Commit Graph

618 Commits

Author SHA1 Message Date
Nathan Chancellor a626beca4c This is the 3.10.105 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYnZNdAAoJEE44bZycYXAvJv0P/jpPc+jKb+D0FVUOiYDkY5Rw
 jxsZ3oeruTIeSFAAIzusVMLm9moBJA6DThuTHU5Kt68mRaKB2lgmqwkkvQAPTSYh
 tnDQwlrF7dOSVmPczJFHalpaLpRdXdQP9r8y+38PibaFZPssKdnZr3BBfdOdi5DT
 lj029AGKfG7co6Hb/iAhsxuAFfPmvGHY4QNwJ2FRbU1m6MDtmCTbXzF0fc6X5AW1
 qrtaWwPulJtZ/5MPk7aFyNpuCpNvIaTEqNaQsZbuz3bHfzDQVLerWze98vgHC0QM
 2YOTP6TnEiHhxHGMb9SywUgSV1ylx0X542YDfxmcfyxBWRr0khlxQh1gpX+waqE3
 pqdSlvN7AFzifw6kubbG2/XjkNvFtJcDTgrL3qco4utIezSijXmoOsDpKNnJuzk/
 kSD5WYd+Q1CSHOkqZX29QPw1Dl/7Ftm7GPfxu7Pis1OBuPByqtRkEfmn9DpiKSs5
 Aja0ljZYiQ3jy3fH+WlEzo6PVSxx0ZxKg0fOShlpgjj8KjMUdGfl9cB1OZxyWnNH
 UiQ9iIWd3tJci7WbsBOfawsQpq3EIJxZKjyUmLYpBht5/YenYxOBDCr/CLJDQBGI
 IQUPAs/E1JGDxGTUY3AmsaMVrcX2yOfhLzjrsVJGqSdote0um+2PdTLZHE4MMiz2
 Dh6CbUVYWS1KNgmQ8T8L
 =k5mW
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEJDfLduVEy2qz2d/TmXOSYMtstxYFAlpqeiwACgkQmXOSYMts
 txaBAQ/+KqZh90YZI+gRHGdczbo3XnlryHMdpp+DTIFtN3zU+2LM352oP+haoJfr
 YhNsixcMhW5TX0is5fg4SkIc0B3ooGKZLVKOPIRw+1NLBAVG5yVuYxW7I1faJgk6
 F37+4rvq7KAOPCNMjAEXRt7GqZ4WZjgvgKy+u5wzKh3k5kUylqDwlP2qdgx2L5Rc
 IxyxgOuaVGV6dZTyAyRlRMild5Tlz+SMY4pWoMe0sulDDXhd5/5PnGNVIgh+XqB6
 m0AGkIIzPVe+wmg6n1iYs93dQO0Jmu6DL47Zv4f3ASZNL/XVSLvU9ie63FyWGZXG
 e52qAPtztXInEOo15vPQSAAq7McZHDTzhHhsU/ZtkBT+LeSUU+rsxXddJ2EO5UgC
 O3cVm11x1FWMzbBtFNFtkqeri2Y2OxvU4O81mfNP1oOUQBTMeSHTzQ8psbCdXeEr
 ktSOtI+nakPmDE3aq4YSaz7BwSgt2tU/vZehkrTxtAQJxt0b88r2xFfThy5WScT1
 v6muoqxlprjjvFld7v99P8cXxJq4QrxKUxXtEBTdB79Q5xtCC29OAcTelpPFDCED
 /KpgZflubzH/Z872AW9Ru8OL9PYty6hBNDOP4aHLSFWfCu3KQxL6BMEeqi5qBjBX
 mJ8JT0dCQYP6xONIWq6a3fICroNMazhNFxdpPSfsQFRhujhjGPg=
 =zhKv
 -----END PGP SIGNATURE-----

Merge 3.10.105 into android-msm-bullhead-3.10-oreo-m5

Changes in 3.10.105: (315 commits)
        sched/core: Fix a race between try_to_wake_up() and a woken up task
        sched/core: Fix an SMP ordering race in try_to_wake_up() vs. schedule()
        crypto: algif_skcipher - Require setkey before accept(2)
        crypto: af_alg - Disallow bind/setkey/... after accept(2)
        crypto: af_alg - Add nokey compatibility path
        crypto: algif_skcipher - Add nokey compatibility path
        crypto: hash - Add crypto_ahash_has_setkey
        crypto: shash - Fix has_key setting
        crypto: algif_hash - Require setkey before accept(2)
        crypto: skcipher - Add crypto_skcipher_has_setkey
        crypto: algif_skcipher - Add key check exception for cipher_null
        crypto: af_alg - Allow af_af_alg_release_parent to be called on nokey path
        crypto: algif_hash - Remove custom release parent function
        crypto: algif_skcipher - Remove custom release parent function
        crypto: af_alg - Forbid bind(2) when nokey child sockets are present
        crypto: algif_hash - Fix race condition in hash_check_key
        crypto: algif_skcipher - Fix race condition in skcipher_check_key
        crypto: algif_skcipher - Load TX SG list after waiting
        crypto: cryptd - initialize child shash_desc on import
        crypto: skcipher - Fix blkcipher walk OOM crash
        crypto: gcm - Fix IV buffer size in crypto_gcm_setkey
        MIPS: KVM: Fix unused variable build warning
        KVM: MIPS: Precalculate MMIO load resume PC
        KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
        KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write
        KVM: MIPS: Make ERET handle ERL before EXL
        KVM: x86: fix wbinvd_dirty_mask use-after-free
        KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr
        KVM: Disable irq while unregistering user notifier
        PM / devfreq: Fix incorrect type issue.
        ppp: defer netns reference release for ppp channel
        x86/mm/xen: Suppress hugetlbfs in PV guests
        xen: Add RING_COPY_REQUEST()
        xen-netback: don't use last request to determine minimum Tx credit
        xen-netback: use RING_COPY_REQUEST() throughout
        xen-blkback: only read request operation from shared ring once
        xen/pciback: Save xen_pci_op commands before processing it
        xen/pciback: Save the number of MSI-X entries to be copied later.
        xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled
        xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled
        xen/pciback: Do not install an IRQ handler for MSI interrupts.
        xen/pciback: For XEN_PCI_OP_disable_msi[|x] only disable if device has MSI(X) enabled.
        xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set.
        xen-pciback: Add name prefix to global 'permissive' variable
        x86/xen: fix upper bound of pmd loop in xen_cleanhighmap()
        x86/traps: Ignore high word of regs->cs in early_idt_handler_common
        x86/mm: Disable preemption during CR3 read+write
        x86/apic: Do not init irq remapping if ioapic is disabled
        x86/mm/pat, /dev/mem: Remove superfluous error message
        x86/paravirt: Do not trace _paravirt_ident_*() functions
        x86/build: Build compressed x86 kernels as PIE
        x86/um: reuse asm-generic/barrier.h
        iommu/amd: Update Alias-DTE in update_device_table()
        iommu/amd: Free domain id when free a domain of struct dma_ops_domain
        ARM: 8616/1: dt: Respect property size when parsing CPUs
        ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7
        ARM: sa1100: clear reset status prior to reboot
        ARM: sa1111: fix pcmcia suspend/resume
        arm64: avoid returning from bad_mode
        arm64: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
        arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb()
        arm64: debug: avoid resetting stepping state machine when TIF_SINGLESTEP
        MIPS: Malta: Fix IOCU disable switch read for MIPS64
        MIPS: ptrace: Fix regs_return_value for kernel context
        powerpc/mm: Don't alias user region to other regions below PAGE_OFFSET
        powerpc/vdso64: Use double word compare on pointers
        powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data()
        powerpc/64: Fix incorrect return value from __copy_tofrom_user
        powerpc/nvram: Fix an incorrect partition merge
        avr32: fix copy_from_user()
        avr32: fix 'undefined reference to `___copy_from_user'
        avr32: off by one in at32_init_pio()
        s390/dasd: fix hanging device after clear subchannel
        parisc: Ensure consistent state when switching to kernel stack at syscall entry
        microblaze: fix __get_user()
        microblaze: fix copy_from_user()
        mn10300: failing __get_user() and get_user() should zero
        m32r: fix __get_user()
        sh64: failing __get_user() should zero
        score: fix __get_user/get_user
        s390: get_user() should zero on failure
        ARC: uaccess: get_user to zero out dest in cause of fault
        asm-generic: make get_user() clear the destination on errors
        frv: fix clear_user()
        cris: buggered copy_from_user/copy_to_user/clear_user
        blackfin: fix copy_from_user()
        score: fix copy_from_user() and friends
        sh: fix copy_from_user()
        hexagon: fix strncpy_from_user() error return
        mips: copy_from_user() must zero the destination on access_ok() failure
        asm-generic: make copy_from_user() zero the destination properly
        alpha: fix copy_from_user()
        metag: copy_from_user() should zero the destination on access_ok() failure
        parisc: fix copy_from_user()
        openrisc: fix copy_from_user()
        openrisc: fix the fix of copy_from_user()
        mn10300: copy_from_user() should zero on access_ok() failure...
        sparc32: fix copy_from_user()
        ppc32: fix copy_from_user()
        ia64: copy_from_user() should zero the destination on access_ok() failure
        fix fault_in_multipages_...() on architectures with no-op access_ok()
        fix memory leaks in tracing_buffers_splice_read()
        arc: don't leak bits of kernel stack into coredump
        Fix potential infoleak in older kernels
        swapfile: fix memory corruption via malformed swapfile
        coredump: fix unfreezable coredumping task
        usb: dwc3: gadget: increment request->actual once
        USB: validate wMaxPacketValue entries in endpoint descriptors
        USB: fix typo in wMaxPacketSize validation
        usb: xhci: Fix panic if disconnect
        USB: serial: fix memleak in driver-registration error path
        USB: kobil_sct: fix non-atomic allocation in write path
        USB: serial: mos7720: fix non-atomic allocation in write path
        USB: serial: mos7840: fix non-atomic allocation in write path
        usb: renesas_usbhs: fix clearing the {BRDY,BEMP}STS condition
        USB: change bInterval default to 10 ms
        usb: gadget: fsl_qe_udc: signedness bug in qe_get_frame()
        USB: serial: cp210x: fix hardware flow-control disable
        usb: misc: legousbtower: Fix NULL pointer deference
        usb: gadget: function: u_ether: don't starve tx request queue
        USB: serial: cp210x: fix tiocmget error handling
        usb: gadget: u_ether: remove interrupt throttling
        usb: chipidea: move the lock initialization to core file
        Fix USB CB/CBI storage devices with CONFIG_VMAP_STACK=y
        ALSA: rawmidi: Fix possible deadlock with virmidi registration
        ALSA: timer: fix NULL pointer dereference in read()/ioctl() race
        ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE
        ALSA: timer: fix NULL pointer dereference on memory allocation failure
        ALSA: ali5451: Fix out-of-bound position reporting
        ALSA: pcm : Call kill_fasync() in stream lock
        zfcp: fix fc_host port_type with NPIV
        zfcp: fix ELS/GS request&response length for hardware data router
        zfcp: close window with unblocked rport during rport gone
        zfcp: retain trace level for SCSI and HBA FSF response records
        zfcp: restore: Dont use 0 to indicate invalid LUN in rec trace
        zfcp: trace on request for open and close of WKA port
        zfcp: restore tracing of handle for port and LUN with HBA records
        zfcp: fix D_ID field with actual value on tracing SAN responses
        zfcp: fix payload trace length for SAN request&response
        zfcp: trace full payload of all SAN records (req,resp,iels)
        scsi: zfcp: spin_lock_irqsave() is not nestable
        scsi: mpt3sas: Fix secure erase premature termination
        scsi: mpt3sas: Unblock device after controller reset
        scsi: mpt3sas: fix hang on ata passthrough commands
        mpt2sas: Fix secure erase premature termination
        scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
        scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
        scsi: ibmvfc: Fix I/O hang when port is not mapped
        scsi: Fix use-after-free
        scsi: arcmsr: Buffer overflow in arcmsr_iop_message_xfer()
        scsi: scsi_debug: Fix memory leak if LBP enabled and module is unloaded
        scsi: arcmsr: Send SYNCHRONIZE_CACHE command to firmware
        ext4: validate that metadata blocks do not overlap superblock
        ext4: avoid modifying checksum fields directly during checksum verification
        ext4: use __GFP_NOFAIL in ext4_free_blocks()
        ext4: reinforce check of i_dtime when clearing high fields of uid and gid
        ext4: allow DAX writeback for hole punch
        ext4: sanity check the block and cluster size at mount time
        reiserfs: fix "new_insert_key may be used uninitialized ..."
        reiserfs: Unlock superblock before calling reiserfs_quota_on_mount()
        xfs: fix superblock inprogress check
        libxfs: clean up _calc_dquots_per_chunk
        btrfs: ensure that file descriptor used with subvol ioctls is a dir
        ocfs2/dlm: fix race between convert and migration
        ocfs2: fix start offset to ocfs2_zero_range_for_truncate()
        ubifs: Fix assertion in layout_in_gaps()
        ubifs: Fix xattr_names length in exit paths
        UBIFS: Fix possible memory leak in ubifs_readdir()
        ubifs: Abort readdir upon error
        ubifs: Fix regression in ubifs_readdir()
        UBI: fastmap: scrub PEB when bitflips are detected in a free PEB EC header
        NFSv4.x: Fix a refcount leak in nfs_callback_up_net
        NFSD: Using free_conn free connection
        NFS: Don't drop CB requests with invalid principals
        NFSv4: Open state recovery must account for file permission changes
        fs/seq_file: fix out-of-bounds read
        fs/super.c: fix race between freeze_super() and thaw_super()
        isofs: Do not return EACCES for unknown filesystems
        hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common()
        driver core: Delete an unnecessary check before the function call "put_device"
        driver core: fix race between creating/querying glue dir and its cleanup
        drm/radeon: fix radeon_move_blit on 32bit systems
        drm: Reject page_flip for !DRIVER_MODESET
        drm/radeon: Ensure vblank interrupt is enabled on DPMS transition to on
        qxl: check for kmap failures
        Input: i8042 - break load dependency between atkbd/psmouse and i8042
        Input: i8042 - set up shared ps2_cmd_mutex for AUX ports
        Input: ili210x - fix permissions on "calibrate" attribute
        hwrng: exynos - Disable runtime PM on probe failure
        hwrng: omap - Fix assumption that runtime_get_sync will always succeed
        hwrng: omap - Only fail if pm_runtime_get_sync returns < 0
        i2c-eg20t: fix race between i2c init and interrupt enable
        em28xx-i2c: rt_mutex_trylock() returns zero on failure
        i2c: core: fix NULL pointer dereference under race condition
        i2c: at91: fix write transfers by clearing pending interrupt first
        iio: accel: kxsd9: Fix raw read return
        iio: accel: kxsd9: Fix scaling bug
        thermal: hwmon: Properly report critical temperature in sysfs
        cdc-acm: fix wrong pipe type on rx interrupt xfers
        timers: Use proper base migration in add_timer_on()
        EDAC: Increment correct counter in edac_inc_ue_error()
        IB/ipoib: Fix memory corruption in ipoib cm mode connect flow
        IB/core: Fix use after free in send_leave function
        IB/ipoib: Don't allow MC joins during light MC flush
        IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV
        IB/mlx4: Fix create CQ error flow
        IB/uverbs: Fix leak of XRC target QPs
        IB/cm: Mark stale CM id's whenever the mad agent was unregistered
        mtd: blkdevs: fix potential deadlock + lockdep warnings
        mtd: pmcmsp-flash: Allocating too much in init_msp_flash()
        mtd: nand: davinci: Reinitialize the HW ECC engine in 4bit hwctl
        perf symbols: Fixup symbol sizes before picking best ones
        perf: Tighten (and fix) the grouping condition
        tty: Prevent ldisc drivers from re-using stale tty fields
        tty: limit terminal size to 4M chars
        tty: vt, fix bogus division in csi_J
        vt: clear selection before resizing
        drivers/vfio: Rework offsetofend()
        include/stddef.h: Move offsetofend() from vfio.h to a generic kernel header
        stddef.h: move offsetofend inside #ifndef/#endif guard, neaten
        ipv6: don't call fib6_run_gc() until routing is ready
        ipv6: split duplicate address detection and router solicitation timer
        ipv6: move DAD and addrconf_verify processing to workqueue
        ipv6: addrconf: fix dev refcont leak when DAD failed
        ipv6: fix rtnl locking in setsockopt for anycast and multicast
        ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()
        ipv6: correctly add local routes when lo goes up
        ipv6: dccp: fix out of bound access in dccp_v6_err()
        ipv6: dccp: add missing bind_conflict to dccp_ipv6_mapped
        ip6_tunnel: Clear IP6CB in ip6tunnel_xmit()
        ip6_tunnel: disable caching when the traffic class is inherited
        net/irda: handle iriap_register_lsap() allocation failure
        tcp: fix use after free in tcp_xmit_retransmit_queue()
        tcp: properly scale window in tcp_v[46]_reqsk_send_ack()
        tcp: fix overflow in __tcp_retransmit_skb()
        tcp: fix wrong checksum calculation on MTU probing
        tcp: take care of truncations done by sk_filter()
        bonding: Fix bonding crash
        net: ratelimit warnings about dst entry refcount underflow or overflow
        mISDN: Support DR6 indication in mISDNipac driver
        mISDN: Fixing missing validation in base_sock_bind()
        net: disable fragment reassembly if high_thresh is set to zero
        ipvs: count pre-established TCP states as active
        iwlwifi: pcie: fix access to scratch buffer
        svc: Avoid garbage replies when pc_func() returns rpc_drop_reply
        brcmsmac: Free packet if dma_mapping_error() fails in dma_rxfill
        brcmsmac: Initialize power in brcms_c_stf_ss_algo_channel_get()
        brcmfmac: avoid potential stack overflow in brcmf_cfg80211_start_ap()
        pstore: Fix buffer overflow while write offset equal to buffer size
        net/mlx4_core: Allow resetting VF admin mac to zero
        firewire: net: guard against rx buffer overflows
        firewire: net: fix fragmented datagram_size off-by-one
        netfilter: fix namespace handling in nf_log_proc_dostring
        can: bcm: fix warning in bcm_connect/proc_register
        net: fix sk_mem_reclaim_partial()
        net: avoid sk_forward_alloc overflows
        ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route
        packet: call fanout_release, while UNREGISTERING a netdev
        net: sctp, forbid negative length
        sctp: validate chunk len before actually using it
        net: clear sk_err_soft in sk_clone_lock()
        net: mangle zero checksum in skb_checksum_help()
        dccp: do not send reset to already closed sockets
        dccp: fix out of bound access in dccp_v4_err()
        sctp: assign assoc_id earlier in __sctp_connect
        neigh: check error pointer instead of NULL for ipv4_neigh_lookup()
        ipv4: use new_gw for redirect neigh lookup
        mac80211: fix purging multicast PS buffer queue
        mac80211: discard multicast and 4-addr A-MSDUs
        cfg80211: limit scan results cache size
        mwifiex: printk() overflow with 32-byte SSIDs
        ipv4: Set skb->protocol properly for local output
        net: sky2: Fix shutdown crash
        kaweth: fix firmware download
        tracing: Move mutex to protect against resetting of seq data
        kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd
        Revert "ipc/sem.c: optimize sem_lock()"
        cfq: fix starvation of asynchronous writes
        drbd: Fix kernel_sendmsg() usage - potential NULL deref
        lib/genalloc.c: start search from start of chunk
        tools/vm/slabinfo: fix an unintentional printf
        rcu: Fix soft lockup for rcu_nocb_kthread
        ratelimit: fix bug in time interval by resetting right begin time
        mfd: core: Fix device reference leak in mfd_clone_cell
        PM / sleep: fix device reference leak in test_suspend
        mmc: mxs: Initialize the spinlock prior to using it
        mmc: block: don't use CMD23 with very old MMC cards
        pstore/core: drop cmpxchg based updates
        pstore/ram: Use memcpy_toio instead of memcpy
        pstore/ram: Use memcpy_fromio() to save old buffer
        mb86a20s: fix the locking logic
        mb86a20s: fix demod settings
        cx231xx: don't return error on success
        cx231xx: fix GPIOs for Pixelview SBTVD hybrid
        gpio: mpc8xxx: Correct irq handler function
        uio: fix dmem_region_start computation
        KEYS: Fix short sprintf buffer in /proc/keys show function
        hv: do not lose pending heartbeat vmbus packets
        staging: iio: ad5933: avoid uninitialized variable in error case
        mei: bus: fix received data size check in NFC fixup
        ACPI / APEI: Fix incorrect return value of ghes_proc()
        PCI: Handle read-only BARs on AMD CS553x devices
        tile: avoid using clocksource_cyc2ns with absolute cycle count
        dm flakey: fix reads to be issued if drop_writes configured
        mm,ksm: fix endless looping in allocating memory when ksm enable
        can: dev: fix deadlock reported after bus-off
        hwmon: (adt7411) set bit 3 in CFG1 register
        mpi: Fix NULL ptr dereference in mpi_powm() [ver #3]
        mfd: 88pm80x: Double shifting bug in suspend/resume
        ASoC: omap-mcpdm: Fix irq resource handling
        regulator: tps65910: Work around silicon erratum SWCZ010
        dm: mark request_queue dead before destroying the DM device
        fbdev/efifb: Fix 16 color palette entry calculation
        metag: Only define atomic_dec_if_positive conditionally
        Linux 3.10.105

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

Conflicts:
	arch/arm/mach-sa1100/generic.c
	arch/arm64/kernel/traps.c
	crypto/blkcipher.c
	drivers/devfreq/devfreq.c
	drivers/usb/dwc3/gadget.c
	drivers/usb/gadget/u_ether.c
	fs/ubifs/dir.c
	include/net/if_inet6.h
	lib/genalloc.c
	net/ipv6/addrconf.c
	net/ipv6/tcp_ipv6.c
	net/wireless/scan.c
	sound/core/timer.c
2018-01-25 17:45:32 -07:00
Al Viro 1eeb8d13b5 ia64: copy_from_user() should zero the destination on access_ok() failure
commit a5e541f796f17228793694d64b507f5f57db4cd7 upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-02-06 23:33:03 +01:00
Tim Chen 3caa6fcb90 locking/mcs: Allow architecture specific asm files to be used for contended case
This patch allows each architecture to add its specific assembly optimized
arch_mcs_spin_lock_contended and arch_mcs_spinlock_uncontended for
MCS lock and unlock functions.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: AswinChandramouleeswaran <aswin@hp.com>
Cc: George Spelvin <linux@horizon.com>
Cc: Rik vanRiel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: MichelLespinasse <walken@google.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Figo.zhang" <figo1802@gmail.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew R Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1390347382.3138.67.camel@schen9-DESK
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Git-commit: ddf1d169c0a489d498c1799a7043904a43b0c159
[joonwoop@codeaurora.org: Resolve merge conflicts; we don't have changes
for arch other than ARM/ARM64]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2014-08-15 11:41:16 -07:00
Peter Zijlstra dcd63889c3 arch: Introduce smp_load_acquire(), smp_store_release()
A number of situations currently require the heavyweight smp_mb(),
even though there is no need to order prior stores against later
loads.  Many architectures have much cheaper ways to handle these
situations, but the Linux kernel currently has no portable way
to make use of them.

This commit therefore supplies smp_load_acquire() and
smp_store_release() to remedy this situation.  The new
smp_load_acquire() primitive orders the specified load against
any subsequent reads or writes, while the new smp_store_release()
primitive orders the specifed store against any prior reads or
writes.  These primitives allow array-based circular FIFOs to be
implemented without an smp_mb(), and also allow a theoretical
hole in rcu_assign_pointer() to be closed at no additional
expense on most architectures.

In addition, the RCU experience transitioning from explicit
smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
and rcu_assign_pointer(), respectively resulted in substantial
improvements in readability.  It therefore seems likely that
replacing other explicit barriers with smp_load_acquire() and
smp_store_release() will provide similar benefits.  It appears
that roughly half of the explicit barriers in core kernel code
might be so replaced.

[Changelog by PaulMck]

Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Git-commit: 47933ad41a86a4a9b50bed7c9b9bd2ba242aac63
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2014-04-17 17:05:07 -07:00
Ian Maund f1b32d4e47 Merge upstream linux-stable v3.10.28 into msm-3.10
The following commits have been reverted from this merge, as they are
known to introduce new bugs and are currently incompatible with our
audio implementation. Investigation of these commits is ongoing, and
they are expected to be brought in at a later time:

86e6de7 ALSA: compress: fix drain calls blocking other compress functions (v6)
16442d4 ALSA: compress: fix drain calls blocking other compress functions

This merge commit also includes a change in block, necessary for
compilation. Upstream has modified elevator_init_fn to prevent race
conditions, requring updates to row_init_queue and test_init_queue.

* commit 'v3.10.28': (1964 commits)
  Linux 3.10.28
  ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling
  drm/i915: Don't grab crtc mutexes in intel_modeset_gem_init()
  serial: amba-pl011: use port lock to guard control register access
  mm: Make {,set}page_address() static inline if WANT_PAGE_VIRTUAL
  md/raid5: Fix possible confusion when multiple write errors occur.
  md/raid10: fix two bugs in handling of known-bad-blocks.
  md/raid10: fix bug when raid10 recovery fails to recover a block.
  md: fix problem when adding device to read-only array with bitmap.
  drm/i915: fix DDI PLLs HW state readout code
  nilfs2: fix segctor bug that causes file system corruption
  thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only
  ftrace/x86: Load ftrace_ops in parameter not the variable holding it
  SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()
  writeback: Fix data corruption on NFS
  hwmon: (coretemp) Fix truncated name of alarm attributes
  vfs: In d_path don't call d_dname on a mount point
  staging: comedi: adl_pci9111: fix incorrect irq passed to request_irq()
  staging: comedi: addi_apci_1032: fix subdevice type/flags bug
  mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully
  GFS2: Increase i_writecount during gfs2_setattr_chown
  perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h
  perf scripting perl: Fix build error on Fedora 12
  ARM: 7815/1: kexec: offline non panic CPUs on Kdump panic
  Linux 3.10.27
  sched: Guarantee new group-entities always have weight
  sched: Fix hrtimer_cancel()/rq->lock deadlock
  sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
  sched: Fix race on toggling cfs_bandwidth_used
  x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround
  netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper
  SCSI: sd: Reduce buffer size for vpd request
  intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters.
  mac80211: move "bufferable MMPDU" check to fix AP mode scan
  ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS
  ACPI / TPM: fix memory leak when walking ACPI namespace
  mfd: rtsx_pcr: Disable interrupts before cancelling delayed works
  clk: exynos5250: fix sysmmu_mfc{l,r} gate clocks
  clk: samsung: exynos5250: Add CLK_IGNORE_UNUSED flag for the sysreg clock
  clk: samsung: exynos4: Correct SRC_MFC register
  clk: clk-divider: fix divisor > 255 bug
  ahci: add PCI ID for Marvell 88SE9170 SATA controller
  parisc: Ensure full cache coherency for kmap/kunmap
  drm/nouveau/bios: make jump conditional
  ARM: shmobile: mackerel: Fix coherent DMA mask
  ARM: shmobile: armadillo: Fix coherent DMA mask
  ARM: shmobile: kzm9g: Fix coherent DMA mask
  ARM: dts: exynos5250: Fix MDMA0 clock number
  ARM: fix "bad mode in ... handler" message for undefined instructions
  ARM: fix footbridge clockevent device
  net: Loosen constraints for recalculating checksum in skb_segment()
  bridge: use spin_lock_bh() in br_multicast_set_hash_max
  netpoll: Fix missing TXQ unlock and and OOPS.
  net: llc: fix use after free in llc_ui_recvmsg
  virtio-net: fix refill races during restore
  virtio_net: don't leak memory or block when too many frags
  virtio-net: make all RX paths handle errors consistently
  virtio_net: fix error handling for mergeable buffers
  vlan: Fix header ops passthru when doing TX VLAN offload.
  net: rose: restore old recvmsg behavior
  rds: prevent dereference of a NULL device
  ipv6: always set the new created dst's from in ip6_rt_copy
  net: fec: fix potential use after free
  hamradio/yam: fix info leak in ioctl
  drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl()
  net: inet_diag: zero out uninitialized idiag_{src,dst} fields
  ip_gre: fix msg_name parsing for recvfrom/recvmsg
  net: unix: allow bind to fail on mutex lock
  ipv6: fix illegal mac_header comparison on 32bit
  netvsc: don't flush peers notifying work during setting mtu
  tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
  net: unix: allow set_peek_off to fail
  net: drop_monitor: fix the value of maxattr
  ipv6: don't count addrconf generated routes against gc limit
  packet: fix send path when running with proto == 0
  virtio: delete napi structures from netdev before releasing memory
  macvtap: signal truncated packets
  tun: update file current position
  macvtap: update file current position
  macvtap: Do not double-count received packets
  rds: prevent BUG_ON triggered on congestion update to loopback
  net: do not pretend FRAGLIST support
  IPv6: Fixed support for blackhole and prohibit routes
  HID: Revert "Revert "HID: Fix logitech-dj: missing Unifying device issue""
  gpio-rcar: R-Car GPIO IRQ share interrupt
  clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast
  irqchip: renesas-irqc: Fix irqc_probe error handling
  Linux 3.10.26
  sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c
  ext4: fix bigalloc regression
  arm64: Use Normal NonCacheable memory for writecombine
  arm64: Do not flush the D-cache for anonymous pages
  arm64: Avoid cache flushing in flush_dcache_page()
  ARM: KVM: arch_timers: zero CNTVOFF upon return to host
  ARM: hyp: initialize CNTVOFF to zero
  clocksource: arch_timer: use virtual counters
  arm64: Remove unused cpu_name ascii in arch/arm64/mm/proc.S
  arm64: dts: Reserve the memory used for secondary CPU release address
  arm64: check for number of arguments in syscall_get/set_arguments()
  arm64: fix possible invalid FPSIMD initialization state
  ...

Change-Id: Ia0e5d71b536ab49ec3a1179d59238c05bdd03106
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2014-03-24 14:28:34 -07:00
Stefano Stabellini f3f3ea582c xen: introduce xen_dma_map/unmap_page and xen_dma_sync_single_for_cpu/device
Introduce xen_dma_map_page, xen_dma_unmap_page,
xen_dma_sync_single_for_cpu and xen_dma_sync_single_for_device.
They have empty implementations on x86 and ia64 but they call the
corresponding platform dma_ops function on arm and arm64.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Changes in v9:
- xen_dma_map_page return void, avoid page_to_phys.
Git-commit: 7100b077ab4ff5fb0ba7760ce54465f623a0a763
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2014-02-07 13:49:53 -08:00
Stefano Stabellini 9a6d1b6158 xen: introduce xen_alloc/free_coherent_pages
xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
and devices. On native x86 is sufficient to call __get_free_pages in
order to get a coherent buffer, while on ARM (and potentially ARM64) we
need to call the native dma_ops->alloc implementation.

Introduce xen_alloc_coherent_pages to abstract the arch specific buffer
allocation.

Similarly introduce xen_free_coherent_pages to free a coherent buffer:
on x86 is simply a call to free_pages while on ARM and ARM64 is
arm_dma_ops.free.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Changes in v7:
- rename __get_dma_ops to __generic_dma_ops;
- call __generic_dma_ops(hwdev)->alloc/free on arm64 too.

Changes in v6:
- call __get_dma_ops to get the native dma_ops pointer on arm.
Git-commit: d6fe76c58c358498b91d21f0ca8054f6aa6e672d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2014-02-07 13:49:51 -08:00
Linus Torvalds fd4defb83c Fix TLB gather virtual address range invalidation corner cases
Ben Tebulin reported:

 "Since v3.7.2 on two independent machines a very specific Git
  repository fails in 9/10 cases on git-fsck due to an SHA1/memory
  failures.  This only occurs on a very specific repository and can be
  reproduced stably on two independent laptops.  Git mailing list ran
  out of ideas and for me this looks like some very exotic kernel issue"

and bisected the failure to the backport of commit 53a59fc67f ("mm:
limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").

That commit itself is not actually buggy, but what it does is to make it
much more likely to hit the partial TLB invalidation case, since it
introduces a new case in tlb_next_batch() that previously only ever
happened when running out of memory.

The real bug is that the TLB gather virtual memory range setup is subtly
buggered.  It was introduced in commit 597e1c3580 ("mm/mmu_gather:
enable tlb flush range in generic mmu_gather"), and the range handling
was already fixed at least once in commit e6c495a96ce0 ("mm: fix the TLB
range flushed when __tlb_remove_page() runs out of slots"), but that fix
was not complete.

The problem with the TLB gather virtual address range is that it isn't
set up by the initial tlb_gather_mmu() initialization (which didn't get
the TLB range information), but it is set up ad-hoc later by the
functions that actually flush the TLB.  And so any such case that forgot
to update the TLB range entries would potentially miss TLB invalidates.

Rather than try to figure out exactly which particular ad-hoc range
setup was missing (I personally suspect it's the hugetlb case in
zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
did), this patch just gets rid of the problem at the source: make the
TLB range information available to tlb_gather_mmu(), and initialize it
when initializing all the other tlb gather fields.

This makes the patch larger, but conceptually much simpler.  And the end
result is much more understandable; even if you want to play games with
partial ranges when invalidating the TLB contents in chunks, now the
range information is always there, and anybody who doesn't want to
bother with it won't introduce subtle bugs.

Ben verified that this fixes his problem.

Reported-bisected-and-tested-by: Ben Tebulin <tebulin@googlemail.com>
Build-testing-by: Stephen Rothwell <sfr@canb.auug.org.au>
Build-testing-by: Richard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 2b047252d087be7f2ba088b4933cd904f92e6fce
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[imaund@codeaurora.org: resolve merge conflicts]
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2014-02-07 13:49:43 -08:00
Al Viro 031325e245 consolidate io_remap_pfn_range definitions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Git-commit: 40d158e61840fbbe23be3f37302a3ca237c15491
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2014-02-07 13:49:39 -08:00
Kees Cook d389d610b1 exec/ptrace: fix get_dumpable() incorrect tests
commit d049f74f2dbe71354d43d393ac3a188947811348 upstream.

The get_dumpable() return value is not boolean.  Most users of the
function actually want to be testing for non-SUID_DUMP_USER(1) rather than
SUID_DUMP_DISABLE(0).  The SUID_DUMP_ROOT(2) is also considered a
protected state.  Almost all places did this correctly, excepting the two
places fixed in this patch.

Wrong logic:
    if (dumpable == SUID_DUMP_DISABLE) { /* be protective */ }
        or
    if (dumpable == 0) { /* be protective */ }
        or
    if (!dumpable) { /* be protective */ }

Correct logic:
    if (dumpable != SUID_DUMP_USER) { /* be protective */ }
        or
    if (dumpable != 1) { /* be protective */ }

Without this patch, if the system had set the sysctl fs/suid_dumpable=2, a
user was able to ptrace attach to processes that had dropped privileges to
that user.  (This may have been partially mitigated if Yama was enabled.)

The macros have been moved into the file that declares get/set_dumpable(),
which means things like the ia64 code can see them too.

CVE-2013-2929

Reported-by: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 11:11:44 -08:00
Linus Torvalds 8e220cfd1a Fix TLB gather virtual address range invalidation corner cases
commit 2b047252d087be7f2ba088b4933cd904f92e6fce upstream.

Ben Tebulin reported:

 "Since v3.7.2 on two independent machines a very specific Git
  repository fails in 9/10 cases on git-fsck due to an SHA1/memory
  failures.  This only occurs on a very specific repository and can be
  reproduced stably on two independent laptops.  Git mailing list ran
  out of ideas and for me this looks like some very exotic kernel issue"

and bisected the failure to the backport of commit 53a59fc67f ("mm:
limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").

That commit itself is not actually buggy, but what it does is to make it
much more likely to hit the partial TLB invalidation case, since it
introduces a new case in tlb_next_batch() that previously only ever
happened when running out of memory.

The real bug is that the TLB gather virtual memory range setup is subtly
buggered.  It was introduced in commit 597e1c3580 ("mm/mmu_gather:
enable tlb flush range in generic mmu_gather"), and the range handling
was already fixed at least once in commit e6c495a96ce0 ("mm: fix the TLB
range flushed when __tlb_remove_page() runs out of slots"), but that fix
was not complete.

The problem with the TLB gather virtual address range is that it isn't
set up by the initial tlb_gather_mmu() initialization (which didn't get
the TLB range information), but it is set up ad-hoc later by the
functions that actually flush the TLB.  And so any such case that forgot
to update the TLB range entries would potentially miss TLB invalidates.

Rather than try to figure out exactly which particular ad-hoc range
setup was missing (I personally suspect it's the hugetlb case in
zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
did), this patch just gets rid of the problem at the source: make the
TLB range information available to tlb_gather_mmu(), and initialize it
when initializing all the other tlb gather fields.

This makes the patch larger, but conceptually much simpler.  And the end
result is much more understandable; even if you want to play games with
partial ranges when invalidating the TLB contents in chunks, now the
range information is always there, and anybody who doesn't want to
bother with it won't introduce subtle bugs.

Ben verified that this fixes his problem.

Reported-bisected-and-tested-by: Ben Tebulin <tebulin@googlemail.com>
Build-testing-by: Stephen Rothwell <sfr@canb.auug.org.au>
Build-testing-by: Richard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20 08:43:05 -07:00
David Daney f75773103d [IA64] Fix include dependency in asm/irqflags.h
asm/kregs.h isn't always included first, so we need an explicit include.

[Fix build breakage introduced by f21afc25f9
 smp.h: Use local_irq_{save,restore}() in !SMP version of on_each_cpu().]

Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-06-17 13:39:52 -07:00
Peter Zijlstra 29eb77825c arch, mm: Remove tlb_fast_mode()
Since the introduction of preemptible mmu_gather TLB fast mode has been
broken. TLB fast mode relies on there being absolutely no concurrency;
it frees pages first and invalidates TLBs later.

However now we can get concurrency and stuff goes *bang*.

This patch removes all tlb_fast_mode() code; it was found the better
option vs trying to patch the hole by entangling tlb invalidation with
the scheduler.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-06 10:07:26 +09:00
Linus Torvalds 01227a889e Merge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Gleb Natapov:
 "Highlights of the updates are:

  general:
   - new emulated device API
   - legacy device assignment is now optional
   - irqfd interface is more generic and can be shared between arches

  x86:
   - VMCS shadow support and other nested VMX improvements
   - APIC virtualization and Posted Interrupt hardware support
   - Optimize mmio spte zapping

  ppc:
    - BookE: in-kernel MPIC emulation with irqfd support
    - Book3S: in-kernel XICS emulation (incomplete)
    - Book3S: HV: migration fixes
    - BookE: more debug support preparation
    - BookE: e6500 support

  ARM:
   - reworking of Hyp idmaps

  s390:
   - ioeventfd for virtio-ccw

  And many other bug fixes, cleanups and improvements"

* tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
  kvm: Add compat_ioctl for device control API
  KVM: x86: Account for failing enable_irq_window for NMI window request
  KVM: PPC: Book3S: Add API for in-kernel XICS emulation
  kvm/ppc/mpic: fix missing unlock in set_base_addr()
  kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write
  kvm/ppc/mpic: remove users
  kvm/ppc/mpic: fix mmio region lists when multiple guests used
  kvm/ppc/mpic: remove default routes from documentation
  kvm: KVM_CAP_IOMMU only available with device assignment
  ARM: KVM: iterate over all CPUs for CPU compatibility check
  KVM: ARM: Fix spelling in error message
  ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally
  KVM: ARM: Fix API documentation for ONE_REG encoding
  ARM: KVM: promote vfp_host pointer to generic host cpu context
  ARM: KVM: add architecture specific hook for capabilities
  ARM: KVM: perform HYP initilization for hotplugged CPUs
  ARM: KVM: switch to a dual-step HYP init code
  ARM: KVM: rework HYP page table freeing
  ARM: KVM: enforce maximum size for identity mapped code
  ARM: KVM: move to a KVM provided HYP idmap
  ...
2013-05-05 14:47:31 -07:00
Linus Torvalds 73287a43cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights (1721 non-merge commits, this has to be a record of some
  sort):

   1) Add 'random' mode to team driver, from Jiri Pirko and Eric
      Dumazet.

   2) Make it so that any driver that supports configuration of multiple
      MAC addresses can provide the forwarding database add and del
      calls by providing a default implementation and hooking that up if
      the driver doesn't have an explicit set of handlers.  From Vlad
      Yasevich.

   3) Support GSO segmentation over tunnels and other encapsulating
      devices such as VXLAN, from Pravin B Shelar.

   4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.

   5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
      Dukkipati.

   6) In the PHY layer, allow supporting wake-on-lan in situations where
      the PHY registers have to be written for it to be configured.

      Use it to support wake-on-lan in mv643xx_eth.

      From Michael Stapelberg.

   7) Significantly improve firewire IPV6 support, from YOSHIFUJI
      Hideaki.

   8) Allow multiple packets to be sent in a single transmission using
      network coding in batman-adv, from Martin Hundebøll.

   9) Add support for T5 cxgb4 chips, from Santosh Rastapur.

  10) Generalize the VXLAN forwarding tables so that there is more
      flexibility in configurating various aspects of the endpoints.
      From David Stevens.

  11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
      from Dmitry Kravkov.

  12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
      Neira Ayuso.

  13) Start adding networking selftests.

  14) In situations of overload on the same AF_PACKET fanout socket, or
      per-cpu packet receive queue, minimize drop by distributing the
      load to other cpus/fanouts.  From Willem de Bruijn and Eric
      Dumazet.

  15) Add support for new payload offset BPF instruction, from Daniel
      Borkmann.

  16) Convert several drivers over to mdoule_platform_driver(), from
      Sachin Kamat.

  17) Provide a minimal BPF JIT image disassembler userspace tool, from
      Daniel Borkmann.

  18) Rewrite F-RTO implementation in TCP to match the final
      specification of it in RFC4138 and RFC5682.  From Yuchung Cheng.

  19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
      you like netlink, so I implemented netlink dumping of netlink
      sockets.") From Andrey Vagin.

  20) Remove ugly passing of rtnetlink attributes into rtnl_doit
      functions, from Thomas Graf.

  21) Allow userspace to be able to see if a configuration change occurs
      in the middle of an address or device list dump, from Nicolas
      Dichtel.

  22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
      Frederic Sowa.

  23) Increase accuracy of packet length used by packet scheduler, from
      Jason Wang.

  24) Beginning set of changes to make ipv4/ipv6 fragment handling more
      scalable and less susceptible to overload and locking contention,
      from Jesper Dangaard Brouer.

  25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
      instead.  From Hong Zhiguo.

  26) Optimize route usage in IPVS by avoiding reference counting where
      possible, from Julian Anastasov.

  27) Convert IPVS schedulers to RCU, also from Julian Anastasov.

  28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
      Eitzenberger.

  29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
      nfnetlink_log, and nfnetlink_queue.  From Gao feng.

  30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.

  31) Support several new r8169 chips, from Hayes Wang.

  32) Support tokenized interface identifiers in ipv6, from Daniel
      Borkmann.

  33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.

  34) Add 802.1ad vlan offload support, from Patrick McHardy.

  35) Support mmap() based netlink communication, also from Patrick
      McHardy.

  36) Support HW timestamping in mlx4 driver, from Amir Vadai.

  37) Rationalize AF_PACKET packet timestamping when transmitting, from
      Willem de Bruijn and Daniel Borkmann.

  38) Bring parity to what's provided by /proc/net/packet socket dumping
      and the info provided by netlink socket dumping of AF_PACKET
      sockets.  From Nicolas Dichtel.

  39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
      Poirier"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
  filter: fix va_list build error
  af_unix: fix a fatal race with bit fields
  bnx2x: Prevent memory leak when cnic is absent
  bnx2x: correct reading of speed capabilities
  net: sctp: attribute printl with __printf for gcc fmt checks
  netlink: kconfig: move mmap i/o into netlink kconfig
  netpoll: convert mutex into a semaphore
  netlink: Fix skb ref counting.
  net_sched: act_ipt forward compat with xtables
  mlx4_en: fix a build error on 32bit arches
  Revert "bnx2x: allow nvram test to run when device is down"
  bridge: avoid OOPS if root port not found
  drivers: net: cpsw: fix kernel warn on cpsw irq enable
  sh_eth: use random MAC address if no valid one supplied
  3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
  tg3: fix to append hardware time stamping flags
  unix/stream: fix peeking with an offset larger than data in queue
  unix/dgram: fix peeking with an offset larger than data in queue
  unix/dgram: peek beyond 0-sized skbs
  openvswitch: Remove unneeded ovs_netdev_get_ifindex()
  ...
2013-05-01 14:08:52 -07:00
Linus Torvalds 08d7676083 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull compat cleanup from Al Viro:
 "Mostly about syscall wrappers this time; there will be another pile
  with patches in the same general area from various people, but I'd
  rather push those after both that and vfs.git pile are in."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  syscalls.h: slightly reduce the jungles of macros
  get rid of union semop in sys_semctl(2) arguments
  make do_mremap() static
  sparc: no need to sign-extend in sync_file_range() wrapper
  ppc compat wrappers for add_key(2) and request_key(2) are pointless
  x86: trim sys_ia32.h
  x86: sys32_kill and sys32_mprotect are pointless
  get rid of compat_sys_semctl() and friends in case of ARCH_WANT_OLD_COMPAT_IPC
  merge compat sys_ipc instances
  consolidate compat lookup_dcookie()
  convert vmsplice to COMPAT_SYSCALL_DEFINE
  switch getrusage() to COMPAT_SYSCALL_DEFINE
  switch epoll_pwait to COMPAT_SYSCALL_DEFINE
  convert sendfile{,64} to COMPAT_SYSCALL_DEFINE
  switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE
  make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protect
  make HAVE_SYSCALL_WRAPPERS unconditional
  consolidate cond_syscall and SYSCALL_ALIAS declarations
  teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long
  get rid of duplicate logics in __SC_....[1-6] definitions
2013-05-01 07:21:43 -07:00
Linus Torvalds 8700c95adb Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP/hotplug changes from Ingo Molnar:
 "This is a pretty large, multi-arch series unifying and generalizing
  the various disjunct pieces of idle routines that architectures have
  historically copied from each other and have grown in random, wildly
  inconsistent and sometimes buggy directions:

   101 files changed, 455 insertions(+), 1328 deletions(-)

  this went through a number of review and test iterations before it was
  committed, it was tested on various architectures, was exposed to
  linux-next for quite some time - nevertheless it might cause problems
  on architectures that don't read the mailing lists and don't regularly
  test linux-next.

  This cat herding excercise was motivated by the -rt kernel, and was
  brought to you by Thomas "the Whip" Gleixner."

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  idle: Remove GENERIC_IDLE_LOOP config switch
  um: Use generic idle loop
  ia64: Make sure interrupts enabled when we "safe_halt()"
  sparc: Use generic idle loop
  idle: Remove unused ARCH_HAS_DEFAULT_IDLE
  bfin: Fix typo in arch_cpu_idle()
  xtensa: Use generic idle loop
  x86: Use generic idle loop
  unicore: Use generic idle loop
  tile: Use generic idle loop
  tile: Enter idle with preemption disabled
  sh: Use generic idle loop
  score: Use generic idle loop
  s390: Use generic idle loop
  powerpc: Use generic idle loop
  parisc: Use generic idle loop
  openrisc: Use generic idle loop
  mn10300: Use generic idle loop
  mips: Use generic idle loop
  microblaze: Use generic idle loop
  ...
2013-04-30 07:50:17 -07:00
Gerald Schaefer 106c992a5e mm/hugetlb: add more arch-defined huge_pte functions
Commit abf09bed3c ("s390/mm: implement software dirty bits")
introduced another difference in the pte layout vs.  the pmd layout on
s390, thoroughly breaking the s390 support for hugetlbfs.  This requires
replacing some more pte_xxx functions in mm/hugetlbfs.c with a
huge_pte_xxx version.

This patch introduces those huge_pte_xxx functions and their generic
implementation in asm-generic/hugetlb.h, which will now be included on
all architectures supporting hugetlbfs apart from s390.  This change
will be a no-op for those architectures.

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Hillf Danton <dhillf@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.cz>	[for !s390 parts]
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:33 -07:00
Alex Williamson 2a5bab1004 kvm: Allow build-time configuration of KVM device assignment
We hope to at some point deprecate KVM legacy device assignment in
favor of VFIO-based assignment.  Towards that end, allow legacy
device assignment to be deconfigured.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-04-28 12:58:56 +03:00
Alexander Graf 22e64024fb KVM: IA64: Carry non-ia64 changes into ia64
We changed a few things in non-ia64 code paths. This patch blindly applies
the changes to the ia64 code as well, hoping it proves useful in case anyone
revives the ia64 kvm code.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:27 +02:00
Luck, Tony 2412aa1293 ia64: Make sure interrupts enabled when we "safe_halt()"
In commit d166991234
   idle: Implement generic idle function
Thomas Gleixner cleaned up many things but perturbed some
fragile code that was keeping ia64 alive. So we started
seeing:
   WARNING: at kernel/cpu/idle.c:94 cpu_idle_loop+0x360/0x380()
and other unpleasantness like system hangs during boot.

We really shouldn't ever halt with interrupts disabled.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: magnus.damm@gmail.com
Link: http://lkml.kernel.org/r/516d9a0c26048eae9c@agluck-desk.sc.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-17 10:39:37 +02:00
Thomas Gleixner ee761f629d arch: Consolidate tsk_is_polling()
Move it to a common place. Preparatory patch for implementing
set/clear for the idle need_resched poll implementation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130321215233.446034505@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-08 17:39:22 +02:00
Yijing Wang eee46b3de7 Fix build error for numa_clear_node() under IA64
numa_clear_node() function is not implemented under IA64,
it will be called in unmap_cpu_on_node() in mm/memory_hotplug.c.
This cause build error under IA64, this patch adds numa_clear_node()
in IA64 to fix this problem.

[Added __cpuinit notation to numa_clear_node() to keep linker happy -Tony]

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-04-02 09:39:48 -07:00
Tony Luck d303e9e98f Fix initialization of CMCI/CMCP interrupts
Back 2010 during a revamp of the irq code some initializations
were moved from ia64_mca_init() to ia64_mca_late_init() in

	commit c75f2aa13f
	Cannot use register_percpu_irq() from ia64_mca_init()

But this was hideously wrong. First of all these initializations
are now down far too late. Specifically after all the other cpus
have been brought up and initialized their own CMC vectors from
smp_callin(). Also ia64_mca_late_init() may be called from any cpu
so the line:
	ia64_mca_cmc_vector_setup();       /* Setup vector on BSP */
is generally not executed on the BSP, and so the CMC vector isn't
setup at all on that processor.

Make use of the arch_early_irq_init() hook to get this code executed
at just the right moment: not too early, not too late.

Reported-by: Fred Hartnett <fred.hartnett@hp.com>
Tested-by: Fred Hartnett <fred.hartnett@hp.com>
Cc: stable@kernel.org # v2.6.37+
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-04-02 09:37:06 -07:00
Keller, Jacob E 7d4c04fc17 net: add option to enable error queue packets waking select
Currently, when a socket receives something on the error queue it only wakes up
the socket on select if it is in the "read" list, that is the socket has
something to read. It is useful also to wake the socket if it is in the error
list, which would enable software to wait on error queue packets without waking
up for regular data on the socket. The main use case is for receiving
timestamped transmit packets which return the timestamp to the socket via the
error queue. This enables an application to select on the socket for the error
queue only instead of for the regular traffic.

-v2-
* Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file
* Modified every socket poll function that checks error queue

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jeffrey Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Matthew Vick <matthew.vick@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-31 19:44:20 -04:00
Stephan Schreiber 136f39ddc5 Wrong asm register contraints in the futex implementation
The Linux Kernel contains some inline assembly source code which has
wrong asm register constraints in arch/ia64/include/asm/futex.h.

I observed this on Kernel 3.2.23 but it is also true on the most
recent Kernel 3.9-rc1.

File arch/ia64/include/asm/futex.h:

static inline int
futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
			      u32 oldval, u32 newval)
{
	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
		return -EFAULT;

	{
		register unsigned long r8 __asm ("r8");
		unsigned long prev;
		__asm__ __volatile__(
			"	mf;;					\n"
			"	mov %0=r0				\n"
			"	mov ar.ccv=%4;;				\n"
			"[1:]	cmpxchg4.acq %1=[%2],%3,ar.ccv		\n"
			"	.xdata4 \"__ex_table\", 1b-., 2f-.	\n"
			"[2:]"
			: "=r" (r8), "=r" (prev)
			: "r" (uaddr), "r" (newval),
			  "rO" ((long) (unsigned) oldval)
			: "memory");
		*uval = prev;
		return r8;
	}
}

The list of output registers is
			: "=r" (r8), "=r" (prev)
The constraint "=r" means that the GCC has to maintain that these vars
are in registers and contain valid info when the program flow leaves
the assembly block (output registers).
But "=r" also means that GCC can put them in registers that are used
as input registers. Input registers are uaddr, newval, oldval on the
example.
The second assembly instruction
			"	mov %0=r0				\n"
is the first one which writes to a register; it sets %0 to 0. %0 means
the first register operand; it is r8 here. (The r0 is read-only and
always 0 on the Itanium; it can be used if an immediate zero value is
needed.)
This instruction might overwrite one of the other registers which are
still needed.
Whether it really happens depends on how GCC decides what registers it
uses and how it optimizes the code.

The objdump utility can give us disassembly.
The futex_atomic_cmpxchg_inatomic() function is inline, so we have to
look for a module that uses the funtion. This is the
cmpxchg_futex_value_locked() function in
kernel/futex.c:

static int cmpxchg_futex_value_locked(u32 *curval, u32 __user *uaddr,
				      u32 uval, u32 newval)
{
	int ret;

	pagefault_disable();
	ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
	pagefault_enable();

	return ret;
}

Now the disassembly. At first from the Kernel package 3.2.23 which has
been compiled with GCC 4.4, remeber this Kernel seemed to work:
objdump -d linux-3.2.23/debian/build/build_ia64_none_mckinley/kernel/futex.o

0000000000000230 <cmpxchg_futex_value_locked>:
      230:	0b 18 80 1b 18 21 	[MMI]       adds r3=3168,r13;;
      236:	80 40 0d 00 42 00 	            adds r8=40,r3
      23c:	00 00 04 00       	            nop.i 0x0;;
      240:	0b 50 00 10 10 10 	[MMI]       ld4 r10=[r8];;
      246:	90 08 28 00 42 00 	            adds r9=1,r10
      24c:	00 00 04 00       	            nop.i 0x0;;
      250:	09 00 00 00 01 00 	[MMI]       nop.m 0x0
      256:	00 48 20 20 23 00 	            st4 [r8]=r9
      25c:	00 00 04 00       	            nop.i 0x0;;
      260:	08 10 80 06 00 21 	[MMI]       adds r2=32,r3
      266:	00 00 00 02 00 00 	            nop.m 0x0
      26c:	02 08 f1 52       	            extr.u r16=r33,0,61
      270:	05 40 88 00 08 e0 	[MLX]       addp4 r8=r34,r0
      276:	ff ff 0f 00 00 e0 	            movl r15=0xfffffffbfff;;
      27c:	f1 f7 ff 65
      280:	09 70 00 04 18 10 	[MMI]       ld8 r14=[r2]
      286:	00 00 00 02 00 c0 	            nop.m 0x0
      28c:	f0 80 1c d0       	            cmp.ltu p6,p7=r15,r16;;
      290:	08 40 fc 1d 09 3b 	[MMI]       cmp.eq p8,p9=-1,r14
      296:	00 00 00 02 00 40 	            nop.m 0x0
      29c:	e1 08 2d d0       	            cmp.ltu p10,p11=r14,r33
      2a0:	56 01 10 00 40 10 	[BBB] (p10) br.cond.spnt.few 2e0
<cmpxchg_futex_value_locked+0xb0>
      2a6:	02 08 00 80 21 03 	      (p08) br.cond.dpnt.few 2b0
<cmpxchg_futex_value_locked+0x80>
      2ac:	40 00 00 41       	      (p06) br.cond.spnt.few 2e0
<cmpxchg_futex_value_locked+0xb0>
      2b0:	0a 00 00 00 22 00 	[MMI]       mf;;
      2b6:	80 00 00 00 42 00 	            mov r8=r0
      2bc:	00 00 04 00       	            nop.i 0x0
      2c0:	0b 00 20 40 2a 04 	[MMI]       mov.m ar.ccv=r8;;
      2c6:	10 1a 85 22 20 00 	            cmpxchg4.acq r33=[r33],r35,ar.ccv
      2cc:	00 00 04 00       	            nop.i 0x0;;
      2d0:	10 00 84 40 90 11 	[MIB]       st4 [r32]=r33
      2d6:	00 00 00 02 00 00 	            nop.i 0x0
      2dc:	20 00 00 40       	            br.few 2f0
<cmpxchg_futex_value_locked+0xc0>
      2e0:	09 40 c8 f9 ff 27 	[MMI]       mov r8=-14
      2e6:	00 00 00 02 00 00 	            nop.m 0x0
      2ec:	00 00 04 00       	            nop.i 0x0;;
      2f0:	0b 58 20 1a 19 21 	[MMI]       adds r11=3208,r13;;
      2f6:	20 01 2c 20 20 00 	            ld4 r18=[r11]
      2fc:	00 00 04 00       	            nop.i 0x0;;
      300:	0b 88 fc 25 3f 23 	[MMI]       adds r17=-1,r18;;
      306:	00 88 2c 20 23 00 	            st4 [r11]=r17
      30c:	00 00 04 00       	            nop.i 0x0;;
      310:	11 00 00 00 01 00 	[MIB]       nop.m 0x0
      316:	00 00 00 02 00 80 	            nop.i 0x0
      31c:	08 00 84 00       	            br.ret.sptk.many b0;;

The lines
      2b0:	0a 00 00 00 22 00 	[MMI]       mf;;
      2b6:	80 00 00 00 42 00 	            mov r8=r0
      2bc:	00 00 04 00       	            nop.i 0x0
      2c0:	0b 00 20 40 2a 04 	[MMI]       mov.m ar.ccv=r8;;
      2c6:	10 1a 85 22 20 00 	            cmpxchg4.acq r33=[r33],r35,ar.ccv
      2cc:	00 00 04 00       	            nop.i 0x0;;
are the instructions of the assembly block.
The line
      2b6:	80 00 00 00 42 00 	            mov r8=r0
sets the r8 register to 0 and after that
      2c0:	0b 00 20 40 2a 04 	[MMI]       mov.m ar.ccv=r8;;
prepares the 'oldvalue' for the cmpxchg but it takes it from r8. This
is wrong.
What happened here is what I explained above: An input register is
overwritten which is still needed.
The register operand constraints in futex.h are wrong.

(The problem doesn't occur when the Kernel is compiled with GCC 4.6.)

The attached patch fixes the register operand constraints in futex.h.
The code after patching of it:

static inline int
futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
			      u32 oldval, u32 newval)
{
	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
		return -EFAULT;

	{
		register unsigned long r8 __asm ("r8") = 0;
		unsigned long prev;
		__asm__ __volatile__(
			"	mf;;					\n"
			"	mov ar.ccv=%4;;				\n"
			"[1:]	cmpxchg4.acq %1=[%2],%3,ar.ccv		\n"
			"	.xdata4 \"__ex_table\", 1b-., 2f-.	\n"
			"[2:]"
			: "+r" (r8), "=&r" (prev)
			: "r" (uaddr), "r" (newval),
			  "rO" ((long) (unsigned) oldval)
			: "memory");
		*uval = prev;
		return r8;
	}
}

I also initialized the 'r8' var with the C programming language.
The _asm qualifier on the definition of the 'r8' var forces GCC to use
the r8 processor register for it.
I don't believe that we should use inline assembly for zeroing out a
local variable.
The constraint is
"+r" (r8)
what means that it is both an input register and an output register.
Note that the page fault handler will modify the r8 register which
will be the return value of the function.
The real fix is
"=&r" (prev)
The & means that GCC must not use any of the input registers to place
this output register in.

Patched the Kernel 3.2.23 and compiled it with GCC4.4:

0000000000000230 <cmpxchg_futex_value_locked>:
      230:	0b 18 80 1b 18 21 	[MMI]       adds r3=3168,r13;;
      236:	80 40 0d 00 42 00 	            adds r8=40,r3
      23c:	00 00 04 00       	            nop.i 0x0;;
      240:	0b 50 00 10 10 10 	[MMI]       ld4 r10=[r8];;
      246:	90 08 28 00 42 00 	            adds r9=1,r10
      24c:	00 00 04 00       	            nop.i 0x0;;
      250:	09 00 00 00 01 00 	[MMI]       nop.m 0x0
      256:	00 48 20 20 23 00 	            st4 [r8]=r9
      25c:	00 00 04 00       	            nop.i 0x0;;
      260:	08 10 80 06 00 21 	[MMI]       adds r2=32,r3
      266:	20 12 01 10 40 00 	            addp4 r34=r34,r0
      26c:	02 08 f1 52       	            extr.u r16=r33,0,61
      270:	05 40 00 00 00 e1 	[MLX]       mov r8=r0
      276:	ff ff 0f 00 00 e0 	            movl r15=0xfffffffbfff;;
      27c:	f1 f7 ff 65
      280:	09 70 00 04 18 10 	[MMI]       ld8 r14=[r2]
      286:	00 00 00 02 00 c0 	            nop.m 0x0
      28c:	f0 80 1c d0       	            cmp.ltu p6,p7=r15,r16;;
      290:	08 40 fc 1d 09 3b 	[MMI]       cmp.eq p8,p9=-1,r14
      296:	00 00 00 02 00 40 	            nop.m 0x0
      29c:	e1 08 2d d0       	            cmp.ltu p10,p11=r14,r33
      2a0:	56 01 10 00 40 10 	[BBB] (p10) br.cond.spnt.few 2e0
<cmpxchg_futex_value_locked+0xb0>
      2a6:	02 08 00 80 21 03 	      (p08) br.cond.dpnt.few 2b0
<cmpxchg_futex_value_locked+0x80>
      2ac:	40 00 00 41       	      (p06) br.cond.spnt.few 2e0
<cmpxchg_futex_value_locked+0xb0>
      2b0:	0b 00 00 00 22 00 	[MMI]       mf;;
      2b6:	00 10 81 54 08 00 	            mov.m ar.ccv=r34
      2bc:	00 00 04 00       	            nop.i 0x0;;
      2c0:	09 58 8c 42 11 10 	[MMI]       cmpxchg4.acq r11=[r33],r35,ar.ccv
      2c6:	00 00 00 02 00 00 	            nop.m 0x0
      2cc:	00 00 04 00       	            nop.i 0x0;;
      2d0:	10 00 2c 40 90 11 	[MIB]       st4 [r32]=r11
      2d6:	00 00 00 02 00 00 	            nop.i 0x0
      2dc:	20 00 00 40       	            br.few 2f0
<cmpxchg_futex_value_locked+0xc0>
      2e0:	09 40 c8 f9 ff 27 	[MMI]       mov r8=-14
      2e6:	00 00 00 02 00 00 	            nop.m 0x0
      2ec:	00 00 04 00       	            nop.i 0x0;;
      2f0:	0b 88 20 1a 19 21 	[MMI]       adds r17=3208,r13;;
      2f6:	30 01 44 20 20 00 	            ld4 r19=[r17]
      2fc:	00 00 04 00       	            nop.i 0x0;;
      300:	0b 90 fc 27 3f 23 	[MMI]       adds r18=-1,r19;;
      306:	00 90 44 20 23 00 	            st4 [r17]=r18
      30c:	00 00 04 00       	            nop.i 0x0;;
      310:	11 00 00 00 01 00 	[MIB]       nop.m 0x0
      316:	00 00 00 02 00 80 	            nop.i 0x0
      31c:	08 00 84 00       	            br.ret.sptk.many b0;;

Much better.
There is a
      270:	05 40 00 00 00 e1 	[MLX]       mov r8=r0
which was generated by C code r8 = 0. Below
      2b6:	00 10 81 54 08 00 	            mov.m ar.ccv=r34
what means that oldval is no longer overwritten.

This is Debian bug#702641
(http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702641).

The patch is applicable on Kernel 3.9-rc1, 3.2.23 and many other versions.

Signed-off-by: Stephan Schreiber <info@fs-driver.org>
Cc: stable@vger.kernel.org
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-03-19 16:14:53 -07:00
Al Viro e1b5bb6d12 consolidate cond_syscall and SYSCALL_ALIAS declarations
take them to asm/linkage.h, with default in linux/linkage.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-03 22:55:19 -05:00
Linus Torvalds d895cb1af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...
2013-02-26 20:16:07 -08:00
Al Viro e72837e3e7 default SET_PERSONALITY() in linux/elf.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26 02:46:08 -05:00
Linus Torvalds 89f883372f Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Marcelo Tosatti:
 "KVM updates for the 3.9 merge window, including x86 real mode
  emulation fixes, stronger memory slot interface restrictions, mmu_lock
  spinlock hold time reduction, improved handling of large page faults
  on shadow, initial APICv HW acceleration support, s390 channel IO
  based virtio, amongst others"

* tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  Revert "KVM: MMU: lazily drop large spte"
  x86: pvclock kvm: align allocation size to page size
  KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr
  x86 emulator: fix parity calculation for AAD instruction
  KVM: PPC: BookE: Handle alignment interrupts
  booke: Added DBCR4 SPR number
  KVM: PPC: booke: Allow multiple exception types
  KVM: PPC: booke: use vcpu reference from thread_struct
  KVM: Remove user_alloc from struct kvm_memory_slot
  KVM: VMX: disable apicv by default
  KVM: s390: Fix handling of iscs.
  KVM: MMU: cleanup __direct_map
  KVM: MMU: remove pt_access in mmu_set_spte
  KVM: MMU: cleanup mapping-level
  KVM: MMU: lazily drop large spte
  KVM: VMX: cleanup vmx_set_cr0().
  KVM: VMX: add missing exit names to VMX_EXIT_REASONS array
  KVM: VMX: disable SMEP feature when guest is in non-paging mode
  KVM: Remove duplicate text in api.txt
  Revert "KVM: MMU: split kvm_mmu_free_page"
  ...
2013-02-24 13:07:18 -08:00
Linus Torvalds 9e2d59ad58 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal handling cleanups from Al Viro:
 "This is the first pile; another one will come a bit later and will
  contain SYSCALL_DEFINE-related patches.

   - a bunch of signal-related syscalls (both native and compat)
     unified.

   - a bunch of compat syscalls switched to COMPAT_SYSCALL_DEFINE
     (fixing several potential problems with missing argument
     validation, while we are at it)

   - a lot of now-pointless wrappers killed

   - a couple of architectures (cris and hexagon) forgot to save
     altstack settings into sigframe, even though they used the
     (uninitialized) values in sigreturn; fixed.

   - microblaze fixes for delivery of multiple signals arriving at once

   - saner set of helpers for signal delivery introduced, several
     architectures switched to using those."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (143 commits)
  x86: convert to ksignal
  sparc: convert to ksignal
  arm: switch to struct ksignal * passing
  alpha: pass k_sigaction and siginfo_t using ksignal pointer
  burying unused conditionals
  make do_sigaltstack() static
  arm64: switch to generic old sigaction() (compat-only)
  arm64: switch to generic compat rt_sigaction()
  arm64: switch compat to generic old sigsuspend
  arm64: switch to generic compat rt_sigqueueinfo()
  arm64: switch to generic compat rt_sigpending()
  arm64: switch to generic compat rt_sigprocmask()
  arm64: switch to generic sigaltstack
  sparc: switch to generic old sigsuspend
  sparc: COMPAT_SYSCALL_DEFINE does all sign-extension as well as SYSCALL_DEFINE
  sparc: kill sign-extending wrappers for native syscalls
  kill sparc32_open()
  sparc: switch to use of generic old sigaction
  sparc: switch sys_compat_rt_sigaction() to COMPAT_SYSCALL_DEFINE
  mips: switch to generic sys_fork() and sys_clone()
  ...
2013-02-23 18:50:11 -08:00
Linus Torvalds a0b1c42951 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking update from David Miller:

 1) Checkpoint/restarted TCP sockets now can properly propagate the TCP
    timestamp offset.  From Andrey Vagin.

 2) VMWARE VM VSOCK layer, from Andy King.

 3) Much improved support for virtual functions and SR-IOV in bnx2x,
    from Ariel ELior.

 4) All protocols on ipv4 and ipv6 are now network namespace aware, and
    all the compatability checks for initial-namespace-only protocols is
    removed.  Thanks to Tom Parkin for helping deal with the last major
    holdout, L2TP.

 5) IPV6 support in netpoll and network namespace support in pktgen,
    from Cong Wang.

 6) Multiple Registration Protocol (MRP) and Multiple VLAN Registration
    Protocol (MVRP) support, from David Ward.

 7) Compute packet lengths more accurately in the packet scheduler, from
    Eric Dumazet.

 8) Use per-task page fragment allocator in skb_append_datato_frags(),
    also from Eric Dumazet.

 9) Add support for connection tracking labels in netfilter, from
    Florian Westphal.

10) Fix default multicast group joining on ipv6, and add anti-spoofing
    checks to 6to4 and 6rd.  From Hannes Frederic Sowa.

11) Make ipv4/ipv6 fragmentation memory limits more reasonable in modern
    times, rearrange inet frag datastructures for better cacheline
    locality, and move more operations outside of locking.  From Jesper
    Dangaard Brouer.

12) Instead of strict master <--> slave relationships, allow arbitrary
    scenerios with "upper device lists".  From Jiri Pirko.

13) Improve rate limiting accuracy in TBF and act_police, also from Jiri
    Pirko.

14) Add a BPF filter netfilter match target, from Willem de Bruijn.

15) Orphan and delete a bunch of pre-historic networking drivers from
    Paul Gortmaker.

16) Add TSO support for GRE tunnels, from Pravin B SHelar.  Although
    this still needs some minor bug fixing before it's %100 correct in
    all cases.

17) Handle unresolved IPSEC states like ARP, with a resolution packet
    queue.  From Steffen Klassert.

18) Remove TCP Appropriate Byte Count support (ABC), from Stephen
    Hemminger.  This was long overdue.

19) Support SO_REUSEPORT, from Tom Herbert.

20) Allow locking a socket BPF filter, so that it cannot change after a
    process drops capabilities.

21) Add VLAN filtering to bridge, from Vlad Yasevich.

22) Bring ipv6 on-par with ipv4 and do not cache neighbour entries in
    the ipv6 routes, from YOSHIFUJI Hideaki.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1538 commits)
  ipv6: fix race condition regarding dst->expires and dst->from.
  net: fix a wrong assignment in skb_split()
  ip_gre: remove an extra dst_release()
  ppp: set qdisc_tx_busylock to avoid LOCKDEP splat
  atl1c: restore buffer state
  net: fix a build failure when !CONFIG_PROC_FS
  net: ipv4: fix waring -Wunused-variable
  net: proc: fix build failed when procfs is not configured
  Revert "xen: netback: remove redundant xenvif_put"
  net: move procfs code to net/core/net-procfs.c
  qmi_wwan, cdc-ether: add ADU960S
  bonding: set sysfs device_type to 'bond'
  bonding: fix bond_release_all inconsistencies
  b44: use netdev_alloc_skb_ip_align()
  xen: netback: remove redundant xenvif_put
  net: fec: Do a sanity check on the gpio number
  ip_gre: propogate target device GSO capability to the tunnel device
  ip_gre: allow CSUM capable devices to handle packets
  bonding: Fix initialize after use for 3ad machine state spinlock
  bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
  ...
2013-02-20 18:58:50 -08:00
Linus Torvalds 8793422fd9 ACPI and power management updates for 3.9-rc1
- Rework of the ACPI namespace scanning code from Rafael J. Wysocki
   with contributions from Bjorn Helgaas, Jiang Liu, Mika Westerberg,
   Toshi Kani, and Yinghai Lu.
 
 - ACPI power resources handling and ACPI device PM update from
   Rafael J. Wysocki.
 
 - ACPICA update to version 20130117 from Bob Moore and Lv Zheng
   with contributions from Aaron Lu, Chao Guan, Jesper Juhl, and
   Tim Gardner.
 
 - Support for Intel Lynxpoint LPSS from Mika Westerberg.
 
 - cpuidle update from Len Brown including Intel Haswell support, C1
   state for intel_idle, removal of global pm_idle.
 
 - cpuidle fixes and cleanups from Daniel Lezcano.
 
 - cpufreq fixes and cleanups from Viresh Kumar and Fabio Baltieri
   with contributions from Stratos Karafotis and Rickard Andersson.
 
 - Intel P-states driver for Sandy Bridge processors from
   Dirk Brandewie.
 
 - cpufreq driver for Marvell Kirkwood SoCs from Andrew Lunn.
 
 - cpufreq fixes related to ordering issues between acpi-cpufreq and
   powernow-k8 from Borislav Petkov and Matthew Garrett.
 
 - cpufreq support for Calxeda Highbank processors from Mark Langsdorf
   and Rob Herring.
 
 - cpufreq driver for the Freescale i.MX6Q SoC and cpufreq-cpu0 update
   from Shawn Guo.
 
 - cpufreq Exynos fixes and cleanups from Jonghwan Choi, Sachin Kamat,
   and Inderpal Singh.
 
 - Support for "lightweight suspend" from Zhang Rui.
 
 - Removal of the deprecated power trace API from Paul Gortmaker.
 
 - Assorted updates from Andreas Fleig, Colin Ian King,
   Davidlohr Bueso, Joseph Salisbury, Kees Cook, Li Fei,
   Nishanth Menon, ShuoX Liu, Srinivas Pandruvada, Tejun Heo,
   Thomas Renninger, and Yasuaki Ishimatsu.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJRIsArAAoJEKhOf7ml8uNsD6MP/j7C4NA+GTq6RdwoJt+Yki0K
 9Ep8I4pEuRFoN/oskv24EyQhpGJIk6UxWcJ/DWFBc+1VhmKORta7k2Idv/wlJA77
 s7AcDveA9xcDh+TVfbh87TeuiMSXiSdDZbiaQO+wMizWJAF3F84AnjiAqqqyQcSK
 bA5/Siz/vWlt9PyYDaQtHTVE4lpvPuVcQdYewsdaH2PsmUjvIg/TUzg28CTrdyvv
 eHOdBK9R0/OLQLhzRbL0VOGJ//wEl+HJRO0QEhTKPgdQ1e/VH/4Zu5WSzF8P/x4C
 s2f8U4IKQqulDuDHXtpMpelFm7hRWgsOqZLkcyXLs+0dvSM9CTPO6P0ZaImxUctk
 5daHWEsXUnCErDQawt1mcZP8l6qnxofMQIfLXyPVzvlSnHyToTmrtXa1v2u4AuL/
 hOo4MYWsFNUmRdtGFFGlExGgEDZ4G5NwiYjRBl/6XJ3v4nhnnMbuzxP8scpoe5m1
 8tjroJHZFUUs/mFU/H+oRbHzSzXPmp1sddNaTg4OpVmTn3DDh6ljnFhiItd1Ndw0
 5ldVbSe6ETq5RoK0TbzvQOeVpa9F3JfqbrXLQPqfd2iz/No41LQYG1uShRYuXKuA
 wfEcc+c9VMd3FILu05pGwBnU8VS9VbxTYMz7xDxg6b29Ywnb7u+Q1ycCk2gFYtkS
 E2oZDuyewTJxaskzYsNr
 =wijn
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:

 - Rework of the ACPI namespace scanning code from Rafael J.  Wysocki
   with contributions from Bjorn Helgaas, Jiang Liu, Mika Westerberg,
   Toshi Kani, and Yinghai Lu.

 - ACPI power resources handling and ACPI device PM update from Rafael
   J Wysocki.

 - ACPICA update to version 20130117 from Bob Moore and Lv Zheng with
   contributions from Aaron Lu, Chao Guan, Jesper Juhl, and Tim Gardner.

 - Support for Intel Lynxpoint LPSS from Mika Westerberg.

 - cpuidle update from Len Brown including Intel Haswell support, C1
   state for intel_idle, removal of global pm_idle.

 - cpuidle fixes and cleanups from Daniel Lezcano.

 - cpufreq fixes and cleanups from Viresh Kumar and Fabio Baltieri with
   contributions from Stratos Karafotis and Rickard Andersson.

 - Intel P-states driver for Sandy Bridge processors from Dirk
   Brandewie.

 - cpufreq driver for Marvell Kirkwood SoCs from Andrew Lunn.

 - cpufreq fixes related to ordering issues between acpi-cpufreq and
   powernow-k8 from Borislav Petkov and Matthew Garrett.

 - cpufreq support for Calxeda Highbank processors from Mark Langsdorf
   and Rob Herring.

 - cpufreq driver for the Freescale i.MX6Q SoC and cpufreq-cpu0 update
   from Shawn Guo.

 - cpufreq Exynos fixes and cleanups from Jonghwan Choi, Sachin Kamat,
   and Inderpal Singh.

 - Support for "lightweight suspend" from Zhang Rui.

 - Removal of the deprecated power trace API from Paul Gortmaker.

 - Assorted updates from Andreas Fleig, Colin Ian King, Davidlohr Bueso,
   Joseph Salisbury, Kees Cook, Li Fei, Nishanth Menon, ShuoX Liu,
   Srinivas Pandruvada, Tejun Heo, Thomas Renninger, and Yasuaki
   Ishimatsu.

* tag 'pm+acpi-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (267 commits)
  PM idle: remove global declaration of pm_idle
  unicore32 idle: delete stray pm_idle comment
  openrisc idle: delete pm_idle
  mn10300 idle: delete pm_idle
  microblaze idle: delete pm_idle
  m32r idle: delete pm_idle, and other dead idle code
  ia64 idle: delete pm_idle
  cris idle: delete idle and pm_idle
  ARM64 idle: delete pm_idle
  ARM idle: delete pm_idle
  blackfin idle: delete pm_idle
  sparc idle: rename pm_idle to sparc_idle
  sh idle: rename global pm_idle to static sh_idle
  x86 idle: rename global pm_idle to static x86_idle
  APM idle: register apm_cpu_idle via cpuidle
  cpufreq / intel_pstate: Add kernel command line option disable intel_pstate.
  cpufreq / intel_pstate: Change to disallow module build
  tools/power turbostat: display SMI count by default
  intel_idle: export both C1 and C1E
  ACPI / hotplug: Fix concurrency issues and memory leaks
  ...
2013-02-20 11:26:56 -08:00
Al Viro d64008a8f3 burying unused conditionals
__ARCH_WANT_SYS_RT_SIGACTION,
__ARCH_WANT_SYS_RT_SIGSUSPEND,
__ARCH_WANT_COMPAT_SYS_RT_SIGSUSPEND,
__ARCH_WANT_COMPAT_SYS_SCHED_RR_GET_INTERVAL - not used anymore
CONFIG_GENERIC_{SIGALTSTACK,COMPAT_RT_SIG{ACTION,QUEUEINFO,PENDING,PROCMASK}} -
can be assumed always set.
2013-02-14 09:21:15 -05:00
Al Viro 574c4866e3 consolidate kernel-side struct sigaction declarations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03 15:09:22 -05:00
Al Viro 92a3ce4a1e consolidate declarations of k_sigaction
Only alpha and sparc are unusual - they have ka_restorer in it.
And nobody needs that exposed to userland.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03 15:09:22 -05:00
Al Viro eaca6eae3e sanitize rt_sigaction() situation a bit
Switch from __ARCH_WANT_SYS_RT_SIGACTION to opposite
(!CONFIG_ODD_RT_SIGACTION); the only two architectures that
need it are alpha and sparc.  The reason for use of CONFIG_...
instead of __ARCH_... is that it's needed only kernel-side
and doing it that way avoids a mess with include order on many
architectures.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03 15:09:18 -05:00
Frederic Weisbecker abf917cd91 cputime: Generic on-demand virtual cputime accounting
If we want to stop the tick further idle, we need to be
able to account the cputime without using the tick.

Virtual based cputime accounting solves that problem by
hooking into kernel/user boundaries.

However implementing CONFIG_VIRT_CPU_ACCOUNTING require
low level hooks and involves more overhead. But we already
have a generic context tracking subsystem that is required
for RCU needs by archs which plan to shut down the tick
outside idle.

This patch implements a generic virtual based cputime
accounting that relies on these generic kernel/user hooks.

There are some upsides of doing this:

- This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
if context tracking is already built (already necessary for RCU in full
tickless mode).

- We can rely on the generic context tracking subsystem to dynamically
(de)activate the hooks, so that we can switch anytime between virtual
and tick based accounting. This way we don't have the overhead
of the virtual accounting when the tick is running periodically.

And one downside:

- There is probably more overhead than a native virtual based cputime
accounting. But this relies on hooks that are already set anyway.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-01-27 19:23:27 +01:00
Frederic Weisbecker 39613766e8 cputime: Librarize per nsecs resolution cputime definitions
The full dynticks cputime accounting that we'll soon introduce
will rely on sched_clock(). And its clock can have a per
nanosecond granularity.

To prepare for this, we need to have a cputime_t implementation
that has this precision.

ia64 virtual cputime accounting already uses that granularity
so all we need is to librarize its implementation in the asm
generic headers.

Also librarize the default per jiffy granularity cputime_t
as well so that we can easily pick either implementation
depending on the cputime accounting config we choose.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
2013-01-27 19:23:16 +01:00
Tom Herbert 055dc21a1d soreuseport: infrastructure
Definitions and macros for implementing soreusport.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-23 13:44:00 -05:00
Vincent Bernat d59577b6ff sk-filter: Add ability to lock a socket filter program
While a privileged program can open a raw socket, attach some
restrictive filter and drop its privileges (or send the socket to an
unprivileged program through some Unix socket), the filter can still
be removed or modified by the unprivileged program. This commit adds a
socket option to lock the filter (SO_LOCK_FILTER) preventing any
modification of a socket filter program.

This is similar to OpenBSD BIOCLOCK ioctl on bpf sockets, except even
root is not allowed change/drop the filter.

The state of the lock can be read with getsockopt(). No error is
triggered if the state is not changed. -EPERM is returned when a user
tries to remove the lock or to change/remove the filter while the lock
is active. The check is done directly in sk_attach_filter() and
sk_detach_filter() and does not affect only setsockopt() syscall.

Signed-off-by: Vincent Bernat <bernat@luffy.cx>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-17 03:21:25 -05:00
Lv Zheng 0947c6dee3 ACPICA: Update compilation environment settings.
This patch does not affect the generation of the Linux binary.
This patch decreases 300 lines of 20121018 divergence.diff.

This patch updates architecture specific environment settings for compiling
ACPICA as such enhancement already has been done in ACPICA.

Note that the appended compiler default settings in the
<acpi/platform/acenv.h> will deprecate some of the macros defined in the
architecture specific <asm/acpi.h>. Thus two of the <asm/acpi.h> headers
have been cleaned up in this patch accordingly.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-01-10 12:36:17 +01:00
Linus Torvalds 49569646b2 Driver core __dev* removal patches
Here are the remaining __dev* removal patches against the 3.8-rc2 tree.
 All of these patches were previously sent to the subsystem maintainers,
 most of them were picked up and pushed to you, but there were a number
 that fell through the cracks, and new drivers were added during the
 merge window, so this series cleans up the rest of the instances of
 these markings.
 
 Third time's the charm...
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlDmHOIACgkQMUfUDdst+ykTZgCePgK84Im3FFooEXJwaPbaf4ls
 lO4AoMEDoWK+BHWOsjQwFPOwFFPEN2Xh
 =6oAQ
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core __dev* removal patches - take 3 - from Greg Kroah-Hartman:
 "Here are the remaining __dev* removal patches against the 3.8-rc2
  tree.  All of these patches were previously sent to the subsystem
  maintainers, most of them were picked up and pushed to you, but there
  were a number that fell through the cracks, and new drivers were added
  during the merge window, so this series cleans up the rest of the
  instances of these markings.

  Third time's the charm...

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

Fixed up trivial conflict with the pinctrl pull in pinctrl-sirf.c.

* tag 'driver-core-3.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (54 commits)
  misc: remove __dev* attributes.
  include: remove __dev* attributes.
  Documentation: remove __dev* attributes.
  Drivers: misc: remove __dev* attributes.
  Drivers: block: remove __dev* attributes.
  Drivers: bcma: remove __dev* attributes.
  Drivers: char: remove __dev* attributes.
  Drivers: clocksource: remove __dev* attributes.
  Drivers: ssb: remove __dev* attributes.
  Drivers: dma: remove __dev* attributes.
  Drivers: gpu: remove __dev* attributes.
  Drivers: infinband: remove __dev* attributes.
  Drivers: memory: remove __dev* attributes.
  Drivers: mmc: remove __dev* attributes.
  Drivers: iommu: remove __dev* attributes.
  Drivers: power: remove __dev* attributes.
  Drivers: message: remove __dev* attributes.
  Drivers: macintosh: remove __dev* attributes.
  Drivers: mfd: remove __dev* attributes.
  pstore: remove __dev* attributes.
  ...
2013-01-03 16:17:50 -08:00
Greg Kroah-Hartman 5b5e76e9cb IA64: drivers: remove __dev* attributes.
CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit, __devexit_p, __devinitdata,
and __devexit from these drivers.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03 15:57:13 -08:00
Luck, Tony 062fe95afe Wire up finit_module syscall
Linux was granted a new system call to load modules by file descriptor
in commit 34e1169d99 ("module: add syscall to load module from fd").

Wire it up for ia64 (ready for the Chrome port :-)

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-03 10:57:23 -08:00
Linus Torvalds 54d46ea993 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal handling cleanups from Al Viro:
 "sigaltstack infrastructure + conversion for x86, alpha and um,
  COMPAT_SYSCALL_DEFINE infrastructure.

  Note that there are several conflicts between "unify
  SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline;
  resolution is trivial - just remove definitions of SS_ONSTACK and
  SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and
  include/uapi/linux/signal.h contains the unified variant."

Fixed up conflicts as per Al.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  alpha: switch to generic sigaltstack
  new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
  generic compat_sys_sigaltstack()
  introduce generic sys_sigaltstack(), switch x86 and um to it
  new helper: compat_user_stack_pointer()
  new helper: restore_altstack()
  unify SS_ONSTACK/SS_DISABLE definitions
  new helper: current_user_stack_pointer()
  missing user_stack_pointer() instances
  Bury the conditionals from kernel_thread/kernel_execve series
  COMPAT_SYSCALL_DEFINE: infrastructure
2012-12-20 18:05:28 -08:00
Linus Torvalds 787314c35f IOMMU Updates for Linux v3.8
A few new features this merge-window. The most important one is
 probably, that dma-debug now warns if a dma-handle is not checked with
 dma_mapping_error by the device driver. This requires minor changes to
 some architectures which make use of dma-debug. Most of these changes
 have the respective Acks by the Arch-Maintainers.
 Besides that there are updates to the AMD IOMMU driver for refactor the
 IOMMU-Groups support and to make sure it does not trigger a hardware
 erratum.
 The OMAP changes (for which I pulled in a branch from Tony Lindgren's
 tree) have a conflict in linux-next with the arm-soc tree. The conflict
 is in the file arch/arm/mach-omap2/clock44xx_data.c which is deleted in
 the arm-soc tree. It is safe to delete the file too so solve the
 conflict. Similar changes are done in the arm-soc tree in the common
 clock framework migration. A missing hunk from the patch in the IOMMU
 tree will be submitted as a seperate patch when the merge-window is
 closed.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQzbQQAAoJECvwRC2XARrjXCIP/2RxBzbVOiaPOorl+ZWbsZ41
 lzWiXsCHJkh4BK4/qGsVeKhiNd9LcbQUlhywnBbhWxym3spzmjGtvU2Hcg8QiO/M
 R83r9S4e8Z6DnF9Gcats1Ns9BufgpyhLXg3XoXPxtyHOgRS59fvYi6xXOxyX30Dy
 uhbj+WL6UD0zvOMNztEnM1p6UhX+XlpvzKDTR5+G5xKdVPkcgeiaKSwqz739caTn
 QE2NpqIh+8Mwuu1nIapk8h07xhUYU5eGMXa38u1LvDwSHsrsCMLC+lXIjtInn7Gw
 Bv+XcCHgtOaoPQwwk/xd2HVwJQxO9HNb5YX51EIjwP0C5S/3yW9Ji1RgqFb6Ewqq
 jIkF6ckwUheLWsBGkw5UknI/f7RX3MDiTWkziYLIniYKKewm+ymGfgIqPt2TzLIO
 tMZZiIssKvy7wOXQ5JjpYJg5Xmrau6opNwdEguC8pWkJT7qsn+3SeLjMt0Lh9IoY
 +37DOgOLb3O3/vnZJ3i0KMRZBfVeaRj5HaGmlxFCYUZCNQymIPTih9Jtqm+WuVcu
 YaGQCTtynsQ0JVh8YEekLzSfgd3OODP68fyCg1CQNixEgvUi2hd/toX2/Z1wkkSA
 JC9bZarcoPkSWqaTAA2HvmaaxvRR+0UbhFPopFTQarVV0MVLZWBxoyuKy/nMrmMd
 UgTzrDYy74UKdrSTwIXg
 =pPHZ
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "A few new features this merge-window.  The most important one is
  probably, that dma-debug now warns if a dma-handle is not checked with
  dma_mapping_error by the device driver.  This requires minor changes
  to some architectures which make use of dma-debug.  Most of these
  changes have the respective Acks by the Arch-Maintainers.

  Besides that there are updates to the AMD IOMMU driver for refactor
  the IOMMU-Groups support and to make sure it does not trigger a
  hardware erratum.

  The OMAP changes (for which I pulled in a branch from Tony Lindgren's
  tree) have a conflict in linux-next with the arm-soc tree.  The
  conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is
  deleted in the arm-soc tree.  It is safe to delete the file too so
  solve the conflict.  Similar changes are done in the arm-soc tree in
  the common clock framework migration.  A missing hunk from the patch
  in the IOMMU tree will be submitted as a seperate patch when the
  merge-window is closed."

* tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits)
  ARM: dma-mapping: support debug_dma_mapping_error
  ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks
  iommu/omap: Adapt to runtime pm
  iommu/omap: Migrate to hwmod framework
  iommu/omap: Keep mmu enabled when requested
  iommu/omap: Remove redundant clock handling on ISR
  iommu/amd: Remove obsolete comment
  iommu/amd: Don't use 512GB pages
  iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch
  iommu/tegra: gart: Move bus_set_iommu after probe for multi arch
  iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all
  tile: dma_debug: add debug_dma_mapping_error support
  sh: dma_debug: add debug_dma_mapping_error support
  powerpc: dma_debug: add debug_dma_mapping_error support
  mips: dma_debug: add debug_dma_mapping_error support
  microblaze: dma-mapping: support debug_dma_mapping_error
  ia64: dma_debug: add debug_dma_mapping_error support
  c6x: dma_debug: add debug_dma_mapping_error support
  ARM64: dma_debug: add debug_dma_mapping_error support
  intel-iommu: Prevent devices with RMRRs from being placed into SI Domain
  ...
2012-12-20 10:07:25 -08:00
Al Viro 031b656698 unify SS_ONSTACK/SS_DISABLE definitions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:39 -05:00
Al Viro 1ca97bb541 new helper: current_user_stack_pointer()
Cross-architecture equivalent of rdusp(); default is
user_stack_pointer(current_pt_regs()) - that works for almost all
platforms that have usp saved in pt_regs.  The only exception from
that is ia64 - we want memory stack, not the backing store for
register one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:39 -05:00
Al Viro ae903caae2 Bury the conditionals from kernel_thread/kernel_execve series
All architectures have
	CONFIG_GENERIC_KERNEL_THREAD
	CONFIG_GENERIC_KERNEL_EXECVE
	__ARCH_WANT_SYS_EXECVE
None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers
of kernel_execve() (which is a trivial wrapper for do_execve() now) left.
Kill the conditionals and make both callers use do_execve().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:38 -05:00