Commit Graph

579 Commits

Author SHA1 Message Date
Nathan Chancellor 8eef28437c This is the 3.10.107 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZUiosAAoJEE44bZycYXAvcHYP/1OKMYQB/3G7GfEhMXdlpV31
 VjdzUg5X1JOE60anYNopvWQJgDFXMy9mTceUI3axDkfYb5iDFUpRBFEh70ggDL04
 bGB/J4n2Linjkj35u+S5P3fK6qBfg9+VDpTfUYPZGB5YjOjmaD06E8InBF8iUuC3
 6pkMtQKOptmKOc2hw84PsB3qm9ER2MMa92Lrs1rtcOihEqQMyKjkI/kzogs8XGje
 5gMt31VweScZed3d7i1r9tl/DTmzGcpEyVpz/x8gI7Xwi69FeeLy6cWbhK0VOsLA
 u7ul9mDa77bUC/jpBzJmIkS8fhzaTyUw8NQbtol9RSSIfzb+mvXyx9Vr7o4LYK2B
 P6AekC16x6R8KUED1hfxKdagguRACDfKf91bMAxDCN/PXqITVbk3RxxxH6wHAvOx
 Ihf4G5h800/ks6X1oMBYZcbFFbNCUHZjyL7V1M/iy1TrKuRhEtou4Ft3X+gOauLS
 CG8VR9Jo1/BAvMaJmy5Hg9RPNoxEMstDi6x3ugD0wH57XHSZ5QmFMBzCbuWR6hWM
 q1DvBK/I54BXlsdYU9WySn1hm2gKCNPZ+zGzLTo1l426vme+YjhC5911V7Tv+WHm
 lc5FTXWtXGhoAZuNSIGDrlv3Dyq44iMNrqXrhlPmJjWD3Hx4hFGGp2GyHOpK+5+7
 7egPk9m1WrhUKzA9m1/M
 =InCr
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEJDfLduVEy2qz2d/TmXOSYMtstxYFAlpqfQUACgkQmXOSYMts
 txZNghAApD/SW4fTOx6RZFCPVjAP70FfXvZsQYf3Zfp44Ytm2Kax3GIABPuknlI+
 IZRAPnXb6KP8DNDdCyGcJ0avI5uw96sXyeZWlDZyeS1WHHizJq3+BLB09zzdegSk
 K1dJrobXCYNESmcQMT5diGwqLYkdOs3hh7Ehqut29njwCzVzNG3n43H9F15o9cUZ
 6lAM8/Zb6ai+0KgVgwC40QJneVltDEFfXVr6wo/IJXnYNaRCPKQM5lsG09pxxopG
 NVSsmUyeJI5bPWEm5vbuBL2JVhaCcMtTfAPHflqbtykE8eSVEWdTeCWPuGWcATB+
 2sGp3cVR2W7+4CHpbcnrXolmP/OI3jXHbG1LvyRqg4Iw1jgtZ8wwjCEkdsPz3fED
 g2+EtSYl/NLW7N8P4KQV9jzihYIfELBj9HQsEs5aPOstyjyxl12RxJvjw835v5ts
 oa7qKQAHIwZsuaB34qK+DjI5coNeKRvDMy5mm0GL3TqmLLFEzSVpaTceGpdvNLi0
 6k3RkuJzU0TwAoTShWyYu6AbV+8aHniBQbjzYs5sufRgDy9pjnfWzDqtUM+chTsm
 WaxwhpHdpOomwAfZr8/Zaf0xIxP/M99SFKevntE04Ft93P8dKuLqFcNAjQkMdibY
 UHrJ67nBllmDtlH8yGO9j4FD89O0QaBX4J3qGyIu5eE73/iibvo=
 =J7vi
 -----END PGP SIGNATURE-----

Merge 3.10.107 into android-msm-bullhead-3.10-oreo-m5

Changes in 3.10.107: (270 commits)
        Revert "Btrfs: don't delay inode ref updates during log, replay"
        Btrfs: fix memory leak in reading btree blocks
        ext4: use more strict checks for inodes_per_block on mount
        ext4: fix in-superblock mount options processing
        ext4: add sanity checking to count_overhead()
        ext4: validate s_first_meta_bg at mount time
        jbd2: don't leak modified metadata buffers on an aborted journal
        ext4: fix fencepost in s_first_meta_bg validation
        ext4: trim allocation requests to group size
        ext4: preserve the needs_recovery flag when the journal is aborted
        ext4: return EROFS if device is r/o and journal replay is needed
        ext4: fix inode checksum calculation problem if i_extra_size is small
        block: fix use-after-free in sys_ioprio_get()
        block: allow WRITE_SAME commands with the SG_IO ioctl
        block: fix del_gendisk() vs blkdev_ioctl crash
        dm crypt: mark key as invalid until properly loaded
        dm space map metadata: fix 'struct sm_metadata' leak on failed create
        md/raid5: limit request size according to implementation limits
        md:raid1: fix a dead loop when read from a WriteMostly disk
        md linear: fix a race between linear_add() and linear_congested()
        CIFS: Fix a possible memory corruption during reconnect
        CIFS: Fix missing nls unload in smb2_reconnect()
        CIFS: Fix a possible memory corruption in push locks
        CIFS: remove bad_network_name flag
        fs/cifs: make share unaccessible at root level mountable
        cifs: Do not send echoes before Negotiate is complete
        ocfs2: fix crash caused by stale lvb with fsdlm plugin
        ocfs2: fix BUG_ON() in ocfs2_ci_checkpointed()
        can: raw: raw_setsockopt: limit number of can_filter that can be set
        can: peak: fix bad memory access and free sequence
        can: c_can_pci: fix null-pointer-deref in c_can_start() - set device pointer
        can: ti_hecc: add missing prepare and unprepare of the clock
        can: bcm: fix hrtimer/tasklet termination in bcm op removal
        can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer
        ALSA: hda - Fix up GPIO for ASUS ROG Ranger
        ALSA: seq: Fix race at creating a queue
        ALSA: seq: Don't handle loop timeout at snd_seq_pool_done()
        ALSA: timer: Reject user params with too small ticks
        ALSA: seq: Fix link corruption by event error handling
        ALSA: seq: Fix racy cell insertions during snd_seq_pool_done()
        ALSA: seq: Fix race during FIFO resize
        ALSA: seq: Don't break snd_use_lock_sync() loop by timeout
        ALSA: usb-audio: Add QuickCam Communicate Deluxe/S7500 to volume_control_quirks
        usb: gadgetfs: restrict upper bound on device configuration size
        USB: gadgetfs: fix unbounded memory allocation bug
        USB: gadgetfs: fix use-after-free bug
        USB: gadgetfs: fix checks of wTotalLength in config descriptors
        xhci: free xhci virtual devices with leaf nodes first
        USB: serial: io_ti: bind to interface after fw download
        usb: gadget: composite: always set ep->mult to a sensible value
        USB: cdc-acm: fix double usb_autopm_put_interface() in acm_port_activate()
        USB: cdc-acm: fix open and suspend race
        USB: cdc-acm: fix failed open not being detected
        usb: dwc3: gadget: make Set Endpoint Configuration macros safe
        usb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers
        usb: dwc3: gadget: delay unmap of bounced requests
        usb: hub: Wait for connection to be reestablished after port reset
        usb: gadget: composite: correctly initialize ep->maxpacket
        USB: UHCI: report non-PME wakeup signalling for Intel hardware
        arm/xen: Use alloc_percpu rather than __alloc_percpu
        xfs: set AGI buffer type in xlog_recover_clear_agi_bucket
        xfs: clear _XBF_PAGES from buffers when readahead page
        ssb: Fix error routine when fallback SPROM fails
        drivers/gpu/drm/ast: Fix infinite loop if read fails
        scsi: avoid a permanent stop of the scsi device's request queue
        scsi: move the nr_phys_segments assert into scsi_init_io
        scsi: don't BUG_ON() empty DMA transfers
        scsi: storvsc: properly handle SRB_ERROR when sense message is present
        scsi: storvsc: properly set residual data length on errors
        target/pscsi: Fix TYPE_TAPE + TYPE_MEDIMUM_CHANGER export
        scsi: lpfc: Add shutdown method for kexec
        scsi: sr: Sanity check returned mode data
        scsi: sd: Fix capacity calculation with 32-bit sector_t
        s390/vmlogrdr: fix IUCV buffer allocation
        libceph: verify authorize reply on connect
        nfs_write_end(): fix handling of short copies
        powerpc/ps3: Fix system hang with GCC 5 builds
        sg_write()/bsg_write() is not fit to be called under KERNEL_DS
        ftrace/x86: Set ftrace_stub to weak to prevent gcc from using short jumps to it
        cred/userns: define current_user_ns() as a function
        net: ti: cpmac: Fix compiler warning due to type confusion
        tick/broadcast: Prevent NULL pointer dereference
        netvsc: reduce maximum GSO size
        drop_monitor: add missing call to genlmsg_end
        drop_monitor: consider inserted data in genlmsg_end
        igmp: Make igmp group member RFC 3376 compliant
        HID: hid-cypress: validate length of report
        Input: xpad - use correct product id for x360w controllers
        Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000
        Input: iforce - validate number of endpoints before using them
        Input: kbtab - validate number of endpoints before using them
        Input: joydev - do not report stale values on first open
        Input: tca8418 - use the interrupt trigger from the device tree
        Input: mpr121 - handle multiple bits change of status register
        Input: mpr121 - set missing event capability
        Input: i8042 - add Clevo P650RS to the i8042 reset list
        i2c: fix kernel memory disclosure in dev interface
        vme: Fix wrong pointer utilization in ca91cx42_slave_get
        sysrq: attach sysrq handler correctly for 32-bit kernel
        pinctrl: sh-pfc: Do not unconditionally support PIN_CONFIG_BIAS_DISABLE
        x86/PCI: Ignore _CRS on Supermicro X8DTH-i/6/iF/6F
        qla2xxx: Fix crash due to null pointer access
        ARM: 8634/1: hw_breakpoint: blacklist Scorpion CPUs
        ARM: dts: da850-evm: fix read access to SPI flash
        NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT
        vmxnet3: Wake queue from reset work
        Fix memory leaks in cifs_do_mount()
        Compare prepaths when comparing superblocks
        Move check for prefix path to within cifs_get_root()
        Fix regression which breaks DFS mounting
        apparmor: fix uninitialized lsm_audit member
        apparmor: exec should not be returning ENOENT when it denies
        apparmor: fix disconnected bind mnts reconnection
        apparmor: internal paths should be treated as disconnected
        apparmor: check that xindex is in trans_table bounds
        apparmor: add missing id bounds check on dfa verification
        apparmor: don't check for vmalloc_addr if kvzalloc() failed
        apparmor: fix oops in profile_unpack() when policy_db is not present
        apparmor: fix module parameters can be changed after policy is locked
        apparmor: do not expose kernel stack
        vfio/pci: Fix integer overflows, bitmask check
        bna: Add synchronization for tx ring.
        sg: Fix double-free when drives detach during SG_IO
        move the call of __d_drop(anon) into __d_materialise_unique(dentry, anon)
        serial: 8250_pci: Detach low-level driver during PCI error recovery
        bnx2x: Correct ringparam estimate when DOWN
        tile/ptrace: Preserve previous registers for short regset write
        sysctl: fix proc_doulongvec_ms_jiffies_minmax()
        ISDN: eicon: silence misleading array-bounds warning
        ARC: [arcompact] handle unaligned access delay slot corner case
        parisc: Don't use BITS_PER_LONG in userspace-exported swab.h header
        nfs: Don't increment lock sequence ID after NFS4ERR_MOVED
        ipv6: addrconf: Avoid addrconf_disable_change() using RCU read-side lock
        af_unix: move unix_mknod() out of bindlock
        drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval
        crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg
        ata: sata_mv:- Handle return value of devm_ioremap.
        mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone()
        mm, fs: check for fatal signals in do_generic_file_read()
        ARC: [arcompact] brown paper bag bug in unaligned access delay slot fixup
        sched/debug: Don't dump sched debug info in SysRq-W
        tcp: fix 0 divide in __tcp_select_window()
        macvtap: read vnet_hdr_size once
        packet: round up linear to header len
        vfs: fix uninitialized flags in splice_to_pipe()
        siano: make it work again with CONFIG_VMAP_STACK
        futex: Move futex_init() to core_initcall
        rtc: interface: ignore expired timers when enqueuing new timers
        irda: Fix lockdep annotations in hashbin_delete().
        tty: serial: msm: Fix module autoload
        rtlwifi: rtl_usb: Fix for URB leaking when doing ifconfig up/down
        af_packet: remove a stray tab in packet_set_ring()
        MIPS: Fix special case in 64 bit IP checksumming.
        mm: vmpressure: fix sending wrong events on underflow
        ipc/shm: Fix shmat mmap nil-page protection
        sd: get disk reference in sd_check_events()
        samples/seccomp: fix 64-bit comparison macros
        ath5k: drop bogus warning on drv_set_key with unsupported cipher
        rdma_cm: fail iwarp accepts w/o connection params
        NFSv4: fix getacl ERANGE for some ACL buffer sizes
        bcma: use (get|put)_device when probing/removing device driver
        powerpc/xmon: Fix data-breakpoint
        KVM: VMX: use correct vmcs_read/write for guest segment selector/base
        KVM: PPC: Book3S PR: Fix illegal opcode emulation
        KVM: s390: fix task size check
        s390: TASK_SIZE for kernel threads
        xtensa: move parse_tag_fdt out of #ifdef CONFIG_BLK_DEV_INITRD
        mac80211: flush delayed work when entering suspend
        drm/ast: Fix test for VGA enabled
        drm/ttm: Make sure BOs being swapped out are cacheable
        fat: fix using uninitialized fields of fat_inode/fsinfo_inode
        drivers: hv: Turn off write permission on the hypercall page
        xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers
        crypto: improve gcc optimization flags for serpent and wp512
        mtd: pmcmsp: use kstrndup instead of kmalloc+strncpy
        cpmac: remove hopeless #warning
        mvsas: fix misleading indentation
        l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv
        net: don't call strlen() on the user buffer in packet_bind_spkt()
        dccp: Unlock sock before calling sk_free()
        tcp: fix various issues for sockets morphing to listen state
        uapi: fix linux/packet_diag.h userspace compilation error
        ipv6: avoid write to a possibly cloned skb
        dccp: fix memory leak during tear-down of unsuccessful connection request
        futex: Fix potential use-after-free in FUTEX_REQUEUE_PI
        futex: Add missing error handling to FUTEX_REQUEUE_PI
        give up on gcc ilog2() constant optimizations
        cancel the setfilesize transation when io error happen
        crypto: ghash-clmulni - Fix load failure
        crypto: cryptd - Assign statesize properly
        ACPI / video: skip evaluating _DOD when it does not exist
        Drivers: hv: balloon: don't crash when memory is added in non-sorted order
        s390/pci: fix use after free in dma_init
        cpufreq: Fix and clean up show_cpuinfo_cur_freq()
        igb: Workaround for igb i210 firmware issue
        igb: add i211 to i210 PHY workaround
        ipv4: provide stronger user input validation in nl_fib_input()
        tcp: initialize icsk_ack.lrcvtime at session start time
        ACM gadget: fix endianness in notifications
        mmc: sdhci: Do not disable interrupts while waiting for clock
        uvcvideo: uvc_scan_fallback() for webcams with broken chain
        fbcon: Fix vc attr at deinit
        crypto: algif_hash - avoid zero-sized array
        virtio_balloon: init 1st buffer in stats vq
        c6x/ptrace: Remove useless PTRACE_SETREGSET implementation
        sparc/ptrace: Preserve previous registers for short regset write
        metag/ptrace: Preserve previous registers for short regset write
        metag/ptrace: Provide default TXSTATUS for short NT_PRSTATUS
        metag/ptrace: Reject partial NT_METAG_RPIPE writes
        libceph: force GFP_NOIO for socket allocations
        ACPI: Fix incompatibility with mcount-based function graph tracing
        ACPI / power: Avoid maybe-uninitialized warning
        rtc: s35390a: make sure all members in the output are set
        rtc: s35390a: implement reset routine as suggested by the reference
        rtc: s35390a: improve irq handling
        padata: avoid race in reordering
        HID: hid-lg: Fix immediate disconnection of Logitech Rumblepad 2
        HID: i2c-hid: Add sleep between POWER ON and RESET
        drm/vmwgfx: NULL pointer dereference in vmw_surface_define_ioctl()
        drm/vmwgfx: avoid calling vzalloc with a 0 size in vmw_get_cap_3d_ioctl()
        drm/vmwgfx: Remove getparam error message
        drm/vmwgfx: fix integer overflow in vmw_surface_define_ioctl()
        Reset TreeId to zero on SMB2 TREE_CONNECT
        metag/usercopy: Drop unused macros
        metag/usercopy: Zero rest of buffer from copy_from_user
        powerpc: Don't try to fix up misaligned load-with-reservation instructions
        mm/mempolicy.c: fix error handling in set_mempolicy and mbind.
        mtd: bcm47xxpart: fix parsing first block after aligned TRX
        net/packet: fix overflow in check for priv area size
        x86/vdso: Plug race between mapping and ELF header setup
        iscsi-target: Fix TMR reference leak during session shutdown
        iscsi-target: Drop work-around for legacy GlobalSAN initiator
        xen, fbfront: fix connecting to backend
        char: lack of bool string made CONFIG_DEVPORT always on
        platform/x86: acer-wmi: setup accelerometer when machine has appropriate notify event
        platform/x86: acer-wmi: setup accelerometer when ACPI device was found
        mm: Tighten x86 /dev/mem with zeroing reads
        virtio-console: avoid DMA from stack
        catc: Combine failure cleanup code in catc_probe()
        catc: Use heap buffer for memory size test
        net: ipv6: check route protocol when deleting routes
        Drivers: hv: don't leak memory in vmbus_establish_gpadl()
        Drivers: hv: get rid of timeout in vmbus_open()
        ubi/upd: Always flush after prepared for an update
        x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs
        powerpc: Reject binutils 2.24 when building little endian
        net/packet: fix overflow in check for tp_frame_nr
        net/packet: fix overflow in check for tp_reserve
        tty: nozomi: avoid a harmless gcc warning
        hostap: avoid uninitialized variable use in hfa384x_get_rid
        gfs2: avoid uninitialized variable warning
        net: neigh: guard against NULL solicit() method
        sctp: listen on the sock only when it's state is listening or closed
        ip6mr: fix notification device destruction
        MIPS: Fix crash registers on non-crashing CPUs
        RDS: Fix the atomicity for congestion map update
        xen/x86: don't lose event interrupts
        p9_client_readdir() fix
        nfsd: check for oversized NFSv2/v3 arguments
        ftrace/x86: Fix triple fault with graph tracing and suspend-to-ram
        kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)
        tun: read vnet_hdr_sz once
        printk: use rcuidle console tracepoint
        ipv6: check raw payload size correctly in ioctl
        x86: standardize mmap_rnd() usage
        x86/mm/32: Enable full randomization on i386 and X86_32
        mm: larger stack guard gap, between vmas
        mm: fix new crash in unmapped_area_topdown()
        Allow stack to grow up to address space limit
        Linux 3.10.107

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

Conflicts:
	arch/x86/mm/mmap.c
	drivers/mmc/host/sdhci.c
	drivers/usb/host/xhci-plat.c
	fs/ext4/super.c
	kernel/sched/core.c
2018-01-25 17:57:41 -07:00
Nathan Chancellor 32224079fd This is the 3.10.76 stable release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJVQJeUAAoJEDjbvchgkmk+NlgP/jf4vyIubRhZNnEveDjCqCam
 OWVT/Q8fRLru9tEp8/b+gdWDDTZON7jJh5ERrwzzRud/LQwQudK+OhVlm7kKPXmY
 8uz6xdlcieKnMQleAQ/UiYqMv6VvjlAhNzUqcn60EeeAmMTix9WDFU0DFU2tvyFt
 qUR17px8vJDVI34Vv2d/Ihgt5lxMa8Bue4jZIQmPxdDHNiW9c8IUqr6vMDer4Ih0
 KuiES3FkQa61b5rEisNOWqEv/w+BH5Hn1XiN3gjBm5YznOhLHWJ5kR/2ewqCWbef
 DRdegojueSyN+ktzEDUnEWpC8zLhk3L4lBXILDSLBHvnoeEc17G5wMYKIe8DSk0B
 +tjSMq/IZMOaj68fYznOWH+UH3iBbQTSGriQShjR1zNqIz0XMMljkJNwnvzN4N3x
 0wNxr8mniIWvX7sCMYS6AWPFJCTNB6xi4mD7SJAeXzsD9+y/wUQkVymY0VkI7wxS
 OenmRy7GF7s/HLMUDTllEZ787dUdZSkrGLYooIii45dwWK87EB+4LkQT/9q1aPry
 JzAQYUxIK9vco/Xhy+CHfImoaovqzKHdxqgt5PbTHuCwj+3uKYChmBECQ4raeuU8
 JIqsr33wxlzhmzXhiwD4IcJvtDd6UiDeTpMtQOvBBce1gsKpL2+MS6s+8OiasO5F
 6TQV7+NtW4CS74ExjBjt
 =KO7Y
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEJDfLduVEy2qz2d/TmXOSYMtstxYFAlpqavQACgkQmXOSYMts
 txas9Q//d0gcyDkHLn+viInrU5Mm4QV5GmMRPL3LnsNIiiJEjoLxKCX48dmPv1/1
 Qk8IOMQRTfBn6l4lNc5AoEP2Q2SYqCKJLKBooRBoDPw9FlsrlJ2BdGz/Bm8SXSxQ
 fM3IQ76GpeW2YOXF7bVNrmOdlye5/JcmhkOT0augXFWKyTLVpVrqO/V4Nm07UAUe
 6iw82ZNxLN9Rcm6H0l92VqI0//gcxYGZi0dfpDoWrvoND10Sz2URMRxzcns8t5Ri
 1YmT+Y9cPO+3DCqNbTmwct1zQ2TlnJotm5S5DGqTznoGzNmv2PMTEIAWaWN63P0M
 T+cJgQxLDJp8ycr9C1oyfwmH2nvS3SQUGS8eDKXN0eNAbAaVrDN80Kl5pzQUfvba
 sUfK3IPU/8brZRcKYtvlc9TlEUlStxQSaUtcTXLpQ4bkoMst+9W/SFEKLScOW0No
 PBi3EhTpAjBN+EmDn6QHG0virHPn9M18s0Z1ZlSzdAyCA4mLY//JyabIEofCTz58
 mIXmfPix9iNoBL/bekGpmqF/4B9nymUrWHQDvUflXbFwf0DrrLnEEGMNmpjJk4bx
 oxDuRb0/gzD7zs2pbTaWeysnl5UsmVDiQM9xLQBowNo+I4NoPv2+u9klHii4qEHW
 JlWsWJg4lFy6u7a9rQYjLEZWVZSNX6Y6e3er5m3qhyCbQwlKIQU=
 =U1aW
 -----END PGP SIGNATURE-----

Merge 3.10.76 into android-msm-bullhead-3.10-oreo-m5

Changes in 3.10.76: (34 commits)
        conditionally define U32_MAX
        remove extra definitions of U32_MAX
        tcp: prevent fetching dst twice in early demux code
        ipv6: Don't reduce hop limit for an interface
        tcp: fix FRTO undo on cumulative ACK of SACKed range
        tcp: tcp_make_synack() should clear skb->tstamp
        8139cp: Call dev_kfree_skby_any instead of kfree_skb.
        8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
        r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.
        bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
        tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.
        ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.
        benet: Call dev_kfree_skby_any instead of kfree_skb.
        serial: 8250_dw: Fix deadlock in LCR workaround
        jfs: fix readdir regression
        splice: Apply generic position and size checks to each write
        mm: Fix NULL pointer dereference in madvise(MADV_WILLNEED) support
        Bluetooth: Enable Atheros 0cf3:311e for firmware upload
        Bluetooth: Add firmware update for Atheros 0cf3:311f
        Bluetooth: btusb: Add IMC Networks (Broadcom based)
        Bluetooth: Add support for Intel bootloader devices
        Bluetooth: Ignore isochronous endpoints for Intel USB bootloader
        netfilter: conntrack: disable generic tracking for known protocols
        KVM: x86: SYSENTER emulation is broken
        kconfig: Fix warning "‘jump’ may be used uninitialized"
        move d_rcu from overlapping d_child to overlapping d_alias
        deal with deadlock in d_walk()
        vm: add VM_FAULT_SIGSEGV handling support
        vm: make stack guard page errors return VM_FAULT_SIGSEGV rather than SIGBUS
        x86: mm: move mmap_sem unlock from mm_fault_error() to caller
        sb_edac: avoid INTERNAL ERROR message in EDAC with unspecified channel
        arc: mm: Fix build failure
        dcache: Fix locking bugs in backported "deal with deadlock in d_walk()"
        Linux 3.10.76

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
2018-01-25 16:40:36 -07:00
Nathan Chancellor 3fadcb88c4
Revert "BACKPORT: mm: larger stack guard gap, between vmas"
This will come back as commit 1ad9a25dd0 ("mm: larger stack guard gap,
between vmas") in 3.10.107 and it causes a conflict with 3.10.76.

This reverts commit 25cd784141.

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
2018-01-25 16:39:50 -07:00
Hugh Dickins 25cd784141 BACKPORT: mm: larger stack guard gap, between vmas
commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream.

Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.

This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.

Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.

One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications.  For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).

Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.

Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.

Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
[wt: backport to 4.11: adjust context]
[wt: backport to 4.9: adjust context ; kernel doc was not in
admin-guide]
[wt: backport to 4.4: adjust context ; drop ppc hugetlb_radix changes]
[wt: backport to 3.18: adjust context ; no FOLL_POPULATE ;
     s390 uses generic arch_get_unmapped_area()]
[wt: backport to 3.16: adjust context]
[wt: backport to 3.10: adjust context ; code logic in PARISC's
     arch_get_unmapped_area() wasn't found ; code inserted into
     expand_upwards() and expand_downwards() runs under anon_vma lock;
     changes for gup.c:faultin_page go to memory.c:__get_user_pages();
     included Hugh Dickins' fixes]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 1ad9a25dd0)

Bug: 38413813
Change-Id: I07f79cec09c9e98fc3d82458f9a5f3f2e21e6ab4
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
2017-07-18 22:26:22 +00:00
Hugh Dickins 1ad9a25dd0 mm: larger stack guard gap, between vmas
commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream.

Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.

This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.

Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.

One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications.  For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).

Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.

Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.

Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
[wt: backport to 4.11: adjust context]
[wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide]
[wt: backport to 4.4: adjust context ; drop ppc hugetlb_radix changes]
[wt: backport to 3.18: adjust context ; no FOLL_POPULATE ;
     s390 uses generic arch_get_unmapped_area()]
[wt: backport to 3.16: adjust context]
[wt: backport to 3.10: adjust context ; code logic in PARISC's
     arch_get_unmapped_area() wasn't found ; code inserted into
     expand_upwards() and expand_downwards() runs under anon_vma lock;
     changes for gup.c:faultin_page go to memory.c:__get_user_pages();
     included Hugh Dickins' fixes]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-06-21 15:42:43 +02:00
Linus Torvalds 0c42d1fbb3 vm: add VM_FAULT_SIGSEGV handling support
commit 33692f27597fcab536d7cbbcc8f52905133e4aa7 upstream.

The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.

That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works.  However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV.  And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.

However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d4514 ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space.  And user space really
expected SIGSEGV, not SIGBUS.

To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it.  They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.

This is the mindless minimal patch to do this.  A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.

Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.

Reported-and-tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[shengyong: Backport to 3.10
 - adjust context
 - ignore modification for arch nios2, because 3.10 does not support it
 - ignore modification for driver lustre, because 3.10 does not support it
 - ignore VM_FAULT_FALLBACK in VM_FAULT_ERROR, becase 3.10 does not support
   this flag
 - add SIGSEGV handling to powerpc/cell spu_fault.c, because 3.10 does not
   separate it to copro_fault.c
 - add SIGSEGV handling in mm/memory.c, because 3.10 does not separate it
   to gup.c
]
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-29 10:34:00 +02:00
Johannes Weiner e2ec2c2b96 arch: mm: pass userspace fault flag to generic fault handler
commit 759496ba6407c6994d6a5ce3a5e74937d7816208 upstream.

Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.

Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21 09:22:56 -08:00
Johannes Weiner 0d49fb2dc7 arch: mm: pass userspace fault flag to generic fault handler
Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.

Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 759496ba6407c6994d6a5ce3a5e74937d7816208
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[imaund@codeaurora.org: resolve merge conflicts, removing changes in the
    tail that are not a part of this specific commit.]
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2014-02-07 13:49:46 -08:00
Naoya Horiguchi 73c908594f mm: migrate: check movability of hugepage in unmap_and_move_huge_page()
Currently hugepage migration works well only for pmd-based hugepages
(mainly due to lack of testing,) so we had better not enable migration of
other levels of hugepages until we are ready for it.

Some users of hugepage migration (mbind, move_pages, and migrate_pages) do
page table walk and check pud/pmd_huge() there, so they are safe.  But the
other users (softoffline and memory hotremove) don't do this, so without
this patch they can try to migrate unexpected types of hugepages.

To prevent this, we introduce hugepage_migration_support() as an
architecture dependent check of whether hugepage are implemented on a pmd
basis or not.  And on some architecture multiple sizes of hugepages are
available, so hugepage_migration_support() also checks hugepage size.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 83467efbdb7948146581a56cbd683a22a0684bbb
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[imaund@codeaurora.org: resolve merge conflicts. Since we are only picking
    changes related to arm64, there were additional changes in the tail
    which weren't needed.]
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2014-02-07 13:49:46 -08:00
Jiang Liu b78e83204c mm: concentrate modification of totalram_pages into the mm core
Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it.  With these
changes applied, only following functions from mm core modify global
variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
free_all_bootmem_node(), adjust_managed_page_count().

With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 0c988534737a358fdff42fcce78f0ff1a12dbfc5
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2014-02-07 13:49:40 -08:00
Linus Torvalds 5a148af669 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc update from Benjamin Herrenschmidt:
 "The main highlights this time around are:

   - A pile of addition POWER8 bits and nits, such as updated
     performance counter support (Michael Ellerman), new branch history
     buffer support (Anshuman Khandual), base support for the new PCI
     host bridge when not using the hypervisor (Gavin Shan) and other
     random related bits and fixes from various contributors.

   - Some rework of our page table format by Aneesh Kumar which fixes a
     thing or two and paves the way for THP support.  THP itself will
     not make it this time around however.

   - More Freescale updates, including Altivec support on the new e6500
     cores, new PCI controller support, and a pile of new boards support
     and updates.

   - The usual batch of trivial cleanups & fixes"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits)
  powerpc: Fix build error for book3e
  powerpc: Context switch the new EBB SPRs
  powerpc: Turn on the EBB H/FSCR bits
  powerpc: Replace CPU_FTR_BCTAR with CPU_FTR_ARCH_207S
  powerpc: Setup BHRB instructions facility in HFSCR for POWER8
  powerpc: Fix interrupt range check on debug exception
  powerpc: Update tlbie/tlbiel as per ISA doc
  powerpc: Print page size info during boot
  powerpc: print both base and actual page size on hash failure
  powerpc: Fix hpte_decode to use the correct decoding for page sizes
  powerpc: Decode the pte-lp-encoding bits correctly.
  powerpc: Use encode avpn where we need only avpn values
  powerpc: Reduce PTE table memory wastage
  powerpc: Move the pte free routines from common header
  powerpc: Reduce the PTE_INDEX_SIZE
  powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format
  powerpc: New hugepage directory format
  powerpc: Don't truncate pgd_index wrongly
  powerpc: Don't hard code the size of pte page
  powerpc: Save DAR and DSISR in pt_regs on MCE
  ...
2013-05-02 10:16:16 -07:00
Linus Torvalds 20b4fb4852 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS updates from Al Viro,

Misc cleanups all over the place, mainly wrt /proc interfaces (switch
create_proc_entry to proc_create(), get rid of the deprecated
create_proc_read_entry() in favor of using proc_create_data() and
seq_file etc).

7kloc removed.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
  don't bother with deferred freeing of fdtables
  proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
  proc: Make the PROC_I() and PDE() macros internal to procfs
  proc: Supply a function to remove a proc entry by PDE
  take cgroup_open() and cpuset_open() to fs/proc/base.c
  ppc: Clean up scanlog
  ppc: Clean up rtas_flash driver somewhat
  hostap: proc: Use remove_proc_subtree()
  drm: proc: Use remove_proc_subtree()
  drm: proc: Use minor->index to label things, not PDE->name
  drm: Constify drm_proc_list[]
  zoran: Don't print proc_dir_entry data in debug
  reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
  proc: Supply an accessor for getting the data from a PDE's parent
  airo: Use remove_proc_subtree()
  rtl8192u: Don't need to save device proc dir PDE
  rtl8187se: Use a dir under /proc/net/r8180/
  proc: Add proc_mkdir_data()
  proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
  proc: Move PDE_NET() to fs/proc/proc_net.c
  ...
2013-05-01 17:51:54 -07:00
Jiang Liu 88448b65a8 mm/SH: use common help functions to free reserved pages
Use common help functions to free reserved pages.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:31 -07:00
Paul Bolle 45b02f8d94 memblock: kill "config MAX_ACTIVE_REGIONS"
The Kconfig symbol MAX_ACTIVE_REGIONS is unused. Commit
0ee332c145 ("memblock: Kill
early_node_map[]") removed the only place were it was actually used. But
it did not remove its Kconfig entries (for powerpc and sh).

Remove those two entries (and the entry for metag, that popped up in
v3.9-rc1).

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:53 +10:00
Al Viro d9dda78bad procfs: new helper - PDE_DATA(inode)
The only part of proc_dir_entry the code outside of fs/proc
really cares about is PDE(inode)->data.  Provide a helper
for that; static inline for now, eventually will be moved
to fs/proc, along with the knowledge of struct proc_dir_entry
layout.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:32 -04:00
Linus Torvalds d895cb1af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...
2013-02-26 20:16:07 -08:00
Wen Congyang 24d335ca36 memory-hotplug: introduce new arch_remove_memory() for removing page table
For removing memory, we need to remove page tables.  But it depends on
architecture.  So the patch introduce arch_remove_memory() for removing
page table.  Now it only calls __remove_pages().

Note: __remove_pages() for some archtecuture is not implemented
      (I don't know how to implement it for s390).

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:12 -08:00
Al Viro 496ad9aa8e new helper: file_inode(file)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-22 23:31:31 -05:00
Kees Cook 0d57af1ec7 arch/sh: remove depends on CONFIG_EXPERIMENTAL
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.

CC: Paul Mundt <lethal@linux-sh.org>
CC: Tejun Heo <tj@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21 14:43:13 -08:00
Linus Torvalds 3d59eebc5e Automatic NUMA Balancing V11
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQIcBAABAgAGBQJQx0kQAAoJEHzG/DNEskfi4fQP/R5PRovayroZALBMLnVJDaLD
 Ttr9p40VNXbiJ+MfRgatJjSSJZ4Jl+fC3NEqBhcwVZhckZZb9R2s0WtrSQo5+ZbB
 vdRfiuKoCaKM4cSZ08C12uTvsF6xjhjd27CTUlMkyOcDoKxMEFKelv0hocSxe4Wo
 xqlv3eF+VsY7kE1BNbgBP06SX4tDpIHRxXfqJPMHaSKQmre+cU0xG2GcEu3QGbHT
 DEDTI788YSaWLmBfMC+kWoaQl1+bV/FYvavIAS8/o4K9IKvgR42VzrXmaFaqrbgb
 72ksa6xfAi57yTmZHqyGmts06qYeBbPpKI+yIhCMInxA9CY3lPbvHppRf0RQOyzj
 YOi4hovGEMJKE+BCILukhJcZ9jCTtS3zut6v1rdvR88f4y7uhR9RfmRfsxuW7PNj
 3Rmh191+n0lVWDmhOs2psXuCLJr3LEiA0dFffN1z8REUTtTAZMsj8Rz+SvBNAZDR
 hsJhERVeXB6X5uQ5rkLDzbn1Zic60LjVw7LIp6SF2OYf/YKaF8vhyWOA8dyCEu8W
 CGo7AoG0BO8tIIr8+LvFe8CweypysZImx4AjCfIs4u9pu/v11zmBvO9NO5yfuObF
 BreEERYgTes/UITxn1qdIW4/q+Nr0iKO3CTqsmu6L1GfCz3/XzPGs3U26fUhllqi
 Ka0JKgnWvsa6ez6FSzKI
 =ivQa
 -----END PGP SIGNATURE-----

Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma

Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
 "There are three implementations for NUMA balancing, this tree
  (balancenuma), numacore which has been developed in tip/master and
  autonuma which is in aa.git.

  In almost all respects balancenuma is the dumbest of the three because
  its main impact is on the VM side with no attempt to be smart about
  scheduling.  In the interest of getting the ball rolling, it would be
  desirable to see this much merged for 3.8 with the view to building
  scheduler smarts on top and adapting the VM where required for 3.9.

  The most recent set of comparisons available from different people are

    mel:    https://lkml.org/lkml/2012/12/9/108
    mingo:  https://lkml.org/lkml/2012/12/7/331
    tglx:   https://lkml.org/lkml/2012/12/10/437
    srikar: https://lkml.org/lkml/2012/12/10/397

  The results are a mixed bag.  In my own tests, balancenuma does
  reasonably well.  It's dumb as rocks and does not regress against
  mainline.  On the other hand, Ingo's tests shows that balancenuma is
  incapable of converging for this workloads driven by perf which is bad
  but is potentially explained by the lack of scheduler smarts.  Thomas'
  results show balancenuma improves on mainline but falls far short of
  numacore or autonuma.  Srikar's results indicate we all suffer on a
  large machine with imbalanced node sizes.

  My own testing showed that recent numacore results have improved
  dramatically, particularly in the last week but not universally.
  We've butted heads heavily on system CPU usage and high levels of
  migration even when it shows that overall performance is better.
  There are also cases where it regresses.  Of interest is that for
  specjbb in some configurations it will regress for lower numbers of
  warehouses and show gains for higher numbers which is not reported by
  the tool by default and sometimes missed in treports.  Recently I
  reported for numacore that the JVM was crashing with
  NullPointerExceptions but currently it's unclear what the source of
  this problem is.  Initially I thought it was in how numacore batch
  handles PTEs but I'm no longer think this is the case.  It's possible
  numacore is just able to trigger it due to higher rates of migration.

  These reports were quite late in the cycle so I/we would like to start
  with this tree as it contains much of the code we can agree on and has
  not changed significantly over the last 2-3 weeks."

* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
  mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
  mm/rmap: Convert the struct anon_vma::mutex to an rwsem
  mm: migrate: Account a transhuge page properly when rate limiting
  mm: numa: Account for failed allocations and isolations as migration failures
  mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
  mm: numa: Add THP migration for the NUMA working set scanning fault case.
  mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
  mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
  mm: sched: numa: Control enabling and disabling of NUMA balancing
  mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
  mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships
  mm: numa: migrate: Set last_nid on newly allocated page
  mm: numa: split_huge_page: Transfer last_nid on tail page
  mm: numa: Introduce last_nid to the page frame
  sched: numa: Slowly increase the scanning period as NUMA faults are handled
  mm: numa: Rate limit setting of pte_numa if node is saturated
  mm: numa: Rate limit the amount of memory that is migrated between nodes
  mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
  mm: numa: Migrate pages handled during a pmd_numa hinting fault
  mm: numa: Migrate on reference policy
  ...
2012-12-16 15:18:08 -08:00
David Rientjes c2d23f919b mm, oom: remove statically defined arch functions of same name
out_of_memory() is a globally defined function to call the oom killer.
x86, sh, and powerpc all use a function of the same name within file scope
in their respective fault.c unnecessarily.  Inline the functions into the
pagefault handlers to clean the code up.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
Linus Torvalds 608ff1a210 Merge branch 'akpm' (Andrew's patchbomb)
Merge misc updates from Andrew Morton:
 "About half of most of MM.  Going very early this time due to
  uncertainty over the coreautounifiednumasched things.  I'll send the
  other half of most of MM tomorrow.  The rest of MM awaits a slab merge
  from Pekka."

* emailed patches from Andrew Morton: (71 commits)
  memory_hotplug: ensure every online node has NORMAL memory
  memory_hotplug: handle empty zone when online_movable/online_kernel
  mm, memory-hotplug: dynamic configure movable memory and portion memory
  drivers/base/node.c: cleanup node_state_attr[]
  bootmem: fix wrong call parameter for free_bootmem()
  avr32, kconfig: remove HAVE_ARCH_BOOTMEM
  mm: cma: remove watermark hacks
  mm: cma: skip watermarks check for already isolated blocks in split_free_page()
  mm, oom: fix race when specifying a thread as the oom origin
  mm, oom: change type of oom_score_adj to short
  mm: cleanup register_node()
  mm, mempolicy: remove duplicate code
  mm/vmscan.c: try_to_freeze() returns boolean
  mm: introduce putback_movable_pages()
  virtio_balloon: introduce migration primitives to balloon pages
  mm: introduce compaction and migration for ballooned pages
  mm: introduce a common interface for balloon pages mobility
  mm: redefine address_space.assoc_mapping
  mm: adjust address_space_operations.migratepage() return code
  arch/sparc/kernel/sys_sparc_64.c: s/COLOUR/COLOR/
  ...
2012-12-11 18:05:37 -08:00
Michel Lespinasse b4265f1234 mm: use vm_unmapped_area() on sh architecture
Update the sh arch_get_unmapped_area[_topdown] functions to make use of
vm_unmapped_area() instead of implementing a brute force search.

[akpm@linux-foundation.org: remove now-unused COLOUR_ALIGN_DOWN()]
Signed-off-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:26 -08:00
Peter Zijlstra cbee9f88ec mm: numa: Add fault driven placement and migration
NOTE: This patch is based on "sched, numa, mm: Add fault driven
	placement and migration policy" but as it throws away all the policy
	to just leave a basic foundation I had to drop the signed-offs-by.

This patch creates a bare-bones method for setting PTEs pte_numa in the
context of the scheduler that when faulted later will be faulted onto the
node the CPU is running on.  In itself this does nothing useful but any
placement policy will fundamentally depend on receiving hints on placement
from fault context and doing something intelligent about it.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:45 +00:00
Cyril Chemparathy 7e6735c357 /dev/mem: use phys_addr_t for physical addresses
This patch fixes the /dev/mem driver to use phys_addr_t for physical
addresses.  This is required on PAE systems, especially those that run
entirely out of >4G physical memory space.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-24 15:32:50 -07:00
Shaohua Li 45cac65b0f readahead: fault retry breaks mmap file read random detection
.fault now can retry.  The retry can break state machine of .fault.  In
filemap_fault, if page is miss, ra->mmap_miss is increased.  In the second
try, since the page is in page cache now, ra->mmap_miss is decreased.  And
these are done in one fault, so we can't detect random mmap file access.

Add a new flag to indicate .fault is tried once.  In the second try, skip
ra->mmap_miss decreasing.  The filemap_fault state machine is ok with it.

I only tested x86, didn't test other archs, but looks the change for other
archs is obvious, but who knows :)

Signed-off-by: Shaohua Li <shaohua.li@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:47 +09:00
Paul Mundt 90eed7d87b sh: Fix up recursive fault in oops with unset TTB.
Presently the oops code looks for the pgd either from the mm context or
the cached TTB value. There are presently cases where the TTB can be
unset or otherwise cleared by hardware, which we weren't handling,
resulting in recursive faults on the NULL pgd. In these cases we can
simply reload from swapper_pg_dir and continue on as normal.

Cc: stable@vger.kernel.org
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-07-25 13:11:13 +09:00
Paul Mundt 0412ddc822 sh64: Fix up section mismatch warnings.
WARNING: vmlinux.o(.cpuinit.text+0x280): Section mismatch in reference from the function cpu_probe() to the function .init.text:sh64_tlb_init()
The function __cpuinit cpu_probe() references
a function __init sh64_tlb_init().
If sh64_tlb_init is only used by cpu_probe then
annotate sh64_tlb_init with a matching annotation.

sh64_tlb_init() simply needs to be __cpuinit annotated, so fix that up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-14 15:05:53 +09:00
Paul Mundt d8fd35fc58 sh64: Fix up vmalloc fault range check.
With the previous attempt reverted this switches to conditionalizing the
end address. Nominally VMALLOC_END, but extended for P3_ADDR_MAX in the
store queue case.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-18 20:01:16 +09:00
Paul Mundt c3e0af9879 Revert "sh: Ensure fixmap and store queue space can co-exist."
This reverts commit 20e7c297ef.
With store queues enabled the area above P4SEG has special properties
from the MMU's point of view, which was causing fixmap failure. We'll
have to do something else to satisfy the vmalloc range check.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-18 19:30:05 +09:00
Paul Mundt fd37e75ed5 sh64: Set additional fault code values.
The SSR.MD status amongst other things are already made available, which
can be used for encoding a more precise fault code value.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 17:46:49 +09:00
Paul Mundt 392c3822a6 sh64: Tidy up and consolidate the TLB miss fast path.
This unifies the fast-path TLB miss handler, allowing for further cleanup
and eventual utilization of a shared _32/_64 handler.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 17:24:21 +09:00
Paul Mundt 2ec08e141f sh64: Fix up caller-save register settings for fast-path.
Now that the fast-path handler has been moved, we also need to update the
Makefile to ensure that the same restrictions for caller-save registers
are observed.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 16:46:07 +09:00
Paul Mundt 4de5185629 sh64: Invert page fault fast-path error path values.
This brings the sh64 version in line with the sh32 one with regards to
how errors are handled. Base work for further unification of the
implementations.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 16:44:45 +09:00
Paul Mundt c06fd28387 sh64: Migrate to __update_tlb() API.
Now that we have a method for finding out if we're handling an ITLB fault
or not without passing it all the way down the chain, it's possible to
use the __update_tlb() interface in place of a special __do_tlb_refill().

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 15:52:28 +09:00
Paul Mundt 28080329ed sh: Enable shared page fault handler for _32/_64.
This moves the now generic _32 page fault handling code to a shared place
and adapts the _64 implementation to make use of it.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 15:33:28 +09:00
Paul Mundt e45af0e083 sh64: Kill off unused fixed I/O mapping window.
This was reworked some time ago to go through fixmaps instead, leaving
the range itself unused. As such, kill off the remaining references and
hand over the remaining space for fixmaps directly. This also makes it
possible to simplify the vmalloc fault case as we no longer have to care
about the special section.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 15:16:11 +09:00
Paul Mundt 20e7c297ef sh: Ensure fixmap and store queue space can co-exist.
At the moment the top of the fixmap space is calculated from P4SEG, which
places it at the end of the store queue space when that API is enabled.
Make sure we use P3_ADDR_MAX here instead to find the proper address
limit. With this done, it's also possible to switch to the generic
vmalloc address range check now that VMALLOC_START/END encapsulate the
translatable areas that we care about.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 15:11:35 +09:00
Paul Mundt 9a7b7739f9 sh64: Utilize thread fault code encoding.
This plugs in fault code encoding for the sh64 page fault, too.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 15:07:52 +09:00
Paul Mundt 5a1dc78a38 sh: Support thread fault code encoding.
This provides a simple interface modelled after sparc64/m32r to encode
the error code in the upper byte of thread_info for finer-grained
handling in the page fault path.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 14:57:28 +09:00
Paul Mundt dbdb4e9f3f sh: Tidy up and generalize page fault error paths.
This follows the x86 changes for tidying up the page fault error paths.
We'll build on top of this for _32/_64 unification.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-05-14 10:27:34 +09:00
Paul Mundt b2212ea41d sh64: Kill off unused trap_no/error_code from thread_struct.
While the trap number and error code are passed around for debugging
purposes, this occurs wholly independently of the thread struct values.
These values were never part of the sigcontext ABI and are thus never
passed anywhere, so we can just kill them off across the board.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-04-19 17:52:20 +09:00
Paul Mundt fb56a91922 Merge branches 'sh/st-integration' and 'sh/stackprotector' into sh-latest 2012-04-19 17:31:59 +09:00
Stuart Menefy 45c0e0e25e sh: Improve oops error reporting
In some cases the opps error reporting doesn't give enough information
to diagnose the problem, only printing information if it is thought
to be valid. Replace the current code with more detailed output.

This code is based on the ARM reporting, with minor changes for the SH.

[lethal@linux-sh.org: fixed up for 64-bit PTEs and pte_offset_kernel()]
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-04-19 17:25:03 +09:00
Stuart Menefy 8d9a784d1e sh: Fix error synchronising kernel page tables
The problem is caused by the interaction of two features in the Linux
memory management code.

A processes address space is described by a struct mm_struct, and
every thread has a pointer to the mm it should run in. The exception
to this are kernel threads, which don't have an mm, and so borrow
the mm from the last thread which ran. The system is bootstrapped
by the initial kernel thread using init's mm (even though init hasn't
been created yet, its mm is the static init_mm).

The other feature is how the kernel handles the page table which
describes the portion of the address space which is only visible when
executing inside the kernel, and which is shared by all threads. On
the SH4 the only portion of the kernel's address space which described
using the page table is called P3, from 0xc0000000 to 0xdfffffff. This
portion of the address space is divided into three:
  - mappings for dma_alloc_coherent()
  - mappings for vmalloc() and ioremap()
  - fixmap mappings, primarily used in copy_user_pages() to create
    kernel mappings of user pages with the correct cache colour.

To optimise the TLB miss handler we don't want to add an additional
condition which checks whether the faulting address is in the user or
the kernel portion of the address space, and so all page tables have a
common portion which describes the kernel part of the address
space. As the SH4 uses a two level page table, only the kernel portion
of first level page table (the pgd entries) is duplicated. These all
point to the same second level entries (the pte's), and so no memory
is wasted.

The reference page table for the kernel is called the swapper_pg_dir,
and when a new page table is created for a new process the kernel
portion of the page table is copied from swapper_pg_dir. This works
fine when changes only occur in the second level of the kernel's page
table, or the first level entries are created before any new user
processes. However if a change occurs to the first level of the page
table, and there are existing processes which don't have this entry in
their page table, this new entry needs to be added. This is done on
demand, when the kernel accesses a P3 address which isn't mapped using
the current page table, the code in vmalloc_fault() copies the entry
from the reference page table (swapper_pg_dir) into the current
processes page table.

The bug which this patch addresses is that the code in vmalloc_fault()
was not copying addresses which fell in the dma_alloc_coherent()
portion of the address space, and it should have been copying any P3
address.

Why we hadn't seen this before, and what made this hard to reproduce,
is that normally the kernel will have called dma_alloc_coherent(), and
accessed the memory mapping created, before any user process
runs. Typically drivers such as USB or SATA will have created and used
mappings of this type during the kernel initialisation, when probing
for the attached devices, before init runs. Ethernet is slightly
different, as it normally only creates and accesses
dma_alloc_coherent() mappings when the network is brought up, but if
kernel level IP configuration is used this will also occur before any
user space process runs. So the first reproduction of this problem
which we saw was occurred when USB and SATA were removed from the
kernel, and then bring up Ethernet from user space using ifconfig.
I'd like to thank Joseph Bormolini who did the hard work reducing the
problem to this simple to reproduce criteria.

In your case the situation is slightly different, and turns out to
depends on the exact kernel configuration (which we had) and your
ramdisk contents (which we didn't - hence the need for some assumptions).

In this case the problem is a side effect of kernel level module
loading. Kernel subsystems sometimes trigger the load of kernel
modules directly, for example the crypto subsystem tries to load the
cryptomgr and MTD tries to load modules for Flash partitioning if
these are not built into the kernel. This is done by the kernel
creating a user process which runs insmod to try and load the
appropriate module.

In order for this to cause problems the system must be running with a
initrd or initramfs, which contains an insmod executable - if the
kernel can't find an insmod to run, no user process is created, and
the problem doesn't occur.  If an insmod is found, a process is
created to run it, which will inherit the kernel portion of the
swapper_pg_dir first level page table. It doesn't matter whether the
inmod is successful or not, but when the the kernel scheduler context
switches back to the kernel initialisation thread, the insmod's mm is
'borrowed' by the kernel thread, as it doesn't have an address space
of its own. (Reference counting is used to ensure this mm is not
destroyed, even though the user process which caused its creation may no
longer exist.) If this address space doesn't have a first level page
table entry for the consistent mappings, and a driver tries to access
such a mapping, we are in the same situation as described above,
except this time in a kernel thread rather than a user thread
executing inside the kernel.

See bugzilla: 15425, 15836, 15862, 16106, 16793

Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-04-19 15:57:44 +09:00
Paul Mundt ba2a3cdf76 sh64: Kill off dead page fault debug cruft.
In the future we'll be unifying some of the 32/64 page fault path, so
start to tidy up the _64 one by killing off some of the unused debug
cruft.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-04-11 12:53:06 +09:00
Paul Mundt a1e2030122 sh64: Port OOM changes to do_page_fault
Reflect the sh32 OOM changes for the sh64 page fault handler, too.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-04-11 12:44:50 +09:00
Kautuk Consul 11fd982400 sh/mm/fault_32.c: Port OOM changes to do_page_fault
Commit d065bd810b
(mm: retry page fault when blocking on disk transfer) and
commit 37b23e0525
(x86,mm: make pagefault killable)

The above commits introduced changes into the x86 pagefault handler
for making the page fault handler retryable as well as killable.

These changes reduce the mmap_sem hold time, which is crucial
during OOM killer invocation.

Port these changes to the 32-bit SH platform.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-04-11 12:37:54 +09:00
Linus Torvalds 664481ed45 SuperH updates for 3.4-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iEYEABECAAYFAk99uBgACgkQGkmNcg7/o7hglwCgqi6CE7i5gyneNYBn2ocRps4O
 y1UAoMSIscO6YWcHPuxOiNBbJYUy/jMI
 =SEO8
 -----END PGP SIGNATURE-----

Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh

Pull SuperH fixes from Paul Mundt.

* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
  sh: fix clock-sh7757 for the latest sh_mobile_sdhi driver
  serial: sh-sci: use serial_port_in/out vs sci_in/out.
  sh: vsyscall: Fix up .eh_frame generation.
  sh: dma: Fix up device attribute mismatch from sysdev fallout.
  sh: dwarf unwinder depends on SHcompact.
  sh: fix up fallout from system.h disintegration.
2012-04-07 09:52:46 -07:00
Linus Torvalds 58bca4a8fa Merge branch 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA mapping branch from Marek Szyprowski:
 "Short summary for the whole series:

  A few limitations have been identified in the current dma-mapping
  design and its implementations for various architectures.  There exist
  more than one function for allocating and freeing the buffers:
  currently these 3 are used dma_{alloc, free}_coherent,
  dma_{alloc,free}_writecombine, dma_{alloc,free}_noncoherent.

  For most of the systems these calls are almost equivalent and can be
  interchanged.  For others, especially the truly non-coherent ones
  (like ARM), the difference can be easily noticed in overall driver
  performance.  Sadly not all architectures provide implementations for
  all of them, so the drivers might need to be adapted and cannot be
  easily shared between different architectures.  The provided patches
  unify all these functions and hide the differences under the already
  existing dma attributes concept.  The thread with more references is
  available here:

    http://www.spinics.net/lists/linux-sh/msg09777.html

  These patches are also a prerequisite for unifying DMA-mapping
  implementation on ARM architecture with the common one provided by
  dma_map_ops structure and extending it with IOMMU support.  More
  information is available in the following thread:

    http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819

  More works on dma-mapping framework are planned, especially in the
  area of buffer sharing and managing the shared mappings (together with
  the recently introduced dma_buf interface: commit d15bd7ee44
  "dma-buf: Introduce dma buffer sharing mechanism").

  The patches in the current set introduce a new alloc/free methods
  (with support for memory attributes) in dma_map_ops structure, which
  will later replace dma_alloc_coherent and dma_alloc_writecombine
  functions."

People finally started piping up with support for merging this, so I'm
merging it as the last of the pending stuff from the merge window.
Looks like pohmelfs is going to wait for 3.5 and more external support
for merging.

* 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  common: DMA-mapping: add NON-CONSISTENT attribute
  common: DMA-mapping: add WRITE_COMBINE attribute
  common: dma-mapping: introduce mmap method
  common: dma-mapping: remove old alloc_coherent and free_coherent methods
  Hexagon: adapt for dma_map_ops changes
  Unicore32: adapt for dma_map_ops changes
  Microblaze: adapt for dma_map_ops changes
  SH: adapt for dma_map_ops changes
  Alpha: adapt for dma_map_ops changes
  SPARC: adapt for dma_map_ops changes
  PowerPC: adapt for dma_map_ops changes
  MIPS: adapt for dma_map_ops changes
  X86 & IA64: adapt for dma_map_ops changes
  common: dma-mapping: introduce generic alloc() and free() methods
2012-04-04 17:13:43 -07:00