-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZUiosAAoJEE44bZycYXAvcHYP/1OKMYQB/3G7GfEhMXdlpV31
VjdzUg5X1JOE60anYNopvWQJgDFXMy9mTceUI3axDkfYb5iDFUpRBFEh70ggDL04
bGB/J4n2Linjkj35u+S5P3fK6qBfg9+VDpTfUYPZGB5YjOjmaD06E8InBF8iUuC3
6pkMtQKOptmKOc2hw84PsB3qm9ER2MMa92Lrs1rtcOihEqQMyKjkI/kzogs8XGje
5gMt31VweScZed3d7i1r9tl/DTmzGcpEyVpz/x8gI7Xwi69FeeLy6cWbhK0VOsLA
u7ul9mDa77bUC/jpBzJmIkS8fhzaTyUw8NQbtol9RSSIfzb+mvXyx9Vr7o4LYK2B
P6AekC16x6R8KUED1hfxKdagguRACDfKf91bMAxDCN/PXqITVbk3RxxxH6wHAvOx
Ihf4G5h800/ks6X1oMBYZcbFFbNCUHZjyL7V1M/iy1TrKuRhEtou4Ft3X+gOauLS
CG8VR9Jo1/BAvMaJmy5Hg9RPNoxEMstDi6x3ugD0wH57XHSZ5QmFMBzCbuWR6hWM
q1DvBK/I54BXlsdYU9WySn1hm2gKCNPZ+zGzLTo1l426vme+YjhC5911V7Tv+WHm
lc5FTXWtXGhoAZuNSIGDrlv3Dyq44iMNrqXrhlPmJjWD3Hx4hFGGp2GyHOpK+5+7
7egPk9m1WrhUKzA9m1/M
=InCr
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEJDfLduVEy2qz2d/TmXOSYMtstxYFAlpqfQUACgkQmXOSYMts
txZNghAApD/SW4fTOx6RZFCPVjAP70FfXvZsQYf3Zfp44Ytm2Kax3GIABPuknlI+
IZRAPnXb6KP8DNDdCyGcJ0avI5uw96sXyeZWlDZyeS1WHHizJq3+BLB09zzdegSk
K1dJrobXCYNESmcQMT5diGwqLYkdOs3hh7Ehqut29njwCzVzNG3n43H9F15o9cUZ
6lAM8/Zb6ai+0KgVgwC40QJneVltDEFfXVr6wo/IJXnYNaRCPKQM5lsG09pxxopG
NVSsmUyeJI5bPWEm5vbuBL2JVhaCcMtTfAPHflqbtykE8eSVEWdTeCWPuGWcATB+
2sGp3cVR2W7+4CHpbcnrXolmP/OI3jXHbG1LvyRqg4Iw1jgtZ8wwjCEkdsPz3fED
g2+EtSYl/NLW7N8P4KQV9jzihYIfELBj9HQsEs5aPOstyjyxl12RxJvjw835v5ts
oa7qKQAHIwZsuaB34qK+DjI5coNeKRvDMy5mm0GL3TqmLLFEzSVpaTceGpdvNLi0
6k3RkuJzU0TwAoTShWyYu6AbV+8aHniBQbjzYs5sufRgDy9pjnfWzDqtUM+chTsm
WaxwhpHdpOomwAfZr8/Zaf0xIxP/M99SFKevntE04Ft93P8dKuLqFcNAjQkMdibY
UHrJ67nBllmDtlH8yGO9j4FD89O0QaBX4J3qGyIu5eE73/iibvo=
=J7vi
-----END PGP SIGNATURE-----
Merge 3.10.107 into android-msm-bullhead-3.10-oreo-m5
Changes in 3.10.107: (270 commits)
Revert "Btrfs: don't delay inode ref updates during log, replay"
Btrfs: fix memory leak in reading btree blocks
ext4: use more strict checks for inodes_per_block on mount
ext4: fix in-superblock mount options processing
ext4: add sanity checking to count_overhead()
ext4: validate s_first_meta_bg at mount time
jbd2: don't leak modified metadata buffers on an aborted journal
ext4: fix fencepost in s_first_meta_bg validation
ext4: trim allocation requests to group size
ext4: preserve the needs_recovery flag when the journal is aborted
ext4: return EROFS if device is r/o and journal replay is needed
ext4: fix inode checksum calculation problem if i_extra_size is small
block: fix use-after-free in sys_ioprio_get()
block: allow WRITE_SAME commands with the SG_IO ioctl
block: fix del_gendisk() vs blkdev_ioctl crash
dm crypt: mark key as invalid until properly loaded
dm space map metadata: fix 'struct sm_metadata' leak on failed create
md/raid5: limit request size according to implementation limits
md:raid1: fix a dead loop when read from a WriteMostly disk
md linear: fix a race between linear_add() and linear_congested()
CIFS: Fix a possible memory corruption during reconnect
CIFS: Fix missing nls unload in smb2_reconnect()
CIFS: Fix a possible memory corruption in push locks
CIFS: remove bad_network_name flag
fs/cifs: make share unaccessible at root level mountable
cifs: Do not send echoes before Negotiate is complete
ocfs2: fix crash caused by stale lvb with fsdlm plugin
ocfs2: fix BUG_ON() in ocfs2_ci_checkpointed()
can: raw: raw_setsockopt: limit number of can_filter that can be set
can: peak: fix bad memory access and free sequence
can: c_can_pci: fix null-pointer-deref in c_can_start() - set device pointer
can: ti_hecc: add missing prepare and unprepare of the clock
can: bcm: fix hrtimer/tasklet termination in bcm op removal
can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer
ALSA: hda - Fix up GPIO for ASUS ROG Ranger
ALSA: seq: Fix race at creating a queue
ALSA: seq: Don't handle loop timeout at snd_seq_pool_done()
ALSA: timer: Reject user params with too small ticks
ALSA: seq: Fix link corruption by event error handling
ALSA: seq: Fix racy cell insertions during snd_seq_pool_done()
ALSA: seq: Fix race during FIFO resize
ALSA: seq: Don't break snd_use_lock_sync() loop by timeout
ALSA: usb-audio: Add QuickCam Communicate Deluxe/S7500 to volume_control_quirks
usb: gadgetfs: restrict upper bound on device configuration size
USB: gadgetfs: fix unbounded memory allocation bug
USB: gadgetfs: fix use-after-free bug
USB: gadgetfs: fix checks of wTotalLength in config descriptors
xhci: free xhci virtual devices with leaf nodes first
USB: serial: io_ti: bind to interface after fw download
usb: gadget: composite: always set ep->mult to a sensible value
USB: cdc-acm: fix double usb_autopm_put_interface() in acm_port_activate()
USB: cdc-acm: fix open and suspend race
USB: cdc-acm: fix failed open not being detected
usb: dwc3: gadget: make Set Endpoint Configuration macros safe
usb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers
usb: dwc3: gadget: delay unmap of bounced requests
usb: hub: Wait for connection to be reestablished after port reset
usb: gadget: composite: correctly initialize ep->maxpacket
USB: UHCI: report non-PME wakeup signalling for Intel hardware
arm/xen: Use alloc_percpu rather than __alloc_percpu
xfs: set AGI buffer type in xlog_recover_clear_agi_bucket
xfs: clear _XBF_PAGES from buffers when readahead page
ssb: Fix error routine when fallback SPROM fails
drivers/gpu/drm/ast: Fix infinite loop if read fails
scsi: avoid a permanent stop of the scsi device's request queue
scsi: move the nr_phys_segments assert into scsi_init_io
scsi: don't BUG_ON() empty DMA transfers
scsi: storvsc: properly handle SRB_ERROR when sense message is present
scsi: storvsc: properly set residual data length on errors
target/pscsi: Fix TYPE_TAPE + TYPE_MEDIMUM_CHANGER export
scsi: lpfc: Add shutdown method for kexec
scsi: sr: Sanity check returned mode data
scsi: sd: Fix capacity calculation with 32-bit sector_t
s390/vmlogrdr: fix IUCV buffer allocation
libceph: verify authorize reply on connect
nfs_write_end(): fix handling of short copies
powerpc/ps3: Fix system hang with GCC 5 builds
sg_write()/bsg_write() is not fit to be called under KERNEL_DS
ftrace/x86: Set ftrace_stub to weak to prevent gcc from using short jumps to it
cred/userns: define current_user_ns() as a function
net: ti: cpmac: Fix compiler warning due to type confusion
tick/broadcast: Prevent NULL pointer dereference
netvsc: reduce maximum GSO size
drop_monitor: add missing call to genlmsg_end
drop_monitor: consider inserted data in genlmsg_end
igmp: Make igmp group member RFC 3376 compliant
HID: hid-cypress: validate length of report
Input: xpad - use correct product id for x360w controllers
Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000
Input: iforce - validate number of endpoints before using them
Input: kbtab - validate number of endpoints before using them
Input: joydev - do not report stale values on first open
Input: tca8418 - use the interrupt trigger from the device tree
Input: mpr121 - handle multiple bits change of status register
Input: mpr121 - set missing event capability
Input: i8042 - add Clevo P650RS to the i8042 reset list
i2c: fix kernel memory disclosure in dev interface
vme: Fix wrong pointer utilization in ca91cx42_slave_get
sysrq: attach sysrq handler correctly for 32-bit kernel
pinctrl: sh-pfc: Do not unconditionally support PIN_CONFIG_BIAS_DISABLE
x86/PCI: Ignore _CRS on Supermicro X8DTH-i/6/iF/6F
qla2xxx: Fix crash due to null pointer access
ARM: 8634/1: hw_breakpoint: blacklist Scorpion CPUs
ARM: dts: da850-evm: fix read access to SPI flash
NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT
vmxnet3: Wake queue from reset work
Fix memory leaks in cifs_do_mount()
Compare prepaths when comparing superblocks
Move check for prefix path to within cifs_get_root()
Fix regression which breaks DFS mounting
apparmor: fix uninitialized lsm_audit member
apparmor: exec should not be returning ENOENT when it denies
apparmor: fix disconnected bind mnts reconnection
apparmor: internal paths should be treated as disconnected
apparmor: check that xindex is in trans_table bounds
apparmor: add missing id bounds check on dfa verification
apparmor: don't check for vmalloc_addr if kvzalloc() failed
apparmor: fix oops in profile_unpack() when policy_db is not present
apparmor: fix module parameters can be changed after policy is locked
apparmor: do not expose kernel stack
vfio/pci: Fix integer overflows, bitmask check
bna: Add synchronization for tx ring.
sg: Fix double-free when drives detach during SG_IO
move the call of __d_drop(anon) into __d_materialise_unique(dentry, anon)
serial: 8250_pci: Detach low-level driver during PCI error recovery
bnx2x: Correct ringparam estimate when DOWN
tile/ptrace: Preserve previous registers for short regset write
sysctl: fix proc_doulongvec_ms_jiffies_minmax()
ISDN: eicon: silence misleading array-bounds warning
ARC: [arcompact] handle unaligned access delay slot corner case
parisc: Don't use BITS_PER_LONG in userspace-exported swab.h header
nfs: Don't increment lock sequence ID after NFS4ERR_MOVED
ipv6: addrconf: Avoid addrconf_disable_change() using RCU read-side lock
af_unix: move unix_mknod() out of bindlock
drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval
crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg
ata: sata_mv:- Handle return value of devm_ioremap.
mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone()
mm, fs: check for fatal signals in do_generic_file_read()
ARC: [arcompact] brown paper bag bug in unaligned access delay slot fixup
sched/debug: Don't dump sched debug info in SysRq-W
tcp: fix 0 divide in __tcp_select_window()
macvtap: read vnet_hdr_size once
packet: round up linear to header len
vfs: fix uninitialized flags in splice_to_pipe()
siano: make it work again with CONFIG_VMAP_STACK
futex: Move futex_init() to core_initcall
rtc: interface: ignore expired timers when enqueuing new timers
irda: Fix lockdep annotations in hashbin_delete().
tty: serial: msm: Fix module autoload
rtlwifi: rtl_usb: Fix for URB leaking when doing ifconfig up/down
af_packet: remove a stray tab in packet_set_ring()
MIPS: Fix special case in 64 bit IP checksumming.
mm: vmpressure: fix sending wrong events on underflow
ipc/shm: Fix shmat mmap nil-page protection
sd: get disk reference in sd_check_events()
samples/seccomp: fix 64-bit comparison macros
ath5k: drop bogus warning on drv_set_key with unsupported cipher
rdma_cm: fail iwarp accepts w/o connection params
NFSv4: fix getacl ERANGE for some ACL buffer sizes
bcma: use (get|put)_device when probing/removing device driver
powerpc/xmon: Fix data-breakpoint
KVM: VMX: use correct vmcs_read/write for guest segment selector/base
KVM: PPC: Book3S PR: Fix illegal opcode emulation
KVM: s390: fix task size check
s390: TASK_SIZE for kernel threads
xtensa: move parse_tag_fdt out of #ifdef CONFIG_BLK_DEV_INITRD
mac80211: flush delayed work when entering suspend
drm/ast: Fix test for VGA enabled
drm/ttm: Make sure BOs being swapped out are cacheable
fat: fix using uninitialized fields of fat_inode/fsinfo_inode
drivers: hv: Turn off write permission on the hypercall page
xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers
crypto: improve gcc optimization flags for serpent and wp512
mtd: pmcmsp: use kstrndup instead of kmalloc+strncpy
cpmac: remove hopeless #warning
mvsas: fix misleading indentation
l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv
net: don't call strlen() on the user buffer in packet_bind_spkt()
dccp: Unlock sock before calling sk_free()
tcp: fix various issues for sockets morphing to listen state
uapi: fix linux/packet_diag.h userspace compilation error
ipv6: avoid write to a possibly cloned skb
dccp: fix memory leak during tear-down of unsuccessful connection request
futex: Fix potential use-after-free in FUTEX_REQUEUE_PI
futex: Add missing error handling to FUTEX_REQUEUE_PI
give up on gcc ilog2() constant optimizations
cancel the setfilesize transation when io error happen
crypto: ghash-clmulni - Fix load failure
crypto: cryptd - Assign statesize properly
ACPI / video: skip evaluating _DOD when it does not exist
Drivers: hv: balloon: don't crash when memory is added in non-sorted order
s390/pci: fix use after free in dma_init
cpufreq: Fix and clean up show_cpuinfo_cur_freq()
igb: Workaround for igb i210 firmware issue
igb: add i211 to i210 PHY workaround
ipv4: provide stronger user input validation in nl_fib_input()
tcp: initialize icsk_ack.lrcvtime at session start time
ACM gadget: fix endianness in notifications
mmc: sdhci: Do not disable interrupts while waiting for clock
uvcvideo: uvc_scan_fallback() for webcams with broken chain
fbcon: Fix vc attr at deinit
crypto: algif_hash - avoid zero-sized array
virtio_balloon: init 1st buffer in stats vq
c6x/ptrace: Remove useless PTRACE_SETREGSET implementation
sparc/ptrace: Preserve previous registers for short regset write
metag/ptrace: Preserve previous registers for short regset write
metag/ptrace: Provide default TXSTATUS for short NT_PRSTATUS
metag/ptrace: Reject partial NT_METAG_RPIPE writes
libceph: force GFP_NOIO for socket allocations
ACPI: Fix incompatibility with mcount-based function graph tracing
ACPI / power: Avoid maybe-uninitialized warning
rtc: s35390a: make sure all members in the output are set
rtc: s35390a: implement reset routine as suggested by the reference
rtc: s35390a: improve irq handling
padata: avoid race in reordering
HID: hid-lg: Fix immediate disconnection of Logitech Rumblepad 2
HID: i2c-hid: Add sleep between POWER ON and RESET
drm/vmwgfx: NULL pointer dereference in vmw_surface_define_ioctl()
drm/vmwgfx: avoid calling vzalloc with a 0 size in vmw_get_cap_3d_ioctl()
drm/vmwgfx: Remove getparam error message
drm/vmwgfx: fix integer overflow in vmw_surface_define_ioctl()
Reset TreeId to zero on SMB2 TREE_CONNECT
metag/usercopy: Drop unused macros
metag/usercopy: Zero rest of buffer from copy_from_user
powerpc: Don't try to fix up misaligned load-with-reservation instructions
mm/mempolicy.c: fix error handling in set_mempolicy and mbind.
mtd: bcm47xxpart: fix parsing first block after aligned TRX
net/packet: fix overflow in check for priv area size
x86/vdso: Plug race between mapping and ELF header setup
iscsi-target: Fix TMR reference leak during session shutdown
iscsi-target: Drop work-around for legacy GlobalSAN initiator
xen, fbfront: fix connecting to backend
char: lack of bool string made CONFIG_DEVPORT always on
platform/x86: acer-wmi: setup accelerometer when machine has appropriate notify event
platform/x86: acer-wmi: setup accelerometer when ACPI device was found
mm: Tighten x86 /dev/mem with zeroing reads
virtio-console: avoid DMA from stack
catc: Combine failure cleanup code in catc_probe()
catc: Use heap buffer for memory size test
net: ipv6: check route protocol when deleting routes
Drivers: hv: don't leak memory in vmbus_establish_gpadl()
Drivers: hv: get rid of timeout in vmbus_open()
ubi/upd: Always flush after prepared for an update
x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs
powerpc: Reject binutils 2.24 when building little endian
net/packet: fix overflow in check for tp_frame_nr
net/packet: fix overflow in check for tp_reserve
tty: nozomi: avoid a harmless gcc warning
hostap: avoid uninitialized variable use in hfa384x_get_rid
gfs2: avoid uninitialized variable warning
net: neigh: guard against NULL solicit() method
sctp: listen on the sock only when it's state is listening or closed
ip6mr: fix notification device destruction
MIPS: Fix crash registers on non-crashing CPUs
RDS: Fix the atomicity for congestion map update
xen/x86: don't lose event interrupts
p9_client_readdir() fix
nfsd: check for oversized NFSv2/v3 arguments
ftrace/x86: Fix triple fault with graph tracing and suspend-to-ram
kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)
tun: read vnet_hdr_sz once
printk: use rcuidle console tracepoint
ipv6: check raw payload size correctly in ioctl
x86: standardize mmap_rnd() usage
x86/mm/32: Enable full randomization on i386 and X86_32
mm: larger stack guard gap, between vmas
mm: fix new crash in unmapped_area_topdown()
Allow stack to grow up to address space limit
Linux 3.10.107
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Conflicts:
arch/x86/mm/mmap.c
drivers/mmc/host/sdhci.c
drivers/usb/host/xhci-plat.c
fs/ext4/super.c
kernel/sched/core.c
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJWz1zgAAoJEDjbvchgkmk+yU8P/10DITNzrhCfz5wbhvvn9Uvo
7H1DziOora3u9h8/rz6xqgFEz2/9cZ03KoLcpGha7kEFBsvgVhN3uSI0YFpVV2mT
8/oh1ADdkky3Pld0f7gDGydDvrmgqx83/69SQ8hDQ8Mr2QTaKNvK05QGC2/EO9kI
OcUAXjdAGglmf5rfhNhXodG/F2DtsA55uCzeyuBhcPE3bM7d4/48pwr1b2tW2CR8
hsprRvSz+kGgHXQy8jYdxKEI66OC/i22xVnxEc8PZmPZ0fFfmszzc9nzhcseWfpe
0JGgfwAtM8Va+bX4kfvqPpc2qR0r8Z2iEKNnAHnGutOvSWvow0l1OEedsb/+s1J6
/AYlPIkgTxwLDAwBIymPgowkEMOPVZzPL0tkoZI8wjB+eqUxxLlIa2dNByCyUs/U
1xTy+0UDMMDXG911mJl+yZFvd4R7lQUavIEStmMQ+A/Go2KrATaqIM8WETBlm7oH
s3hZ3E+RBWmfD/6JQwsJNkwv6yWeaRXNE+bj8C1r/uBdPyGqX9T22OaIOlio+I71
XBNEM5mrTlNeNVIUIKW29qmLBxBrH2LLwpv/dRyfOfzfhi1B+dl9+3sJauvrSmWi
jrR1khGmmaZcfOT2DVmpwlDQCQcyMcy8S8RTTAHhhuNmWtSjdc3TcfRlHXvP0sOu
ruXBufxernb94E7sqsvF
=LW9r
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEJDfLduVEy2qz2d/TmXOSYMtstxYFAlpqdMoACgkQmXOSYMts
txYJKxAAkVgmXLjjtJbCUkYLzohjXabtfF9ekfy7UPRdBU+PPRC2c8tHcR6LCqXd
v+hEiI80h72BqEVE4y3ztFZlhbpSonIcmRrG+/gWsWcWmY9S0owilHwhmrl3uvmC
Fvso6+5oWVvVXuM8I4Ul/3bXmScVhv/rh22iN2hhOS7WgEVdqlhmYHC/KIpRK+rD
dyUQ2eONgr14FyGswgK0zLaFKXvKhQfEjvAu4KXJek0sIPIUEVdZ5xgS2v4eLigN
W0+ewi4DCTESCU8GCnZwwU1OIbe2De09sPIVwBM644bOIJRxOJxnL0a11IjwOaye
P9ne98G3M1vTruiM+/dA40eGh7kFiKKlIqCO1mf1IqrQSYq+sNEuDSmD9XY+huRZ
ktDue8NcUmFgJzJxeRYfdatCNF/esfdIzuzbFnw+Jr+EPACn6FiOXFgkJkUpo204
wvv+nOhiYlSJQT81jqmVTn3iGyvZIJd15uCEryguNt8LmLafGlztYBZ5dSUkejcu
nAipexnYGyrufD5XhshZlcBt1S1FCQZd3lUBETmqLzP+hiZG76ti96i2ro2hnyM5
TWva2zmC1Cp89l0dWJjtNSohD4S6226Jc6ebHTDO/67gpsj3dlbH3IR7rDqKXgof
AFltzPMYnfMPYuDmANTu7vqlJGI5974xrDA1hRAUN49YVxD5YKk=
=fJ2P
-----END PGP SIGNATURE-----
Merge 3.10.98 into android-msm-bullhead-3.10-oreo-m5
Changes in 3.10.98: (55 commits)
ALSA: seq: Fix double port list deletion
wan/x25: Fix use-after-free in x25_asy_open_tty()
staging/speakup: Use tty_ldisc_ref() for paste kworker
pty: fix possible use after free of tty->driver_data
pty: make sure super_block is still valid in final /dev/tty close
AIO: properly check iovec sizes
ext4: fix potential integer overflow
Btrfs: fix hang on extent buffer lock caused by the inode_paths ioctl
perf: Fix inherited events vs. tracepoint filters
ptrace: use fsuid, fsgid, effective creds for fs access checks
tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines
tracing: Fix freak link error caused by branch tracer
klist: fix starting point removed bug in klist iterators
scsi: restart list search after unlock in scsi_remove_target
scsi_sysfs: Fix queue_ramp_up_period return code
iscsi-target: Fix rx_login_comp hang after login failure
Fix a memory leak in scsi_host_dev_release()
SCSI: Fix NULL pointer dereference in runtime PM
iscsi-target: Fix potential dead-lock during node acl delete
SCSI: fix crashes in sd and sr runtime PM
drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
scsi_dh_rdac: always retry MODE SELECT on command lock violation
scsi: fix soft lockup in scsi_remove_target() on module removal
iio:ad7793: Fix ad7785 product ID
iio: lpc32xx_adc: fix warnings caused by enabling unprepared clock
iio:ad5064: Make sure ad5064_i2c_write() returns 0 on success
iio: adis_buffer: Fix out-of-bounds memory access
iio: dac: mcp4725: set iio name property in sysfs
cifs: fix erroneous return value
nfs: Fix race in __update_open_stateid()
udf: limit the maximum number of indirect extents in a row
udf: Prevent buffer overrun with multi-byte characters
udf: Check output buffer length when converting name to CS0
ARM: 8519/1: ICST: try other dividends than 1
ARM: 8517/1: ICST: avoid arithmetic overflow in icst_hz()
fuse: break infinite loop in fuse_fill_write_pages()
mm: soft-offline: check return value in second __get_any_page() call
Input: elantech - add Fujitsu Lifebook U745 to force crc_enabled
Input: elantech - mark protocols v2 and v3 as semi-mt
Input: i8042 - add Fujitsu Lifebook U745 to the nomux list
iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG
mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()
xhci: Fix list corruption in urb dequeue at host removal
m32r: fix m32104ut_defconfig build fail
dma-debug: switch check from _text to _stext
scripts/bloat-o-meter: fix python3 syntax error
memcg: only free spare array when readers are done
radix-tree: fix race in gang lookup
radix-tree: fix oops after radix_tree_iter_retry
intel_scu_ipcutil: underflow in scu_reg_access()
x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers
futex: Drop refcount if requeue_pi() acquired the rtmutex
ip6mr: call del_timer_sync() in ip6mr_free_table()
module: wrapper for symbol name.
Linux 3.10.98
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJViKCyAAoJEDjbvchgkmk+7igQANfP456IVSk/KTAi61UtDwDI
csRV4yjVE8mVethBnhVilpj6Loi3sz9vZDGApCLOrHgPYvOLHJ0VfShXYlL5spSE
uCfQTLZJiaU5/vrT4J8fy0rJhIjjsUOav1EMoYSb4CJhU2aqpyCUC14t7kvOKBug
uszH4Tu6h7Xu9n0Kf5RD34fPqrp7bx4q/a7Tw9el2ngnvs8HLuvEO5o4gCT7qG55
3IaV5rnP/V3KJeth5K7IeNmbKLhcKfNpiIBYzx+btUVBUOuf3nud/IXiFbf6y0xf
7GmG4eRUVQIyW3oXGc6aUjw5+A14Ul1hz7hfYakqQ151708WWvRbAmBNTOBeov6L
Fmcb2+NvP1bbJL6oSoqPY+sLNhjSqWYkKjoHzC7Jl24sZEvlADevXPouWeIdS3Gg
VYZNDV+BxrjBMyTaycjec9QVekjOG2Wwrm5eODFoSs37t40Zgn/wKq50Ra9aQv10
HPvqDuWBKl0QfOpaA4hiHnenjTxLoeJ9P+cddkuDWRbMwmUeDMPtrQIijMcGnPA1
E73KTGXmfwbyCwf2XMRivT52HbplEj7KdcnOHvPztL8vSjfCWJjeEUHGh/qNQ4jW
kUAmRdWxtVdFUDgpcLn14u3g4ARyDwI3941/whzrtHwi2AAJV0Rf36wAYfFNx8qB
hxGNa4EHHXWTyrSkQoc+
=1zWu
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEJDfLduVEy2qz2d/TmXOSYMtstxYFAlpqbB8ACgkQmXOSYMts
txY1dg//VF/h2LTWKlyrm/256LLYX1uTgCdUYDmFfTHxw2GOHOxCp4TCw4sruodP
4p8OR3NC+XgpU7fXjsuOr5uxBCOz5rAv25CEtuLSnctDr8Ck2AwK/yVzRs/JzuSr
FRo9j4zB/IrD8FdfbCbD2zOIenKiW/kFGfezk1YXA+NPS0Xr8g1HfgTfIJS8IAUU
2N3SYcjW+ypUOoUyR9aaTDtzqqJuj0qg3gVCq1fPhRP+ylUUAwmitYtBtT55S6o7
VyokIOv4DAWPf8jVGGHva/ywmOpFE+8K9ySXbMPZrETmfKuqw+6gqf0LxQzVOgTf
xe6Ze321kDMFVwnv3GhTNkwYCAkekQZVtQZcdg6WsYSHuavVLSssa5qy+qpF8740
9M5wiD5N24X5+CC82b9NZUgL/llV97+QbFS0PbOQ+J1vbi2MT1ZNde12de1phXWv
MPxImGzEAUA40HlO5krZfUAzTnwm1jKR9hO0wTImQQ+IIWpOo9HONsrxOoJwMnn6
fsUa3aILQuOBrRkga6ST5UQAjsPrm8yL6VGFEl3Fn/nAACbWG4876SAQUYyCFbQe
uYobqpIs7ZMUcGD372Sb1AyfZgLVYlIStprF0eSF31Ee7Oth0KJSyLXGwmu/6FeP
KKxyp3Idp+PXCWSITqbCSGvtJaekXaZ4JB651ycaBZNIIDl50jw=
=oodM
-----END PGP SIGNATURE-----
Merge 3.10.81 into android-msm-bullhead-3.10-oreo-m5
Changes in 3.10.81: (30 commits)
net: phy: Allow EEE for all RGMII variants
ipv4: Avoid crashing in ip_error
bridge: fix parsing of MLDv2 reports
net: dp83640: fix broken calibration routine.
unix/caif: sk_socket can disappear when state is unlocked
net_sched: invoke ->attach() after setting dev->qdisc
udp: fix behavior of wrong checksums
xen: netback: read hotplug script once at start of day.
iio: adis16400: Report pressure channel scale
iio: adis16400: Use != channel indices for the two voltage channels
iio: adis16400: Compute the scan mask from channel indices
ALSA: hda/realtek - Add a fixup for another Acer Aspire 9420
ALSA: usb-audio: Add mic volume fix quirk for Logitech Quickcam Fusion
ALSA: usb-audio: add MAYA44 USB+ mixer control names
Input: elantech - fix detection of touchpads where the revision matches a known rate
block: fix ext_dev_lock lockdep report
USB: cp210x: add ID for HubZ dual ZigBee and Z-Wave dongle
USB: serial: ftdi_sio: Add support for a Motion Tracker Development Board
ring-buffer-benchmark: Fix the wrong sched_priority of producer
MIPS: Fix enabling of DEBUG_STACKOVERFLOW
ozwpan: Use proper check to prevent heap overflow
ozwpan: divide-by-zero leading to panic
ozwpan: unchecked signed subtraction leads to DoS
pata_octeon_cf: fix broken build
drm/i915: Fix DDC probe for passive adapters
mm/memory_hotplug.c: set zone->wait_table to null after freeing it
cfg80211: wext: clear sinfo struct before calling driver
btrfs: incorrect handling for fiemap_fill_next_extent return
btrfs: cleanup orphans while looking up default subvolume
Linux 3.10.81
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJVM2NqAAoJEDjbvchgkmk+YtIQAKHNWU09GUrIxzg2va+9cVYI
pCyiUHd1JF/DLmWQG4TeBn4OowIqvwOuljPDg/0RoVrfX2cx33oAyo+R6Cgyay5c
1s7hPgTIsrV5QHTTWODXsV48fWE/AsqFqw01XvMnhMgFPRc3859Thh9zy29fwxjR
2xlzf5GBtWfmmuSLO8TtC1FOnvi7BuNKvhMR/5pJZ40kS1vpw6qpJvMPMSR2hEVT
fFfO87c9XPUhh94kRhMIaDoMk7OeZFbr0R7IJCW1WcUJVqFP8YQOK/YYLQmJERjG
OnGOF5W2VKGV0lWdMJ+NiNKZ3eLAjMHHqvzqbhl8ANU7AkRsw8bvwZeXjJJGFcqS
L9Ik94MakuuZDypyejZCC3QmlCGQUjR0PjmNGhuXZlPn63y0/dlxCEHlBxUdvdHh
OkfNDPMXqbRFzQ6ASjOPW0O41KiTOIw2oGezFkQRxq65KkGmBiCrHaEqmUtBLoP6
s5xPf7quMOvINn5GTEBTpZjGz4mH2UadCoRVXJ27Wn+KAxZqJgwpGodoyk+lHMc/
Xo3ndTVJGPnwKgAixkOINusEY2ne/TWyjPlGQBju/NoVTXsotdCf8HDtbRCY0mj4
EkxytoSnoI7/S2jGSFoUB8uDQuoQgveOSfe1IxUmWvBaIuUKHtM8h4n6f+Os7BzG
S+lI/rnXJHBkB8Oz1AGv
=vntV
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEJDfLduVEy2qz2d/TmXOSYMtstxYFAlpqaNgACgkQmXOSYMts
txaJrQ/+OLQnIVrcr0DdA1ZElDmt186GuGE4sNRorp8F+zNEmRKOwz0qXG8YmXq0
UzM9CvwYWdKtBVZFkFqOEEivgtrgmtcB8BEtki+MmOcQS0iMJD4XyZdwbgG1UUn/
JyQQcSLa4Bde2xkUEn/VUcxYjYYbwmhywYDIS0ApxMotFHu7NSVvtyYYUhnZXYmv
u3110UKu1va2R8gnxv9jN0PIe3yfFb5DaSQHrPcGkjeLhRM2W1ae2KYG1oBG7Yo/
7OCdRjGJTz+Hxzt1zGl1g6ROWZesy9/hnC+kWOif2QSkSEckWbM32dcCpQUh6Cmw
8GUs9yV7c8nMvNB8GWMGn8z8Mur5opt1r/FbVifgzlDZ8irFeVa5qkIVbdgCI4P1
cPui1+2Rgsocn5HbEoGNjGONtsn20YzC4EI2vWPkZVJirMB6J1HRLT8WGOLXoFB5
SnnDzLPnk7qAb5xIiAg1TaogrRk+2vwyHEf65OpAFGlYYL7Ng5019Zj4eetWO9OW
zZ7+kLSw8AWK5MTpZbLWVn6571oKenstNQM6nfOLkDH/YIXN4Q1JWIJSLG/Fcg+/
AY1hCKiJ0S3NipuHDQjCPA8es+AoIeHSLTKJPQ0I3AH36thEGIIFMiPAC1BRq+eX
ebmtn0N+ZErt3RTx/SBS80qfBJzakoU0dmdZOrT6+nQdZjhtqEU=
=uI+2
-----END PGP SIGNATURE-----
Merge 3.10.75 into android-msm-bullhead-3.10-oreo-m5
Changes in 3.10.75: (35 commits)
ALSA: hda - Add one more node in the EAPD supporting candidate list
ALSA: usb - Creative USB X-Fi Pro SB1095 volume knob support
ALSA: hda - Fix headphone pin config for Lifebook T731
selinux: fix sel_write_enforce broken return value
tcp: Fix crash in TCP Fast Open
IB/core: Avoid leakage from kernel to user space
IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic
iwlwifi: dvm: run INIT firmware again upon .start()
nbd: fix possible memory leak
mm/memory hotplug: postpone the reset of obsolete pgdat
writeback: add missing INITIAL_JIFFIES init in global_update_bandwidth()
writeback: fix possible underflow in write bandwidth calculation
radeon: Do not directly dereference pointers to BIOS area.
USB: ftdi_sio: Added custom PID for Synapse Wireless product
USB: ftdi_sio: Use jtag quirk for SNAP Connect E10
Defer processing of REQ_PREEMPT requests for blocked devices
iio: inv_mpu6050: Clear timestamps fifo while resetting hardware fifo
iio: imu: Use iio_trigger_get for indio_dev->trig assignment
dmaengine: omap-dma: Fix memory leak when terminating running transfer
cpuidle: ACPI: do not overwrite name and description of C0
usb: xhci: apply XHCI_AVOID_BEI quirk to all Intel xHCI controllers
cifs: fix use-after-free bug in find_writable_file
be2iscsi: Fix kernel panic when device initialization fails
ocfs2: _really_ sync the right range
iscsi target: fix oops when adding reject pdu
media: s5p-mfc: fix mmap support for 64bit arch
core, nfqueue, openvswitch: fix compilation warning
ipc: fix compat msgrcv with negative msgtyp
net: rds: use correct size for max unacked packets and bytes
net: llc: use correct size for sysctl timeout entries
kernel.h: define u8, s8, u32, etc. limits
IB/mlx4: Saturate RoCE port PMA counters in case of overflow
console: Fix console name size mismatch
pagemap: do not leak physical addresses to non-privileged userspace
Linux 3.10.75
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Conflicts:
fs/proc/task_mmu.c
include/linux/kernel.h
commit deb88a2a19e85842d79ba96b05031739ec327ff4 upstream.
Patch series "fix a kernel oops when reading sysfs valid_zones", v2.
A sysfs memory file is created for each 2GiB memory block on x86-64 when
the system has 64GiB or more memory. [1] When the start address of a
memory block is not backed by struct page, i.e. a memory range is not
aligned by 2GiB, reading its 'valid_zones' attribute file leads to a
kernel oops. This issue was observed on multiple x86-64 systems with
more than 64GiB of memory. This patch-set fixes this issue.
Patch 1 first fixes an issue in test_pages_in_a_zone(), which does not
test the start section.
Patch 2 then fixes the kernel oops by extending test_pages_in_a_zone()
to return valid [start, end).
Note for stable kernels: The memory block size change was made by commit
bdee237c0343 ("x86: mm: Use 2GB memory block size on large-memory x86-64
systems"), which was accepted to 3.9. However, this patch-set depends
on (and fixes) the change to test_pages_in_a_zone() made by commit
5f0f2887f4de ("mm/memory_hotplug.c: check for missing sections in
test_pages_in_a_zone()"), which was accepted to 4.4.
So, I recommend that we backport it up to 4.4.
[1] 'Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
This patch (of 2):
test_pages_in_a_zone() does not check 'start_pfn' when it is aligned by
section since 'sec_end_pfn' is set equal to 'pfn'. Since this function
is called for testing the range of a sysfs memory file, 'start_pfn' is
always aligned by section.
Fix it by properly setting 'sec_end_pfn' to the next section pfn.
Also make sure that this function returns 1 only when the range belongs
to a zone.
Link: http://lkml.kernel.org/r/20170127222149.30893-2-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
commit 5f0f2887f4de9508dcf438deab28f1de8070c271 upstream.
test_pages_in_a_zone() does not account for the possibility of missing
sections in the given pfn range. pfn_valid_within always returns 1 when
CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing
sections to pass the test, leading to a kernel oops.
Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check
for missing sections before proceeding into the zone-check code.
This also prevents a crash from offlining memory devices with missing
sections. Despite this, it may be a good idea to keep the related patch
'[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with
missing sections' because missing sections in a memory block may lead to
other problems not covered by the scope of this fix.
Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Alex Thorlton <athorlton@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Greg KH <greg@kroah.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85bd839983778fcd0c1c043327b14a046e979b39 upstream.
Izumi found the following oops when hot re-adding a node:
BUG: unable to handle kernel paging request at ffffc90008963690
IP: __wake_up_bit+0x20/0x70
Oops: 0000 [#1] SMP
CPU: 68 PID: 1237 Comm: rs:main Q:Reg Not tainted 4.1.0-rc5 #80
Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 1.87 04/28/2015
task: ffff880838df8000 ti: ffff880017b94000 task.ti: ffff880017b94000
RIP: 0010:[<ffffffff810dff80>] [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
RSP: 0018:ffff880017b97be8 EFLAGS: 00010246
RAX: ffffc90008963690 RBX: 00000000003c0000 RCX: 000000000000a4c9
RDX: 0000000000000000 RSI: ffffea101bffd500 RDI: ffffc90008963648
RBP: ffff880017b97c08 R08: 0000000002000020 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a0797c73800
R13: ffffea101bffd500 R14: 0000000000000001 R15: 00000000003c0000
FS: 00007fcc7ffff700(0000) GS:ffff880874800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc90008963690 CR3: 0000000836761000 CR4: 00000000001407e0
Call Trace:
unlock_page+0x6d/0x70
generic_write_end+0x53/0xb0
xfs_vm_write_end+0x29/0x80 [xfs]
generic_perform_write+0x10a/0x1e0
xfs_file_buffered_aio_write+0x14d/0x3e0 [xfs]
xfs_file_write_iter+0x79/0x120 [xfs]
__vfs_write+0xd4/0x110
vfs_write+0xac/0x1c0
SyS_write+0x58/0xd0
system_call_fastpath+0x12/0x76
Code: 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8d 47 48 <48> 39 47 48 48 c7 45 e8 00 00 00 00 48 c7 45 f0 00 00 00 00 48
RIP [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
RSP <ffff880017b97be8>
CR2: ffffc90008963690
Reproduce method (re-add a node)::
Hot-add nodeA --> remove nodeA --> hot-add nodeA (panic)
This seems an use-after-free problem, and the root cause is
zone->wait_table was not set to *NULL* after free it in
try_offline_node.
When hot re-add a node, we will reuse the pgdat of it, so does the zone
struct, and when add pages to the target zone, it will init the zone
first (including the wait_table) if the zone is not initialized. The
judgement of zone initialized is based on zone->wait_table:
static inline bool zone_is_initialized(struct zone *zone)
{
return !!zone->wait_table;
}
so if we do not set the zone->wait_table to *NULL* after free it, the
memory hotplug routine will skip the init of new zone when hot re-add
the node, and the wait_table still points to the freed memory, then we
will access the invalid address when trying to wake up the waiting
people after the i/o operation with the page is done, such as mentioned
above.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Reviewed by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In recent versions, the platform specific physical
offline returns the number of bytes offlined, so
a value of 0 indicates an error, not success as in
older versions. Make sure that the memory
for the original memory resource nodes is not
freed via kfree, as this memory was obtained
from alloc_bootmem very early in the system's life.
Change-Id: Iffcdd8be4483e043d7605fce596ed438b15f3e02
Signed-off-by: Larry Bassel <lbassel@codeaurora.org>
(cherry picked from commit 2421717cb10a06814d7bdb431485aa3a5e364f36)
Add Low Power mode TAG.
Add new API's for mem lowpower modes.
Create new sys file for mem low power modes.
Set SECTION_SIZE_BITS to 28.
Change NPA_MEMORY_NODE_NAME to "/mem/apps/ddr_dpd".
Fix NPA node create function to do atomic_inc()
in atomic_dec_and_test() failure case.
Change-Id: Ia5cb18b99338c43165d5401e619c773cd8d6b3f6
Signed-off-by: Larry Bassel <lbassel@codeaurora.org>
(cherry picked from commit b054046e708f8c5b044e76c2df6f72fd607be558)
Conflicts:
arch/arm/include/asm/setup.h
arch/arm/kernel/setup.c
arch/arm/mach-msm/include/mach/memory.h
arch/arm/mach-msm/memory.c
drivers/base/memory.c
include/linux/memory_hotplug.h
The file /sys/devices/system/memory/low_power now exists.
Writing a physical address into this file will put this
section of memory into a low power state (retaining contents)
if the architecture and platform supports it.
Change-Id: I70592d37f1091a1b533f2374546ba67b50ea7d30
Signed-off-by: Larry Bassel <lbassel@codeaurora.org>
(cherry picked from commit 1f4d1c8e295aaf66b23309caa0d03b09b7009b99)
Conflicts:
drivers/base/memory.c
include/linux/memory_hotplug.h
This provides the physical memory hotremove API (this is needed since our
physical memory removal is done by powering it off, which needs
to be requested by userspace, not by someone physically pulling memory
out of a machine).
Change-Id: Ic34426a91a1aac2bd4a45677ee00c2b7a3f84746
Signed-off-by: Larry Bassel <lbassel@codeaurora.org>
(cherry picked from commit d651b6964bbb50d3c1fee6f76467a0f867286dfb)
If offlined_pages is greater than
zone->present_pages, underflow will occur.
This change will set zone->present_pages to 0 if
offlined_pages is greater.
Change-Id: I728e90c60fb7fc391de7b9c4828ab264ca38653b
Signed-off-by: Jack Cheung <jackc@codeaurora.org>
(cherry picked from commit 80c201e25e8dbc00427b73d90b1527c356526442)
When onlining, the onlined pages must be added to the kernel's
list of free pages using __free_page(). However, pages are not
immediately added but placed in a queue to be processed
when the queue size reaches a watermark. The last pages in
the queue may not be processed in time, and if you try to
offline that memory before it is processed, offlining will
always fail.
This fix calls drain_all_pages(), which will process every
free page in the queue. This ensures that all pages are
accounted for when onlining and nothing gets stuck in the queue.
Change-Id: I54dbc0749556702407090e51ce9246abc5db7d1c
Signed-off-by: Jack Cheung <jackc@codeaurora.org>
(cherry picked from commit aa7e9dec5cfd309cb9eb6cb56a284a61607a925a)
This patch prevents memory hotplug from marking pages of the memmap that
only reference holes in the physical address space as private. Some
architectures (including ARM) attempt to free these unneeded parts of the
memmap, and attempting to free a private page will throw bad_page warnings
and tie up the memory indefinitely.
This patch also allows early_pfn_valid to be architecture specific and
defines it for ARM. The definition for ARM takes into account memory banks
and the holes in physical memory.
CRs-Fixed: 247010
Change-Id: Iad88d427b1b923a808b026c22d2899fa0483cb9e
Signed-off-by: jesset@codeaurora.org
(cherry picked from commit 0b610c773ad6281a3d217fbbe894b2476e9e71dd)
Conflicts:
arch/arm/mm/init.c
Vmalloc will exit if the amount it needs to allocate is
greater than totalram_pages. Vmalloc cannot allocate
from the movable zone, so pages in the movable zone should
not be counted.
This change adds a new global variable: total_unmovable_pages.
It is calculated in init.c, based on totalram_pages minus
the pages in the movable zone. Vmalloc now looks at this new
global instead of totalram_pages.
total_unmovable_pages can be modified during memory_hotplug.
If the zone you are offlining/onlining is unmovable, then
you modify it similar to totalram_pages. If the zone is
movable, then no change is needed.
Change-Id: Ie55c41051e9ad4b921eb04ecbb4798a8bd2344d6
Signed-off-by: Jack Cheung <jackc@codeaurora.org>
(cherry picked from commit 59f9f1c9ae463a3d4499cd9353619f8b1993371b)
Conflicts:
arch/arm/mm/init.c
mm/memory_hotplug.c
mm/page_alloc.c
mm/vmalloc.c
Fix printk format warnings in mm/memory_hotplug.c by using "%pa":
mm/memory_hotplug.c: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 2 has type 'resource_size_t' [-Wformat]
mm/memory_hotplug.c: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'resource_size_t' [-Wformat]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
PFN_PHYS() is a phys_addr_t, which can be u32 or u64.
Fix the build warning when phys_addr_t is u32.
mm/memory_hotplug.c: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 2 has type 'unsigned int' [-Wformat]: => 1685:3
mm/memory_hotplug.c: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'unsigned int' [-Wformat]: => 1685:3
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__remove_pages() is only necessary for CONFIG_MEMORY_HOTREMOVE. PowerPC
pseries will return -EOPNOTSUPP if unsupported.
Adding an #ifdef causes several other functions it depends on to also
become unnecessary, which saves in .text when disabled (it's disabled in
most defconfigs besides powerpc, including x86). remove_memory_block()
becomes static since it is not referenced outside of
drivers/base/memory.c.
Build tested on x86 and powerpc with CONFIG_MEMORY_HOTREMOVE both enabled
and disabled.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change __remove_pages() to call release_mem_region_adjustable(). This
allows a requested memory range to be released from the iomem_resource
table even if it does not match exactly to an resource entry but still
fits into. The resource entries initialized at bootup usually cover the
whole contiguous memory ranges and may not necessarily match with the
size of memory hot-delete requests.
If release_mem_region_adjustable() failed, __remove_pages() emits a
warning message and continues to proceed as it was the case with
release_mem_region(). release_mem_region(), which is defined to
__release_region(), emits a warning message and returns no error since a
void function.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by : Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: T Makphaibulchoke <tmac@hp.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a typo "end_pft" in the comment of walk_memory_range().
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
zone->wait_table may be allocated from bootmem, it can not be freed.
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
remove_memory() calls walk_memory_range() with [start_pfn, end_pfn), where
end_pfn is exclusive in this range. Therefore, end_pfn needs to be set to
the next page of the end address.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ensure_zone_is_initialized() checks if a zone is in a empty & not
initialized state (typically occuring after it is created in memory
hotplugging), and, if so, calls init_currently_empty_zone() to
initialize the zone.
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: David Hansen <dave@linux.vnet.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add 2 helpers (zone_end_pfn() and zone_spans_pfn()) to reduce code
duplication.
This also switches to using them in compaction (where an additional
variable needed to be renamed), page_alloc, vmstat, memory_hotplug, and
kmemleak.
Note that in compaction.c I avoid calling zone_end_pfn() repeatedly
because I expect at some point the sycronization issues with start_pfn &
spanned_pages will need fixing, either by actually using the seqlock or
clever memory barrier usage.
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: David Hansen <dave@linux.vnet.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No functional change, but the only purpose of the offlining argument to
migrate_pages() etc, was to ensure that __unmap_and_move() could migrate a
KSM page for memory hotremove (which took ksm_thread_mutex) but not for
other callers. Now all cases are safe, remove the arg.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Petr Holasek <pholasek@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Izik Eidus <izik.eidus@ravellosystems.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Function put_page_bootmem() is used to free pages allocated by bootmem
allocator, so it should increase totalram_pages when freeing pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Chris Clayton <chris2553@googlemail.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the node is offlined, there is no memory/cpu on the node. If a
sleep task runs on a cpu of this node, it will be migrated to the cpu on
the other node. So we can clear cpu-to-node mapping.
[akpm@linux-foundation.org: numa_clear_node() and numa_set_node() can no longer be __cpuinit]
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
try_offline_node() will be needed in the tristate
drivers/acpi/processor_driver.c.
The node will be offlined when all memory/cpu on the node have been
hotremoved. So we need the function try_offline_node() in cpu-hotplug
path.
If the memory-hotplug is disabled, and cpu-hotplug is enabled
1. no memory no the node
we don't online the node, and cpu's node is the nearest node.
2. the node contains some memory
the node has been onlined, and cpu's node is still needed
to migrate the sleep task on the cpu to the same node.
So we do nothing in try_offline_node() in this case.
[rientjes@google.com: export the function try_offline_node() fix]
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since there is no way to guarentee the address of pgdat/zone is not on
stack of any kernel threads or used by other kernel objects without
reference counting or other symchronizing method, we cannot reset
node_data and free pgdat when offlining a node. Just reset pgdat to 0
and reuse the memory when the node is online again.
The problem is suggested by Kamezawa Hiroyuki. The idea is from Wen
Congyang.
NOTE: If we don't reset pgdat to 0, the WARN_ON in free_area_init_node()
will be triggered.
[akpm@linux-foundation.org: fix warning when CONFIG_NEED_MULTIPLE_NODES=n]
[akpm@linux-foundation.org: fix the warning again again]
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We call hotadd_new_pgdat() to allocate memory to store node_data. So we
should free it when removing a node.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce a new function try_offline_node() to remove sysfs file of node
when all memory sections of this node are removed. If some memory
sections of this node are not removed, this function does nothing.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When memory is added, we update zone's and pgdat's start_pfn and
spanned_pages in __add_zone(). So we should revert them when the memory
is removed.
The patch adds a new function __remove_zone() to do this.
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But even
if we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In __remove_section(), we locked pgdat_resize_lock when calling
sparse_remove_one_section(). This lock will disable irq. But we don't
need to lock the whole function. If we do some work to free pagetables
in free_section_usemap(), we need to call flush_tlb_all(), which need
irq enabled. Otherwise the WARN_ON_ONCE() in smp_call_function_many()
will be triggered.
If we lock the whole sparse_remove_one_section(), then we come to this call trace:
------------[ cut here ]------------
WARNING: at kernel/smp.c:461 smp_call_function_many+0xbd/0x260()
Hardware name: PRIMEQUEST 1800E
......
Call Trace:
smp_call_function_many+0xbd/0x260
smp_call_function+0x3b/0x50
on_each_cpu+0x3b/0xc0
flush_tlb_all+0x1c/0x20
remove_pagetable+0x14e/0x1d0
vmemmap_free+0x18/0x20
sparse_remove_one_section+0xf7/0x100
__remove_section+0xa2/0xb0
__remove_pages+0xa0/0xd0
arch_remove_memory+0x6b/0xc0
remove_memory+0xb8/0xf0
acpi_memory_device_remove+0x53/0x96
acpi_device_remove+0x90/0xb2
__device_release_driver+0x7c/0xf0
device_release_driver+0x2f/0x50
acpi_bus_remove+0x32/0x6d
acpi_bus_trim+0x91/0x102
acpi_bus_hot_remove_device+0x88/0x16b
acpi_os_execute_deferred+0x27/0x34
process_one_work+0x20e/0x5c0
worker_thread+0x12e/0x370
kthread+0xee/0x100
ret_from_fork+0x7c/0xb0
---[ end trace 25e85300f542aa01 ]---
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For removing memmap region of sparse-vmemmap which is allocated bootmem,
memmap region of sparse-vmemmap needs to be registered by
get_page_bootmem(). So the patch searches pages of virtual mapping and
registers the pages by get_page_bootmem().
NOTE: register_page_bootmem_memmap() is not implemented for ia64,
ppc, s390, and sparc. So introduce CONFIG_HAVE_BOOTMEM_INFO_NODE
and revert register_page_bootmem_info_node() when platform doesn't
support it.
It's implemented by adding a new Kconfig option named
CONFIG_HAVE_BOOTMEM_INFO_NODE, which will be automatically selected
by memory-hotplug feature fully supported archs(currently only on
x86_64).
Since we have 2 config options called MEMORY_HOTPLUG and
MEMORY_HOTREMOVE used for memory hot-add and hot-remove separately,
and codes in function register_page_bootmem_info_node() are only
used for collecting infomation for hot-remove, so reside it under
MEMORY_HOTREMOVE.
Besides page_isolation.c selected by MEMORY_ISOLATION under
MEMORY_HOTPLUG is also such case, move it too.
[mhocko@suse.cz: put register_page_bootmem_memmap inside CONFIG_MEMORY_HOTPLUG_SPARSE]
[linfeng@cn.fujitsu.com: introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node()]
[mhocko@suse.cz: remove the arch specific functions without any implementation]
[linfeng@cn.fujitsu.com: mm/Kconfig: move auto selects from MEMORY_HOTPLUG to MEMORY_HOTREMOVE as needed]
[rientjes@google.com: fix defined but not used warning]
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wu Jianguo <wujianguo@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Lin Feng <linfeng@cn.fujitsu.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For removing memory, we need to remove page tables. But it depends on
architecture. So the patch introduce arch_remove_memory() for removing
page table. Now it only calls __remove_pages().
Note: __remove_pages() for some archtecuture is not implemented
(I don't know how to implement it for s390).
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When (hot)adding memory into system, /sys/firmware/memmap/X/{end, start,
type} sysfs files are created. But there is no code to remove these
files. This patch implements the function to remove them.
We cannot free firmware_map_entry which is allocated by bootmem because
there is no way to do so when the system is up. But we can at least
remember the address of that memory and reuse the storage when the
memory is added next time.
This patch also introduces a new list map_entries_bootmem to link the
map entries allocated by bootmem when they are removed, and a lock to
protect it. And these entries will be reused when the memory is
hot-added again.
The idea is suggestted by Andrew Morton.
NOTE: It is unsafe to return an entry pointer and release the
map_entries_lock. So we should not hold the map_entries_lock
separately in firmware_map_find_entry() and
firmware_map_remove_entry(). Hold the map_entries_lock across find
and remove /sys/firmware/memmap/X operation.
And also, users of these two functions need to be careful to
hold the lock when using these two functions.
[tangchen@cn.fujitsu.com: Hold spinlock across find|remove /sys operation]
[tangchen@cn.fujitsu.com: fix the wrong comments of map_entries]
[tangchen@cn.fujitsu.com: reuse the storage of /sys/firmware/memmap/X/ allocated by bootmem]
[tangchen@cn.fujitsu.com: fix section mismatch problem]
[tangchen@cn.fujitsu.com: fix the doc format in drivers/firmware/memmap.c]
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Julian Calaby <julian.calaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We remove the memory like this:
1. lock memory hotplug
2. offline a memory block
3. unlock memory hotplug
4. repeat 1-3 to offline all memory blocks
5. lock memory hotplug
6. remove memory(TODO)
7. unlock memory hotplug
All memory blocks must be offlined before removing memory. But we don't
hold the lock in the whole operation. So we should check whether all
memory blocks are offlined before step6. Otherwise, kernel maybe
panicked.
Offlining a memory block and removing a memory device can be two
different operations. Users can just offline some memory blocks without
removing the memory device. For this purpose, the kernel has held
lock_memory_hotplug() in __offline_pages(). To reuse the code for
memory hot-remove, we repeat step 1-3 to offline all the memory blocks,
repeatedly lock and unlock memory hotplug, but not hold the memory
hotplug lock in the whole operation.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memory can't be offlined when CONFIG_MEMCG is selected. For example:
there is a memory device on node 1. The address range is [1G, 1.5G).
You will find 4 new directories memory8, memory9, memory10, and memory11
under the directory /sys/devices/system/memory/.
If CONFIG_MEMCG is selected, we will allocate memory to store page
cgroup when we online pages. When we online memory8, the memory stored
page cgroup is not provided by this memory device. But when we online
memory9, the memory stored page cgroup may be provided by memory8. So
we can't offline memory8 now. We should offline the memory in the
reversed order.
When the memory device is hotremoved, we will auto offline memory
provided by this memory device. But we don't know which memory is
onlined first, so offlining memory may fail. In such case, iterate
twice to offline the memory. 1st iterate: offline every non primary
memory block. 2nd iterate: offline primary (i.e. first added) memory
block.
This idea is suggested by KOSAKI Motohiro.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIcBAABAgAGBQJQx0kQAAoJEHzG/DNEskfi4fQP/R5PRovayroZALBMLnVJDaLD
Ttr9p40VNXbiJ+MfRgatJjSSJZ4Jl+fC3NEqBhcwVZhckZZb9R2s0WtrSQo5+ZbB
vdRfiuKoCaKM4cSZ08C12uTvsF6xjhjd27CTUlMkyOcDoKxMEFKelv0hocSxe4Wo
xqlv3eF+VsY7kE1BNbgBP06SX4tDpIHRxXfqJPMHaSKQmre+cU0xG2GcEu3QGbHT
DEDTI788YSaWLmBfMC+kWoaQl1+bV/FYvavIAS8/o4K9IKvgR42VzrXmaFaqrbgb
72ksa6xfAi57yTmZHqyGmts06qYeBbPpKI+yIhCMInxA9CY3lPbvHppRf0RQOyzj
YOi4hovGEMJKE+BCILukhJcZ9jCTtS3zut6v1rdvR88f4y7uhR9RfmRfsxuW7PNj
3Rmh191+n0lVWDmhOs2psXuCLJr3LEiA0dFffN1z8REUTtTAZMsj8Rz+SvBNAZDR
hsJhERVeXB6X5uQ5rkLDzbn1Zic60LjVw7LIp6SF2OYf/YKaF8vhyWOA8dyCEu8W
CGo7AoG0BO8tIIr8+LvFe8CweypysZImx4AjCfIs4u9pu/v11zmBvO9NO5yfuObF
BreEERYgTes/UITxn1qdIW4/q+Nr0iKO3CTqsmu6L1GfCz3/XzPGs3U26fUhllqi
Ka0JKgnWvsa6ez6FSzKI
=ivQa
-----END PGP SIGNATURE-----
Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma
Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
"There are three implementations for NUMA balancing, this tree
(balancenuma), numacore which has been developed in tip/master and
autonuma which is in aa.git.
In almost all respects balancenuma is the dumbest of the three because
its main impact is on the VM side with no attempt to be smart about
scheduling. In the interest of getting the ball rolling, it would be
desirable to see this much merged for 3.8 with the view to building
scheduler smarts on top and adapting the VM where required for 3.9.
The most recent set of comparisons available from different people are
mel: https://lkml.org/lkml/2012/12/9/108
mingo: https://lkml.org/lkml/2012/12/7/331
tglx: https://lkml.org/lkml/2012/12/10/437
srikar: https://lkml.org/lkml/2012/12/10/397
The results are a mixed bag. In my own tests, balancenuma does
reasonably well. It's dumb as rocks and does not regress against
mainline. On the other hand, Ingo's tests shows that balancenuma is
incapable of converging for this workloads driven by perf which is bad
but is potentially explained by the lack of scheduler smarts. Thomas'
results show balancenuma improves on mainline but falls far short of
numacore or autonuma. Srikar's results indicate we all suffer on a
large machine with imbalanced node sizes.
My own testing showed that recent numacore results have improved
dramatically, particularly in the last week but not universally.
We've butted heads heavily on system CPU usage and high levels of
migration even when it shows that overall performance is better.
There are also cases where it regresses. Of interest is that for
specjbb in some configurations it will regress for lower numbers of
warehouses and show gains for higher numbers which is not reported by
the tool by default and sometimes missed in treports. Recently I
reported for numacore that the JVM was crashing with
NullPointerExceptions but currently it's unclear what the source of
this problem is. Initially I thought it was in how numacore batch
handles PTEs but I'm no longer think this is the case. It's possible
numacore is just able to trigger it due to higher rates of migration.
These reports were quite late in the cycle so I/we would like to start
with this tree as it contains much of the code we can agree on and has
not changed significantly over the last 2-3 weeks."
* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
mm/rmap: Convert the struct anon_vma::mutex to an rwsem
mm: migrate: Account a transhuge page properly when rate limiting
mm: numa: Account for failed allocations and isolations as migration failures
mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
mm: numa: Add THP migration for the NUMA working set scanning fault case.
mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
mm: sched: numa: Control enabling and disabling of NUMA balancing
mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships
mm: numa: migrate: Set last_nid on newly allocated page
mm: numa: split_huge_page: Transfer last_nid on tail page
mm: numa: Introduce last_nid to the page frame
sched: numa: Slowly increase the scanning period as NUMA faults are handled
mm: numa: Rate limit setting of pte_numa if node is saturated
mm: numa: Rate limit the amount of memory that is migrated between nodes
mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
mm: numa: Migrate pages handled during a pmd_numa hinting fault
mm: numa: Migrate on reference policy
...
Merge misc VM changes from Andrew Morton:
"The rest of most-of-MM. The other MM bits await a slab merge.
This patch includes the addition of a huge zero_page. Not a
performance boost but it an save large amounts of physical memory in
some situations.
Also a bunch of Fujitsu engineers are working on memory hotplug.
Which, as it turns out, was badly broken. About half of their patches
are included here; the remainder are 3.8 material."
However, this merge disables CONFIG_MOVABLE_NODE, which was totally
broken. We don't add new features with "default y", nor do we add
Kconfig questions that are incomprehensible to most people without any
help text. Does the feature even make sense without compaction or
memory hotplug?
* akpm: (54 commits)
mm/bootmem.c: remove unused wrapper function reserve_bootmem_generic()
mm/memory.c: remove unused code from do_wp_page()
asm-generic, mm: pgtable: consolidate zero page helpers
mm/hugetlb.c: fix warning on freeing hwpoisoned hugepage
hwpoison, hugetlbfs: fix RSS-counter warning
hwpoison, hugetlbfs: fix "bad pmd" warning in unmapping hwpoisoned hugepage
mm: protect against concurrent vma expansion
memcg: do not check for mm in __mem_cgroup_count_vm_event
tmpfs: support SEEK_DATA and SEEK_HOLE (reprise)
mm: provide more accurate estimation of pages occupied by memmap
fs/buffer.c: remove redundant initialization in alloc_page_buffers()
fs/buffer.c: do not inline exported function
writeback: fix a typo in comment
mm: introduce new field "managed_pages" to struct zone
mm, oom: remove statically defined arch functions of same name
mm, oom: remove redundant sleep in pagefault oom handler
mm, oom: cleanup pagefault oom handler
memory_hotplug: allow online/offline memory to result movable node
numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
mm, memcg: avoid unnecessary function call when memcg is disabled
...
Pull trivial branch from Jiri Kosina:
"Usual stuff -- comment/printk typo fixes, documentation updates, dead
code elimination."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
HOWTO: fix double words typo
x86 mtrr: fix comment typo in mtrr_bp_init
propagate name change to comments in kernel source
doc: Update the name of profiling based on sysfs
treewide: Fix typos in various drivers
treewide: Fix typos in various Kconfig
wireless: mwifiex: Fix typo in wireless/mwifiex driver
messages: i2o: Fix typo in messages/i2o
scripts/kernel-doc: check that non-void fcts describe their return value
Kernel-doc: Convention: Use a "Return" section to describe return values
radeon: Fix typo and copy/paste error in comments
doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c
various: Fix spelling of "asynchronous" in comments.
Fix misspellings of "whether" in comments.
eisa: Fix spelling of "asynchronous".
various: Fix spelling of "registered" in comments.
doc: fix quite a few typos within Documentation
target: iscsi: fix comment typos in target/iscsi drivers
treewide: fix typo of "suport" in various comments and Kconfig
treewide: fix typo of "suppport" in various comments
...
Currently a zone's present_pages is calcuated as below, which is
inaccurate and may cause trouble to memory hotplug.
spanned_pages - absent_pages - memmap_pages - dma_reserve.
During fixing bugs caused by inaccurate zone->present_pages, we found
zone->present_pages has been abused. The field zone->present_pages may
have different meanings in different contexts:
1) pages existing in a zone.
2) pages managed by the buddy system.
For more discussions about the issue, please refer to:
http://lkml.org/lkml/2012/11/5/866https://patchwork.kernel.org/patch/1346751/
This patchset tries to introduce a new field named "managed_pages" to
struct zone, which counts "pages managed by the buddy system". And revert
zone->present_pages to count "physical pages existing in a zone", which
also keep in consistence with pgdat->node_present_pages.
We will set an initial value for zone->managed_pages in function
free_area_init_core() and will adjust it later if the initial value is
inaccurate.
For DMA/normal zones, the initial value is set to:
(spanned_pages - absent_pages - memmap_pages - dma_reserve)
Later zone->managed_pages will be adjusted to the accurate value when the
bootmem allocator frees all free pages to the buddy system in function
free_all_bootmem_node() and free_all_bootmem().
The bootmem allocator doesn't touch highmem pages, so highmem zones'
managed_pages is set to the accurate value "spanned_pages - absent_pages"
in function free_area_init_core() and won't be updated anymore.
This patch also adds a new field "managed_pages" to /proc/zoneinfo
and sysrq showmem.
[akpm@linux-foundation.org: small comment tweaks]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now, memory management can handle movable node or nodes which don't have
any normal memory, so we can dynamic configure and add movable node by:
online a ZONE_MOVABLE memory from a previous offline node
offline the last normal memory which result a non-normal-memory-node
movable-node is very important for power-saving, hardware partitioning and
high-available-system(hardware fault management).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>