The trace event headers are required to include tracepoint.h. The only reason
they worked now is because module.h included tracepoint.h, and that will soon
change.
Link: http://lkml.kernel.org/r/20140226190644.442886305@goodmis.org
Fixes: 455b286468 "writeback: Initial tracing support"
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZQspmAAoJEE44bZycYXAvLXMP/3Uqx7K7dGjHvvhGA4DhnzSp
bGLpjeP1sXXnnd932PN+qkGbl2j/NPjS74DobDqGWnrwxKRzQ21F4YkWJGtb4Pe2
JKcY7y2rbKGcwhpS9qDMkSWuaUKJWF5MAsH08LnCWqlGphGwAH/uPTdqS4iI/CJM
aQvaaITe5SVzvpvpyoCVdHqu8K+Ukraf91mvt7hlmrn9OnqO9us9MWulw5sSXQcd
pM8ZbRkBDE5OFeVnPKJDBY+cR2ML41wekMMwvJWt7uRyrX2i5c7oQVXYoeYE4MKx
Pueb7aG7LQwBUzNJCiZA6PAEFQPwNPCoxHZbAax0D6/JyDWOZukappquzjd6gLDM
+U7mxeFTeNZJ5v9tUcUIOb4GaaFcccS3wdDP23V2N8iM88hFVwJn0RSy/pksX37+
ZNDiEyDeJBjz3kh/Kf40zhFIIrABMozFeX3tpSRVVqXb+T6P9l8Y88O2LGY5FCXK
QBbAC+jC4X4YI+4v+QWImg9mkfTwzZyjyAlfyjPlHVSK9KDP9M6LXpr2+jKS7jOc
ievMOh9ku0HIVuSWGUKZSqjvcF01Bh99tFlX+KqipomwNTwa4hKCLmnOVflF1BPE
8sfD9hvenA0e949kXrURUmqpg6Ujkrbb/lXuD7e2CakCu+XjEMf317R11TyTsHNG
10hsmPsGDVcwbyFOFHS3
=mvzl
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEJDfLduVEy2qz2d/TmXOSYMtstxYFAlpqfEUACgkQmXOSYMts
txbJOQ/+Pce1eBSgjESWKuz0OP9BfAe9RpWFi7lBZ/EgRwJVYEx6jau9EYXAQ7YT
roCIsV6eufhMplYGHJz6EHxK2Hieb1zG9ooX9ss9GxiB6qmqeqC0Slm9EQE15yGT
px3fVz9r86edqjtj7UKK0/n8DJUaFh5LWOymLD3d3/115RYQsl/GowugH9F79PvN
pR+OyXq7srtfCmwdhZ65012Ef10RXqBRv0fCYBH6r+jkMqb7uSDFzdR39Z7k3QFk
AM4+3lTm6EEZ4xZkcMyX3GuQWslpPAlvFdEx43TjdCbseXAqURoppmxvz+Izum75
fy0oOdKl5OSpyZArRkUfZ0MnL6BHGcKxwYV4u1LupwvqPyaUT4yiT5VEUdy9EqJo
Syrr0oSR2lrXqQESdxKkmOZVXyul0nF3Fh1p5QlU1/Id9oskMLYqcXegFyhr2Wyp
+A4ZozljEQ4AGm4dYFdH3w8TcNDttjztYoKf8OXnaCOj3p/SEq84tk4Hm3vpoPvh
5OzsZC3UB9gJ1mXsKOVKLJFCPzmg61KOvwhopfAcC6cyiIIf/MPCneZeOzsavtQX
J+atSNcLVNE3jmrXvUrwxSpZ3KCc3Ti5Q8pD9ni6/B6st2+LO8EXPrS6n2+28nvu
hVpjyCXLbghdmn1mjOGW9lvMQEg/Dupj/ocpCPHJnXpbpM8Mcjo=
=3eAv
-----END PGP SIGNATURE-----
Merge 3.10.106 into android-msm-bullhead-3.10-oreo-m5
Changes in 3.10.106: (252 commits)
packet: fix race condition in packet_set_ring
crypto: crypto_memneq - add equality testing of memory regions w/o timing leaks
EVM: Use crypto_memneq() for digest comparisons
libceph: don't set weight to IN when OSD is destroyed
KVM: x86: fix emulation of "MOV SS, null selector"
KVM: x86: Introduce segmented_write_std
posix_acl: Clear SGID bit when setting file permissions
tmpfs: clear S_ISGID when setting posix ACLs
fbdev: color map copying bounds checking
selinux: fix off-by-one in setprocattr
tcp: avoid infinite loop in tcp_splice_read()
xfrm_user: validate XFRM_MSG_NEWAE XFRMA_REPLAY_ESN_VAL replay_window
xfrm_user: validate XFRM_MSG_NEWAE incoming ESN size harder
KEYS: Disallow keyrings beginning with '.' to be joined as session keyrings
KEYS: Change the name of the dead type to ".dead" to prevent user access
KEYS: fix keyctl_set_reqkey_keyring() to not leak thread keyrings
ext4: fix data exposure after a crash
locking/rtmutex: Prevent dequeue vs. unlock race
m68k: Fix ndelay() macro
hotplug: Make register and unregister notifier API symmetric
Btrfs: fix tree search logic when replaying directory entry deletes
USB: serial: kl5kusb105: fix open error path
block_dev: don't test bdev->bd_contains when it is not stable
crypto: caam - fix AEAD givenc descriptors
ext4: fix mballoc breakage with 64k block size
ext4: fix stack memory corruption with 64k block size
ext4: reject inodes with negative size
ext4: return -ENOMEM instead of success
f2fs: set ->owner for debugfs status file's file_operations
block: protect iterate_bdevs() against concurrent close
scsi: zfcp: fix use-after-"free" in FC ingress path after TMF
scsi: zfcp: do not trace pure benign residual HBA responses at default level
scsi: zfcp: fix rport unblock race with LUN recovery
ftrace/x86_32: Set ftrace_stub to weak to prevent gcc from using short jumps to it
IB/mad: Fix an array index check
IB/multicast: Check ib_find_pkey() return value
powerpc: Convert cmp to cmpd in idle enter sequence
usb: gadget: composite: Test get_alt() presence instead of set_alt()
USB: serial: omninet: fix NULL-derefs at open and disconnect
USB: serial: quatech2: fix sleep-while-atomic in close
USB: serial: pl2303: fix NULL-deref at open
USB: serial: keyspan_pda: verify endpoints at probe
USB: serial: spcp8x5: fix NULL-deref at open
USB: serial: io_ti: fix NULL-deref at open
USB: serial: io_ti: fix another NULL-deref at open
USB: serial: iuu_phoenix: fix NULL-deref at open
USB: serial: garmin_gps: fix memory leak on failed URB submit
USB: serial: ti_usb_3410_5052: fix NULL-deref at open
USB: serial: io_edgeport: fix NULL-deref at open
USB: serial: oti6858: fix NULL-deref at open
USB: serial: cyberjack: fix NULL-deref at open
USB: serial: kobil_sct: fix NULL-deref in write
USB: serial: mos7840: fix NULL-deref at open
USB: serial: mos7720: fix NULL-deref at open
USB: serial: mos7720: fix use-after-free on probe errors
USB: serial: mos7720: fix parport use-after-free on probe errors
USB: serial: mos7720: fix parallel probe
usb: xhci-mem: use passed in GFP flags instead of GFP_KERNEL
usb: musb: Fix trying to free already-free IRQ 4
ALSA: usb-audio: Fix bogus error return in snd_usb_create_stream()
USB: serial: kl5kusb105: abort on open exception path
staging: iio: ad7606: fix improper setting of oversampling pins
usb: dwc3: gadget: always unmap EP0 requests
cris: Only build flash rescue image if CONFIG_ETRAX_AXISFLASHMAP is selected
hwmon: (ds620) Fix overflows seen when writing temperature limits
clk: clk-wm831x: fix a logic error
iommu/amd: Fix the left value check of cmd buffer
scsi: mvsas: fix command_active typo
target/iscsi: Fix double free in lio_target_tiqn_addtpg()
mmc: mmc_test: Uninitialized return value
powerpc/pci/rpadlpar: Fix device reference leaks
ser_gigaset: return -ENOMEM on error instead of success
net, sched: fix soft lockup in tc_classify
net: stmmac: Fix race between stmmac_drv_probe and stmmac_open
gro: Enter slow-path if there is no tailroom
gro: use min_t() in skb_gro_reset_offset()
gro: Disable frag0 optimization on IPv6 ext headers
powerpc: Fix build warning on 32-bit PPC
Input: i8042 - add Pegatron touchpad to noloop table
mm/hugetlb.c: fix reservation race when freeing surplus pages
USB: serial: kl5kusb105: fix line-state error handling
USB: serial: ch341: fix initial modem-control state
USB: serial: ch341: fix open error handling
USB: serial: ch341: fix control-message error handling
USB: serial: ch341: fix open and resume after B0
USB: serial: ch341: fix resume after reset
USB: serial: ch341: fix modem-control and B0 handling
x86/cpu: Fix bootup crashes by sanitizing the argument of the 'clearcpuid=' command-line option
NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.
powerpc/ibmebus: Fix further device reference leaks
powerpc/ibmebus: Fix device reference leaks in sysfs interface
IB/mlx4: Set traffic class in AH
IB/mlx4: Fix port query for 56Gb Ethernet links
perf scripting: Avoid leaking the scripting_context variable
ARM: dts: imx31: fix clock control module interrupts description
svcrpc: don't leak contexts on PROC_DESTROY
mmc: mxs-mmc: Fix additional cycles after transmission stop
mtd: nand: xway: disable module support
ubifs: Fix journal replay wrt. xattr nodes
arm64/ptrace: Preserve previous registers for short regset write
arm64/ptrace: Avoid uninitialised struct padding in fpr_set()
arm64/ptrace: Reject attempts to set incomplete hardware breakpoint fields
ARM: ux500: fix prcmu_is_cpu_in_wfi() calculation
ite-cir: initialize use_demodulator before using it
fuse: do not use iocb after it may have been freed
crypto: caam - fix non-hmac hashes
drm/i915: Don't leak edid in intel_crt_detect_ddc()
s5k4ecgx: select CRC32 helper
platform/x86: intel_mid_powerbtn: Set IRQ_ONESHOT
net: fix harmonize_features() vs NETIF_F_HIGHDMA
tcp: initialize max window for a new fastopen socket
svcrpc: fix oops in absence of krb5 module
ARM: 8643/3: arm/ptrace: Preserve previous registers for short regset write
mac80211: Fix adding of mesh vendor IEs
scsi: zfcp: fix use-after-free by not tracing WKA port open/close on failed send
drm/i915: fix use-after-free in page_flip_completed()
net: use a work queue to defer net_disable_timestamp() work
ipv4: keep skb->dst around in presence of IP options
netlabel: out of bound access in cipso_v4_validate()
ip6_gre: fix ip6gre_err() invalid reads
ping: fix a null pointer dereference
l2tp: do not use udp_ioctl()
packet: fix races in fanout_add()
packet: Do not call fanout_release from atomic contexts
net: socket: fix recvmmsg not returning error from sock_error
USB: serial: mos7840: fix another NULL-deref at open
USB: serial: ftdi_sio: fix modem-status error handling
USB: serial: ftdi_sio: fix extreme low-latency setting
USB: serial: ftdi_sio: fix line-status over-reporting
USB: serial: spcp8x5: fix modem-status handling
USB: serial: opticon: fix CTS retrieval at open
USB: serial: ark3116: fix register-accessor error handling
x86/platform/goldfish: Prevent unconditional loading
goldfish: Sanitize the broken interrupt handler
ocfs2: do not write error flag to user structure we cannot copy from/to
mfd: pm8921: Potential NULL dereference in pm8921_remove()
drm/nv50/disp: min/max are reversed in nv50_crtc_gamma_set()
net: 6lowpan: fix lowpan_header_create non-compression memcpy call
vti4: Don't count header length twice.
net/sched: em_meta: Fix 'meta vlan' to correctly recognize zero VID frames
MIPS: OCTEON: Fix copy_from_user fault handling for large buffers
MIPS: Clear ISA bit correctly in get_frame_info()
MIPS: Prevent unaligned accesses during stack unwinding
MIPS: Fix get_frame_info() handling of microMIPS function size
MIPS: Fix is_jump_ins() handling of 16b microMIPS instructions
MIPS: Calculate microMIPS ra properly when unwinding the stack
MIPS: Handle microMIPS jumps in the same way as MIPS32/MIPS64 jumps
uvcvideo: Fix a wrong macro
scsi: aacraid: Reorder Adapter status check
ath9k: use correct OTP register offsets for the AR9340 and AR9550
fuse: add missing FR_FORCE
RDMA/core: Fix incorrect structure packing for booleans
NFSv4: fix getacl head length estimation
s390/qdio: clear DSCI prior to scanning multiple input queues
IB/ipoib: Fix deadlock between rmmod and set_mode
ktest: Fix child exit code processing
nlm: Ensure callback code also checks that the files match
dm: flush queued bios when process blocks to avoid deadlock
USB: serial: digi_acceleport: fix OOB data sanity check
USB: serial: digi_acceleport: fix OOB-event processing
MIPS: ip27: Disable qlge driver in defconfig
tracing: Add #undef to fix compile error
USB: serial: safe_serial: fix information leak in completion handler
USB: serial: omninet: fix reference leaks at open
USB: iowarrior: fix NULL-deref at probe
USB: iowarrior: fix NULL-deref in write
USB: serial: io_ti: fix NULL-deref in interrupt callback
USB: serial: io_ti: fix information leak in completion handler
vxlan: correctly validate VXLAN ID against VXLAN_N_VID
ipv4: mask tos for input route
locking/static_keys: Add static_key_{en,dis}able() helpers
net: net_enable_timestamp() can be called from irq contexts
dccp/tcp: fix routing redirect race
net sched actions: decrement module reference count after table flush.
perf/core: Fix event inheritance on fork()
isdn/gigaset: fix NULL-deref at probe
xen: do not re-use pirq number cached in pci device msi msg data
net: properly release sk_frag.page
net: unix: properly re-increment inflight counter of GC discarded candidates
Input: ims-pcu - validate number of endpoints before using them
Input: hanwang - validate number of endpoints before using them
Input: yealink - validate number of endpoints before using them
Input: cm109 - validate number of endpoints before using them
USB: uss720: fix NULL-deref at probe
USB: idmouse: fix NULL-deref at probe
USB: wusbcore: fix NULL-deref at probe
uwb: i1480-dfu: fix NULL-deref at probe
uwb: hwa-rc: fix NULL-deref at probe
mmc: ushc: fix NULL-deref at probe
ext4: mark inode dirty after converting inline directory
scsi: libsas: fix ata xfer length
ALSA: ctxfi: Fallback DMA mask to 32bit
ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call
ACPI / PNP: Avoid conflicting resource reservations
ACPI / resources: free memory on error in add_region_before()
ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stage
USB: OHCI: Fix race between ED unlink and URB submission
i2c: at91: manage unexpected RXRDY flag when starting a transfer
ipv4: igmp: Allow removing groups from a removed interface
ptrace: fix PTRACE_LISTEN race corrupting task->state
ring-buffer: Fix return value check in test_ringbuffer()
metag/usercopy: Fix alignment error checking
metag/usercopy: Add early abort to copy_to_user
metag/usercopy: Set flags before ADDZ
metag/usercopy: Fix src fixup in from user rapf loops
metag/usercopy: Add missing fixups
s390/decompressor: fix initrd corruption caused by bss clear
net/mlx4_en: Fix bad WQE issue
net/mlx4_core: Fix racy CQ (Completion Queue) free
char: Drop bogus dependency of DEVPORT on !M68K
powerpc: Disable HFSCR[TM] if TM is not supported
pegasus: Use heap buffers for all register access
rtl8150: Use heap buffers for all register access
tracing: Allocate the snapshot buffer before enabling probe
ring-buffer: Have ring_buffer_iter_empty() return true when empty
netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel
net: phy: handle state correctly in phy_stop_machine
l2tp: take reference on sessions being dumped
MIPS: KGDB: Use kernel context for sleeping threads
ARM: dts: imx31: move CCM device node to AIPS2 bus devices
ARM: dts: imx31: fix AVIC base address
tun: Fix TUN_PKT_STRIP setting
Staging: vt6655-6: potential NULL dereference in hostap_disable_hostapd()
net: sctp: rework multihoming retransmission path selection to rfc4960
perf trace: Use the syscall raw_syscalls:sys_enter timestamp
USB: usbtmc: add missing endpoint sanity check
ping: implement proper locking
USB: fix problems with duplicate endpoint addresses
USB: dummy-hcd: fix bug in stop_activity (handle ep0)
mm/init: fix zone boundary creation
can: Fix kernel panic at security_sock_rcv_skb
Drivers: hv: avoid vfree() on crash
xc2028: avoid use after free
xc2028: unlock on error in xc2028_set_config()
xc2028: Fix use-after-free bug properly
ipv6: fix ip6_tnl_parse_tlv_enc_lim()
ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim()
ipv6: fix the use of pcpu_tstats in ip6_tunnel
sctp: avoid BUG_ON on sctp_wait_for_sndbuf
sctp: deny peeloff operation on asocs with threads sleeping on it
KVM: x86: clear bus pointer when destroyed
kvm: exclude ioeventfd from counting kvm_io_range limit
KVM: kvm_io_bus_unregister_dev() should never fail
TTY: n_hdlc, fix lockdep false positive
tty: n_hdlc: get rid of racy n_hdlc.tbuf
ipv6: handle -EFAULT from skb_copy_bits
fs: exec: apply CLOEXEC before changing dumpable task flags
mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp
dccp/tcp: do not inherit mc_list from parent
char: lp: fix possible integer overflow in lp_setup()
dccp: fix freeing skb too early for IPV6_RECVPKTINFO
Linux 3.10.106
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Conflicts:
drivers/mfd/pm8921-core.c
include/linux/cpu.h
kernel/cpu.c
net/ipv4/inet_connection_sock.c
net/ipv4/ping.c
commit bf7165cfa23695c51998231c4efa080fe1d3548d upstream.
There are several trace include files that define TRACE_INCLUDE_FILE.
Include several of them in the same .c file (as I currently have in
some code I am working on), and the compile will blow up with a
"warning: "TRACE_INCLUDE_FILE" redefined #define TRACE_INCLUDE_FILE syscalls"
Every other include file in include/trace/events/ avoids that issue
by having a #undef TRACE_INCLUDE_FILE before the #define; syscalls.h
should have one, too.
Link: http://lkml.kernel.org/r/20160928225554.13bd7ac6@annuminas.surriel.com
Fixes: b8007ef742 ("tracing: Separate raw syscall from syscall tracer")
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
We want to use network trace events in production
builds, to help diagnose Wifi problems. However, we
don't want to expose raw kernel pointers in such
builds.
Change the format specifier for the skbaddr field,
so that, if kptr_restrict is enabled, the pointers
will be reported as 0.
BUG=30090733
TEST=manual (see below)
Manual test
- Connect device to a wifi network.
$ adb root && adb shell
device# cd /sys/kernel/debug/tracing
device# echo 1 > events/net/net_dev_xmit/enable
device# echo 1 > events/net/netif_rx/enable
device# echo 1 > tracing_on
device# ping google.com
device# cat trace
- verify that skbaddr is always 0
Change-Id: Ic4bd583d37af6637343601feca875ee24479ddff
When studying page stealing, I noticed some weird looking decisions in
try_to_steal_freepages(). The first I assume is a bug (Patch 1), the
following two patches were driven by evaluation.
Testing was done with stress-highalloc of mmtests, using the
mm_page_alloc_extfrag tracepoint and postprocessing to get counts of how
often page stealing occurs for individual migratetypes, and what
migratetypes are used for fallbacks. Arguably, the worst case of page
stealing is when UNMOVABLE allocation steals from MOVABLE pageblock.
RECLAIMABLE allocation stealing from MOVABLE allocation is also not ideal,
so the goal is to minimize these two cases.
The evaluation of v2 wasn't always clear win and Joonsoo questioned the
results. Here I used different baseline which includes RFC compaction
improvements from [1]. I found that the compaction improvements reduce
variability of stress-highalloc, so there's less noise in the data.
First, let's look at stress-highalloc configured to do sync compaction,
and how these patches reduce page stealing events during the test. First
column is after fresh reboot, other two are reiterations of test without
reboot. That was all accumulater over 5 re-iterations (so the benchmark
was run 5x3 times with 5 fresh restarts).
Baseline:
3.19-rc4 3.19-rc4 3.19-rc4
5-nothp-1 5-nothp-2 5-nothp-3
Page alloc extfrag event 10264225 8702233 10244125
Extfrag fragmenting 10263271 8701552 10243473
Extfrag fragmenting for unmovable 13595 17616 15960
Extfrag fragmenting unmovable placed with movable 7989 12193 8447
Extfrag fragmenting for reclaimable 658 1840 1817
Extfrag fragmenting reclaimable placed with movable 558 1677 1679
Extfrag fragmenting for movable 10249018 8682096 10225696
With Patch 1:
3.19-rc4 3.19-rc4 3.19-rc4
6-nothp-1 6-nothp-2 6-nothp-3
Page alloc extfrag event 11834954 9877523 9774860
Extfrag fragmenting 11833993 9876880 9774245
Extfrag fragmenting for unmovable 7342 16129 11712
Extfrag fragmenting unmovable placed with movable 4191 10547 6270
Extfrag fragmenting for reclaimable 373 1130 923
Extfrag fragmenting reclaimable placed with movable 302 906 738
Extfrag fragmenting for movable 11826278 9859621 9761610
With Patch 2:
3.19-rc4 3.19-rc4 3.19-rc4
7-nothp-1 7-nothp-2 7-nothp-3
Page alloc extfrag event 4725990 3668793 3807436
Extfrag fragmenting 4725104 3668252 3806898
Extfrag fragmenting for unmovable 6678 7974 7281
Extfrag fragmenting unmovable placed with movable 2051 3829 4017
Extfrag fragmenting for reclaimable 429 1208 1278
Extfrag fragmenting reclaimable placed with movable 369 976 1034
Extfrag fragmenting for movable 4717997 3659070 3798339
With Patch 3:
3.19-rc4 3.19-rc4 3.19-rc4
8-nothp-1 8-nothp-2 8-nothp-3
Page alloc extfrag event 5016183 4700142 3850633
Extfrag fragmenting 5015325 4699613 3850072
Extfrag fragmenting for unmovable 1312 3154 3088
Extfrag fragmenting unmovable placed with movable 1115 2777 2714
Extfrag fragmenting for reclaimable 437 1193 1097
Extfrag fragmenting reclaimable placed with movable 330 969 879
Extfrag fragmenting for movable 5013576 4695266 3845887
In v2 we've seen apparent regression with Patch 1 for unmovable events,
this is now gone, suggesting it was indeed noise. Here, each patch
improves the situation for unmovable events. Reclaimable is improved by
patch 1 and then either the same modulo noise, or perhaps sligtly worse -
a small price for unmovable improvements, IMHO. The number of movable
allocations falling back to other migratetypes is most noisy, but it's
reduced to half at Patch 2 nevertheless. These are least critical as
compaction can move them around.
If we look at success rates, the patches don't affect them, that didn't change.
Baseline:
3.19-rc4 3.19-rc4 3.19-rc4
5-nothp-1 5-nothp-2 5-nothp-3
Success 1 Min 49.00 ( 0.00%) 42.00 ( 14.29%) 41.00 ( 16.33%)
Success 1 Mean 51.00 ( 0.00%) 45.00 ( 11.76%) 42.60 ( 16.47%)
Success 1 Max 55.00 ( 0.00%) 51.00 ( 7.27%) 46.00 ( 16.36%)
Success 2 Min 53.00 ( 0.00%) 47.00 ( 11.32%) 44.00 ( 16.98%)
Success 2 Mean 59.60 ( 0.00%) 50.80 ( 14.77%) 48.20 ( 19.13%)
Success 2 Max 64.00 ( 0.00%) 56.00 ( 12.50%) 52.00 ( 18.75%)
Success 3 Min 84.00 ( 0.00%) 82.00 ( 2.38%) 78.00 ( 7.14%)
Success 3 Mean 85.60 ( 0.00%) 82.80 ( 3.27%) 79.40 ( 7.24%)
Success 3 Max 86.00 ( 0.00%) 83.00 ( 3.49%) 80.00 ( 6.98%)
Patch 1:
3.19-rc4 3.19-rc4 3.19-rc4
6-nothp-1 6-nothp-2 6-nothp-3
Success 1 Min 49.00 ( 0.00%) 44.00 ( 10.20%) 44.00 ( 10.20%)
Success 1 Mean 51.80 ( 0.00%) 46.00 ( 11.20%) 45.80 ( 11.58%)
Success 1 Max 54.00 ( 0.00%) 49.00 ( 9.26%) 49.00 ( 9.26%)
Success 2 Min 58.00 ( 0.00%) 49.00 ( 15.52%) 48.00 ( 17.24%)
Success 2 Mean 60.40 ( 0.00%) 51.80 ( 14.24%) 50.80 ( 15.89%)
Success 2 Max 63.00 ( 0.00%) 54.00 ( 14.29%) 55.00 ( 12.70%)
Success 3 Min 84.00 ( 0.00%) 81.00 ( 3.57%) 79.00 ( 5.95%)
Success 3 Mean 85.00 ( 0.00%) 81.60 ( 4.00%) 79.80 ( 6.12%)
Success 3 Max 86.00 ( 0.00%) 82.00 ( 4.65%) 82.00 ( 4.65%)
Patch 2:
3.19-rc4 3.19-rc4 3.19-rc4
7-nothp-1 7-nothp-2 7-nothp-3
Success 1 Min 50.00 ( 0.00%) 44.00 ( 12.00%) 39.00 ( 22.00%)
Success 1 Mean 52.80 ( 0.00%) 45.60 ( 13.64%) 42.40 ( 19.70%)
Success 1 Max 55.00 ( 0.00%) 46.00 ( 16.36%) 47.00 ( 14.55%)
Success 2 Min 52.00 ( 0.00%) 48.00 ( 7.69%) 45.00 ( 13.46%)
Success 2 Mean 53.40 ( 0.00%) 49.80 ( 6.74%) 48.80 ( 8.61%)
Success 2 Max 57.00 ( 0.00%) 52.00 ( 8.77%) 52.00 ( 8.77%)
Success 3 Min 84.00 ( 0.00%) 81.00 ( 3.57%) 79.00 ( 5.95%)
Success 3 Mean 85.00 ( 0.00%) 82.40 ( 3.06%) 79.60 ( 6.35%)
Success 3 Max 86.00 ( 0.00%) 83.00 ( 3.49%) 80.00 ( 6.98%)
Patch 3:
3.19-rc4 3.19-rc4 3.19-rc4
8-nothp-1 8-nothp-2 8-nothp-3
Success 1 Min 46.00 ( 0.00%) 44.00 ( 4.35%) 42.00 ( 8.70%)
Success 1 Mean 50.20 ( 0.00%) 45.60 ( 9.16%) 44.00 ( 12.35%)
Success 1 Max 52.00 ( 0.00%) 47.00 ( 9.62%) 47.00 ( 9.62%)
Success 2 Min 53.00 ( 0.00%) 49.00 ( 7.55%) 48.00 ( 9.43%)
Success 2 Mean 55.80 ( 0.00%) 50.60 ( 9.32%) 49.00 ( 12.19%)
Success 2 Max 59.00 ( 0.00%) 52.00 ( 11.86%) 51.00 ( 13.56%)
Success 3 Min 84.00 ( 0.00%) 80.00 ( 4.76%) 79.00 ( 5.95%)
Success 3 Mean 85.40 ( 0.00%) 81.60 ( 4.45%) 80.40 ( 5.85%)
Success 3 Max 87.00 ( 0.00%) 83.00 ( 4.60%) 82.00 ( 5.75%)
While there's no improvement here, I consider reduced fragmentation events
to be worth on its own. Patch 2 also seems to reduce scanning for free
pages, and migrations in compaction, suggesting it has somewhat less work
to do:
Patch 1:
Compaction stalls 4153 3959 3978
Compaction success 1523 1441 1446
Compaction failures 2630 2517 2531
Page migrate success 4600827 4943120 5104348
Page migrate failure 19763 16656 17806
Compaction pages isolated 9597640 10305617 10653541
Compaction migrate scanned 77828948 86533283 87137064
Compaction free scanned 517758295 521312840 521462251
Compaction cost 5503 5932 6110
Patch 2:
Compaction stalls 3800 3450 3518
Compaction success 1421 1316 1317
Compaction failures 2379 2134 2201
Page migrate success 4160421 4502708 4752148
Page migrate failure 19705 14340 14911
Compaction pages isolated 8731983 9382374 9910043
Compaction migrate scanned 98362797 96349194 98609686
Compaction free scanned 496512560 469502017 480442545
Compaction cost 5173 5526 5811
As with v2, /proc/pagetypeinfo appears unaffected with respect to numbers
of unmovable and reclaimable pageblocks.
Configuring the benchmark to allocate like THP page fault (i.e. no sync
compaction) gives much noisier results for iterations 2 and 3 after
reboot. This is not so surprising given how [1] offers lower improvements
in this scenario due to less restarts after deferred compaction which
would change compaction pivot.
Baseline:
3.19-rc4 3.19-rc4 3.19-rc4
5-thp-1 5-thp-2 5-thp-3
Page alloc extfrag event 8148965 6227815 6646741
Extfrag fragmenting 8147872 6227130 6646117
Extfrag fragmenting for unmovable 10324 12942 15975
Extfrag fragmenting unmovable placed with movable 5972 8495 10907
Extfrag fragmenting for reclaimable 601 1707 2210
Extfrag fragmenting reclaimable placed with movable 520 1570 2000
Extfrag fragmenting for movable 8136947 6212481 6627932
Patch 1:
3.19-rc4 3.19-rc4 3.19-rc4
6-thp-1 6-thp-2 6-thp-3
Page alloc extfrag event 8345457 7574471 7020419
Extfrag fragmenting 8343546 7573777 7019718
Extfrag fragmenting for unmovable 10256 18535 30716
Extfrag fragmenting unmovable placed with movable 6893 11726 22181
Extfrag fragmenting for reclaimable 465 1208 1023
Extfrag fragmenting reclaimable placed with movable 353 996 843
Extfrag fragmenting for movable 8332825 7554034 6987979
Patch 2:
3.19-rc4 3.19-rc4 3.19-rc4
7-thp-1 7-thp-2 7-thp-3
Page alloc extfrag event 3512847 3020756 2891625
Extfrag fragmenting 3511940 3020185 2891059
Extfrag fragmenting for unmovable 9017 6892 6191
Extfrag fragmenting unmovable placed with movable 1524 3053 2435
Extfrag fragmenting for reclaimable 445 1081 1160
Extfrag fragmenting reclaimable placed with movable 375 918 986
Extfrag fragmenting for movable 3502478 3012212 2883708
Patch 3:
3.19-rc4 3.19-rc4 3.19-rc4
8-thp-1 8-thp-2 8-thp-3
Page alloc extfrag event 3181699 3082881 2674164
Extfrag fragmenting 3180812 3082303 2673611
Extfrag fragmenting for unmovable 1201 4031 4040
Extfrag fragmenting unmovable placed with movable 974 3611 3645
Extfrag fragmenting for reclaimable 478 1165 1294
Extfrag fragmenting reclaimable placed with movable 387 985 1030
Extfrag fragmenting for movable 3179133 3077107 2668277
The improvements for first iteration are clear, the rest is much noisier
and can appear like regression for Patch 1. Anyway, patch 2 rectifies it.
Allocation success rates are again unaffected so there's no point in
making this e-mail any longer.
[1] http://marc.info/?l=linux-mm&m=142166196321125&w=2
This patch (of 3):
When __rmqueue_fallback() is called to allocate a page of order X, it will
find a page of order Y >= X of a fallback migratetype, which is different
from the desired migratetype. With the help of try_to_steal_freepages(),
it may change the migratetype (to the desired one) also of:
1) all currently free pages in the pageblock containing the fallback page
2) the fallback pageblock itself
3) buddy pages created by splitting the fallback page (when Y > X)
These decisions take the order Y into account, as well as the desired
migratetype, with the goal of preventing multiple fallback allocations
that could e.g. distribute UNMOVABLE allocations among multiple
pageblocks.
Originally, decision for 1) has implied the decision for 3). Commit
47118af076 ("mm: mmzone: MIGRATE_CMA migration type added") changed that
(probably unintentionally) so that the buddy pages in case 3) are always
changed to the desired migratetype, except for CMA pageblocks.
Commit fef903efcf0c ("mm/page_allo.c: restructure free-page stealing code
and fix a bug") did some refactoring and added a comment that the case of
3) is intended. Commit 0cbef29a7821 ("mm: __rmqueue_fallback() should
respect pageblock type") removed the comment and tried to restore the
original behavior where 1) implies 3), but due to the previous
refactoring, the result is instead that only 2) implies 3) - and the
conditions for 2) are less frequently met than conditions for 1). This
may increase fragmentation in situations where the code decides to steal
all free pages from the pageblock (case 1)), but then gives back the buddy
pages produced by splitting.
This patch restores the original intended logic where 1) implies 3).
During testing with stress-highalloc from mmtests, this has shown to
decrease the number of events where UNMOVABLE and RECLAIMABLE allocations
steal from MOVABLE pageblocks, which can lead to permanent fragmentation.
In some cases it has increased the number of events when MOVABLE
allocations steal from UNMOVABLE or RECLAIMABLE pageblocks, but these are
fixable by sync compaction and thus less harmful.
Note that evaluation has shown that the behavior introduced by
47118af076 for buddy pages in case 3) is actually even better than the
original logic, so the following patch will introduce it properly once
again. For stable backports of this patch it makes thus sense to only fix
versions containing 0cbef29a7821.
[iamjoonsoo.kim@lge.com: tracepoint fix]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.13+ containing 0cbef29a7821]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In general, every tracepoint should be zero overhead if it is disabled.
However, trace_mm_page_alloc_extfrag() is one of exception. It evaluate
"new_type == start_migratetype" even if tracepoint is disabled.
However, the code can be moved into tracepoint's TP_fast_assign() and
TP_fast_assign exist exactly such purpose. This patch does it.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the current code, the value of fallback_migratetype that is printed
using the mm_page_alloc_extfrag tracepoint, is the value of the
migratetype *after* it has been set to the preferred migratetype (if the
ownership was changed). Obviously that wouldn't have been the original
intent. (We already have a separate 'change_ownership' field to tell
whether the ownership of the pageblock was changed from the
fallback_migratetype to the preferred type.)
The intent of the fallback_migratetype field is to show the migratetype
from which we borrowed pages in order to satisfy the allocation request.
So fix the code to print that value correctly.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Decare war on uninterruptible sleep. Add a tracepoint which
walks the kernel stack and dumps the first non-scheduler function
called before the scheduler is invoked.
Change-Id: I19e965d5206329360a92cbfe2afcc8c30f65c229
Signed-off-by: Riley Andrews <riandrews@google.com>
This patch includes two trace events on generic_perform_write and
do_generic_file_read to check on the address_space mapping for the
pages to be accessed by the request.
Change-Id: Ib319b9b2c971b9e5c76645be6cfd995ef9465d77
Signed-off-by: Daniel Campello <campello@google.com>
(cherry picked from commit d3952c50853166bd04562766c9603ed86ab0da75)
The Inter Processor Interrupt is used to make another processor do a
specific action such as rescheduling tasks, signal a timer event or
execute something in another CPU's context. IRQs are already traceable
but IPIs were not. Tracing them is useful for monitoring IPI latency,
or to verify when they are the source of CPU wake-ups with power
management implications.
Three trace hooks are defined: ipi_raise, ipi_entry and ipi_exit. To make
them portable, a string is used to identify them and correlate related
events. Additionally, ipi_raise records a bitmask representing targeted
CPUs.
Link: http://lkml.kernel.org/p/1406318733-26754-3-git-send-email-nicolas.pitre@linaro.org
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Being able to show a cpumask of events can be useful as some events
may affect only some CPUs. There is no standard way to record the
cpumask and converting it to a string is rather expensive during
the trace as traces happen in hotpaths. It would be better to record
the raw event mask and be able to parse it at print time.
The following macros were added for use with the TRACE_EVENT() macro:
__bitmask()
__assign_bitmask()
__get_bitmask()
To test this, I added this to the sched_migrate_task event, which
looked like this:
TRACE_EVENT(sched_migrate_task,
TP_PROTO(struct task_struct *p, int dest_cpu, const struct cpumask *cpus),
TP_ARGS(p, dest_cpu, cpus),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
__field( int, prio )
__field( int, orig_cpu )
__field( int, dest_cpu )
__bitmask( cpumask, num_possible_cpus() )
),
TP_fast_assign(
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->prio = p->prio;
__entry->orig_cpu = task_cpu(p);
__entry->dest_cpu = dest_cpu;
__assign_bitmask(cpumask, cpumask_bits(cpus), num_possible_cpus());
),
TP_printk("comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s",
__entry->comm, __entry->pid, __entry->prio,
__entry->orig_cpu, __entry->dest_cpu,
__get_bitmask(cpumask))
);
With the output of:
ksmtuned-3613 [003] d..2 485.220508: sched_migrate_task: comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 cpumask=00000000,0000000f
migration/1-13 [001] d..5 485.221202: sched_migrate_task: comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 cpumask=00000000,0000000f
awk-3615 [002] d.H5 485.221747: sched_migrate_task: comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 cpumask=00000000,000000ff
migration/2-18 [002] d..5 485.222062: sched_migrate_task: comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 cpumask=00000000,0000000f
Link: http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.merino@arm.com
Link: http://lkml.kernel.org/r/20140506132238.22e136d1@gandalf.local.home
Suggested-by: Javi Merino <javi.merino@arm.com>
Tested-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This will cause ext3 and gfs2 to not compile correctly, but allows us
to get a modern version of ext4 into 3.10. This makes it easier to
backport newer features such as ext4 encryption into a downrev kernel.
It also fixes a number of xfstest failures that were fixed since 3.10.
The subsequent commits will fix up the 3.18 ext4 codebase so it will
compile against 3.10.
Signed-off-by: Theodore Ts'o <tytso@google.com>
Add support for enter/exit cycle sysfs nodes for io detection
There are some usecases which may benefit from different enter/exit
cycle load criteria for IO load. This change adds support for
that.
Change-Id: Iff135ed11b92becc374ace4578e0efc212d2b731
Signed-off-by: Tapas Kumar Kundu <tkundu@codeaurora.org>
Add support for multi_enter_cycles/multi_exit_cycles per cluster
There are some usecases which may benefit from different enter/exit
cycle load criteria for multimode cpu load. This change adds support for
that.
Change-Id: I3408405307ca03b9bba3f03e216ef59b98f29832
Signed-off-by: Tapas Kumar Kundu <tkundu@codeaurora.org>
Certain governors may stop sending out notifications once CPUs enter
idle at min frequency.If governor's notifications stop then single mode
will not exit for long time. It can happen only if the exit conditions are
set in such a way that the time taken to exit single mode exceeds the time
for the governor to ramp down, idle out and hence stop sending
notifications leaving the system in single mode indefinitely.
This change adds seperate enter/exit cycle sysfs nodes along with a per
cluster non-deferrable timer for single mode exit. The timer is armed only
when the load starts falling below the exit load threshold and is
cancelled when either the load starts going up or SINGLE mode is exited
due to exceeding exit cycle count. On expiry the timer resets SINGLE mode
and the enter/exit cycle counts.
Change-Id: I02dd3fa8af39ca320e80da6391eb2b1ea635a433
Signed-off-by: Tapas Kumar Kundu <tkundu@codeaurora.org>
This reverts commit 6105c52440.
Reverting all pm_qos common framework changes and their
dependencies.
Change-Id: Ida9372f549aba0253e43796f09256466241bf24b
CRs-Fixed: 811532
Signed-off-by: Dov Levenglick <dovl@codeaurora.org>
Add trace-points for tracking pm_qos voting.
This assists in following the voting for debugging
performance related issues.
Change-Id: I5a9e886c739252043e9f28f100e0493436a0eb75
Signed-off-by: Dov Levenglick <dovl@codeaurora.org>
Extend sched_get_nr_running_avg() API to return average nr_big_tasks,
in addition to average nr_running and average nr_io_wait tasks. Also
add a new trace point to record values returned by
sched_get_nr_running_avg() API.
Change-Id: Id3591e6d04da8db484b4d1cb9d95dba075f5ab9a
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
This driver is a current limit management module to help manage
instantaneous peak current drawn by multiple subsystems on shared
supply. The inputs to the mitigation algorithm are current states
of different subsystems sharing this supply like cpu frequency,
gpu frequency, number of cores online, soc temperature, core leakage,
and modem state. It throttles cpu frequency and limits number of
online cores to reduce the dynamic current so as to keep the total
current drawn from supply in safe limits.
Change-Id: I4592b8be48bad3709e8cfb09da53f23279a8ff9b
Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org>
Detect single and multi threaded heavy workloads based on loads
received from interactive governor.
- If the max load across all the CPUs is greater than a
user-specified threshold for certain number of governor windows
then the load is detected as a single-threaded workload.
- If the total load across all the CPUs is greater than a
user-specified threshold for certain number of governor windows
then the load is detected as a multi-threaded workload.
If one of these is detected then a notification is sent to the
userspace so that an entity can read the nodes exposed to get an
idea of the nature of workload running.
Change-Id: Iba75d26fb3981886b3a8460d5f8999a632bbb73a
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
Some workloads spend a lot of time in IO activity and need higher
performance from system resources (for eg. CPU/DDR frequencies)to
complete with decent performance. Unfortunately cpufreq governors and
other sytem resources crucial for IO are tuned for general usecases
and hence might be slower to react to such demanding IO workloads.
This patch adds functionality to detect IO workloads and then send
hints to userspace of the detected activity so that userspace can
take necessary tuning action to prepare the system for such activity.
IO activity is tracked every interactive governor timer boundary and
if the percentage of iowait time in each cycle exceeds certain
threshold continuously for certain number of cycles then heavy IO
activity is detected.
Change-Id: I73859517cb436e50340ef14739183e61fc62f90f
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
Key hmp stats (nr_big_tasks, nr_small_tasks and
cumulative_runnable_average) are currently maintained per-cpu in
'struct rq'. Merge those stats in their own structure (struct
hmp_sched_stats) and modify impacted functions to deal with the newly
introduced structure. This cleanup is required for a subsequent patch
which fixes various issues with use of CFS_BANDWIDTH feature in HMP
scheduler.
Change-Id: Ieffc10a3b82a102f561331bc385d042c15a33998
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Add LMH Lite ftraces for the LMH Lite event.
Ftraces are added to events like,
1. SCM call entry and exit
2. Trim error event
3. LMH interrupt trigger and clear event
4. LMH sensor interrupt event with intensity value
5. LMH sensor intensity reading
Change-Id: I2bc0c31fab751f0ee23b52e7d978a90d20a0eea1
Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org>
Add the ability to switch NoC masters to be in limiter and regulator mode
for the adhoc bus driver. These modes offer differing degrees of
throttling the io traffic from NoC master ports if needed.
Change-Id: If2f868430ebccff1a11aad7d90fa5b352ea2c876
Signed-off-by: Girish Mahadevan <girishm@codeaurora.org>
Add trace events to the ad-hoc bus driver to assist in
client agg vote, clocks, QOS debugging and rules applied.
Change-Id: I46ae10bd550117dea2f3c2934e8335c8c0b0e1bd
Signed-off-by: Alok Chauhan <alokc@codeaurora.org>
Add the current CPU temperature to the sched_cpu_load trace point.
This will allow us to track the CPU temperature.
CRs-Fixed: 764788
Change-Id: Ib2e3559bbbe3fe07a6b7c8115db606828bc36254
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
Prefer idle determines whether the scheduler prefers an idle CPU
over a busy CPU or not to wake up a task on. Knowing the correct
value of this tunable is essential in understanding placement
decisions made in select_best_cpu().
Change-Id: I955d7577061abccb65d01f560e1911d9db70298a
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
Sync wakeups provide a hint to the scheduler about upcoming task
activity. Knowing which wakeups are sync wakeups from logs will
assist in workload analysis.
Change-Id: I6ffe73f2337e56b8234d4097069d5d70ab045eda
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
The irqload is used in determining whether CPUs are mostly idle
so it is useful to know this value while viewing scheduler traces.
Change-Id: Icbb74fc1285be878f254ae54886bdb161b14a270
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
Add ftrace event so that the state of all the clocks can be recorded
using ftrace. Use that event to add support in debugfs to take a
snapshot of all the clocks present and their current state to ftrace.
CRs-Fixed: 766583
Change-Id: Ibe95b5eaa013e2da378b9dc5e8c43162895ef272
Signed-off-by: Pushkar Joshi <pushkarj@codeaurora.org>
Add new APIs to the bus scaling driver. The new APIs make it
easier for clients to setup paths for bus scaling. The driver APIs
will return a pointer to a client handle in case of success and NULL or
error in cases of failure. For now the existing APIs will remain as is
eventually all clients will start switching over to the new APIs.
Change-Id: I22656dddf13802128ee5c4faab9f83f9c6f8e683
Signed-off-by: Girish Mahadevan <girishm@codeaurora.org>
Evaluate every rule for a given node when a bus transaction happens and
apply the first matched rule allowing for multiple rules to be applicable
but apply the most restrictive.
Change-Id: I25018ac4260916fd5c42d8a73b886b13a0d2b3a0
Signed-off-by: Girish Mahadevan <girishm@codeaurora.org>
Sometimes for power saving reasons we might want to keep fewer CPUs
online without adversely affecting performance for certain real world
usecases. This module helps to provide that hotplug support to the
userspace such that it tries to make a best effort in keeping a certain
number of CPUs online as specified by the userspace.
It allows any userspace entity to specify the CPUs that it wants to
manage with this module and of those, the number of CPUs that should be
kept online.
Change-Id: I82c6d6e998d3740ad6f8c67b47344ce87f328b8b
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
This patch adds the profiling support for some of the time critical
operations like hibern8 enter/exit, clock gating & clock scaling.
Change-Id: I4dde1078dcd2af47f051639b03c44c423ee344fa
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Make criteria for notifying governor to be per-cpu. Governor is
notified of any large change in cpu's busy time statistics
(rq->prev_runnable_sum) since the last reported value.
Change-Id: I727354d994d909b166d093b94d3dade7c7dddc0d
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
In order to save power we should put the UFS link into hibern8 as soon as
UFS link is idle and power measurement of active usecases (like audio/video
playback/recording) show that putting UFS link in hibern8 @ 10ms of idle
(if not earlier) would save significant power.
Our current available solution is to do hibern8 with clock gating @idle
timeout of 150ms. As clock gating has huge latencies (7ms each in enter and
exit), we cannot bring down the idle timeout to <=10ms without degrading
UFS throughput. Hence this change has added support to enter into hibern8
with another idle timer.
Change-Id: I5a31f18fc21015d4a68236da9fd94f3f016e1d44
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Previously, there was a limitation in load change callback that it
can't attempt to wake up a task. Therefore the best we can do is to
schedule timer at current jiffy. The timer function will only be
executed at next timer tick. This could take up to 10ms.
Now that this limitation is removed, re-evaluate load immediately upon
receiving this callback.
Change-Id: Iab3de4705b9aae96054655b1541e32fb040f7e60
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
rq->curr/prev_runnable_sum counters represent cpu demand from various
tasks that have run on a cpu. Any task that runs on a cpu will have a
representation in rq->curr_runnable_sum. Their partial_demand value
will be included in rq->curr_runnable_sum. Since partial_demand is
derived from historical load samples for a task, rq->curr_runnable_sum
could represent "inflated/un-realistic" cpu usage. As an example, lets
say that task with partial_demand of 10ms runs for only 1ms on a cpu.
What is included in rq->curr_runnable_sum is 10ms (and not the actual
execution time of 1ms). This leads to cpu busy time being reported on
the upside causing frequency to stay higher than necessary.
This patch fixes cpu busy accounting scheme to strictly represent
actual usage. It also provides for conditional fixup of busy time upon
migration and upon heavy-task wakeup.
CRs-Fixed: 691443
Change-Id: Ic4092627668053934049af4dfef65d9b6b901e6b
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Add two new ftrace event:
* trace_sched_freq_alert, to log notifications sent
to governor for requesting change in frequency.
* trace_sched_get_busy, to log cpu busytime information returned by
scheduler
Extend existing ftrace events as follows:
* sched_update_task_ravg() event to log irqtime parameter
* sched_migration_update_sum() to log threadid which is being migrated
(and thus responsible for update of curr_runnable_sum and
prev_runnable_sum counters)
Change-Id: Ia68ce0953a2d21d319a1db7f916c51ff6a91557c
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
While debugging its always useful to know whether a task is small or
not to determine the scheduling algorithm being used. Have the
sched_task_load tracepoint indicate this information rather than
having to do manual calculations for every task placement.
Change-Id: Ibf390095f05c7da80df1ebfe00f4c5af66c97d12
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
C-state information is used by the scheduler for small task placement
decisions. Track this information in the sched_cpu_load trace event.
Also add the trace event in best_small_task_cpu(). This will help
better understand small task placement decisions.
Change-Id: Ife5f05bba59f85c968fab999bd13b9fb6b1c184e
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>