android_kernel_lge_bullhead

Commit Graph

Author	SHA1	Message	Date
Linux Build Service Account	72de2d6cb8	Merge "sched: update_rq_clock() must skip ONE update"	2014-11-30 16:19:16 -08:00
Linux Build Service Account	b4b0ebc5f9	Merge "sched: tighten up jiffy to sched_clock mapping"	2014-11-29 17:17:43 -08:00
Srivatsa Vaddagiri	ab2ff007fe	sched: update_rq_clock() must skip ONE update Prevent large wakeup latencies from being accounted to the wrong task. Change-Id: Ie9932acb8a733989441ff2dd51c50a2626cfe5c5 Cc: <stable@vger.kernel.org> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> CRs-Fixed: 755576 Patch-mainline: http://permalink.gmane.org/gmane.linux.kernel/1677324 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-11-25 12:28:35 +05:30
Linux Build Service Account	aa285b577f	Merge "sched: per-cpu mostly_idle threshold"	2014-11-20 15:36:30 -08:00
Linux Build Service Account	62b1d26801	Merge "sched: Add API to set task's initial task load"	2014-11-20 15:36:29 -08:00
Steve Muckle	f17fe85baf	sched: tighten up jiffy to sched_clock mapping The tick code already tracks exact time a tick is expected to arrive. This can be used to eliminate slack in the jiffy to sched_clock mapping that aligns windows between a caller of sched_set_window and the scheduler itself. Change-Id: I9d47466658d01e6857d7457405459436d504a2ca Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-11-19 15:06:33 -08:00
Syed Rameez Mustafa	a40d3ce56e	sched: Avoid unnecessary load balance when tasks don't fit on dst_cpu When considering to pull over a task that does not fit on the destination CPU make sure that the busiest group has exceeded its capacity. While the change is applicable to all groups, the biggest impact will be on migrating big tasks to little CPUs. This should only happen when the big cluster is no longer capable of balancing load within the cluster. This change should have no impact on single cluster systems. Change-Id: I6d1ef0e0d878460530f036921ce4a4a9c1e1394b Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-11-13 12:24:31 -08:00
Steve Muckle	e3d8a00dab	sched: print sched_cpu_load tracepoint for all CPUs When select_best_cpu() is called because a task is on a suboptimal CPU, certain CPUs are skipped because moving the task there would not make things any better. For the purposes of debugging though it is useful to always see the state of all CPUs. Change-Id: I76965663c1feef5c4cfab9909e477b0dcf67272d Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-11-10 19:22:51 -08:00
Srivatsa Vaddagiri	ed7d7749e9	sched: per-cpu mostly_idle threshold sched_mostly_idle_load and sched_mostly_idle_nr_run knobs help pack tasks on cpus to some extent. In some cases, it may be desirable to have different packing limits for different cpus. For example, pack to a higher limit on high-performance cpus compared to power-efficient cpus. This patch removes the global mostly_idle tunables and makes them per-cpu, thus letting task packing behavior to be controlled in a fine-grained manner. Change-Id: Ifc254cda34b928eae9d6c342ce4c0f64e531e6c2 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-11-06 15:27:00 +05:30
Srivatsa Vaddagiri	f0e281597c	sched: Add API to set task's initial task load Add a per-task attribute, init_load_pct, that is used to initialize newly created children's initial task load. This helps important applications launch their child tasks on cpus with highest capacity. Change-Id: Ie9665fd2aeb15203f95fd7f211c50bebbaa18727 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-11-05 14:26:59 +05:30
Syed Rameez Mustafa	297c4ccce8	sched: use C-states in non-small task wakeup placement logic Currently when a non-small task wakes up, the task placement logic first tries to find the least loaded CPU before breaking any ties via the power cost of running the task on those CPUs. When the power cost is also same, however, the scheduler just selects the first CPU it came across. Use C-states to further break ties when the power cost is the same for multiple CPUs. The scheduler will now pick a CPU in the shallowest C-state. Change-Id: Ie1401b305fa02758a2f7b30cfca1afe64459fc2b Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-11-04 14:11:24 -08:00
Linux Build Service Account	c44483c313	Merge "sched: Provide an easy method to log context switch latencies"	2014-10-24 22:51:02 -07:00
Linux Build Service Account	0f1e07cdf9	Merge "sched: take rq lock prior to saving idle task's mark_start"	2014-10-24 16:09:25 -07:00
Syed Rameez Mustafa	a2006e83ba	sched: Provide an easy method to log context switch latencies Allow logging of various sections of context switch in order to derive the worst case latencies associated with them. This is required for scheduler profiling. Change-Id: I3a5009cb3088cc7ace2cd3130d4d7b24e957bada Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-10-23 17:38:22 -07:00
Linux Build Service Account	27c06362c4	Merge "sched: update governor notification logic"	2014-10-22 15:58:33 -07:00
Steve Muckle	eed96dfa2a	sched: take rq lock prior to saving idle task's mark_start When the idle task is being re-initialized during hotplug its mark_start value must be retained. The runqueue lock must be held when reading this value though to serialize this with other CPUs that could update the idle task's window-based statistics. CRs-Fixed: 743991 Change-Id: I1bca092d9ebc32a808cea2b9fe890cd24dc868cd Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-10-22 15:12:41 -07:00
Srivatsa Vaddagiri	f3386c7cfb	sched: update governor notification logic Make criteria for notifying governor to be per-cpu. Governor is notified of any large change in cpu's busy time statistics (rq->prev_runnable_sum) since the last reported value. Change-Id: I727354d994d909b166d093b94d3dade7c7dddc0d Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-10-15 14:57:18 -07:00
Srivatsa Vaddagiri	f99927a703	sched: window-stats: Retain idle thread's mark_start init_idle() is called on a cpu's idle-thread once at bootup and subsequently everytime the cpu is hot-added. Since init_idle() calls __sched_fork(), we end up blowing idle thread's ravg.mark_start value. As a result we will fail to accurately maintain cpu's curr/prev_runnable_sum counters. Below example illustrates such a failure: CS = curr_runnable_sum, PS = prev_runnable_sum t0 -> New window starts for CPU2 <after some_task_activity> CS = X, PS = Y t1 -> <cpu2 is hot-removed. idle_task start's running on cpu2> At this time, cpu2_idle_thread.ravg.mark_start = t1 t1 -> t0 + W. One window elapses. CPU2 still hot-removed. We defer swapping CS and PS until some future task event occurs t2 -> CPU2 hot-added. _cpu_up()->idle_thread_get()->init_idle() ->__sched_fork() results in cpu2_idle_thread.ravg.mark_start = 0 t3 -> Some task wakes on cpu2. Since mark_start = 0, we don't swap CS and PS => which is a BUG! Fix this by retaining idle task's original mark_start value during init_idle() call. Change-Id: I4ac9bfe3a58fb5da8a6c7bc378c79d9930d17942 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-10-13 16:11:20 -07:00
Linux Build Service Account	53d2a04a26	Merge "sched: Stop task migration to busy CPUs due to power active balance"	2014-10-12 08:39:38 -07:00
Olav Haugan	72bbd4b7cb	sched: Add checks for frequency change We need to check for frequency change when a task is migrated due to affinity change and during active balance. Change-Id: I96676db04d34b5b91edd83431c236a1c28166985 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>	2014-10-09 15:37:22 -07:00
Srivatsa Vaddagiri	19b3f3f871	sched: Use absolute scale for notifying governor Make the tunables used for deciding the need for notification to be on absolute scale. The earlier scale (in percent terms relative to cur_freq) does not work well with available range of frequencies. For example, 100% tunable value would work well for lower range of frequencies and not for higher range. Having the tunable to be on absolute scale makes tuning more realistic. Change-Id: I35a8c4e2f2e9da57f4ca4462072276d06ad386f1 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-10-03 14:03:56 -07:00
Srivatsa Vaddagiri	2568673dd6	sched: window-stats: Enhance cpu busy time accounting rq->curr/prev_runnable_sum counters represent cpu demand from various tasks that have run on a cpu. Any task that runs on a cpu will have a representation in rq->curr_runnable_sum. Their partial_demand value will be included in rq->curr_runnable_sum. Since partial_demand is derived from historical load samples for a task, rq->curr_runnable_sum could represent "inflated/un-realistic" cpu usage. As an example, lets say that task with partial_demand of 10ms runs for only 1ms on a cpu. What is included in rq->curr_runnable_sum is 10ms (and not the actual execution time of 1ms). This leads to cpu busy time being reported on the upside causing frequency to stay higher than necessary. This patch fixes cpu busy accounting scheme to strictly represent actual usage. It also provides for conditional fixup of busy time upon migration and upon heavy-task wakeup. CRs-Fixed: 691443 Change-Id: Ic4092627668053934049af4dfef65d9b6b901e6b Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-10-03 14:03:51 -07:00
Srivatsa Vaddagiri	dababc266f	sched: window-stats: ftrace event improvements Add two new ftrace event: * trace_sched_freq_alert, to log notifications sent to governor for requesting change in frequency. * trace_sched_get_busy, to log cpu busytime information returned by scheduler Extend existing ftrace events as follows: * sched_update_task_ravg() event to log irqtime parameter * sched_migration_update_sum() to log threadid which is being migrated (and thus responsible for update of curr_runnable_sum and prev_runnable_sum counters) Change-Id: Ia68ce0953a2d21d319a1db7f916c51ff6a91557c Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-10-03 13:47:29 -07:00
Srivatsa Vaddagiri	86df733742	sched: improve logic for alerting governor Currently we send notification to governor not taking note of cpus that are synchronized with regard to their frequency. As a result, scheduler could send pointless notifications (notification spam!). Avoid this by considering synchronized cpus and alerting governor only when the highest demand of any cpu within cluster far exceeds or falls behind current frequency. Change-Id: I74908b5a212404ca56b38eb94548f9b1fbcca33d Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-10-03 13:46:18 -07:00
Syed Rameez Mustafa	0b013c8593	sched: Stop task migration to busy CPUs due to power active balance Power active balance should only be invoked when the destination CPU is calling load balance with either a CPU_IDLE or a CPU_NEWLY_IDLE environment. We do not want to push tasks towards busy CPUs even they are a more power efficient place to run that task. This can cause higher scheduling latencies due to the resulting load imbalance. Change-Id: I8e0f242338887d189e2fc17acfb63586e7c40839 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-10-02 17:37:09 -07:00
Srivatsa Vaddagiri	6cb3d32976	sched: window-stats: Fix accounting bug in legacy mode TASK_UPDATE event currently does not result in increment of rq->curr_runnable_sum in legacy mode, which is wrong. As a result, cpu busy time reported under legacy mode could be incorrect. Change-Id: Ifa76c735a0ead23062c1a64faf97e7b801b66bf9 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-09-15 12:00:46 +05:30
Srivatsa Vaddagiri	802a513d90	sched: window-stats: Note legacy mode in fork() and exit() In legacy mode, mark_task_starting() should avoid adding (new) task's (initial) demand to rq->curr_runnable_sum and rq->prev_runnable_sum. Similarly exit() should avoid removing (exiting) task's demand from rq->curr_runnable_sum and rq->prev_runnable_sum (as those counters don't include task's demand and partial_demand values in legacy mode). Change-Id: I26820b1ac5885a9d681d363ec53d6866a2ea2e6f Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-09-15 12:00:46 +05:30
Srivatsa Vaddagiri	718293c53c	sched: Fix reference to stale task_struct in try_to_wake_up() try_to_wake_up() currently drops p->pi_lock and later checks for need to notify cpufreq governor on task migrations or wakeups. However the woken task could exit between the time p->pi_lock is released and the time the test for notification is run. As a result, the test for notification could refer to an exited task. task_notify_on_migrate(p) could thus lead to invalid memory reference. Fix this by running the test for notification with task's pi_lock held. Change-Id: I1c7a337473d2d8e79342a015a179174ce00702e1 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-09-15 12:00:46 +05:30
Syed Rameez Mustafa	37c0e84719	sched: Remove hack to enable/disable HMP scheduling extensions The current method of turning HMP scheduling extensions on or off based on the number of CPUs is inappropriate as there may be SoCs with 4 or less cores that require the use of these extensions. Remove this hack as HMP extensions will now be enabled/disabled via command line options. Change-Id: Id44b53c2c3b3c3b83e1911a834e2c824f3958135 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-09-11 09:18:03 -07:00
Linux Build Service Account	7a62303fd9	Merge "sched: add check for cpu idleness when using C-state information"	2014-09-11 01:52:32 -07:00
Linux Build Service Account	e81a3dc7f7	Merge "sched: extend sched_task_load tracepoint to indicate small tasks"	2014-09-11 01:52:31 -07:00
Linux Build Service Account	cef2bfadb1	Merge "sched: Add C-state tracking to the sched_cpu_load trace event"	2014-09-09 04:48:06 -07:00
Linux Build Service Account	0dbd5f1b7b	Merge "sched: window-stats: add a new AVG policy"	2014-09-09 04:47:32 -07:00
Linux Build Service Account	672d3eb95f	Merge "sched: fix wrong load_scale_factor/capacity/nr_big/small_tasks"	2014-09-09 00:57:10 -07:00
Srivatsa Vaddagiri	9e37153f17	sched: fix wrong load_scale_factor/capacity/nr_big/small_tasks A couple bugs exist with incorrect use of cpu_online_mask in pre/post_big_small_task() functions, leading to potentially incorrect computation of load_scale_factor/capacity/nr_big/small_tasks. pre/post_big_small_task_count_change() use cpu_online_mask in an unreliable manner. While local_irq_disable() in pre_big_small_task_count_change() ensures a cpu won't go away in cpu_online_mask, nothing prevents a cpu from coming online concurrently. As a result, cpu_online_mask used in pre_big_small_task_count_change() can be inconsistent with that used in post_big_small_task_count_change() which can lead to an attempt to unlock rq->lock which was not taken before. Secondly, when either max_possible_freq or min_max_freq is changing, it needs to trigger recomputation of load_scale_factor and capacity for all cpus, even if some are offline. Otherwise, an offline cpu could later come online with incorrect load_scale_factor/capacity. While it should be sufficient to scan online cpus for updating their nr_big/small_tasks in post_big_small_task_count_change(), unfortunately it sounds pretty hard to provide a stable cpu_online_mask when its called from cpufreq_notifier_policy(). cpufreq framework can trigger a CPUFREQ_NOTIFY notification in multiple contexts, some in cpu-hotplug paths, which makes it pretty hard to guess whether get_online_cpus() can be taken without causing deadlocks or not. To workaround the insufficient information we have about the hotplug-safety context when CPUFREQ_NOTIFY is issued, have post_big_small_task_count_change() traverse all possible cpus in updating nr_big/small_task_count. CRs-Fixed: 717134 Change-Id: Ife8f3f7cdfd77d5a21eee63627d7a3465930aed5 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-09-08 17:18:24 -07:00
Syed Rameez Mustafa	04953f4035	sched: add check for cpu idleness when using C-state information Task enqueue on a CPU occurs prior to that CPU exiting an idle state. For the time duration between enqueue and idle exit, the CPU C-state information can no longer be relied on for further task placement since already enqueued/waiting tasks are not taken into account. The small task placement algorithm implicitly assumes a non zero C-state implies an idle CPU. Since this assumption is incorrect for the duration described above, make the cpu_idle() check explicit. This problem can lead to task packing beyond the mostly_idle threshold. Change-Id: Idb5be85705d6b15f187d011ea2196e1bfe31dbf2 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-09-08 15:25:11 -07:00
Syed Rameez Mustafa	444e5dee14	sched: extend sched_task_load tracepoint to indicate small tasks While debugging its always useful to know whether a task is small or not to determine the scheduling algorithm being used. Have the sched_task_load tracepoint indicate this information rather than having to do manual calculations for every task placement. Change-Id: Ibf390095f05c7da80df1ebfe00f4c5af66c97d12 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-09-08 14:40:58 -07:00
Syed Rameez Mustafa	e85e73f1d7	sched: Add C-state tracking to the sched_cpu_load trace event C-state information is used by the scheduler for small task placement decisions. Track this information in the sched_cpu_load trace event. Also add the trace event in best_small_task_cpu(). This will help better understand small task placement decisions. Change-Id: Ife5f05bba59f85c968fab999bd13b9fb6b1c184e Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-09-08 11:29:57 -07:00
Syed Rameez Mustafa	bf3e6c0e55	sched: window-stats: add a new AVG policy The current WINDOW_STATS_AVG policy is actually a misnomer since it uses the maximum value of the runtime in the recent window and the average of the past ravg_hist_size windows. Add a policy that only uses the average and call it WINDOW_STATS_AVG policy. Rename all the other polices to make them shorter and unambiguous. Change-Id: I080a4ea072a84a88858ca9da59a4151dfbdbe62c Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-09-08 11:07:41 -07:00
Linux Build Service Account	c5bc590f13	Merge "sched: Fix compile error"	2014-09-07 08:21:53 -07:00
Srivatsa Vaddagiri	594ce07f48	sched: Fix compile error sched_get_busy(), sched_set_io_is_busy() and sched_set_window() need to be defined only when CONFIG_SCHED_FREQ_INPUT is defined, otherwise we get compilation error related to dual definition of those routines Change-Id: Ifd5c9b6675b78d04c2f7ef0e24efeae70f7ce19b Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-09-04 12:14:38 +05:30
Syed Rameez Mustafa	e4600ab9eb	sched: update ld_moved for active balance from the load balancer ld_moved is currently left set to 0 when the load balancer calls upon active balance. This behavior is incorrect as it prevents the termination of load balance for parent sched domains. Currently the feature is used quite frequently for power active balance and sched boost. This means that while sched boost is in effect we could run into a scenario where a more power efficient newly idle big CPU first triggers active migration from a less power efficient busy big CPU. It then continues to load balance at the cluster level causing active migration for a task running on a little CPU. Consequently the more power efficient big CPU ends up with two tasks where as the less power efficient big CPU may become idle. Fix this problem by updating ld_moved when active migration has been requested. Change-Id: I52e84eafb77249fd9378ebe531abe2d694178537 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-09-03 20:01:44 -07:00
Syed Rameez Mustafa	5f5ecf01d3	sched: actively migrate tasks to idle big CPUs during sched boost The sched boost feature is currently tick driven, i.e. task placement decisions only take place at a tick (or wakeup). The load balancer does not have any knowledge of boost being in effect. Tasks that are woken up on a little CPU when all big CPUs are busy will continue executing there at least until the next tick even if one of the big CPUs becomes idle. Reduce this latency by adding support for detecting whether boost is in effect or not in the load balancer. If boost is in effect any big CPU running idle balance will trigger active migration from a little CPU with the highest task load. Change-Id: Ib2828809efa0f9857f5009b29931f63b276a59f3 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-09-03 19:42:15 -07:00
Syed Rameez Mustafa	d3990aabb5	sched: always do idle balance with a NEWLY_IDLE idle environment With the introduction of energy aware scheduling, if idle_balance() is to be called on behalf of a different CPU which is idle, CPU_IDLE is used in the environment for load_balance(). This, however, introduces subtle differences in load calculations and policies in the load balancer. For example there are restrictions on which CPU is permitted to do load balancing during !CPU_NEWLY_IDLE (see update_sg_lb_stats) and find_busiest_group() uses different criteria to detect the presence of a busy group. There are other differences as well. Revert back to using the NEWLY_IDLE environment irrespective of whether idle_balance() is called for the newly idle CPU or on behalf on already existing idle CPU. This will ensure that task movement logic while doing idle balance remains unaffected. Change-Id: I388b0ad9a38ca550667895c8ed19628f3d25ce1a Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-09-03 19:23:41 -07:00
Syed Rameez Mustafa	9c37494817	sched: fix bail condition in bail_inter_cluster_balance() Following commit efcad25cbfb (revert "sched: influence cpu_power based on max_freq and efficiency), all CPUs in the system have the same cpu_power and consequently the same group capacity. Therefore, the check in bail_inter_cluster_balance() can now no longer be used to distinguish a higher performance cluster from one with lower performance. The check is currently broken and always returns true for every load balancing attempt. Fix this by using runqueue capacity instead which can still be used as a good measure of cluster capabilities. Also the logic for distinguishing between idle environments and using a different sched group capacity in update_sd_pick_busiest() is redundant. sgs->group_capacity would now always be equal to the number of CPUs in the group. Use sgs->group_capacity directly in conditonal checks in that function. Change-Id: Idecfd1ed221d27d4324b20539e5224a92bf8b751 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-09-03 19:23:40 -07:00
Srivatsa Vaddagiri	1b36dc118d	sched: Initialize env->loop variable to 0 load_balance() function does not explicitly initialize env->loop variable to 0. As a result, there is a vague possibility of move_tasks() hitting a very long (unnecessary) loop when its unable to move tasks from src_cpu. This can lead to unpleasant results like a watchdog bark. Fix this by explicitly initializing env->loop variable to 0 (in both load_balance() and active_load_balance_cpu_stop()). Change-Id: I36b84c91a9753870fa16ef9c9339db7b706527be Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-25 16:07:57 +05:30
Linux Build Service Account	a4e6dcf42b	Merge "sched: window-stats: use policy_mutex in sched_set_window()"	2014-08-24 20:01:43 -07:00
Linux Build Service Account	355e55afc6	Merge "sched: window-stats: Avoid taking all cpu's rq->lock for long"	2014-08-24 20:01:43 -07:00
Linux Build Service Account	d7bca8f374	Merge "sched: window_stats: Add "disable" mode support"	2014-08-24 20:01:41 -07:00
Linux Build Service Account	bf7b729348	Merge "sched: window-stats: Fix exit race"	2014-08-24 20:01:41 -07:00
Linux Build Service Account	e621b2a191	Merge "sched: window-stats: code cleanup"	2014-08-24 20:01:40 -07:00
Linux Build Service Account	5b1594145e	Merge "sched: window-stats: legacy mode"	2014-08-24 20:01:39 -07:00
Linux Build Service Account	755da8b25c	Merge "sched: window-stats: Code cleanup"	2014-08-24 20:01:38 -07:00
Linux Build Service Account	273f377789	Merge "sched: window-stats: Code cleanup"	2014-08-24 20:01:37 -07:00
Linux Build Service Account	0e3780151f	Merge "sched: window-stats: Code cleanup"	2014-08-24 20:01:37 -07:00
Linux Build Service Account	7b3f011d4e	Merge "sched: window-stats: Remove unused prev_window variable"	2014-08-24 20:01:36 -07:00
Srivatsa Vaddagiri	4f93bebd20	sched: window-stats: use policy_mutex in sched_set_window() Several configuration variable change will result in reset_all_window_stats() being called. All of them, except sched_set_window(), are serialized via policy_mutex. Take policy_mutex in sched_set_window() as well to serialize use of reset_all_window_stats() function Change-Id: Iada7ff8ac85caa1517e2adcf6394c5b050e3968a Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:45:17 -07:00
Srivatsa Vaddagiri	e4ff6c07c5	sched: window-stats: Avoid taking all cpu's rq->lock for long reset_all_window_stats() walks task-list with all cpu's rq->lock held, which can cause spinlock timeouts if task-list is huge (and hence lead to a spinlock bug report). Avoid this by walking task-list without cpu's rq->lock held. Change-Id: Id09afd8b730fa32c76cd3bff5da7c0cd7aeb8dfb Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:45:16 -07:00
Srivatsa Vaddagiri	b432b691fa	sched: window_stats: Add "disable" mode support "disabled" mode (sched_disble_window_stats = 1) disables all window-stats related activity. This is useful when changing key configuration variables associated with window-stats feature (like policy or window size). Change-Id: I9e55c9eb7f7e3b1b646079c3aa338db6259a9cfe Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:45:15 -07:00
Srivatsa Vaddagiri	e81a1d6ece	sched: window-stats: Fix exit race Exiting tasks are removed from tasklist and hence at some point will become invisible to do_each_thread/for_each_thread task iterators. This breaks the functionality of reset_all_windows_stats() which has to reset stats for all tasks. This patch causes exiting tasks stats to be reset before they are removed from tasklist. DONT_ACCOUNT bit in exiting task's ravg.flags is also marked so that their remaining execution time is not accounted in cpu busy time counters (rq->curr/prev_runnable_sum). reset_all_windows_stats() is thus guaranteed to return with all task's stats reset to 0. Change-Id: I5f101156a4f958c1b3f31eb0db8cd06e621b75e9 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:45:11 -07:00
Srivatsa Vaddagiri	f90ea88fa6	sched: window-stats: code cleanup Provide a wrapper function to reset task's window statistics. This will be reused by a subsequent patch Change-Id: Ied7d32325854088c91285d8fee55d5a5e8a954b3 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:43:50 -07:00
Srivatsa Vaddagiri	85ed6be992	sched: window-stats: legacy mode Support legacy mode, which results in busy time being seen by governor that is close to what it would have seen via existing APIs i.e get_cpu_idle_time_us(), get_cpu_iowait_time_us() and get_cpu_idle_time_jiffy(). In particular, legacy mode means that only task execution time is counted in rq->curr_runnable_sum and rq->prev_runnable_sum. Also task migration does not result in adjustment of those counters. Change-Id: If374ccc084aa73f77374b6b3ab4cd0a4ca7b8c90 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:43:14 -07:00
Srivatsa Vaddagiri	da60007442	sched: window-stats: Code cleanup Collapse duplicated comments about keeping few of sysctl knobs initialized to same value as their non-sysctl copies Change-Id: Idc8261d86b9f36e5f2f2ab845213bae268ae9028 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:43:13 -07:00
Srivatsa Vaddagiri	dafe791457	sched: window-stats: Code cleanup Remove code duplication associated with update of various window-stats related sysctl tunables Change-Id: I64e29ac065172464ba371a03758937999c42a71f Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:43:12 -07:00
Srivatsa Vaddagiri	25d5c94d24	sched: window-stats: Code cleanup add_task_demand() and 'long_sleep' calculation in it are not strictly required. rq_freq_margin() check for need to change frequency, which removes need for long_sleep calculation. Once that is removed, need for add_task_demand() vanishes. Change-Id: I936540c06072eb8238fc18754aba88789ee3c9f5 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:43:12 -07:00
Srivatsa Vaddagiri	e1ea811d7a	sched: window-stats: Remove unused prev_window variable Remove unused prev_window variable in 'struct ravg' Change-Id: I22ec040bae6fa5810f9f8771aa1cb873a2183746 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-22 14:43:11 -07:00
Ian Maund	6440f462f9	Merge upstream tag 'v3.10.49' into msm-3.10 * commit 'v3.10.49': (529 commits) Linux 3.10.49 ACPI / battery: Retry to get battery information if failed during probing x86, ioremap: Speed up check for RAM pages Score: Modify the Makefile of Score, remove -mlong-calls for compiling Score: The commit is for compiling successfully. Score: Implement the function csum_ipv6_magic score: normalize global variables exported by vmlinux.lds rtmutex: Plug slow unlock race rtmutex: Handle deadlock detection smarter rtmutex: Detect changes in the pi lock chain rtmutex: Fix deadlock detector for real ring-buffer: Check if buffer exists before polling drm/radeon: stop poisoning the GART TLB drm/radeon: fix typo in golden register setup on evergreen ext4: disable synchronous transaction batching if max_batch_time==0 ext4: clarify error count warning messages ext4: fix unjournalled bg descriptor while initializing inode bitmap dm io: fix a race condition in the wake up code for sync_io Drivers: hv: vmbus: Fix a bug in the channel callback dispatch code clk: spear3xx: Use proper control register offset ... In addition to bringing in upstream commits, this merge also makes minor changes to mainitain compatibility with upstream: The definition of list_next_entry in qcrypto.c and ipa_dp.c has been removed, as upstream has moved the definition to list.h. The implementation of list_next_entry was identical between the two. irq.c, for both arm and arm64 architecture, has had its calls to __irq_set_affinity_locked updated to reflect changes to the API upstream. Finally, as we have removed the sleep_length member variable of the tick_sched struct, all changes made by upstream commit `ec804bd` do not apply to our tree and have been removed from this merge. Only kernel/time/tick-sched.c is impacted. Change-Id: I63b7e0c1354812921c94804e1f3b33d1ad6ee3f1 Signed-off-by: Ian Maund <imaund@codeaurora.org>	2014-08-20 13:23:09 -07:00
Peter Zijlstra	c5ac12693f	arch: Mass conversion of smp_mb__() Mostly scripted conversion of the smp_mb__ barriers. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 4e857c58efeb99393cba5a5d0d8ec7117183137c [joonwoop@codeaurora.org: fixed trivial merge conflict.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2014-08-15 11:45:28 -07:00
Peter Zijlstra	7d9e69c77f	arch: Prepare for smp_mb__{before,after}_atomic() Since the smp_mb__{before,after}*() ops are fundamentally dependent on how an arch can implement atomics it doesn't make sense to have 3 variants of them. They must all be the same. Furthermore, the 3 variants suggest they're only valid for those 3 atomic ops, while we have many more where they could be applied. So move away from smp_mb__{before,after}_{atomic,clear}_{dec,inc,bit}() and reduce the interface to just the two: smp_mb__{before,after}_atomic(). This patch prepares the way by introducing default implementations in asm-generic/barrier.h that default to a full barrier and providing __deprecated inlines for the previous 6 barriers if they're not provided by the arch. This should allow for a mostly painless transition (lots of deprecated warns in the interim). Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-wr59327qdyi9mbzn6x937s4e@git.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Chen, Gong" <gong.chen@linux.intel.com> Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: febdbfe8a91ce0d11939d4940b592eb0dba8d663 [joonwoop@codeaurora.org: fixed trivial merge conflict.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2014-08-15 11:45:27 -07:00
Steve Muckle	0a0adbb0b1	sched: disable frequency notifications by default The frequency notifications from the scheduler do not currently respect synchronous topologies. If demand on CPU 0 is driving frequency high and CPU 1 is in the same frequency domain, and demand on CPU 1 is low, frequency notifiers will be continuously sent by CPU 1 in an attempt to have its frequency lowered. Until the notifiers are fixed, disable them by default. They can still be re-enabled at runtime. Change-Id: Ic8a927af2236d8fe83b4f4a633b20a8ddcfba359 Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-08-12 11:15:34 -07:00
Steve Muckle	7ebd479ae3	sched: fix misalignment between requested and actual windows When set_window_start() is first executed sched_clock() has not yet stabilized. Refresh the sched_init_jiffy and sched_clock_at_init_jiffy values until it is known that sched_clock has stabilized - this will be the case by the time a client calls the sched_set_window() API. Change-Id: Icd057707ff44c3b240e5e7e96891b23c95733daa Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-08-12 11:15:33 -07:00
Syed Rameez Mustafa	efcad24cbf	Revert "sched: Influence cpu_power based on max_freq and efficiency" This reverts commit `0951ec0ff1` ("sched: Influence cpu_power based on max_freq and efficiency") to let all cpus be seen at same 'cpu_power' from load balance perspective. Without this revert, some cpus will be seen to have more 'cpu_power' than others, causing tasks to incur wait-time despite availability of idle cpus. This happens because a cpu with low 'cpu_power' can fail to see imbalance with another cpu having higher 'cpu_power' and thus can go idle without pulling any work. Change-Id: Iccb34319c527d5b45f29c2d12d2ebc7acdd9d07e Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-08-12 11:15:33 -07:00
Olav Haugan	df91ad278c	sched: Make RAVG_HIST_SIZE tunable Make RAVG_HIST_SIZE available from /proc/sys/kernel/sched_ravg_hist_size to allow tuning of the size of the history that is used in computation of task demand. CRs-fixed: 706138 Change-Id: Id54c1e4b6e974a62d787070a0af1b4e8ce3b4be6 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>	2014-08-12 11:15:20 -07:00
Srivatsa Vaddagiri	7b76c244c2	sched: Fix possibility of "stuck" reserved flag check_for_migration() could mark a thread for migration (in rq->push_task) and invoke active_load_balance_cpu_stop(). However that thread could get migrated to another cpu by the time active_load_balance_cpu_stop() runs, which could fail to clear reserved flag for a cpu and drop task_sruct reference when cpu has only one task (stopper thread running active_load_balance_cpu_stop()). This would cause a cpu to have reserved bit stuck, which prevents it from being used effectively. Fix this by having active_load_balance_cpu_stop() drop reserved bit always. Change-Id: I2464a46b4ddb52376a95518bcc95dd9768e891f9 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:31 -07:00
Srivatsa Vaddagiri	0699a566d3	sched: initialize env->flags variable to 0 env->flags and env->new_dst_cpu fields are not initialized in load_balance() function. As a result, load_balance() could wrongly see LBF_SOME_PINNED flag set and access (bogus) new_dst_cpu's runqueue leading to invalid memory reference. Fix this by initializing env->flags field to 0. While we are at it, fix similar issue in active_load_balance_cpu_stop() function, although there is no harm present currently in that function with uninitialized env->flags variable. Change-Id: Ied470b0abd65bf2ecfa33fa991ba554a5393f649 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:31 -07:00
Srivatsa Vaddagiri	098d8371ad	sched: window-stats: 64-bit type for curr/prev_runnable_sum Expand rq->curr_runnable_sum and rq->prev_runnable_sum to be 64-bit counters as otherwise they can easily overflow when a cpu has many tasks. Change-Id: I68ab2658ac6a3174ddb395888ecd6bf70ca70473 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:31 -07:00
Srivatsa Vaddagiri	5e8f14fbbc	sched: window-stats: Allow acct_wait_time to be tuned Add sysctl interface to tune sched_acct_wait_time variable at runtime Change-Id: I38339cdb388a507019e429709a7c28e80b5b3585 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:30 -07:00
Srivatsa Vaddagiri	4da7e167b3	sched: window-stats: Account interrupt handling time as busy time Account cycles spent by idle cpu handling interrupts (irq or softirq) towards its busy time. Change-Id: I84cc084ced67502e1cfa7037594f29ed2305b2b1 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:30 -07:00
Srivatsa Vaddagiri	a3c1ecd80a	sched: window-stats: Account idle time as busy time Provide a knob to consider idle time as busy time, when cpu becomes idle as a result of io_schedule() call. This will let governor parameter 'io_is_busy' to be appropriately honored. Change-Id: Id9fb4fe448e8e4909696aa8a3be5a165ad7529d3 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:30 -07:00
Srivatsa Vaddagiri	c55cc8b64a	sched: window-stats: Account wait time Extend window-based task load accounting mechanism to include wait-time as part of task demand. A subsequent patch will make this feature configurable at runtime. Change-Id: I8e79337c30a19921d5c5527a79ac0133b385f8a9 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:29 -07:00
Srivatsa Vaddagiri	dca58d0666	sched: window-stats: update task demand on tick A task can execute on a cpu for a long time without being preempted or migrated. In such case, its demand can become outdated for a long time. Prevent that from happening by updating demand of currently running task during scheduler tick. Change-Id: I321917b4590635c0a612560e3a1baf1e6921e792 CRs-Fixed: 698662 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:29 -07:00
Srivatsa Vaddagiri	fa15bd9937	sched: Fix herding issue check_for_migration() could run concurrently on multiple cpus, resulting in multiple tasks wanting to migrate to same cpu. This could cause cpus to be underutilized and lead to increased scheduling latencies for tasks. Fix this by serializing select_best_cpu() calls from cpus running check_for_migration() check and marking selected cpus as reserved, so that subsequent call to select_best_cpu() from check_for_migration() will skip reserved cpus. Change-Id: I73a22cacab32dee3c14267a98b700f572aa3900c Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:29 -07:00
Srivatsa Vaddagiri	02cc889604	sched: window-stats: print window size in /proc/sched_debug Printing window size in /proc/sched_debug would provide useful information to debug scheduler issues. Change-Id: Ia12ab2cb544f41a61c8a1d87bf821b85a19e09fd Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:29 -07:00
Srivatsa Vaddagiri	1f12e6698c	sched: Extend ftrace event to record boost and reason code Add a new ftrace event to record changes to boost setting. Also extend sched_task_load() ftrace event to record boost setting and reason code passed to select_best_cpu(). This will be useful for debug purpose. Change-Id: Idac72f86d954472abe9f88a8db184343b7730287 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:28 -07:00
Srivatsa Vaddagiri	232b0fe6f4	sched: Avoid needless migration Restrict check_for_migration() to operate on fair_sched class tasks only. Also check_for_migration() can result in a call to select_best_cpu() to look for a better cpu for currently running task on a cpu. However select_best_cpu() can end up suggesting a cpu that is not necessarily better than the cpu on which task is running currently. This will result in unnecessary migration. Prevent that from happening. Change-Id: I391cdda0d7285671d5f79aa2da12eaaa6cae42d7 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-12 10:51:28 -07:00
John Stultz	3984bb13c8	printk: rename printk_sched to printk_deferred commit aac74dc495456412c4130a1167ce4beb6c1f0b38 upstream. After learning we'll need some sort of deferred printk functionality in the timekeeping core, Peter suggested we rename the printk_sched function so it can be reused by needed subsystems. This only changes the function name. No logic changes. Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Bohac <jbohac@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-07 14:30:26 -07:00
Srivatsa Vaddagiri	c6d6e960df	sched: Drop active balance request upon cpu going offline A cpu could mark its currently running task to be migrated to another cpu (via rq->push_task/rq->push_cpu) and could go offline before active load balance handles the request. In such case, clear the active load balance request. Change-Id: Ia3e668e34edbeb91d8559c1abb4cbffa25b1830b Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-06 15:36:59 +05:30
Srivatsa Vaddagiri	ea5020bcd2	sched: trigger immediate migration of tasks upon boost Currently turning on boost does not immediately trigger migration of tasks from lower capacity cpus. Tasks could incur migration latency of up to one timer tick (when check_for_migration() is run). Fix this by triggering a migration check on cpus with lower capacity as soon as boost is turned on for first time. Change-Id: I244649f9cb6608862d87631325967b887b7f4b7e Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-06 15:36:59 +05:30
Srivatsa Vaddagiri	6cd6b83b50	sched: Extend boost benefit for small and low-prio tasks Allow small and low-prio tasks to benefit from boost, which is expected to last for a short duration. Any task that wishes to run during that short period is allowed boost benefit. Change-Id: I02979a0c5feeba0f1256b7ee3d73f6b283fcfafa Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-06 15:36:59 +05:30
Srivatsa Vaddagiri	a57fe9b6df	sched: window-stats: Handle policy change properly sched_window_stat_policy influences task demand and thus various statistics maintained per-cpu like curr_runnable_sum. Changing policy non-atomically would lead to improper accounting. For example, when task is enqueued on a cpu's runqueue, its demand that is added to rq->cumulative_runnable_avg could be based on AVG policy and when its dequeued its demand that is removed can be based on MAX, leading to erroneous accounting. This change causes policy change to be "atomic" i.e all cpu's rq->lock are held and all task's window-stats are reset before policy is changed. Change-Id: I6a3e4fb7bc299dfc5c367693b5717a1ef518c32d CRs-Fixed: 687409 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-06 15:36:59 +05:30
Srivatsa Vaddagiri	7b5f42a8e1	sched: window-stats: Reset all window stats Currently, few of the window statistics for tasks are not reset when window size is changing. Fix this to completely reset all window statistics for tasks and cpus. Move the reset code to a function, which can be reused by a subsequent patch that resets same statistics upon policy change. Change-Id: Ic626260245b89007c4d70b9a07ebd577e217f283 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-06 15:36:58 +05:30
Srivatsa Vaddagiri	578851abaa	sched: window-stats: Additional error checking in sched_set_window() Check for invalid window size passed as argument to sched_set_window() Also move up local_irq_disable() call to avoid thread from being preempted during calculation of window_start and its comparison against sched_clock(). Use right macro to evluate whether window_start argument is ahead in time or not. Change-Id: Idc0d3ab17ede08471ae63b72a2d55e7f84868fd6 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-08-06 15:36:58 +05:30
Srivatsa Vaddagiri	f2a21ce199	sched: window-stats: Fix incorrect calculation of partial_demand When using MAX_POLICY, partial_demand is calculated incorrectly as 0. Fix this by picking maximum of previous 4 windows and most recent sample. Change-Id: I27850a510746a63b5382c84761920fc021b876c5 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-28 12:51:07 -07:00
Srivatsa Vaddagiri	5c351b809f	sched: window-stats: Fix potential wrong use of rq 'rq' reference to a cpu where a waking task last ran can be potentially incorrect leading to incorrect accounting. This happens when task_cpu() changes between points A & B in try_to_wake_up() listed below: try_to_wake_up() { cpu = src_cpu = task_cpu(p); rq = cpu_rq(src_cpu); -> Point A .. while (p->on_cpu) cpu_relax(); smp_rmb(); raw_spin_lock(&rq->lock); -> Point B Fix this by initializing 'rq' variable after task has slept (its on_cpu field becomes 0). Also avoid adding task demand to its old cpu runqueue (prev_runnable_sum) in case it's gone offline. Change-Id: I9e5d3beeca01796d944137b5416805b983a6e06e Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-28 12:51:07 -07:00
Mateusz Guzik	4aba6e3634	sched: Fix possible divide by zero in avg_atom() calculation commit b0ab99e7736af88b8ac1b7ae50ea287fffa2badc upstream. proc_sched_show_task() does: if (nr_switches) do_div(avg_atom, nr_switches); nr_switches is unsigned long and do_div truncates it to 32 bits, which means it can test non-zero on e.g. x86-64 and be truncated to zero for division. Fix the problem by using div64_ul() instead. As a side effect calculations of avg_atom for big nr_switches are now correct. Signed-off-by: Mateusz Guzik <mguzik@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1402750809-31991-1-git-send-email-mguzik@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-07-28 08:00:07 -07:00
Linux Build Service Account	0c445a72ec	Merge "sched: set initial task load to just above a small task"	2014-07-26 12:41:28 -07:00
Steve Muckle	1eaec37bfd	sched: set initial task load to just above a small task To maximize power savings, set the intial load of newly created tasks to just above a small task. Setting it below the small task threshold would cause new tasks to be packed which is very likely too aggressive. Change-Id: Idace26cc0252e31a5472c73534d2f5277a1e3fa4 Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-07-25 10:55:56 -07:00
Olav Haugan	47c59a6b72	sched/fair: Check whether any CPUs are available There is a possibility that there are no allowed CPUs online when we try to select the best cpu for a small task. Add a check to ensure we don't continue if there are no CPUs available. CRs-fixed: 692505 Change-Id: Iff955fb0d0b07e758a893539f7bc8ea8aa09d9c4 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>	2014-07-25 08:29:24 -07:00
Steve Muckle	1aa9b6992a	sched: fixes for compilation without CONFIG_SCHED_HMP These fixes are necessary to compile without CONFIG_SCHED_HMP enabled. Change-Id: Iabbde3c22a81288242ed3a44fdfdb2a16db8b072 Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-07-22 16:08:02 -07:00
Steve Muckle	483fa0ade3	sched: enable hmp, power aware scheduling for targets with > 4 CPUs Enabling and disabling hmp/power-aware scheduling is meant to be done via kernel command line options. Until that is fully supported however, take advantage of the fact that current targets with more than 4 CPUs will need these features. Change-Id: I4916805881d58eeb54747e4b972816ffc96caae7 Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-07-22 16:08:01 -07:00
Srivatsa Vaddagiri	fefafa08b7	sched: remove sysctl control for HMP and power-aware task placement There is no real need to control HMP and power-aware task placement at runtime after kernel has booted. Boot-time control should be sufficient. Not allowing for runtime (sysctl) support simplifies the code quite a bit. Also rename sysctl_sched_enable_hmp_task_placement to be shorter. Change-Id: I60cae51a173c6f73b79cbf90c50ddd41a27604aa Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 16:07:58 -07:00
Srivatsa Vaddagiri	87df0beb43	sched: support legacy mode better It should be possible to bypass all HMP scheduler changes at runtime by setting sysctl_sched_enable_hmp_task_placement and sysctl_sched_enable_power_aware to 0. Fix various code paths to honor this requirement. Change-Id: I74254e68582b3f9f1b84661baf7dae14f981c025 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 16:07:05 -07:00
Srivatsa Vaddagiri	1ef9206d1c	sched: code cleanup Avoid the long if() block of code in set_task_cpu(). Move that code to its own function Change-Id: Ia80a99867ff9c23a614635e366777759abaccee4 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 16:05:48 -07:00
Srivatsa Vaddagiri	cf0d1f54be	sched: Add BUG_ON when task_cpu() is incorrect It would be fatal if task_cpu() information for a task does not accurately represent the cpu on which its running. All sorts of wierd issues can arise if that were to happen! Add a BUG_ON() in context switch to detect such cases. Change-Id: I4eb2c96c850e2247e22f773bbb6eedb8ccafa49c Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:23:05 -07:00
Srivatsa Vaddagiri	cea49887c4	sched: avoid active migration of tasks not in TASK_RUNNING state Avoid wasting effort in migrating tasks that are about to sleep. Change-Id: Icf9520b1c8fa48d3e071cb9fa1c5526b3b36ff16 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:23:05 -07:00
Srivatsa Vaddagiri	955b16f3ce	sched: fix up task load during migration Fix the hack to set task's on_rq to 0 during task migration. Task's load is temporarily added back to its runqueue so that update_task_ravg() can fixup task's load when its demand is changing. Task's load is removed immediately afterwards. Temporarily setting p->on_rq to 0 introduces a race condition with try_to_wake_up(). Another task (task A) may be attempting to wake up the migrating task (task B). As long as task A sees task B's p->on_rq as 1, the wake up will not continue. Changing p->on_rq to 0, then back to 1, allows task A to continue "waking" task B, at which point we have both try_to_wake_up and the migration code attempting to set the cpu of task B at the same time. CRs-Fixed: 695071 Change-Id: I525745f144da4ffeba1d539890b4d46720ec3ef1 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:23:05 -07:00
Prasad Sodagudi	6075c8a6f7	sched: avoid pushing tasks to an offline CPU Currently active_load_balance_cpu_stop is run by cpu stopper and it pushes running tasks off the busiest CPU onto idle target CPU. But there is no check to see whether target cpu is offline or not before pushing the tasks. With the introduction of active migration in the scheduler tick path (see check_for_migration()) there have been instances of attempts to migrate tasks to offline CPUs. Add a check as to whether the target cpu is online or not to prevent scheduling on offline CPUs. Change-Id: Ib8ac7f8aeabd3ca7365f3eae977075952dab4f21 Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>	2014-07-22 14:23:04 -07:00
Syed Rameez Mustafa	8f7e5b8ee8	sched: Add a per rq max_possible_capacity for use in power calculations In the absence of a power driver providing real power values, the scheduler currently defaults to using capacity of a CPU as a measure of power. This, however, is not a good measure since the capacity of a CPU can change due to thermal conditions and/or other hardware restrictions. These frequency restrictions have no effect on the power efficiency of those CPUs. Introduce max possible capacity of a CPU to track an absolute measure of capacity which translates into a good absolute measure of power efficiency. Max possible capacity takes the max possible frequency of CPUs into account instead of max frequency. Change-Id: Ia970b853e43a90eb8cc6fd990b5c47fca7e50db8 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:04 -07:00
Syed Rameez Mustafa	16fa06671f	sched: Disable interrupts when holding the rq lock in sched_get_busy() Interrupts can end up waking processes on the same cpu as the one for which sched_get_busy() is called. Since sched_get_busy() takes the rq lock this can result in a deadlock as the same rq lock is required to enqueue the waking up task. Fix the deadlock by disabling interrupts when taking the rq lock. Change-Id: I46e14a14789c2fb0ead42363fbaaa0a303a5818f Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:03 -07:00
Srivatsa Vaddagiri	9361844015	sched: Make wallclock more accurate update_task_ravg() in context switch uses wallclock that is updated before running put_prev_task() and pick_next_task(), both of which can take some time. Its better to update wallclock after those routines, leading to more accurate accounting. Change-Id: I882b1f0e8eddd2cc17d42ca2ab8f7a2841b8d89a Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:23:03 -07:00
Syed Rameez Mustafa	cd0c28ddcc	sched: Resolve some 64 bit compilation issues On 64 bit architectures a pointer is no longer the same size as an int. Therefore any place that does a conversion from int to a pointer type gives a compilation error. Resolve these by type casting to long first which is guaranteed to be the same size as a pointer. Change-Id: I518ac3c562bd3f85893f91ad6dbcd2f0c7bf081b Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:03 -07:00
Syed Rameez Mustafa	3d972d3af1	sched: Make task and CPU load calculations safe from truncation Load calculations have been modified to accept and return 64 bit values. Fix up all the places where we make such calculations to store the result in 64 bit variables. This is necessary to avoid issues caused by truncation of values. While at it update scale_task_load() to scale_load_to_cpu(). This is because the API is used to scale load of both individual tasks as well as the cumulative load of CPUs. In this sense the name was a misnomer. Also clean up power_cost() to use max_task_load(). Change-Id: I51e683e1592a5ea3c4e4b2b06d7a7339a49cce9c Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:03 -07:00
Syed Rameez Mustafa	8eebaa1826	sched/fair: Introduce C-state aware task placement for small tasks Small tasks execute for small durations. This means that the power cost of taking CPUs out of a low power mode outweigh any performance advantage of using an idle core or power advantage of using the most power efficient CPU. Introduce C-state aware task placement for small tasks. This requires a two pass approach where we first determine the most power effecient CPU and establish a band of CPUs offering a similar power cost for the task. The order of preference then is as follows: 1) Any mostly idle CPU in active C-state in the same power band. 2) A CPU with the shallowest C-state in the same power band. 3) A CPU with the least load in the same power band. 4) Lowest power CPU in a higher power band. The patch also modifies the definition of a small task. Small tasks are now determined relative to minimum capacity CPUs in the system and not the task CPU. Change-Id: Ia09840a5972881cad7ba7bea8fe34c45f909725e Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:02 -07:00
Srivatsa Vaddagiri	34e72241a9	sched: Make the scheduler aware of C-state for cpus C-state represents a power-state of a cpu. A cpu could have one or more C-states associated with it. C-state transitions are based on various factors (expected sleep time for example). "Deeper" C-states implies longer wakeup latencies. Scheduler needs to know wakeup latency associated with various C-states. Having this information allows the scheduler to make better decisions during task placement. For example: - Prefer an idle cpu that is in the least shallow C-state - Avoid waking up small tasks on a idle cpu unless it is in the least shallow C-state This patch introduces APIs in the scheduler that can be used by the architecture specific power-management driver to inform the scheduler about C-states for cpus. Change-Id: I39c5ae6dbace4f8bd96e88f75cd2d72620436dd1 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:02 -07:00
Syed Rameez Mustafa	65eab4a6f5	sched/fair: Introduce scheduler boost for low latency workloads Certain low latency bursty workloads require immediate use of highest capacity CPUs in HMP systems. Existing load tracking mechanisms may be unable to respond to the sudden surge in the system load within the latency requirements. Introduce the scheduler boost feature for such workloads. While boost is in effect the scheduler bypasses regular load based task placement and prefers highest capacity CPUs in the system for all non-small fair sched class tasks. Provide both a kernel and userspace API for software that may have apriori knowledge about the system workload. Change-Id: I783f585d1f8c97219e629d9c54f712318821922f Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:02 -07:00
Srivatsa Vaddagiri	a6fa50d177	sched: Move call to trace_sched_cpu_load() select_best_cpu() invokes trace_sched_cpu_load() for all online cpus in a loop, before it enters the loop for core selection. Moving invocation of trace_sched_cpu_load() in inner core loop is potentially more efficient. Change-Id: Iae1c58b26632edf9ec5f5da905c31356eb95c925 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:01 -07:00
Srivatsa Vaddagiri	73bf6e4e2d	sched: fair: Reset balance_interval before sending NOHZ kick balance_interval needs to be reset for anycpu being kicked. Otherwise it can end up ignoring the kick (i.e not doing load balance for itself). Also bypass the check for existence of idle cpus in tickless state for !CONFIG_SCHED_HMP to allow for more aggressive load balance. Change-Id: I52365ee7c2997ec09bd93c4e9ae0293a954e39a8 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:01 -07:00
Srivatsa Vaddagiri	5b97613e54	sched: Avoid active migration of small tasks We currently check the need to migrate the currently running task in scheduler_tick(). Skip that check for small tasks, as its not worth the effort! Change-Id: Ic205cc6452f42fde6be6b85c3bf06a8542a73eba Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:01 -07:00
Srivatsa Vaddagiri	e87706488c	sched: Account for cpu's current frequency when calculating its power cost In estimating cost of running a task on a given cpu, cost of cpu at its current frequency needs to over-ride cost at frequency demanded by task, where cur_freq exceeds required frequency of task. This is because placing a task on a cpu can only result in an increase of cpu's frequency. Change-Id: I021a3bbaf179bf1ec2c7f4556870256936797eb9 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:01 -07:00
Srivatsa Vaddagiri	8e4ffa0d07	sched: make sched_set_window() return failure when PELT is in use Window-based load tracking is a pre-requisite for the scheduler to feed cpu load information to the governor. When PELT is in use, return failure when governor attempts to set window-size. This will let governor fall back to other APIs for retrieving cpu load statistics. Change-Id: I0e11188594c1a54b3b7ff55447d30bfed1a01115 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:01 -07:00
Srivatsa Vaddagiri	7d8d1bd095	sched: debug: Print additional information in /proc/sched_debug Provide information in /proc/sched_debug on min_capacity, max_capacity and whether pelt or window-based task load statistics is in use. Change-Id: Ie4e9450652f4c83110dda75be3ead8aa5bb355c3 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:23:00 -07:00
Srivatsa Vaddagiri	df5fd251ed	sched: Move around code Move up chunk of code to be defined early. This helps a subsequent patch that needs update_min_max_capacity() Change-Id: I9403c7b4dcc74ba4ef1034327241c81df97b01ea Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:00 -07:00
Srivatsa Vaddagiri	af9a2812eb	sched: Update capacity of all online cpus when min_max_freq changes During bootup, its possible for min_max_freq to change as frequency information for additional clusters is processed. That would need to trigger recalculation of capacity/load_scale_factor for all (online) cpus, as they strongly depend on min_max_freq variable. Not doing so would imply some cpus will have their capacity/load_scale_factor computed wrongly. Change-Id: Iea5a0a517a2d71be24c2c71cdd805c0733ce37f8 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:23:00 -07:00
Srivatsa Vaddagiri	2baa89cd4c	sched: update task statistics when CPU frequency changes A CPU may have its frequency changed by a different CPU. Because of this, it is not guaranteed that we will update task statistics at approximately the same time that the frequency change occurs. To guard against accruing time to a task at the wrong frequency, update the task's window-based statistics if the CPU it is running on changes frequency. Change-Id: I333c3f8aa82676bd2831797b55fd7af9c4225555 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:23:00 -07:00
Srivatsa Vaddagiri	11262c23d6	sched: Add new trace events Add trace events for update_task_ravg(), update_history(), and set_task_cpu(). These tracepoints are useful for monitoring the per-task and per-runqueue demand statistics. Change-Id: Ibec9f945074ff31d1fc1a76ae37c40c8fea8cda9 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:59 -07:00
Steve Muckle	fd3c9f6c53	sched: do not balance on exec if SCHED_HMP Rebalancing at exec time will currently undo any beneficial placement that has been done during fork time, since select_best_cpu() will not discount the currently running task. For now just skip re-evaluating task placement at exec. Change-Id: I1e5e0fcc329b7b53c338c8c73795ebd5e85a118b Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-07-22 14:22:59 -07:00
Srivatsa Vaddagiri	00faa770e7	sched: Use historical load for freq governor input Historical load maintained per task can be used to influence cpu frequency better. For example, when a heavy demand task wakes up after prolonged sleep, we could use the historical load information to alert cpufreq governor about the need to raise cpu frequency. This patch changes CPU busy statistics to be aggregation of historical task demand. Also task's historical load (as defined by sysctl_sched_window_stats_policy) is add to cpu's busy statistics (rq->curr_runnable_sum) whenever it executes on a cpu. Change-Id: I2b66136f138b147ba19083b9b044c4feb20d9b57 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:59 -07:00
Srivatsa Vaddagiri	0352c87d18	sched: window-stats: apply scaling to full elapsed windows In the event that a full window (or multiple full windows) have elapsed when updating a task's window-based stats, the runtime of those windows needs to be scaled based on the CPU frequency. This is currently missing, causing full windows to be accounted as having elapsed at maximum frequency, erroneously inflating task demand. Change-Id: I356b4279d44d4f39c8aea881c04327b70ed66183 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:22:58 -07:00
Steve Muckle	80753c7e5e	sched: notify cpufreq on over/underprovisioned CPUs After a migration occurs the source and destination CPUs may not be running at frequencies which match the new task load on those CPUs. Previously, the scheduler was notifying cpufreq anytime a task greater than a certain size migrates. This is suboptimal however since this does not take into account the CPU's current frequency and other task activity that may be present. Change-Id: I5092bda3a517e1343f97e5a455957c25ee19b549 Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-07-22 14:22:58 -07:00
Syed Rameez Mustafa	dbd2db2471	sched: Introduce spill threshold tunables to manage overcommitment When the number of tasks intended for a cluster exceed the number of mostly idle CPUs in that cluster, the scheduler currently freely uses CPUs in other clusters if possible. While this is optimal for performance the power trade off can be quite significant. Introduce spill threshold tunables that govern the extent to which the scheduler should attempt to contain tasks within a cluster. Change-Id: I797e6c6b2aa0c3a376dad93758abe1d587663624 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:22:58 -07:00
Steve Muckle	dd76a2e00a	sched: add affinity, task load information to sched tracepoints Knowing the affinity mask and CPU usage of a task is helpful in understanding the behavior of the system. Affinity information has been added to the enq_deq trace event, and the migration tracepoint now reports the load of the task migrated. Change-Id: I29d8a610292b4dfeeb8fe16174e9d4dc196649b7 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2014-07-22 14:22:58 -07:00
Steve Muckle	bb3f4aae22	sched: add migration load change notifier for frequency guidance When a task moves between CPUs in two different frequency domains the cpufreq governor may wish to immediately modify the frequency of both the source and destination CPUs of the migrating task. A tunable is provided to establish what size task is considered "significant" enough to warrant notifying cpufreq. Also fix a bug that would cause load to not be accounted properly during wakeup migrations. Change-Id: Ie8f6b1cc4d43a602840dac18590b42a81327c95a Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2014-07-22 14:22:57 -07:00
Syed Rameez Mustafa	8968469989	sched/fair: Limit MAX_PINNED_INTERVAL for more frequent load balancing Should the system get stuck in a state where load balancing is failing due to all tasks being pinned, deferring load balancing for up to half a second may cause further performance problems. Eventually all tasks will not be pinned and load balancing should not be deferred for a great length of time. Change-Id: I06f93b5448353b5871645b9274ce4419dc9fae0f Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:57 -07:00
Syed Rameez Mustafa	6317c544d8	sched/fair: Help out higher capacity CPUs when they are overcommitted This comprises of two parts: If we have a task to schedule, we currently don't consider CPUs where it will not fit even if they are idle. Instead we choose the previous CPU which is sub-optimal for performance if an idle CPU is present. This change introduces tracking of any idle CPUs irrespective of whether the task fits on them or not. If we don't have a good place to put the task, prefer the lowest power idle CPU. The other part involves the load balancer which was unable to move tasks despite the above mentioned task placement to balance out the load. The reason is that the load balancer checks the big cluster's group capacity and determines that it can take twice the amount of workload as the little cluster. Hence the big cluster does not get marked as busy. While this behavior is intended under heavily loaded systems where we want to push more work towards the higher capacity CPUs, it is sub optimal when we have idle CPUs. Add the ability to differentiate between the two scenarios when marking a group as busy. If load_balance is called from a CPU_NOT_IDLE environment use the the group capacity to determine whether the group is busy or not. For everything else use number of CPUs in the group. Change-Id: I4e8290639ad1602541a44a80ba4b2804068cac0f Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:57 -07:00
Syed Rameez Mustafa	5349ec0ee8	sched/rt: Introduce power aware scheduling for real time tasks Real Time task scheduling has historically been geared towards performance with a significant attempt to keep higher priority tasks on the same CPU. This is not optimal for power since the task CPU may not be the most power efficient CPU. Also task movement via select_lowest_rq() gives CPU priority the primary consideration before looking at CPU topologies to find a CPU closest to the task CPU in terms of topology. This again is not optimal for power since the closest CPU may be significantly worse for power than CPUs further away. This patch removes any bias for the task CPU. When the lowest priority CPUs in the system are found we give no consideration to the CPU topology. Instead we find the lowest power CPU within local_cpu_mask. This takes care of select_task_rq_rt() and push_task(). The pull model remains unaffected since we have no room for power optimization there. Change-Id: I4162ebe2f74be14240e62476f231f9e4a18bd9e8 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:57 -07:00
Steve Muckle	ea296243e2	sched: balance power inefficient CPUs with one task Normally the load balancer does not pay attention to CPUs with one task since it is not possible to subdivide that load any further to achieve better balance. With power aware scheduling however it may be desirable to relocate that one task if the CPU it is currently executing on is less power efficient than other CPUs in the system. Change-Id: Idf3f5e22b88048184323513f0052827b884526b6 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:56 -07:00
Steve Muckle	6e6c17d05c	sched: check for power inefficient task placement in tick Although tasks are routed to the most power-efficient CPUs during task wakeup, a CPU-bound task will not go through this decision point. Load balancing can help if it is modified to dislodge a single task from an inefficient CPU. The situation can be further improved if during the tick, the task placement is checked to see if it is optimal. This sort of checking is already being done to ensure proper task placement in heterogneous CPU topologies, so checking for power efficient task placement fits pretty well. Change-Id: I71e56d406d314702bc26dee1438c0eeda7699027 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:56 -07:00
Steve Muckle	7edb149465	sched: do nohz load balancing in order of power efficiency The nohz load balancer CPU does load balancing on behalf of all idle tickless CPUs. In the interest of power efficiency though, we should do load balancing on the most power efficient idle tickless CPU first, and then work our way towards the least power efficient idle tickless CPU. This will help load find its way to the most power efficient CPUs in the system. Since when selecting the CPU to balance next it is unknown what task load would be pulled, a frequency must be assumed in order to do a comparison of CPU power consumption. The maximum freqeuncy supported by all CPUs is used for this. Change-Id: I96c7f4300fde2c677c068dc10fc0e57f763eb9b2 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:56 -07:00
Steve Muckle	ac1defc4d3	sched: run idle_balance() on most power-efficient CPU When a CPU goes idle, it checks to see whether it can pull any load from other busy CPUs. The CPU going idle may not be the most power-efficient idle CPU in the system however. This patch causes the CPU going idle to check to see whether there is a more power-efficient idle CPU within the same lowest sched domain. If there is, then it runs the load balancer on behalf of that CPU instead of itself. Since it is unknown at this point what task load would be pulled, a frequency must be assumed for this in order to do a comparison of CPU power consumption. The maximum freqeuncy supported by all CPUs is used for this. Change-Id: I5eedddc1f7d10df58ecd358f37dba563eeecf4fc Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:56 -07:00
Steve Muckle	5ef89ee90e	sched: add hook for platform-specific CPU power information To enable power-aware scheduling, provide a hook/infrastructure for platforms to communicate CPU power requirements for each supported CPU frequency. This information is then used to estimate the cost of running a task on a given CPU. Currently, an assumption is made that the task will be running by itself on the CPU. Given the current policy tries to spread tasks as much as possible this assumption should not be too far off. Change-Id: I19f1fa760a0d43222d2880f8aec0508c468b39bb Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:56 -07:00
Steve Muckle	bf1afbbbcd	sched: add power aware scheduling sysctl The sched_enable_power_aware sysctl will control whether or not scheduling decisions are influenced by the power consumption of individual CPUs. Change-Id: I312f892cf76a3fccc4ecc8aa6703908b205267f0 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:55 -07:00
Srivatsa Vaddagiri	a6e9741047	sched: Extend update_task_ravg() to accept wallclock as argument This will make it easier to account interrupt time on a cpu, introduced in a subsequent patch. Change-Id: I0e1fb5255c280ca374fd255e7fc19d5de9f8b045 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:55 -07:00
Srivatsa Vaddagiri	61490dcfb4	sched: add sched_get_busy, sched_set_window APIs sched_get_busy() returns the busy time of a cpu during the most recent completed window. sched_set_window() will set window size and aligns windows across all CPUs. Change-Id: Ic53e27f43fd4600109b7b6db979e1c52c7aca103 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:55 -07:00
Steve Muckle	d2cc14b9e2	sched: window-stats: adjust RQ curr, prev sums on task migration Adjust cpu's busy time in its recent and previous window upon task migration. This would enable scheduler to provide better inputs to cpufreq governor on a cpu's busy time in a given window. Change-Id: Idec2ca459382e9f46d882da3af53148412d631c6 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:54 -07:00
Steve Muckle	d7b56a170f	sched: window-stats: Add aggregated runqueue windowed stats Add counters per-cpu to track its busy time in the latest window and one previous to that. This would be needed to track accurate busy time per-cpu that accounts for migrations. Basically once a task migrates, its execution time in current window is migrated as well to new cpu. The idle task's runtime is not accounted since it should not count towards runqueue busy time. Change-Id: I4014dd686f95dbbfaa4274269bc36ed716573421 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:54 -07:00
Srivatsa Vaddagiri	af4d1578b9	sched: window-stats: add prev_window counter per-task Currently windows where tasks had no execution time are ignored. However accurate accounting of cpu busy time that factors in migration would need to know actual utilization of a task in the window previous to the latest one. This would help scheduler guide cpufreq governor on busy time per-cpu that is not subject to migration induced errors. Change-Id: I5841b1732c83e83d69002139de3bdb93333ce347 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:54 -07:00
Srivatsa Vaddagiri	975dbc9783	sched: window-stats: synchronize windows across cpus Synchronizing windows across cpus for task load measurements simplifies cpu busy time accounting during migrations. For task migrations, its usage in current window can be carried over to its new cpu. This lets cpufreq governor see a correct picture of cpu busy time that is not affected by migrations. This patch lines up windows across cpus. One of the cpu, sync_cpu, serves as a reference for all others. During bootup sync_cpu would initialize its window_start (from its sched_clock()). Other cpus will synchronize their window_start in reference to sync_cpu. This patch assumes synchronous sched_clock() across cpus and may need some change to address architectures which do not provide such synchronized sched_clock(). Change-Id: I13381389a72f5f9f85cc2446401d493a55c78ab7 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:54 -07:00
Srivatsa Vaddagiri	bf90a4be22	sched: window-stats: Do not account wait time Task load statistics are used for two purposes : cpu frequency management and placement. Task's load can't be accurately judged by its wait time. For ex: a task could have waited for 10ms and when given opportunity to run, could just execute for 1ms. Accounting for 11ms as task's demand could be over-stating its needs in this example. For now, remove wait time from task demand and instead let task load be derived from its actual exec time. This may need to become a tunable feature. Change-Id: I47e94c444c6b44c3b0810347287d50f1ee685038 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:53 -07:00
Srivatsa Vaddagiri	bd020d066f	sched: window-stats: update during migration and earlier at wakeup During migrations accounting needs to be done in set_task_cpu() to subtract the task activity from the source CPU and add it to the destination CPU. This accounting will require that the task's window based load statistics be up to date. Unfortunately, the window-based statistics cannot always be updated in set_task_cpu() because they are already being updated in the wakeup path. We cannot update the statistics solely in the wakeup path because not all wakeups are migrations. Those non-migrating wakeups will not enter set_task_cpu(). To ensure the window-based stats are always updated for both wakeup migrations and regular migrations, they are updated earlier in the wakeup path, and also updated in set_task_cpu if the task is already runnable (this ensures it is not a wakeup migration, but a regular migration). Change-Id: Ib246028741d0be9bb38ce93679d6e6ba25b10756 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:22:53 -07:00
Srivatsa Vaddagiri	f0ad6a880a	sched: move definition of update_task_ravg() set_task_cpu() will need to call update_task_ravg(). Move up definition to make it easy. Change-Id: I95c1c9e009bd1805f28708e8d6fd3b7b2166410e Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2014-07-22 14:20:35 -07:00

1 2 3 4 5 ...

710 Commits