cpufreq_register_driver() could race with CPU hotplug during
bootup. Since hotplug notification is not registered when
subsys_interface_register() is being executed, it's possible
cpufreq's view of online CPUs becomes stale before it registers
for hotplug notification.
Register for hotplug notification first and protect
subsys_interface_register() against hotplug using
get/put_online_cpus().
Change-Id: I26b2908f1d167c2becc4e8664c357bb7c6162406
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
commit 36b4bed5cd8f6e17019fa7d380e0836872c7b367 upstream.
Code which changes policy to powersave changes also max_policy_pct based on
max_freq. Code which change max_perf_pct has upper limit base on value
max_policy_pct. When policy is changing from powersave back to performance
then max_policy_pct is not changed. Which means that changing max_perf_pct is
not possible to high values if max_freq was too low in powersave policy.
Test case:
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
800000
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
3300000
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100
$ echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
$ echo 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
$ echo 20 > /sys/devices/system/cpu/intel_pstate/max_perf_pct
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
powersave
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
800000
$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
20
$ echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
$ echo 3300000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
$ echo 100 > /sys/devices/system/cpu/intel_pstate/max_perf_pct
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
3300000
$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
24
And now intel_pstate driver allows to set maximal value for max_perf_pct based
on max_policy_pct which is 24 for previous powersave max_freq 800000.
This patch will set default value for max_policy_pct when setting policy to
performance so it will allow to set also max value for max_perf_pct.
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Disable sample window alignment by default to match default behavior
of upstream interactive governor.
Change-Id: Ibbf4bdd4dd423f97d3a9dd5442eba78b378e66e2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
When cpufreq_stats_table is not initialized for policy->last_cpu in
CPUFREQ_UPDATE_POLICY_CPU callback, no updates are necessary. Current
implementation dereferences the table unconditionally, causing a crash
if the table is NULL.
Return directly in cpufreq_stats_update_policy_cpu() if
cpufreq_stats_table of policy->last_cpu is NULL.
Change-Id: Ic9ef8120557702791ba5b873532e47cc0a5dc027
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Failing to initialize cpufreq_stats table effectively disables
cpufreq_stats. Such failures should not be silent. Print error
messages when cpufreq_stats fails to create stats table.
Change-Id: I71cc0dd8262c7c6946e169f148ae39bd8f213a96
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
If a heavy task migrates between otherwise idle CPUs in a policy during
every sample window, the above hispeed delay window for the CPUs would get
restarted for every sample window. Due to the continuous restart of above
hispeed delay window, none of the CPUs would ever pick a target frequency
higher than hispeed frequency. This causes the policy's frequency to be
stuck at hispeed freq even if the load justifies a higher frequency.
To fix this, the above high speed delay window is restarted only when the
policy frequency changes. This ensures that tasks migrating between CPUs in
a policy are handled correctly.
Also, the hispeed load/frequency heuristic is only necessary when the
information is insufficient to determine if the load on the CPU needs at
least hispeed frequency. When the policy frequency is already at or above
hispeed frequency, if the CPU load% based on policy frequency is not above
hispeed load, then the information is clearly sufficient to determine that
the load on the CPU does not need hispeed frequency.
Therefore, compute CPU load% (which is used only to compare against hispeed
load) based on policy frequency instead of CPU target frequency.
Change-Id: I1749d663949e34753ecb5c426a16563796f8b0b2
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
commit dfa5bb622555d9da0df21b50f46ebdeef390041b upstream.
The ondemand governor calculates load in terms of frequency and
increases it only if load_freq is greater than up_threshold
multiplied by the current or average frequency. This appears to
produce oscillations of frequency between min and max because,
for example, a relatively small load can easily saturate minimum
frequency and lead the CPU to the max. Then, it will decrease
back to the min due to small load_freq.
Change the calculation method of load and target frequency on the
basis of the following two observations:
- Load computation should not depend on the current or average
measured frequency. For example, absolute load of 80% at 100MHz
is not necessarily equivalent to 8% at 1000MHz in the next
sampling interval.
- It should be possible to increase the target frequency to any
value present in the frequency table proportional to the absolute
load, rather than to the max only, so that:
Target frequency = C * load
where we take C = policy->cpuinfo.max_freq / 100.
Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
that middle frequencies are used more, with this patch. Highest
and lowest frequencies were used less by ~9%.
[rjw: We have run multiple other tests on kernels with this
change applied and in the vast majority of cases it turns out
that the resulting performance improvement also leads to reduced
consumption of energy. The change is additionally justified by
the overall simplification of the code in question.]
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a857c0b9e24e39fe5be82451b65377795f9538d8 upstream.
The time spent by a CPU under a given frequency is stored in jiffies unit
in the cpu var cpufreq_stats_table->time_in_state[i], i being the index of
the frequency.
This is what is displayed in the following file on the right column:
cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
2301000 19835820
2300000 3172
[...]
Now cpufreq converts this jiffies unit delta to clock_t before returning it
to the user as in the above file. And that conversion is achieved using the API
cputime64_to_clock_t().
Although it accidentally works on traditional tick based cputime accounting, where
cputime_t maps directly to jiffies, it doesn't work with other types of cputime
accounting such as CONFIG_VIRT_CPU_ACCOUNTING_* where cputime_t can map to nsecs
or any granularity preffered by the architecture.
For example we get a buggy zero delta on full dyntick configurations:
cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
2301000 0
2300000 0
[...]
Fix this with using the proper jiffies_64_t to clock_t conversion.
Reported-and-tested-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Previously, there was a limitation in load change callback that it
can't attempt to wake up a task. Therefore the best we can do is to
schedule timer at current jiffy. The timer function will only be
executed at next timer tick. This could take up to 10ms.
Now that this limitation is removed, re-evaluate load immediately upon
receiving this callback.
Change-Id: Iab3de4705b9aae96054655b1541e32fb040f7e60
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
rq->curr/prev_runnable_sum counters represent cpu demand from various
tasks that have run on a cpu. Any task that runs on a cpu will have a
representation in rq->curr_runnable_sum. Their partial_demand value
will be included in rq->curr_runnable_sum. Since partial_demand is
derived from historical load samples for a task, rq->curr_runnable_sum
could represent "inflated/un-realistic" cpu usage. As an example, lets
say that task with partial_demand of 10ms runs for only 1ms on a cpu.
What is included in rq->curr_runnable_sum is 10ms (and not the actual
execution time of 1ms). This leads to cpu busy time being reported on
the upside causing frequency to stay higher than necessary.
This patch fixes cpu busy accounting scheme to strictly represent
actual usage. It also provides for conditional fixup of busy time upon
migration and upon heavy-task wakeup.
CRs-Fixed: 691443
Change-Id: Ic4092627668053934049af4dfef65d9b6b901e6b
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Make sampling window alignment optional when scheduler inputs
are not enabled.
Change-Id: If69c111a3efe219cdd1e38c1f46f03404789c0bb
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Previously known as sampling down factor, max_freq_hysteresis
extends the period that interactive governor will stay at policy->max.
This feature is to accomodate short idle periods in an otherwise very
intensive workload.
When the feature is enabled, it ensures that once a CPU goes to max
frequency, it doesn't reduce the frequency for max_freq_hysteresis
microseconds from the time it first goes to idle.
Change-Id: Ia54985cb554f63f8c22d0b554a0a0f2ed2be038f
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
All features in interactive_pro governor is now ported to interactive
governor. Delete interactive_pro governor.
Change-Id: Ic847968f45079ba8d72f1b6fecce4c0e6b88b37a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* commit 'v3.10.49': (529 commits)
Linux 3.10.49
ACPI / battery: Retry to get battery information if failed during probing
x86, ioremap: Speed up check for RAM pages
Score: Modify the Makefile of Score, remove -mlong-calls for compiling
Score: The commit is for compiling successfully.
Score: Implement the function csum_ipv6_magic
score: normalize global variables exported by vmlinux.lds
rtmutex: Plug slow unlock race
rtmutex: Handle deadlock detection smarter
rtmutex: Detect changes in the pi lock chain
rtmutex: Fix deadlock detector for real
ring-buffer: Check if buffer exists before polling
drm/radeon: stop poisoning the GART TLB
drm/radeon: fix typo in golden register setup on evergreen
ext4: disable synchronous transaction batching if max_batch_time==0
ext4: clarify error count warning messages
ext4: fix unjournalled bg descriptor while initializing inode bitmap
dm io: fix a race condition in the wake up code for sync_io
Drivers: hv: vmbus: Fix a bug in the channel callback dispatch code
clk: spear3xx: Use proper control register offset
...
In addition to bringing in upstream commits, this merge also makes minor
changes to mainitain compatibility with upstream:
The definition of list_next_entry in qcrypto.c and ipa_dp.c has been
removed, as upstream has moved the definition to list.h. The implementation
of list_next_entry was identical between the two.
irq.c, for both arm and arm64 architecture, has had its calls to
__irq_set_affinity_locked updated to reflect changes to the API upstream.
Finally, as we have removed the sleep_length member variable of the
tick_sched struct, all changes made by upstream commit ec804bd do not
apply to our tree and have been removed from this merge. Only
kernel/time/tick-sched.c is impacted.
Change-Id: I63b7e0c1354812921c94804e1f3b33d1ad6ee3f1
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Interactive governor does not have enough information about the tasks
on a CPU to make a more informed decision on the frequency the CPUs
should run at. To address this problem, modify interactive governor
to get load information from scheduler. In addition, it can get
notification from scheduler on significant load change to reevaluate
CPU frequency immediately.
Add two sysfs file to control the behavior of load evaluation:
use_sched_load:
When enabled, governor uses load information from scheduler
instead of busy/idle time from past window.
use_migration_notif:
Whenever a task migrates, scheduler might send a notification
so that governor can re-evaluate load and scale frequency.
Governor will ignore this notification unless both
use_sched_hint and use_migration_notification are true for
the policy group.
Change-Id: Iaf66e424c6166ec15480db027002b3a3b357d79c
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Replace mod_timer_pinned() with del_timer(), add_timer_on().
mod_timer_pinned() always adds timer onto current CPU. Interactive
governor expects each CPU's timers to be running on the same CPU.
If cpufreq_interactive_timer_resched() is called from another CPU,
the timer will be armed on the wrong CPU.
Replacing mod_timer_pinned() with del_timer() and add_timer_on()
guarantees timers are still run on the right CPU even if another
CPU reschedules the timer. This would provide more flexibility
for future changes.
Change-Id: I3a10be37632afc0ea4e0cc9c86323b9783b216b1
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Currently, tunables are only saved to per_cpu field when
CPUFREQ_GOV_POLICY_EXIT event happens. Save tunables the moment they
are created so that per_cpu cached_tunables field always matches
the tunables in use. This is useful for modifying tunable values
across clusters.
Change-Id: I9e30d5e93d6fde1282b5450458d8a605d568a0f5
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Timers are scheduled in unit of jiffies. Round up timer_rate so that
it matches the actual sampling period.
Change-Id: I47e666f835752528331f50b1e76784e6d67f8bcf
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
When CPU has been busy for a long time, last evaluated jiffy will be
quite behind because the timer would have been canceled. We don't want
to schedule a timer to fire in the past as load will always be 100%.
Reset last evaluated jiffy so that timer will be scheduled for the
next window.
Change-Id: Ie25e65eab1f16acdeda267987ca605d653f1f32a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
When CPU has been busy for a long time, last evaluated jiffy will be
quite behind because the timer would have been canceled. We don't want
to schedule a timer to fire in the past as load will always be 100%.
Reset last evaluated jiffy so that timer will be scheduled for the
next window.
Change-Id: I4c3838f36bf4d1e4cebce29a26b45611b416d929
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
It's more advantageous to evaluate all CPUs at same time so that
interactive governor gets a complete picture of the load on
each CPU at a specific time. It could also reduce number of speed
changes made if there are many CPUs controlled by same policy. In
addition, waking up all CPUs at same time would allow the cluster
to go into a deeper sleep state when it's idle.
Change-Id: I6915050c5339ef1af106eb906ebe4b7c618061e2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Interactive governor already has a per_cpu field cpuinfo to keep track
of per_cpu data. Move cached_tunables into cpuinfo.
Change-Id: I77fda0cda76b56ff949456a95f96d129d877aa7b
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Remove sampling_down_factor feature.
This commit revert d094d23694
(cpufreq: interactive: Add a sampling_down_factor for max frequencies)
and subsequent modifications related to sampling down factor.
Change-Id: Ib7ec0a918bd3e85a3425dbdeefcd2f2aecffe69c
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Sync freq feature is not valid for a HMP system with clusters.
This commit reverts commit f3d1980b4d
(cpufreq: interactive: sync freq feature for interactive governor)
Change-Id: I78cb91a94b1a022f8daed045f5aae69f1c00783d
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Commit f8b276565c
(cpufreq: Sync on thread migration optimizations) is no longer needed
for targets with synchronous CPUs.
Part of that commit has already been reverted in
a913b3afca
(cpufreq: interactive: Revert timer start modification)
This commit reverts the remaining changes.
Change-Id: I7eadeb7e48cfbef8fec74eb1b0e221eb65482f52
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Enable cpufreq and power kconfig menus on arm64 along with arm cpufreq
drivers. The power menu is needed for OPP support. At least on Calxeda
systems, the same cpufreq driver is used for arm and arm64 based
systems.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Git-commit: 52e7e816420383a340cfb6c3ffd12477c3c80b76
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
When the governor limits are changed, it was calling
cpufreq_interactive_timer_start() after deleting the timers. But there's a
possibility for another thread to race and add the timer before
cpufreq_interactive_timer_start() is called. Remove the possibility of this
race by calling cpufreq_interactive_timer_resched() that grabs a spinlock,
deletes the timer and then adds the timer again. This works because timers
can be deleted multiple times without any side effects.
[ 686.764888] ------------[ cut here ]------------
[ 686.768484] kernel BUG at .../kernel/timer.c:931!
[ 686.787923] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 686.793390] Modules linked in:
[ 687.311878] Call trace:
[ 687.314316] [<ffffffc000223dd4>] add_timer_on+0x5c/0x148
[ 687.319607] [<ffffffc00082584c>] cpufreq_interactive_timer_start+0x60/0xd8
[ 687.326466] [<ffffffc0008264f8>] cpufreq_governor_interactive+0x6d4/0x720
[ 687.333239] [<ffffffc00081ca94>] __cpufreq_governor+0x120/0x1d0
[ 687.339139] [<ffffffc00081cdb0>] cpufreq_set_policy+0x26c/0x284
[ 687.345043] [<ffffffc00081db5c>] cpufreq_update_policy+0x110/0x140
[ 687.351207] [<ffffffc0007ff220>] update_cpu_freq+0x30/0x68
[ 687.356672] [<ffffffc0007ff358>] update_cluster_freq+0x100/0x128
[ 687.362667] [<ffffffc000b80d30>] do_freq_mitigation+0x140/0x16c
[ 687.368570] [<ffffffc0002363e0>] kthread+0xac/0xb8
Change-Id: I4dbc084cc5bd7f583fdca2b593a227298c345518
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
Scheduler provides an API to force tasks to the big cluster. To
improve performance, use this API to move most/all tasks to the
big cluster for short duration on an input event. On the removal of
frequency boost (after input_boost_ms), this scheduler boost is also
deactivated.
Change-Id: I9d643914ebc75266478cc22260a45862faad6236
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
Currently each CPU queues up its own work for removing input boost.
This is not really required as boost removal for all the CPUs can
be done at the same time. So this patch uses a single work to
remove input boost for all the CPUs and updates the policy for
the online ones.
Change-Id: I37c809f2f155548b1d8c1b9aa7626c8852b3acc6
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
Different types of CPUs could have different frequency to satisfy same
input workload. Add support for using different input_boost_freq on
different CPUs.
input_boost_freq now either takes a single number which applies to all
CPUs, or cpuid:freq pairs separated by space for different CPUs.
Change-Id: I20506a9fbdb4d532d94168bbd61744595bebc8e5
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
As the pointers' size change to 64 bits in the 64 bit kernel, the
int declarations for them from the legacy code give compilation
warnings which get flagged as errors.
Replace int casting of pointers with long to get rid of these
warnings.
Change-Id: I96b6cf342c2bf110220eac0addfb72fbdd969c6e
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
This reverts commit ff6af80775.
Commit ff6af807 tries to avoid a corner case where frequency is stuck in
hispeed_freq for one additional window. For example, if timer_rate is
20ms, and go_hispeed_delay is 40ms, frequency might be stuck at
hispeed_freq for 60ms due to imprecision in jiffies. Same problem can be
easily solved by making go_hispeed_delay 1ms smaller instead of changing
the code.
Change-Id: Idab7c29ed28374df219210e444454068864d144d
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
When tunables are not available for events other than
CPUFREQ_GOV_POLICY_INIT in cpufreq_governor_interactive(), trigger a
panic instead of throwing a warning.
When the original warning happens, some race condition must have
occurred, and governor will be in a bad state even if it might still
run for a while. Panic directly so that it's easier to catch the
first race event.
Change-Id: I2dc1185cabfe72a63739452731fe242924d2cf45
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
In the cases where the system boots up in a constraint with policy min and
max lower than cpuinfo min/max, and user tries to set a higher user
policy min, the value would be overridden during the verifying the
limits.
Once user initiates the sysfs write the previous user policy is
maintained in policy min and max thus changing the limits for verification
of the current policy. Once the verification is completed restore the
current user policy min/max with the updated values if any. This would take
care of cases uwhere user policy min/max input is higher/lesser than the
current min and max.
Change-Id: I5ad92ba05162cb5c32c3ba3fdae21d2e505493d3
Signed-off-by: Taniya Das <tdas@codeaurora.org>
Scheduler has more information about what potential load could be
on each CPU in the future. Use scheduler hints instead of busy/idle
time from past window.
In addition, replace mod_timer_pinned() with del_timer(), add_timer_on().
mod_timer_pinned() always adds timer onto current CPU. Interactive
governor expects each CPU's timers to be running on the same CPU, but
load change callback might be triggered from other CPUs.
Replacing mod_timer_pinned() with del_timer() and add_timer_on()
guarantees timers are still run on the right CPU even if another CPU
reschedules the timer.
Change-Id: Ic7c917bffe7bfef60c9ef93072276aa873927ad7
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
It's more advantageous to evaluate all CPUs at same time so that
interactive governor gets a complete picture of the load on
each CPU at a specific time. It could also reduce number of speed
changes made if there are many CPUs controlled by same policy.
Change-Id: I4cfa5027b7a8c647f34893215573dc1fcd6428d5
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Userspace might change tunable values for a governor. Currently, if
all CPUs in a policy go offline, governor frees its tunable. This
wipes out all userspace modifications. Kernel drivers can call
cpu_up/down() directly and thus userspace won't have a chance to
restore the tunables.
Permanently save tunable struct in a per_cpu field so that we
preserve tunable values across hotplug, suspend/resume and governor
switch.
Change-Id: I5bc11d40e6bf649bdcb49f209468583d37a4c424
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
To avoid multiple frees of an allocated tunables struct during
module_exit(), the pointer to the allocated tunables should be stored in
only one of the per-CPU cached_tunables pointer.
So, in the case of per policy governor configuration, store the cached
values in the pointer of first CPU in a policy. In the case of one governor
across all policies, store it in the CPU0 pointer.
Change-Id: Id4334246491519ac91ab725a8758b2748f743bb0
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
cpufreq_get/put_global_kobject() provides refcounting for
cpufreq_global_kobject. Users need to call these APIs if they intend
to use cpufreq_global_kobject.
Wrap cpufreq_global_kobject usage with get/put calls in interactive
governor.
Change-Id: I03c6830c297475a83c3eab723f1ec5449dcd151a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Rename governor in cpufreq_interactive_pro.c to interactive_pro.
Also fix compilation error for tracepoint definitions.
Change-Id: Ic4eb722fc50565489296a16c0193a4373d76ba5b
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Upstream has moved get_cpu_idle_time() and related functions into core
CPUfreq framework. Remove unnecessary functions in interactive governor.
Change-Id: Ibba09e190581610b2c4bc17344164973414f1b12
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
The cpufreq_stats_create_table function does not fully protect cpu_stats'
members with a spinlock. This change fixes that. It also performs a
null-check on freq_table in freq_table_get_index and returns -ENOENT
appropriately.
Change-Id: I301d1b23ef766a77161bab1472dc810d53b6c5fe
Signed-off-by: Anurag Singh <anursing@codeaurora.org>
Introduce a compile time flag to enable scheduler guidance of
frequency selection. This flag is also used to turn on or off
window-based load stats feature.
Having a compile time flag will let some platforms avoid any
overhead that may be present with this scheduler feature.
Change-Id: Id8dec9839f90dcac82f58ef7e2bd0ccd0b6bd16c
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
commit 5a90af67c2126fe1d04ebccc1f8177e6ca70d3a9 upstream.
Since commtit 8a7b1227e3 (cpufreq: davinci: move cpufreq driver to
drivers/cpufreq) this added dependancy only for CONFIG_ARCH_DAVINCI_DA850
where as davinci_cpufreq_init() call is used by all davinci platform.
This patch fixes following build error:
arch/arm/mach-davinci/built-in.o: In function `davinci_init_late':
:(.init.text+0x928): undefined reference to `davinci_cpufreq_init'
make: *** [vmlinux] Error 1
Fixes: 8a7b1227e3 (cpufreq: davinci: move cpufreq driver to drivers/cpufreq)
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use write lock when updating cpufreq_cpu_data,
and read lock when getting the policy pointer.
CRs-Fixed: 689522
Change-Id: I454f0d575157b3411d369e04097386f50aeaaa1c
Signed-off-by: Maria Yu <aiquny@codeaurora.org>
Make sure CPU is online before proceeding with any "show"
ops. Without this check, the show can race with hotplug
and try to show the details of a stale or non-existent
policy.
CRs-Fixed: 689522
Change-Id: Ie791c73cb281bcfc4d722f7c8c10eee07cb11f2e
Signed-off-by: Maria Yu <aiquny@codeaurora.org>
The cpufreq_interactive_timer gets cancelled and rescheduled
whenever the cpufreq_policy is changed. When the cpufreq policy is
changed at a rate faster than the sampling_rate of the interactive
governor, then the governor misses to change the target frequency
for long duration. The patch removes the need of cancelling the
timers when policy->min is changed.
Change-Id: Ibd98d151e1c73b8bd969484583ff98ee9f1135ef
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Git-commit: 9b97d655a558607c5d46ef1f21365d695f8d1ee2
Git-Repo: https://android.googlesource.com/kernel/common.git
[junjiew@codeaurora.org: resolve merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
commit f8b276565c
(cpufreq: Sync on thread migration optimizations)
introduced a change to cpufreq_interactive_timer_start() in order
to reschedule the timer differently based on whether min or max
is changed. A better way is to reschedule the timer only when
necessary.
Revert timer start modification in preparation for the final fix.
Change-Id: I13f3b75a6eee03ac6380c24db899806a9bfbc96a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Using the function wait_event in cpu_boost puts the
process enter to 'D' state which contribute to the
high load average. This change will put the process
boost_sync in the 'S' state (interruptible sleep)
Change-Id: Ie121adbe1fac1d2862ac5342bb97c7c926f7d7a8
CRs-Fixed: 655484
Signed-off-by: Swetha Chikkaboraiah <schikk@codeaurora.org>
Signed-off-by: Raghavendra Ambadas <rambad@codeaurora.org>
It's no longer a requirement to pin frequency change on the CPU that
is being scaled. Therefore, there is no longer a need for per-cpu
workqueue in qcom-cpufreq. Remove the workqueue.
Change-Id: Ic6fd7f898fa8b1b1226a178b04530c24f0398daa
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
MSM_CPU_FREQ_SET_MIN_MAX and related Kconfigs are deprecated. Purge
them from Kconfig and qcom-cpufreq.
Change-Id: I8ac786c155c7e235154b60c79f97d76ea15dace2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
qcom-cpufreq use a per-cpu work to scale CPU frequency. For CPUs
in sync, only the first CPU that is plugged in has its work
initialized. When that CPU goes offline, it hands over policy
to another CPU, which doesn't have its work initialized. If CPU
scaling happens then, an uninitialized work will be queued onto
workqueue, causing a crash.
Initialize workqueue for all CPUs in sync in msm_cpufreq_init().
Change-Id: I4c3bc08182c4088de4a3675c47a8e0e10c8e4f47
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Governor error messages point to important failures in governor or
framework. Output triggering CPU and policy->cpu to help debugging.
Change-Id: I4c5c392ec973b764ec3240bb2eb455c624bcaf63
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* commit 'v3.10.40': (203 commits)
Linux 3.10.40
ARC: !PREEMPT: Ensure Return to kernel mode is IRQ safe
drm: cirrus: add power management support
Input: synaptics - add min/max quirk for ThinkPad Edge E431
Input: synaptics - add min/max quirk for ThinkPad T431s, L440, L540, S1 Yoga and X1
lockd: ensure we tear down any live sockets when socket creation fails during lockd_up
dm thin: fix dangling bio in process_deferred_bios error path
dm transaction manager: fix corruption due to non-atomic transaction commit
Skip intel_crt_init for Dell XPS 8700
mtd: sm_ftl: heap corruption in sm_create_sysfs_attributes()
mtd: nuc900_nand: NULL dereference in nuc900_nand_enable()
mtd: atmel_nand: Disable subpage NAND write when using Atmel PMECC
tgafb: fix data copying
gpio: mxs: Allow for recursive enable_irq_wake() call
rtlwifi: rtl8188ee: initialize packet_beacon
rtlwifi: rtl8192se: Fix regression due to commit 1bf4bbb
rtlwifi: rtl8192se: Fix too long disable of IRQs
rtlwifi: rtl8192cu: Fix too long disable of IRQs
rtlwifi: rtl8188ee: Fix too long disable of IRQs
rtlwifi: rtl8723ae: Fix too long disable of IRQs
...
Change-Id: If5388cf980cb123e35e1b29275ba288c89c5aa18
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Some SoCs contain CPU clock trees the elements of which are
gated off when the CPUs enter power collapse. This includes
sources of glitch free muxes; therefore when the CPUs enter
power collapse, those muxes cannot be switched.
Now in the cpufreq driver, the CPU clocks are disabled in
the CPU_DEAD notifier, which implies that the CPU muxes
are switched to a safe source *after* the CPUs are power
collapsed. However, the source of the GFMUX may already be
turned off, causing the mux to get stuck.
Ideally, the mux should allow a static switch, since the
clock to the CPU is gated. Some implementations do not
allow this.
Change-Id: I37d3426f20250c59756a0b55d1284efad5359a23
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
common_tunables are only used in cpufreq_interactive. Make it
static.
Change-Id: Iec8ee12af2728c8878d001dc1cf3613be529dc67
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Userspace might change tunable values for a governor. Currently, if
all CPUs in a policy go offline, governor frees its tunable. This
wipes out all userspace modifications. Kernel drivers can call
cpu_up/down() directly and thus userspace won't have a chance to
restore the tunables.
Permanently save tunable struct in a per_cpu field so that we
preserve tunable values across hotplug, suspend/resume and governor
switch.
Change-Id: I126b8278c8e75c8eadb3e2ddfe97fcc72cddfa23
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Even if all CPUs share same frequency table, there might still be use
case where governor-per-policy is useful. Move it before returning
from parsing common CPU freq table.
Change-Id: I0254dd4d09b6ea6595a183207da036b224c90f04
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Light-weight init/teardown is introduced to preserve file permission and
reduce suspend/resume latency. However, it doesn't work correctly if
multiple CPUs controlled by same policy can all go offline.
Suspend and resume removes and adds back CPUs in the same order for
non-boot CPUs. Say CPU2 and CPU3 are both online when suspend starts.
CPU2 goes offline first, handing policy and sysfs over to CPU3. Then
CPU3 goes offline. Due to light-weight teardown, CPU3 still owns the
policy and sysfs files.
When CPU2 comes online after resume, it calls update_policy_cpu() to take
back the policy ownership, but sysfs is not touched. Now CPU2 is online,
with an active policy, but no sysfs folder. In additions, when CPU3 comes
online, it will attempt to create a symlink while owning the real sysfs
folder.
To really fix this issue, special handling for sysfs and symlinks is
required during suspend and resume. This requires non-trivial refactoring
of certain functions.
Temporarly disable light-weight init/teardown until a solution is found.
Change-Id: I485483244954129fa405bc5ef1a5e0da5c05a7d5
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
If CPUs have different performance characteristics, it makes sense to have
different CPU frequency tables for each unique CPU clock. Add support for
parsing different frequency tables for each unique CPU clock.
Change-Id: Ia9b064dfd1f84320d26dd41070339cec548abe7c
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>