Commit Graph

1076 Commits

Author SHA1 Message Date
Linux Build Service Account a4331ca2b8 Merge "cpufreq: cpu-boost: Don't register for cpufreq notifiers too early" 2014-02-26 21:23:40 -08:00
Linux Build Service Account 52159dfc55 Merge "qcom-cpufreq: Block hotplug until cpufreq is ready" 2014-02-26 17:32:08 -08:00
Linux Build Service Account 663435db62 Merge "cpufreq: interactive: delete timers for GOV_START" 2014-02-26 06:55:27 -08:00
Junjie Wu 2b63b109fc qcom-cpufreq: Block hotplug until cpufreq is ready
Hotplug before qcom-cpufreq is ready could lead to inconsistent CPU
clock state. Block hotplug by returning NOTIFY_BAD in hotplug callback
until qcom-cpufreq is probed.

Change-Id: I72a2f98c083c9b21b95ecafdb5a5be7a7682e842
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2014-02-25 11:22:11 -08:00
Shridhar Rasal 13f3529d5d cpufreq: interactive: delete timers for GOV_START
Make sure that timers cpu_timer and cpu_slack_timer
deactivated before addition of new.

Change-Id: If31c4049606871df6f00efdc24b1d713c86a6f69
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Git-commit: b430123367e78a6557bac3cf1558bcb85193fb12
Git-repo: https://android.googlesource.com/kernel/common/
[mattw@codeaurora.org: resolved trivial context conflict]
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
2014-02-24 16:14:50 -08:00
Junjie Wu d0e097389e msm: cpufreq: Move cpufreq to drivers/cpufreq/
Architecutural changes in the ARM Linux kernel tree mandate the
eventual removal of the mach-* directories. Move mach/cpufreq to
driver/cpufreq/. Also move related header to include/linux.

Change-Id: I6dcf69e275b7ca7ba913e945353a42f0d6321731
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2014-02-21 11:17:03 -08:00
Saravana Kannan d1631df89a cpufreq: cpu-boost: Don't register for cpufreq notifiers too early
The cpufreq notifiers should be registered only after all the data
structures used in the notifier callbacks have been initialized. So, move
the cpufreq notifier registration to a later point in the init function.

Change-Id: I043ab5bc0ebb98164c40549fe151a8d801c8c186
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
2014-02-13 19:08:21 -08:00
Anurag Singh a16e20e70e cpufreq: persistent_stats: export persistent CPU frequency data
CPU frequency statistics are lost when a CPU is put offline to save power.
Due to this behavior, there is currently no way of knowing what frequencies
a CPU ran at over the system's uptime (assuming it is hot-pluggable and was
offlined/onlined multiple times). To solve this problem, export these
statistics so that frequency residencies for all CPUs are preserved.

These statistics can be found under
/sys/devices/system/cpu/cpufreq/stats/cpuX - 'cpuX' is the CPU ID.

Following are the important nodes:
- time_in_state: reading this node would output all the frequencies the CPU
  is capable of running at and the amount of time (in jiffies) it ran at
  those frequencies
- reset: writing 1 to this node would reset all frequency time stats to 0
- enable: writing 0 or 1 to this node disables or enables stats collection,
  respectively. By default, stats collection is enabled.

Change-Id: I225ef89f7b359f1f94386f2f9445ece9d5119768
Signed-off-by: Anurag Singh <anursing@codeaurora.org>
2014-02-13 15:15:02 -08:00
Dirk Brandewie 0df520d459 intel_pstate: Correct calculation of min pstate value
commit 7244cb62d96e735847dc9d08f870550df896898c upstream.

The minimum pstate is supposed to be a percentage of the maximum P
state available.  Calculate min using max pstate and not the
current max which may have been limited by the user

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-13 13:48:04 -08:00
Brennan Shacklett e34ce30f32 intel_pstate: Improve accuracy by not truncating until final result
commit d253d2a52676cfa3d89b8f0737a08ce7db665207 upstream.

This patch addresses Bug 60727
(https://bugzilla.kernel.org/show_bug.cgi?id=60727)
which was due to the truncation of intermediate values in the
calculations, which causes the code to consistently underestimate the
current cpu frequency, specifically 100% cpu utilization was truncated
down to the setpoint of 97%. This patch fixes the problem by keeping
the results of all intermediate calculations as fixed point numbers
rather scaling them back and forth between integers and fixed point.

References: https://bugzilla.kernel.org/show_bug.cgi?id=60727
Signed-off-by: Brennan Shacklett <bpshacklett@gmail.com>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-13 13:48:04 -08:00
Srinivas Pandruvada 3dc642a398 intel_pstate: fix no_turbo
commit 1ccf7a1cdafadd02e33e8f3d74370685a0600ec6 upstream.

When sysfs for no_turbo is set, then also some p states in turbo regions
are observed. This patch will set IDA Engage bit when no_turbo is set to
explicitly disengage turbo.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-13 13:48:04 -08:00
Nell Hardcastle 0b977de88f intel_pstate: Add Haswell CPU models
commit 6cdcdb793791f776ea9408581b1242b636d43b37 upstream.

Enable the intel_pstate driver for Haswell CPUs. One missing Ivy Bridge
model (0x3E) is also included. Models referenced from
tools/power/x86/turbostat/turbostat.c:has_nehalem_turbo_ratio_limit

Signed-off-by: Nell Hardcastle <nell@spicious.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-13 13:48:04 -08:00
Saravana Kannan dd16210636 cpufreq: cpu-boost: Fix deadlock in wake_up of sync threads
If wake_up() is called on the current task on a CPU, the call will wait
until the current task is switched out before it wakes it up again and
returns.

The sync notifier for a CPU always runs on that CPU.

These two together can result in a deadlock if the sync notifier on CPU A
tries to wake up the sync thread of CPU A as it goes to sleep (is the
current task). A previous commit fixed this by adding a check to the sync
notifier to not wake up the sync thread of CPU A if it's the current task.

But this is still not sufficient to prevent deadlocks.

Sync thread of CPU A could be the current task on CPU B and sync thread of
CPU B could be the current task on CPU A.  At this point, if sync notifier
of CPU A and B try to wake up the sync threads of CPU A and B, it will
result in CPU A waiting for the current task in CPU B to get switched out
and CPU B waiting for the current task in CPU A to get switched out.  This
will result in a deadlock.

Prevent this scenario from happening by pinning the sync threads of each
CPU to run on that CPU. By doing this, we guarantee that sync notifiers
will only try to wake up sync threads running on that CPU. The fix added by
"cpufreq: cpu-boost: Resolve deadlock when waking up sync thread" ensures a
deadlock doesn't happen when a sync notifier tries to wake up a sync thread
running on that CPU.

Change-Id: I864e545529722a23886dd5a82f66089155d2d193
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
2014-01-31 17:22:22 -08:00
Saravana Kannan 6131333017 cpufreq: cpu-boost: Fix queue_delayed_work_on() race with hotplug
Calling queue_delayed_work_on() on a CPU that's in the process of getting
hotplugged out can result in that CPU infinitely looping in
msm_pm_wait_cpu_shutdown(). If queue_delayed_work_on() is called after the
CPU is hotplugged out, it could wake up the CPU without going through the
hotplug path and cause instability. To avoid this, make sure the CPU is and
stays online while queuing a work on it.

Change-Id: I1b4aae3db803e476b1a7676d08f495c1f38bb154
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
2014-01-31 17:21:19 -08:00
Junjie Wu 0c3bb4bd90 cpufreq: Set policy for NULL before cleaning up kobj
__cpufreq_remove_dev_finish() cleans up the policy->kobj while
per_cpu(cpufreq_cpu_data) still contains a valid policy pointer.
This causes a race between kobject_get() called from cpufreq_cpu_get()
and cpu hotplug.

Set cpufreq_cpu_data to NULL before cleaning up kobject so that
subsequent cpufreq_cpu_get() will fail and thus not race with
cpu hotplug.

Change-Id: I0e2d1a64b7aac98aa69e137cc902e07d0edb786e
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2014-01-24 16:24:51 -08:00
Linux Build Service Account 912b94f9d2 Merge "cpufreq: interactive: Modifying sync_freq implementation" 2014-01-18 07:39:05 -08:00
Linux Build Service Account b839dcac69 Merge "cpufreq: interactive: Use default min_sample_time if SDF is zero" 2014-01-18 00:47:53 -08:00
Rohit Gupta 878010b9b1 cpufreq: interactive: Modifying sync_freq implementation
1. Check for up_threshold_any_cpu_freq instead of sync_freq to
   boost the frequency to sync_freq.
2. Change the load threshold name from sync_freq_load_threshold to
   up_threshold_any_cpu_load
3. Do not consider CPUs with load less than up_threshold_any_cpu_load
   while evaluating max load across CPUs

Change-Id: Ia0e537edbf38a5006c1a22f5c472daa0d086ffc9
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2014-01-16 16:45:02 -08:00
Linux Build Service Account 34c0f6190a Merge "cpufreq: ondemand: Fix update_sampling_rate race with hotplug" 2014-01-16 13:31:21 -08:00
Linux Build Service Account b94313f8d8 Merge "cpufreq: interactive: Fix null pointer dereference in interactive governor" 2014-01-16 13:28:56 -08:00
Dirk Brandewie d0ccf8a115 intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters.
commit 6cbd7ee10e2842a3d1f9b60abede1c8f3d1f1130 upstream.

KVM environments do not support APERF/MPERF MSRs. intel_pstate cannot
operate without these registers.

The previous validity checks in intel_pstate_msrs_not_valid() are
insufficent in nested KVMs.

References: https://bugzilla.redhat.com/show_bug.cgi?id=1046317
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-15 15:28:53 -08:00
Vijay Ganti d3d905242d cpufreq: interactive: Fix null pointer dereference in interactive governor
For the sync_freq feature currently we check pcpu->policy->cur frequency
for each online cpu.  But for a CPU that isn't using interactive governor
or for an offline CPU, pcpu->policy can be null or an invalid value.
This patch tries to avoid that scenario by using pcpu->target_freq
instead of policy->cur to get the frequency of an online CPU.

Kernel crash without this patch:
[   20.132373] Unable to handle kernel NULL pointer dereference at virtual address 00000028
[   20.132375] pgd = c34f34c0
[   20.132377] pgd = ef6f2440
[   20.132383] [00000028] *pgd=00000000
[   20.132385]
[   20.132388] [00000028] *pgd=2e98f003, *pmd=00000000
[   20.132390] Internal error: Oops: 205 [#1] PREEMPT SMP ARM
[   20.132394] Modules linked in:
[   20.132398] CPU: 0 PID: 1560 Comm: chown Tainted: G        W    3.10.0-perf-gb12057b-00001-ga2c6c16-dirty #7
[   20.132401] task: ef9af300 ti: ee49c000 task.ti: ee49c000
[   20.132411] PC is at cpufreq_interactive_timer+0x10c/0x650
[   20.132415] LR is at cpufreq_interactive_timer+0x128/0x650
<snip>
[   20.133002] [<c07eb204>] (cpufreq_interactive_timer+0x10c/0x650) from [<c02804d8>] (call_timer_fn+0x80/0x198)
[   20.133012] [<c02804d8>] (call_timer_fn+0x80/0x198) from [<c0280acc>] (run_timer_softirq+0x1f8/0x270)
[   20.133019] [<c0280acc>] (run_timer_softirq+0x1f8/0x270) from [<c0279e20>] (__do_softirq+0x12c/0x2d4)
[   20.133025] [<c0279e20>] (__do_softirq+0x12c/0x2d4) from [<c027a2d4>] (irq_exit+0x74/0xc8)
[   20.133034] [<c027a2d4>] (irq_exit+0x74/0xc8) from [<c0206a00>] (handle_IRQ+0x68/0x8c)
[   20.133041] [<c0206a00>] (handle_IRQ+0x68/0x8c) from [<c02004b8>] (gic_handle_irq+0x3c/0x60)
[   20.133051] [<c02004b8>] (gic_handle_irq+0x3c/0x60) from [<c0ac6900>] (__irq_svc+0x40/0x70)
<snip>

Change-Id: Ie834f5d383de4d41e0fe6fbd40c8b0a0c05d82f5
Signed-off-by: Vijay Ganti <viganti@codeaurora.org>
2014-01-14 15:14:11 -08:00
Linux Build Service Account 8cf919625e Merge "cpufreq: Set policy to non-NULL only after all hotplug online work is done" 2014-01-14 02:29:17 -08:00
Rohit Gupta 61c9b7b970 cpufreq: ondemand: Fix update_sampling_rate race with hotplug
update_sampling_rate has a for loop which goes through each
online cpu and possibly queue up the ondemand work for them.
But while doing this it doesnt take any hotplug lock which
could potentially cause a race condition where ondemand work
is queued after the hotplug code (which sets the policy to NULL)
in the governor has cancelled any pending work. This could cause
a crash while trying to access the NULL policy in dbs_check_cpu.

Protecting the for_each_online_cpu loop with get_online_cpus()
and put_online_cpus().

Change-Id: Ia3f43ca7e4bed542834ab03ca1191d728f13311c
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2014-01-13 15:43:17 -08:00
Linux Build Service Account d86b06e030 Merge "cpufreq: Call cpufreq_update_policy() during cpufreq_stats_init()" 2014-01-10 01:27:42 -08:00
Saravana Kannan df2bff3319 cpufreq: Set policy to non-NULL only after all hotplug online work is done
The existing code sets the per CPU policy to a non-NULL value before all
the steps performed during the hotplug online path is done. Specifically,
this is done before the policy min/max, governors, etc are initialized for
the policy.  This in turn means that calls to cpufreq_cpu_get() return a
non-NULL policy before the policy/CPU is ready to be used.

To fix this, move the update of per CPU policy to a valid value after all
the initialization steps for the policy are completed.

Example kernel panic without this fix:
[  512.146185] Unable to handle kernel NULL pointer dereference at virtual address 00000020
[  512.146195] pgd = c0003000
[  512.146213] [00000020] *pgd=80000000004003, *pmd=00000000
[  512.146228] Internal error: Oops: 206 [#1] PREEMPT SMP ARM
<snip>
[  512.146297] PC is at __cpufreq_governor+0x10/0x1ac
[  512.146312] LR is at cpufreq_update_policy+0x114/0x150
<snip>
[  512.149740] ---[ end trace f23a8defea6cd706 ]---
[  512.149761] Kernel panic - not syncing: Fatal exception
[  513.152016] CPU0: stopping
[  513.154710] CPU: 0 PID: 7136 Comm: mpdecision Tainted: G      D W    3.10.0-gd727407-00074-g979ede8 #396
<snip>
[  513.317224] [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58)
[  513.327809] [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) from [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c)
[  513.339182] [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) from [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8)
[  513.349594] [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) from [<c0803e7c>] (cpufreq_init_policy+0x30/0x98)
[  513.359231] [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) from [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4)
[  513.369560] [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) from [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84)
[  513.379978] [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) from [<c0afe180>] (notifier_call_chain+0x40/0x68)
[  513.389704] [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02812dc>] (__cpu_notify+0x28/0x44)
[  513.398728] [<c02812dc>] (__cpu_notify+0x28/0x44) from [<c0aeed90>] (_cpu_up+0xf4/0x1dc)
[  513.406797] [<c0aeed90>] (_cpu_up+0xf4/0x1dc) from [<c0aeeed4>] (cpu_up+0x5c/0x78)
[  513.414357] [<c0aeeed4>] (cpu_up+0x5c/0x78) from [<c0aec808>] (store_online+0x44/0x74)
[  513.422253] [<c0aec808>] (store_online+0x44/0x74) from [<c03a40f4>] (sysfs_write_file+0x108/0x14c)
[  513.431195] [<c03a40f4>] (sysfs_write_file+0x108/0x14c) from [<c03517d4>] (vfs_write+0xd0/0x180)
[  513.439958] [<c03517d4>] (vfs_write+0xd0/0x180) from [<c0351ca8>] (SyS_write+0x38/0x68)
[  513.447947] [<c0351ca8>] (SyS_write+0x38/0x68) from [<c0205de0>] (ret_fast_syscall+0x0/0x30)

In this specific case, CPU0 set's CPU1's policy->governor in
cpufreq_init_policy() to NULL while CPU1 is using the policy->governor in
__cpufreq_governor().

Change-Id: I8b8bc7114e44744f6f38925e4ed710126a45075d
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
2014-01-09 19:51:09 -08:00
Junjie Wu f30be559b1 cpufreq: Call cpufreq_update_policy() during cpufreq_stats_init()
Commit da12b9488e (cpufreq: Fix misplaced call to cpufreq_update_policy())
removed cpufreq_update_policy() that was in cpufreq_stats_init(). The fix
of moving cpufreq_update_policy() in hotplug notifier is still valid.

However, by removing the call in cpufreq_stats_init(), it misses a
corner case where cpufreq_stats register last and no hotplug or
cpufreq change happens afterwards. Then a CPUFREQ_NOTIFY will never
be sent to cpufreq_stats hotplug notifier, and sysfs won't be created.

Put back the cpufreq_update_policy() call after registering for hotplug
notifier in cpufreq_stats_init(). The call will force a CPUFREQ_NOTIFY
notification to be sent and sysfs to be created. In case cpufreq
driver is not registered, it will just fail silently.

Change-Id: I6d792d0ba600a7d8c70ccb2adae1e2cbadc0463e
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2014-01-09 19:06:21 -08:00
Linux Build Service Account 60040c6828 Merge "cpufreq: cpu-boost: Resolve deadlock when waking up sync thread" 2014-01-09 15:55:19 -08:00
Rafael J. Wysocki ec84b71390 intel_pstate: Fail initialization if P-state information is missing
commit 98a947abdd54e5de909bebadfced1696ccad30cf upstream.

If pstate.current_pstate is 0 after the initial
intel_pstate_get_cpu_pstates(), this means that we were unable to
obtain any useful P-state information and there is no reason to
continue, so free memory and return an error in that case.

This fixes the following divide error occuring in a nested KVM
guest:

Intel P-state driver initializing.
Intel pstate controlling: cpu 0
cpufreq: __cpufreq_add_dev: ->get() failed
divide error: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-0.rc4.git5.1.fc21.x86_64 #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff88001ea20000 ti: ffff88001e9bc000 task.ti: ffff88001e9bc000
RIP: 0010:[<ffffffff815c551d>]  [<ffffffff815c551d>] intel_pstate_timer_func+0x11d/0x2b0
RSP: 0000:ffff88001ee03e18  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88001a454348 RCX: 0000000000006100
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88001ee03e38 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88001ea20000 R11: 0000000000000000 R12: 00000c0a1ea20000
R13: 1ea200001ea20000 R14: ffffffff815c5400 R15: ffff88001a454348
FS:  0000000000000000(0000) GS:ffff88001ee00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 00000000000006f0
Stack:
 fffffffb1a454390 ffffffff821a4500 ffff88001a454390 0000000000000100
 ffff88001ee03ea8 ffffffff81083e9a ffffffff81083e15 ffffffff82d5ed40
 ffffffff8258cc60 0000000000000000 ffffffff81ac39de 0000000000000000
Call Trace:
 <IRQ>
 [<ffffffff81083e9a>] call_timer_fn+0x8a/0x310
 [<ffffffff81083e15>] ? call_timer_fn+0x5/0x310
 [<ffffffff815c5400>] ? pid_param_set+0x130/0x130
 [<ffffffff81084354>] run_timer_softirq+0x234/0x380
 [<ffffffff8107aee4>] __do_softirq+0x104/0x430
 [<ffffffff8107b5fd>] irq_exit+0xcd/0xe0
 [<ffffffff81770645>] smp_apic_timer_interrupt+0x45/0x60
 [<ffffffff8176efb2>] apic_timer_interrupt+0x72/0x80
 <EOI>
 [<ffffffff810e15cd>] ? vprintk_emit+0x1dd/0x5e0
 [<ffffffff81757719>] printk+0x67/0x69
 [<ffffffff815c1493>] __cpufreq_add_dev.isra.13+0x883/0x8d0
 [<ffffffff815c14f0>] cpufreq_add_dev+0x10/0x20
 [<ffffffff814a14d1>] subsys_interface_register+0xb1/0xf0
 [<ffffffff815bf5cf>] cpufreq_register_driver+0x9f/0x210
 [<ffffffff81fb19af>] intel_pstate_init+0x27d/0x3be
 [<ffffffff81761e3e>] ? mutex_unlock+0xe/0x10
 [<ffffffff81fb1732>] ? cpufreq_gov_dbs_init+0x12/0x12
 [<ffffffff8100214a>] do_one_initcall+0xfa/0x1b0
 [<ffffffff8109dbf5>] ? parse_args+0x225/0x3f0
 [<ffffffff81f64193>] kernel_init_freeable+0x1fc/0x287
 [<ffffffff81f638d0>] ? do_early_param+0x88/0x88
 [<ffffffff8174b530>] ? rest_init+0x150/0x150
 [<ffffffff8174b53e>] kernel_init+0xe/0x130
 [<ffffffff8176e27c>] ret_from_fork+0x7c/0xb0
 [<ffffffff8174b530>] ? rest_init+0x150/0x150
Code: c1 e0 05 48 63 bc 03 10 01 00 00 48 63 83 d0 00 00 00 48 63 d6 48 c1 e2 08 c1 e1 08 4c 63 c2 48 c1 e0 08 48 98 48 c1 e0 08 48 99 <49> f7 f8 48 98 48 0f af f8 48 c1 ff 08 29 f9 89 ca c1 fa 1f 89
RIP  [<ffffffff815c551d>] intel_pstate_timer_func+0x11d/0x2b0
 RSP <ffff88001ee03e18>
---[ end trace f166110ed22cc37a ]---
Kernel panic - not syncing: Fatal exception in interrupt

Reported-and-tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-09 12:24:23 -08:00
Saravana Kannan 93d462463f cpufreq: Fix policy getting stuck when user & kernel min/max don't overlap
Every __cpufreq_set_policy starts with checking the new policy min/max has
some overlap with the current policy min/max. This works out fine until we
end up with the policy min/max being set to a range that doesn't overlap
with the user policy min/max. Once we get into this situation, the check at
the start of __cpufreq_set_policy fails and prevents us from getting out of
this state.

This only happens when one of the CPUFREQ_ADJUST/CPUFREQ_INCOMPATIBLE
notifiers called inside __cpufreq_set_policy pick a min/max outside the
range of user policy min/max.

The real intent of the check at the start of __cpufreq_set_policy is to
make sure userspace can't set user policy min > user policy max. Since
__cpufreq_set_policy always gets called only with current user policy
min/max except when the actual user space policy min/max is changed, we can
fix the issue by simply checking the new policy min/max against current
user policy min/max.

Change-Id: Iaac805825e64d7985c41fb9052bd96baacdf3d6f
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
2014-01-07 17:43:56 -08:00
Junjie Wu e0cc22a8a0 cpufreq: interactive: Use correct kobj when creating sysfs
cpufreq_global_kobject is no longer initialized during cpufreq_core_init.
Fix sysfs creation by properly requesting the kobject based on
have_governor_per_policy().

Change-Id: I5f9cff68043dad8822952bd43227d948a934a1c7
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-23 14:10:55 -08:00
lan,Tianyu 43c1a00e9d cpufreq: governor: Remove fossil comment in the cpufreq_governor_dbs()
The related code has been changed and the comment is out of date.
So remove it.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: aae467c79b14db0d286764ed9ddbaefe3715ebd2
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:20 -08:00
Xiaoguang Chen 8c66d27021 cpufreq: conservative: set requested_freq to policy max when it is over policy max
When requested_freq is over policy->max, set it to policy->max.
This can help to speed up decreasing frequency.

Signed-off-by: Xiaoguang Chen <chenxg@marvell.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 6d7bcb1464a89181ddc4b4584ad6e0c7566ae31b
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:20 -08:00
Xiaoguang Chen 0d941cead1 cpufreq: conservative: fix requested_freq reduction issue
When decreasing frequency, requested_freq may be less than
freq_target, So requested_freq minus freq_target may be negative,
But reqested_freq's unit is unsigned int, then the negative result
will be one larger interger which may be even higher than
requested_freq.

This patch is to fix such issue. when result becomes negative,
set requested_freq as the min value of policy.

Signed-off-by: Xiaoguang Chen <chenxg@marvell.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 3baa976ae644f76f5cdb5be0fb26754c3bfb32cb
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:19 -08:00
Stratos Karafotis e2b0e8ad81 cpufreq: ondemand: Remove redundant return statement
After commit dfa5bb622555 (cpufreq: ondemand: Change the calculation
of target frequency), this return statement is no longer needed.

Reported-by: Henrik Nilsson <Karl.Henrik.Nilsson@gmail.com>
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 880eef041655b35f9aa488726ea3c4303a4f2204
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:19 -08:00
Viresh Kumar c0ff1bbed6 cpufreq: move freq change notifications to cpufreq core
Most of the drivers do following in their ->target_index() routines:

	struct cpufreq_freqs freqs;
	freqs.old = old freq...
	freqs.new = new freq...

	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);

	/* Change rate here */

	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);

This is replicated over all cpufreq drivers today and there doesn't exists a
good enough reason why this shouldn't be moved to cpufreq core instead.

There are few special cases though, like exynos5440, which doesn't do everything
on the call to ->target_index() routine and call some kind of bottom halves for
doing this work, work/tasklet/etc..

They may continue doing notification from their own code as flag:
CPUFREQ_ASYNC_NOTIFICATION is already set for them.

All drivers are also modified in this patch to avoid breaking 'git bisect', as
double notification would happen otherwise.

Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Russell King <linux@arm.linux.org.uk>
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Reviewed-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: d4019f0a92ab802f385cc9c8ad3ab7b5449712cb
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: dropped all conflicted changes in arch specific
 cpufreq files that we don't use]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:19 -08:00
Viresh Kumar a6a7321205 cpufreq: distinguish drivers that do asynchronous notifications
There are few special cases like exynos5440 which doesn't send POSTCHANGE
notification from their ->target() routine and call some kind of bottom halves
for doing this work, work/tasklet/etc.. From which they finally send POSTCHANGE
notification.

Its better if we distinguish them from other cpufreq drivers in some way so that
core can handle them specially. So this patch introduces another flag:
CPUFREQ_ASYNC_NOTIFICATION, which will be set by such drivers.

This also changes exynos5440-cpufreq.c and powernow-k8 in order to set this
flag.

Acked-by: Amit Daniel Kachhap <amit.daniel@samsung.com>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 7dbf694db6ac7c759599316d50d7050efcbd512a
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:19 -08:00
Nicolas Pitre 4d0f60f2c7 cpufreq: arm_big_little: reconfigure switcher behavior at run time
The b.L switcher can be turned on/off at run time.  It is therefore
necessary to change the cpufreq driver behavior accordingly.

The driver must be unregistered/registered with the cpufreq core
to reconfigure freq tables for the virtual or actual CPUs. This is
accomplished via the b.L switcher notifier callback.

Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 45cac118ffd7c9920b3d85bf551c2205674eb4f2
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:17 -08:00
Viresh Kumar 3b35d5501e cpufreq: arm_big_little: add in-kernel switching (IKS) support
This patch adds IKS (In Kernel Switcher) support to cpufreq driver.

This creates a combined freq table for A7-A15 CPU pairs. A7 frequencies
are virtualized and scaled down to half the actual frequencies to
approximate a linear scale across the combined A7+A15 range. When the
requested frequency change crosses the A7-A15 boundary a cluster switch
is invoked.

Based on earlier work from Sudeep KarkadaNagesha.

Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: e79a23c5b9870b7f80425793abeb10e57f7486d4
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:17 -08:00
viresh kumar ca72f8bfb0 cpufreq: create per policy rwsem instead of per CPU cpu_policy_rwsem
We have per-CPU cpu_policy_rwsem for cpufreq core, but we never use
all of them. We always use rwsem of policy->cpu and so we can
actually make this rwsem per policy instead.

This patch does this change. With this change other tricky situations
are also avoided now, like which lock to take while we are changing
policy->cpu, etc.

Suggested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: ad7722dab7292dbc1c4586d701ac226b68122d39
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:17 -08:00
Viresh Kumar 28ffa72403 cpufreq: Implement light weight ->target_index() routine
Currently, the prototype of cpufreq_drivers target routines is:

int target(struct cpufreq_policy *policy, unsigned int target_freq,
		unsigned int relation);

And most of the drivers call cpufreq_frequency_table_target() to get a valid
index of their frequency table which is closest to the target_freq. And they
don't use target_freq and relation after that.

So, it makes sense to just do this work in cpufreq core before calling
cpufreq_frequency_table_target() and simply pass index instead. But this can be
done only with drivers which expose their frequency table with cpufreq core. For
others we need to stick with the old prototype of target() until those drivers
are converted to expose frequency tables.

This patch implements the new light weight prototype for target_index() routine.
It looks like this:

int target_index(struct cpufreq_policy *policy, unsigned int index);

CPUFreq core will call cpufreq_frequency_table_target() before calling this
routine and pass index to it. Because CPUFreq core now requires to call routines
present in freq_table.c CONFIG_CPU_FREQ_TABLE must be enabled all the time.

This also marks target() interface as deprecated. So, that new drivers avoid
using it. And Documentation is updated accordingly.

It also converts existing .target() to newly defined light weight
.target_index() routine for many driver.

Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Russell King <linux@arm.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Git-commit: 9c0ebcf78fde0ffa348a95a544c6d3f2dac5af65
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: ignored all arch specific files that generated
 conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:16 -08:00
Lan Tianyu ebb72a01ef cpufreq / governor: Remove fossil comment
cpufreq_set_policy() has been changed to origin __cpufreq_set_policy()
and policy->lock has been converted to rewrite lock by commit 5a01f2.
So remove the comment.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: a814613b9a32d9ab9578d9dab396265c826d37f0
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:16 -08:00
Srivatsa S. Bhat 0e6d14088d cpufreq: Detect spurious invocations of update_policy_cpu()
The function update_policy_cpu() is expected to be called when the policy->cpu
of a cpufreq policy is to be changed: ie., the new CPU nominated to become the
policy->cpu is different from the old one.

Print a warning if it is invoked with new_cpu == old_cpu, since such an
invocation might hint at a faulty logic in the caller.

Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 99ec899eafe2ec0a7dd96e9de5fa0a2bea3032ba
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:16 -08:00
Sudeep KarkadaNagesha afc20abfb0 cpufreq: arm-big-little: use clk_get instead of clk_get_sys
Currently clk_get_sys is used with cpu-cluster.<n> as the device id
which is incorrect. It should be connection/consumer ID instead.

It is possible to specify input clock in the cpu device node along
with the optional clock-name. clk_get_sys can't handle that.

This patch replaces clk_get_sys with clk_get to extend support for
clocks specified in the device tree cpu node.

Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 076dec90fc32c830184b0f0fa1842a6de1199bc6
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:16 -08:00
Viresh Kumar 67f699ebc1 cpufreq: remove CONFIG_CPU_FREQ_TABLE
CONFIG_CPU_FREQ_TABLE will be always enabled when cpufreq framework is used, as
cpufreq core depends on it. So, we don't need this CONFIG option anymore as it
is not configurable. Remove CONFIG_CPU_FREQ_TABLE and update its users.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 3bc28ab6da039f8020bbcea8e832b63a900bdb66
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolved merge conflicts for Kconfig.arm and
 Kconfig.powerpc by ignoring missing configs. Searched and removed
 CPU_FREQ_TABLE config in our tree (arch/arm/mach-tegra/Kconfig,
 arch/powerpc/platforms/Kconfig, Documentation/android.txt). These conflicts
 are generated because we don't pull Kconfig changes for archs we don't use.]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:16 -08:00
Viresh Kumar 45b1c5e57e cpufreq: create cpufreq_generic_init() routine
Many CPUFreq drivers for SMP system (where all cores share same clock lines), do
similar stuff in their ->init() part.

This patch creates a generic routine in cpufreq core which can be used by these
so that we can remove some redundant code.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 70e9e778337973d5bf57004092b360bd3f3c412f
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:15 -08:00
Viresh Kumar b88c9a3589 cpufreq: arm_big_little: don't initialize part of policy is set by core
Many common initializations of struct policy are moved to core now and hence
this driver doesn't need to do it. This patch removes such code.

Most recent of those changes is to call ->get() in the core after calling
->init().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: e4c8afe3a06c682e215c3e38240126b652fa98d0
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:15 -08:00
Viresh Kumar 3db386d97a cpufreq: call cpufreq_driver->get() after calling ->init()
Almost all drivers set policy->cur with current CPU frequency in their ->init()
part. This can be done for all of them at core level and so they wouldn't need
to do it.

This patch adds supporting code in cpufreq core for calling get() after we have
called init() for a policy.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: da60ce9f2faca87013fd3cab1c3bed5183608c3d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:15 -08:00
Viresh Kumar 1a491aee5a cpufreq: arm_big_little: Use generic cpufreq routines
Most of the CPUFreq drivers do similar things in .exit() and .verify() routines
and .attr. So its better if we have generic routines for them which can be used
by cpufreq drivers then.

This patch uses these generic routines in the arm_big_little driver.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 3c75a1503f2c5ca91279436b1f573002c869ef06
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:15 -08:00
Viresh Kumar 00d1c89bb8 cpufreq: define generic .attr, .exit() and .verify() routines
Most of the CPUFreq drivers do similar things in .exit() and .verify() routines
and .attr. So its better if we have generic routines for them which can be used
by cpufreq drivers then.

This patch introduces generic .attr, .exit() and .verify() cpufreq drivers.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 184345129c53e76069c209f9912ed7c457eceb31
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:14 -08:00
Viresh Kumar 74253297ce cpufreq: add new routine cpufreq_verify_within_cpu_limits()
Most of the users of cpufreq_verify_within_limits() calls it for
limiting with min/max from policy->cpuinfo. We can make that code
simple by introducing another routine which will do this for them
automatically.

This patch adds another routine cpufreq_verify_within_cpu_limits()
and updates others to use it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: be49e3465f222b4b796be8a21d14afbfd8f5d20f
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:14 -08:00
Viresh Kumar 9e07689844 cpufreq: use cpufreq_driver->flags to mark CPUFREQ_HAVE_GOVERNOR_PER_POLICY
Use cpufreq_driver->flags to mark CPUFREQ_HAVE_GOVERNOR_PER_POLICY instead
of a separate field within cpufreq_driver. This will save some bytes of
memory.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 0b981e70748861a3e10ea2e2a689bdcee3e15085
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:14 -08:00
Viresh Kumar de2f4cb19f cpufreq: rename __cpufreq_set_policy() as cpufreq_set_policy()
Earlier there used to be two functions named __cpufreq_set_policy() and
cpufreq_set_policy(), but now we only have a single routine lets name it
cpufreq_set_policy() instead of __cpufreq_set_policy().

This also removes some invalid comments or fixes some incorrect comments.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 037ce8397d23b2f84ccfb879cf4b43277b0454e3
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:13 -08:00
Viresh Kumar 442f568915 cpufreq: Optimize cpufreq_frequency_table_verify()
cpufreq_frequency_table_verify() is rewritten here to make it more logical
and efficient.
 - merge multiple lines for variable declarations together.
 - quit early if any frequency between min/max is found.
 - don't call cpufreq_verify_within_limits() in case any valid freq is
   found as it is of no use.
 - rename the count variable as found and change its type to boolean.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 77db50c4eb1991d6e88254390ec368e1d23a8fa5
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:13 -08:00
Viresh Kumar 4beb48071b cpufreq: remove __cpufreq_remove_dev()
Nobody except cpufreq_remove_dev() calls __cpufreq_remove_dev() and
so we don't need two separate routines here. Merge code from
__cpufreq_remove_dev() into cpufreq_remove_dev() and get rid of
__cpufreq_remove_dev().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 27a862e98341226a50835f29aa26ffa528215ecc
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:13 -08:00
Viresh Kumar 780b477ed7 cpufreq: don't break string in print statements
As a rule its better not to break string (quoted inside "") in a
print statement even if it crosses 80 column boundary as that may
introduce bugs and so this patch rewrites one of the print statements..

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 75949c9a1fe0fd07983788449059337edac2b9f6
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:13 -08:00
Viresh Kumar 59f75cc637 cpufreq: Remove extra blank line
We don't need a blank line just at start of a block, lets remove it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: bbdd04ab1f375ef46a0e2d98de439863d35e4d3e
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:12 -08:00
Viresh Kumar 0fd511e4dc cpufreq: remove invalid comment from __cpufreq_remove_dev()
Some section of kerneldoc comment for __cpufreq_remove_dev() is invalid now.
Remove it.

Suggested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 67a29e558b17a923c3a53c348315c572b8ca261a
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:12 -08:00
Viresh Kumar 2f59f0d636 cpufreq: make return type of lock_policy_rwsem_{read|write}() as void
lock_policy_rwsem_{read|write}() currently has return type of int,
but it always returns zero and hence its return type should be void
instead. This patch makes that change and modifies all of the users
accordingly.

Reported-by: Jon Medhurst<tixy@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 1b750e3bdae5b2d0f3d377b0c56e7465f85b67f2
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:12 -08:00
Viresh Kumar 807d40fea3 cpufreq: arm_big_little: call cpufreq_frequency_table_put_attr()
Drivers which have an exit path must call cpufreq_frequency_table_put_attr() if
they have called cpufreq_frequency_table_get_attr() in their init path.

This driver was missing this part and is fixed with this patch.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 2457dac670f287b260d50792988f4788f403ca32
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:12 -08:00
Viresh Kumar 2e99c3a3d0 cpufreq: arm_big_little: use cpufreq_table_validate_and_show()
Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 39b10ebe5d30ef46ddea1daa89ca55bd2c817d7b
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:11 -08:00
Viresh Kumar a10db86d70 cpufreq: Add new helper cpufreq_table_validate_and_show()
Almost every cpufreq driver is required to validate its frequency table with:
cpufreq_frequency_table_cpuinfo() and then expose it to cpufreq core with:
cpufreq_frequency_table_get_attr().

This patch creates another helper routine cpufreq_table_validate_and_show() that
will do both these steps in a single call and will return 0 for success, error
otherwise.

This also fixes potential bugs in cpufreq drivers where people have called
cpufreq_frequency_table_get_attr() before calling
cpufreq_frequency_table_cpuinfo(), as the later may fail.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 27047a603645d0885bcd72d7a0b6cce6e3c94ca7
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:11 -08:00
Nishanth Menon a2dc9b1bc9 PM / OPP: rename header to linux/pm_opp.h
Since Operating Performance Points (OPP) functions are specific
to device specific power management, be specific and rename opp.h
to pm_opp.h

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: e4db1c7439b31993a4886b273bb9235a8eea82bf
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:11 -08:00
Nishanth Menon 4dfc0d4764 PM / OPP: rename functions to dev_pm_opp*
Since Operating Performance Points (OPP) functions are specific to
device specific power management, be specific and rename opp_*
accessors in OPP library with dev_pm_opp_* equivalent.

Affected functions are:
 opp_get_voltage
 opp_get_freq
 opp_get_opp_count
 opp_find_freq_exact
 opp_find_freq_floor
 opp_find_freq_ceil
 opp_add
 opp_enable
 opp_disable
 opp_get_notifier
 opp_init_cpufreq_table
 opp_free_cpufreq_table

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Nishanth Menon <nm@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 5d4879cda67b09f086807821cf594ee079d6dfbe
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:11 -08:00
Viresh Kumar f1093739ff cpufreq: check cpufreq driver is valid and cpufreq isn't disabled in cpufreq_get()
cpufreq_get() can be called from external drivers which might not be aware if
cpufreq driver is registered or not. And so we should actually check if cpufreq
driver is registered or not and also if cpufreq is active or disabled, at the
beginning of cpufreq_get().

Otherwise call to lock_policy_rwsem_read() might hit BUG_ON(!policy).

Reported-and-tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 26ca8694344af4c833d22590c5b77d6b9cff0722
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:10 -08:00
Yinghai Lu ed0ea642ab cpufreq: return EEXIST instead of EBUSY for second registering
On systems that support intel_pstate, acpi_cpufreq fails to load, and
udev keeps trying until trace gets filled up and kernel crashes.

The root cause is driver return ret from cpufreq_register_driver(),
because when some other driver takes over before, it will return
EBUSY and then udev will keep trying ...

cpufreq_register_driver() should return EEXIST instead so that the
system can boot without appending intel_pstate=disable and still use
intel_pstate.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 4dea5806d332f91d640d99943db99a5539e832c3
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:10 -08:00
Viresh Kumar eb24ac11f5 cpufreq: unlock correct rwsem while updating policy->cpu
Current code looks like this:

        WARN_ON(lock_policy_rwsem_write(cpu));
        update_policy_cpu(policy, new_cpu);
        unlock_policy_rwsem_write(cpu);

{lock|unlock}_policy_rwsem_write(cpu) takes/releases policy->cpu's rwsem.
Because cpu is changing with the call to update_policy_cpu(), the
unlock_policy_rwsem_write() will release the incorrect lock.

The right solution would be to release the same lock as was taken earlier. Also
update_policy_cpu() was also called from cpufreq_add_dev() without any locks and
so its better if we move this locking to inside update_policy_cpu().

This patch fixes a regression introduced in 3.12 by commit f9ba680d23
(cpufreq: Extract the handover of policy cpu to a helper function).

Reported-and-tested-by: Jon Medhurst<tixy@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 8efd57657d8ef666810b55e609da72de92314dc4
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:10 -08:00
Viresh Kumar 240a20eaf0 cpufreq: Clear policy->cpus bits in __cpufreq_remove_dev_finish()
This broke after a recent change "cedb70a cpufreq: Split __cpufreq_remove_dev()
into two parts" from Srivatsa.

Consider a scenario where we have two CPUs in a policy (0 & 1) and we are
removing CPU 1. On the call to __cpufreq_remove_dev_prepare() we have cleared 1
from policy->cpus and now on a call to __cpufreq_remove_dev_finish() we read
cpumask_weight of policy->cpus, which will come as 1 and this code will behave
as if we are removing the last CPU from policy :)

Fix it by clearing the CPU mask in __cpufreq_remove_dev_finish() instead of
__cpufreq_remove_dev_prepare().

Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 9c8f1ee40b6368e6b2775c9c9f816e2a5dca3c07
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:10 -08:00
Lan Tianyu 8f4301c5f5 cpufreq: Acquire the lock in cpufreq_policy_restore() for reading
In cpufreq_policy_restore() before system suspend policy is read from
percpu's cpufreq_cpu_data_fallback.  It's a read operation rather
than a write one, so take the lock for reading in there.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 44871c9c7f7963f8869dd8bc9620221c9e9db153
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:10 -08:00
Srivatsa S. Bhat 005b70df3b cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu
If update_policy_cpu() is invoked with the existing policy->cpu itself
as the new-cpu parameter, then a lot of things can go terribly wrong.

In its present form, update_policy_cpu() always assumes that the new-cpu
is different from policy->cpu and invokes other functions to perform their
respective updates. And those functions implement the actual update like
this:

per_cpu(..., new_cpu) = per_cpu(..., last_cpu);
per_cpu(..., last_cpu) = NULL;

Thus, when new_cpu == last_cpu, the final NULL assignment makes the per-cpu
references vanish into thin air! (memory leak). From there, it leads to more
problems: cpufreq_stats_create_table() now doesn't find the per-cpu reference
and hence tries to create a new sysfs-group; but sysfs already had created
the group earlier, so it complains that it cannot create a duplicate filename.
In short, the repercussions of a rather innocuous invocation of
update_policy_cpu() can turn out to be pretty nasty.

Ideally update_policy_cpu() should handle this situation (new == last)
gracefully, and not lead to such severe problems. So fix it by adding an
appropriate check.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: cb38ed5cf1c4fdb7454e4b48fb70c396f5acfb21
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:09 -08:00
Srivatsa S. Bhat def400bcef cpufreq: Restructure if/else block to avoid unintended behavior
In __cpufreq_remove_dev_prepare(), the code which decides whether to remove
the sysfs link or nominate a new policy cpu, is governed by an if/else block
with a rather complex set of conditionals. Worse, they harbor a subtlety
which leads to certain unintended behavior.

The code looks like this:

        if (cpu != policy->cpu && !frozen) {
                sysfs_remove_link(&dev->kobj, "cpufreq");
        } else if (cpus > 1) {
		new_cpu = cpufreq_nominate_new_policy_cpu(...);
		...
		update_policy_cpu(..., new_cpu);
	}

The original intention was:
If the CPU going offline is not policy->cpu, just remove the link.
On the other hand, if the CPU going offline is the policy->cpu itself,
handover the policy->cpu job to some other surviving CPU in that policy.

But because the 'if' condition also includes the 'frozen' check, now there
are *two* possibilities by which we can enter the 'else' block:

1. cpu == policy->cpu (intended)
2. cpu != policy->cpu && frozen (unintended)

Due to the second (unintended) scenario, we end up spuriously nominating
a CPU as the policy->cpu, even when the existing policy->cpu is alive and
well. This can cause problems further down the line, especially when we end
up nominating the same policy->cpu as the new one (ie., old == new),
because it totally confuses update_policy_cpu().

To avoid this mess, restructure the if/else block to only do what was
originally intended, and thus prevent any unwelcome surprises.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 61173f256a3bebfbd09b4bd2c164dde378614091
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:09 -08:00
Srivatsa S. Bhat 63384c5ec3 cpufreq: Fix crash in cpufreq-stats during suspend/resume
Stephen Warren reported that the cpufreq-stats code hits a NULL pointer
dereference during the second attempt to suspend a system. He also
pin-pointed the problem to commit 5302c3f "cpufreq: Perform light-weight
init/teardown during suspend/resume".

That commit actually ensured that the cpufreq-stats table and the
cpufreq-stats sysfs entries are *not* torn down (ie., not freed) during
suspend/resume, which makes it all the more surprising. However, it turns
out that the root-cause is not that we access an already freed memory, but
that the reference to the allocated memory gets moved around and we lose
track of that during resume, leading to the reported crash in a subsequent
suspend attempt.

In the suspend path, during CPU offline, the value of policy->cpu is updated
by choosing one of the surviving CPUs in that policy, as long as there is
atleast one CPU in that policy. And cpufreq_stats_update_policy_cpu() is
invoked to update the reference to the stats structure by assigning it to
the new CPU. However, in the resume path, during CPU online, we end up
assigning a fresh CPU as the policy->cpu, without letting cpufreq-stats
know about this. Thus the reference to the stats structure remains
(incorrectly) associated with the old CPU. So, in a subsequent suspend attempt,
during CPU offline, we end up accessing an incorrect location to get the
stats structure, which eventually leads to the NULL pointer dereference.

Fix this by letting cpufreq-stats know about the update of the policy->cpu
during CPU online in the resume path. (Also, move the update_policy_cpu()
function higher up in the file, so that __cpufreq_add_dev() can invoke
it).

Reported-and-tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 0d66b91ebff49841f607a3c079984c907c8a4199
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:09 -08:00
Rafael J. Wysocki 7b13c20fec Revert "cpufreq: make sure frequency transitions are serialized"
Commit 7c30ed5 (cpufreq: make sure frequency transitions are
serialized) attempted to serialize frequency transitions by
adding checks to the CPUFREQ_PRECHANGE and CPUFREQ_POSTCHANGE
notifications.  However, it assumed that the notifications will
always originate from the driver's .target() callback, but they
also can be triggered by cpufreq_out_of_sync() and that leads to
warnings like this on some systems:

 WARNING: CPU: 0 PID: 14543 at drivers/cpufreq/cpufreq.c:317
 __cpufreq_notify_transition+0x238/0x260()
 In middle of another frequency transition

accompanied by a call trace similar to this one:

 [<ffffffff81720daa>] dump_stack+0x46/0x58
 [<ffffffff8106534c>] warn_slowpath_common+0x8c/0xc0
 [<ffffffff815b8560>] ? acpi_cpufreq_target+0x320/0x320
 [<ffffffff81065436>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff815b1ec8>] __cpufreq_notify_transition+0x238/0x260
 [<ffffffff815b33be>] cpufreq_notify_transition+0x3e/0x70
 [<ffffffff815b345d>] cpufreq_out_of_sync+0x6d/0xb0
 [<ffffffff815b370c>] cpufreq_update_policy+0x10c/0x160
 [<ffffffff815b3760>] ? cpufreq_update_policy+0x160/0x160
 [<ffffffff81413813>] cpufreq_set_cur_state+0x8c/0xb5
 [<ffffffff814138df>] processor_set_cur_state+0xa3/0xcf
 [<ffffffff8158e13c>] thermal_cdev_update+0x9c/0xb0
 [<ffffffff8159046a>] step_wise_throttle+0x5a/0x90
 [<ffffffff8158e21f>] handle_thermal_trip+0x4f/0x140
 [<ffffffff8158e377>] thermal_zone_device_update+0x57/0xa0
 [<ffffffff81415b36>] acpi_thermal_check+0x2e/0x30
 [<ffffffff81415ca0>] acpi_thermal_notify+0x40/0xdc
 [<ffffffff813e7dbd>] acpi_device_notify+0x19/0x1b
 [<ffffffff813f8241>] acpi_ev_notify_dispatch+0x41/0x5c
 [<ffffffff813e3fbe>] acpi_os_execute_deferred+0x25/0x32
 [<ffffffff81081060>] process_one_work+0x170/0x4a0
 [<ffffffff81082121>] worker_thread+0x121/0x390
 [<ffffffff81082000>] ? manage_workers.isra.20+0x170/0x170
 [<ffffffff81088fe0>] kthread+0xc0/0xd0
 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff8173582c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0

For this reason, revert commit 7c30ed5 along with the fix 266c13d
(cpufreq: Fix serialization of frequency transitions) on top of it
and we will revisit the serialization problem later.

Reported-by: Alessandro Bono <alessandro.bono@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 798282a8718347b04a2f0a4bae7d775c48c6bcb9
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:09 -08:00
Srivatsa S. Bhat 4f03fd6c97 cpufreq: Use signed type for 'ret' variable, to store negative error values
There are places where the variable 'ret' is declared as unsigned int
and then used to store negative return values such as -EINVAL. Fix them
by declaring the variable as a signed quantity.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 5136fa56582beadb7fa71eb30bc79148bfcba5c1
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:08 -08:00
Srivatsa S. Bhat 656b745f10 cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes
Commit "cpufreq: serialize calls to __cpufreq_governor()" had been a temporary
and partial solution to the race condition between writing to a cpufreq sysfs
file and taking a CPU offline. Now that we have a proper and complete solution
to that problem, remove the temporary fix.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 56d07db274b7b15ca38b60ea4a762d40de093000
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:08 -08:00
Srivatsa S. Bhat d6f5eef479 cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug
The functions that are used to write to cpufreq sysfs files (such as
store_scaling_max_freq()) are not hotplug safe. They can race with CPU
hotplug tasks and lead to problems such as trying to acquire an already
destroyed timer-mutex etc.

Eg:

    __cpufreq_remove_dev()
     __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
       policy->governor->governor(policy, CPUFREQ_GOV_STOP);
        cpufreq_governor_dbs()
         case CPUFREQ_GOV_STOP:
          mutex_destroy(&cpu_cdbs->timer_mutex)
          cpu_cdbs->cur_policy = NULL;
      <PREEMPT>
    store()
     __cpufreq_set_policy()
      __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
        policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
         case CPUFREQ_GOV_LIMITS:
          mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex)
           if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL

So use get_online_cpus()/put_online_cpus() in the store_*() functions, to
synchronize with CPU hotplug. However, there is an additional point to note
here: some parts of the CPU teardown in the cpufreq subsystem are done in
the CPU_POST_DEAD stage, with cpu_hotplug.lock *released*. So, using the
get/put_online_cpus() functions alone is insufficient; we should also ensure
that we don't race with those latter steps in the hotplug sequence. We can
easily achieve this by checking if the CPU is online before proceeding with
the store, since the CPU would have been marked offline by the time the
CPU_POST_DEAD notifiers are executed.

Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 4f750c930822b92df74327a4d1364eff87701360
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:08 -08:00
Srivatsa S. Bhat 844015c914 cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock
__cpufreq_remove_dev_finish() handles the kobject cleanup for a CPU going
offline. But because we destroy the kobject towards the end of the CPU offline
phase, there are certain race windows where a task can try to write to a
cpufreq sysfs file (eg: using store_scaling_max_freq()) while we are taking
that CPU offline, and this can bump up the kobject refcount, which in turn might
hinder the CPU offline task from running to completion. (It can also cause
other more serious problems such as trying to acquire a destroyed timer-mutex
etc., depending on the exact stage of the cleanup at which the task managed to
take a new refcount).

To fix the race window, we will need to synchronize those store_*() call-sites
with CPU hotplug, using get_online_cpus()/put_online_cpus(). However, that
in turn can cause a total deadlock because it can end up waiting for the
CPU offline task to complete, with incremented refcount!

Write to sysfs                            CPU offline task
--------------                            ----------------
kobj_refcnt++

                                          Acquire cpu_hotplug.lock

get_online_cpus();

					  Wait for kobj_refcnt to drop to zero

                     **DEADLOCK**

A simple way to avoid this problem is to perform the kobject cleanup in the
CPU offline path, with the cpu_hotplug.lock *released*. That is, we can
perform the wait-for-kobj-refcnt-to-drop as well as the subsequent cleanup
in the CPU_POST_DEAD stage of CPU offline, which is run with cpu_hotplug.lock
released. Doing this helps us avoid deadlocks due to holding kobject refcounts
and waiting on each other on the cpu_hotplug.lock.

(Note: We can't move all of the cpufreq CPU offline steps to the
CPU_POST_DEAD stage, because certain things such as stopping the governors
have to be done before the outgoing CPU is marked offline. So retain those
parts in the CPU_DOWN_PREPARE stage itself).

Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 1aee40ac9c86759c05f2ceb4523642b22fc8ea36
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:08 -08:00
Srivatsa S. Bhat c834b85822 cpufreq: Split __cpufreq_remove_dev() into two parts
During CPU offline, the cpufreq core invokes __cpufreq_remove_dev()
to perform work such as stopping the cpufreq governor, clearing the
CPU from the policy structure etc, and finally cleaning up the
kobject.

There are certain subtle issues related to the kobject cleanup, and
it would be much easier to deal with them if we separate that part
from the rest of the cleanup-work in the CPU offline phase. So split
the __cpufreq_remove_dev() function into 2 parts: one that handles
the kobject cleanup, and the other that handles the rest of the work.

Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: cedb70afd077b00bff7379042fdbf7eef32606c9
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:07 -08:00
Andreas Schwab 0367d22be7 cpufreq: Fix wrong time unit conversion
The time spent by a CPU under a given frequency is stored in jiffies unit
in the cpu var cpufreq_stats_table->time_in_state[i], i being the index of
the frequency.

This is what is displayed in the following file on the right column:

     cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
     2301000 19835820
     2300000 3172
     [...]

Now cpufreq converts this jiffies unit delta to clock_t before returning it
to the user as in the above file. And that conversion is achieved using the API
cputime64_to_clock_t().

Although it accidentally works on traditional tick based cputime accounting, where
cputime_t maps directly to jiffies, it doesn't work with other types of cputime
accounting such as CONFIG_VIRT_CPU_ACCOUNTING_* where cputime_t can map to nsecs
or any granularity preffered by the architecture.

For example we get a buggy zero delta on full dyntick configurations:

     cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
     2301000 0
     2300000 0
     [...]

Fix this with using the proper jiffies_64_t to clock_t conversion.

Reported-and-tested-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: a857c0b9e24e39fe5be82451b65377795f9538d8
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:07 -08:00
Viresh Kumar 84886f9776 cpufreq: serialize calls to __cpufreq_governor()
We can't take a big lock around __cpufreq_governor() as this causes
recursive locking for some cases. But calls to this routine must be
serialized for every policy. Otherwise we can see some unpredictable
events.

For example, consider following scenario:

__cpufreq_remove_dev()
 __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
   policy->governor->governor(policy, CPUFREQ_GOV_STOP);
    cpufreq_governor_dbs()
     case CPUFREQ_GOV_STOP:
      mutex_destroy(&cpu_cdbs->timer_mutex)
      cpu_cdbs->cur_policy = NULL;
  <PREEMPT>
store()
 __cpufreq_set_policy()
  __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
    policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
     case CPUFREQ_GOV_LIMITS:
      mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex)
       if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL

And so store() will eventually result in a crash if cur_policy is
NULL at this point.

Introduce an additional variable which would guarantee serialization
here.

Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 19c763031acb831a5ab9c1a701b7fedda073eb3f
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:07 -08:00
Viresh Kumar 88dca9646b cpufreq: don't allow governor limits to be changed when it is disabled
__cpufreq_governor() returns with -EBUSY when governor is already
stopped and we try to stop it again, but when it is stopped we must
not allow calls to CPUFREQ_GOV_LIMITS event as well.

This patch adds this check in __cpufreq_governor().

Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: f73d39338444d9915c746403bd98b145ff9d2ba4
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:07 -08:00
Stratos Karafotis ecc0eef693 cpufreq: governor: Fix typos in comments
- 'Governer' should be 'Governor'.
 - 'S' is used for Siemens (electrical conductance) in SI units,
   so use small 's' for seconds.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: c4afc410942f9f0675a5431adbdb03cf5908d1df
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:06 -08:00
Stratos Karafotis 57943819d6 cpufreq: governors: Remove duplicate check of target freq in supported range
Function __cpufreq_driver_target() checks if target_freq is within
policy->min and policy->max range. generic_powersave_bias_target() also
checks if target_freq is valid via a cpufreq_frequency_table_target()
call. So, drop the unnecessary duplicate check in *_check_cpu().

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 934dac1ea072bd8adff8d6a6abba561731e093cf
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:06 -08:00
Sudeep KarkadaNagesha fcaef55c1f cpufreq: arm_big_little: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes all DT parsing and uses cpu->of_node instead.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Git-commit: da0eb143dbbaf26b6f084bee81d56fc64efb5390
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:06 -08:00
Li Zhong a0a204fa01 cpufreq: fix bad unlock balance on !CONFIG_SMP
This patch tries to fix lockdep complaint attached below.

It seems that we should always read acquire the cpufreq_rwsem,
whether CONFIG_SMP is enabled or not.  And CONFIG_HOTPLUG_CPU
depends on CONFIG_SMP, so it seems we don't need CONFIG_SMP for the
code enabled by CONFIG_HOTPLUG_CPU.

[    0.504191] =====================================
[    0.504627] [ BUG: bad unlock balance detected! ]
[    0.504627] 3.11.0-rc6-next-20130819 #1 Not tainted
[    0.504627] -------------------------------------
[    0.504627] swapper/1 is trying to release lock (cpufreq_rwsem) at:
[    0.504627] [<ffffffff813d927a>] cpufreq_add_dev+0x13a/0x3e0
[    0.504627] but there are no more locks to release!
[    0.504627]
[    0.504627] other info that might help us debug this:
[    0.504627] 1 lock held by swapper/1:
[    0.504627]  #0:  (subsys mutex#4){+.+.+.}, at: [<ffffffff8134a7bf>] subsys_interface_register+0x4f/0xe0
[    0.504627]
[    0.504627] stack backtrace:
[    0.504627] CPU: 0 PID: 1 Comm: swapper Not tainted 3.11.0-rc6-next-20130819 #1
[    0.504627] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[    0.504627]  ffffffff813d927a ffff88007f847c98 ffffffff814c062b ffff88007f847cc8
[    0.504627]  ffffffff81098bce ffff88007f847cf8 ffffffff81aadc30 ffffffff813d927a
[    0.504627]  00000000ffffffff ffff88007f847d68 ffffffff8109d0be 0000000000000006
[    0.504627] Call Trace:
[    0.504627]  [<ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0
[    0.504627]  [<ffffffff814c062b>] dump_stack+0x19/0x1b
[    0.504627]  [<ffffffff81098bce>] print_unlock_imbalance_bug+0xfe/0x110
[    0.504627]  [<ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0
[    0.504627]  [<ffffffff8109d0be>] lock_release_non_nested+0x1ee/0x310
[    0.504627]  [<ffffffff81099d0e>] ? mark_held_locks+0xae/0x120
[    0.504627]  [<ffffffff811510cb>] ? kfree+0xcb/0x1d0
[    0.504627]  [<ffffffff813d77ea>] ? cpufreq_policy_free+0x4a/0x60
[    0.504627]  [<ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0
[    0.504627]  [<ffffffff8109d2a4>] lock_release+0xc4/0x250
[    0.504627]  [<ffffffff8106c9f3>] up_read+0x23/0x40
[    0.504627]  [<ffffffff813d927a>] cpufreq_add_dev+0x13a/0x3e0
[    0.504627]  [<ffffffff8134a809>] subsys_interface_register+0x99/0xe0
[    0.504627]  [<ffffffff81b19f3b>] ? cpufreq_gov_dbs_init+0x12/0x12
[    0.504627]  [<ffffffff813d7f0d>] cpufreq_register_driver+0x9d/0x1d0
[    0.504627]  [<ffffffff81b19f3b>] ? cpufreq_gov_dbs_init+0x12/0x12
[    0.504627]  [<ffffffff81b1a039>] acpi_cpufreq_init+0xfe/0x1f8
[    0.504627]  [<ffffffff810002ba>] do_one_initcall+0xda/0x180
[    0.504627]  [<ffffffff81ae301e>] kernel_init_freeable+0x12c/0x1bb
[    0.504627]  [<ffffffff81ae2841>] ? do_early_param+0x8c/0x8c
[    0.504627]  [<ffffffff814b4dd0>] ? rest_init+0x140/0x140
[    0.504627]  [<ffffffff814b4dde>] kernel_init+0xe/0xf0
[    0.504627]  [<ffffffff814d029a>] ret_from_fork+0x7a/0xb0
[    0.504627]  [<ffffffff814b4dd0>] ? rest_init+0x140/0x140

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-and-tested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 5025d628c8659fbf939f929107bf76db81dcdfff
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:06 -08:00
Viresh Kumar e8c806d8e6 cpufreq: Use cpufreq_policy_list for iterating over policies
To iterate over all policies we currently iterate over all online
CPUs and then get the policy for each of them which is suboptimal.
Use the newly created cpufreq_policy_list for this purpose instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 1b27429446f0c37353179544e844dc2086fa2353
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:05 -08:00
Viresh Kumar 982c7c772e cpufreq: remove cpufreq_policy_cpu per-cpu variable
cpufreq_policy_cpu per-cpu variables are used for storing the ID of
the CPU that manages the given CPU's policy.  However, we also store
a policy pointer for each cpu in cpufreq_cpu_data, so the
cpufreq_policy_cpu information is simply redundant.

It is better to use cpufreq_cpu_data to retrieve a policy and get
policy->cpu from there, so make that happen everywhere and drop the
cpufreq_policy_cpu per-cpu variables which aren't necessary any more.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 474deff744c4012f07cfa994947d7c6260c9ab89
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:05 -08:00
Viresh Kumar cd4947e249 cpufreq: remove unnecessary check in __cpufreq_governor()
We don't need to check if event is CPUFREQ_GOV_POLICY_INIT and put
governor module as we are sure event can only be START/STOP here.

Remove the useless check.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 9e9fd801676a946b759a8669baa24ba327c8c903
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:05 -08:00
Viresh Kumar 719216221d cpufreq: remove policy from cpufreq_policy_list during suspend
cpufreq_policy_list is a list of active policies.  We do remove
policies from this list when all CPUs belonging to that policy are
removed.  But during system suspend we don't really free a policy
struct as it will be used again during resume, so we didn't remove
it from cpufreq_policy_list as well..

However, this is incorrect.  We are saying this policy isn't valid
anymore and must not be referenced (though we haven't freed it), but
it can still be used by code that iterates over cpufreq_policy_list.

Remove policy from this list during system suspend as well.
Of course, we must add it back whenever the first CPU belonging to
that policy shows up.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 9515f4d69b92feafe37581047a1bb41e41602faa
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:05 -08:00
Viresh Kumar 43a0679e2a cpufreq: Fix white space in __cpufreq_remove_dev()
Align closing brace '}' of an if block.

[rjw: Subject and changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: edab2fbc21b9eb37007ad8bffe1159d536bbb451
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:04 -08:00
Rafael J. Wysocki 387bda797c Revert "cpufreq: Use cpufreq_policy_list for iterating over policies"
Revert commit eb60852 (cpufreq: Use cpufreq_policy_list for iterating
over policies), because it breaks system suspend/resume on multiple
machines.

It either causes resume to block indefinitely or causes the BUG_ON()
in lock_policy_rwsem_##mode() to trigger on sysfs accesses to cpufreq
attributes.

Conflicts:
	drivers/cpufreq/cpufreq.c

Git-commit: 878f6e074e9a7784a6e351512eace4ccb3542eef
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:04 -08:00
Viresh Kumar 4c36081b1f cpufreq: improve error checking on return values of __cpufreq_governor()
The __cpufreq_governor() function can fail in rare cases especially
if there are bugs in cpufreq drivers.  Thus we must stop processing
as soon as this routine fails, otherwise it may result in undefined
behavior.

This patch adds error checking code whenever this routine is called
from any place.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 3de9bdeb28638e164d1f0eb38dd68e3f5d2ac95c
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:04 -08:00
Viresh Kumar 326e2e9f4f cpufreq: Drop the owner field from struct cpufreq_driver
We don't need to set .owner = THIS_MODULE any more in cpufreq drivers
as this field isn't used any more by the cpufreq core.

This patch removes it and updates all dependent drivers accordingly.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: adc97d6a735dbb1e94cb4f1bf0b55f258b349941
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:04 -08:00
Viresh Kumar b22d99e4b9 cpufreq: Use rwsem for protecting critical sections
Critical sections of the cpufreq core are protected with the help of
the driver module owner's refcount, which isn't the correct approach,
because it causes rmmod to return an error when some routine has
updated that refcount.

Let's use rwsem for this purpose instead.  Only
cpufreq_unregister_driver() will use write sem
and everybody else will use read sem.

[rjw: Subject & changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 6eed9404ab3c4baea54ce4c7e862e69df1d39f38
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:04 -08:00
Viresh Kumar c15e6c9a55 cpufreq: Fix broken usage of governor->owner's refcount
The cpufreq governor owner refcount usage is broken.  We should only
increment that refcount when a CPUFREQ_GOV_POLICY_INIT event has come
and it should only be decremented if CPUFREQ_GOV_POLICY_EXIT has come.

Currently, there can be situations where the governor is in use, but
we have allowed it to be unloaded which may result in undefined
behavior.  Let's fix it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: fe492f3f0332e23cc6ca4913e5a2ed78e1888902
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:03 -08:00
Viresh Kumar c267384a9c cpufreq: Use cpufreq_policy_list for iterating over policies
To iterate over all policies we currently iterate over all CPUs and
then get the policy for each of them.  Let's use the newly created
cpufreq_policy_list for this purpose.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: eb608521f1e25a8c14295b6d9a3853c3cd8c6cf8
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:03 -08:00
Lukasz Majewski 6bc8f02509 cpufreq: Store cpufreq policies in a list
Policies available in the cpufreq framework are now linked together.
They are accessible via cpufreq_policy_list defined in the cpufreq
core.

[rjw: Fix from Yinghai Lu folded in]
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: c88a1f8b96e7384627b918dfabbfc0c615a4a914
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:03 -08:00
Viresh Kumar d87467e6f2 cpufreq: Use sizeof(*ptr) convetion for computing sizes
Chapter 14 of Documentation/CodingStyle says:

The preferred form for passing a size of a struct is the following:

	p = kmalloc(sizeof(*p), ...);

The alternative form where struct name is spelled out hurts
readability and introduces an opportunity for a bug when the pointer
variable type is changed but the corresponding sizeof that is passed
to a memory allocator is not.

This wasn't followed consistently in drivers/cpufreq, let's make it
more consistent by always following this rule.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: d5b73cd870e2b049ef566aec2791dbf5fd26a7ec
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:03 -08:00
Viresh Kumar 068323d345 cpufreq: Give consistent names to cpufreq_policy objects
They are called policy, cur_policy, new_policy, data, etc.  Just call
them policy wherever possible.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 3a3e9e06d0c11b8efa95933a88c9e67209fa4330
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflict]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:02 -08:00
Viresh Kumar b408618d46 cpufreq: Clean up header files included in the core
This patch addresses the following issues in the header files in the
cpufreq core:
 - Include headers in ascending order, so that we don't add same
   many times by mistake.
 - <asm/> must be included after <linux/>, so that they override
   whatever they need to.
 - Remove unnecessary includes.
 - Don't include files already included by cpufreq.h or
   cpufreq_governor.h.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 5ff0a268037d344f86df690ccb994d8bc015d2d9
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:02 -08:00
Viresh Kumar 37b113da9d cpufreq: Pass policy to cpufreq_add_policy_cpu()
The caller of cpufreq_add_policy_cpu() already has a pointer to the
policy structure and there is no need to look it up again in
cpufreq_add_policy_cpu().  Let's pass it directly.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: d8d3b4711297e101bbad826474013edbe342c333
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:02 -08:00
Rafael J. Wysocki b6c1774b41 cpufreq: Avoid double kobject_put() for the same kobject in error code path
The only case triggering a jump to the err_out_unregister label in
__cpufreq_add_dev() is when cpufreq_add_dev_interface() fails.
However, if cpufreq_add_dev_interface() fails, it calls kobject_put()
for the policy kobject in its error code path and since that causes
the kobject's refcount to become 0, the additional kobject_put() for
the same kobject under err_out_unregister and the
wait_for_completion() following it are pointless, so drop them.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Git-commit: 10659ab7b50e963429f1a681882404ca37aa584c
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:01 -08:00
Rafael J. Wysocki ba921df0d0 cpufreq: Do not hold driver module references for additional policy CPUs
The cpufreq core is a little inconsistent in the way it uses the
driver module refcount.

Namely, if __cpufreq_add_dev() is called for a CPU that doesn't
share the policy object with any other CPUs, the driver module
refcount it grabs to start with will be dropped by it before
returning and will be equal to whatever it had been before that
function was invoked.

However, if the given CPU does share the policy object with other
CPUs, either cpufreq_add_policy_cpu() is called to link the new CPU
to the existing policy, or cpufreq_add_dev_symlink() is used to link
the other CPUs sharing the policy with it to the just created policy
object.  In that case, because both cpufreq_add_policy_cpu() and
cpufreq_add_dev_symlink() call cpufreq_cpu_get() for the given
policy (the latter possibly many times) without the balancing
cpufreq_cpu_put() (unless there is an error), the driver module
refcount will be left by __cpufreq_add_dev() with a nonzero value
(different from the initial one).

To remove that inconsistency make cpufreq_add_policy_cpu() execute
cpufreq_cpu_put() for the given policy before returning, which
decrements the driver module refcount so that it will be equal to its
initial value after __cpufreq_add_dev() returns.  Also remove the
cpufreq_cpu_get() call from cpufreq_add_dev_symlink(), since both the
policy refcount and the driver module refcount are nonzero when it is
called and they don't need to be bumped up by it.

Accordingly, drop the cpufreq_cpu_put() from __cpufreq_remove_dev(),
since it is only necessary to balance the cpufreq_cpu_get() called
by cpufreq_add_policy_cpu() or cpufreq_add_dev_symlink().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Git-commit: 71c3461ef7c67024792d283b88630245a6c169ba
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:01 -08:00
Viresh Kumar 8e4bf985f3 cpufreq: Don't pass CPU to cpufreq_add_dev_{symlink|interface}()
Pointer to struct cpufreq_policy is already passed to these routines
and we don't need to send policy->cpu to them as well.  So, get rid
of this extra argument and use policy->cpu everywhere.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 308b60e71541518f3fe97171b4daf71adc607f3d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:01 -08:00
Viresh Kumar 1a2410255c cpufreq: Remove extra variables from cpufreq_add_dev_symlink()
We call cpufreq_cpu_get() in cpufreq_add_dev_symlink() to increase usage
refcount of policy, but not to get a policy for the given CPU.  So, we
don't really need to capture the return value of this routine.  We can
simply use policy passed as an argument to cpufreq_add_dev_symlink().

Moreover debug print is rewritten to make it more clear.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: e8fdde1011ea45792e60f14f620b01f78cb0d34d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:01 -08:00
Srivatsa S. Bhat d6a685043e cpufreq: Perform light-weight init/teardown during suspend/resume
Now that we have the infrastructure to perform a light-weight init/tear-down,
use that in the cpufreq CPU hotplug notifier when invoked from the
suspend/resume path.

This also ensures that the file permissions of the cpufreq sysfs files are
preserved across suspend/resume, something which commit a66b2e (cpufreq:
Preserve sysfs files across suspend/resume) originally intended to do, but
had to be reverted due to other problems.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 5302c3fb2e62f4ca5e43e060491ba299f58c5231
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:00 -08:00
Srivatsa S. Bhat 420f1a2fae cpufreq: Preserve policy structure across suspend/resume
To perform light-weight cpu-init and teardown in the cpufreq subsystem
during suspend/resume, we need to separate out the 2 main functionalities
of the cpufreq CPU hotplug callbacks, as outlined below:

1. Init/tear-down of core cpufreq and CPU-specific components, which are
   critical to the correct functioning of the cpufreq subsystem.

2. Init/tear-down of cpufreq sysfs files during suspend/resume.

The first part requires accurate updates to the policy structure such as
its ->cpus and ->related_cpus masks, whereas the second part requires that
the policy->kobj structure is not released or re-initialized during
suspend/resume.

To handle both these requirements, we need to allow updates to the policy
structure throughout suspend/resume, but prevent the structure from getting
freed up. Also, we must have a mechanism by which the cpu-up callbacks can
restore the policy structure, without allocating things afresh. (That also
helps avoid memory leaks).

To achieve this, we use 2 schemes:
a. Use a fallback per-cpu storage area for preserving the policy structures
   during suspend, so that they can be restored during resume appropriately.

b. Use the 'frozen' flag to determine when to free or allocate the policy
   structure vs when to restore the policy from the saved fallback storage.
   Thus we can successfully preserve the structure across suspend/resume.

Effectively, this helps us complete the separation of the 'light-weight'
and the 'full' init/tear-down sequences in the cpufreq subsystem, so that
this can be made use of in the suspend/resume scenario.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 8414809c6a1e8479e331e09254adb58b33a36d25
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:00 -08:00
Srivatsa S. Bhat a3cd334d35 cpufreq: Introduce a flag ('frozen') to separate full vs temporary init/teardown
During suspend/resume we would like to do a light-weight init/teardown of
CPUs in the cpufreq subsystem and preserve certain things such as sysfs files
etc across suspend/resume transitions. Add a flag called 'frozen' to help
distinguish the full init/teardown sequence from the light-weight one.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: a82fab292898f88ea9ca99dd10c1773dcada08b6
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:00 -08:00
Srivatsa S. Bhat 4a247c9235 cpufreq: Extract the handover of policy cpu to a helper function
During cpu offline, when the policy->cpu is going down, some other CPU
present in the policy->cpus mask is nominated as the new policy->cpu.
Extract this functionality from __cpufreq_remove_dev() and implement
it in a helper function. This helps in upcoming code reorganization.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: f9ba680d23ea7e2fc31b4b7106a482d90ec62a24
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:07:00 -08:00
Srivatsa S. Bhat e7fd09fd43 cpufreq: Extract non-interface related stuff from cpufreq_add_dev_interface
cpufreq_add_dev_interface() includes the work of exposing the interface
to the device, as well as a lot of unrelated stuff. Move the latter to
cpufreq_add_dev(), where it is more appropriate.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: e18f1682bce701ddcf88ba3651e07c7ee9b3ed60
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:59 -08:00
Srivatsa S. Bhat d35bcfbd1d cpufreq: Add helper to perform alloc/free of policy structure
Separate out the allocation of the cpufreq policy structure (along with
its error handling) to a helper function. This makes the code easier to
read and also helps with some upcoming code reorganization.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: e9698cc5d2749c5b74e137f94a95d7e505b097e8
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:59 -08:00
Srivatsa S. Bhat da12b9488e cpufreq: Fix misplaced call to cpufreq_update_policy()
The call to cpufreq_update_policy() is placed in the CPU hotplug callback
of cpufreq_stats, which has a higher priority than the CPU hotplug callback
of cpufreq-core. As a result, during CPU_ONLINE/CPU_ONLINE_FROZEN, we end up
calling cpufreq_update_policy() *before* calling cpufreq_add_dev() !
And for uninitialized CPUs, it just returns silently, not doing anything.

To add to that, cpufreq_stats is not even the right place to call
cpufreq_update_policy() to begin with. The cpufreq core ought to handle
this in its own callback, from an elegance/relevance perspective.

So move the invocation of cpufreq_update_policy() to cpufreq_cpu_callback,
and place it *after* cpufreq_add_dev().

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 23d328994b548d6822b88fe7e1903652afc354e0
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:59 -08:00
Viresh Kumar cf3f085426 cpufreq: rename ignore_nice as ignore_nice_load
This sysfs file was called ignore_nice_load earlier and commit
4d5dcc4 (cpufreq: governor: Implement per policy instances of
governors) changed its name to ignore_nice by mistake.

Lets get it renamed back to its original name.

Reported-by: Martin von Gagern <Martin.vGagern@gmx.net>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 6c4640c3adfd97ce10efed7c07405f52d002b9a8
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:59 -08:00
Stratos Karafotis a61f710c5d cpufreq: Remove unused function __cpufreq_driver_getavg()
The target frequency calculation method in the ondemand governor has
changed and it is now independent of the measured average frequency.
Consequently, the __cpufreq_driver_getavg() function and getavg
member of struct cpufreq_driver are not used any more, so drop them.

[rjw: Changelog]
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: cffe4e0e7413eb29fb8bd035c8b12b33a4b8522a
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:59 -08:00
Stratos Karafotis 8a061dbee2 cpufreq: ondemand: Change the calculation of target frequency
The ondemand governor calculates load in terms of frequency and
increases it only if load_freq is greater than up_threshold
multiplied by the current or average frequency.  This appears to
produce oscillations of frequency between min and max because,
for example, a relatively small load can easily saturate minimum
frequency and lead the CPU to the max.  Then, it will decrease
back to the min due to small load_freq.

Change the calculation method of load and target frequency on the
basis of the following two observations:

 - Load computation should not depend on the current or average
   measured frequency.  For example, absolute load of 80% at 100MHz
   is not necessarily equivalent to 8% at 1000MHz in the next
   sampling interval.

 - It should be possible to increase the target frequency to any
   value present in the frequency table proportional to the absolute
   load, rather than to the max only, so that:

   Target frequency = C * load

   where we take C = policy->cpuinfo.max_freq / 100.

Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
that middle frequencies are used more, with this patch.  Highest
and lowest frequencies were used less by ~9%.

[rjw: We have run multiple other tests on kernels with this
 change applied and in the vast majority of cases it turns out
 that the resulting performance improvement also leads to reduced
 consumption of energy.  The change is additionally justified by
 the overall simplification of the code in question.]

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: dfa5bb622555d9da0df21b50f46ebdeef390041b
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:58 -08:00
Paul Gortmaker ab6ac74b49 cpufreq: delete __cpuinit usage from all cpufreq files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

This removes all the drivers/cpufreq uses of the __cpuinit macros
from all C files.

[1] https://lkml.org/lkml/2013/5/20/589

[v2: leave 2nd lines of args misaligned as requested by Viresh]
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: cpufreq@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Git-commit: 2760984f6578d5a462155bb4727766d0c8b68387
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflicts. Remove __cpuinit
 for arch/arm/mach-msm/cpufreq.c as well]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:58 -08:00
Viresh Kumar a3ffd1408e cpufreq: Fix serialization of frequency transitions
Commit 7c30ed ("cpufreq: make sure frequency transitions are serialized")
interacts poorly with systems that have a single core freqency for all
cores.  On such systems we have a single policy for all cores with
several CPUs.  When we do a frequency transition the governor calls the
pre and post change notifiers which causes cpufreq_notify_transition()
per CPU.  Since the policy is the same for all of them all CPUs after
the first and the warnings added are generated by checking a per-policy
flag the warnings will be triggered for all cores after the first.

Fix this by allowing notifier to be called for n times. Where n is the number of
cpus in policy->cpus.

Reported-and-tested-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 266c13d767be61a17d8e6f2310b9b7c46278273b
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:58 -08:00
Lan Tianyu d0c9f30b15 acpi-cpufreq: Add new sysfs attribute freqdomain_cpus
Commits fcf8058 (cpufreq: Simplify cpufreq_add_dev()) and aa77a52
(cpufreq: acpi-cpufreq: Don't set policy->related_cpus from .init())
changed the contents of the "related_cpus" sysfs attribute on systems
where acpi-cpufreq is used and user space can't get the list of CPUs
which are in the same hardware coordination CPU domain (provided by
the ACPI AML method _PSD) via "related_cpus" any more.

To make up for that loss add a new sysfs attribute "freqdomian_cpus"
for the acpi-cpufreq driver which exposes the list of CPUs in the
same domain regardless of whether it is coordinated by hardware or
software.

[rjw: Changelog, documentation]
References: https://bugzilla.kernel.org/show_bug.cgi?id=58761
Reported-by: Jean-Philippe Halimi <jean-philippe.halimi@exascale-computing.eu>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: f4fd3797848aa04e72e942c855fd279840a47fe4
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:57 -08:00
Viresh Kumar bdd9adf7ee cpufreq: make sure frequency transitions are serialized
Whenever we are changing frequency of a cpu, we are calling PRECHANGE and
POSTCHANGE notifiers. They must be serialized. i.e. PRECHANGE or POSTCHANGE
shouldn't be called twice contiguously.

This can happen due to bugs in users of __cpufreq_driver_target() or actual
cpufreq drivers who are sending these notifiers.

This patch adds some protection against this. Now, we keep track of the last
transaction and see if something went wrong.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 7c30ed532cf798a8d924562f2f44d03d7652f7a7
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:57 -08:00
Viresh Kumar ac137ff7c1 cpufreq: arm-big-little: call CPUFREQ_POSTCHANGE notfier in error cases
PRECHANGE and POSTCHANGE notifiers must be called in groups, i.e either both
should be called or both shouldn't be.

In case we have started PRECHANGE notifier and found an error, we must call
POSTCHANGE notifier with freqs.new = freqs.old to guarantee that sequence of
calling notifiers is complete.

This patch fixes it.

This also removes code setting policy->cur as this is also done by POSTCHANGE
notifier.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Git-commit: 3d69dd50517f4a1c037298ac4af85aae1d070879
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:57 -08:00
Viresh Kumar 81266f636f cpufreq: make __cpufreq_notify_transition() static
__cpufreq_notify_transition() is used only in cpufreq.c,
make it static.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 0956df9c842a534b0b36f62f3a0fdb5fca19dc96
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:57 -08:00
Viresh Kumar 0309e39a08 cpufreq: Fix minor formatting issues
There were a few noticeable formatting issues in core cpufreq code.
This cleans them up to make code look better.  The changes include:
 - Whitespace cleanup.
 - Rearrangements of code.
 - Multiline comments fixes.
 - Formatting changes to fit 80 columns.

Copyright information in cpufreq.c is also updated to include my name
for 2013.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: bb176f7d038fee4d46b3293e64e173bfb05ab7b5
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:56 -08:00
Viresh Kumar 399c4791d5 cpufreq: Simplify userspace governor
Userspace governor has got more code than what it needs for its
functioning, so simplify it.

Portions of code removed are:
 - Extra header files which aren't required anymore (rearrange them
   as well).
 - cpu_{max|min|cur|set}_freq, as they are always the same as
   policy->{max|min|cur}.
 - userspace_cpufreq_notifier_block as we don't need to set
   cpu_cur_freq anymore.
 - cpus_using_userspace_governor as it was for the notifier code.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: d1922f02562fe230396400e466e6e38dfeb072f5
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:56 -08:00
Viresh Kumar fa715b03d7 cpufreq: remove unnecessary cpufreq_cpu_{get|put}() calls
struct cpufreq_policy is already passed as argument to some routines
like: __cpufreq_driver_getavg() and so we don't really need to do
cpufreq_cpu_get() before and cpufreq_cpu_put() in them to get a
policy structure.

Remove them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: a262e94cdcb961762e5d91e7fcb857bba7d420a0
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:56 -08:00
Viresh Kumar 2314e15b61 cpufreq: rename index as driver_data in cpufreq_frequency_table
The "index" field of struct cpufreq_frequency_table was never an
index and isn't used at all by the cpufreq core.  It only is useful
for cpufreq drivers for their internal purposes.

Many people nowadays blindly set it in ascending order with the
assumption that the core will use it, which is a mistake.

Rename it to "driver_data" as that's what its purpose is. All of its
users are updated accordingly.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 5070158804b5339c71809f5e673cea1cfacd804d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: update non-upstream files]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:56 -08:00
Viresh Kumar 0536fb861a cpufreq: Don't create empty /sys/devices/system/cpu/cpufreq directory
When we don't have any file in cpu/cpufreq directory we shouldn't
create it. Specially with the introduction of per-policy governor
instance patchset, even governors are moved to
cpu/cpu*/cpufreq/governor-name directory and so this directory is
just not required.

Lets have it only when required.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 2361be23666232dbb4851a527f466c4cbf5340fc
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:55 -08:00
Viresh Kumar aa47efbd4e cpufreq: Move get_cpu_idle_time() to cpufreq.c
Governors other than ondemand and conservative can also use
get_cpu_idle_time() and they aren't required to compile
cpufreq_governor.c. So, move these independent routines to
cpufreq.c instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 72a4ce340a7ebf39e1c6fdc8f5feb4f974d6c635
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: update non-upstream files]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:55 -08:00
Viresh Kumar 4b7d898329 cpufreq: governors: Move get_governor_parent_kobj() to cpufreq.c
get_governor_parent_kobj() can be used by any governor, generic
cpufreq governors or platform specific ones and so must be present in
cpufreq.c instead of cpufreq_governor.c.

This patch moves it to cpufreq.c. This also adds
EXPORT_SYMBOL_GPL(get_governor_parent_kobj) so that modules can use
this function too.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 944e9a0316e60bc5bc122e46c1fde36e5f6e9f56
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:55 -08:00
Viresh Kumar 4f660094e4 cpufreq: Add EXPORT_SYMBOL_GPL for have_governor_per_policy
This patch adds: EXPORT_SYMBOL_GPL(have_governor_per_policy), so that
this routine can be used by modules too.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 3f869d6d41d032392abafe17ea5257a2514a24a7
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-12-20 19:06:55 -08:00
Srivatsa Vaddagiri 840cdb0958 cpufreq: cpu-boost: Resolve deadlock when waking up sync thread
CPU boost driver receives notification from scheduler when threads
migrate towards a cpu and in turn wakes up a sync thread associated
with that cpu to handle any required frequency transitions. The wakeup
call however can lead to a deadlock inside scheduler under some
circumstance. The deadlock is seen when sync thread is the only thread
running on a cpu and goes to sleep (say by calling wait_event() ->
schedule()). Midway through this sleep (schedule()) call, while cpu is
still running in context of sync thread, scheduler attempts a load
balance (realizing that cpu is about to become idle) which can result
in tasks being migrated towards the cpu going idle. This will cause
migration notification to be issued and in turn a wakeup on sync
thread. The wakeup call however gets stuck in below while() loop
inside scheduler:

try_to_wake_up(struct task_struct *p, ...)
{

        /*
         * If the owning (remote) cpu is still in the middle of
	 * schedule() with this task as prev, wait until its done
	 * referencing the task.
         */
	while (p->on_cpu)
		cpu_relax();

}

A possible fix could be to teach try_to_wake_up() about this
special case. Another fix, implemented in this patch and that helps
minimize scheduler changes, is to have cpu boost driver not issue a
wakeup under this special circumstance, which was found to occur very
infrequently.

Change-Id: I92bc68a22d51595a208673fe2a1eedfa97004f9e
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2013-12-20 18:56:07 -08:00
Rohit Gupta 24d3ed232f cpufreq: interactive: Use default min_sample_time if SDF is zero
Use min_sample_time to step down from max frequencies if
sampling_down_factor is set to zero.

Change-Id: I2e89bef6220557cb95efd20d658e0c05753b4c3c
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2013-12-20 17:47:47 -08:00
Linux Build Service Account cd45ce66b0 Merge "cpufreq: interactive: Remove trace event from idle_start handler" 2013-12-10 17:15:21 -08:00
Linux Build Service Account 09efed950d Merge "cpufreq: interactive: sync freq feature for interactive governor" 2013-12-10 17:15:09 -08:00
Rohit Gupta cdd6914bb7 cpufreq: interactive: Remove trace event from idle_start handler
Removed the trace_cpufreq_interactive_idle_start.
Also fix a crash resulting from accessing NULL policy before taking
the pcpu->enable_sem lock. The policy can be NULL if the core is
hotplugged out before the enable_sem lock is taken.

Change-Id: I7e2809cc016b3b383a44cdf3c697013e2d2b5417
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2013-12-09 19:22:27 -08:00
Linux Build Service Account 0e50cfa58c Merge "cpufreq: Add Input Boost feature to the cpu-boost driver" 2013-12-08 13:30:02 -08:00
Linux Build Service Account 0f3ff5b57f Merge "cpufreq: interactive: Reset floor_validate_time if busy at max for 100ms" 2013-12-07 11:10:53 -08:00
Rohit Gupta b6cd9b3352 cpufreq: interactive: Reset floor_validate_time if busy at max for 100ms
When the interactive governor selects to run at max frequency it doesn't
re-schedule the timer until it hits an idle. This change checks if the CPU
has been continuously busy for last 100ms on hitting an idle start. If yes,
then floor_validate_time is reset so that the CPU stays at max frequency
for at least another 100 ms before stepping down.
This is an important feature for detecting CPU intensive workloads which
require high frequencies for achieving better performance.

Change-Id: I7d48ffbc3d50a80af9be3bf94667ee3d0120b763
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2013-12-06 16:43:07 -08:00
Rohit Gupta f3d1980b4d cpufreq: interactive: sync freq feature for interactive governor
1) Add load info to cpufreq_interactive_cpuinfo
2) If load on any other online cpu exceeds sync_freq_load_threshold,
   do not allow the frequency to drop below sync_freq

Change-Id: I3617e10f87b85178914a18bcf04ac2a31a4f1ec1
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2013-12-06 15:44:10 -08:00
Rohit Gupta ff6af80775 cpufreq: interactive: Allow 1 ms error in above_hispeed_delay comparisons
Allow for an error of 1 ms while taking into account
above_hispeed_delay for a frequency

Change-Id: I744e44387152e4efb5978df4f2b9533bf79d4582
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2013-12-04 18:42:21 -08:00
Mark Langsdorf 16802f7ac3 cpufreq: highbank-cpufreq: Enable Midway/ECX-2000
commit fbbc5bfb44a22e7a8ef753a1c8dfb448d7ac8b85 upstream.

Calxeda's new ECX-2000 part uses the same cpufreq interface as highbank,
so add it to the driver's compatibility list.

This is a minor change that can safely be applied to the 3.10 and 3.11
stable trees.

Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-04 10:57:18 -08:00
Rohit Gupta c556bd1322 cpufreq: Add Input Boost feature to the cpu-boost driver
On incoming input events boost the frequency of all online cpus
for at least input_boost_ms ms. This is accomplished by changing
the policy->min of all the online cpus to input_boost_freq.

Change-Id: Idb0ab75d68ae4ceff259cbbaaec1a9bb3bc871d3
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2013-12-03 18:28:44 -08:00
Rohit Gupta 2dd0e605ee cpufreq: Add a sync limit to cpu-boost
Perform frequency synchronization only when source CPU's frequency
is less than sync_threshold, else sync to the sync_threshold.

Change-Id: I544c414568d4e015b80ce5891dd215275bac95da
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2013-12-03 18:28:40 -08:00
Rohit Gupta d094d23694 cpufreq: interactive: Add a sampling_down_factor for max frequencies
Change min_sample_time to sampling_down_factor ms when running at max
frequency. This would keep the frequency up at max for at least that much
time before allowing it to step down. This is an important performance
enhancement for CPU intensive workloads like benchmarks where there is
consistently high load on CPU to keep it up at max frequency.

Change-Id: Ia7e35b0625bedf20e7ef3a1f52e5828ffbfed93e
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2013-11-27 17:42:17 -08:00
Saravana Kannan 6249bf152c cpufreq: cpu-boost: Add cpu-boost driver
When certain bursty and important events take place, it might take a while
for the current cpufreq governor to notice the new load and react to it.
That would result in poor user experience. To alleviate this, the cpu-boost
driver boosts the frequency of a CPU for a short duration to maintain good
user experience while the governor catches up.

Specifically, this commit deals with ensuring that when "important" tasks
migrate from a fast CPU to a slow CPU, the frequency of the slow CPU is
boosted to be at least as high as the fast CPU for a short duration.

Since this driver enforces the boost by hooking into standard cpufreq
ADJUST notifiers, it has several advantages:
- More portable across kernel versions where the cpufreq internals might
  have been rewritten.
- Governor agnostic and hence works with multiple governors like
  conservative, ondemand, interactive, etc.
- Does not affect the sampling period/logic of existing governors.
- Can have the boost period adjusted independent of governor sampling
  period.

Change-Id: Ibd814a20043d0aba64ee7637a4a79b9ffa1b0991
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
2013-11-25 17:14:47 -08:00
Ian Maund f06163e6d0 msm: reap unused kernel files
This change removes source files from the kernel tree that
were not being used during make. The list of used files
was generated using an annotated make log and was then
compared with new files added since the public release of
kernel version 3.10.00. New files which were added but
not used have been removed from the tree.

A diff was also run to determine the list of files that had
been modified since the release of kernel version 3.10.00.
These files were then scrubbed based on the current kernel
configuration, removing invalid and unused conditionals.

Some files which support planned functionality or are
useful in debugging have been excluded from this reap.

Change-Id: Ia44a224d3cea7bc78dd45e8a8279860d35d4b008
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2013-11-21 17:45:28 -08:00
Minsung Kim 01ae0541f0 cpufreq: interactive: fix show_target_loads and show_above_hispeed_delay
Remove a trailing whitespace from target_loads and above_hispeed_delay. Problem
happens when user-space program tried to restore parameters that saved before
changing parameters. In this case was returned error(EINVAL).

Change-Id: I5a74e3824602cd6f2b74651adda5ec1b627e61e9
Signed-off-by: Minsung Kim <ms925.kim@samsung.com>
Git-commit: cf0fad49d17cb8273ce555dd5b7afab67d7923bf
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Dilip Gudlur <dgudlur@codeaurora.org>
2013-11-18 11:57:24 -08:00
Dirk Brandewie f606b358df cpufreq / intel_pstate: Fix max_perf_pct on resume
commit 52e0a509e5d6f902ec26bc2a8bb02b137dc453be upstream.

If the system is suspended while max_perf_pct is less than 100 percent
or no_turbo set policy->{min,max} will be set incorrectly with scaled
values which turn the scaled values into hard limits.

References: https://bugzilla.kernel.org/show_bug.cgi?id=61241
Reported-by: Patrick Bartels <petzicus@googlemail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-13 12:05:30 +09:00
Junjie Wu aa3edab571 cpufreq: Save and restore user min/max freq for hotplug
In addition to restoring the previous governor after hotplug, restore
min/max frequency set in user_policy.

This patch combines commit:
1) 026cf76c5a2b1b84cbe12b79800f97389abfd589 (android-msm-2.6.35)
   cpufreq: Save and restore min and max frequencies
2) 82a8db17421e475eb662ad83674d4f7a351bc55e (msm-3.4)
   cpufreq: Save user policy min/max instead of policy min/max during
   hotplug

And refactors it to apply cleanly onto the 3.10 kernel.

Change-Id: Id097132765a5487618ca020bae3cc9c9ea791072
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2013-09-20 23:32:29 -07:00
Stephen Boyd b6d59d6bfc cpufreq: Don't use smp_processor_id() in preemptible context
Workqueues are preemptible even if works are queued on them with
queue_work_on(). Let's use raw_smp_processor_id() here to silence
the warning.

BUG: using smp_processor_id() in preemptible [00000000] code: kworker/3:2/674
caller is gov_queue_work+0x28/0xb0
CPU: 0 PID: 674 Comm: kworker/3:2 Tainted: G        W    3.10.0 #30
Workqueue: events od_dbs_timer
[<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
[<c0109dec>] (show_stack+0x10/0x14) from [<c03885a4>] (debug_smp_processor_id+0xbc/0xf0)
[<c03885a4>] (debug_smp_processor_id+0xbc/0xf0) from [<c0635864>] (gov_queue_work+0x28/0xb0)
[<c0635864>] (gov_queue_work+0x28/0xb0) from [<c0635618>] (od_dbs_timer+0x108/0x134)
[<c0635618>] (od_dbs_timer+0x108/0x134) from [<c01aa8f8>] (process_one_work+0x25c/0x444)
[<c01aa8f8>] (process_one_work+0x25c/0x444) from [<c01aaf88>] (worker_thread+0x200/0x344)
[<c01aaf88>] (worker_thread+0x200/0x344) from [<c01b03bc>] (kthread+0xa0/0xb0)
[<c01b03bc>] (kthread+0xa0/0xb0) from [<c01061b8>] (ret_from_fork+0x14/0x3c)

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-09-04 17:24:51 -07:00
Stephen Boyd c43e0a8d13 cpufreq: Fix timer/workqueue corruption due to double queueing
When a CPU is hot removed we'll cancel all the delayed work items
via gov_cancel_work(). Normally this will just cancel a delayed
timer on each CPU that the policy is managing and the work won't
run, but if the work is already running the workqueue code will
wait for the work to finish before continuing to prevent the
work items from re-queuing themselves like they normally do. This
scheme will work most of the time, except for the case where the
work function determines that it should adjust the delay for all
other CPUs that the policy is managing. If this scenario occurs,
the canceling CPU will cancel its own work but queue up the other
CPUs works to run. For example:

 CPU0                                        CPU1
 ----                                        ----
 cpu_down()
  ...
  __cpufreq_remove_dev()
   cpufreq_governor_dbs()
    case CPUFREQ_GOV_STOP:
     gov_cancel_work(dbs_data, policy);
      cpu0 work is canceled
       timer is canceled
       cpu1 work is canceled                    <work runs>
       <waits for cpu1>                         od_dbs_timer()
                                                 gov_queue_work(*, *, true);
 						  cpu0 work queued
 						  cpu1 work queued
						  cpu2 work queued
						  ...
       cpu1 work is canceled
       cpu2 work is canceled
       ...

At the end of the GOV_STOP case cpu0 still has a work queued to
run although the code is expecting all of the works to be
canceled. __cpufreq_remove_dev() will then proceed to
re-initialize all the other CPUs works except for the CPU that is
going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs()
will trample over the queued work and debugobjects will spit out
a warning:

WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc()
ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x10
Modules linked in:
CPU: 0 PID: 1491 Comm: sh Tainted: G        W    3.10.0 #19
[<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
[<c0109dec>] (show_stack+0x10/0x14) from [<c01904cc>] (warn_slowpath_common+0x4c/0x6c)
[<c01904cc>] (warn_slowpath_common+0x4c/0x6c) from [<c019056c>] (warn_slowpath_fmt+0x2c/0x3c)
[<c019056c>] (warn_slowpath_fmt+0x2c/0x3c) from [<c0388a7c>] (debug_print_object+0x94/0xbc)
[<c0388a7c>] (debug_print_object+0x94/0xbc) from [<c0388e34>] (__debug_object_init+0x2d0/0x340)
[<c0388e34>] (__debug_object_init+0x2d0/0x340) from [<c019e3b0>] (init_timer_key+0x14/0xb0)
[<c019e3b0>] (init_timer_key+0x14/0xb0) from [<c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8)
[<c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8) from [<c06325a0>] (__cpufreq_governor+0xdc/0x1a4)
[<c06325a0>] (__cpufreq_governor+0xdc/0x1a4) from [<c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434)
[<c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434) from [<c08989f4>] (cpufreq_cpu_callback+0x60/0x80)
[<c08989f4>] (cpufreq_cpu_callback+0x60/0x80) from [<c08a43c0>] (notifier_call_chain+0x38/0x68)
[<c08a43c0>] (notifier_call_chain+0x38/0x68) from [<c01938e0>] (__cpu_notify+0x28/0x40)
[<c01938e0>] (__cpu_notify+0x28/0x40) from [<c0892ad4>] (_cpu_down+0x7c/0x2c0)
[<c0892ad4>] (_cpu_down+0x7c/0x2c0) from [<c0892d3c>] (cpu_down+0x24/0x40)
[<c0892d3c>] (cpu_down+0x24/0x40) from [<c0893ea8>] (store_online+0x2c/0x74)
[<c0893ea8>] (store_online+0x2c/0x74) from [<c04519d8>] (dev_attr_store+0x18/0x24)
[<c04519d8>] (dev_attr_store+0x18/0x24) from [<c02a69d4>] (sysfs_write_file+0x100/0x148)
[<c02a69d4>] (sysfs_write_file+0x100/0x148) from [<c0255c18>] (vfs_write+0xcc/0x174)
[<c0255c18>] (vfs_write+0xcc/0x174) from [<c0255f70>] (SyS_write+0x38/0x64)
[<c0255f70>] (SyS_write+0x38/0x64) from [<c0106120>] (ret_fast_syscall+0x0/0x30)

The simplest fix is to check and see if the governor is being
stopped and ignore the all_cpus flag so that only the work that's
being canceled has the chance to re-queue itself.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-09-04 17:24:47 -07:00
Steve Muckle 652b447d79 msm: dcvs: gpu minimum frequency levels
System performance is enhanced if the gpu frequency is given a
minimum corresponding to various frequency levels of CPU 0.

Change-Id: Iba168d708524fc8ef164428bb5f4e0631a499342
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
2013-09-04 15:36:27 -07:00
Abhijeet Dharmapurikar a7984e7688 msm: dcvs: remove core name
Currently core_name is used to identify which core the dcvs operates on.
Instead use a type and the type num while registration with dcvs and
return an id (dcvs_core_id) upon successfull registration.

The dcvs_core_id is used by the clients of msm_dcvs to call upon its
apis viz. freq_start, freq_stop, msm_dcvs_idle etc.

The dcvs inturn uses the type num passed in at registration time to
invoke apis on the clients viz. set_freq, get_freq, idle_enable.

This further cleans up the internal dcvs add_core and get_core
implementation. One need not pass around the core_name and use the type
instead.

Change-Id: I1736f4befec02249eae969694c7b7696dc9cdb9d
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
2013-09-04 15:25:40 -07:00
Abhijeet Dharmapurikar 6656d624c6 msm: dcvs: remove idle notification registration
Register the idle enable callback along with the core. The code
becomes cleaner and easy to update.

Importantly, the msm_dcvs_idle driver becomes useless. Remove it
and instead let the msm governor handle idle enabling and disabling.

Change-Id: Ia056b8b1fcca8ebd356fa0484148c3ade54026fc
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
2013-09-04 15:25:39 -07:00
Abhijeet Dharmapurikar d078c4f1d1 msm: dcvs: provide frequency set/get callbacks at registration
There is no need to register a separate structure for setting and
getting frequency.

Simply pass function pointers to set and get callbacks when a
core is registered.

While at it rename the msm_dcvs_freq_sink_register/unregister to
msm_dcvs_freq_sink_start/stop to better reflect that those
apis are meant to do.

Change-Id: I0e5bb1fe8d127e5841ee9916dad5c3fd64815228
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
2013-09-04 15:25:25 -07:00
Abhijeet Dharmapurikar b420ae3196 msm: dcvs: update dcvs if the governor limits change
We had seen issues where dcvs goes out of sync with the actual
freq the cpu is running at. The root cause was if the userspace
changes the limits on the governor, the governor ends up changing
the frequency without notifying dcvs.

Provide an api for the governor to call dcvs when a frequency change
happens.

Change-Id: Ifb33fac9cdde8535e274b7c91534dc44cc19dbe7
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
2013-09-04 15:25:24 -07:00
Abhijeet Dharmapurikar e585b1137c cpufreq: msm: fix race in cpufreq
__cpufreq_driver_target expects to be called with the rw semaphore
held. We are not doing this when the frequency is asked to be changed
by dcvs.

Use cpufreq_driver_target variant of that function which gets the
rw semaphore before setting the frequency.

Change-Id: I67a7d437f68749f43c26b95ae73f436890326019
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
2013-09-04 15:25:19 -07:00
Abhijeet Dharmapurikar e2c08a46fc cpufreq: msm: remove gov_mutex from stop
The stop could end up calling set_frequency callback. The set_frequency
callback takes the same gov_mutex lock and we end up in a deadlock.

Fix this by removing the locks around stop.

Change-Id: Ic687d92825f253aa3e23e4bcaf3c96d5879d4b43
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
2013-09-04 15:25:19 -07:00
Abhijeet Dharmapurikar 2f4ea63bc6 msm: dcvs: Add thermal interfaces
The algorithm needs thermal inputs for all the cores. Create members in
the internal core_info strucutre and platform data/device tree to pass
in the sensors they use.

Update the dcvs code to notify the temperature to TZ.

Change-Id: I96e123eb49cdd564564e5fe12531407406eafa0c
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
2013-09-04 15:25:18 -07:00
Abhijeet Dharmapurikar dd87ea3e19 msm: dcvs: rearrange platform data
This change
-removes the use of group id and instead introduces core type
-rearranges platform data, adds energy curve coefficients and power
 parameters
-allow for the energy params to be -ve numbers

The change also mandates updates to the msm8974-gpu.dtsi and the
associated binding documentation.

Also take this opportunity to remove devices for unsupported platform
- 8930 and 8960

Change-Id: Ie8d5a691defb825955b240c1b304e04f0a21499a
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
2013-09-04 15:25:07 -07:00
Amar Singhal ee974af4c9 cpufreq: make the "scaling_cur_freq" sysfs entry pollable
Wakeup userspace poll on change of cpu frequency. The userspace
may then take action to change the power/performance
characteristics of the device.

Change-Id: I3030b22084fe7e0143b978a198ddcc579e7d6e83
Signed-off-by: Amar Singhal <asinghal@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-09-04 14:47:07 -07:00
Amar Singhal 187dd1a240 cpufreq: Make the "scaling_governor" sysfs node pollable
A userspace module programs different qos-rules depending on the
governor running in the system. Make the governor node
pollable, so that the userspace module can be triggered when
the value of the governor changes.

change-Id: Ic89c77c7d16b0f8954d59a211612e9a8e98a2c28
Signed-off-by: Amar Singhal <asinghal@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-09-04 14:47:07 -07:00
Praveen Chidambaram 0fa51886ea msm: dcvs: Add 'msm-dcvs' cpufreq governor
The 'msm-dcvs' CPUFreq governor interfaces the msm_dcvs driver frequency
change requests with the CPUFreq framework.

Change-Id: I950e5b09f568412760d9b022f59f208c6bcb54ce
Signed-off-by: Praveen Chidambaram <pchidamb@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-09-04 14:47:06 -07:00
Stephen Boyd 38d8910730 Merge branch 'qandroid-3.10' into msm-3.10
* qandroid-3.10: (636 commits)
  netfilter: xt_qtaguid: Protect iface list access with necessary lock
  HID: magicmouse: Fix build warning
  USB: gadget: mtp: Fix OUT endpoint request length usage in read
  USB: gadget: f_mtp: Fix using tx buffer pointer
  msm: Fix race condition in domain lookup
  msm: Add null-pointer checks for domains
  base: sync: increase size of sync_timeline name
  USB: gadget: mtp: Add module parameters for Tx transfer length
  msm: iommu: Lock the genpool allocation
  gpu: ion: fix page offset in dma_buf_kmap()
  gpu: ion: Fix bug in ion_system_heap map_user
  gpu: ion: Only map as much of the vma as the user requested
  gpu: ion: use vmalloc to allocate page array to map kernel
  gpu: ion: Remove dead comments
  gpu: ion: Minimize allocation fallback delay
  mmc: sd: Set the card removed if card detect fails
  gpu: ion: don't fault in individual pages for the CP heap
  gpu: ion: do not ask for compound pages in system heap
  gpu: ion: Modify the system heap to try to allocate large/huge pages
  gpu: ion: Set the dma_address of the sg list at alloc time
  ...

Conflicts:
	arch/arm/Kconfig
	arch/arm/include/asm/hardware/cache-l2x0.h
	arch/arm/mm/cache-l2x0.c
	drivers/mmc/card/block.c
	drivers/usb/gadget/udc-core.c
2013-09-04 14:46:18 -07:00
Xiaoguang Chen 19d5afc535 cpufreq: Fix governor start/stop race condition
Cpufreq governors' stop and start operations should be carried out
in sequence.  Otherwise, there will be unexpected behavior, like in
the example below.

Suppose there are 4 CPUs and policy->cpu=CPU0, CPU1/2/3 are linked
to CPU0.  The normal sequence is:

 1) Current governor is userspace.  An application tries to set the
    governor to ondemand.  It will call __cpufreq_set_policy() in
    which it will stop the userspace governor and then start the
    ondemand governor.

 2) Current governor is userspace.  The online of CPU3 runs on CPU0.
    It will call cpufreq_add_policy_cpu() in which it will first
    stop the userspace governor, and then start it again.

If the sequence of the above two cases interleaves, it becomes:

 1) Application stops userspace governor
 2)                                  Hotplug stops userspace governor

which is a problem, because the governor shouldn't be stopped twice
in a row.  What happens next is:

 3) Application starts ondemand governor
 4)                                  Hotplug starts a governor

In step 4, the hotplug is supposed to start the userspace governor,
but now the governor has been changed by the application to ondemand,
so the ondemand governor is started once again, which is incorrect.

The solution is to prevent policy governors from being stopped
multiple times in a row.  A governor should only be stopped once for
one policy.  After it has been stopped, no more governor stop
operations should be executed.

Also add a mutex to serialize governor operations.

[rjw: Changelog.  And you owe me a beverage of my choice.]
Signed-off-by: Xiaoguang Chen <chenxg@marvell.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 95731ebb114c5f0c028459388560fc2a72fe5049
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-09-04 14:44:53 -07:00
Jacob Shin 22658c8250 cpufreq: don't leave stale policy pointer in cdbs->cur_policy
Clear ->cur_policy when stopping a governor, or the ->cur_policy
pointer may be stale on systems with have_governor_per_policy when a
new policy is allocated due to CPU hotplug offline/online.

[rjw: Changelog]
Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 419e172145cf6c51d436a8bf4afcd17511f0ff79
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-09-04 14:44:52 -07:00
Rafael J. Wysocki 5ef7dbef0c cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
driver module refcount, __cpufreq_remove_dev() causes that refcount
to become negative for the cpufreq driver after a suspend/resume
cycle.

This is not the only bad thing that happens there, however, because
kobject_put() should only be called for the policy kobject at this
point if the CPU is not the last one for that policy.

Namely, if the given CPU is the last one for that policy, the
policy kobject's refcount should be 1 at this point, as set by
cpufreq_add_dev_interface(), and only needs to be dropped once for
the kobject to go away.  This actually happens under the cpu == 1
check, so it need not be done before by cpufreq_cpu_put().

On the other hand, if the given CPU is not the last one for that
policy, this means that cpufreq_add_policy_cpu() has been called
at least once for that policy and cpufreq_cpu_get() has been
called for it too.  To balance that cpufreq_cpu_get(), we need to
call cpufreq_cpu_put() in that case.

Thus, to fix the described problem and keep the reference
counters balanced in both cases, move the cpufreq_cpu_get() call
in __cpufreq_remove_dev() to the code path executed only for
CPUs that share the policy with other CPUs.

Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: 3.10+ <stable@vger.kernel.org>
Git-commit: 2a99859932281ed6c2ecdd988855f8f6838f6743
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-08-22 18:09:28 -07:00
Srivatsa S. Bhat 38f70e635d cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression
commit 2f7021a8 "cpufreq: protect 'policy->cpus' from offlining
during __gov_queue_work()" caused a regression in CPU hotplug,
because it lead to a deadlock between cpufreq governor worker thread
and the CPU hotplug writer task.

Lockdep splat corresponding to this deadlock is shown below:

[   60.277396] ======================================================
[   60.277400] [ INFO: possible circular locking dependency detected ]
[   60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
[   60.277411] -------------------------------------------------------
[   60.277417] bash/2225 is trying to acquire lock:
[   60.277422]  ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
[   60.277444] but task is already holding lock:
[   60.277449]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[   60.277465] which lock already depends on the new lock.

[   60.277472] the existing dependency chain (in reverse order) is:
[   60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
[   60.277490]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277503]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[   60.277514]        [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
[   60.277522]        [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
[   60.277532]        [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
[   60.277543]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[   60.277552]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[   60.277560]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[   60.277569]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[   60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
[   60.277592]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277600]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[   60.277608]        [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
[   60.277616]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[   60.277624]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[   60.277633]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[   60.277640]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[   60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
[   60.277661]        [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[   60.277669]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277677]        [<ffffffff810621ed>] flush_work+0x3d/0x280
[   60.277685]        [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[   60.277693]        [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[   60.277701]        [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[   60.277709]        [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[   60.277719]        [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[   60.277728]        [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[   60.277737]        [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[   60.277747]        [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[   60.277759]        [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[   60.277768]        [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[   60.277779]        [<ffffffff815a0d46>] cpu_down+0x36/0x50
[   60.277788]        [<ffffffff815a2748>] store_online+0x98/0xd0
[   60.277796]        [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[   60.277806]        [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[   60.277818]        [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[   60.277826]        [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[   60.277834]        [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[   60.277842] other info that might help us debug this:

[   60.277848] Chain exists of:
  (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock

[   60.277864]  Possible unsafe locking scenario:

[   60.277869]        CPU0                    CPU1
[   60.277873]        ----                    ----
[   60.277877]   lock(cpu_hotplug.lock);
[   60.277885]                                lock(&j_cdbs->timer_mutex);
[   60.277892]                                lock(cpu_hotplug.lock);
[   60.277900]   lock((&(&j_cdbs->work)->work));
[   60.277907]  *** DEADLOCK ***

[   60.277915] 6 locks held by bash/2225:
[   60.277919]  #0:  (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
[   60.277937]  #1:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
[   60.277954]  #2:  (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
[   60.277972]  #3:  (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
[   60.277990]  #4:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
[   60.278007]  #5:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[   60.278023] stack backtrace:
[   60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
[   60.278037] Hardware name: Acer             Aspire 5741G    /Aspire 5741G    , BIOS V1.20 02/08/2011
[   60.278042]  ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
[   60.278055]  ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
[   60.278068]  ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
[   60.278081] Call Trace:
[   60.278091]  [<ffffffff815b3d90>] dump_stack+0x19/0x1b
[   60.278101]  [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
[   60.278111]  [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[   60.278123]  [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
[   60.278134]  [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.278142]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
[   60.278151]  [<ffffffff810621ed>] flush_work+0x3d/0x280
[   60.278159]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
[   60.278169]  [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
[   60.278178]  [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
[   60.278188]  [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[   60.278196]  [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[   60.278206]  [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[   60.278214]  [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[   60.278225]  [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[   60.278234]  [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[   60.278244]  [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[   60.278255]  [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[   60.278265]  [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[   60.278275]  [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[   60.278284]  [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[   60.278292]  [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
[   60.278302]  [<ffffffff815a0d46>] cpu_down+0x36/0x50
[   60.278311]  [<ffffffff815a2748>] store_online+0x98/0xd0
[   60.278320]  [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[   60.278329]  [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[   60.278337]  [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[   60.278347]  [<ffffffff81185950>] ? fget_light+0x320/0x4b0
[   60.278355]  [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[   60.278364]  [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[   60.280582] smpboot: CPU 1 is now offline

The intention of that commit was to avoid warnings during CPU
hotplug, which indicated that offline CPUs were getting IPIs from the
cpufreq governor's work items.  But the real root-cause of that
problem was commit a66b2e5 (cpufreq: Preserve sysfs files across
suspend/resume) because it totally skipped all the cpufreq callbacks
during CPU hotplug in the suspend/resume path, and hence it never
actually shut down the cpufreq governor's worker threads during CPU
offline in the suspend/resume path.

Reflecting back, the reason why we never suspected that commit as the
root-cause earlier, was that the original issue was reported with
just the halt command and nobody had brought in suspend/resume to the
equation.

The reason for _that_ in turn, as it turns out, is that earlier
halt/shutdown was being done by disabling non-boot CPUs while tasks
were frozen, just like suspend/resume....  but commit cf7df378a
(reboot: migrate shutdown/reboot to boot cpu) which came somewhere
along that very same time changed that logic: shutdown/halt no longer
takes CPUs offline.  Thus, the test-cases for reproducing the bug
were vastly different and thus we went totally off the trail.

Overall, it was one hell of a confusion with so many commits
affecting each other and also affecting the symptoms of the problems
in subtle ways.  Finally, now since the original problematic commit
(a66b2e5) has been completely reverted, revert this intermediate fix
too (2f7021a8), to fix the CPU hotplug deadlock.  Phew!

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Peter Wu <lekensteyn@gmail.com>
Cc: 3.10+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: e8d05276f236ee6435e78411f62be9714e0b9377
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-08-22 18:09:27 -07:00
Srivatsa S. Bhat 659989d7c1 cpufreq: Revert commit a66b2e to fix suspend/resume regression
commit a66b2e (cpufreq: Preserve sysfs files across suspend/resume)
has unfortunately caused several things in the cpufreq subsystem to
break subtly after a suspend/resume cycle.

The intention of that patch was to retain the file permissions of the
cpufreq related sysfs files across suspend/resume.  To achieve that,
the commit completely removed the calls to cpufreq_add_dev() and
__cpufreq_remove_dev() during suspend/resume transitions.  But the
problem is that those functions do 2 kinds of things:
  1. Low-level initialization/tear-down that are critical to the
     correct functioning of cpufreq-core.
  2. Kobject and sysfs related initialization/teardown.

Ideally we should have reorganized the code to cleanly separate these
two responsibilities, and skipped only the sysfs related parts during
suspend/resume.  Since we skipped the entire callbacks instead (which
also included some CPU and cpufreq-specific critical components),
cpufreq subsystem started behaving erratically after suspend/resume.

So revert the commit to fix the regression.  We'll revisit and address
the original goal of that commit separately, since it involves quite a
bit of careful code reorganization and appears to be non-trivial.

(While reverting the commit, note that another commit f51e1eb
 (cpufreq: Fix cpufreq regression after suspend/resume) already
 reverted part of the original set of changes.  So revert only the
 remaining ones).

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Paul Bolle <pebolle@tiscali.nl>
Cc: 3.10+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: aae760ed21cd690fe8a6db9f3a177ad55d7e12ab
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-08-22 18:09:26 -07:00
Srivatsa S. Bhat 021bac3da8 cpufreq: Fix cpufreq regression after suspend/resume
Toralf Förster reported that the cpufreq ondemand governor behaves erratically
(doesn't scale well) after a suspend/resume cycle. The problem was that the
cpufreq subsystem's idea of the cpu frequencies differed from the actual
frequencies set in the hardware after a suspend/resume cycle. Toralf bisected
the problem to commit a66b2e5 (cpufreq: Preserve sysfs files across
suspend/resume).

Among other (harmless) things, that commit skipped the call to
cpufreq_update_policy() in the resume path. But cpufreq_update_policy() plays
an important role during resume, because it is responsible for checking if
the BIOS changed the cpu frequencies behind our back and resynchronize the
cpufreq subsystem's knowledge of the cpu frequencies, and update them
accordingly.

So, restore the call to cpufreq_update_policy() in the resume path to fix
the cpufreq regression.

Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: 3.10+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: f51e1eb63d9c28cec188337ee656a13be6980cfd
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2013-08-22 18:09:26 -07:00
Deepak Katragadda 059e669ed2 msm: cpufreq: Only apply driver limits for scaling_min/max_freq writes
When new values are written to scaling_min/max_freq sysfs files, the
current code applies all the limits imposed by various ADJUST and
INCOMPATIBLE notifiers handlers before storing them.

When the ADJUST and/or INCOMPATIBLE notifiers change their limits
frequently, this behavior makes it almost impossible for a
user/userspace process to store the scaling_min/max_freq limits
that go beyond the instantaneous limits imposed by the notifiers.
Since these sysfs nodes are typically meant to set limits that need
to be enforced for the foreseeable future, this is not a very user
friendly behavior.

So, change the behavior to only apply limits that are enforced by
the cpufreq driver. Typically, these are just the absolute limits
of the HW and don't change very often.

Change-Id: I1ccfaa2d1ee4ea595f882485d359dbdb407a0176
Signed-off-by: Deepak Katragadda <dkatraga@codeaurora.org>
2013-08-22 18:08:48 -07:00
Viresh Kumar da712f3a8c cpufreq: rename ignore_nice as ignore_nice_load
commit 6c4640c3adfd97ce10efed7c07405f52d002b9a8 upstream.

This sysfs file was called ignore_nice_load earlier and commit
4d5dcc4 (cpufreq: governor: Implement per policy instances of
governors) changed its name to ignore_nice by mistake.

Lets get it renamed back to its original name.

Reported-by: Martin von Gagern <Martin.vGagern@gmx.net>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-14 22:59:06 -07:00
Aaro Koskinen 4b0be00599 cpufreq: loongson2: fix regression related to clock management
commit f54fe64d14dff3df6d45a48115d248a82557811f upstream.

Commit 42913c799 (MIPS: Loongson2: Use clk API instead of direct
dereferences) broke the cpufreq functionality on Loongson2 boards:
clk_set_rate() is called before the CPU frequency table is
initialized, and therefore will always fail.

Fix by moving the clk_set_rate() after the table initialization.
Tested on Lemote FuLoong mini-PC.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-14 22:59:06 -07:00
Rafael J. Wysocki e9ef4410a7 cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
commit 2a99859932281ed6c2ecdd988855f8f6838f6743 upstream.

Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
driver module refcount, __cpufreq_remove_dev() causes that refcount
to become negative for the cpufreq driver after a suspend/resume
cycle.

This is not the only bad thing that happens there, however, because
kobject_put() should only be called for the policy kobject at this
point if the CPU is not the last one for that policy.

Namely, if the given CPU is the last one for that policy, the
policy kobject's refcount should be 1 at this point, as set by
cpufreq_add_dev_interface(), and only needs to be dropped once for
the kobject to go away.  This actually happens under the cpu == 1
check, so it need not be done before by cpufreq_cpu_put().

On the other hand, if the given CPU is not the last one for that
policy, this means that cpufreq_add_policy_cpu() has been called
at least once for that policy and cpufreq_cpu_get() has been
called for it too.  To balance that cpufreq_cpu_get(), we need to
call cpufreq_cpu_put() in that case.

Thus, to fix the described problem and keep the reference
counters balanced in both cases, move the cpufreq_cpu_get() call
in __cpufreq_remove_dev() to the code path executed only for
CPUs that share the policy with other CPUs.

Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-11 18:35:24 -07:00
Dirk Brandewie cb631ac773 cpufreq / intel_pstate: Change to scale off of max P-state
commit 2134ed4d614349b2b4e8d7bb593baa9179b8dd1e upstream.

Change to using max P-state instead of max turbo P-state.  This
change resolves two issues.

On a quiet system intel_pstate can fail to respond to a load change.

On CPU SKUs that have a limited number of P-states and no turbo range
intel_pstate fails to select the highest available P-state.

This change is suitable for stable v3.9+

References: https://bugzilla.kernel.org/show_bug.cgi?id=59481
Reported-and-tested-by: Arjan van de Ven <arjan@linux.intel.com>
Reported-and-tested-by: dsmythies@telus.net
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-04 16:50:51 +08:00
Srivatsa S. Bhat 916f4dbc2a cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression
commit e8d05276f236ee6435e78411f62be9714e0b9377 upstream.

commit 2f7021a8 "cpufreq: protect 'policy->cpus' from offlining
during __gov_queue_work()" caused a regression in CPU hotplug,
because it lead to a deadlock between cpufreq governor worker thread
and the CPU hotplug writer task.

Lockdep splat corresponding to this deadlock is shown below:

[   60.277396] ======================================================
[   60.277400] [ INFO: possible circular locking dependency detected ]
[   60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
[   60.277411] -------------------------------------------------------
[   60.277417] bash/2225 is trying to acquire lock:
[   60.277422]  ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
[   60.277444] but task is already holding lock:
[   60.277449]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[   60.277465] which lock already depends on the new lock.

[   60.277472] the existing dependency chain (in reverse order) is:
[   60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
[   60.277490]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277503]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[   60.277514]        [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
[   60.277522]        [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
[   60.277532]        [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
[   60.277543]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[   60.277552]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[   60.277560]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[   60.277569]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[   60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
[   60.277592]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277600]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[   60.277608]        [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
[   60.277616]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[   60.277624]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[   60.277633]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[   60.277640]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[   60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
[   60.277661]        [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[   60.277669]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277677]        [<ffffffff810621ed>] flush_work+0x3d/0x280
[   60.277685]        [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[   60.277693]        [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[   60.277701]        [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[   60.277709]        [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[   60.277719]        [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[   60.277728]        [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[   60.277737]        [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[   60.277747]        [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[   60.277759]        [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[   60.277768]        [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[   60.277779]        [<ffffffff815a0d46>] cpu_down+0x36/0x50
[   60.277788]        [<ffffffff815a2748>] store_online+0x98/0xd0
[   60.277796]        [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[   60.277806]        [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[   60.277818]        [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[   60.277826]        [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[   60.277834]        [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[   60.277842] other info that might help us debug this:

[   60.277848] Chain exists of:
  (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock

[   60.277864]  Possible unsafe locking scenario:

[   60.277869]        CPU0                    CPU1
[   60.277873]        ----                    ----
[   60.277877]   lock(cpu_hotplug.lock);
[   60.277885]                                lock(&j_cdbs->timer_mutex);
[   60.277892]                                lock(cpu_hotplug.lock);
[   60.277900]   lock((&(&j_cdbs->work)->work));
[   60.277907]  *** DEADLOCK ***

[   60.277915] 6 locks held by bash/2225:
[   60.277919]  #0:  (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
[   60.277937]  #1:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
[   60.277954]  #2:  (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
[   60.277972]  #3:  (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
[   60.277990]  #4:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
[   60.278007]  #5:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[   60.278023] stack backtrace:
[   60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
[   60.278037] Hardware name: Acer             Aspire 5741G    /Aspire 5741G    , BIOS V1.20 02/08/2011
[   60.278042]  ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
[   60.278055]  ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
[   60.278068]  ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
[   60.278081] Call Trace:
[   60.278091]  [<ffffffff815b3d90>] dump_stack+0x19/0x1b
[   60.278101]  [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
[   60.278111]  [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[   60.278123]  [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
[   60.278134]  [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.278142]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
[   60.278151]  [<ffffffff810621ed>] flush_work+0x3d/0x280
[   60.278159]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
[   60.278169]  [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
[   60.278178]  [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
[   60.278188]  [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[   60.278196]  [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[   60.278206]  [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[   60.278214]  [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[   60.278225]  [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[   60.278234]  [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[   60.278244]  [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[   60.278255]  [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[   60.278265]  [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[   60.278275]  [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[   60.278284]  [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[   60.278292]  [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
[   60.278302]  [<ffffffff815a0d46>] cpu_down+0x36/0x50
[   60.278311]  [<ffffffff815a2748>] store_online+0x98/0xd0
[   60.278320]  [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[   60.278329]  [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[   60.278337]  [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[   60.278347]  [<ffffffff81185950>] ? fget_light+0x320/0x4b0
[   60.278355]  [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[   60.278364]  [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[   60.280582] smpboot: CPU 1 is now offline

The intention of that commit was to avoid warnings during CPU
hotplug, which indicated that offline CPUs were getting IPIs from the
cpufreq governor's work items.  But the real root-cause of that
problem was commit a66b2e5 (cpufreq: Preserve sysfs files across
suspend/resume) because it totally skipped all the cpufreq callbacks
during CPU hotplug in the suspend/resume path, and hence it never
actually shut down the cpufreq governor's worker threads during CPU
offline in the suspend/resume path.

Reflecting back, the reason why we never suspected that commit as the
root-cause earlier, was that the original issue was reported with
just the halt command and nobody had brought in suspend/resume to the
equation.

The reason for _that_ in turn, as it turns out, is that earlier
halt/shutdown was being done by disabling non-boot CPUs while tasks
were frozen, just like suspend/resume....  but commit cf7df378a
(reboot: migrate shutdown/reboot to boot cpu) which came somewhere
along that very same time changed that logic: shutdown/halt no longer
takes CPUs offline.  Thus, the test-cases for reproducing the bug
were vastly different and thus we went totally off the trail.

Overall, it was one hell of a confusion with so many commits
affecting each other and also affecting the symptoms of the problems
in subtle ways.  Finally, now since the original problematic commit
(a66b2e5) has been completely reverted, revert this intermediate fix
too (2f7021a8), to fix the CPU hotplug deadlock.  Phew!

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Peter Wu <lekensteyn@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25 14:07:23 -07:00
Srivatsa S. Bhat 9d3ce4af3b cpufreq: Revert commit a66b2e to fix suspend/resume regression
commit aae760ed21cd690fe8a6db9f3a177ad55d7e12ab upstream.

commit a66b2e (cpufreq: Preserve sysfs files across suspend/resume)
has unfortunately caused several things in the cpufreq subsystem to
break subtly after a suspend/resume cycle.

The intention of that patch was to retain the file permissions of the
cpufreq related sysfs files across suspend/resume.  To achieve that,
the commit completely removed the calls to cpufreq_add_dev() and
__cpufreq_remove_dev() during suspend/resume transitions.  But the
problem is that those functions do 2 kinds of things:
  1. Low-level initialization/tear-down that are critical to the
     correct functioning of cpufreq-core.
  2. Kobject and sysfs related initialization/teardown.

Ideally we should have reorganized the code to cleanly separate these
two responsibilities, and skipped only the sysfs related parts during
suspend/resume.  Since we skipped the entire callbacks instead (which
also included some CPU and cpufreq-specific critical components),
cpufreq subsystem started behaving erratically after suspend/resume.

So revert the commit to fix the regression.  We'll revisit and address
the original goal of that commit separately, since it involves quite a
bit of careful code reorganization and appears to be non-trivial.

(While reverting the commit, note that another commit f51e1eb
 (cpufreq: Fix cpufreq regression after suspend/resume) already
 reverted part of the original set of changes.  So revert only the
 remaining ones).

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25 14:07:23 -07:00
Srivatsa S. Bhat c02527487f cpufreq: Fix cpufreq regression after suspend/resume
commit f51e1eb63d9c28cec188337ee656a13be6980cfd upstream.

Toralf Förster reported that the cpufreq ondemand governor behaves erratically
(doesn't scale well) after a suspend/resume cycle. The problem was that the
cpufreq subsystem's idea of the cpu frequencies differed from the actual
frequencies set in the hardware after a suspend/resume cycle. Toralf bisected
the problem to commit a66b2e5 (cpufreq: Preserve sysfs files across
suspend/resume).

Among other (harmless) things, that commit skipped the call to
cpufreq_update_policy() in the resume path. But cpufreq_update_policy() plays
an important role during resume, because it is responsible for checking if
the BIOS changed the cpu frequencies behind our back and resynchronize the
cpufreq subsystem's knowledge of the cpu frequencies, and update them
accordingly.

So, restore the call to cpufreq_update_policy() in the resume path to fix
the cpufreq regression.

Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-13 11:42:27 -07:00
Lianwei Wang 37a05ab72b cpufreq: interactive: resched timer if max freq raised
When the policy max freq is raised, and before the timer is
rescheduled in idle callback, the cpu freq may stuck at a
lower freq.

The target_freq shall be updated too, else on a high load
situation, the new_freq is always equal to target_freq and
which will cause freq stuck at a lower freq too.

Reschedule the timer on gov limits callback.

Change-Id: I6c187001ab43e859731429b64f75a74eebc37a24
Signed-off-by: Lianwei Wang <a22439@motorola.com>
2013-07-01 15:46:29 -07:00
Lianwei Wang e1fb7646f1 cpufreq: interactive: fix race on cpufreq TRANSITION notifier
The cpufreq TRANSTION notifier callback does not check the
governor_enabled state on affected CPUS, which will case
kernel panic in update_load because the policy object maybe
NULL or invalid when governor_enabled is false.

Change-Id: Ie0f1718124f61e2f9b5da57abc6981ada5b83908
Signed-off-by: Lianwei Wang <a22439@motorola.com>
2013-07-01 15:46:28 -07:00
Minsung Kim 3ab74abdc3 cpufreq: interactive: avoid underflow on active time calculation
Check for idle time delta less than elapsed time delta, avoid
underflow computing active time.

Change-Id: I3e4c6ef1ad794eec49ed379c0c50fa727fd6ad28
Signed-off-by: Minsung Kim <ms925.kim@samsung.com>
2013-07-01 14:16:27 -07:00
Todd Poynor d06bc7e5b3 cpufreq: interactive: reduce chance of zero time delta on load eval
Reschedule load sampling timer after timestamp of sample start taken,
hold spinlock across entire sequence to avoid preemption.  Avoid the
WARN for zero time delta in the load sampling timer function.

Change-Id: Idc10a756f09141decb6df92669521a1ebf0dbc10
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:27 -07:00
Todd Poynor fb07c42ed8 cpufreq: interactive: handle errors from cpufreq_frequency_table_target
Add checks for error return from cpufreq_frequency_table_target, and be
less noisy on the existing call with an error check.  CPU hotplug and
system shutdown may cause this call to return -EINVAL.

Bug: 8613560
Change-Id: Id78d8829920462c0db1c7e14e717d91740d6cb44
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:27 -07:00
Minsung Kim 790fbc3116 cpufreq: interactive: fix uninitialized spinlock
Add missing spinlock init

Backtrace:
[<c0011ce4>] (dump_backtrace+0x0/0x10c) from [<c0662a68>] (dump_stack+0x18/0x1c)
 r6:00000032 r5:c0bd09ec r4:e6848000 r3:00000000
[<c0662a50>] (dump_stack+0x0/0x1c) from [<c06670b0>] (spin_dump+0x80/0x94)
[<c0667030>] (spin_dump+0x0/0x94) from [<c06670f0>] (spin_bug+0x2c/0x30)
 r5:c08f91fc r4:c0bd09ec
[<c06670c4>] (spin_bug+0x0/0x30) from [<c0245f74>] (do_raw_spin_unlock+0x88/0xcc)
 r5:e547bac0 r4:c0bd09ec
[<c0245eec>] (do_raw_spin_unlock+0x0/0xcc) from [<c066c9cc>] (_raw_spin_unlock_irqrestore+0x14/0x40)
 r5:e547bac0 r4:60000013
[<c066c9b8>] (_raw_spin_unlock_irqrestore+0x0/0x40) from [<c044b884>] (store_above_hispeed_delay+0x6c/0x80)
 r4:c0b4cf78 r3:00000007
[<c044b818>] (store_above_hispeed_delay+0x0/0x80) from [<c0235d24>] (kobj_attr_store+0x1c/0x28)
 r7:e68ff000 r6:00000032 r5:e58137c0 r4:e61cde80
[<c0235d08>] (kobj_attr_store+0x0/0x28) from [<c0156b78>] (sysfs_write_file+0x104/0x184)
[<c0156a74>] (sysfs_write_file+0x0/0x184) from [<c0100680>] (vfs_write+0xb0/0x140)
[<c01005d0>] (vfs_write+0x0/0x140) from [<c0100900>] (sys_write+0x44/0x70)
 r8:00000000 r7:00000004 r6:00000032 r5:bee43c90 r4:e5600300
[<c01008bc>] (sys_write+0x0/0x70) from [<c000e400>] (ret_fast_syscall+0x0/0x30)
 r9:e6842000 r8:c000e584 r6:00000032 r5:bee43c90 r4:00000009

Change-Id: I80a1e0b3fecb24adba501ff44f568479deeff7fa
Signed-off-by: Minsung Kim <ms925.kim@samsung.com>
2013-07-01 14:16:25 -07:00
Todd Poynor 9b3941d0af cpufreq: interactive: base above_hispeed_delay on target freq, not current
Time to wait should be based on the intended target speed, not the
actual speed (which may be held high by another CPU).

Change-Id: Ifc5bb55d06adddb9a02af90af05398a78f282272
Reported-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:23 -07:00
Todd Poynor eb58ca7afd cpufreq: interactive: fix crash on error paths in get_tokenized_data
Use separate variable for error code, free proper pointer.

Change-Id: Ia83cccb195997789ac6afbf5b8761f7b278196d6
Reported-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:23 -07:00
Lianwei Wang fb117edf60 cpufreq: interactive: add io_is_busy interface
Previously the idle time returned from get_cpu_idle_time_us included the
iowait time. So the iowait time was always calculated as idle time.

But now the idle time returned from get_cpu_idle_time_us does not include
the iowait time anymore because of below commit which cause the iowait time
always calculated as busy time:
    6beea0c nohz: Fix update_ts_time_stat idle accounting

Add the io_is_busy interface, as does the ondemand governor, and let the user
configure the iowait time as busy or idle through the io_is_busy sysfs
interface.

By default, io_is_busy is disabled.

[toddpoynor@google.com: minor updates]
Change-Id: If7d70ff864c43bc9c8d7fd7cfc66f930d339f9b4
Signed-off-by: Lianwei Wang <lian-wei.wang@motorola.com>
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:22 -07:00
Minsung Kim a9dbf84177 cpufreq: interactive: allow arbitrary speed / delay mappings
Accept a string of delays and speeds at which to apply the delay before
raising each step above hispeed. For example, "80000 1300000:200000
1500000:40000" means that the delay at or above 1GHz, until 1.3GHz is 80 msecs,
the delay until 1.5GHz is 200 msecs and the delay at or above 1.5GHz is 40
msecs when hispeed_freq is 1GHz.

[toddpoynor@google.com: add documentation]
Change-Id: Ifeebede8b1acbdd0a53e5c6916bccbf764dc854f
Signed-off-by: Minsung Kim <ms925.kim@samsung.com>
2013-07-01 14:16:22 -07:00
Lianwei Wang de096a7b3f cpufreq: interactive: fix race on governor start/stop
There is race condition when both two cpu do CPUFREQ_GOV_STOP and one cpu
do CPUFREQ_GOV_START soon. The sysfs_remove_group is not done yet on one
cpu, but sysfs_create_group is called on another cpu, which cause governor
start failed and then kernel panic in timer callback because the policy and
cpu mask are all kfree in cpufreq driver.

Replace atomic with mutex to lock the whole START/STOP sequence.

Change-Id: I3762b3d44315ae021b8275aca84f5ea9147cc540
Signed-off-by: Lianwei Wang <a22439@motorola.com>
2013-07-01 14:16:19 -07:00
Todd Poynor 0f44c51f17 cpufreq: interactive: fix deadlock on spinlock in timer
Need to use irqsave/restore spinlock calls to avoid a deadlock in calls
from the timer.

Change-Id: I15b6b590045ba1447e34ca7b5ff342723e53a605
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:19 -07:00
Todd Poynor bfea469069 cpufreq: interactive: don't handle transition notification if not enabled
If multiple governors are in use then avoid processing frequency transition
notifications for CPUs on which the interactive governor is not enabled.

Change-Id: Ibd75255b921d887501a64774a8c4f62302f2d4e4
Reported-by: Francisco Franco <francisco.franco@cloudcar.com>
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:19 -07:00
Todd Poynor 5f94a8a7b6 cpufreq: interactive: init default values at compile time
Change-Id: Ia4966e949a6c24c34fdbd4a6e522cd7c37e4108e
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:18 -07:00
Todd Poynor cefa3db706 cpufreq: interactive: default go_hispeed_load 99%, doc updates
Update default go_hispeed_load from 85% to 99%.  Recent changes to the
governor now use a default target_load of 90%.  go_hispeed_load should
not be lower than the target load for hispeed_freq, which could lead
to oscillating speed decisions.  Other recent changes reduce the need
to dampen speed jumps on load spikes, while input event boosts from
userspace are the preferred method for anticipating load spikes with
UI impacts.

General update to the documentation to reflect recent changes.

Change-Id: I1b92f3091f42c04b10503cd1169a943b5dfd6faf
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:18 -07:00
Todd Poynor 1d0427d1c3 cpufreq: interactive: fix race on timer restart on governor start
Starting the governor, or restarting on a hotplugged-in CPU, can race
with the timer start in idle, triggering a BUG on timer already pending.
Start the timer before setting the enable flag, and use enable_sem to
protect the sequence (and ensure correct order of the update to the
enable flag).  Delete any existing timer for safety.

Change-Id: Ife77cf9fe099e8fd8543224cbf148c6722c2ffb0
Reported-by: Francisco Franco <francisco.franco@cloudcar.com>
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:18 -07:00
Todd Poynor beb55c19d5 cpufreq: interactive: fix racy timer stopping
When stopping the governor, del_timer_sync() can race against an
invocation of the idle notifier callback, which has the potential
to reactivate the timer.

To fix this issue, a read-write semaphore is used. Multiple readers are
allowed as long as pcpu->governor_enabled is true.  However it can be
moved to false only after taking a write semaphore which would wait for
any on-going asynchronous activities to complete and prevent any more of
those activities to be initiated.

[toddpoynor@google.com: cosmetic and commit text changes]
Change-Id: Ib51165a735d73dcf964a06754c48bdc1913e13d0
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
2013-07-01 14:16:18 -07:00
Todd Poynor f5e5ad4292 cpufreq: interactive: fix boosting logic
35a84de cpufreq: interactive: apply above_hispeed_delay to each step above hispeed

caused the speed choice logic to osciallate between boosting and not boosting.
Add back code to ensure speed does not drop below boost frequency while
boosting.

Change-Id: Id420068480fcc7f5c4989ff523e2a8d22e2f4db2
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:18 -07:00
Todd Poynor 07af93ad6e cpufreq: interactive: add timer slack to limit idle at speed > min
Always use deferrable timer for load sampling.

Set a non-deferrable timer to an additional slack time to allow prior to
waking up from idle to drop speed when not at minimum speed.  Slack value
-1 avoids wakeups to drop speed.  Default is 80ms.

Remove the governidle module param and its timer management in idle.  For
platforms on which holding speed above mimum in idle costs power, use the
new timer slack to select how long to wait before waking up to drop speed.

Change-Id: I270b3980667e2c70a68e5bff534124b4411dbad5
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:18 -07:00
Todd Poynor 8c4fc9a26e cpufreq: interactive: specify duration of CPU speed boost pulse
Sysfs attribute boostpulse_duration specifies the duration of boosting CPU
speed in response to bootpulse events.  Duration is specified in usecs,
default 80ms.

Change-Id: Ifd41625574891a44f1787a4e85d1e7b4f2afb52b
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:17 -07:00
Todd Poynor 5b28b65949 cpufreq: interactive: adjust load for changes in speed
Add notifier for speed transitions.  Keep a count of CPU active
microseconds times current frequency, converted to a percentage relative
to the current frequency when load is evaluated.

Change-Id: I5c27adb11081c50490219784ca57cc46e97fc28c
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:17 -07:00
Todd Poynor 404bca0758 cpufreq: interactive: remove load since last speed change
The longer-term load since last speed change isn't terribly useful,
may delay recognition of dropping load, and would need forthcoming
changes to adjust load for changing CPU speeds.  Drop it.

Change-Id: Ic3cbb0542cc3484617031787e03ed9bdd632dec1
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:15 -07:00
Todd Poynor 04b3fd6f5f cpufreq: interactive: allow arbitrary speed / target load mappings
Accept a string of target loads and speeds at which to apply the
target loads, per the documentation update in this patch.  For example,
"85 1000000:90 1700000:99" targets CPU load 85% below speed 1GHz,  90%
at or above 1GHz, until 1.7GHz and above, at which load 99% is targeted.

Attempt to avoid oscillations by evaluating the current speed
weighted by current load against each new choice of speed, choosing a
higher speed if the current load requires a higher speed.

Change-Id: Ie3300206047c84eca5a26b0b63ea512e5207550e
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:15 -07:00
Todd Poynor b1a39c5c2a cpufreq: interactive: apply above_hispeed_delay to each step above hispeed
Apply above_hispeed_delay whenever increasing speed to a new speed above
hispeed (not just the first step above hispeed).

Change-Id: Ibb7add7db47f2a4306a9458c4e1ebabb60698636
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:14 -07:00
Todd Poynor 9e746fb291 cpufreq: interactive: change speed according to current speed and target load
Add a target_load attribute that specifies how aggressively the governor is
to adjust speed to meet the observed load.  New target speed is calculated
as the current actual speed (may be higher than target speed on SMP) times
the CPU load (as a fraction) divided by target load (fraction).

cpufreq_frequency_table_target() call use CPUFREQ_RELATION_L to set
the next higher speed rather than next lower speed.

Change-Id: If432451da82f5fed12e15c9421d7d27792376150
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:14 -07:00
Todd Poynor 1d1b87282b cpufreq: interactive: trace actual speed in target speed decisions
Tracing adds actual speed since this is expected to be key to the
choice of target speed.

Change-Id: Iec936102d0010c4e9dfa143c38a9fd0d551189c3
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:14 -07:00
Todd Poynor 9c6633a1bf cpufreq: interactive: kick timer on idle exit past expiry
The deferrable timer list isn't checked on all idle exits, such as when
hi-res timers expire or ISRs schedule workers.  If the idle loop is
exited and it's past time to run the governor load polling timer,
run it immediately.  This ensures we handle load spikes caused by actvity
that does not run the normal timer list.

Rename the field that timestamps the "time_in_idle" value to be more
accurate.

Change-Id: Ied590ecbefc83c9a9ec5eb9e31903557f6fa1614
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:13 -07:00
Lianwei Wang a407739135 cpufreq: interactive: use deferrable timer by default
Avoid wakeups only to handle the governor timer when the system is otherwise
idle.

For platforms where the power cost of remaining in idle at higher CPU
speed may outweigh the cost of a governor wakeup from idle to lower the speed,
set parameter cpufreq_interactive.governidle=1.

Change-Id: Id6c43eb35caecf9b0574fcdd5b769711bc7e6de6
Signed-off-by: LianWei WANG <a22439@motorola.com>
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:13 -07:00
Todd Poynor 8cb7e24b53 cpufreq: interactive: pin timers to associated CPU
Helps avoid waking up other CPUs to react to activity on the local CPU.

Change-Id: Ife272aaa7916894a437705d44521b1a1693fbe8e
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:13 -07:00
Todd Poynor 92942e6393 cpufreq: interactive: run at fraction of hispeed_freq when load is low
When load is below go_hispeed_load, apply the percentage of CPU load to
a max frequency of hispeed_freq instead of the max speed.  This avoids
jumping too quickly to hispeed_freq when it is a relatively low
percentage of max speed.  This also allows go_hispeed_load to be set to
a high percentage relative to hispeed_freq (as a percentage of max speed,
again useful when hispeed_freq is a low fraction of max speed), to cap
larger loads at hispeed_freq.  For example, a load of 60% will typically
move to 60% of hispeed_freq, not 60% of max speed.  This causes the
governor to apply two different speed caps, depending on whether load is
below or above go_hispeed_load.

Also fix the type of hispeed_freq, which was u64, to match other
speed data types (and avoid overhead and allow division).

Change-Id: Ie2d0668be161c074aaad77db2037505431457b3a
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:09 -07:00
Todd Poynor 70fb7625fb cpufreq: interactive: always limit initial speed bump to hispeed
First bump speed up to hispeed_freq whenever the current speed is below
hispeed_freq, instead of only when the current speed is the minimum speed.
The previous code made it too difficult to use hispeed_freq as a common
intermediate speed on systems that frequently run at speeds between
minimum and hispeed_freq.

Change-Id: I04ec30bafabf5741e267ff289209b8c2d846824b
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 14:16:07 -07:00
Todd Poynor 4306ff6486 cpufreq: interactive: remove input_boost handling
Now handled in userspace Power HAL instead.

Change-Id: I78a4a2fd471308bfcd785bbefcc65fede27314cf
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:52 -07:00
Todd Poynor 76297aac95 cpufreq: interactive: handle speed up and down in the realtime task
Not useful to have a separate, non-realtime workqueue for speed down
events, avoid priority inversion for speed up events.

Change-Id: Iddcd05545245c847aa1bbe0b8790092914c813d2
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:52 -07:00
Sam Leffler 604395ffe9 cpufreq: interactive: keep freezer happy when not current governor
Fix a problem where the hung task mechanism was deeming the interactive
clock boost thread as hung.  This was because the thread is created at
module init but never run/woken up until needed.  If the governor is not
being used this can be forever.  To workaround this explicitly wake up
the thread once all the necessary data structures are initialized.  The
latter required some minor code shuffle.

Signed-off-by: Sam Leffler <sleffler@chromium.org>
Change-Id: Ie2c058dd75dcb6460ea10e7ac997e46baf66b1fe
2013-07-01 13:40:52 -07:00
Sam Leffler 4335359ee7 cpufreq: interactive: take idle notifications only when active
Register an idle notifier only when the governor is active.  Also
short-circuit work of idle end if the governor is not enabled.

Signed-off-by: Sam Leffler <sleffler@chromium.org>
Change-Id: I4cae36dd2e7389540d337d74745ffbaa0131870f
2013-07-01 13:40:52 -07:00
Todd Poynor 808bb2cfaa cpufreq: interactive: restart above_hispeed_delay at each hispeed load
Change-Id: I2e5b91d45e8806b0ab94ca2301ed671c9af9ab13
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:47 -07:00
John Stultz 4a52216c23 cpufreq-interactive: Compile fixup
Looks like AOSP has a compile bug. Fix it up.

Signed-off-by: John Stultz <john.stultz@linaro.org>
2013-07-01 13:40:44 -07:00
Todd Poynor fe51873dff cpufreq: interactive: add boost pulse interface
Change-Id: Icf1e86d2065cc8f0816ba9c6b065eb056d4e8249
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:44 -07:00
Todd Poynor a8e5a0cf1a cpufreq: interactive: set floor for boosted speed
Allow speed to drop to flooor frequency but not below, don't pin
to speed at last boost.

Change-Id: I0147c2b7a2e61ba16820605af6baaf09570be787
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:44 -07:00
Todd Poynor 77417b69a8 cpufreq: interactive: Add sysfs boost interface for hints from userspace
The explicit hint on/off version.

Change-Id: Ibf62b6d45bf6fb8c9c055b9bdaf074ce9374c04f
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:44 -07:00
Todd Poynor b6c0f567e7 cpufreq: interactive: remove unused target_validate_time_in_idle
Change-Id: I37c5085b91318242612440dfd775ad762996612f
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:44 -07:00
Todd Poynor de9a6d3615 cpufreq: interactive: Boost frequency on touchscreen input
Based on previous patches by Tero Kristo <tero.kristo@nokia.com>,
Brian Steuer <bsteuer@codeaurora.org>,
David Ng <dave@codeaurora.org>,
Antti P Miettinen <amiettinen@nvidia.com>, and
Thomas Renninger <trenn@suse.de>

Change-Id: Ic55fedcf6f9310f43a7022fb88e23b0392122769
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:43 -07:00
Todd Poynor 276905725c cpufreq: interactive: Separate speed target revalidate time and initial set time
Allow speed drop after min_sample_time elapses from last time
the current speed was last re-validated as appropriate for
current load / input boost.

Allow speed bump after min_sample_time (or above_hispeed_delay)
elapses from the time the current speed was originally set.

Change-Id: Ic25687a7a53d25e6544c30c47d7ab6f27a47bee8
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:43 -07:00
Todd Poynor 2e2151f88b cpufreq: interactive: base hispeed bump on target freq, not actual
For systems that set a common speed for all CPUs, checking current
speed here could bypass the intermediate hispeed bump decision for
this CPU when another CPU was already at hispeed.  This could
result in an overly high setting (for all CPUs) in situations
where all CPUs were about to drop to load levels that map to
hispeed or below.

Change-Id: I186f23dcfc5e2b6336cab8b0327f0c8a9a4482bc
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:43 -07:00
Todd Poynor cdd48fc92e cpufreq: interactive: adjust code and documentation to match
Change-Id: If59c668d514a29febe5c35404fd9d01df8548eb1
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:43 -07:00
Todd Poynor a48c5d30b7 cpufreq: interactive: configurable delay before raising above hispeed
Change-Id: I4d6ac40b23a3790d48e30c37408284e9f955e8fa
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:43 -07:00
Todd Poynor b6258ddd64 cpufreq: interactive: don't drop speed if recently at higher load
Apply min_sample_time to the last time the current target speed
was originally requested or re-validated as appropriate for the
current load, not to the time since the current speed was
originally set.  Avoids periodic dips in speed during bursty
loads.

Change-Id: I250bda657985de60373f9897cc41f480664d51a1
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:43 -07:00
Todd Poynor af95534ed6 cpufreq: interactive: set at least hispeed when above hispeed load
If load is above go_hispeed_load, always go to at least hispeed_freq,
even when reducing speed from a higher speed, not just when jumping
up from minimum speed.  Avoids running at a lower than intended
speed after a burst of even higher load.

Change-Id: I5b9d2a15ba25ce609b21bac7c724265cf6838dee
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:42 -07:00
Todd Poynor c9d6c1ff8d cpufreq: interactive: apply intermediate load to max speed not current
Evaluate spikes in load (below go_hispeed_load) against the maximum
speed supported by the device, not the current speed (which tends to
make it too difficult to raise speed to intermediate levels until
very busy).

Change-Id: Ib937006abf8bedb60891a739acd733e89b732ae0
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:42 -07:00
Todd Poynor ea2bc4da31 cpufreq interactive governor: event tracing
Change-Id: Ic13614a3da2faa2d4bd215ca3eb7191614f0cf66
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2013-07-01 13:40:42 -07:00
Colin Cross 32d3498d04 cpufreq: Prevent memory leak in cpufreq_stats on hotplug
Ensures that cpufreq_stats_free_table is called before
__cpufreq_remove_dev on cpu hotplug (which also occurs during
suspend on SMP systems) to make sure that sysfs_remove_group
can get called before the cpufreq kobj is freed.  Otherwise,
the sysfs file structures are leaked.

Change-Id: I87e55277272f5cfad47e9e7c92630e990bb90069
Signed-off-by: Colin Cross <ccross@android.com>
2013-07-01 13:40:31 -07:00
Mike Chan 879d744033 cpufreq: interactive: New 'interactive' governor
This governor is designed for latency-sensitive workloads, such as
interactive user interfaces.  The interactive governor aims to be
significantly more responsive to ramp CPU quickly up when CPU-intensive
activity begins.

Existing governors sample CPU load at a particular rate, typically
every X ms.  This can lead to under-powering UI threads for the period of
time during which the user begins interacting with a previously-idle system
until the next sample period happens.

The 'interactive' governor uses a different approach. Instead of sampling
the CPU at a specified rate, the governor will check whether to scale the
CPU frequency up soon after coming out of idle.  When the CPU comes out of
idle, a timer is configured to fire within 1-2 ticks.  If the CPU is very
busy from exiting idle to when the timer fires then we assume the CPU is
underpowered and ramp to MAX speed.

If the CPU was not sufficiently busy to immediately ramp to MAX speed, then
the governor evaluates the CPU load since the last speed adjustment,
choosing the highest value between that longer-term load or the short-term
load since idle exit to determine the CPU speed to ramp to.

A realtime thread is used for scaling up, giving the remaining tasks the
CPU performance benefit, unlike existing governors which are more likely to
schedule rampup work to occur after your performance starved tasks have
completed.

The tuneables for this governor are:
/sys/devices/system/cpu/cpufreq/interactive/min_sample_time:
	The minimum amount of time to spend at the current frequency before
	ramping down. This is to ensure that the governor has seen enough
	historic CPU load data to determine the appropriate workload.
	Default is 80000 uS.
/sys/devices/system/cpu/cpufreq/interactive/go_maxspeed_load
	The CPU load at which to ramp to max speed.  Default is 85.

Change-Id: Ib2b362607c62f7c56d35f44a9ef3280f98c17585
Signed-off-by: Mike Chan <mike@android.com>
Signed-off-by: Todd Poynor <toddpoynor@google.com>
Bug: 3152864
2013-07-01 13:40:31 -07:00
Jacob Shin c28375583b cpufreq: fix NULL pointer deference at od_set_powersave_bias()
When initializing the default powersave_bias value, we need to first
make sure that this policy is running the ondemand governor.

Reported-and-tested-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-25 22:42:37 +02:00
Guennadi Liakhovetski 0ca6843655 cpufreq: cpufreq-cpu0: use the exact frequency for clk_set_rate()
clk_set_rate() isn't supposed to accept approximate frequencies, instead
a supported frequency should be obtained from clk_round_rate() and then
used to set the clock.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-05 13:51:29 +02:00
Michael Wang 2f7021a815 cpufreq: protect 'policy->cpus' from offlining during __gov_queue_work()
Jiri Kosina <jkosina@suse.cz> and Borislav Petkov <bp@alien8.de>
reported the warning:

[   51.616759] ------------[ cut here ]------------
[   51.621460] WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule+0x58/0x60()
[   51.629638] Modules linked in: ext2 vfat fat loop snd_hda_codec_hdmi usbhid snd_hda_codec_realtek coretemp kvm_intel kvm snd_hda_intel snd_hda_codec crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hwdep snd_pcm aesni_intel sb_edac aes_x86_64 ehci_pci snd_page_alloc glue_helper snd_timer xhci_hcd snd iTCO_wdt iTCO_vendor_support ehci_hcd edac_core lpc_ich acpi_cpufreq lrw gf128mul ablk_helper cryptd mperf usbcore usb_common soundcore mfd_core dcdbas evdev pcspkr processor i2c_i801 button microcode
[   51.675581] CPU: 0 PID: 244 Comm: kworker/1:1 Tainted: G        W    3.10.0-rc1+ #10
[   51.683407] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
[   51.690901] Workqueue: events od_dbs_timer
[   51.695069]  0000000000000009 ffff88043a2f5b68 ffffffff8161441c ffff88043a2f5ba8
[   51.702602]  ffffffff8103e540 0000000000000033 0000000000000001 ffff88043d5f8000
[   51.710136]  00000000ffff0ce1 0000000000000001 ffff88044fc4fc08 ffff88043a2f5bb8
[   51.717691] Call Trace:
[   51.720191]  [<ffffffff8161441c>] dump_stack+0x19/0x1b
[   51.725396]  [<ffffffff8103e540>] warn_slowpath_common+0x70/0xa0
[   51.731473]  [<ffffffff8103e58a>] warn_slowpath_null+0x1a/0x20
[   51.737378]  [<ffffffff81025628>] native_smp_send_reschedule+0x58/0x60
[   51.744013]  [<ffffffff81072cfd>] wake_up_nohz_cpu+0x2d/0xa0
[   51.749745]  [<ffffffff8104f6bf>] add_timer_on+0x8f/0x110
[   51.755214]  [<ffffffff8105f6fe>] __queue_delayed_work+0x16e/0x1a0
[   51.761470]  [<ffffffff8105f251>] ? try_to_grab_pending+0xd1/0x1a0
[   51.767724]  [<ffffffff8105f78a>] mod_delayed_work_on+0x5a/0xa0
[   51.773719]  [<ffffffff814f6b5d>] gov_queue_work+0x4d/0xc0
[   51.779271]  [<ffffffff814f60cb>] od_dbs_timer+0xcb/0x170
[   51.784734]  [<ffffffff8105e75d>] process_one_work+0x1fd/0x540
[   51.790634]  [<ffffffff8105e6f2>] ? process_one_work+0x192/0x540
[   51.796711]  [<ffffffff8105ef22>] worker_thread+0x122/0x380
[   51.802350]  [<ffffffff8105ee00>] ? rescuer_thread+0x320/0x320
[   51.808264]  [<ffffffff8106634a>] kthread+0xea/0xf0
[   51.813200]  [<ffffffff81066260>] ? flush_kthread_worker+0x150/0x150
[   51.819644]  [<ffffffff81623d5c>] ret_from_fork+0x7c/0xb0
[   51.918165] nouveau E[     DRM] GPU lockup - switching to software fbcon
[   51.930505]  [<ffffffff81066260>] ? flush_kthread_worker+0x150/0x150
[   51.936994] ---[ end trace f419538ada83b5c5 ]---

It was caused by the policy->cpus changed during the process of
__gov_queue_work(), in other word, cpu offline happened.

Use get/put_online_cpus() to prevent the offline from happening while
__gov_queue_work() is running.

[rjw: The problem has been present since recent commit 031299b
(cpufreq: governors: Avoid unnecessary per cpu timer interrupts)]

References: https://lkml.org/lkml/2013/6/5/88
Reported-by: Borislav Petkov <bp@alien8.de>
Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-05 13:46:54 +02:00
Ross Lagerwall 8673b83bf2 acpi-cpufreq: set current frequency based on target P-State
Commit 4b31e774 (Always set P-state on initialization) fixed bug
#4634 and caused the driver to always set the target P-State at
least once since the initial P-State may not be the desired one.
Commit 5a1c0228 (cpufreq: Avoid calling cpufreq driver's target()
routine if target_freq == policy->cur) caused a regression in
this behavior.

This fixes the regression by setting policy->cur based on the CPU's
target frequency rather than the CPU's current reported frequency
(which may be different).  This means that the P-State will be set
initially if the CPU's target frequency is different from the
governor's target frequency.

This fixes an issue where setting the default governor to
performance wouldn't correctly enable turbo mode on all cores.

Signed-off-by: Ross Lagerwall <rosslagerwall@gmail.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.8+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-05 13:10:57 +02:00
Linus Torvalds 1aad08dc57 Power management and ACPI fixes for 3.10-rc3
- Additional CPU ID for the intel_pstate driver from Dirk Brandewie.
 
 - More cpufreq fixes related to ARM big.LITTLE support and locking from
   Viresh Kumar.
 
 - VIA C7 cpufreq build fix from Rafał Bilski.
 
 - ACPI power management fix making it possible to use device power
   states regardless of the CONFIG_PM setting from Rafael J. Wysocki.
 
 - New ACPI video blacklist item from Bastian Triller.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJRoRZjAAoJEKhOf7ml8uNsv9wQAKAMs9J8k6XqgNPisFKetw+K
 hzCOsKFOpI0BQKFikgtWjhGre1SyNIRUvLXO7BHFHXYQW6cLvn1jAyJhvl+i4nvT
 eOa+vdGd6grWncbhIxeidoyk9hTZ6bdMWlTBvKUz5KpHzvp4YGC2jlvwFwqsJkpg
 nQ8Hcbrbhm4vz5h7EmrlYcELBNTi5LQtmnqlxtbn02GX75BFTpkCm5aLZWZNEUrE
 Hix8BhN41+hSy+K34ztHFlP5g/s/lIa9dOX1tewqSigkDB/qYYIt2lpdD2icOIOW
 qHAtvpZq8/fZOcoZ9KdFqKUjjbuKVavldb+YzGeTLQufOAwb4hgMRvAccdNFMHIW
 9tVkp2TcK6K7pAYlXtgEf25ka7ulLWDBd4C662gZfpi+oPKx2BI/6m7J4VoTULeb
 30hDMyZXrXWWvStwO05Pyno3W5lG+cn9jytc3hKkaFerb53NHcZHfb0Rih5NhDZD
 Ep09IuPE8fOT9KndY2kw/WwoZyJurYCbrgE+G1QyA+hsNPkNhPlGTxdL8vCqxM4K
 ZOaQQejpd1bXBSk8Koz8LRyQ38KJByvM64B0EDSP6BQUT+rlbkcvog1bJV+UdpbJ
 4TlhrAFlobhRFQBqlIbRqMXFPH31YSm7wVK1eK/gEqNZI935Kd17YSFf8yyi2yli
 vBlmPkiPEIJHysps+tvd
 =Srt8
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:

 - Additional CPU ID for the intel_pstate driver from Dirk Brandewie.

 - More cpufreq fixes related to ARM big.LITTLE support and locking from
   Viresh Kumar.

 - VIA C7 cpufreq build fix from Rafał Bilski.

 - ACPI power management fix making it possible to use device power
   states regardless of the CONFIG_PM setting from Rafael J Wysocki.

 - New ACPI video blacklist item from Bastian Triller.

* tag 'pm+acpi-3.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / video: Add "Asus UL30A" to ACPI video detect blacklist
  cpufreq: arm_big_little_dt: Instantiate as platform_driver
  cpufreq: arm_big_little_dt: Register driver only if DT has valid data
  cpufreq / e_powersaver: Fix linker error when ACPI processor is a module
  cpufreq / intel_pstate: Add additional supported CPU ID
  cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT
  ACPI / PM: Allow device power states to be used for CONFIG_PM unset
2013-05-25 20:32:00 -07:00
Viresh Kumar 9076eaca60 cpufreq: arm_big_little_dt: Instantiate as platform_driver
As multiplatform build is being adopted by more and more ARM platforms, initcall
function should be used very carefully. For example, when both arm_big_little_dt
and cpufreq-cpu0 drivers are compiled in, arm_big_little_dt driver may try to
register even if we had platform device for cpufreq-cpu0 registered.

To eliminate this undesired the effect, the patch changes arm_big_little_dt
driver to have it instantiated as a platform_driver. Then it will only run on
platforms that create the platform_device "arm-bL-cpufreq-dt".

Reported-and-tested-by: Rob Herring <robherring2@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-22 12:43:33 +02:00
Viresh Kumar 92a9b5c291 cpufreq: arm_big_little_dt: Register driver only if DT has valid data
If arm_big_little_dt driver is enabled, then it will always try to register with
big LITTLE cpufreq core driver. In case DT doesn't have relevant data for cpu
nodes, i.e. operating points aren't present, then we should exit early and
shouldn't register with big LITTLE cpufreq core driver. Otherwise we will fail
continuously from the driver->init() routine.

This patch fixes this issue.

Reported-and-tested-by: Jon Medhurst <tixy@linaro.org>
Reviewed-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-22 12:42:34 +02:00
Rafał Bilski b5f14720a6 cpufreq / e_powersaver: Fix linker error when ACPI processor is a module
on i386:
CONFIG_ACPI_PROCESSOR=m
CONFIG_X86_E_POWERSAVER=y

drivers/built-in.o: In function `eps_cpu_init.part.8':
e_powersaver.c:(.text.unlikely+0x2243): undefined reference to `acpi_processor_register_performance'
e_powersaver.c:(.text.unlikely+0x22a2): undefined reference to `acpi_processor_unregister_performance'
e_powersaver.c:(.text.unlikely+0x246b): undefined reference to `acpi_processor_get_bios_limit'

X86_E_POWERSAVER should also depend on ACPI_PROCESSOR.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-22 12:41:25 +02:00
Ralf Baechle bdc92d74e0 MIPS: Idle: Consolidate all declarations in <asm/idle.h>.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-22 01:34:27 +02:00
Ralf Baechle fb40bc3e94 MIPS: Idle: Re-enable irqs at the end of r3081, au1k and loongson2 cpu_wait.
Without this, the

    WARN_ON_ONCE(irqs_disabled());

in the idle loop will be triggered.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-22 01:34:26 +02:00
Dirk Brandewie c96d53d600 cpufreq / intel_pstate: Add additional supported CPU ID
Add CPU ID for Ivybrigde processor.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-22 00:28:44 +02:00
Viresh Kumar 955ef48335 cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT
With the rwsem lock around
__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT), we
get circular dependency when we call sysfs_remove_group().

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.9.0-rc7+ #15 Not tainted
 -------------------------------------------------------
 cat/2387 is trying to acquire lock:
  (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<c02f6179>] lock_policy_rwsem_read+0x25/0x34

 but task is already holding lock:
  (s_active#41){++++.+}, at: [<c00f9bf7>] sysfs_read_file+0x4f/0xcc

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

-> #1 (s_active#41){++++.+}:
        [<c0055a79>] lock_acquire+0x61/0xbc
        [<c00fabf1>] sysfs_addrm_finish+0xc1/0x128
        [<c00f9819>] sysfs_hash_and_remove+0x35/0x64
        [<c00fbe6f>] remove_files.isra.0+0x1b/0x24
        [<c00fbea5>] sysfs_remove_group+0x2d/0xa8
        [<c02f9a0b>] cpufreq_governor_interactive+0x13b/0x35c
        [<c02f61df>] __cpufreq_governor+0x2b/0x8c
        [<c02f6579>] __cpufreq_set_policy+0xa9/0xf8
        [<c02f6b75>] store_scaling_governor+0x61/0x100
        [<c02f6f4d>] store+0x39/0x60
        [<c00f9b81>] sysfs_write_file+0xed/0x114
        [<c00b3fd1>] vfs_write+0x65/0xd8
        [<c00b424b>] sys_write+0x2f/0x50
        [<c000cdc1>] ret_fast_syscall+0x1/0x52

-> #0 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}:
        [<c0055253>] __lock_acquire+0xef3/0x13dc
        [<c0055a79>] lock_acquire+0x61/0xbc
        [<c03ee1f5>] down_read+0x25/0x30
        [<c02f6179>] lock_policy_rwsem_read+0x25/0x34
        [<c02f6edd>] show+0x21/0x58
        [<c00f9c0f>] sysfs_read_file+0x67/0xcc
        [<c00b40a7>] vfs_read+0x63/0xd8
        [<c00b41fb>] sys_read+0x2f/0x50
        [<c000cdc1>] ret_fast_syscall+0x1/0x52

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(s_active#41);
                                lock(&per_cpu(cpu_policy_rwsem, cpu));
                                lock(s_active#41);
   lock(&per_cpu(cpu_policy_rwsem, cpu));

  *** DEADLOCK ***

 2 locks held by cat/2387:
  #0:  (&buffer->mutex){+.+.+.}, at: [<c00f9bcd>] sysfs_read_file+0x25/0xcc
  #1:  (s_active#41){++++.+}, at: [<c00f9bf7>] sysfs_read_file+0x4f/0xcc

 stack backtrace:
 [<c0011d55>] (unwind_backtrace+0x1/0x9c) from [<c03e9a09>] (print_circular_bug+0x19d/0x1e8)
 [<c03e9a09>] (print_circular_bug+0x19d/0x1e8) from [<c0055253>] (__lock_acquire+0xef3/0x13dc)
 [<c0055253>] (__lock_acquire+0xef3/0x13dc) from [<c0055a79>] (lock_acquire+0x61/0xbc)
 [<c0055a79>] (lock_acquire+0x61/0xbc) from [<c03ee1f5>] (down_read+0x25/0x30)
 [<c03ee1f5>] (down_read+0x25/0x30) from [<c02f6179>] (lock_policy_rwsem_read+0x25/0x34)
 [<c02f6179>] (lock_policy_rwsem_read+0x25/0x34) from [<c02f6edd>] (show+0x21/0x58)
 [<c02f6edd>] (show+0x21/0x58) from [<c00f9c0f>] (sysfs_read_file+0x67/0xcc)
 [<c00f9c0f>] (sysfs_read_file+0x67/0xcc) from [<c00b40a7>] (vfs_read+0x63/0xd8)
 [<c00b40a7>] (vfs_read+0x63/0xd8) from [<c00b41fb>] (sys_read+0x2f/0x50)
 [<c00b41fb>] (sys_read+0x2f/0x50) from [<c000cdc1>] (ret_fast_syscall+0x1/0x52)

This lock isn't required while calling __cpufreq_governor(policy,
CPUFREQ_GOV_POLICY_EXIT). Remove it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-22 00:23:54 +02:00
Srivatsa S. Bhat a66b2e503f cpufreq: Preserve sysfs files across suspend/resume
The file permissions of cpufreq per-cpu sysfs files are not preserved
across suspend/resume because we internally go through the CPU
Hotplug path which reinitializes the file permissions on CPU online.

But the user is not supposed to know that we are using CPU hotplug
internally within suspend/resume (IOW, the kernel should not silently
wreck the user-set file permissions across a suspend cycle).
Therefore, we need to preserve the file permissions as they are
across suspend/resume.

The simplest way to achieve that is to just not touch the sysfs files
at all - ie., just ignore the CPU hotplug notifications in the
suspend/resume path (_FROZEN) in the cpufreq hotplug callback.

Reported-by: Robert Jarzmik <robert.jarzmik@intel.com>
Reported-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-15 21:47:17 +02:00
Wei Yongjun b57ffac5e5 cpufreq / intel_pstate: use vzalloc() instead of vmalloc()/memset(0)
Use vzalloc() instead of vmalloc() and memset(0).

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-14 01:39:28 +02:00
Borislav Petkov 60e6726c7b cpufreq, ondemand: Remove leftover debug line
I don't see how the virtual address of the tuners pointer would be of
any help to anyone so remove it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-13 14:02:31 +02:00
Wolfram Sang d96f733017 cpufreq / kirkwood: don't check resource with devm_ioremap_resource
devm_ioremap_resource does sanity checks on the given resource. No need to
duplicate this in the driver.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:04:17 +02:00
Dirk Brandewie 35363e943f cpufreq / intel_pstate: remove #ifdef MODULE compile fence
The driver can no longer be built as a module remove the compile fence
around cpufreq tracing call.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:04:17 +02:00
Dirk Brandewie a73108d578 cpufreq / intel_pstate: Remove idle mode PID
Remove dead code from the driver.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:04:17 +02:00
Dirk Brandewie ca182aee38 cpufreq / intel_pstate: fix ffmpeg regression
The ffmpeg benchmark in the phoronix test suite has threads on
multiple cores that rely on the progress on of threads on other cores
and ping pong back and forth fast enough to make the core appear less
busy than it "should" be.  If the core has been at minimum p-state for
a while bump the pstate up to kick the core to see if it is in this
ping pong state.  If the core is truly idle the p-state will be
reduced at the next sample time.  If the core makes more progress it
will send more work to the thread bringing both threads out of the
ping pong scenario and the p-state will be selected normally.

This fixes a performance regression of approximately 30%

Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:04:16 +02:00
Dirk Brandewie d8f469e9cf cpufreq / intel_pstate: use lowest requested max performance
There are two ways that the maximum p-state can be clamped, via a
policy change and via the sysfs file.

The acpi-thermal driver adjusts the p-state policy in response to
thermal events.  These changes override the users settings at the
moment.

Use the lowest of the two requested values this ensures that we will
not exceed the requested pstate from either mechanism.

Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:04:16 +02:00
Dirk Brandewie 1abc4b20b8 cpufreq / intel_pstate: remove idle time and duration from sample and calculations
Idle time is taken into account in the APERF/MPERF ratio calculation
there is no reason for the driver to track it seperately.  This
reduces the work in the driver and makes the code more readable.

Removal of the tracking of sample duration removes the possibility of
the divide by zero exception when the duration is sub 1us

References: https://bugzilla.kernel.org/show_bug.cgi?id=56691
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:04:16 +02:00