aboutsummaryrefslogtreecommitdiff
path: root/kernel/sched
AgeCommit message (Collapse)Author
2015-04-15Merge branch 'for-lsk' of git://git.linaro.org/arm/big.LITTLE/mp into ↵Alex Shi
linux-linaro-lsk
2015-04-14sched: hmp: fix spinlock recursion in active migrationKevin Hilman
Commit cd5c2cc93d3d (hmp: Remove potential for task_struct access race) introduced a put_task_struct() to prevent races, but in doing so introduced potential spinlock recursion. (This change was further consolidated in commit 0baa5811bacf -- sched: hmp: unify active migration code.) Unfortunately, the put_task_struct() is done while the runqueue spinlock is held, but put_task_struct() can also cause a reschedule causing the runqueue lock to be acquired recursively. To fix, move the put_task_struct() outside the runqueue spinlock. Reported-by: Victor Lixin <victor.lixin@hisilicon.com> Cc: Jorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org> Cc: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Kevin Hilman <khilman@linaro.org> Reviewed-by: Jon Medhurst <tixy@linaro.org> Reviewed-by: Alex Shi <alex.shi@linaro.org> Reviewed-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-08-14Merge remote-tracking branch 'lsk/v3.10/topic/big.LITTLE' into linux-linaro-lskMark Brown
2014-08-12hmp: Restrict ILB events if no CPU has > 1 taskChris Redpath
Frequently in HMP, the big CPUs are only active with one task per CPU and there may be idle CPUs in the big cluster. This patch avoids triggering an idle balance in situations where none of the active CPUs in the current HMP domain have > 1 tasks running. When packing is enabled, only enforce this behaviour when we are not in the smallest domain - there we idle balance whenever a CPU is over the up_threshold regardless of tasks in case one needs to be moved. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-08-12HMP: Do not fork-boost tasks coming from PIDs <= 2Chris Redpath
System services are generally started by init, whilst kernel threads are started by kthreadd. We do not want to give those tasks a head start, as this costs power for very little benefit. We do however wish to do that for tasks which the user launches. Further, some tasks allocate per-cpu timers directly after launch which can lead to those tasks being always scheduled on a big CPU when there is no computational need to do so. Not promoting services to big CPUs on launch will prevent that unless a service allocates their per-cpu resources after a period of intense computation, which is not a common pattern. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-08-08Merge tag 'v3.10.52' into linux-linaro-lskAlex Shi
This is the 3.10.52 stable release
2014-08-07printk: rename printk_sched to printk_deferredJohn Stultz
commit aac74dc495456412c4130a1167ce4beb6c1f0b38 upstream. After learning we'll need some sort of deferred printk functionality in the timekeeping core, Peter suggested we rename the printk_sched function so it can be reused by needed subsystems. This only changes the function name. No logic changes. Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Bohac <jbohac@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-29Merge remote-tracking branch 'lts/linux-3.10.y' into linux-linaro-lskAlex Shi
Conflicts: arch/arm64/Kconfig
2014-07-28sched: Fix possible divide by zero in avg_atom() calculationMateusz Guzik
commit b0ab99e7736af88b8ac1b7ae50ea287fffa2badc upstream. proc_sched_show_task() does: if (nr_switches) do_div(avg_atom, nr_switches); nr_switches is unsigned long and do_div truncates it to 32 bits, which means it can test non-zero on e.g. x86-64 and be truncated to zero for division. Fix the problem by using div64_ul() instead. As a side effect calculations of avg_atom for big nr_switches are now correct. Signed-off-by: Mateusz Guzik <mguzik@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1402750809-31991-1-git-send-email-mguzik@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-27Merge remote-tracking branch 'lsk/v3.10/topic/big.LITTLE' into linux-linaro-lskMark Brown
2014-06-27HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pullAlex Shi
Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init. [ 113.878664] other info that might help us debug this: [ 113.878667] Possible unsafe locking scenario: [ 113.878667] [ 113.878670] CPU0 [ 113.878673] ---- [ 113.878681] lock(cpuidle_driver_lock); [ 113.878684] <Interrupt> [ 113.878691] lock(cpuidle_driver_lock); [ 113.878693] [ 113.878693] *** DEADLOCK *** [ 113.878693] [ 113.878697] 1 lock held by ksoftirqd/4/28: [ 113.878719] #0: (hmp_force_migration){+.....}, at: [<c0054da5>] hmp_idle_pull+0x49/0x508 This patch is just a quick/cheap workaround for cpuidle_driver_lock deadlock. It works for TC2 and any other platform where the idle driver cannot be changed at runtime. Signed-off-by: Alex Shi <alex.shi@linaro.org> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-06-25Merge remote-tracking branch 'lsk/v3.10/topic/big.LITTLE' into linux-linaro-lskMark Brown
2014-06-12Merge tag v3.10.43 into linux-linaro-lskAlex Shi
This is the 3.10.43 stable release
2014-06-11sched: Fix hotplug vs. set_cpus_allowed_ptr()Lai Jiangshan
commit 6acbfb96976fc3350e30d964acb1dbbdf876d55e upstream. Lai found that: WARNING: CPU: 1 PID: 13 at arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x2d/0x4b() ... migration_cpu_stop+0x1d/0x22 was caused by set_cpus_allowed_ptr() assuming that cpu_active_mask is always a sub-set of cpu_online_mask. This isn't true since 5fbd036b552f ("sched: Cleanup cpu_active madness"). So set active and online at the same time to avoid this particular problem. Fixes: 5fbd036b552f ("sched: Cleanup cpu_active madness") Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael wang <wangyun@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/53758B12.8060609@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11sched: Sanitize irq accounting madnessThomas Gleixner
commit 2d513868e2a33e1d5315490ef4c861ee65babd65 upstream. Russell reported, that irqtime_account_idle_ticks() takes ages due to: for (i = 0; i < ticks; i++) irqtime_account_process_tick(current, 0, rq); It's sad, that this code was written way _AFTER_ the NOHZ idle functionality was available. I charge myself guitly for not paying attention when that crap got merged with commit abb74cefa ("sched: Export ns irqtimes through /proc/stat") So instead of looping nr_ticks times just apply the whole thing at once. As a side note: The whole cputime_t vs. u64 business in that context wants to be cleaned up as well. There is no point in having all these back and forth conversions. Lets standardise on u64 nsec for all kernel internal accounting and be done with it. Everything else does not make sense at all for fine grained accounting. Frederic, can you please take care of that? Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Shaun Ruffell <sruffell@digium.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1405022307000.6261@ionos.tec.linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11sched: Use CPUPRI_NR_PRIORITIES instead of MAX_RT_PRIO in cpupri checkSteven Rostedt (Red Hat)
commit 6227cb00cc120f9a43ce8313bb0475ddabcb7d01 upstream. The check at the beginning of cpupri_find() makes sure that the task_pri variable does not exceed the cp->pri_to_cpu array length. But that length is CPUPRI_NR_PRIORITIES not MAX_RT_PRIO, where it will miss the last two priorities in that array. As task_pri is computed from convert_prio() which should never be bigger than CPUPRI_NR_PRIORITIES, if the check should cause a panic if it is hit. Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1397015410.5212.13.camel@marge.simpson.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11sched: hmp: fix out-of-range CPU possibleChris Redpath
If someone hotplugs all the little CPUs while another CPU is handling a wakeup, we can potentially return new_cpu == NR_CPUS from hmp_select_slower_cpu (which is called internally by hmp_best_little_cpu as well). We will use this to deref the per_cpu rq array in hmp_next_down_delay which can go boom. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-05-15Merge remote-tracking branch 'lsk/v3.10/topic/big.LITTLE' into linux-linaro-lskMark Brown
2014-05-09hmp: dont attempt to pull tasks if affinity doesn't allow itChris Redpath
When looking for a task to be idle-pulled, don't consider tasks where the affinity does not allow that task to be placed on the target CPU. Also ensure that tasks with restricted affinity do not block selecting other unrestricted busy tasks. Use the knowledge of target CPU more effectively in idle pull by passing to hmp_get_heaviest_task when we know it, otherwise only checking for general affinity matches with any of the CPUs in the bigger HMP domain. We still need to explicitly check affinity is allowed in idle pull since if we find no match in hmp_get_heaviest_task we will return the current one, which may not be affine to the new CPU despite having high enough load. In this case, there is nothing to move. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-05-09hmp: Use idle pull to perform forced up-migrationsChris Redpath
When a normal forced up-migration takes place we stop the task to be migrated while the target CPU becomes available. This delay can range from 80us to 1500us on TC2 if the target CPU is in a deep idle state. Instead, interrupt the target CPU and ask it to pull a task. This lets the current eligible task continue executing on the original CPU while the target CPU wakes. Use a pinned timer to prevent the pulling CPU going back into power-down with pending up-migrations. If we trigger for a nohz kick, it doesn't matter about triggering for an idle pull since the idle_pull flag will be set when we execute the softirq and we'll still do the idle pull. If the target CPU is busy, we will not pull any tasks. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-05-09sched: hmp: unify active migration codeChris Redpath
The HMP active migration code is functionally identical to the CFS active migration code apart from one flag check. Share the code and make the flag check optional. Two wrapper functions allow the flag check to be present or not. Thanks to tixy@linaro.org for pointing out the build break and a good solution in an earlier version. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-05-09hmp: sched: Clean up hmp_up_threshold checks into a utility fnChris Redpath
In anticipation of modifying the up_threshold handling, make all instances use the same utility fn to check if a task is eligible for up-migration. This also removes the previous difference in threshold comparison where up-migration used '!<threshold' and idle pull used '>threshold' to decide up-migration eligibility. Make them both use '!<threshold' instead for consistency, although this is unlikely to change any results. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-05-07Merge remote-tracking branch 'lsk/v3.10/topic/big.LITTLE' into linux-linaro-lskMark Brown
2014-05-07sched: hmp: Change small task packing defaults for all platformsChris Redpath
All platforms other than TC2 default to enabling packing. Since TC2 shows no performance or energy degradation with this feature enabled make it default enabled the same as everyone else. Likewise, vendors have been including TC2 support in multi-machine kernel builds so they expect the default thresholds to remain the same when the TC2 #ifdef is removed. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-04-08Merge branch 'v3.10/topic/big.LITTLE' of ↵Mark Brown
git://git.linaro.org/kernel/linux-linaro-stable into linux-linaro-lsk
2014-04-08Revert "hmp: sched: Clean up hmp_up_threshold checks into a utility fn"Jon Medhurst
This reverts commit 765aae26e6e296333c3a5f7a02360f5389dc439a. Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-04-08Revert "sched: hmp: unify active migration code"Jon Medhurst
This reverts commit 0baa5811bacf15b0e76ee85ce29fedffb5136313. Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-04-08Revert "hmp: Use idle pull to perform forced up-migrations"Jon Medhurst
This reverts commit aae7721f20f2520d24a149408a74f18e58f56472. Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-04-08Revert "hmp: dont attempt to pull tasks if affinity doesn't allow it"Jon Medhurst
This reverts commit 5a570cfc01b06906faa8ac67ad7c0c6f278761c4. Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-03-31Merge tag 'v3.10.35' into linux-linaro-lskMark Brown
This is the 3.10.35 stable release
2014-03-31sched/autogroup: Fix race with task_groups listGerald Schaefer
commit 41261b6a832ea0e788627f6a8707854423f9ff49 upstream. In autogroup_create(), a tg is allocated and added to the task_groups list. If CONFIG_RT_GROUP_SCHED is set, this tg is then modified while on the list, without locking. This can race with someone walking the list, like __enable_runtime() during CPU unplug, and result in a use-after-free bug. To fix this, move sched_online_group(), which adds the tg to the list, to the end of the autogroup_create() function after the modification. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1369411669-46971-2-git-send-email-gerald.schaefer@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-31Merge branch 'for-lsk' of git://git.linaro.org/arm/big.LITTLE/mp into ↵Mark Brown
linux-linaro-lsk
2014-03-28Merge tag 'v3.10.34' into linux-linaro-lskAlex Shi
This is 3.10.34 stable release
2014-03-24hmp: dont attempt to pull tasks if affinity doesn't allow itChris Redpath
When looking for a task to be idle-pulled, don't consider tasks where the affinity does not allow that task to be placed on the target CPU. Also ensure that tasks with restricted affinity do not block selecting other unrestricted busy tasks. Use the knowledge of target CPU more effectively in idle pull by passing to hmp_get_heaviest_task when we know it, otherwise only checking for general affinity matches with any of the CPUs in the bigger HMP domain. We still need to explicitly check affinity is allowed in idle pull since if we find no match in hmp_get_heaviest_task we will return the current one, which may not be affine to the new CPU despite having high enough load. In this case, there is nothing to move. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-03-24hmp: Use idle pull to perform forced up-migrationsChris Redpath
When a normal forced up-migration takes place we stop the task to be migrated while the target CPU becomes available. This delay can range from 80us to 1500us on TC2 if the target CPU is in a deep idle state. Instead, interrupt the target CPU and ask it to pull a task. This lets the current eligible task continue executing on the original CPU while the target CPU wakes. Use a pinned timer to prevent the pulling CPU going back into power-down with pending up-migrations. If we trigger for a nohz kick, it doesn't matter about triggering for an idle pull since the idle_pull flag will be set when we execute the softirq and we'll still do the idle pull. If the target CPU is busy, we will not pull any tasks. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-03-24sched: hmp: unify active migration codeChris Redpath
The HMP active migration code is functionally identical to the CFS active migration code apart from one flag check. Share the code and make the flag check optional. Two wrapper functions allow the flag check to be present or not. Thanks to tixy@linaro.org for pointing out the build break and a good solution in an earlier version. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-03-24hmp: sched: Clean up hmp_up_threshold checks into a utility fnChris Redpath
In anticipation of modifying the up_threshold handling, make all instances use the same utility fn to check if a task is eligible for up-migration. This also removes the previous difference in threshold comparison where up-migration used '!<threshold' and idle pull used '>threshold' to decide up-migration eligibility. Make them both use '!<threshold' instead for consistency, although this is unlikely to change any results. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-03-23sched: Fix double normalization of vruntimeGeorge McCollister
commit 791c9e0292671a3bfa95286bb5c08129d8605618 upstream. dequeue_entity() is called when p->on_rq and sets se->on_rq = 0 which appears to guarentee that the !se->on_rq condition is met. If the task has done set_current_state(TASK_INTERRUPTIBLE) without schedule() the second condition will be met and vruntime will be incorrectly adjusted twice. In certain cases this can result in the task's vruntime never increasing past the vruntime of other tasks on the CFS' run queue, starving them of CPU time. This patch changes switched_from_fair() to use !p->on_rq instead of !se->on_rq. I'm able to cause a task with a priority of 120 to starve all other tasks with the same priority on an ARM platform running 3.2.51-rt72 PREEMPT RT by writing one character at time to a serial tty (16550 UART) in a tight loop. I'm also able to verify making this change corrects the problem on that platform and kernel version. Signed-off-by: George McCollister <george.mccollister@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392767811-28916-1-git-send-email-george.mccollister@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-22Merge remote-tracking branch 'lsk/v3.10/topic/big.LITTLE' into linux-linaro-lskMark Brown
2014-01-22HMP: Fix rt task allowed cpu mask restriction code on 1x1 systemDietmar Eggemann
There is an error scenario where on a 1x1 HMP system (weight of the hmp_slow_cpu_mask is 1) the short-cut of restricting the allowed cpu mask of an rt tasks leads to triggering a kernel bug in the rt sched class set_cpus_allowed function set_cpus_allowed_rt(). In case the task is on the run-queue and the weight of the required cpu mask is 1 and this is different to the p->nr_cpus_allowed value, this back-end function interprets this in such a way that a task changed from being migratable to not migratable anymore and decrements the rt_nr_migratory counter. There is a BUG_ON(!rq->rt.rt_nr_migratory) check in this code path which triggers in this situation. To circumvent this issue, set the number of allowed cpus for a task p to the weight of the hmp_slow_cpu_mask before calling do_set_cpus_allowed() in __setscheduler(). It will be set to this value in do_set_cpus_allowed() after the call to the sched class related backend function any way. By doing this, set_cpus_allowed_rt() returns without trying to update the rt_nr_migratory counter. This patch has been tested with a test device driver requiring a threaded irq handler on a TC2 system with a reduced cpu mask (1 Cortex A15, 1 Cortex A7). Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-01-22sched: hmp: Fix potential task_struct memory leakChris Redpath
We use get_task_struct to increment the ref count on a task_struct so that even if the task dies with a pending migration we are still able to read the memory without causing a fault. In the case of non-running tasks, we forgot to decrement the ref count when we are done with the task. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-01-22sched: hmp: Change TC2 packing config to disabled default if presentChris Redpath
Since TC2 power curves don't really have a utilisation hotspot where packing makes sense, if it is present for a TC2 system at least make it default to disabled. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-01-22sched: hmp: Make idle balance behaviour normal when packing disabledChris Redpath
The presence of packing permanently changed the idle balance behaviour. Do not restrict idle balance on the smallest CPUs when packing is present but disabled. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-01-22sched: update runqueue clock before migrations awayChris Redpath
If we migrate a sleeping task away from a CPU which has the tick stopped, then both the clock_task and decay_counter will be out of date for that CPU and we will not decay load correctly regardless of how often we update the blocked load. This is only an issue for tasks which are not on a runqueue (because otherwise that CPU would be awake) and simultaneously the CPU the task previously ran on has had the tick stopped. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-01-22sched: reset blocked load decay_count during synchronizationChris Redpath
If an entity happens to sleep for less than one tick duration the tracked load associated with that entity can be decayed by an unexpectedly large amount if it is later migrated to a different CPU. This can interfere with correct scheduling when entity load is used for decision making. The reason for this is that when an entity is dequeued and enqueued quickly, such that se.avg.decay_count and cfs_rq.decay_counter do not differ when that entity is enqueued again, __synchronize_entity_decay skips the calculation step and also skips clearing the decay_count. At a later time that entity may be migrated and its load will be decayed incorrectly. All users of this function expect decay_count to be zero'ed after use. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-01-22sched/debug: Add load-tracking statistics to taskKamalesh Babulal
At present we print per-entity load-tracking statistics for cfs_rq of cgroups/runqueues. Given that per task statistics is maintained, it can be used to know the contribution made by the task to its parenting cfs_rq level. This patch adds per-task load-tracking statistics to /proc/<PID>/sched. Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130625080336.GA20175@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 939fd731eb88a0cdd9058d0b0143563172a217d7) Signed-off-by: Jon Medhurst <tixy@linaro.org>
2014-01-16Merge remote-tracking branch 'stable/linux-3.10.y' 3.10.27 into linux-linaro-lskAlex Shi
2014-01-15sched: Guarantee new group-entities always have weightPaul Turner
commit 0ac9b1c21874d2490331233b3242085f8151e166 upstream. Currently, group entity load-weights are initialized to zero. This admits some races with respect to the first time they are re-weighted in earlty use. ( Let g[x] denote the se for "g" on cpu "x". ) Suppose that we have root->a and that a enters a throttled state, immediately followed by a[0]->t1 (the only task running on cpu[0]) blocking: put_prev_task(group_cfs_rq(a[0]), t1) put_prev_entity(..., t1) check_cfs_rq_runtime(group_cfs_rq(a[0])) throttle_cfs_rq(group_cfs_rq(a[0])) Then, before unthrottling occurs, let a[0]->b[0]->t2 wake for the first time: enqueue_task_fair(rq[0], t2) enqueue_entity(group_cfs_rq(b[0]), t2) enqueue_entity_load_avg(group_cfs_rq(b[0]), t2) account_entity_enqueue(group_cfs_ra(b[0]), t2) update_cfs_shares(group_cfs_rq(b[0])) < skipped because b is part of a throttled hierarchy > enqueue_entity(group_cfs_rq(a[0]), b[0]) ... We now have b[0] enqueued, yet group_cfs_rq(a[0])->load.weight == 0 which violates invariants in several code-paths. Eliminate the possibility of this by initializing group entity weight. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131016181627.22647.47543.stgit@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-15sched: Fix hrtimer_cancel()/rq->lock deadlockBen Segall
commit 927b54fccbf04207ec92f669dce6806848cbec7d upstream. __start_cfs_bandwidth calls hrtimer_cancel while holding rq->lock, waiting for the hrtimer to finish. However, if sched_cfs_period_timer runs for another loop iteration, the hrtimer can attempt to take rq->lock, resulting in deadlock. Fix this by ensuring that cfs_b->timer_active is cleared only if the _latest_ call to do_sched_cfs_period_timer is returning as idle. Then __start_cfs_bandwidth can just call hrtimer_try_to_cancel and wait for that to succeed or timer_active == 1. Signed-off-by: Ben Segall <bsegall@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: pjt@google.com Link: http://lkml.kernel.org/r/20131016181622.22647.16643.stgit@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-15sched: Fix cfs_bandwidth misuse of hrtimer_expires_remainingBen Segall
commit db06e78cc13d70f10877e0557becc88ab3ad2be8 upstream. hrtimer_expires_remaining does not take internal hrtimer locks and thus must be guarded against concurrent __hrtimer_start_range_ns (but returning HRTIMER_RESTART is safe). Use cfs_b->lock to make it safe. Signed-off-by: Ben Segall <bsegall@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: pjt@google.com Link: http://lkml.kernel.org/r/20131016181617.22647.73829.stgit@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>