aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2010-05-10sched: Remove a stale commentLi Zefan
This comment should have been removed together with uids_mutex when removing user sched. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dhaval Giani <dhaval.giani@gmail.com> LKML-Reference: <4BE77C6B.5010402@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Intoduce get_cpu_iowait_time_us()Arjan van de Ven
For the ondemand cpufreq governor, it is desired that the iowait time is microaccounted in a similar way as idle time is. This patch introduces the infrastructure to account and expose this information via the get_cpu_iowait_time_us() function. [akpm@linux-foundation.org: fix CONFIG_NO_HZ=n build] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082523.284feab6@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Eliminate the ts->idle_lastupdate fieldArjan van de Ven
Now that the only user of ts->idle_lastupdate is update_ts_time_stats(), the entire field can be eliminated. In update_ts_time_stats(), idle_lastupdate is first set to "now", and a few lines later, the only user is an if() statement that assigns a variable either to "now" or to ts->idle_lastupdate, which has the value of "now" at that point. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082439.2fab0b4f@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Fold updating of the last_update_time_info into update_ts_time_stats()Arjan van de Ven
This patch folds the updating of the last_update_time into the update_ts_time_stats() function, and updates the callers. This allows for further cleanups that are done in the next patch. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082403.60072967@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Update the idle statistics in get_cpu_idle_time_us()Arjan van de Ven
Right now, get_cpu_idle_time_us() only reports the idle statistics upto the point the CPU entered last idle; not what is valid right now. This patch adds an update of the idle statistics to get_cpu_idle_time_us(), so that calling this function always returns statistics that are accurate at the point of the call. This includes resetting the start of the idle time for accounting purposes to avoid double accounting. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082323.2d2f1945@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Introduce a function to update the idle statisticsArjan van de Ven
Currently, two places update the idle statistics (and more to come later in this series). This patch creates a helper function for updating these statistics. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082245.163e67ed@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Add a comment to get_cpu_idle_time_us()Arjan van de Ven
The exported function get_cpu_idle_time_us() has no comment describing it; add a kerneldoc comment Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082208.7cb721f0@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09tracing: Drop the nested field from lock_release eventFrederic Weisbecker
Drop the nested field as we don't use it. Every nested state can be computed from a state machine on post processing already. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Steven Rostedt <rostedt@goodmis.org>
2010-05-09tracing: Drop lock_acquired waittime fieldFrederic Weisbecker
Drop the waittime field from the lock_acquired event, we can calculate it by substracting the lock_acquired event timestamp with the matching lock_acquire one. It is not needed and takes useless space in the traces. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Steven Rostedt <rostedt@goodmis.org>
2010-05-08Merge branch 'cpu_stop' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into sched/core
2010-05-08kprobes: Move enable/disable_kprobe() out from debugfs codeMasami Hiramatsu
Move enable/disable_kprobe() API out from debugfs related code, because these interfaces are not related to debugfs interface. This fixes a compiler warning. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> LKML-Reference: <20100427223312.2322.60512.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-08cpu_stop: add dummy implementation for UPTejun Heo
When !CONFIG_SMP, cpu_stop functions weren't defined at all which could lead to build failures if UP code uses cpu_stop facility. Add dummy cpu_stop implementation for UP. The waiting variants execute the work function directly with preempt disabled and stop_one_cpu_nowait() schedules a workqueue work. Makefile and ifdefs around stop_machine implementation are updated to accomodate CONFIG_SMP && !CONFIG_STOP_MACHINE case. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ingo Molnar <mingo@elte.hu>
2010-05-08perf_event: Make software events work againPaul Mackerras
Commit 6bde9b6ce0127e2a56228a2071536d422be31336 ("perf: Add group scheduling transactional APIs") added code to allow a group to be scheduled in a single transaction. However, it introduced a bug in handling events whose pmu does not implement transactions -- at the end of scheduling in the events in the group, in the non-transactional case the code now falls through to the group_error label, and proceeds to unschedule all the events in the group and return failure. This fixes it by returning 0 (success) in the non-transactional case. Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: eranian@gmail.com LKML-Reference: <20100508105800.GB10650@brick.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: create rcu_my_thread_group_empty() wrapper memcg: css_id() must be called under rcu_read_lock() cgroup: Check task_lock in task_subsys_state() sched: Fix an RCU warning in print_task() cgroup: Fix an RCU warning in alloc_css_id() cgroup: Fix an RCU warning in cgroup_path() KEYS: Fix an RCU warning in the reading of user keys KEYS: Fix an RCU warning
2010-05-07perf: Add group scheduling transactional APIsLin Ming
Add group scheduling transactional APIs to struct pmu. These APIs will be implemented in arch code, based on Peter's idea as below. > the idea behind hw_perf_group_sched_in() is to not perform > schedulability tests on each event in the group, but to add the group > as a whole and then perform one test. > > Of course, when that test fails, you'll have to roll-back the whole > group again. > > So start_txn (or a better name) would simply toggle a flag in the pmu > implementation that will make pmu::enable() not perform the > schedulablilty test. > > Then commit_txn() will perform the schedulability test (so note the > method has to have a !void return value. > > This will allow us to use the regular > kernel/perf_event.c::group_sched_in() and all the rollback code. > Currently each hw_perf_group_sched_in() implementation duplicates all > the rolllback code (with various bugs). ->start_txn: Start group events scheduling transaction, set a flag to make pmu::enable() not perform the schedulability test, it will be performed at commit time. ->commit_txn: Commit group events scheduling transaction, perform the group schedulability as a whole ->cancel_txn: Stop group events scheduling transaction, clear the flag so pmu::enable() will perform the schedulability test. Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1272002160.5707.60.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07perf: Annotate perf_event_read_group() vs perf_event_release_kernel()Peter Zijlstra
Stephane reported a lockdep warning while using PERF_FORMAT_GROUP. The issue is that perf_event_read_group() takes faults while holding the ctx->mutex, while perf_event_release_kernel() can be called from munmap(). Which makes for an AB-BA deadlock. Except we can never establish the deadlock because we'll only ever call perf_event_release_kernel() after all file descriptors are dead so there is no concurrency possible. Reported-by: Stephane Eranian <eranian@google.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07Merge branch 'perf/urgent' into perf/coreIngo Molnar
Merge reason: Resolve patch dependency Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07perf: Fix exit() vs PERF_FORMAT_GROUPPeter Zijlstra
Both Stephane and Corey reported that PERF_FORMAT_GROUP didn't work as expected if the task the counters were attached to quit before the read() call. The cause is that we unconditionally destroy the grouping when we remove counters from their context. Fix this by only doing this when we free the counter itself. Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1273160566.5605.404.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07sched: Remove rq argument to the tracepointsPeter Zijlstra
struct rq isn't visible outside of sched.o so its near useless to expose the pointer, also there are no users of it, so remove it. Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1272997616.1642.207.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07Merge commit 'v2.6.34-rc6' into sched/coreIngo Molnar
2010-05-07lockdep: Reduce stack_trace usageYong Zhang
When calling check_prevs_add(), if all validations passed add_lock_to_list() will add new lock to dependency tree and alloc stack_trace for each list_entry. But at this time, we are always on the same stack, so stack_trace for each list_entry has the same value. This is redundant and eats up lots of memory which could lead to warning on low MAX_STACK_TRACE_ENTRIES. Use one copy of stack_trace instead. V2: As suggested by Peter Zijlstra, move save_trace() from check_prevs_add() to check_prev_add(). Add tracking for trylock dependence which is also redundant. Signed-off-by: Yong Zhang <yong.zhang0@windriver.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100504065711.GC10784@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07rcu: need barrier() in UP synchronize_sched_expedited()Paul E. McKenney
If synchronize_sched_expedited() is ever to be called from within kernel/sched.c in a !SMP PREEMPT kernel, the !SMP implementation needs a barrier(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2010-05-06perf: Fix check at end of event searchDan Carpenter
The original code doesn't work because "call" is never NULL there. Signed-off-by: Dan Carpenter <error27@gmail.com> LKML-Reference: <20100320143911.GF5331@bicker> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-06sched: correctly place paranioa memory barriers in synchronize_sched_expedited()Paul E. McKenney
The memory barriers must be in the SMP case, not in the !SMP case. Also add a barrier after the atomic_inc() in order to ensure that other CPUs see post-synchronize_sched_expedited() actions as following the expedited grace period. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2010-05-06sched: kill paranoia check in synchronize_sched_expedited()Tejun Heo
The paranoid check which verifies that the cpu_stop callback is actually called on all online cpus is completely superflous. It's guaranteed by cpu_stop facility and if it didn't work as advertised other things would go horribly wrong and trying to recover using synchronize_sched() wouldn't be very meaningful. Kill the paranoid check. Removal of this feature is done as a separate step so that it can serve as a bisection point if something actually goes wrong. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Josh Triplett <josh@freedesktop.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>
2010-05-06sched: replace migration_thread with cpu_stopTejun Heo
Currently migration_thread is serving three purposes - migration pusher, context to execute active_load_balance() and forced context switcher for expedited RCU synchronize_sched. All three roles are hardcoded into migration_thread() and determining which job is scheduled is slightly messy. This patch kills migration_thread and replaces all three uses with cpu_stop. The three different roles of migration_thread() are splitted into three separate cpu_stop callbacks - migration_cpu_stop(), active_load_balance_cpu_stop() and synchronize_sched_expedited_cpu_stop() - and each use case now simply asks cpu_stop to execute the callback as necessary. synchronize_sched_expedited() was implemented with private preallocated resources and custom multi-cpu queueing and waiting logic, both of which are provided by cpu_stop. synchronize_sched_expedited_count is made atomic and all other shared resources along with the mutex are dropped. synchronize_sched_expedited() also implemented a check to detect cases where not all the callback got executed on their assigned cpus and fall back to synchronize_sched(). If called with cpu hotplug blocked, cpu_stop already guarantees that and the condition cannot happen; otherwise, stop_machine() would break. However, this patch preserves the paranoid check using a cpumask to record on which cpus the stopper ran so that it can serve as a bisection point if something actually goes wrong theree. Because the internal execution state is no longer visible, rcu_expedited_torture_stats() is removed. This patch also renames cpu_stop threads to from "stopper/%d" to "migration/%d". The names of these threads ultimately don't matter and there's no reason to make unnecessary userland visible changes. With this patch applied, stop_machine() and sched now share the same resources. stop_machine() is faster without wasting any resources and sched migration users are much cleaner. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Josh Triplett <josh@freedesktop.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>
2010-05-06stop_machine: reimplement using cpu_stopTejun Heo
Reimplement stop_machine using cpu_stop. As cpu stoppers are guaranteed to be available for all online cpus, stop_machine_create/destroy() are no longer necessary and removed. With resource management and synchronization handled by cpu_stop, the new implementation is much simpler. Asking the cpu_stop to execute the stop_cpu() state machine on all online cpus with cpu hotplug disabled is enough. stop_machine itself doesn't need to manage any global resources anymore, so all per-instance information is rolled into struct stop_machine_data and the mutex and all static data variables are removed. The previous implementation created and destroyed RT workqueues as necessary which made stop_machine() calls highly expensive on very large machines. According to Dimitri Sivanich, preventing the dynamic creation/destruction makes booting faster more than twice on very large machines. cpu_stop resources are preallocated for all online cpus and should have the same effect. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>
2010-05-06cpu_stop: implement stop_cpu[s]()Tejun Heo
Implement a simplistic per-cpu maximum priority cpu monopolization mechanism. A non-sleeping callback can be scheduled to run on one or multiple cpus with maximum priority monopolozing those cpus. This is primarily to replace and unify RT workqueue usage in stop_machine and scheduler migration_thread which currently is serving multiple purposes. Four functions are provided - stop_one_cpu(), stop_one_cpu_nowait(), stop_cpus() and try_stop_cpus(). This is to allow clean sharing of resources among stop_cpu and all the migration thread users. One stopper thread per cpu is created which is currently named "stopper/CPU". This will eventually replace the migration thread and take on its name. * This facility was originally named cpuhog and lived in separate files but Peter Zijlstra nacked the name and thus got renamed to cpu_stop and moved into stop_machine.c. * Better reporting of preemption leak as per Peter's suggestion. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>
2010-05-06rcu: create rcu_my_thread_group_empty() wrapperPaul E. McKenney
Some RCU-lockdep splat repairs need to know whether they are running in a single-threaded process. Unfortunately, the thread_group_empty() primitive is defined in sched.h, and can induce #include hell. This commit therefore introduces a rcu_my_thread_group_empty() wrapper that is defined in rcupdate.c, thus avoiding the need to include sched.h everywhere. Signed-off-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
2010-05-05tracing: Fix "integer as NULL pointer" warning.Thiago Farina
kernel/trace/trace_output.c:256:24: warning: Using plain integer as NULL pointer Signed-off-by: Thiago Farina <tfransosi@gmail.com> LKML-Reference: <1264349038-1766-3-git-send-email-tfransosi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-05Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: flush_delayed_work: keep the original workqueue for re-queueing
2010-05-04Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Fix resource leak in failure path of perf_event_open()
2010-05-04Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: Fix RCU lockdep splat on freezer_fork path rcu: Fix RCU lockdep splat in set_task_cpu on fork path mutex: Don't spin when the owner CPU is offline or other weird cases
2010-05-04sched: Fix an RCU warning in print_task()Li Zefan
With CONFIG_PROVE_RCU=y, a warning can be triggered: $ cat /proc/sched_debug ... kernel/cgroup.c:1649 invoked rcu_dereference_check() without protection! ... Both cgroup_path() and task_group() should be called with either rcu_read_lock or cgroup_mutex held. The rcu_dereference_check() does include cgroup_lock_is_held(), so we know that this lock is not held. Therefore, in a CONFIG_PREEMPT kernel, to say nothing of a CONFIG_PREEMPT_RT kernel, the original code could have ended up copying a string out of the freelist. This patch inserts RCU read-side primitives needed to prevent this scenario. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-05-04cgroup: Fix an RCU warning in alloc_css_id()Li Zefan
With CONFIG_PROVE_RCU=y, a warning can be triggered: # mount -t cgroup -o memory xxx /mnt # mkdir /mnt/0 ... kernel/cgroup.c:4442 invoked rcu_dereference_check() without protection! ... This is a false-positive. It's safe to directly access parent_css->id. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-05-04cgroup: Fix an RCU warning in cgroup_path()Li Zefan
with CONFIG_PROVE_RCU=y, a warning can be triggered: # mount -t cgroup -o debug xxx /mnt # cat /proc/$$/cgroup ... kernel/cgroup.c:1649 invoked rcu_dereference_check() without protection! ... This is a false-positive, because cgroup_path() can be called with either rcu_read_lock() held or cgroup_mutex held. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-05-04ring-buffer: Wrap open-coded WARN_ONCEBorislav Petkov
Wrap open-coded WARN_ONCE functionality into the equivalent macro. Signed-off-by: Borislav Petkov <bp@alien8.de> LKML-Reference: <20100502060354.GA5281@liondog.tnic> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-04hw_breakpoints: Fix percpu build failureFrederic Weisbecker
Fix this build error: kernel/hw_breakpoint.c:58:1: error: pasting "__pcpu_scope_" and "*" does not give a valid preprocessing token It happens if CONFIG_DEBUG_FORCE_WEAK_PER_CPU, because we concatenate someting with the name and we have the "*" in the name. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jason Wessel <jason.wessel@windriver.com> LKML-Reference: <20100503133942.GA5497@nowhere> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-04lockdep: No need to disable preemption in debug atomic opsFrederic Weisbecker
No need to disable preemption in the debug_atomic_* ops, as we ensure interrupts are disabled already. So let's use the __this_cpu_ops() rather than this_cpu_ops() that enclose the ops in a preempt disabled section. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org>
2010-05-04lockdep: Actually _dec_ in debug_atomic_decFrederic Weisbecker
Fix a silly copy-paste mistake that was making debug_atomic_dec use this_cpu_inc instead of this_cpu_dec. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org>
2010-05-04lockdep: Provide off case for redundant_hardirqs_on incrementFrederic Weisbecker
We forgot to provide a !CONFIG_DEBUG_LOCKDEP case for the redundant_hardirqs_on stat handling. Manage that in the headers with a new __debug_atomic_inc() helper. Fixes: kernel/lockdep.c:2306: error: 'lockdep_stats' undeclared (first use in this function) kernel/lockdep.c:2306: error: (Each undeclared identifier is reported only once kernel/lockdep.c:2306: error: for each function it appears in.) Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org>
2010-05-01perf: Fix resource leak in failure path of perf_event_open()Tejun Heo
perf_event_open() kfrees event after init failure which doesn't release all resources allocated by perf_event_alloc(). Use free_event() instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: <stable@kernel.org> LKML-Reference: <4BDBE237.1040809@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-01hw-breakpoints: Get the number of available registers on boot dynamicallyFrederic Weisbecker
The breakpoint generic layer assumes that archs always know in advance the static number of address registers available to host breakpoints through the HBP_NUM macro. However this is not true for every archs. For example Arm needs to get this information dynamically to handle the compatiblity between different versions. To solve this, this patch proposes to drop the static HBP_NUM macro and let the arch provide the number of available slots through a new hw_breakpoint_slots() function. For archs that have CONFIG_HAVE_MIXED_BREAKPOINTS_REGS selected, it will be called once as the number of registers fits for instruction and data breakpoints together. For the others it will be called first to get the number of instruction breakpoint registers and another time to get the data breakpoint registers, the targeted type is given as a parameter of hw_breakpoint_slots(). Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Ingo Molnar <mingo@elte.hu>
2010-05-01hw-breakpoints: Handle breakpoint weight in allocation constraintsFrederic Weisbecker
Depending on their nature and on what an arch supports, breakpoints may consume more than one address register. For example a simple absolute address match usually only requires one address register. But an address range match may consume two registers. Currently our slot allocation constraints, that tend to reflect the limited arch's resources, always consider that a breakpoint consumes one slot. Then provide a way for archs to tell us the weight of a breakpoint through a new hw_breakpoint_weight() helper. This weight will be computed against the generic allocation constraints instead of a constant value. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu>
2010-05-01hw-breakpoints: Separate constraint space for data and instruction breakpointsFrederic Weisbecker
There are two outstanding fashions for archs to implement hardware breakpoints. The first is to separate breakpoint address pattern definition space between data and instruction breakpoints. We then have typically distinct instruction address breakpoint registers and data address breakpoint registers, delivered with separate control registers for data and instruction breakpoints as well. This is the case of PowerPc and ARM for example. The second consists in having merged breakpoint address space definition between data and instruction breakpoint. Address registers can host either instruction or data address and the access mode for the breakpoint is defined in a control register. This is the case of x86 and Super H. This patch adds a new CONFIG_HAVE_MIXED_BREAKPOINTS_REGS config that archs can select if they belong to the second case. Those will have their slot allocation merged for instructions and data breakpoints. The others will have a separate slot tracking between data and instruction breakpoints. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu>
2010-05-01hw-breakpoints: Change/Enforce some breakpoints policiesFrederic Weisbecker
The current policies of breakpoints in x86 and SH are the following: - task bound breakpoints can only break on userspace addresses - cpu wide breakpoints can only break on kernel addresses The former rule prevents ptrace breakpoints to be set to trigger on kernel addresses, which is good. But as a side effect, we can't breakpoint on kernel addresses for task bound breakpoints. The latter rule simply makes no sense, there is no reason why we can't set breakpoints on userspace while performing cpu bound profiles. We want the following new policies: - task bound breakpoint can set userspace address breakpoints, with no particular privilege required. - task bound breakpoints can set kernelspace address breakpoints but must be privileged to do that. - cpu bound breakpoints can do what they want as they are privileged already. To implement these new policies, this patch checks if we are dealing with a kernel address breakpoint, if so and if the exclude_kernel parameter is set, we tell the user that the breakpoint is invalid, which makes a good generic ptrace protection. If we don't have exclude_kernel, ensure the user has the right privileges as kernel breakpoints are quite sensitive (risk of trap recursion attacks and global performance impacts). [ Paul Mundt: keep addr space check for sh signal delivery and fix double function declaration] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-05-01hw-breakpoints: Check disabled breakpoints againFrederic Weisbecker
We stopped checking disabled breakpoints because we weren't allowing breakpoints on NULL addresses. And gdb tends to set NULL addresses on inactive breakpoints. But refusing NULL addresses was actually a regression that has been fixed now. There is no reason anymore to not validate inactive breakpoint settings. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Ingo Molnar <mingo@elte.hu>
2010-04-30lockdep: Simplify debug atomic opsFrederic Weisbecker
Simplify debug_atomic_inc/dec by using this_cpu_inc/dec() instead of doing it through an indirect get_cpu_var() and a manual incrementation. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org>
2010-04-30lockdep: Fix redundant_hardirqs_on incremented with irqs enabledFrederic Weisbecker
When a path restore the flags while irqs are already enabled, we update the per cpu var redundant_hardirqs_on in a racy fashion and debug_atomic_inc() warns about this situation. In this particular case, loosing a few hits in a stat is not a big deal, so increment it without protection. v2: Don't bother with disabling irq, we can miss one count in rare situations Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-30Merge commit 'v2.6.34-rc6' into core/lockingFrederic Weisbecker
Merge reason: Further lockdep patches depend on per cpu updates made in -rc1.