aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/vsyscall_64.c
AgeCommit message (Collapse)Author
2009-05-28x86: move rdtsc_barrier() into the TSC vread methodPetr Tesarik
The *fence instructions were moved to vsyscall_64.c by commit cb9e35dce94a1b9c59d46224e8a94377d673e204. But this breaks the vDSO, because vread methods are also called from there. Besides, the synchronization might be unnecessary for other time sources than TSC. [ Impact: fix potential time warp in VDSO ] Signed-off-by: Petr Tesarik <ptesarik@suse.cz> LKML-Reference: <9d0ea9ea0f866bdc1f4d76831221ae117f11ea67.1243241859.git.ptesarik@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org>
2008-12-28Merge branch 'tracing-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits) sched, trace: update trace_sched_wakeup() tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3 Revert "x86: disable X86_PTRACE_BTS" ring-buffer: prevent false positive warning ring-buffer: fix dangling commit race ftrace: enable format arguments checking x86, bts: memory accounting x86, bts: add fork and exit handling ftrace: introduce tracing_reset_online_cpus() helper tracing: fix warnings in kernel/trace/trace_sched_switch.c tracing: fix warning in kernel/trace/trace.c tracing/ring-buffer: remove unused ring_buffer size trace: fix task state printout ftrace: add not to regex on filtering functions trace: better use of stack_trace_enabled for boot up code trace: add a way to enable or disable the stack tracer x86: entry_64 - introduce FTRACE_ frame macro v2 tracing/ftrace: add the printk-msg-only option tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp() x86, bts: correctly report invalid bts records ... Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits being already partly merged by the SH merge.
2008-11-12trace: rename unlikely profiler to branch profilerSteven Rostedt
Impact: name change of unlikely tracer and profiler Ingo Molnar suggested changing the config from UNLIKELY_PROFILE to BRANCH_PROFILING. I never did like the "unlikely" name so I went one step farther, and renamed all the unlikely configurations to a "BRANCH" variant. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12tracing: branch tracer, fix vdso crashIngo Molnar
Impact: fix bootup crash the branch tracer missed arch/x86/vdso/vclock_gettime.c from disabling tracing, which caused such bootup crashes: [ 201.840097] init[1]: segfault at 7fffed3fe7c0 ip 00007fffed3fea2e sp 000077 also clean up the ugly ifdefs in arch/x86/kernel/vsyscall_64.c by creating DISABLE_UNLIKELY_PROFILE facility for code to turn off instrumentation on a per file basis. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12tracing: profile likely and unlikely annotationsSteven Rostedt
Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-08x86: clean up rdtsc_barrier() useIngo Molnar
Impact: cleanup Move rdtsc_barrier() use to vsyscall_64.c where it's relied on, and point out its role in the context of its use. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-15Merge branch 'generic-ipi' into generic-ipi-for-linusIngo Molnar
Conflicts: arch/powerpc/Kconfig arch/s390/kernel/time.c arch/x86/kernel/apic_32.c arch/x86/kernel/cpu/perfctr-watchdog.c arch/x86/kernel/i8259_64.c arch/x86/kernel/ldt.c arch/x86/kernel/nmi_64.c arch/x86/kernel/smpboot.c arch/x86/xen/smp.c include/asm-x86/hw_irq_32.h include/asm-x86/hw_irq_64.h include/asm-x86/mach-default/irq_vectors.h include/asm-x86/mach-voyager/irq_vectors.h include/asm-x86/smp.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-14Merge branch 'auto-ftrace-next' into tracing/for-linusIngo Molnar
Conflicts: arch/x86/kernel/entry_32.S arch/x86/kernel/process_32.c arch/x86/kernel/process_64.c arch/x86/lib/Makefile include/asm-x86/irqflags.h kernel/Makefile kernel/sched.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08x86, 64-bit: use write_gdt_entry in vsyscall_set_cpuJeremy Fitzhardinge
Use write_gdt_entry to generate the special vgetcpu descriptor in the vsyscall page. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-26on_each_cpu(): kill unused 'retry' parameterJens Axboe
It's not even passed on to smp_call_function() anymore, since that was removed. So kill it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-26smp_call_function: get rid of the unused nonatomic/retry argumentJens Axboe
It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-05-23x86: add notrace annotations to vsyscall.Steven Rostedt
Add the notrace annotations to the vsyscall functions - there we are not in kernel context yet, so the tracer function cannot (and must not) be called. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24"make namespacecheck" fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-29x86: restore vsyscall64 prochandlerThomas Gleixner
a recent fix: commit ce28b9864b853803320c3f1d8de1b81aa4120b14 Author: Thomas Gleixner <tglx@linutronix.de> Date: Wed Feb 20 23:57:30 2008 +0100 x86: fix vsyscall wreckage removed the broken /kernel/vsyscall64 handler completely. This triggers the following debug check: sysctl table check failed: /kernel/vsyscall64 No proc_handler Restore the sane part of the proc handler. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: fix vsyscall wreckageThomas Gleixner
based on a report from Arne Georg Gleditsch about user-space apps misbehaving after toggling /proc/sys/kernel/vsyscall64, a review of the code revealed that the "NOP patching" done there is fundamentally unsafe for a number of reasons: 1) the patching code runs without synchronizing other CPUs 2) it inserts NOPs even if there is no clock source which provides vread 3) when the clock source changes to one without vread we run in exactly the same problem as in #2 4) if nobody toggles the proc entry from 1 to 0 and to 1 again, then the syscall is not patched out as a result it is possible to break user-space via this patching. The only safe thing for now is to remove the patching. This code was broken since v2.6.21. Reported-by: Arne Georg Gleditsch <arne.gleditsch@dolphinics.no> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30x86: map vsyscalls early enoughIngo Molnar
map vsyscalls early enough. This is important if a __vsyscall_fn function is used by other kernel code too. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: change gdt acessor macro nameGlauber de Oliveira Costa
This patch changes the name of x86_64 macro used to access the per-cpu gdt. It is now equal to the i386 version, which will allow code to be shared. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: introduce native_read_tscpGlauber de Oliveira Costa
Targetting paravirt, this patch introduces native_read_tscp, in place of rdtscp() macro. When in a paravirt guest, this will involve a function call, and thus, cannot be done in the vdso area. These users then have to call the native version directly Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: rename the struct pt_regs members for 32/64-bit consistencyH. Peter Anvin
We have a lot of code which differs only by the naming of specific members of structures that contain registers. In order to enable additional unifications, this patch drops the e- or r- size prefix from the register names in struct pt_regs, and drops the x- prefixes for segment registers on the 32-bit side. This patch also performs the equivalent renames in some additional places that might be candidates for unification in the future. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (74 commits) fix do_sys_open() prototype sysfs: trivial: fix sysfs_create_file kerneldoc spelling mistake Documentation: Fix typo in SubmitChecklist. Typo: depricated -> deprecated Add missing profile=kvm option to Documentation/kernel-parameters.txt fix typo about TBI in e1000 comment proc.txt: Add /proc/stat field small documentation fixes Fix compiler warning in smount example program from sharedsubtree.txt docs/sysfs: add missing word to sysfs attribute explanation documentation/ext3: grammar fixes Documentation/java.txt: typo and grammar fixes Documentation/filesystems/vfs.txt: typo fix include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros trivial copy_data_pages() tidy up Fix typo in arch/x86/kernel/tsc_32.c file link fix for Pegasus USB net driver help remove unused return within void return function Typo fixes retrun -> return x86 hpet.h: remove broken links ...
2007-10-20spelling fixes: arch/x86_64/Simon Arlott
Spelling fixes in arch/x86_64/. Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-19x86: convert cpuinfo_x86 array to a per_cpu arrayMike Travis
cpu_data is currently an array defined using NR_CPUS. This means that we overallocate since we will rarely really use maximum configured cpus. When NR_CPU count is raised to 4096 the size of cpu_data becomes 3,145,728 bytes. These changes were adopted from the sparc64 (and ia64) code. An additional field was added to cpuinfo_x86 to be a non-ambiguous cpu index. This corresponds to the index into a cpumask_t as well as the per_cpu index. It's used in various places like show_cpuinfo(). cpu_data is defined to be the boot_cpu_data structure for the NON-SMP case. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Mark M. Hoffman <mhoffman@lightlink.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19x86, vsyscall: fix the oops crash with __pa_vsymbol()Siddha, Suresh B
Appended patch fixes an oops while changing the vsyscall sysctl. I am sure no one tested this code before integrating into mainline :( BTW, using ioremap() in vsyscall_sysctl_change() to get the virtual address of a kernel symbol sounds like an over kill.. I wonder if we can define a simple __va_vsymbol() which will return directly the kernel direct mapping. comments in the code which says gcc has trouble with __va(__pa()) sounds bogus to me. __pa() on a vsyscall address will not work anyhow :( And also, the whole nop out syscall in vsyscall page infrastructure (vsyscall_sysctl_change()) is added to make some attacks difficult, and yet I don't see this nop out being done by default. This area requires more cleanups? Fix an oops with __pa_vsymbol(). VSYSCALL_FIRST_PAGE is a fixmap index. We want the starting virtual address of the vsyscall page and not the index. [ mingo: arch/x86 adaptation ] Reported-by: Yanmin Zhang <yanmin.zhang@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-18sparse pointer use of zero as nullStephen Hemminger
Get rid of sparse related warnings from places that use integer as NULL pointer. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Ian Kent <raven@themaw.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18sysctl: x86_64 remove unnecessary binary pathsEric W. Biederman
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18Fix discrepancy between VDSO based gettimeofday() and sys_gettimeofday().Tony Breeds
On platforms that copy sys_tz into the vdso (currently only x86_64, soon to include powerpc), it is possible for the vdso to get out of sync if a user calls (admittedly unusual) settimeofday(NULL, ptr). This patch adds a hook for architectures that set CONFIG_GENERIC_TIME_VSYSCALL to ensure when sys_tz is updated they can also updatee their copy in the vdso. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Cc: Andi Kleen <ak@suse.de> Cc: Tony Luck <tony.luck@intel.com> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17x86: remove duplicated vsyscall nsec updateAndi Kleen
Spotted by Chuck Ebbert [ tglx: arch/x86 adaptation ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17x86: fix cpu_to_node referencesMike Travis
In x86_64 and i386 architectures most arrays that are sized using NR_CPUS lay in local memory on node 0. Not only will most (99%?) of the systems not use all the slots in these arrays, particularly when NR_CPUS is increased to accommodate future very high cpu count systems, but a number of cache lines are passed unnecessarily on the system bus when these arrays are referenced by cpus on other nodes. Typically, the values in these arrays are referenced by the cpu accessing it's own values, though when passing IPI interrupts, the cpu does access the data relevant to the targeted cpu/node. Of course, if the referencing cpu is not on node 0, then the reference will still require cross node exchanges of cache lines. A common use of this is for an interrupt service routine to pass the interrupt to other cpus local to that node. Ideally, all the elements in these arrays should be moved to the per_cpu data area. In some cases (such as x86_cpu_to_apicid) the array is referenced before the per_cpu data areas are setup. In this case, a static array is declared in the __initdata area and initialized by the booting cpu (BSP). The values are then moved to the per_cpu area after it is initialized and the original static array is freed with the rest of the __initdata. This patch: Fix four instances where cpu_to_node is referenced by array instead of via the cpu_to_node macro. This is preparation to moving it to the per_cpu data area. Signed-off-by: Mike Travis <travis@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-13Delete filenames in comments.Dave Jones
Since the x86 merge, lots of files that referenced their own filenames are no longer correct. Rather than keep them up to date, just delete them, as they add no real value. Additionally: - fix up comment formatting in scx200_32.c - Remove a credit from myself in setup_64.c from a time when we had no SCM - remove longwinded history from tsc_32.c which can be figured out from git. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-11x86_64: move kernelThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>