aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2007-05-08kprobes: kretprobes simplificationsChristoph Hellwig
- consolidate duplicate code in all arch_prepare_kretprobe instances into common code - replace various odd helpers that use hlist_for_each_entry to get the first elemenet of a list with either a hlist_for_each_entry_save or an opencoded access to the first element in the caller - inline add_rp_inst into it's only remaining caller - use kretprobe_inst_table_head instead of opencoding it Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08utimensat implementationUlrich Drepper
Implement utimensat(2) which is an extension to futimesat(2) in that it a) supports nano-second resolution for the timestamps b) allows to selectively ignore the atime/mtime value c) allows to selectively use the current time for either atime or mtime d) supports changing the atime/mtime of a symlink itself along the lines of the BSD lutimes(3) functions For this change the internally used do_utimes() functions was changed to accept a timespec time value and an additional flags parameter. Additionally the sys_utime function was changed to match compat_sys_utime which already use do_utimes instead of duplicating the work. Also, the completely missing futimensat() functionality is added. We have such a function in glibc but we have to resort to using /proc/self/fd/* which not everybody likes (chroot etc). Test application (the syscall number will need per-arch editing): #include <errno.h> #include <fcntl.h> #include <time.h> #include <sys/time.h> #include <stddef.h> #include <syscall.h> #define __NR_utimensat 280 #define UTIME_NOW ((1l << 30) - 1l) #define UTIME_OMIT ((1l << 30) - 2l) int main(void) { int status = 0; int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666); if (fd == -1) error (1, errno, "failed to create test file \"ttt\""); struct stat64 st1; if (fstat64 (fd, &st1) != 0) error (1, errno, "fstat failed"); struct timespec t[2]; t[0].tv_sec = 0; t[0].tv_nsec = 0; t[1].tv_sec = 0; t[1].tv_nsec = 0; if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0) error (1, errno, "utimensat failed"); struct stat64 st2; if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0) { puts ("atim not reset to zero"); status = 1; } if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0) { puts ("mtim not reset to zero"); status = 1; } if (status != 0) goto out; t[0] = st1.st_atim; t[1].tv_sec = 0; t[1].tv_nsec = UTIME_OMIT; if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0) error (1, errno, "utimensat failed"); if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); if (st2.st_atim.tv_sec != st1.st_atim.tv_sec || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec) { puts ("atim not set"); status = 1; } if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0) { puts ("mtim changed from zero"); status = 1; } if (status != 0) goto out; t[0].tv_sec = 0; t[0].tv_nsec = UTIME_OMIT; t[1] = st1.st_mtim; if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0) error (1, errno, "utimensat failed"); if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); if (st2.st_atim.tv_sec != st1.st_atim.tv_sec || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec) { puts ("mtim changed from original time"); status = 1; } if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec || st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec) { puts ("mtim not set"); status = 1; } if (status != 0) goto out; sleep (2); t[0].tv_sec = 0; t[0].tv_nsec = UTIME_NOW; t[1].tv_sec = 0; t[1].tv_nsec = UTIME_NOW; if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0) error (1, errno, "utimensat failed"); if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); struct timeval tv; gettimeofday(&tv,NULL); if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec || st2.st_atim.tv_sec > tv.tv_sec) { puts ("atim not set to NOW"); status = 1; } if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec || st2.st_mtim.tv_sec > tv.tv_sec) { puts ("mtim not set to NOW"); status = 1; } if (symlink ("ttt", "tttsym") != 0) error (1, errno, "cannot create symlink"); t[0].tv_sec = 0; t[0].tv_nsec = 0; t[1].tv_sec = 0; t[1].tv_nsec = 0; if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0) error (1, errno, "utimensat failed"); if (lstat64 ("tttsym", &st2) != 0) error (1, errno, "lstat failed"); if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0) { puts ("symlink atim not reset to zero"); status = 1; } if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0) { puts ("symlink mtim not reset to zero"); status = 1; } if (status != 0) goto out; t[0].tv_sec = 1; t[0].tv_nsec = 0; t[1].tv_sec = 1; t[1].tv_nsec = 0; if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0) error (1, errno, "utimensat failed"); if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0) { puts ("atim not reset to one"); status = 1; } if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0) { puts ("mtim not reset to one"); status = 1; } if (status == 0) puts ("all OK"); out: close (fd); unlink ("ttt"); unlink ("tttsym"); return status; } [akpm@linux-foundation.org: add missing i386 syscall table entry] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Alexey Dobriyan <adobriyan@openvz.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08dma_declare_coherent_memory wrong allocationGuennadi Liakhovetski
dma_declare_coherent_memory() allocates a bitmap 1 bit per page, it calculates the bitmap size based on size of long, but allocates bytes... Thanks to James Bottomley for clarifications and corrections. Signed-off-by: G. Liakhovetski <g.liakhovetski@gmx.de> Acked-by: James Bottomley <James.Bottomley@SteelEye.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08apm: fix incorrect commentAlan Cox
HZ has not always been 100Hz for some time. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08EFI: warn only for pre-1.00 system tablesBjorn Helgaas
We used to warn unless the EFI system table major revision was exactly 1. But EFI 2.00 firmware is starting to appear, and the 2.00 changes don't affect anything in Linux. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Replace deprecated SA_xxx interrupt flagsThomas Gleixner
Fix the last users of the deprecated SA_xxx interrupt flags. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Simplify kallsyms_lookup()Alexey Dobriyan
Several kallsyms_lookup() pass dummy arguments but only need, say, module's name. Make kallsyms_lookup() accept NULLs where possible. Also, makes picture clearer about what interfaces are needed for all symbol resolving business. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Kprobes: print details of kretprobe on assertion failureAnanth N Mavinakayanahalli
In certain cases like when the real return address can't be found or when the number of tracked calls to a kretprobed function is less than the number of returns, we may not be able to find the correct return address after processing a kretprobe. Currently we just do a BUG_ON, but no information is provided about the actual failing kretprobe. Print out details of the kretprobe before calling BUG(). Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08kdump/kexec: calculate note size at compile timeSimon Horman
Currently the size of the per-cpu region reserved to save crash notes is set by the per-architecture value MAX_NOTE_BYTES. Which in turn is currently set to 1024 on all supported architectures. While testing ia64 I recently discovered that this value is in fact too small. The particular setup I was using actually needs 1172 bytes. This lead to very tedious failure mode where the tail of one elf note would overwrite the head of another if they ended up being alocated sequentially by kmalloc, which was often the case. It seems to me that a far better approach is to caclculate the size that the area needs to be. This patch does just that. If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X) is needed then this should be as easy as making MAX_NOTE_BYTES larger in arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice. However, I think that the approach in this patch is a much more robust idea. Acked-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08header cleaning: don't include smp_lock.h when not usedRandy Dunlap
Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08<linux/sysdev.h> needs to include <linux/module.h>Ralf Baechle
sysdev.h uses THIS_MODULE so should include <linux/module.h>. [akpm@linux-foundation.org: couple of fixes] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Andi Kleen <ak@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08move die notifier handling to common codeChristoph Hellwig
This patch moves the die notifier handling to common code. Previous various architectures had exactly the same code for it. Note that the new code is compiled unconditionally, this should be understood as an appel to the other architecture maintainer to implement support for it aswell (aka sprinkling a notify_die or two in the proper place) arm had a notifiy_die that did something totally different, I renamed it to arm_notify_die as part of the patch and made it static to the file it's declared and used at. avr32 used to pass slightly less information through this interface and I brought it into line with the other architectures. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix vmalloc_sync_all bustage] [bryan.wu@analog.com: fix vmalloc_sync_all in nommu] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Fixes and cleanups for earlyprintk aka boot consoleGerd Hoffmann
The console subsystem already has an idea of a boot console, using the CON_BOOT flag. The implementation has some flaws though. The major problem is that presence of a boot console makes register_console() ignore any other console devices (unless explicitly specified on the kernel command line). This patch fixes the console selection code to *not* consider a boot console a full-featured one, so the first non-boot console registering will become the default console instead. This way the unregister call for the boot console in the register_console() function actually triggers and the handover from the boot console to the real console device works smoothly. Added a printk for the handover, so you know which console device the output goes to when the boot console stops printing messages. The disable_early_printk() call is obsolete with that patch, explicitly disabling the early console isn't needed any more as it works automagically with that patch. I've walked through the tree, dropped all disable_early_printk() instances found below arch/ and tagged the consoles with CON_BOOT if needed. The code is tested on x86, sh (thanks to Paul) and mips (thanks to Ralf). Changes to last version: Rediffed against -rc3, adapted to mips cleanups by Ralf, fixed "udbg-immortal" cmd line arg on powerpc. Signed-off-by: Gerd Hoffmann <kraxel@exsuse.de> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Andi Kleen <ak@suse.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08kconfig: centralize the selection of semaphore debugging in lib/Kconfig.debugRobert P. J. Day
Remove the Kconfig selection of semaphore debugging from the ALPHA and FRV Kconfig files, and centralize it in lib/Kconfig.debug. There doesn't seem to be much point in letting individual architectures independently define the same Kconfig option when it can just as easily be put in a single Kconfig file and made dependent on a subset of architectures. that way, at least the option shows up in the same relative location in the menu each time. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: David Howells <dhowells@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08ipmi: add new IPMI nmi watchdog handlingCorey Minyard
Convert over to the new NMI handling for getting IPMI watchdog timeouts via an NMI. This add config options to know if there is the ability to receive NMIs and if it has an NMI post processing call. Then it modifies the IPMI watchdog to take advantage of this so that it can know if an NMI comes in. It also adds testing that the IPMI NMI watchdog works. Signed-off-by: Corey Minyard <minyard@acm.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08simplify the stacktrace codeChristoph Hellwig
Simplify the stacktrace code: - remove the unused task argument to save_stack_trace, it's always current - remove the all_contexts flag, it's alwasy 0 Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <ak@suse.de> Cc: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08uml: an idle system should have zero load averageJeff Dike
The ever-vigilant users of linode.com noticed that an idle 2.6 UML has a persistent load average of ~.4. It turns out that because the UML timer handler processed softirqs before actually delivering the tick, the tick was counted in the context of the idle thread about half the time. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08use SLAB_PANIC flag cleanupAkinobu Mita
Use SLAB_PANIC and delete duplicated panic(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Ian Molton <spyro@f2s.com> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08Fix section mismatch of memory hotplug related code.Yasunori Goto
This is to fix many section mismatches of code related to memory hotplug. I checked compile with memory hotplug on/off on ia64 and x86-64 box. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] update memory attribute aliasing documentation & test cases [IA64] fail mmaps that span areas with incompatible attributes [IA64] allow WB /sys/.../legacy_mem mmaps [IA64] make ioremap avoid unsupported attributes [IA64] rename ioremap variables to match i386 [IA64] relax per-cpu TLB requirement to DTC [IA64] remove per-cpu ia64_phys_stacked_size_p8 [IA64] Fix example error injection program [IA64] Itanium MC Error Injection Tool: pal_mc_error_inject() interface [IA64] Itanium MC Error Injection Tool: Makefile changes [IA64] Itanium MC Error Injection Tool: Driver sysfs interface [IA64] Itanium MC Error Injection Tool: Doc and sample application [IA64] Itanium MC Error Injection Tool: Kernel configuration
2007-05-07Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SERIAL] sunsu: Fix section mismatch warnings. [SPARC64]: pgtable_cache_init() should be __init. [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/prom.c [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/pci.c [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/console.c [MM]: sparse_init() should be __init. [SPARC64]: Update defconfig. [VIDEO]: Add Sun XVR-2500 framebuffer driver. [VIDEO]: Add Sun XVR-500 framebuffer driver. [SPARC64]: SUN4U PCI-E controller support. [SPARC]: Fix comment typo in smp4m_blackbox_current(). [SCSI] SUNESP: sun_esp.c needs linux/delay.h Fix up conflict in arch/sparc64/mm/init.c manually due to removal of pgtable_cache_init() through the -mm patches (even though that patch was also by David ;) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07Merge master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (38 commits) sh: R7785RP board updates. sh: Update r7780rp defconfig. sh: Add die chain notifiers. sh: Fix APM emulation on hp6xx. sh: Wire up more IRQs for SH7709. sh: Solution Engine 7722 board support. sh: Fix r7780rp build. sh: kdump support. sh: Move clock reporting to its own proc entry. sh: Solution Engine SH7705 board and CPU updates. serial: sh-sci: Fix module clock refcount for serial console. serial: sh-sci: Fix module clock refcounting. sh: SH7722 clock framework support. sh: hp6xx pata_platform support. sh: Obey CONFIG_HZ for HZ definition. sh: Fix fstatat64() syscall. sh: se7780 PCI support. sh: SH7780 Solution Engine board support. sh: Add a dummy SH-4 PCIC fixup. sh: Tidy up L-BOX area5 addresses. ...
2007-05-07xtensa: strlcpy is smart enoughJean Delvare
strlcpy already accounts for the trailing zero in its length computation, so there is no need to substract one to the buffer size. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07v850: generic timekeeping conversionjohn stultz
Convert an arch that does not currently implement sub-jiffy timekeeping to use the generic timekeeping code. v850 looks like it has some intent to implement sub-jiffy timekeeping, so it may not yet be appropriate to try to convert, but I figured I'd get the maintainer's input and submit the patch for comment. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: fix prototypesPaolo 'Blaisorblade' Giarrusso
Declare strlcpy and strlcat more correctly. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: virtualized time fixJeff Dike
With the current timekeeping, !CONFIG_UML_REAL_TIME_CLOCK has inconsistent behavior. Previously, gettimeofday could be (and was) isolated from the clock ticking. Now, it's not, so when CONFIG_UML_REAL_TIME_CLOCK is disabled, gettimeofday must progress in lockstep with the clock, making it fully virtual. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: out of tmpfs space error clarificationJeff Dike
It turns out that the message complaining about a lack of tmpfs space on the host can be misunderstood as referring to the UML. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: only flush areas covered by VMAJeff Dike
When doing a full address space flush, only look at areas covered by a VMA. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: more page fault path trimmingJeff Dike
More trimming of the page fault path. Permissions are passed around in a single int rather than one bit per int. The permission values are copied from libc so that they can be passed to mmap and mprotect without any further conversion. The register sets used by do_syscall_stub and copy_context_skas0 are initialized once, at boot time, rather than once per call. wait_stub_done checks whether it is getting the signals it expects by comparing the wait status to a mask containing bits for the signals of interest rather than comparing individually to the signal numbers. It also has one check for a wait failure instead of two. The caller is expected to do the initial continue of the stub. This gets rid of an argument and some logic. The fname argument is gone, as that can be had from a stack trace. user_signal() is collapsed into userspace() as it is basically one or two lines of code afterwards. The physical memory remapping stuff is gone, as it is unused. flush_tlb_page is inlined. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: eliminate a piece of debugging codeJeff Dike
I missed removing another piece of debugging in an earlier patch. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: speed page fault pathJeff Dike
Give the page fault code a specialized path. There is only one page to look at, so there's no point in going into the general page table walking code. There's only going to be one host operation, so there are no opportunities for merging. So, we go straight to the pte we want, figure out what needs doing, and do it. While I was in here, I fixed the wart where the address passed to unmap was a void *, but an unsigned long to map and protect. This gives me just under 10% on a kernel build. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: aIO deadlock avoidanceJeff Dike
Allow deadlocks to be avoided in the AIO code by setting the pipe to the I/O thread non-blocking. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: rename os_{read_write}_file_k back to os_{read_write}_fileJeff Dike
Rename os_{read_write}_file_k back to os_{read_write}_file, delete the originals and their bogus infrastructure, and fix all the callers. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: remove debugging remnantsJeff Dike
I accidentally left the remnants of some debugging in an earlier patch. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: formatting fixes around os_{read_write}_file callersJeff Dike
Formatting fixes ahead of renaming os_{read_write}_file_k to os_{read_write}_file and fixing all the callers. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: change remaining callers of os_{read_write}_fileJeff Dike
Convert all remaining os_{read_write}_file users to use the simple {read,write} wrappers, os_{read_write}_file_k. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: don't try to handle signals on initial process stackJeff Dike
Code running on the initial UML stack can't receive or process signals since current must be valid when IRQs are handled, and there is no current for this stack. So, instead of using UML_LONGJMP and UML_SETJMP, which are careful to save and restore signal state, and, as a side-effect, handle any deferred signals, start_idle_thread must use the bare equivalents, which don't do anything with signals. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: dump core on panicJeff Dike
Dump core after a panic. This will provide better debugging information than is currently available. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: fixup allocation in the ubd driverPeter Zijlstra
Sanitise gfp flags; it actually is an atomic context, so drop the GFP_KERNEL part. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: send pointers instead of structures to I/O threadJeff Dike
Instead of writing entire structures between UML and the I/O thread, we send pointers. This cuts down on the amount of data being copied and possibly allows more requests to be pending between the two. This requires that the requests be kmalloced and freed instead of living on the stack. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: batch I/O requestsJeff Dike
Send as many I/O requests to the I/O thread as possible, even though it will still only handle one at a time. This provides an opportunity to reduce latency by starting one request before the previous one has been finished in the driver. Request handling is somewhat modernized by requesting sg pieces of a request and handling them separately, finishing off the entire request after all the pieces are done. When a request queue stalls, normally because its pipe to the I/O thread is full, it is put on the restart list. This list is processed by starting up the queues on it whenever there is some indication that progress might be possible again. Currently, this happens in the driver interrupt routine. Some requests have been finished, so there is likely to be room in the pipe again. This almost doubles throughput when copying data between devices, but made no noticable difference on anything else I tried. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: convert libc layer to call read and writeJeff Dike
This patch converts calls in the os layer to os_{read,write}_file to calls directly to libc read() and write() where it is clear that the I/O buffer is in the kernel. We can do that here instead of calling os_{read,write}_file_k since we are in libc code and can call libc directly. With the change in the calls, error handling needs to be changed to refer to errno directly rather than the return value of the call. CATCH_EINTR wrappers were also added where needed. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: tidy libc codeJeff Dike
This patch lays some groundwork for the next one, which converts calls to os_{read,write}_file into {read,write}, by doing some tidying in the affected areas. do_not_aio gets restructured to make the final result a bit cleaner. There are also whitespace and other formatting fixes, fixes in error messages, and a typo fix. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: start fixing os_read_file and os_write_fileJeff Dike
This patch starts the removal of a very old, very broken piece of code. This stems from the problem of passing a userspace buffer into read() or write() on the host. If that buffer had not yet been faulted in, read and write will return -EFAULT. To avoid this problem, the solution was to fault the buffer in before the system call by touching the pages that hold the buffer by doing a copy-user of a byte to each page. This is obviously bogus, but it does usually work, in tt mode, since the kernel and process are in the same address space and userspace addresses can be accessed directly in the kernel. In skas mode, where the kernel and process are in separate address spaces, it is completely bogus because the userspace address, which is invalid in the kernel, is passed into the system call instead of the corresponding physical address, which would be valid. Here, it appears that this code, on every host read() or write(), tries to fault in a random process page. This doesn't seem to cause any correctness problems, but there is a performance impact. This patch, and the ones following, result in a 10-15% performance gain on a kernel build. This code can't be immediately tossed out because when it is, you can't log in. Apparently, there is some code in the console driver which depends on this somehow. However, we can start removing it by switching the code which does I/O using kernel addresses to using plain read() and write(). This patch introduces os_read_file_k and os_write_file_k for use with kernel buffers and converts all call locations which use obvious kernel buffers to use them. These include I/O using buffers which are local variables which are on the stack or kmalloc-ed. Later patches will handle the less obvious cases, followed by a mass conversion back to the original interface. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: remove unused x86_64 codeJeff Dike
It turns out that essentially none of the x86_64 bugs.c is needed. So, we can delete most of it. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: speed up page table walkingJeff Dike
The previous page table walking code was horribly inefficient. This patch replaces it with code taken from elsewhere in the kernel. Forking from bash is now ~5% faster and page faults are handled ~10% faster. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: dump registers on ptrace or wait failureJeff Dike
Provide a register dump if handle_trap fails. Abstract out ptrace_dump_regs since it now has two callers. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: drivers get release methodsJeff Dike
Define release methods for the ubd and net drivers. They contain as much of the remove methods as make sense. All error checking must have already been done as well as anything else that might be holding a reference on the device kobject. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: delete HOST_FRAME_SIZEJeff Dike
HOST_FRAME_SIZE isn't used any more. It has been replaced with MAX_REG_NR. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07uml: irq locking commentaryJeff Dike
Locking commentary. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>