aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2008-01-06CPU hotplug: fix cpu_is_offline() on !CONFIG_HOTPLUG_CPUIngo Molnar
make randconfig bootup testing found that the cpufreq code crashes on bootup, if the powernow-k8 driver is enabled and if maxcpus=1 passed on the boot line to a !CONFIG_HOTPLUG_CPU kernel. First lockdep found out that there's an inconsistent unlock sequence: ===================================== [ BUG: bad unlock balance detected! ] ------------------------------------- swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at: [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42 but there are no more locks to release! Call Trace: [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42 [<ffffffff80251c29>] print_unlock_inbalance_bug+0x104/0x12c [<ffffffff80252f3a>] mark_held_locks+0x56/0x94 [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42 [<ffffffff807008b6>] cpufreq_add_dev+0x2a8/0x5c4 ... then shortly afterwards the cpufreq code crashed on an assert: ------------[ cut here ]------------ kernel BUG at drivers/cpufreq/cpufreq.c:1068! invalid opcode: 0000 [1] SMP [...] Call Trace: [<ffffffff805145d6>] sysdev_driver_unregister+0x5b/0x91 [<ffffffff806ff520>] cpufreq_register_driver+0x15d/0x1a2 [<ffffffff80cc0596>] powernowk8_init+0x86/0x94 [...] ---[ end trace 1e9219be2b4431de ]--- the bug was caused by maxcpus=1 bootup, which brought up the secondary core as !cpu_online() but !cpu_is_offline() either, which on on !CONFIG_HOTPLUG_CPU is always 0 (include/linux/cpu.h): /* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */ static inline int cpu_is_offline(int cpu) { return 0; } but the cpufreq code uses cpu_online() and cpu_is_offline() in a mixed way - the low-level drivers use cpu_online(), while the cpufreq core uses cpu_is_offline(). This opened up the possibility to add the non-initialized sysdev device of the secondary core: cpufreq-core: trying to register driver powernow-k8 cpufreq-core: adding CPU 0 powernow-k8: BIOS error - no PSB or ACPI _PSS objects cpufreq-core: initialization failed cpufreq-core: adding CPU 1 cpufreq-core: initialization failed which then blew up. The fix is to make cpu_is_offline() always the negation of cpu_online(). With that fix applied the kernel boots up fine without crashing: Calling initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00) powernow-k8: BIOS error - no PSB or ACPI _PSS objects initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() returned -19. initcall 0xffffffff80cc0510 ran for 19 msecs: powernowk8_init+0x0/0x94() Calling initcall 0xffffffff80cc328f: init_lapic_nmi_sysfs+0x0/0x39() We could fix this by making CPU enumeration aware of max_cpus, but that would be more fragile IMO, and the cpu_online(cpu) != cpu_is_offline(cpu) possibility was quite confusing and a continuous source of bugs too. Most distributions have kernels with CPU hotplug enabled, so this bug remained hidden for a long time. Bug forensics: The broken cpu_is_offline() API variant was introduced via: commit a59d2e4e6977e7b94e003c96a41f07e96cddc340 Author: Rusty Russell <rusty@rustcorp.com.au> Date: Mon Mar 8 06:06:03 2004 -0800 [PATCH] minor cleanups for hotplug CPUs ( this predates linux-2.6.git, this commit is available from Thomas's historic git tree. ) Then 1.5 years later the cpufreq code made use of it: commit c32b6b8e524d2c337767d312814484d9289550cf Author: Ashok Raj <ashok.raj@intel.com> Date: Sun Oct 30 14:59:54 2005 -0800 [PATCH] create and destroy cpufreq sysfs entries based on cpu notifiers + if (cpu_is_offline(cpu)) + return 0; which is a correct use of the subtly broken new API. v2.6.15 then shipped with this bug included. then it took two more years for random-kernel qa to hit it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-06Revert "scsi: revert "[SCSI] Get rid of scsi_cmnd->done""Linus Torvalds
This reverts commit ac40532ef0b8649e6f7f83859ea0de1c4ed08a19, which gets us back the original cleanup of 6f5391c283d7fdcf24bf40786ea79061919d1e1d. It turns out that the bug that was triggered by that commit was apparently not actually triggered by that commit at all, and just the testing conditions had changed enough to make it appear to be due to it. The real problem seems to have been found by Peter Osterlund: "pktcdvd sets it [block device size] when opening the /dev/pktcdvd device, but when the drive is later opened as /dev/scd0, there is nothing that sets it back. (Btw, 40944 is possible if the disk is a CDRW that was formatted with "cdrwtool -m 10236".) The problem is that pktcdvd opens the cd device in non-blocking mode when pktsetup is run, and doesn't close it again until pktsetup -d is run. The effect is that if you meanwhile open the cd device, blkdev.c:do_open() doesn't call bd_set_size() because bdev->bd_openers is non-zero." In particular, to repeat the bug (regardless of whether commit 6f5391c283d7fdcf24bf40786ea79061919d1e1d is applied or not): " 1. Start with an empty drive. 2. pktsetup 0 /dev/scd0 3. Insert a CD containing an isofs filesystem. 4. mount /dev/pktcdvd/0 /mnt/tmp 5. umount /mnt/tmp 6. Press the eject button. 7. Insert a DVD containing a non-writable filesystem. 8. mount /dev/scd0 /mnt/tmp 9. find /mnt/tmp -type f -print0 | xargs -0 sha1sum >/dev/null 10. If the DVD contains data beyond the physical size of a CD, you get I/O errors in the terminal, and dmesg reports lots of "attempt to access beyond end of device" errors." which in turn is because the nested open after the media change won't cause the size to be set properly (because the original open still holds the block device, and we only do the bd_set_size() when we don't have other people holding the device open). The proper fix for that is probably to just do something like bdev->bd_inode->i_size = (loff_t)get_capacity(disk)<<9; in fs/block_dev.c:do_open() even for the cases where we're not the original opener (but *not* call bd_set_size(), since that will also change the block size of the device). Cc: Peter Osterlund <petero2@telia.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-03[IA64] Update Altix BTE error return status patchRuss Anderson
I neglected to send Tony the most recent version of the patch ("Fix Altix BTE error return status") applied as commit: 64135fa97ce016058f95345425a9ebd04ee1bd2a This patch gets it up to date. Without this patch on shub2, if there is no error xpcBteUnmappedError is returned instead of xpcSuccess. Signed-off-by: Russ Anderson (rja@sgi.com) Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-01-02restrict reading from /proc/<pid>/maps to those who share ->mm or can ptrace pidAl Viro
Contents of /proc/*/maps is sensitive and may become sensitive after open() (e.g. if target originally shares our ->mm and later does exec on suid-root binary). Check at read() (actually, ->start() of iterator) time that mm_struct we'd grabbed and locked is - still the ->mm of target - equal to reader's ->mm or the target is ptracable by reader. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-02scsi: revert "[SCSI] Get rid of scsi_cmnd->done"Ingo Molnar
This reverts commit 6f5391c283d7fdcf24bf40786ea79061919d1e1d ("[SCSI] Get rid of scsi_cmnd->done") that was supposed to be a cleanup commit, but apparently it causes regressions: Bug 9370 - v2.6.24-rc2-409-g9418d5d: attempt to access beyond end of device http://bugzilla.kernel.org/show_bug.cgi?id=9370 this patch should be reintroduced in a more split-up form to make testing of it easier. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Matthew Wilcox <matthew@wil.cx> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-02Unify /proc/slabinfo configurationLinus Torvalds
Both SLUB and SLAB really did almost exactly the same thing for /proc/slabinfo setup, using duplicate code and per-allocator #ifdef's. This just creates a common CONFIG_SLABINFO that is enabled by both SLUB and SLAB, and shares all the setup code. Maybe SLOB will want this some day too. Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix asm-x86/msr.h for user-space export x86: fix asm-x86/byteorder.h for userspace export
2008-01-01slub: provide /proc/slabinfoPekka J Enberg
This adds a read-only /proc/slabinfo file on SLUB, that makes slabtop work. [ mingo@elte.hu: build fix. ] Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Lameter <clameter@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-01x86: fix asm-x86/msr.h for user-space exportMike Frysinger
Use __asm__ and __volatile__ in code that is exported to userspace. Wrap kernel functions with __KERNEL__ so they get scrubbed. No code changed: text data bss dec hex filename 9681036 1698924 3407872 14787832 e1a4f8 vmlinux.before 9681036 1698924 3407872 14787832 e1a4f8 vmlinux.after Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-01x86: fix asm-x86/byteorder.h for userspace exportMike Frysinger
Since asm-x86/byteorder.h is exported to userspace, use __asm__ rather than asm in its code. Signed-Off-By: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-28[POWERPC] Oprofile: Remove dependency on spufs moduleBob Nelson
This removes an OProfile dependency on the spufs module. This dependency was causing a problem for multiplatform systems that are built with support for Oprofile on Cell but try to load the oprofile module on a non-Cell system. Signed-off-by: Bob Nelson <rrnelson@us.ibm.com> Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [IPV4]: Fix ip command line processing. [VETH]: move veth.h to include/linux [NET] tc_nat: header install [TUNTAP]: Fix wrong debug message. [NETFILTER]: nf_conntrack_ipv4: fix module parameter compatibility mac80211: warn when receiving frames with unaligned data mac80211: round station cleanup timer
2007-12-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Implement pci_resource_to_user()
2007-12-26Revert quicklist need->flush fixChristoph Lameter
Did not fix the reported issue. Apart from other weirdness this causes a bad link between the TLB flushing logic and the quicklists. If there is indeed an issue that an arch needs a tlb flush before free then the arch code needs to set tlb->need_flush before calling quicklist_free. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-26[VETH]: move veth.h to include/linuxStephen Hemminger
Move veth.h from net/ to linux/ since it is a user api, and add it to user header processing Kbuild. [ Use header-y as suggested by Sam Ravnborg. -DaveM ] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-26[NET] tc_nat: header installStephen Hemminger
iproute2 build needs tc_nat.h header from kernel make install_headers. Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-26[NETFILTER]: nf_conntrack_ipv4: fix module parameter compatibilityPatrick McHardy
Some users do "modprobe ip_conntrack hashsize=...". Since we have the module aliases this loads nf_conntrack_ipv4 and nf_conntrack, the hashsize parameter is unknown for nf_conntrack_ipv4 however and makes it fail. Allow to specify hashsize= for both nf_conntrack and nf_conntrack_ipv4. Note: the nf_conntrack message in the ringbuffer will display an incorrect hashsize since nf_conntrack is first pulled in as a dependency and calculates the size itself, then it gets changed through a call to nf_conntrack_set_hashsize(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-26[SPARC64]: Implement pci_resource_to_user()David S. Miller
This makes libpciaccess able to mmap() resources of the device properly. Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-23quicklists: do not release off node pages earlyChristoph Lameter
quicklists must keep even off node pages on the quicklists until the TLB flush has been completed. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits) [IPV4]: OOPS with NETLINK_FIB_LOOKUP netlink socket [NET]: Fix function put_cmsg() which may cause usr application memory overflow [ATM]: Spelling fixes [NETFILTER] ipv4: Spelling fixes [NETFILTER]: Spelling fixes [SCTP]: Spelling fixes [NETLABEL]: Spelling fixes [PKT_SCHED]: Spelling fixes [NET] net/core/: Spelling fixes [IPV6]: Spelling fixes [IRDA]: Spelling fixes [DCCP]: Spelling fixes [NET] include/net/: Spelling fixes [NET]: Correct two mistaken skb_reset_mac_header() conversions. [IPV4] ip_gre: set mac_header correctly in receive path [XFRM]: Audit function arguments misordered [IPSEC]: Avoid undefined shift operation when testing algorithm ID [IPV4] ARP: Remove not used code [TG3]: Endianness bugfix. [TG3]: Endianness annotations. ...
2007-12-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC32]: Spelling fixes [SPARC64]: Spelling fixes [SPARC64]: Fix OOPS in dma_sync_*_for_device()
2007-12-20[NET] include/net/: Spelling fixesJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-20dm: merge max_hw_sectorNeil Brown
Make sure dm honours max_hw_sectors of underlying devices We still have no firm testing evidence in support of this patch but believe it may help to resolve some bug reports. - agk Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-12-20[SPARC64]: Fix OOPS in dma_sync_*_for_device()David S. Miller
I included these operations vector cases for situations where we never need to do anything, the entries aren't filled in by any implementation, so we OOPS trying to invoke NULL pointer functions. Really make them NOPs, to fix the bug. Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-19Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Adjust CMCI mask on CPU hotplug [IA64] make flush_tlb_kernel_range() an inline function [IA64] Guard elfcorehdr_addr with #if CONFIG_PROC_FS [IA64] Fix Altix BTE error return status [IA64] Remove assembler warnings on head.S [IA64] Remove compiler warinings about uninitialized variable in irq_ia64.c [IA64] set_thread_area fails in IA32 chroot [IA64] print kernel release in OOPS to make kerneloops.org happy [IA64] Two trivial spelling fixes [IA64] Avoid unnecessary TLB flushes when allocating memory [IA64] ia32 nopage [IA64] signal: remove redundant code in setup_sigcontext() IA64: Slim down __clear_bit_unlock
2007-12-19[IA64] make flush_tlb_kernel_range() an inline functionJan Beulich
This fixes an unused variable warning in mm/vmalloc.c. Tony: also fix resulting fallout in uncached.c with a typo in args to flush_tlb_kernel_range(). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-12-19[IA64] Fix Altix BTE error return statusRuss Anderson
The Altix shub2 BTE error detail bits are in a different location than on shub1. The current code does not take this into account resulting in all shub2 BTE failures mapping to "unknown". This patch reads the error detail bits from the proper location, so the correct BTE failure reason is returned for both shub1 and shub2. Signed-off-by: Russ Anderson <rja@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-12-18[IA64] Two trivial spelling fixesJoe Perches
s/addres/address/ s/performanc/performance/ Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-12-18IA64: Slim down __clear_bit_unlockChristoph Lameter
__clear_bit_unlock does not need to perform atomic operations on the variable. Avoid a cmpxchg and simply do a store with release semantics. Add a barrier to be safe that the compiler does not do funky things. Tony: Use intrinsic rather than inline assembler Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!" genirq: revert lazy irq disable for simple irqs x86: also define AT_VECTOR_SIZE_ARCH x86: kprobes bugfix x86: jprobe bugfix timer: kernel/timer.c section fixes genirq: add unlocked version of set_irq_handler() clockevents: fix reprogramming decision in oneshot broadcast oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters
2007-12-18x86: also define AT_VECTOR_SIZE_ARCHJan Beulich
The patch introducing this left out 64-bit x86 despite it also having extra entries. this solves Xen guest troubles. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-18x86: jprobe bugfixMasami Hiramatsu
jprobe for x86-64 may cause kernel page fault when the jprobe_return() is called from incorrect function. - Use jprobe_saved_regs instead getting it from stack. (Especially on x86-64, it may get incorrect data, because pt_regs can not be get by using container_of(rsp)) - Change the type of stack pointer to unsigned long *. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-18genirq: add unlocked version of set_irq_handler()Kevin Hilman
Add unlocked version for use by irq_chip.set_type handlers which may wish to change handler to level or edge handler when IRQ type is changed. The normal set_irq_handler() call cannot be used because it tries to take irq_desc.lock which is already held when the irq_chip.set_type hook is called. Signed-off-by: Kevin Hilman <khilman@mvista.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-18Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: Cleanup umem driver: fix most checkpatch warnings, conform to kernel block: let elv_register() return void as-iosched: fix write batch start point as-iosched: fix incorrect comments block: use jiffies conversion functions in scsi_ioctl.c
2007-12-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: mmc: remove unused 'mode' from the mmc_host structure sdhci: support JMicron JMB38x chips sdhci: use PIO when DMA can't satisfy the request sdhci: don't warn about sdhci 2.0 controllers sdhci: describe quirks
2007-12-18block: let elv_register() return voidAdrian Bunk
elv_register() always returns 0, and there isn't anything it does where it should return an error (the only error condition is so grave that it's handled with a BUG_ON). Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-12-17Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: fix ATAPI draining libata: update atapi_eh_request_sense() such that lbam/lbah contains buffer size libata-acpi: implement _GTF command filtering libata-acpi: improve _GTF execution error handling and reporting libata-acpi: improve ACPI disabling libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM during resume libata-acpi: implement and use ata_acpi_init_gtm() libata-acpi: add new hooks ata_acpi_dissociate() and ata_acpi_on_disable() libata: ata_dev_disable() should be called from EH context libata: add more opcodes to ata.h libata: update ata_*_printk() macros such that level can be a variable libata-acpi: adjust constness in ata_acpi_gtm/stm() parameters sata_mv: improve warnings about Highpoint RocketRAID 23xx cards libata: add ST3160023AS / 3.42 to NCQ blacklist libata: clear link->eh_info.serror from ata_std_postreset() sata_sil: fix spurious IRQ handling
2007-12-17quicklist: Set tlb->need_flush if pages are remaining in quicklist 0Christoph Lameter
This ensures that the quicklists are drained. Otherwise draining may only occur when the processor reaches an idle state. Fixes fatal leakage of pgd_t's on 2.6.22 and later. Signed-off-by: Christoph Lameter <clameter@sgi.com> Reported-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17Revert "hugetlb: Add hugetlb_dynamic_pool sysctl"Nishanth Aravamudan
This reverts commit 54f9f80d6543fb7b157d3b11e2e7911dc1379790 ("hugetlb: Add hugetlb_dynamic_pool sysctl") Given the new sysctl nr_overcommit_hugepages, the boolean dynamic pool sysctl is not needed, as its semantics can be expressed by 0 in the overcommit sysctl (no dynamic pool) and non-0 in the overcommit sysctl (pool enabled). (Needed in 2.6.24 since it reverts a post-2.6.23 userspace-visible change) Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17hugetlb: introduce nr_overcommit_hugepages sysctlNishanth Aravamudan
hugetlb: introduce nr_overcommit_hugepages sysctl While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I became convinced that having a boolean sysctl was insufficient: 1) To support per-node control of hugepages, I have previously submitted patches to add a sysfs attribute related to nr_hugepages. However, with a boolean global value and per-mount quota enforcement constraining the dynamic pool, adding corresponding control of the dynamic pool on a per-node basis seems inconsistent to me. 2) Administration of the hugetlb dynamic pool with multiple hugetlbfs mount points is, arguably, more arduous than it needs to be. Each quota would need to be set separately, and the sum would need to be monitored. To ease the administration, and to help make the way for per-node control of the static & dynamic hugepage pool, I added a separate sysctl, nr_overcommit_hugepages. This value serves as a high watermark for the overall hugepage pool, while nr_hugepages serves as a low watermark. The boolean sysctl can then be removed, as the condition nr_overcommit_hugepages > 0 indicates the same administrative setting as hugetlb_dynamic_pool == 1 Quotas still serve as local enforcement of the size of the pool on a per-mount basis. A few caveats: 1) There is a race whereby the global surplus huge page counter is incremented before a hugepage has allocated. Another process could then try grow the pool, and fail to convert a surplus huge page to a normal huge page and instead allocate a fresh huge page. I believe this is benign, as no memory is leaked (the actual pages are still tracked correctly) and the counters won't go out of sync. 2) Shrinking the static pool while a surplus is in effect will allow the number of surplus huge pages to exceed the overcommit value. As long as this condition holds, however, no more surplus huge pages will be allowed on the system until one of the two sysctls are increased sufficiently, or the surplus huge pages go out of use and are freed. Successfully tested on x86_64 with the current libhugetlbfs snapshot, modified to use the new sysctl. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17apm_event{,info}_t are userspace typesAdam Jackson
These types define the size of data read from /dev/apm_bios. They should not be hidden behind #ifdef __KERNEL__. This is killing my xserver compile, apm_event_t is used in the xserver source. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17alpha: build fixesIvan Kokshaysky
This fixes some of the alpha-specific build problems, except a) modpost warning about COMMON symbol "saved_config" and b) nasty final link failure with gcc-4.x, -Os and scsi-disk driver configured built-in (due to jump table in .rodata referencing discarded .exit.text). - build failure with gcc-4.2.x: fix up casts in cia_io* routines to avoid warnings ('discards qualifiers from pointer target type'), which are failures, thanks to -Werror; - modpost warnings: add missing __init qualifier for titan and marvel; for non-generic build, move machine vectors from .data to .data.init.refok section; - unbreak CPU-specific optimization: rearrange cpuflags-y assignments so that extended -mcpu value (ev56, pca56, ev67) overrides basic one (ev5, ev6) and not vice versa. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17fix headers_installAndrew Morton
make[3]: *** No rule to make target `/usr/src/devel/include/linux/ticable.h', needed by `/usr/src/devel/usr/include/linux/ticable.h'. Stop. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17libata: fix ATAPI drainingTejun Heo
With ATAPI transfer chunk size properly programmed, libata PIO HSM should be able to handle full spurious data chunks. Also, it's a good idea to suppress trailing data warning for misc ATAPI commands as there can be many of them per command - for example, if the chunk size is 16 and the drive tries to transfer 510 bytes, there can be 31 trailing data messages. This patch makes the following updates to libata ATAPI PIO HSM implementation. * Make it drain full spurious chunks. * Suppress trailing data warning message for misc commands. * Put limit on how many bytes can be drained. * If odd, round up consumed bytes and the number of bytes to be drained. This gets the number of bytes to drain right for drivers which do 16bit PIO. This patch is partial backport of improve-ATAPI-data-xfer patchset pending for #upstream. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM ↵Tejun Heo
during resume On certain implementations, _GTF evaluation depends on preceding _STM and both can be pretty picky about the configuration. Using _GTM result cached during controller initialization satisfies the most neurotic _STM implementation. However, libata evaluates _GTF after reset during device configuration and the hardware state can be different from what _GTF expects and can cause evaluation failure. This patch adds dev->gtf_cache and updates ata_dev_get_GTF() such that it uses the cached value if available. Cache is cleared with a call to ata_acpi_clear_gtf(). Because for SATA ACPI nodes _GTF must be evaluated after _SDD which can't be done till IDENTIFY is complete, _GTF caching from ata_acpi_on_resume() is used only for IDE ACPI nodes. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17libata-acpi: implement and use ata_acpi_init_gtm()Tejun Heo
_GTM fetches currently configured transfer mode while _STM configures controller according to _GTM parameter and prepares transfer mode configuration TFs for _GTF. In many cases _GTM and _STM implementations are quite brittle and can't cope with configuration changed by libata. libata does not depend on ATA ACPI to configure devices. The only reason libata performs _GTM and _STM are to make _GTF evaluation succeed and libata also doesn't care about how _GTF TFs configure transfer mode. It overrides that configuration anyway, so from libata's POV, it doesn't matter what value is feeded to _STM as long as evaluation succeeds for _STM and following _GTF. This patch adds dev->__acpi_init_gtm and store initial _GTM values on host initialization before modified by reset and mode configuration. If the field is valid, ata_acpi_init_gtm() returns pointer to the saved _GTM structure; otherwise, NULL. This saved value is used for _STM during resume and peek at BIOS/firmware programmed initial timing for later use. The accessor is there to make building w/o ACPI easy as dev->__acpi_init doesn't exist if ACPI is not enabled. On driver detach, the initial BIOS configuration is restored by executing _STM with the initial _GTM values such that the next driver can also use the initial BIOS configured values. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17libata: add more opcodes to ata.hTejun Heo
Add constants for DEVICE CONFIGURATION OVERLAY and SET_MAX to include/linux/ata.h. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17libata: update ata_*_printk() macros such that level can be a variableTejun Heo
Make prink helpers format @lv together rather than prepending to the format string as constant. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17libata-acpi: adjust constness in ata_acpi_gtm/stm() parametersTejun Heo
* No internal function uses const ata_port. Drop const from @ap. * Make ata_acpi_stm() copy @stm before using it and change @stm to const. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6: HOWTO: update misspelling and word incorrected add stable_api_nonsense.txt in korean HOWTO: change addresses of maintainer and lxr url for Korean HOWTO Add Documentation for FAIR_USER_SCHED sysfs files HOWTO: Change man-page maintainer address for Japanese HOWTO tipar: remove obsolete module kobject: fix the documentation of how kobject_set_name works