aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-02-27x86, paravirt: Remove kmap_atomic_pte paravirt op.Ian Campbell
Now that both Xen and VMI disable allocations of PTE pages from high memory this paravirt op serves no further purpose. This effectively reverts ce6234b5 "add kmap_atomic_pte for mapping highpte pages". Signed-off-by: Ian Campbell <ian.campbell@citrix.com> LKML-Reference: <1267204562-11844-3-git-send-email-ian.campbell@citrix.com> Acked-by: Alok Kataria <akataria@vmware.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-27x86, vmi: Disable highmem PTE allocation even when CONFIG_HIGHPTE=yIan Campbell
Preventing HIGHPTE allocations under VMI will allow us to remove the kmap_atomic_pte paravirt op. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> LKML-Reference: <1267204562-11844-2-git-send-email-ian.campbell@citrix.com> Acked-by: Alok Kataria <akataria@vmware.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-27x86, xen: Disable highmem PTE allocation even when CONFIG_HIGHPTE=yIan Campbell
There's a path in the pagefault code where the kernel deliberately breaks its own locking rules by kmapping a high pte page without holding the pagetable lock (in at least page_check_address). This breaks Xen's ability to track the pinned/unpinned state of the page. There does not appear to be a viable workaround for this behaviour so simply disable HIGHPTE for all Xen guests. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> LKML-Reference: <1267204562-11844-1-git-send-email-ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pasi Kärkkäinen <pasik@iki.fi> Cc: <stable@kernel.org> # .32.x: 14315592: Allow highmem user page tables to be disabled at boot time Cc: <stable@kernel.org> # .32.x Cc: <xen-devel@lists.xensource.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-25x86, mm: Unify kernel_physical_mapping_init() APIPekka Enberg
This patch changes the 32-bit version of kernel_physical_mapping_init() to return the last mapped address like the 64-bit one so that we can unify the call-site in init_memory_mapping(). Cc: Yinghai Lu <yinghai@kernel.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <alpine.DEB.2.00.1002241703570.1180@melkki.cs.helsinki.fi> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-25x86, mm: Allow highmem user page tables to be disabled at boot timeIan Campbell
Distros generally (I looked at Debian, RHEL5 and SLES11) seem to enable CONFIG_HIGHPTE for any x86 configuration which has highmem enabled. This means that the overhead applies even to machines which have a fairly modest amount of high memory and which therefore do not really benefit from allocating PTEs in high memory but still pay the price of the additional mapping operations. Running kernbench on a 4G box I found that with CONFIG_HIGHPTE=y but no actual highptes being allocated there was a reduction in system time used from 59.737s to 55.9s. With CONFIG_HIGHPTE=y and highmem PTEs being allocated: Average Optimal load -j 4 Run (std deviation): Elapsed Time 175.396 (0.238914) User Time 515.983 (5.85019) System Time 59.737 (1.26727) Percent CPU 263.8 (71.6796) Context Switches 39989.7 (4672.64) Sleeps 42617.7 (246.307) With CONFIG_HIGHPTE=y but with no highmem PTEs being allocated: Average Optimal load -j 4 Run (std deviation): Elapsed Time 174.278 (0.831968) User Time 515.659 (6.07012) System Time 55.9 (1.07799) Percent CPU 263.8 (71.266) Context Switches 39929.6 (4485.13) Sleeps 42583.7 (373.039) This patch allows the user to control the allocation of PTEs in highmem from the command line ("userpte=nohigh") but retains the status-quo as the default. It is possible that some simple heuristic could be developed which allows auto-tuning of this option however I don't have a sufficiently large machine available to me to perform any particularly meaningful experiments. We could probably handwave up an argument for a threshold at 16G of total RAM. Assuming 768M of lowmem we have 196608 potential lowmem PTE pages. Each page can map 2M of RAM in a PAE-enabled configuration, meaning a maximum of 384G of RAM could potentially be mapped using lowmem PTEs. Even allowing generous factor of 10 to account for other required lowmem allocations, generous slop to account for page sharing (which reduces the total amount of RAM mappable by a given number of PT pages) and other innacuracies in the estimations it would seem that even a 32G machine would not have a particularly pressing need for highmem PTEs. I think 32G could be considered to be at the upper bound of what might be sensible on a 32 bit machine (although I think in practice 64G is still supported). It's seems questionable if HIGHPTE is even a win for any amount of RAM you would sensibly run a 32 bit kernel on rather than going 64 bit. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> LKML-Reference: <1266403090-20162-1-git-send-email-ian.campbell@citrix.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-25x86: Do not reserve brk for DMI if it's not going to be usedThadeu Lima de Souza Cascardo
This will save 64K bytes from memory when loading linux if DMI is disabled, which is good for embedded systems. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> LKML-Reference: <1265758732-19320-1-git-send-email-cascardo@holoscopio.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-17x86: Convert tlbstate_lock to raw_spinlockThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-02-17Merge branch 'linus' into x86/mmThomas Gleixner
x86/mm is on 32-rc4 and missing the spinlock namespace changes which are needed for further commits into this topic. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: serial: 8250: add serial transmitter fully empty test
2010-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: gadget: fix EEM gadget CRC usage USB: otg Kconfig: let USB_OTG_UTILS select USB_ULPI option USB: g_multi: fix CONFIG_USB_G_MULTI_RNDIS usage kfifo: Don't use integer as NULL pointer USB: FHCI: Fix build after kfifo rework kfifo: Make kfifo_initialized work after kfifo_free USB: serial: add usbid for dell wwan card to sierra.c USB: SIS USB2VGA DRIVER: support KAIREN's USB VGA adaptor USB20SVGA-MB-PLUS USB: ehci: phy low power mode bug fixing USB: s3c-hsotg: Export usb_gadget_register_driver() USB: r8a66597-udc: Prototype IS_ERR() and PTR_ERR() USB: ftdi_sio: add device IDs (several ELV, one Mindstorms NXT) USB: storage: Remove unneeded SC/PR from unusual_devs.h USB: ftdi_sio: new device id for papouch AD4USB USB: usbfs: properly clean up the as structure on error paths USB: usbfs: only copy the actual data received
2010-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: class: Free the class private data in class_release sysfs: sysfs_sd_setattr set iattrs unconditionally
2010-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits) be2net: set proper value to version field in req hdr xfrm: Fix xfrm_state_clone leak ipcomp: Avoid duplicate calls to ipcomp_destroy ethtool: allow non-admin user to read GRO settings. ixgbe: fix WOL register setup for 82599 ixgbe: Fix - Do not allow Rx FC on 82598 at 1G due to errata sfc: Fix SFE4002 initialisation mac80211: fix handling of null-rate control in rate_control_get_rate inet: Remove bogus IGMPv3 report handling iwlwifi: fix AMSDU Rx after paged Rx patch tcp: fix ICMP-RTO war via-velocity: Fix races on shared interrupts via-velocity: Take spinlock on set coalesce via-velocity: Remove unused IRQ status parameter from rx_srv and tx_srv rtl8187: Add new device ID iwmc3200wifi: Test of wrong pointer after kzalloc in iwm_mlme_update_bss_table() ath9k: Fix sequence numbers for PAE frames mac80211: fix deferred hardware scan requests iwlwifi: Fix to set correct ht configuration mac80211: Fix probe request filtering in IBSS mode ...
2010-02-16serial: 8250: add serial transmitter fully empty testDick Hollenbeck
When controlling an industrial radio modem it can be necessary to manipulate the handshake lines in order to control the radio modem's transmitter, from userspace. The transmitter should not be turned off before all characters have been transmitted. serial8250_tx_empty() was reporting that all characters were transmitted before they actually were. === Discovered in parallel with more testing and analysis by Kees Schoenmakers as follows: I ran into an NetMos 9835 serial pci board which behaves a little different than the standard. This type of expansion board is very common. "Standard" 8250 compatible devices clear the 'UART_LST_TEMT" bit together with the "UART_LSR_THRE" bit when writing data to the device. The NetMos device does it slightly different I believe that the TEMT bit is coupled to the shift register. The problem is that after writing data to the device and very quickly after that one does call serial8250_tx_empty, it returns the wrong information. My patch makes the test more robust (and solves the problem) and it does not affect the already correct devices. Alan: We may yet need to quirk this but now we know which chips we have a way to do that should we find this breaks some other 8250 clone with dodgy THRE. Signed-off-by: Dick Hollenbeck <dick@softplc.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Kees Schoenmakers <k.schoenmakers@sigmae.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16class: Free the class private data in class_releaseLaurent Pinchart
Fix a memory leak by freeing the memory allocated in __class_register for the class private data. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16sysfs: sysfs_sd_setattr set iattrs unconditionallyEric W. Biederman
There is currently a bug in sysfs_sd_setattr inherited from sysfs_setattr in 2.6.32 where the first time we set the attributes on a sysfs file we allocate backing store but do not set the backing store attributes. Resulting in overly restrictive permissions on sysfs files. The fix is to simply modify the code so that it always executes when we update the sysfs attributes, as we did in 2.6.31 and earlier. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Tested-by: Jean Delvare <khali@linux-fr.org> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: gadget: fix EEM gadget CRC usageBrian Niebuhr
eem_wrap() is sending a sentinel CRC, but it didn't indicate that to the host, it should zero bit 14 (bmCRC) in the EEM packet header, instead of setting it. Also remove a redundant crc calculation in eem_unwrap(). Signed-off-by: Steve Longerbeam <stevel@netspectrum.com> Acked-by: Brian Niebuhr <bniebuhr@efjohnson.com> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: otg Kconfig: let USB_OTG_UTILS select USB_ULPI optionValentin Longchamp
With CONFIG_USB_ULPI=y, CONFIG_USB<=m, CONFIG_PCI=n and CONFIG_USB_OTG_UTILS=n, which is the default used for mx31moboard, the build for all mx3 platforms fails because drivers/usb/otg/ulpi.c where otg_ulpi_create is defined is not compiled. Build error: arch/arm/mach-mx3/built-in.o: In function `mxc_board_init': kzmarm11.c:(.init.text+0x73c): undefined reference to `otg_ulpi_create' kzmarm11.c:(.init.text+0x1020): undefined reference to `otg_ulpi_create' This isn't a strong dependency as drivers/usb/otg/ulpi.c doesn't use functions defined in drivers/usb/otg/otg.o and is only needed to get ulpi.o linked into the kernel image. Signed-off-by: Valentin Longchamp <valentin.longchamp@epfl.ch> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: g_multi: fix CONFIG_USB_G_MULTI_RNDIS usageMichal Nazarewicz
g_multi used CONFIG_USB_ETH_RNDIS to check if RNDIS option was requested where it should check for CONFIG_USB_G_MULTI_RNDIS. As a result, RNDIS was never present in g_multi regardless of configuration. This fixes changes made in commit 396cda90d228d0851f3d64c7c85a1ecf6b8ae1e8. Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16kfifo: Don't use integer as NULL pointerAnton Vorontsov
This patch fixes following sparse warnings: include/linux/kfifo.h:127:25: warning: Using plain integer as NULL pointer kernel/kfifo.c:83:21: warning: Using plain integer as NULL pointer Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Stefani Seibold <stefani@seibold.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: FHCI: Fix build after kfifo reworkAnton Vorontsov
After kfifo rework FHCI fails to build: CC drivers/usb/host/fhci-tds.o drivers/usb/host/fhci-tds.c: In function 'fhci_ep0_free': drivers/usb/host/fhci-tds.c:108: error: used struct type value where scalar is required drivers/usb/host/fhci-tds.c:118: error: used struct type value where scalar is required drivers/usb/host/fhci-tds.c:128: error: used struct type value where scalar is required This is because kfifos are no longer pointers in the ep struct. So, instead of checking the pointers, we should now check if kfifo is initialized. Reported-by: Josh Boyer <jwboyer@gmail.com> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Stefani Seibold <stefani@seibold.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16kfifo: Make kfifo_initialized work after kfifo_freeAnton Vorontsov
After kfifo rework it's no longer possible to reliably know if kfifo is usable, since after kfifo_free(), kfifo_initialized() would still return true. The correct behaviour is needed for at least FHCI USB driver. This patch fixes the issue by resetting the kfifo to zero values (the same approach is used in kfifo_alloc() if allocation failed). Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Stefani Seibold <stefani@seibold.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: serial: add usbid for dell wwan card to sierra.cRichard Farina
This patch adds support for Dell Computer Corp. Wireless 5720 VZW Mobile Broadband (EVDO Rev-A) Minicard GPS Port. I stole the name from lsusb, but my card does not have a GPS on it (at least not that I can make function). I'm sure the patch is whitespace damaged but the one line addition should be fairly straightforward nonetheless. Tested-by: Rick Farina <sidhayn@gmail.com> Signed-off-by: Rick Farina <sidhayn@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: SIS USB2VGA DRIVER: support KAIREN's USB VGA adaptor USB20SVGA-MB-PLUSTanaka Akira
This patch adds the USB product ID of KAIREN's USB VGA Adaptor, USB20SVGA-MB-PLUS, to sisusbvga work with it. Signed-off-by: Tanaka Akira <akr@fsij.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: ehci: phy low power mode bug fixingAlek Du
1. There are two msleep calls inside two spin lock sections, need to unlock and lock again after msleep. 2. Save a extra status reg setting. Signed-off-by: Alek Du <alek.du@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: s3c-hsotg: Export usb_gadget_register_driver()Mark Brown
USB gadget controller drivers normally export their driver registration function, allowing modular builds of the individual gadget drivers so do so for s3c-hsotg, fixing builds. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: r8a66597-udc: Prototype IS_ERR() and PTR_ERR()Mark Brown
The build of r8a66597-udc was failing on ARM since IS_ERR() and PTR_ERR() weren't protyped. Presumably err.h is being pulled in by another header on other platforms. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: ftdi_sio: add device IDs (several ELV, one Mindstorms NXT)Andreas Mohr
- add FTDI device IDs for several ELV devices and NXTCam of Lego Mindstorms NXT - add hopefully helpful new_id comment - remove less helpful "Due to many user requests for multiple ELV devices we enable them by default." comment (we simply add _all_ known devices - an enduser shouldn't have to fiddle with obscure module parameters...). - add myself to DRIVER_AUTHOR The missing NXTCam ID has been found at http://www.unixboard.de/vb3/showthread.php?t=44155 , ELV devices taken from ELV Windows .inf file. Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: storage: Remove unneeded SC/PR from unusual_devs.hPhil Dibowitz
This patch removes the subclass and protocol entries from a Microtech entry in unusual_devs.h. This was reported by <ryck@pacbell.net>. Greg, please apply. Signed-off-by: Phil Dibowitz <phil@ipom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: ftdi_sio: new device id for papouch AD4USBRadek Liboska
added new device pid (PAPOUCH_AD4USB_PID) to ftdi_sio.h and ftdi_sio.c AD4USB measuring converter is a 4-input A/D converter which enables the user to measure to four current inputs ranging from 0(4) to 20 mA or voltage between 0 and 10 V. The measured values are then transferred to a superior system in digital form. The AD4USB communicates via USB. Powered is also via USB. datasheet in english is here: http://www.papouch.com/shop/scripts/pdf/ad4usb_en.pdf Signed-off-by: Radek Liboska <liboska@uochb.cas.cz>
2010-02-16USB: usbfs: properly clean up the as structure on error pathsLinus Torvalds
I notice that the processcompl_compat() function seems to be leaking the 'struct async *as' in the error paths. I think that the calling convention is fundamentally buggered. The caller is the one that did the "reap_as()" to get the as thing, the caller should be the one to free it too. Freeing it in the caller also means that it very clearly always gets freed, and avoids the need for any "free in the error case too". From: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marcus Meissner <meissner@suse.de> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16USB: usbfs: only copy the actual data receivedGreg KH
We need to only copy the data received by the device to userspace, not the whole kernel buffer, which can contain "stale" data. Thanks to Marcus Meissner for pointing this out and testing the fix. Reported-by: Marcus Meissner <meissner@suse.de> Tested-by: Marcus Meissner <meissner@suse.de> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16be2net: set proper value to version field in req hdrAjit Khaparde
Before sending a command to the ASIC, set version properly. This is necessary for the ARM firmware to send correct data to the driver. This also fixes a bug in certain skews of the ASIC where the statistics are misreported. Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16xfrm: Fix xfrm_state_clone leakHerbert Xu
xfrm_state_clone calls kfree instead of xfrm_state_put to free a failed state. Depending on the state of the failed state, it can cause leaks to things like module references. All states should be freed by xfrm_state_put past the point of xfrm_init_state. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16ipcomp: Avoid duplicate calls to ipcomp_destroyHerbert Xu
When ipcomp_tunnel_attach fails we will call ipcomp_destroy twice. This may lead to double-frees on certain structures. As there is no reason to explicitly call ipcomp_destroy, this patch removes it from ipcomp*.c and lets the standard xfrm_state destruction take place. This is based on the discovery and patch by Alexey Dobriyan. Tested-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16ethtool: allow non-admin user to read GRO settings.stephen hemminger
Looks like an oversight in GRO design. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dmLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm: sysfs revert add empty release function to avoid debug warning dm mpath: fix stall when requeueing io dm raid1: fix null pointer dereference in suspend dm raid1: fail writes if errors are not handled and log fails dm log: userspace fix overhead_size calcuations dm snapshot: persistent annotate work_queue as on stack dm stripe: avoid divide by zero with invalid stripe count
2010-02-16Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] preserve personality flag bits across exec
2010-02-16dm: sysfs revert add empty release function to avoid debug warningAlasdair G Kergon
Revert commit d2bb7df8cac647b92f51fb84ae735771e7adbfa7 at Greg's request. Author: Milan Broz <mbroz@redhat.com> Date: Thu Dec 10 23:51:53 2009 +0000 dm: sysfs add empty release function to avoid debug warning This patch just removes an unnecessary warning: kobject: 'dm': does not have a release() function, it is broken and must be fixed. The kobject is embedded in mapped device struct, so code does not need to release memory explicitly here. Cc: Greg KH <gregkh@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16dm mpath: fix stall when requeueing ioKiyoshi Ueda
This patch fixes the problem that system may stall if target's ->map_rq returns DM_MAPIO_REQUEUE in map_request(). E.g. stall happens on 1 CPU box when a dm-mpath device with queue_if_no_path bounces between all-paths-down and paths-up on I/O load. When target's ->map_rq returns DM_MAPIO_REQUEUE, map_request() requeues the request and returns to dm_request_fn(). Then, dm_request_fn() doesn't exit the I/O dispatching loop and continues processing the requeued request again. This map and requeue loop can be done with interrupt disabled, so 1 CPU system can be stalled if this situation happens. For example, commands below can stall my 1 CPU box within 1 minute or so: # dmsetup table mp mp: 0 2097152 multipath 1 queue_if_no_path 0 1 1 service-time 0 1 2 8:144 1 1 # while true; do dd if=/dev/mapper/mp of=/dev/null bs=1M count=100; done & # while true; do \ > dmsetup message mp 0 "fail_path 8:144" \ > dmsetup suspend --noflush mp \ > dmsetup resume mp \ > dmsetup message mp 0 "reinstate_path 8:144" \ > done To fix the problem above, this patch changes dm_request_fn() to exit the I/O dispatching loop once if a request is requeued in map_request(). Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@kernel.org Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16dm raid1: fix null pointer dereference in suspendTakahiro Yasui
When suspending a failed mirror, bios are completed by mirror_end_io() and __rh_lookup() in dm_rh_dec() returns NULL where a non-NULL return value is required by design. Fix this by not changing the state of the recovery failed region from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end(). Issue On 2.6.33-rc1 kernel, I hit the bug when I suspended the failed mirror by dmsetup command. BUG: unable to handle kernel NULL pointer dereference at 00000020 IP: [<f94f38e2>] dm_rh_dec+0x35/0xa1 [dm_region_hash] ... EIP: 0060:[<f94f38e2>] EFLAGS: 00010046 CPU: 0 EIP is at dm_rh_dec+0x35/0xa1 [dm_region_hash] EAX: 00000286 EBX: 00000000 ECX: 00000286 EDX: 00000000 ESI: eff79eac EDI: eff79e80 EBP: f6915cd4 ESP: f6915cc4 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process dmsetup (pid: 2849, ti=f6914000 task=eff03e80 task.ti=f6914000) ... Call Trace: [<f9530af6>] ? mirror_end_io+0x53/0x1b1 [dm_mirror] [<f9413104>] ? clone_endio+0x4d/0xa2 [dm_mod] [<f9530aa3>] ? mirror_end_io+0x0/0x1b1 [dm_mirror] [<f94130b7>] ? clone_endio+0x0/0xa2 [dm_mod] [<c02d6bcb>] ? bio_endio+0x28/0x2b [<f952f303>] ? hold_bio+0x2d/0x62 [dm_mirror] [<f952f942>] ? mirror_presuspend+0xeb/0xf7 [dm_mirror] [<c02aa3e2>] ? vmap_page_range+0xb/0xd [<f9414c8d>] ? suspend_targets+0x2d/0x3b [dm_mod] [<f9414ca9>] ? dm_table_presuspend_targets+0xe/0x10 [dm_mod] [<f941456f>] ? dm_suspend+0x4d/0x150 [dm_mod] [<f941767d>] ? dev_suspend+0x55/0x18a [dm_mod] [<c0343762>] ? _copy_from_user+0x42/0x56 [<f9417fb0>] ? dm_ctl_ioctl+0x22c/0x281 [dm_mod] [<f9417628>] ? dev_suspend+0x0/0x18a [dm_mod] [<f9417d84>] ? dm_ctl_ioctl+0x0/0x281 [dm_mod] [<c02c3c4b>] ? vfs_ioctl+0x22/0x85 [<c02c422c>] ? do_vfs_ioctl+0x4cb/0x516 [<c02c42b7>] ? sys_ioctl+0x40/0x5a [<c0202858>] ? sysenter_do_call+0x12/0x28 Analysis When recovery process of a region failed, dm_rh_recovery_end() function changes the state of the region from RM_RH_RECOVERING to DM_RH_NOSYNC. When recovery_complete() is executed between dm_rh_update_states() and dm_writes() in do_mirror(), bios are processed with the region state, DM_RH_NOSYNC. However, the region data is freed without checking its pending count when dm_rh_update_states() is called next time. When bios are finished by mirror_end_io(), __rh_lookup() in dm_rh_dec() returns NULL even though a valid return value are expected. Solution Remove the state change of the recovery failed region from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end(). We can remove the state change because: - If the region data has been released by dm_rh_update_states(), a new region data is created with the state of DM_RH_NOSYNC, and bios are processed according to the DM_RH_NOSYNC state. - If the region data has not been released by dm_rh_update_states(), a state of the region is DM_RH_RECOVERING and bios are put in the delayed_bio list. The flag change from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end() was added in the following commit: dm raid1: handle resync failures author Jonathan Brassow <jbrassow@redhat.com> Thu, 12 Jul 2007 16:29:04 +0000 (17:29 +0100) http://git.kernel.org/linus/f44db678edcc6f4c2779ac43f63f0b9dfa28b724 Signed-off-by: Takahiro Yasui <tyasui@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16dm raid1: fail writes if errors are not handled and log failsMikulas Patocka
If the mirror log fails when the handle_errors option was not selected and there is no remaining valid mirror leg, writes return success even though they weren't actually written to any device. This patch completes them with EIO instead. This code path is taken: do_writes: bio_list_merge(&ms->failures, &sync); do_failures: if (!get_valid_mirror(ms)) (false) else if (errors_handled(ms)) (false) else bio_endio(bio, 0); The logic in do_failures is based on presuming that the write was already tried: if it succeeded at least on one leg (without handle_errors) it is reported as success. Reference: https://bugzilla.redhat.com/show_bug.cgi?id=555197 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16dm log: userspace fix overhead_size calcuationsJonathan Brassow
This patch fixes two bugs that revolve around the miscalculation and misuse of the variable 'overhead_size'. 'overhead_size' is the size of the various header structures used during communication. The first bug is the use of 'sizeof' with the pointer of a structure instead of the structure itself - resulting in the wrong size being computed. This is then used in a check to see if the payload (data_size) would be to large for the preallocated structure. Since the bug produces a smaller value for the overhead, it was possible for the structure to be breached. (Although the current users of the code do not currently send enough data to trigger this bug.) The second bug is that the 'overhead_size' value is used to compute how much of the preallocated space should be cleared before populating it with fresh data. This should have simply been 'sizeof(struct cn_msg)' not overhead_size. The fact that 'overhead_size' was computed incorrectly made this problem "less bad" - leaving only a pointer's worth of space at the end uncleared. Thus, this bug was never producing a bad result, but still needs to be fixed - especially now that the value is computed correctly. Cc: stable@kernel.org Signed-off-by: Jonathan Brassow <jbrassow@redhat.com Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16dm snapshot: persistent annotate work_queue as on stackMike Snitzer
chunk_io() declares its 'struct mdata_req' on the stack and then initializes its 'struct work_struct' member. Annotate the initialization of this workqueue with INIT_WORK_ON_STACK to suppress a debugobjects warning seen when CONFIG_DEBUG_OBJECTS_WORK is enabled. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16dm stripe: avoid divide by zero with invalid stripe countNikanth Karthikesan
If a table containing zero as stripe count is passed into stripe_ctr the code attempts to divide by zero. This patch changes DM_TABLE_LOAD to return -EINVAL if the stripe count is zero. We now get the following error messages: device-mapper: table: 253:0: striped: Invalid stripe count device-mapper: ioctl: error adding target to table Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Cc: stable@kernel.org Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16x86: ELF_PLAT_INIT() shouldn't worry about TIF_IA32Oleg Nesterov
The 64-bit version of ELF_PLAT_INIT() clears TIF_IA32, but at this point it has already been cleared by SET_PERSONALITY == set_personality_64bit. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-16x86: set_personality_ia32() misses force_personality32Oleg Nesterov
05d43ed8a "x86: get rid of the insane TIF_ABI_PENDING bit" forgot about force_personality32. Fix. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-15ixgbe: fix WOL register setup for 82599Don Skidmore
We need to have the WUS register set to all 1's in order for the hardware to be capable of ever waking up. Set it here in the ixgbe_probe(). Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-15ixgbe: Fix - Do not allow Rx FC on 82598 at 1G due to errataDon Skidmore
The 82598 has an erratum that receipt of pause frames at 1G could lead to a Tx Hang. To avoid this this patch disables Rx FC while at 1G speed for all 82598 parts. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: btrfs_mark_extent_written uses the wrong slot
2010-02-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: ohci: retransmit isochronous transmit packets on cycle loss firewire: net: fix panic in fwnet_write_complete