aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-05-11netfilter: xt_rpfilter: skip locally generated broadcast/multicast, tooFlorian Westphal
commit f83a7ea2075ca896f2dbf07672bac9cf3682ff74 upstream. Alex Efros reported rpfilter module doesn't match following packets: IN=br.qemu SRC=192.168.2.1 DST=192.168.2.255 [ .. ] (netfilter bugzilla #814). Problem is that network stack arranges for the locally generated broadcasts to appear on the interface they were sent out, so the IFF_LOOPBACK check doesn't trigger. As -m rpfilter is restricted to PREROUTING, we can check for existing rtable instead, it catches locally-generated broad/multicast case, too. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11netfilter: ctnetlink: don't permit ct creation with random tupleFlorian Westphal
commit 442fad9423b78319e0019a7f5047eddf3317afbc upstream. Userspace can cause kernel panic by not specifying orig/reply tuple: kernel will create a tuple with random stack values. Problem is that tuple.dst.dir will be random, too, which causes nf_ct_tuplehash_to_ctrack() to return garbage. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@gnumonks.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11netfilter: nf_ct_helper: don't discard helper if it is actually the sameFlorian Westphal
commit 6e2f0aa8cf8892868bf2c19349cb5d7c407f690d upstream. commit (32f5376 netfilter: nf_ct_helper: disable automatic helper re-assignment of different type) broke transparent proxy scenarios. For example, initial helper lookup might yield "ftp" (dport 21), while re-lookup after REDIRECT yields "ftp-2121". This causes the autoassign code to toss the ftp helper, even though these are just different instances of the same helper. Change the test to check for the helper function address instead of the helper address, as suggested by Pablo. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@gnumonks.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11netfilter: ipset: "Directory not empty" error messageJozsef Kadlecsik
commit dd82088dab3646ed28e4aa43d1a5b5d5ffc2afba upstream. When an entry flagged with "nomatch" was tested by ipset, it returned the error message "Kernel error received: Directory not empty" instead of "<element> is NOT in set <setname>" (reported by John Brendler). The internal error code was not properly transformed before returning to userspace, fixed. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the ↵Patrick McHardy
packet commit 3a7b21eaf4fb3c971bdb47a98f570550ddfe4471 upstream. Some Cisco phones create huge messages that are spread over multiple packets. After calculating the offset of the SIP body, it is validated to be within the packet and the packet is dropped otherwise. This breaks operation of these phones. Since connection tracking is supposed to be passive, just let those packets pass unmodified and untracked. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11netfilter: ipset: list:set: fix reference counter updateJozsef Kadlecsik
commit 02f815cb6d3f57914228be84df9613ee5a01c2e6 upstream. The last element can be replaced or pushed off and in both cases the reference counter must be updated. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11netfilter: nf_nat: fix race when unloading protocol modulesFlorian Westphal
commit c2d421e171868586939c328dfb91bab840fe4c49 upstream. following oops was reported: RIP: 0010:[<ffffffffa03227f2>] [<ffffffffa03227f2>] nf_nat_cleanup_conntrack+0x42/0x70 [nf_nat] RSP: 0018:ffff880202c63d40 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8801ac7bec28 RCX: ffff8801d0eedbe0 RDX: dead000000200200 RSI: 0000000000000011 RDI: ffffffffa03265b8 [..] Call Trace: [..] [<ffffffffa02febed>] destroy_conntrack+0xbd/0x110 [nf_conntrack] Happens when a conntrack timeout expires right after first part of the nat cleanup has completed (bysrc hash removal), but before part 2 has completed (re-initialization of nat area). [ destroy callback tries to delete bysrc again ] Patrick suggested to just remove the affected conntracks -- the connections won't work properly anyway without nat transformation. So, lets do that. Reported-by: CAI Qian <caiqian@redhat.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11ipvs: ip_vs_sip_fill_param() BUG: bad check of return valueHans Schillstrom
commit f7a1dd6e3ad59f0cfd51da29dfdbfd54122c5916 upstream. The reason for this patch is crash in kmemdup caused by returning from get_callid with uniialized matchoff and matchlen. Removing Zero check of matchlen since it's done by ct_sip_get_header() BUG: unable to handle kernel paging request at ffff880457b5763f IP: [<ffffffff810df7fc>] kmemdup+0x2e/0x35 PGD 27f6067 PUD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: xt_state xt_helper nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle xt_connmark xt_conntrack ip6_tables nf_conntrack_ftp ip_vs_ftp nf_nat xt_tcpudp iptable_mangle xt_mark ip_tables x_tables ip_vs_rr ip_vs_lblcr ip_vs_pe_sip ip_vs nf_conntrack_sip nf_conntrack bonding igb i2c_algo_bit i2c_core CPU 5 Pid: 0, comm: swapper/5 Not tainted 3.9.0-rc5+ #5 /S1200KP RIP: 0010:[<ffffffff810df7fc>] [<ffffffff810df7fc>] kmemdup+0x2e/0x35 RSP: 0018:ffff8803fea03648 EFLAGS: 00010282 RAX: ffff8803d61063e0 RBX: 0000000000000003 RCX: 0000000000000003 RDX: 0000000000000003 RSI: ffff880457b5763f RDI: ffff8803d61063e0 RBP: ffff8803fea03658 R08: 0000000000000008 R09: 0000000000000011 R10: 0000000000000011 R11: 00ffffffff81a8a3 R12: ffff880457b5763f R13: ffff8803d67f786a R14: ffff8803fea03730 R15: ffffffffa0098e90 FS: 0000000000000000(0000) GS:ffff8803fea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff880457b5763f CR3: 0000000001a0c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/5 (pid: 0, threadinfo ffff8803ee18c000, task ffff8803ee18a480) Stack: ffff8803d822a080 000000000000001c ffff8803fea036c8 ffffffffa000937a ffffffff81f0d8a0 000000038135fdd5 ffff880300000014 ffff880300110000 ffffffff150118ac ffff8803d7e8a000 ffff88031e0118ac 0000000000000000 Call Trace: <IRQ> [<ffffffffa000937a>] ip_vs_sip_fill_param+0x13a/0x187 [ip_vs_pe_sip] [<ffffffffa007b209>] ip_vs_sched_persist+0x2c6/0x9c3 [ip_vs] [<ffffffff8107dc53>] ? __lock_acquire+0x677/0x1697 [<ffffffff8100972e>] ? native_sched_clock+0x3c/0x7d [<ffffffff8100972e>] ? native_sched_clock+0x3c/0x7d [<ffffffff810649bc>] ? sched_clock_cpu+0x43/0xcf [<ffffffffa007bb1e>] ip_vs_schedule+0x181/0x4ba [ip_vs] ... Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11xhci: Don't warn on empty ring for suspended devices.Sarah Sharp
commit a83d6755814e4614ba77e15d82796af0f695c6b8 upstream. When a device attached to the roothub is suspended, the endpoint rings are stopped. The host may generate a completion event with the completion code set to 'Stopped' or 'Stopped Invalid' when the ring is halted. The current xHCI code prints a warning in that case, which can be really annoying if the USB device is coming into and out of suspend. Remove the unnecessary warning. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Tested-by: Stephen Hemminger <stephen@networkplumber.org> Cc: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11e1000e: fix accessing to suspended deviceKonstantin Khlebnikov
commit e60b22c5b7e59db09a7c9490b1e132c7e49ae904 upstream. This patch fixes some annoying messages like 'Error reading PHY register' and 'Hardware Erorr' and saves several seconds on reboot. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Bruce Allan <bruce.w.allan@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Borislav Petkov <bp@suse.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Tóth Attila <atoth@atoth.sote.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11e1000e: fix runtime power management transitionsKonstantin Khlebnikov
commit 66148babe728f3e00e13c56f6b0ecf325abd80da upstream. This patch removes redundant actions from driver and fixes its interaction with actions in pci-bus runtime power management code. It removes pci_save_state() from __e1000_shutdown() for normal adapters, PCI bus callbacks pci_pm_*() will do all this for us. Now __e1000_shutdown() switches to D3-state only quad-port adapters, because they needs quirk for clearing false-positive error from downsteam pci-e port. pci_save_state() now called after clearing bus-master bit, thus __e1000_resume() and e1000_io_slot_reset() must set it back after restoring configuration space. This patch set get_link_status before calling pm_runtime_put() in e1000_open() to allow e1000_idle() get real link status and schedule first runtime suspend. This patch also enables wakeup for device if management mode is enabled (like for WoL) as result pci_prepare_to_sleep() would setup wakeup without special actions like custom 'enable_wakeup' sign. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Bruce Allan <bruce.w.allan@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Borislav Petkov <bp@suse.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Tóth Attila <atoth@atoth.sote.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11PCI/PM: Clear state_saved during suspendRafael J. Wysocki
commit 82fee4d67ab86d6fe5eb0f9a9e988ca9d654d765 upstream. This patch clears pci_dev->state_saved at the beginning of suspending. PCI config state may be saved long before that. Some drivers call pci_save_state() from the ->probe() callback to get snapshot of sane configuration space to use in the ->slot_reset() callback. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> # add comment Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Tóth Attila <atoth@atoth.sote.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11perf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNELPeter Zijlstra
commit 7cc23cd6c0c7d7f4bee057607e7ce01568925717 upstream. We should always have proper privileges when requesting kernel data. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andi Kleen <ak@linux.intel.com> Cc: eranian@google.com Link: http://lkml.kernel.org/r/20130503121256.230745028@chello.nl [ Fix build error reported by fengguang.wu@intel.com, propagate error code back. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/n/tip-v0x9ky3ahzr6nm3c6ilwrili@git.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11perf/x86/intel/lbr: Fix LBR filterPeter Zijlstra
commit 6e15eb3ba6c0249c9e8c783517d131b47db995ca upstream. The LBR 'from' adddress is under full userspace control; ensure we validate it before reading from it. Note: is_module_text_address() can potentially be quite expensive; for those running into that with high overhead in modules optimize it using an RCU backed rb-tree. Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: eranian@google.com Link: http://lkml.kernel.org/r/20130503121256.158211806@chello.nl Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/n/tip-mk8i82ffzax01cnqo829iy1q@git.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11perf/x86/intel: Fix unintended variable name reuseJan-Simon Möller
commit 1b0dac2ac6debdbf1541e15f2cede03613cf4465 upstream. The variable name events_group is already in used and led to a compilation error when using clang to build the Linux Kernel . The fix is just to rename the var. No functional change. Please apply. Fix suggested in discussion by PaX Team <pageexec@freemail.hu> Signed-off-by: Jan-Simon Möller <dl9pf@gmx.de> Cc: rostedt@goodmis.org Cc: a.p.zijlstra@chello.nl Cc: paulus@samba.org Cc: acme@ghostprotocols.net Link: http://lkml.kernel.org/r/1367316153-14808-1-git-send-email-dl9pf@gmx.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11perf/x86/intel: Add support for IvyBridge model 58 UncoreVince Weaver
commit 9a6bc14350b130427725f33e371e86212fa56c85 upstream. According to Intel Vol3b 18.9, the IvyBridge model 58 uncore is the same as that of SandyBridge. I've done some simple tests and with this patch things seem to work on my mac-mini. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Stephane Eranian <eranian@gmail.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1304291549320.15827@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11net/eth/ibmveth: Fixup retrieval of MAC addressBenjamin Herrenschmidt
commit 13f85203e1060da83d9ec1c1c5a63343eaab8de4 upstream. Some ancient pHyp versions used to create a 8 bytes local-mac-address property in the device-tree instead of a 6 bytes one for veth. The Linux driver code to deal with that is an insane hack which also happens to break with some choices of MAC addresses in qemu by testing for a bit in the address rather than just looking at the size of the property. Sanitize this by doing the latter instead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11iommu/amd: Properly initialize irq-table lockJoerg Roedel
commit 197887f03daecdb3ae21bafeb4155412abad3497 upstream. Fixes a lockdep warning. Reviewed-by: Shuah Khan <shuahkhan@gmail.com> Signed-off-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11hugetlbfs: fix mmap failure in unaligned size requestNaoya Horiguchi
commit af73e4d9506d3b797509f3c030e7dcd554f7d9c4 upstream. The current kernel returns -EINVAL unless a given mmap length is "almost" hugepage aligned. This is because in sys_mmap_pgoff() the given length is passed to vm_mmap_pgoff() as it is without being aligned with hugepage boundary. This is a regression introduced in commit 40716e29243d ("hugetlbfs: fix alignment of huge page requests"), where alignment code is pushed into hugetlb_file_setup() and the variable len in caller side is not changed. To fix this, this patch partially reverts that commit, and adds alignment code in caller side. And it also introduces hstate_sizelog() in order to get proper hstate to specified hugepage size. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=56881 [akpm@linux-foundation.org: fix warning when CONFIG_HUGETLB_PAGE=n] Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: <iceman_dvd@yahoo.com> Cc: Steven Truelove <steven.truelove@utoronto.ca> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11autofs - remove autofs dentry mount checkDavid Jeffery
commit ce8a5dbdf9e709bdaf4618d7ef8cceb91e8adc69 upstream. When checking if an autofs mount point is busy it isn't sufficient to only check if it's a mount point. For example, if the mount of an offset mountpoint in a tree is denied for this host by its export and the dentry becomes a process working directory the check incorrectly returns the mount as not in use at expire. This can happen since the default when mounting within a tree is nostrict, which means ingnore mount fails on mounts within the tree and continue. The nostrict option is meant to allow mounting in this case. Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11pwm: spear: Fix checking return value of clk_enable() and clk_prepare()Axel Lin
commit 563861cd633ae52932843477bb6ca3f1c9e2f78b upstream. The logic to check return value of clk_enable() and clk_prepare() is reversed, fix it. Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11powerpc: fix numa distance for form0 device treeVaidyanathan Srinivasan
commit 7122beeee7bc1757682049780179d7c216dd1c83 upstream. The following commit breaks numa distance setup for old powerpc systems that use form0 encoding in device tree. commit 41eab6f88f24124df89e38067b3766b7bef06ddb powerpc/numa: Use form 1 affinity to setup node distance Device tree node /rtas/ibm,associativity-reference-points would index into /cpus/PowerPCxxxx/ibm,associativity based on form0 or form1 encoding detected by ibm,architecture-vec-5 property. All modern systems use form1 and current kernel code is correct. However, on older systems with form0 encoding, the numa distance will get hard coded as LOCAL_DISTANCE for all nodes. This causes task scheduling anomaly since scheduler will skip building numa level domain (topmost domain with all cpus) if all numa distances are same. (value of 'level' in sched_init_numa() will remain 0) Prior to the above commit: ((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE) Restoring compatible behavior with this patch for old powerpc systems with device tree where numa distance are encoded as form0. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11powerpc: Emulate non privileged DSCR read and writeAnton Blanchard
commit 73d2fb758e678c93bc76d40876c2359f0729b0ef upstream. POWER8 allows read and write of the DSCR in userspace. We added kernel emulation so applications could always use the instructions regardless of the CPU type. Unfortunately there are two SPRs for the DSCR and we only added emulation for the privileged one. Add code to match the non privileged one. A simple test was created to verify the fix: http://ozlabs.org/~anton/junkcode/user_dscr_test.c Without the patch we get a SIGILL and it passes with the patch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11xen/arm: actually pass a non-NULL percpu pointer to request_percpu_irqStefano Stabellini
commit 2798ba7d19aed645663398a21ec4006bfdbb1ef3 upstream. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Ian Campbell <ian.camjpbell@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07Linux 3.8.12v3.8.12Greg Kroah-Hartman
2013-05-07mfd: adp5520: Restore mode bits on resumeLars-Peter Clausen
commit c6cc25fda58da8685ecef3f179adc7b99c8253b2 upstream. The adp5520 unfortunately also clears the BL_EN bit when the nSTNDBY bit is cleared. So we need to make sure to restore it during resume if it was set before suspend. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07rcutrace: single_open() leaksAl Viro
commit 7ee2b9e56495c56dcaffa2bab19b39451d9fdc8a upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07mmc: atmel-mci: pio hang on block errorsTerry Barnaby
commit bdbc5d0c60f3e9de3eeccf1c1a18bdc11dca62cc upstream. The driver is doing, by default, multi-block reads. When a block error occurs, card/block.c instigates a single block read: "mmcblk0: retrying using single block read". It leaves the sg chain intact and just changes the length attribute for the first sg entry and the overall sg_len parameter. When atmci_read_data_pio is called to read the single block of data it ignores the sg_len and expects to read more than 512 bytes as it sees there are multiple items in the sg list. No more data comes as the controller has only been commanded to get one block. Signed-off-by: Terry Barnaby <terry@beam.ltd.uk> Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com> Signed-off-by: Chris Ball <cjb@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07mmc: core: Fix bit width test failing on old eMMC cardsPhilip Rakity
commit 836dc2fe89c968c10cada87e0dfae6626f8f9da3 upstream. PARTITION_SUPPORT needs to be set before doing the compare on version number so the bit width test does not get invalid data. Before this patch, a Sandisk iNAND eMMC card would detect 1-bit width although the hardware supports 4-bit. Only affects old emmc devices - pre 4.4 devices. Reported-by: Elad Yi <elad.yi@gmail.com> Signed-off-by: Philip Rakity <prakity@yahoo.com> Signed-off-by: Chris Ball <cjb@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07x86: Eliminate irq_mis_count counted in arch_irq_statLi Fei
commit f7b0e1055574ce06ab53391263b4e205bf38daf3 upstream. With the current implementation, kstat_cpu(cpu).irqs_sum is also increased in case of irq_mis_count increment. So there is no need to count irq_mis_count in arch_irq_stat, otherwise irq_mis_count will be counted twice in the sum of /proc/stat. Reported-by: Liu Chuansheng <chuansheng.liu@intel.com> Signed-off-by: Li Fei <fei.li@intel.com> Acked-by: Liu Chuansheng <chuansheng.liu@intel.com> Cc: tomoki.sekiyama.qu@hitachi.com Cc: joe@perches.com Link: http://lkml.kernel.org/r/1366980611.32469.7.camel@fli24-HP-Compaq-8100-Elite-CMT-PC Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07KVM: X86 emulator: fix source operand decoding for 8bit mov[zs]x instructionsGleb Natapov
commit 660696d1d16a71e15549ce1bf74953be1592bcd3 upstream. Source operand for one byte mov[zs]x is decoded incorrectly if it is in high byte register. Fix that. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07Give the OID registry file module info to avoid kernel taintingDavid Howells
commit 9e6879460c8edb0cd3c24c09b83d06541b5af0dc upstream. Give the OID registry file module information so that it doesn't taint the kernel when compiled as a module and loaded. Reported-by: Dros Adamson <Weston.Adamson@netapp.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07mmc: at91/avr32/atmel-mci: fix DMA-channel leak on module unloadJohan Hovold
commit 91cf54feecf815bec0b6a8d6d9dbd0e219f2f2cc upstream. Fix regression introduced by commit 796211b7953 ("mmc: atmel-mci: add pdc support and runtime capabilities detection") which removed the need for CONFIG_MMC_ATMELMCI_DMA but kept the Kconfig-entry as well as the compile guards around dma_release_channel() in remove(). Consequently, DMA is always enabled (if supported), but the DMA-channel is not released on module unload unless the DMA-config option is selected. Remove the no longer used CONFIG_MMC_ATMELMCI_DMA option completely. Signed-off-by: Johan Hovold <jhovold@gmail.com> Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com> Signed-off-by: Chris Ball <cjb@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07ext4: fix Kconfig documentation for CONFIG_EXT4_DEBUGTheodore Ts'o
commit 7f3e3c7cfcec148ccca9c0dd2dbfd7b00b7ac10f upstream. Fox the Kconfig documentation for CONFIG_EXT4_DEBUG to match the change made by commit a0b30c1229: ext4: use module parameters instead of debugfs for mballoc_debug Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07ext4: fix online resizing for ext3-compat file systemsTheodore Ts'o
commit c5c72d814cf0f650010337c73638b25e6d14d2d4 upstream. Commit fb0a387dcdc restricts block allocations for indirect-mapped files to block groups less than s_blockfile_groups. However, the online resizing code wasn't setting s_blockfile_groups, so the newly added block groups were not available for non-extent mapped files. Reported-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07ext4: fix big-endian bug in metadata checksum calculationsDmitry Monakhov
commit 171a7f21a76a0958c225b97c00a97a10390d40ee upstream. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07ext4: fix journal callback list traversalDmitry Monakhov
commit 5d3ee20855e28169d711b394857ee608a5023094 upstream. It is incorrect to use list_for_each_entry_safe() for journal callback traversial because ->next may be removed by other task: ->ext4_mb_free_metadata() ->ext4_mb_free_metadata() ->ext4_journal_callback_del() This results in the following issue: WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250() Hardware name: list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod Pid: 16400, comm: jbd2/dm-1-8 Tainted: G W 3.8.0-rc3+ #107 Call Trace: [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0 [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50 [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0 [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250 [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0 [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570 [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0 [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0 [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0 [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40 [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80 [<ffffffff810ac6be>] kthread+0x10e/0x120 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 This patch fix the issue as follows: - ext4_journal_commit_callback() make list truly traversial safe simply by always starting from list_head - fix race between two ext4_journal_callback_del() and ext4_journal_callback_try_del() Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07jbd2: fix race between jbd2_journal_remove_checkpoint and ->j_commit_callbackDmitry Monakhov
commit 794446c6946513c684d448205fbd76fa35f38b72 upstream. The following race is possible: [kjournald2] other_task jbd2_journal_commit_transaction() j_state = T_FINISHED; spin_unlock(&journal->j_list_lock); ->jbd2_journal_remove_checkpoint() ->jbd2_journal_free_transaction(); ->kmem_cache_free(transaction) ->j_commit_callback(journal, transaction); -> USE_AFTER_FREE WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250() Hardware name: list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod Pid: 16400, comm: jbd2/dm-1-8 Tainted: G W 3.8.0-rc3+ #107 Call Trace: [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0 [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50 [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0 [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250 [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0 [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570 [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0 [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0 [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0 [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40 [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80 [<ffffffff810ac6be>] kthread+0x10e/0x120 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 In order to demonstrace this issue one should mount ext4 with mount -o discard option on SSD disk. This makes callback longer and race window becomes wider. In order to fix this we should mark transaction as finished only after callbacks have completed Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07ext4/jbd2: don't wait (forever) for stale tid caused by wraparoundTheodore Ts'o
commit d76a3a77113db020d9bb1e894822869410450bd9 upstream. In the case where an inode has a very stale transaction id (tid) in i_datasync_tid or i_sync_tid, it's possible that after a very large (2**31) number of transactions, that the tid number space might wrap, causing tid_geq()'s calculations to fail. Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily", attempted to fix this problem, but it only avoided kjournald spinning forever by fixing the logic in jbd2_log_start_commit(). Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c that might call jbd2_log_start_commit() with a stale tid, those functions will subsequently call jbd2_log_wait_commit() with the same stale tid, and then wait for a very long time. To fix this, we replace the calls to jbd2_log_start_commit() and jbd2_log_wait_commit() with a call to a new function, jbd2_complete_transaction(), which will correctly handle stale tid's. As a bonus, jbd2_complete_transaction() will avoid locking j_state_lock for writing unless a commit needs to be started. This should have a small (but probably not measurable) improvement for ext4's scalability. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Ben Hutchings <ben@decadent.org.uk> Reported-by: George Barnett <gbarnett@atlassian.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07ixgbe: fix EICR write in ixgbe_msix_otherJacob Keller
commit d87d830720a1446403ed38bfc2da268be0d356d1 upstream. Previously, the ixgbe_msix_other was writing the full 32bits of the set interrupts, instead of only the ones which the ixgbe_msix_other is handling. This resulted in a loss of performance when the X540's PPS feature is enabled due to sometimes clearing queue interrupts which resulted in the driver not getting the interrupt for cleaning the q_vector rings often enough. The fix is to simply mask the lower 16bits off so that this handler does not write them in the EICR, which causes them to remain high and be properly handled by the clean_rings interrupt routine as normal. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07ipc: sysv shared memory limited to 8TiBRobin Holt
commit d69f3bad4675ac519d41ca2b11e1c00ca115cecd upstream. Trying to run an application which was trying to put data into half of memory using shmget(), we found that having a shmall value below 8EiB-8TiB would prevent us from using anything more than 8TiB. By setting kernel.shmall greater than 8EiB-8TiB would make the job work. In the newseg() function, ns->shm_tot which, at 8TiB is INT_MAX. ipc/shm.c: 458 static int newseg(struct ipc_namespace *ns, struct ipc_params *params) 459 { ... 465 int numpages = (size + PAGE_SIZE -1) >> PAGE_SHIFT; ... 474 if (ns->shm_tot + numpages > ns->shm_ctlall) 475 return -ENOSPC; [akpm@linux-foundation.org: make ipc/shm.c:newseg()'s numpages size_t, not int] Signed-off-by: Robin Holt <holt@sgi.com> Reported-by: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07wireless: regulatory: fix channel disabling race conditionJohannes Berg
commit 990de49f74e772b6db5208457b7aa712a5f4db86 upstream. When a full scan 2.4 and 5 GHz scan is scheduled, but then the 2.4 GHz part of the scan disables a 5.2 GHz channel due to, e.g. receiving country or frequency information, that 5.2 GHz channel might already be in the list of channels to scan next. Then, when the driver checks if it should do a passive scan, that will return false and attempt an active scan. This is not only wrong but can also lead to the iwlwifi device firmware crashing since it checks regulatory as well. Fix this by not setting the channel flags to just disabled but rather OR'ing in the disabled flag. That way, even if the race happens, the channel will be scanned passively which is still (mostly) correct. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07nfsd: Decode and send 64bit time valuesBryan Schumaker
commit bf8d909705e9d9bac31d9b8eac6734d2b51332a7 upstream. The seconds field of an nfstime4 structure is 64bit, but we are assuming that the first 32bits are zero-filled. So if the client tries to set atime to a value before the epoch (touch -t 196001010101), then the server will save the wrong value on disk. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07nfsd: don't run get_file if nfs4_preprocess_stateid_op return errorfanchaoting
commit b022032e195ffca83d7002d6b84297d796ed443b upstream. we should return error status directly when nfs4_preprocess_stateid_op return error. Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07nfsd4: don't close read-write opens too soonJ. Bruce Fields
commit 0c7c3e67ab91ec6caa44bdf1fc89a48012ceb0c5 upstream. Don't actually close any opens until we don't need them at all. This means being left with write access when it's not really necessary, but that's better than putting a file that might still have posix locks held on it, as we have been. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07NFSv4: Handle NFS4ERR_DELAY and NFS4ERR_GRACE in nfs4_open_delegation_recallTrond Myklebust
commit 8b6cc4d6f841d31f72fe7478453759166d366274 upstream. A server shouldn't normally return NFS4ERR_GRACE if the client holds a delegation, since no conflicting lock reclaims can be granted, however the spec does not require the server to grant the open in this instance Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07MD: ignore discard request for hard disks of hybid raid1/raid10 arrayShaohua Li
commit 32f9f570d04461a41bdcd5c1d93b41ebc5ce182a upstream. In SSD/hard disk hybid storage, discard request should be ignored for hard disk. We used to be doing this way, but the unplug path forgets it. This is suitable for stable tree since v3.6. Reported-and-tested-by: Markus <M4rkusXXL@web.de> Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07md: bad block list should default to disabled.NeilBrown
commit 486adf72ccc0c235754923d47a2270c5dcb0c98b upstream. Maintenance of a bad-block-list currently defaults to 'enabled' and is then disabled when it cannot be supported. This is backwards and causes problem for dm-raid which didn't know to disable it. So fix the defaults, and only enabled for v1.x metadata which explicitly has bad blocks enabled. The problem with dm-raid has been present since badblock support was added in v3.1, so this patch is suitable for any -stable from 3.1 onwards. Reported-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07LOCKD: Ensure that nlmclnt_block resets block->b_status after a server rebootTrond Myklebust
commit 1dfd89af8697a299e7982ae740d4695ecd917eef upstream. After a server reboot, the reclaimer thread will recover all the existing locks. For locks that are blocked, however, it will change the value of block->b_status to nlm_lck_denied_grace_period in order to signal that they need to wake up and resend the original blocking lock request. Due to a bug, however, the block->b_status never gets reset after the blocked locks have been woken up, and so the process goes into an infinite loop of resends until the blocked lock is satisfied. Reported-by: Marc Eshel <eshel@us.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07exec: do not abuse ->cred_guard_mutex in threadgroup_lock()Oleg Nesterov
commit e56fb2874015370e3b7f8d85051f6dce26051df9 upstream. threadgroup_lock() takes signal->cred_guard_mutex to ensure that thread_group_leader() is stable. This doesn't look nice, the scope of this lock in do_execve() is huge. And as Dave pointed out this can lead to deadlock, we have the following dependencies: do_execve: cred_guard_mutex -> i_mutex cgroup_mount: i_mutex -> cgroup_mutex attach_task_by_pid: cgroup_mutex -> cred_guard_mutex Change de_thread() to take threadgroup_change_begin() around the switch-the-leader code and change threadgroup_lock() to avoid ->cred_guard_mutex. Note that de_thread() can't sleep with ->group_rwsem held, this can obviously deadlock with the exiting leader if the writer is active, so it does threadgroup_change_end() before schedule(). Reported-by: Dave Jones <davej@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>