AgeCommit message (Collapse)Author
2019-02-13ARM: fix the cockup in the previous patchlinux-linaro-lsk-v4.9-testRussell King
Commit d6951f582cc50ba0ad22ef46b599740966599b14 upstream. The intention in the previous patch was to only place the processor tables in the .rodata section if big.Little was being built and we wanted the branch target hardening, but instead (due to the way it was tested) it ended up always placing the tables into the .rodata section. Although harmless, let's correct this anyway. Fixes: 3a4d0c2172bc ("ARM: ensure that processor vtables is not lost after boot") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: ensure that processor vtables is not lost after bootRussell King
Commit 3a4d0c2172bcf15b7a3d9d498b2b355f9864286b upstream. Marek Szyprowski reported problems with CPU hotplug in current kernels. This was tracked down to the processor vtables being located in an init section, and therefore discarded after kernel boot, despite being required after boot to properly initialise the non-boot CPUs. Arrange for these tables to end up in .rodata when required. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Krzysztof Kozlowski <krzk@kernel.org> Fixes: 383fb3ee8024 ("ARM: spectre-v2: per-CPU vtables to work around big.Little systems") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: spectre-v2: per-CPU vtables to work around big.Little systemsRussell King
Commit 383fb3ee8024d596f488d2dbaf45e572897acbdb upstream. In big.Little systems, some CPUs require the Spectre workarounds in paths such as the context switch, but other CPUs do not. In order to handle these differences, we need per-CPU vtables. We are unable to use the kernel's per-CPU variables to support this as per-CPU is not initialised at times when we need access to the vtables, so we have to use an array indexed by logical CPU number. We use an array-of-pointers to avoid having function pointers in the kernel's read/write .data section. Note: Added include of linux/slab.h in arch/arm/smp.c. Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: add PROC_VTABLE and PROC_TABLE macrosRussell King
Commit e209950fdd065d2cc46e6338e47e52841b830cba upstream. Allow the way we access members of the processor vtable to be changed at compile time. We will need to move to per-CPU vtables to fix the Spectre variant 2 issues on big.Little systems. However, we have a couple of calls that do not need the vtable treatment, and indeed cause a kernel warning due to the (later) use of smp_processor_id(), so also introduce the PROC_TABLE macro for these which always use CPU 0's function pointers. Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: clean up per-processor check_bugs method callRussell King
Commit 945aceb1db8885d3a35790cf2e810f681db52756 upstream. Call the per-processor type check_bugs() method in the same way as we do other per-processor functions - move the "processor." detail into proc-fns.h. Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: split out processor lookupRussell King
Commit 65987a8553061515b5851b472081aedb9837a391 upstream. Split out the lookup of the processor type and associated error handling from the rest of setup_processor() - we will need to use this in the secondary CPU bringup path for big.Little Spectre variant 2 mitigation. Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: make lookup_processor_type() non-__initRussell King
Commit 899a42f836678a595f7d2bc36a5a0c2b03d08cbc upstream. Move lookup_processor_type() out of the __init section so it is callable from (eg) the secondary startup code during hotplug. Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: 8810/1: vfp: Fix wrong assignement to ufp_excJulien Thierry
Commit 5df7a99bdd0de4a0480320264c44c04543c29d5a upstream. In vfp_preserve_user_clear_hwstate, ufp_exc->fpinst2 gets assigned to itself. It should actually be hwstate->fpinst2 that gets assigned to the ufp_exc field. Fixes commit 3aa2df6ec2ca6bc143a65351cca4266d03a8bc41 ("ARM: 8791/1: vfp: use __copy_to_user() when saving VFP state"). Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: 8797/1: spectre-v1.1: harden __copy_to_userJulien Thierry
Commit a1d09e074250fad24f1b993f327b18cc6812eb7a upstream. Sanitize user pointer given to __copy_to_user, both for standard version and memcopy version of the user accessor. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitizationJulien Thierry
Commit afaf6838f4bc896a711180b702b388b8cfa638fc upstream. Introduce C and asm helpers to sanitize user address, taking the address range they target into account. Use asm helper for existing sanitization in __copy_from_user(). Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()Julien Thierry
Commit e3aa6243434fd9a82e84bb79ab1abd14f2d9a5a7 upstream. When Spectre mitigation is required, __put_user() needs to include check_uaccess. This is already the case for put_user(), so just make __put_user() an alias of put_user(). Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: 8794/1: uaccess: Prevent speculative use of the current addr_limitJulien Thierry
Commit 621afc677465db231662ed126ae1f355bf8eac47 upstream. A mispredicted conditional call to set_fs could result in the wrong addr_limit being forwarded under speculation to a subsequent access_ok check, potentially forming part of a spectre-v1 attack using uaccess routines. This patch prevents this forwarding from taking place, but putting heavy barriers in set_fs after writing the addr_limit. Porting commit c2f0ad4fc089cff8 ("arm64: uaccess: Prevent speculative use of the current addr_limit"). Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: 8793/1: signal: replace __put_user_error with __put_userJulien Thierry
Commit 18ea66bd6e7a95bdc598223d72757190916af28b upstream. With Spectre-v1.1 mitigations, __put_user_error is pointless. In an attempt to remove it, replace its references in frame setups with __put_user. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: 8792/1: oabi-compat: copy oabi events using __copy_to_user()Julien Thierry
Commit 319508902600c2688e057750148487996396e9ca upstream. Copy events to user using __copy_to_user() rather than copy members of individually with __put_user_error(). This has the benefit of disabling/enabling PAN once per event intead of once per event member. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: 8791/1: vfp: use __copy_to_user() when saving VFP stateJulien Thierry
Commit 3aa2df6ec2ca6bc143a65351cca4266d03a8bc41 upstream. Use __copy_to_user() rather than __put_user_error() for individual members when saving VFP state. This has the benefit of disabling/enabling PAN once per copied struct intead of once per write. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-13ARM: 8789/1: signal: copy registers using __copy_to_user()Julien Thierry
Commit 5ca451cf6ed04443774bbb7ee45332dafa42e99f upstream. When saving the ARM integer registers, use __copy_to_user() to copy them into user signal frame, rather than __put_user_error(). This has the benefit of disabling/enabling PAN once for the whole copy intead of once per write. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David A. Long <dave.long@linaro.org>
2019-02-12Linux 4.9.156Greg Kroah-Hartman
2019-02-12ath9k: dynack: check da->enabled first in sampling routinesLorenzo Bianconi
commit 9d3d65a91f027b8a9af5e63752d9b78cb10eb92d upstream. Check da->enabled flag first in ath_dynack_sample_tx_ts and ath_dynack_sample_ack_ts routines in order to avoid useless processing Tested-by: Koen Vandeputte <koen.vandeputte@ncentric.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12ath9k: dynack: make ewma estimation fasterLorenzo Bianconi
commit 0c60c490830a1a756c80f8de8d33d9c6359d4a36 upstream. In order to make propagation time estimation faster, use current sample as ewma output value during 'late ack' tracking Tested-by: Koen Vandeputte <koen.vandeputte@ncentric.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()Peter Zijlstra
commit 602cae04c4864bb3487dfe4c2126c8d9e7e1614a upstream. intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other members of struct cpu_hw_events. This memory is released in intel_pmu_cpu_dying() which is wrong. The counterpart of the intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu(). Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but allocate new memory on the next attempt to online the CPU (leaking the old memory). Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and CPUHP_PERF_X86_PREPARE then the CPU will go back online but never allocate the memory that was released in x86_pmu_dying_cpu(). Make the memory allocation/free symmetrical in regard to the CPU hotplug notifier by moving the deallocation to intel_pmu_cpu_dead(). This started in commit: a7e3ed1e47011 ("perf: Add support for supplementary event registers"). In principle the bug was introduced in v2.6.39 (!), but it will almost certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15... [ bigeasy: Added patch description. ] [ mingo: Added backporting guidance. ] Reported-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: bp@alien8.de Cc: hpa@zytor.com Cc: jolsa@kernel.org Cc: kan.liang@linux.intel.com Cc: namhyung@kernel.org Cc: <stable@vger.kernel.org> Fixes: a7e3ed1e47011 ("perf: Add support for supplementary event registers"). Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> [ He Zhe: Fixes conflict caused by missing disable_counter_freeze which is introduced since v4.20 af3bdb991a5cb. ] Signed-off-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12IB/hfi1: Add limit test for RC/UC send via loopbackMike Marciniszyn
commit 09ce351dff8e7636af0beb72cd4a86c3904a0500 upstream. Fix potential memory corruption and panic in loopback for IB_WR_SEND variants. The code blindly assumes the posted length will fit in the fetched rwqe, which is not a valid assumption. Fix by adding a limit test, and triggering the appropriate send completion and putting the QP in an error state. This mimics the handling for non-loopback QPs. Fixes: 15703461533a ("IB/{hfi1, qib, rdmavt}: Move ruc_loopback to rdmavt") Cc: <stable@vger.kernel.org> #v4.20+ Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
2019-02-12PCI: vmd: Free up IRQs on suspend pathScott Bauer
commit e2b1820bd5d0962d6f271b0d47c3a0e38647df2f upstream. Free up the IRQs we request on the suspend path and reallocate them on the resume path. Fixes this error: CPU 111 disable failed: CPU has 9 vectors assigned and there are only 0 available. Error taking CPU111 down: -34 Non-boot CPUs are not disabled Enabling non-boot CPUs ... Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Sushma Kalakota <sushmax.kalakota@intel.com>
2019-02-12oom, oom_reaper: do not enqueue same task twiceTetsuo Handa
commit 9bcdeb51bd7d2ae9fe65ea4d60643d2aeef5bfe3 upstream. Arkadiusz reported that enabling memcg's group oom killing causes strange memcg statistics where there is no task in a memcg despite the number of tasks in that memcg is not 0. It turned out that there is a bug in wake_oom_reaper() which allows enqueuing same task twice which makes impossible to decrease the number of tasks in that memcg due to a refcount leak. This bug existed since the OOM reaper became invokable from task_will_free_mem(current) path in out_of_memory() in Linux 4.7, T1@P1 |T2@P1 |T3@P1 |OOM reaper ----------+----------+----------+------------ # Processing an OOM victim in a different memcg domain. try_charge() mem_cgroup_out_of_memory() mutex_lock(&oom_lock) try_charge() mem_cgroup_out_of_memory() mutex_lock(&oom_lock) try_charge() mem_cgroup_out_of_memory() mutex_lock(&oom_lock) out_of_memory() oom_kill_process(P1) do_send_sig_info(SIGKILL, @P1) mark_oom_victim(T1@P1) wake_oom_reaper(T1@P1) # T1@P1 is enqueued. mutex_unlock(&oom_lock) out_of_memory() mark_oom_victim(T2@P1) wake_oom_reaper(T2@P1) # T2@P1 is enqueued. mutex_unlock(&oom_lock) out_of_memory() mark_oom_victim(T1@P1) wake_oom_reaper(T1@P1) # T1@P1 is enqueued again due to oom_reaper_list == T2@P1 && T1@P1->oom_reaper_list == NULL. mutex_unlock(&oom_lock) # Completed processing an OOM victim in a different memcg domain. spin_lock(&oom_reaper_lock) # T1P1 is dequeued. spin_unlock(&oom_reaper_lock) but memcg's group oom killing made it easier to trigger this bug by calling wake_oom_reaper() on the same task from one out_of_memory() request. Fix this bug using an approach used by commit 855b018325737f76 ("oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task"). As a side effect of this patch, this patch also avoids enqueuing multiple threads sharing memory via task_will_free_mem(current) path. Link: http://lkml.kernel.org/r/e865a044-2c10-9858-f4ef-254bc71d6cc2@i-love.sakura.ne.jp Link: http://lkml.kernel.org/r/5ee34fc6-1485-34f8-8790-903ddabaa809@i-love.sakura.ne.jp Fixes: af8e15cc85a25315 ("oom, oom_reaper: do not enqueue task if it is on the oom_reaper_list head") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl> Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Aleksa Sarai <asarai@suse.de> Cc: Jay Kamat <jgkamat@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12serial: fix race between flush_to_ldisc and tty_openGreg Kroah-Hartman
commit fedb5760648a291e949f2380d383b5b2d2749b5e upstream. There still is a race window after the commit b027e2298bd588 ("tty: fix data race between tty_init_dev and flush of buf"), and we encountered this crash issue if receive_buf call comes before tty initialization completes in tty_open and tty->driver_data may be NULL. CPU0 CPU1 ---- ---- tty_open tty_init_dev tty_ldisc_unlock schedule flush_to_ldisc receive_buf tty_port_default_receive_buf tty_ldisc_receive_buf n_tty_receive_buf_common __receive_buf uart_flush_chars uart_start /*tty->driver_data is NULL*/ tty->ops->open /*init tty->driver_data*/ it can be fixed by extending ldisc semaphore lock in tty_init_dev to driver_data initialized completely after tty->ops->open(), but this will lead to get lock on one function and unlock in some other function, and hard to maintain, so fix this race only by checking tty->driver_data when receiving, and return if tty->driver_data is NULL, and n_tty_receive_buf_common maybe calls uart_unthrottle, so add the same check. Because the tty layer knows nothing about the driver associated with the device, the tty layer can not do anything here, it is up to the tty driver itself to check for this type of race. Fix up the serial driver to correctly check to see if it is finished binding with the device when being called, and if not, abort the tty calls. [Description and problem report and testing from Li RongQing, I rewrote the patch to be in the serial layer, not in the tty core - gregkh] Reported-by: Li RongQing <lirongqing@baidu.com> Tested-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Wang Li <wangli39@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12perf tests evsel-tp-sched: Fix bitwise operatorGustavo A. R. Silva
commit 489338a717a0dfbbd5a3fabccf172b78f0ac9015 upstream. Notice that the use of the bitwise OR operator '|' always leads to true in this particular case, which seems a bit suspicious due to the context in which this expression is being used. Fix this by using bitwise AND operator '&' instead. This bug was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Fixes: 6a6cd11d4e57 ("perf test: Add test for the sched tracepoint format fields") Link: http://lkml.kernel.org/r/20190122233439.GA5868@embeddedor Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12perf/core: Don't WARN() for impossible ring-buffer sizesMark Rutland
commit 9dff0aa95a324e262ffb03f425d00e4751f3294e upstream. The perf tool uses /proc/sys/kernel/perf_event_mlock_kb to determine how large its ringbuffer mmap should be. This can be configured to arbitrary values, which can be larger than the maximum possible allocation from kmalloc. When this is configured to a suitably large value (e.g. thanks to the perf fuzzer), attempting to use perf record triggers a WARN_ON_ONCE() in __alloc_pages_nodemask(): WARNING: CPU: 2 PID: 5666 at mm/page_alloc.c:4511 __alloc_pages_nodemask+0x3f8/0xbc8 Let's avoid this by checking that the requested allocation is possible before calling kzalloc. Reported-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20190110142745.25495-1-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()Tony Luck
commit d28af26faa0b1daf3c692603d46bc4687c16f19e upstream. Internal injection testing crashed with a console log that said: mce: [Hardware Error]: CPU 7: Machine Check Exception: f Bank 0: bd80000000100134 This caused a lot of head scratching because the MCACOD (bits 15:0) of that status is a signature from an L1 data cache error. But Linux says that it found it in "Bank 0", which on this model CPU only reports L1 instruction cache errors. The answer was that Linux doesn't initialize "m->bank" in the case that it finds a fatal error in the mce_no_way_out() pre-scan of banks. If this was a local machine check, then this partially initialized struct mce is being passed to mce_panic(). Fix is simple: just initialize m->bank in the case of a fatal error. Fixes: 40c36e2741d7 ("x86/mce: Fix incorrect "Machine check from unknown source" message") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: x86-ml <x86@kernel.org> Cc: stable@vger.kernel.org # v4.18 Note pre-v5.0 arch/x86/kernel/cpu/mce/core.c was called arch/x86/kernel/cpu/mcheck/mce.c Link: https://lkml.kernel.org/r/20190201003341.10638-1-tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12perf/x86/intel/uncore: Add Node ID maskKan Liang
commit 9e63a7894fd302082cf3627fe90844421a6cbe7f upstream. Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE Superdome Flex). To understand which Socket the PCI uncore PMUs belongs to, perf retrieves the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI configuration space, and the mapping between Socket ID and Node ID from GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly. The local Node ID is only available at bit 2:0, but current code doesn't mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID will be fetched. Filter the Node ID by adding a mask. Reported-by: Song Liu <songliubraving@fb.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> # v3.7+ Fixes: 7c94ee2e0917 ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support") Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12KVM: nVMX: unconditionally cancel preemption timer in free_nested ↵Peter Shier
(CVE-2019-7221) commit ecec76885bcfe3294685dc363fd1273df0d5d65f upstream. Bugzilla: 1671904 There are multiple code paths where an hrtimer may have been started to emulate an L1 VMX preemption timer that can result in a call to free_nested without an intervening L2 exit where the hrtimer is normally cancelled. Unconditionally cancel in free_nested to cover all cases. Embargoed until Feb 7th 2019. Signed-off-by: Peter Shier <pshier@google.com> Reported-by: Jim Mattson <jmattson@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Message-Id: <20181011184646.154065-1-pshier@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)Jann Horn
commit cfa39381173d5f969daf43582c95ad679189cbc9 upstream. kvm_ioctl_create_device() does the following: 1. creates a device that holds a reference to the VM object (with a borrowed reference, the VM's refcount has not been bumped yet) 2. initializes the device 3. transfers the reference to the device to the caller's file descriptor table 4. calls kvm_get_kvm() to turn the borrowed reference to the VM into a real reference The ownership transfer in step 3 must not happen before the reference to the VM becomes a proper, non-borrowed reference, which only happens in step 4. After step 3, an attacker can close the file descriptor and drop the borrowed reference, which can cause the refcount of the kvm object to drop to zero. This means that we need to grab a reference for the device before anon_inode_getfd(), otherwise the VM can disappear from under us. Fixes: 852b6d57dc7f ("kvm: add device control API") Cc: stable@kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)Paolo Bonzini
commit 353c0956a618a07ba4bbe7ad00ff29fe70e8412a upstream. Bugzilla: 1671930 Emulation of certain instructions (VMXON, VMCLEAR, VMPTRLD, VMWRITE with memory operand, INVEPT, INVVPID) can incorrectly inject a page fault when passed an operand that points to an MMIO address. The page fault will use uninitialized kernel stack memory as the CR2 and error code. The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR exit to userspace; however, it is not an easy fix, so for now just ensure that the error code and CR2 are zero. Embargoed until Feb 7th 2019. Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12scsi: aic94xx: fix module loadingJames Bottomley
commit 42caa0edabd6a0a392ec36a5f0943924e4954311 upstream. The aic94xx driver is currently failing to load with errors like sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:02:00.3/0000:07:02.0/revision' Because the PCI code had recently added a file named 'revision' to every PCI device. Fix this by renaming the aic94xx revision file to aic_revision. This is safe to do for us because as far as I can tell, there's nothing in userspace relying on the current aic94xx revision file so it can be renamed without breaking anything. Fixes: 702ed3be1b1b (PCI: Create revision file in sysfs) Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12usb: gadget: musb: fix short isoc packets with inventra dmaPaul Elder
commit c418fd6c01fbc5516a2cd1eaf1df1ec86869028a upstream. Handling short packets (length < max packet size) in the Inventra DMA engine in the MUSB driver causes the MUSB DMA controller to hang. An example of a problem that is caused by this problem is when streaming video out of a UVC gadget, only the first video frame is transferred. For short packets (mode-0 or mode-1 DMA), MUSB_TXCSR_TXPKTRDY must be set manually by the driver. This was previously done in musb_g_tx (musb_gadget.c), but incorrectly (all csr flags were cleared, and only MUSB_TXCSR_MODE and MUSB_TXCSR_TXPKTRDY were set). Fixing that problem allows some requests to be transferred correctly, but multiple requests were often put together in one USB packet, and caused problems if the packet size was not a multiple of 4. Instead, set MUSB_TXCSR_TXPKTRDY in dma_controller_irq (musbhsdma.c), just like host mode transfers. This topic was originally tackled by Nicolas Boichat [0] [1] and is discussed further at [2] as part of his GSoC project [3]. [0] https://groups.google.com/forum/?hl=en#!topic/beagleboard-gsoc/k8Azwfp75CU [1] https://gitorious.org/beagleboard-usbsniffer/beagleboard-usbsniffer-kernel/commit/b0be3b6cc195ba732189b04f1d43ec843c3e54c9?p=beagleboard-usbsniffer:beagleboard-usbsniffer-kernel.git;a=patch;h=b0be3b6cc195ba732189b04f1d43ec843c3e54c9 [2] http://beagleboard-usbsniffer.blogspot.com/2010/07/musb-isochronous-transfers-fixed.html [3] http://elinux.org/BeagleBoard/GSoC/USBSniffer Fixes: 550a7375fe72 ("USB: Add MUSB and TUSB support") Signed-off-by: Paul Elder <paul.elder@ideasonboard.com> Signed-off-by: Bin Liu <b-liu@ti.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12usb: gadget: udc: net2272: Fix bitwise and boolean operationsGustavo A. R. Silva
commit 07c69f1148da7de3978686d3af9263325d9d60bd upstream. (!x & y) strikes again. Fix bitwise and boolean operations by enclosing the expression: intcsr & (1 << NET2272_PCI_IRQ) in parentheses, before applying the boolean operator '!'. Notice that this code has been there since 2011. So, it would be helpful if someone can double-check this. This issue was detected with the help of Coccinelle. Fixes: ceb80363b2ec ("USB: net2272: driver for PLX NET2272 USB device controller") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12usb: phy: am335x: fix race condition in _probeBin Liu
commit a53469a68eb886e84dd8b69a1458a623d3591793 upstream. power off the phy should be done before populate the phy. Otherwise, am335x_init() could be called by the phy owner to power on the phy first, then am335x_phy_probe() turns off the phy again without the caller knowing it. Fixes: 2fc711d76352 ("usb: phy: am335x: Enable USB remote wakeup using PHY wakeup") Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12dmaengine: imx-dma: fix wrong callback invokeLeonid Iziumtsev
commit 341198eda723c8c1cddbb006a89ad9e362502ea2 upstream. Once the "ld_queue" list is not empty, next descriptor will migrate into "ld_active" list. The "desc" variable will be overwritten during that transition. And later the dmaengine_desc_get_callback_invoke() will use it as an argument. As result we invoke wrong callback. That behaviour was in place since: commit fcaaba6c7136 ("dmaengine: imx-dma: fix callback path in tasklet"). But after commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job") things got worse, since possible delay between tasklet_schedule() from DMA irq handler and actual tasklet function execution got bigger. And that gave more time for new DMA request to be submitted and to be put into "ld_queue" list. It has been noticed that DMA issue is causing problems for "mxc-mmc" driver. While stressing the system with heavy network traffic and writing/reading to/from sd card simultaneously the timeout may happen: 10013000.sdhci: mxcmci_watchdog: read time out (status = 0x30004900) That often lead to file system corruption. Signed-off-by: Leonid Iziumtsev <leonid.iziumtsev@gmail.com> Signed-off-by: Vinod Koul <vkoul@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12dmaengine: bcm2835: Fix abort of transactionsLukas Wunner
commit 9e528c799d17a4ac37d788c81440b50377dd592d upstream. There are multiple issues with bcm2835_dma_abort() (which is called on termination of a transaction): * The algorithm to abort the transaction first pauses the channel by clearing the ACTIVE flag in the CS register, then waits for the PAUSED flag to clear. Page 49 of the spec documents the latter as follows: "Indicates if the DMA is currently paused and not transferring data. This will occur if the active bit has been cleared [...]" https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf So the function is entering an infinite loop because it is waiting for PAUSED to clear which is always set due to the function having cleared the ACTIVE flag. The only thing that's saving it from itself is the upper bound of 10000 loop iterations. The code comment says that the intention is to "wait for any current AXI transfer to complete", so the author probably wanted to check the WAITING_FOR_OUTSTANDING_WRITES flag instead. Amend the function accordingly. * The CS register is only read at the beginning of the function. It needs to be read again after pausing the channel and before checking for outstanding writes, otherwise writes which were issued between the register read at the beginning of the function and pausing the channel may not be waited for. * The function seeks to abort the transfer by writing 0 to the NEXTCONBK register and setting the ABORT and ACTIVE flags. Thereby, the 0 in NEXTCONBK is sought to be loaded into the CONBLK_AD register. However experimentation has shown this approach to not work: The CONBLK_AD register remains the same as before and the CS register contains 0x00000030 (PAUSED | DREQ_STOPS_DMA). In other words, the control block is not aborted but merely paused and it will be resumed once the next DMA transaction is started. That is absolutely not the desired behavior. A simpler approach is to set the channel's RESET flag instead. This reliably zeroes the NEXTCONBK as well as the CS register. It requires less code and only a single MMIO write. This is also what popular user space DMA drivers do, e.g.: https://github.com/metachris/RPIO/blob/master/source/c_pwm/pwm.c Note that the spec is contradictory whether the NEXTCONBK register is writeable at all. On the one hand, page 41 claims: "The value loaded into the NEXTCONBK register can be overwritten so that the linked list of Control Block data structures can be dynamically altered. However it is only safe to do this when the DMA is paused." On the other hand, page 40 specifies: "Only three registers in each channel's register set are directly writeable (CS, CONBLK_AD and DEBUG). The other registers (TI, SOURCE_AD, DEST_AD, TXFR_LEN, STRIDE & NEXTCONBK), are automatically loaded from a Control Block data structure held in external memory." Fixes: 96286b576690 ("dmaengine: Add support for BCM2835") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v3.14+ Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Meier <florian.meier@koalo.de> Cc: Clive Messer <clive.m.messer@gmail.com> Cc: Matthias Reichl <hias@horus.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Acked-by: Florian Kauer <florian.kauer@koalo.de> Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12dmaengine: bcm2835: Fix interrupt race on RTLukas Wunner
commit f7da7782aba92593f7b82f03d2409a1c5f4db91b upstream. If IRQ handlers are threaded (either because CONFIG_PREEMPT_RT_BASE is enabled or "threadirqs" was passed on the command line) and if system load is sufficiently high that wakeup latency of IRQ threads degrades, SPI DMA transactions on the BCM2835 occasionally break like this: ks8851 spi0.0: SPI transfer timed out bcm2835-dma 3f007000.dma: DMA transfer could not be terminated ks8851 spi0.0 eth2: ks8851_rdfifo: spi_sync() failed The root cause is an assumption made by the DMA driver which is documented in a code comment in bcm2835_dma_terminate_all(): /* * Stop DMA activity: we assume the callback will not be called * after bcm_dma_abort() returns (even if it does, it will see * c->desc is NULL and exit.) */ That assumption falls apart if the IRQ handler bcm2835_dma_callback() is threaded: A client may terminate a descriptor and issue a new one before the IRQ handler had a chance to run. In fact the IRQ handler may miss an *arbitrary* number of descriptors. The result is the following race condition: 1. A descriptor finishes, its interrupt is deferred to the IRQ thread. 2. A client calls dma_terminate_async() which sets channel->desc = NULL. 3. The client issues a new descriptor. Because channel->desc is NULL, bcm2835_dma_issue_pending() immediately starts the descriptor. 4. Finally the IRQ thread runs and writes BCM2835_DMA_INT to the CS register to acknowledge the interrupt. This clears the ACTIVE flag, so the newly issued descriptor is paused in the middle of the transaction. Because channel->desc is not NULL, the IRQ thread finalizes the descriptor and tries to start the next one. I see two possible solutions: The first is to call synchronize_irq() in bcm2835_dma_issue_pending() to wait until the IRQ thread has finished before issuing a new descriptor. The downside of this approach is unnecessary latency if clients desire rapidly terminating and re-issuing descriptors and don't have any use for an IRQ callback. (The SPI TX DMA channel is a case in point.) A better alternative is to make the IRQ thread recognize that it has missed descriptors and avoid finalizing the newly issued descriptor. So first of all, set the ACTIVE flag when acknowledging the interrupt. This keeps a newly issued descriptor running. If the descriptor was finished, the channel remains idle despite the ACTIVE flag being set. However the ACTIVE flag can then no longer be used to check whether the channel is idle, so instead check whether the register containing the current control block address is zero and finalize the current descriptor only if so. That way, there is no impact on latency and throughput if the client doesn't care for the interrupt: Only minimal additional overhead is introduced for non-cyclic descriptors as one further MMIO read is necessary per interrupt to check for idleness of the channel. Cyclic descriptors are sped up slightly by removing one MMIO write per interrupt. Fixes: 96286b576690 ("dmaengine: Add support for BCM2835") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v3.14+ Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Meier <florian.meier@koalo.de> Cc: Clive Messer <clive.m.messer@gmail.com> Cc: Matthias Reichl <hias@horus.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Acked-by: Florian Kauer <florian.kauer@koalo.de> Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12fuse: handle zero sized retrieve correctlyMiklos Szeredi
commit 97e1532ef81acb31c30f9e75bf00306c33a77812 upstream. Dereferencing req->page_descs[0] will Oops if req->max_pages is zero. Reported-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com Tested-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com Fixes: b2430d7567a3 ("fuse: add per-page descriptor <offset, length> to fuse_req") Cc: <stable@vger.kernel.org> # v3.9 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12fuse: decrement NR_WRITEBACK_TEMP on the right pageMiklos Szeredi
commit a2ebba824106dabe79937a9f29a875f837e1b6d4 upstream. NR_WRITEBACK_TEMP is accounted on the temporary page in the request, not the page cache page. Fixes: 8b284dc47291 ("fuse: writepages: handle same page rewrites") Cc: <stable@vger.kernel.org> # v3.13 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12fuse: call pipe_buf_release() under pipe lockJann Horn
commit 9509941e9c534920ccc4771ae70bd6cbbe79df1c upstream. Some of the pipe_buf_release() handlers seem to assume that the pipe is locked - in particular, anon_pipe_buf_release() accesses pipe->tmp_page without taking any extra locks. From a glance through the callers of pipe_buf_release(), it looks like FUSE is the only one that calls pipe_buf_release() without having the pipe locked. This bug should only lead to a memory leak, nothing terrible. Fixes: dd3bb14f44a6 ("fuse: support splice() writing to fuse device") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12ALSA: hda - Serialize codec registrationsTakashi Iwai
commit 305a0ade180981686eec1f92aa6252a7c6ebb1cf upstream. In the current code, the codec registration may happen both at the codec bind time and the end of the controller probe time. In a rare occasion, they race with each other, leading to Oops due to the still uninitialized card device. This patch introduces a simple flag to prevent the codec registration at the codec bind time as long as the controller probe is going on. The controller probe invokes snd_card_register() that does the whole registration task, and we don't need to register each piece beforehand. Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12ALSA: compress: Fix stop handling on compressed capture streamsCharles Keepax
commit 4f2ab5e1d13d6aa77c55f4914659784efd776eb4 upstream. It is normal user behaviour to start, stop, then start a stream again without closing it. Currently this works for compressed playback streams but not capture ones. The states on a compressed capture stream go directly from OPEN to PREPARED, unlike a playback stream which moves to SETUP and waits for a write of data before moving to PREPARED. Currently however, when a stop is sent the state is set to SETUP for both types of streams. This leaves a capture stream in the situation where a new start can't be sent as that requires the state to be PREPARED and a new set_params can't be sent as that requires the state to be OPEN. The only option being to close the stream, and then reopen. Correct this issues by allowing snd_compr_drain_notify to set the state depending on the stream direction, as we already do in set_params. Fixes: 49bb6402f1aa ("ALSA: compress_core: Add support for capture streams") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12enic: fix checksum validation for IPv6Govindarajulu Varadarajan
[ Upstream commit 7596175e99b3d4bce28022193efd954c201a782a ] In case of IPv6 pkts, ipv4_csum_ok is 0. Because of this, driver does not set skb->ip_summed. So IPv6 rx checksum is not offloaded. Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12net: dsa: slave: Don't propagate flag changes on down slave interfacesRundong Ge
[ Upstream commit 17ab4f61b8cd6f9c38e9d0b935d86d73b5d0d2b5 ] The unbalance of master's promiscuity or allmulti will happen after ifdown and ifup a slave interface which is in a bridge. When we ifdown a slave interface , both the 'dsa_slave_close' and 'dsa_slave_change_rx_flags' will clear the master's flags. The flags of master will be decrease twice. In the other hand, if we ifup the slave interface again, since the slave's flags were cleared the 'dsa_slave_open' won't set the master's flag, only 'dsa_slave_change_rx_flags' that triggered by 'br_add_if' will set the master's flags. The flags of master is increase once. Only propagating flag changes when a slave interface is up makes sure this does not happen. The 'vlan_dev_change_rx_flags' had the same problem and was fixed, and changes here follows that fix. Fixes: 91da11f870f0 ("net: Distributed Switch Architecture protocol support") Signed-off-by: Rundong Ge <rdong.ge@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet framesCong Wang
[ Upstream commit e8c8b53ccaff568fef4c13a6ccaf08bf241aa01a ] When an ethernet frame is padded to meet the minimum ethernet frame size, the padding octets are not covered by the hardware checksum. Fortunately the padding octets are usually zero's, which don't affect checksum. However, we have a switch which pads non-zero octets, this causes kernel hardware checksum fault repeatedly. Prior to: commit '88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE ...")' skb checksum was forced to be CHECKSUM_NONE when padding is detected. After it, we need to keep skb->csum updated, like what we do for RXFCS. However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP headers, it is not worthy the effort as the packets are so small that CHECKSUM_COMPLETE can't save anything. Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"), Cc: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Nikola Ciprich <nikola.ciprich@linuxbox.cz> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12net: systemport: Fix WoL with password after deep sleepFlorian Fainelli
[ Upstream commit 8dfb8d2cceb76b74ad5b58cc65c75994329b4d5e ] Broadcom STB chips support a deep sleep mode where all register contents are lost. Because we were stashing the MagicPacket password into some of these registers a suspend into that deep sleep then a resumption would not lead to being able to wake-up from MagicPacket with password again. Fix this by keeping a software copy of the password and program it during suspend. Fixes: 83e82f4c706b ("net: systemport: add Wake-on-LAN support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12rds: fix refcount bug in rds_sock_addrefEric Dumazet
[ Upstream commit 6fa19f5637a6c22bc0999596bcc83bdcac8a4fa6 ] syzbot was able to catch a bug in rds [1] The issue here is that the socket might be found in a hash table but that its refcount has already be set to 0 by another cpu. We need to use refcount_inc_not_zero() to be safe here. [1] refcount_t: increment on 0; use-after-free. WARNING: CPU: 1 PID: 23129 at lib/refcount.c:153 refcount_inc_checked lib/refcount.c:153 [inline] WARNING: CPU: 1 PID: 23129 at lib/refcount.c:153 refcount_inc_checked+0x61/0x70 lib/refcount.c:151 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 23129 Comm: syz-executor3 Not tainted 5.0.0-rc4+ #53 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1db/0x2d0 lib/dump_stack.c:113 panic+0x2cb/0x65c kernel/panic.c:214 __warn.cold+0x20/0x48 kernel/panic.c:571 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] fixup_bug arch/x86/kernel/traps.c:173 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973 RIP: 0010:refcount_inc_checked lib/refcount.c:153 [inline] RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:151 Code: 1d 51 63 c8 06 31 ff 89 de e8 eb 1b f2 fd 84 db 75 dd e8 a2 1a f2 fd 48 c7 c7 60 9f 81 88 c6 05 31 63 c8 06 01 e8 af 65 bb fd <0f> 0b eb c1 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 49 RSP: 0018:ffff8880a0cbf1e8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc90006113000 RDX: 000000000001047d RSI: ffffffff81685776 RDI: 0000000000000005 RBP: ffff8880a0cbf1f8 R08: ffff888097c9e100 R09: ffffed1015ce5021 R10: ffffed1015ce5020 R11: ffff8880ae728107 R12: ffff8880723c20c0 R13: ffff8880723c24b0 R14: dffffc0000000000 R15: ffffed1014197e64 sock_hold include/net/sock.h:647 [inline] rds_sock_addref+0x19/0x20 net/rds/af_rds.c:675 rds_find_bound+0x97c/0x1080 net/rds/bind.c:82 rds_recv_incoming+0x3be/0x1430 net/rds/recv.c:362 rds_loop_xmit+0xf3/0x2a0 net/rds/loop.c:96 rds_send_xmit+0x1355/0x2a10 net/rds/send.c:355 rds_sendmsg+0x323c/0x44e0 net/rds/send.c:1368 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xdd/0x130 net/socket.c:631 __sys_sendto+0x387/0x5f0 net/socket.c:1788 __do_sys_sendto net/socket.c:1800 [inline] __se_sys_sendto net/socket.c:1796 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1796 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458089 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fc266df8c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000458089 RDX: 0000000000000000 RSI: 00000000204b3fff RDI: 0000000000000005 RBP: 000000000073bf00 R08: 00000000202b4000 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc266df96d4 R13: 00000000004c56e4 R14: 00000000004d94a8 R15: 00000000ffffffff Fixes: cc4dfb7f70a3 ("rds: fix two RCU related problems") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: rds-devel@oss.oracle.com Cc: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12skge: potential memory corruption in skge_get_regs()Dan Carpenter
[ Upstream commit 294c149a209c6196c2de85f512b52ef50f519949 ] The "p" buffer is 0x4000 bytes long. B3_RI_WTO_R1 is 0x190. The value of "regs->len" is in the 1-0x4000 range. The bug here is that "regs->len - B3_RI_WTO_R1" can be a negative value which would lead to memory corruption and an abrupt crash. Fixes: c3f8be961808 ("[PATCH] skge: expand ethtool debug register dump") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12rxrpc: bad unlock balance in rxrpc_recvmsgEric Dumazet
[ Upstream commit 6dce3c20ac429e7a651d728e375853370c796e8d ] When either "goto wait_interrupted;" or "goto wait_error;" paths are taken, socket lock has already been released. This patch fixes following syzbot splat : WARNING: bad unlock balance detected! 5.0.0-rc4+ #59 Not tainted ------------------------------------- syz-executor223/8256 is trying to release lock (sk_lock-AF_RXRPC) at: [<ffffffff86651353>] rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598 but there are no more locks to release! other info that might help us debug this: 1 lock held by syz-executor223/8256: #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: spin_lock_bh include/linux/spinlock.h:334 [inline] #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: release_sock+0x20/0x1c0 net/core/sock.c:2798 stack backtrace: CPU: 1 PID: 8256 Comm: syz-executor223 Not tainted 5.0.0-rc4+ #59 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_unlock_imbalance_bug kernel/locking/lockdep.c:3391 [inline] print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3368 __lock_release kernel/locking/lockdep.c:3601 [inline] lock_release+0x67e/0xa00 kernel/locking/lockdep.c:3860 sock_release_ownership include/net/sock.h:1471 [inline] release_sock+0x183/0x1c0 net/core/sock.c:2808 rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598 sock_recvmsg_nosec net/socket.c:794 [inline] sock_recvmsg net/socket.c:801 [inline] sock_recvmsg+0xd0/0x110 net/socket.c:797 __sys_recvfrom+0x1ff/0x350 net/socket.c:1845 __do_sys_recvfrom net/socket.c:1863 [inline] __se_sys_recvfrom net/socket.c:1859 [inline] __x64_sys_recvfrom+0xe1/0x1a0 net/socket.c:1859 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446379 Code: e8 2c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fe5da89fd98 EFLAGS: 00000246 ORIG_RAX: 000000000000002d RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446379 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00000000006dbc20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc2c R13: 0000000000000000 R14: 0000000000000000 R15: 20c49ba5e353f7cf Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Howells <dhowells@redhat.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>