aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2013-07-25powerpc: Rename and flesh out the facility unavailable exception handlerMichael Ellerman
commit 021424a1fce335e05807fd770eb8e1da30a63eea upstream. The exception at 0xf60 is not the TM (Transactional Memory) unavailable exception, it is the "Facility Unavailable Exception", rename it as such. Flesh out the handler to acknowledge the fact that it can be called for many reasons, one of which is TM being unavailable. Use STD_EXCEPTION_COMMON() for the exception body, for some reason we had it open-coded, I've checked the generated code is identical. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc: Remove KVMTEST from RELON exception handlersMichael Ellerman
commit c9f69518e5f08170bc857984a077f693d63171df upstream. KVMTEST is a macro which checks whether we are taking an exception from guest context, if so we branch out of line and eventually call into the KVM code to handle the switch. When running real guests on bare metal (HV KVM) the hardware ensures that we never take a relocation on exception when transitioning from guest to host. For PR KVM we disable relocation on exceptions ourself in kvmppc_core_init_vm(), as of commit a413f47 "Disable relocation on exceptions whenever PR KVM is active". So convert all the RELON macros to use NOTEST, and drop the remaining KVM_HANDLER() definitions we have for 0xe40 and 0xe80. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc: Remove unreachable relocation on exception handlersMichael Ellerman
commit 1d567cb4bd42d560a7621cac6f6aebe87343689e upstream. We have relocation on exception handlers defined for h_data_storage and h_instr_storage. However we will never take relocation on exceptions for these because they can only come from a guest, and we never take relocation on exceptions when we transition from guest to host. We also have a handler for hmi_exception (Hypervisor Maintenance) which is defined in the architecture to never be delivered with relocation on, see see v2.07 Book III-S section 6.5. So remove the handlers, leaving a branch to self just to be double extra paranoid. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc/tm: Fix return of active 64bit signalsMichael Neuling
commit 87b4e5393af77f5cba124638f19f6c426e210aec upstream. Currently we only restore signals which are transactionally suspended but it's possible that the transaction can be restored even when it's active. Most likely this will result in a transactional rollback by the hardware as the transaction will have been doomed by an earlier treclaim. The current code is a legacy of earlier kernel implementations which did software rollback of active transactions in the kernel. That code has now gone but we didn't correctly fix up this part of the signals code which still makes assumptions based on having software rollback. This changes the signal return code to always restore both contexts on 64 bit signal return. It also ensures that the MSR TM bits are properly restored from the signal context which they are not currently. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc/tm: Fix return of 32bit rt signals to active transactionsMichael Neuling
commit 55e4341850ac56e63a3eefe9583a9000042164fa upstream. Currently we only restore signals which are transactionally suspended but it's possible that the transaction can be restored even when it's active. Most likely this will result in a transactional rollback by the hardware as the transaction will have been doomed by an earlier treclaim. The current code is a legacy of earlier kernel implementations which did software rollback of active transactions in the kernel. That code has now gone but we didn't correctly fix up this part of the signals code which still makes assumptions based on having software rollback. This changes the signal return code to always restore both contexts on 32 bit rt signal return. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc/tm: Fix restoration of MSR on 32bit signal returnMichael Neuling
commit 2c27a18f8736da047bef2b997bdd48efc667e3c9 upstream. Currently we clear out the MSR TM bits on signal return assuming that the signal should never return to an active transaction. This is bogus as the user may do this. It's most likely the transaction will be doomed due to a treclaim but that's a problem for the HW not the kernel. The current code is a legacy of earlier kernel implementations which did software rollback of active transactions in the kernel. That code has now gone but we didn't correctly fix up this part of the signals code which still makes the assumption that it must be returning to a suspended transaction. This pulls out both MSR TM bits from the user supplied context rather than just setting TM suspend. We pull out only the bits needed to ensure the user can't do anything dangerous to the MSR. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc/tm: Fix 32 bit non-rt signalsMichael Neuling
commit fee55450710dff32a13ae30b4129ec7b5a4b44d0 upstream. Currently sys_sigreturn() is TM unaware. Therefore, if we take a 32 bit signal without SIGINFO (non RT) inside a transaction, on signal return we don't restore the signal frame correctly. This checks if the signal frame being restoring is an active transaction, and if so, it copies the additional state to ptregs so it can be restored. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc/tm: Fix writing top half of MSR on 32 bit signalsMichael Neuling
commit 1d25f11fdbcc5390d68efd98c28900bfd29b264c upstream. The MSR TM controls are in the top 32 bits of the MSR hence on 32 bit signals, we stick the top half of the MSR in the checkpointed signal context so that the user can access it. Unfortunately, we don't currently write anything to the checkpointed signal context when coming in a from a non transactional process and hence the top MSR bits can contain junk. This updates the 32 bit signal handling code to always write something to the top MSR bits so that users know if the process is transactional or not and the kernel can use it on signal return. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc/powernv: Fix iommu initialization againBenjamin Herrenschmidt
commit 74251fe21bfa9310ddba9e0436d1fcf389e602ee upstream. So because those things always end up in trainwrecks... In 7846de406 we moved back the iommu initialization earlier, essentially undoing 37f02195b which was causing us endless trouble... except that in the meantime we had merged 959c9bdd58 (to workaround the original breakage) which is now ... broken :-) This fixes it by doing a partial revert of the latter (we keep the ppc_md. path which will be needed in the hotplug case, which happens also during some EEH error recovery situations). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc/hw_brk: Fix off by one error when validating DAWR region endMichael Neuling
commit e2a800beaca1f580945773e57d1a0e7cd37b1056 upstream. The Data Address Watchpoint Register (DAWR) on POWER8 can take a 512 byte range but this range must not cross a 512 byte boundary. Unfortunately we were off by one when calculating the end of the region, hence we were not allowing some breakpoint regions which were actually valid. This fixes this error. Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc/hw_brk: Fix clearing of extraneous IRQMichael Neuling
commit 540e07c67efe42ef6b6be4f1956931e676d58a15 upstream. In 9422de3 "powerpc: Hardware breakpoints rewrite to handle non DABR breakpoint registers" we changed the way we mark extraneous irqs with this: - info->extraneous_interrupt = !((bp->attr.bp_addr <= dar) && - (dar - bp->attr.bp_addr < bp->attr.bp_len)); + if (!((bp->attr.bp_addr <= dar) && + (dar - bp->attr.bp_addr < bp->attr.bp_len))) + info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ; Unfortunately this is bogus as it never clears extraneous IRQ if it's already set. This correctly clears extraneous IRQ before possibly setting it. Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-25powerpc/hw_brk: Fix setting of length for exact mode breakpointsMichael Neuling
commit b0b0aa9c7faf94e92320eabd8a1786c7747e40a8 upstream. The smallest match region for both the DABR and DAWR is 8 bytes, so the kernel needs to filter matches when users want to look at regions smaller than this. Currently we set the length of PPC_BREAKPOINT_MODE_EXACT breakpoints to 8. This is wrong as in exact mode we should only match on 1 address, hence the length should be 1. This ensures that the kernel will filter out any exact mode hardware breakpoint matches on any addresses other than the requested one. Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-30powerpc/eeh: Fix fetching bus for single-dev-PEGavin Shan
While running Linux as guest on top of phyp, we possiblly have PE that includes single PCI device. However, we didn't return its PCI bus correctly and it leads to failure on recovery from EEH errors for single-dev-PE. The patch fixes the issue. Cc: <stable@vger.kernel.org> # v3.7+ Cc: Steve Best <sbest@us.ibm.com> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-29Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "We discovered some breakage in our "EEH" (PCI Error Handling) code while doing error injection, due to a couple of regressions. One of them is due to a patch (37f02195bee9 "powerpc/pci: fix PCI-e devices rescan issue on powerpc platform") that, in hindsight, I shouldn't have merged considering that it caused more problems than it solved. Please pull those two fixes. One for a simple EEH address cache initialization issue. The other one is a patch from Guenter that I had originally planned to put in 3.11 but which happens to also fix that other regression (a kernel oops during EEH error handling and possibly hotplug). With those two, the couple of test machines I've hammered with error injection are remaining up now. EEH appears to still fail to recover on some devices, so there is another problem that Gavin is looking into but at least it's no longer crashing the kernel." * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/pci: Improve device hotplug initialization powerpc/eeh: Add eeh_dev to the cache during boot
2013-06-30powerpc/pci: Improve device hotplug initializationGuenter Roeck
Commit 37f02195b (powerpc/pci: fix PCI-e devices rescan issue on powerpc platform) fixes a problem with interrupt and DMA initialization on hot plugged devices. With this commit, interrupt and DMA initialization for hot plugged devices is handled in the pci device enable function. This approach has a couple of drawbacks. First, it creates two code paths for device initialization, one for hot plugged devices and another for devices known during the initial PCI scan. Second, the initialization code for hot plugged devices is only called when the device is enabled, ie typically in the probe function. Also, the platform specific setup code is called each time pci_enable_device() is called, not only once during device discovery, meaning it is actually called multiple times, once for devices discovered during the initial scan and again each time a driver is re-loaded. The visible result is that interrupt pins are only assigned to hot plugged devices when the device driver is loaded. Effectively this changes the PCI probe API, since pci_dev->irq and the device's dma configuration will now only be valid after pci_enable() was called at least once. A more subtle change is that platform specific PCI device setup is moved from device discovery into the driver's probe function, more specifically into the pci_enable_device() call. To fix the inconsistencies, add new function pcibios_add_device. Call pcibios_setup_device from pcibios_setup_bus_devices if device setup is not complete, and from pcibios_add_device if bus setup is complete. With this change, device setup code is moved back into device initialization, and called exactly once for both static and hot plugged devices. [ This also fixes a regression introduced by the above patch which causes dev->irq to be overwritten under some cirumstances after MSIs have been enabled for the device which leads to crashes due to the MSI core "hijacking" dev->irq to store the base MSI number and not the LSI. --BenH ] Cc: Yuanquan Chen <Yuanquan.Chen@freescale.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hiroo Matsumoto <matsumoto.hiroo@jp.fujitsu.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-28powerpc/eeh: Add eeh_dev to the cache during bootThadeu Lima de Souza Cascardo
commit f8f7d63fd96ead101415a1302035137a866f8998 ("powerpc/eeh: Trace eeh device from I/O cache") broke EEH on pseries for devices that were present during boot and have not been hotplugged/DLPARed. eeh_check_failure will get the eeh_dev from the cache, and will get NULL. eeh_addr_cache_build adds the addresses to the cache, but eeh_dev for the giving pci_device is not set yet. Just reordering the call to eeh_addr_cache_insert_dev works fine. The ordering is similar to the one in eeh_add_device_late. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-25Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc bugfix from Ben Herrenschmidt: "This is a fix for a regression causing a freescale "83xx" based platforms to crash on boot due to some PCI breakage" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/pci: Fix boot panic on mpc83xx (regression)
2013-06-24powerpc/pci: Fix boot panic on mpc83xx (regression)Rojhalat Ibrahim
The following commit caused a fatal oops when booting on mpc83xx with a non-express PCI bus (regardless of whether a PCI device is present): commit 50d8f87d2b39313dae9d0a2d9b23d377328f2f7b Author: Rojhalat Ibrahim <imr@rtschenk.de> Date: Mon Apr 8 10:15:28 2013 +0200 powerpc/fsl-pci Make PCIe hotplug work with Freescale PCIe controllers Up to now the PCIe link status on Freescale PCIe controllers was only checked once at boot time. So hotplug did not work. With this patch the link status is checked on every config read. PCIe devices not present at boot time are found after doing 'echo 1 >/sys/bus/pci/rescan'. Signed-off-by: Rojhalat Ibrahim <imr@rtschenk.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> This patch fixes the issue by calling setup_indirect_pci for all device types. fsl_indirect_read_config is now only used for booke/86xx PCIe controllers. Reported-by: Michael Guntsche <mike@it-loops.com> Cc: Scott Wood <scottwood@freescale.com> Signed-off-by: Rojhalat Ibrahim <imr@rtschenk.de> Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-06-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "Three one-line fixes for my first pull request; one for x86 host, one for x86 guest, one for PPC" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: x86: kvmclock: zero initialize pvclock shared memory area kvm/ppc/booke: Delay kvmppc_lazy_ee_enable KVM: x86: remove vcpu's CPL check in host-invoked XCR set
2013-06-20powerpc: Fix bad pmd error with book3E configAneesh Kumar K.V
Book3E uses the hugepd at PMD level and don't encode pte directly at the pmd level. So it will find the lower bits of pmd set and the pmd_bad check throws error. Infact the current code will never take the free_hugepd_range call at all because it will clear the pmd if it find a hugepd pointer. Fix this by clearing bad pmd only if it is not a hugepd pointer. This is regression introduced by e2b3d202d1dba8f3546ed28224ce485bc50010be "powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format" Reported-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-19kvm/ppc/booke: Delay kvmppc_lazy_ee_enableScott Wood
kwmppc_lazy_ee_enable() should be called as late as possible, or else we get things like WARN_ON(preemptible()) in enable_kernel_fp() in configurations where preemptible() works. Note that book3s_pr already waits until just before __kvmppc_vcpu_run to call kvmppc_lazy_ee_enable(). Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-14Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Benjamin Herrenschmidt: "So here are 3 fixes still for 3.10. Fixes are simple, bugs are nasty (though not recent regressions, nasty enough) and all targeted at stable" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Fix missing/delayed calls to irq_work powerpc: Fix emulation of illegal instructions on PowerNV platform powerpc: Fix stack overflow crash in resume_kernel when ftracing
2013-06-15powerpc: Fix missing/delayed calls to irq_workBenjamin Herrenschmidt
When replaying interrupts (as a result of the interrupt occurring while soft-disabled), in the case of the decrementer, we are exclusively testing for a pending timer target. However we also use decrementer interrupts to trigger the new "irq_work", which in this case would be missed. This change the logic to force a replay in both cases of a timer boundary reached and a decrementer interrupt having actually occurred while disabled. The former test is still useful to catch cases where a CPU having been hard-disabled for a long time completely misses the interrupt due to a decrementer rollover. CC: <stable@vger.kernel.org> [v3.4+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Steven Rostedt <rostedt@goodmis.org>
2013-06-15powerpc: Fix emulation of illegal instructions on PowerNV platformPaul Mackerras
Normally, the kernel emulates a few instructions that are unimplemented on some processors (e.g. the old dcba instruction), or privileged (e.g. mfpvr). The emulation of unimplemented instructions is currently not working on the PowerNV platform. The reason is that on these machines, unimplemented and illegal instructions cause a hypervisor emulation assist interrupt, rather than a program interrupt as on older CPUs. Our vector for the emulation assist interrupt just calls program_check_exception() directly, without setting the bit in SRR1 that indicates an illegal instruction interrupt. This fixes it by making the emulation assist interrupt set that bit before calling program_check_interrupt(). With this, old programs that use no-longer implemented instructions such as dcba now work again. CC: <stable@vger.kernel.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-15powerpc: Fix stack overflow crash in resume_kernel when ftracingMichael Ellerman
It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873 "powerpc: Rework runlatch code" by myself --BenH ] CC: <stable@vger.kernel.org> [v3.4+] Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-11Merge branch 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm bugfixes from Gleb Natapov: "There is one more fix for MIPS KVM ABI here, MIPS and PPC build breakage fixes and a couple of PPC bug fixes" * 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit() kvm/ppc/booke: Hold srcu lock when calling gfn functions kvm/ppc/booke64: Disable e6500 support kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage mips/kvm: Use KVM_REG_MIPS and proper size indicators for *_ONE_REG kvm: Add definition of KVM_REG_MIPS KVM: add kvm_para_available to asm-generic/kvm_para.h
2013-06-11kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()Scott Wood
EE is hard-disabled on entry to kvmppc_handle_exit(), so call hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled is unset. Without this, we get warnings such as arch/powerpc/kernel/time.c:300, and sometimes host kernel hangs. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11kvm/ppc/booke: Hold srcu lock when calling gfn functionsScott Wood
KVM core expects arch code to acquire the srcu lock when calling gfn_to_memslot and similar functions. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11kvm/ppc/booke64: Disable e6500 supportScott Wood
The previous patch made 64-bit booke KVM build again, but Altivec support is still not complete, and we can't prevent the guest from turning on Altivec (which can corrupt host state until state save/restore is implemented). Disable e6500 on KVM until this is fixed. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakageMihai Caraman
Interrupt numbers defined for Book3E follows IVORs definition. Align BOOKE_INTERRUPT_ALTIVEC_UNAVAIL and BOOKE_INTERRUPT_ALTIVEC_ASSIST to this rule which also fixes the build breakage. IVORs 32 and 33 are shared so reflect this in the interrupts naming. This fixes a build break for 64-bit booke KVM. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-10powerpc: Partial revert of "Context switch more PMU related SPRs"Michael Ellerman
In commit 59affcd I added context switching of more PMU SPRs, because they are potentially exposed to userspace on Power8. However despite me being a smart arse in the commit message it's actually not correct. In particular it interacts badly with a global perf record. We will have to do something more complicated, but that will have to wait for 3.11. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10powerpc/perf: Fix deadlock caused by calling printk() in PMU exceptionMichael Ellerman
In commit bc09c21 "Fix finding overflowed PMC in interrupt" we added a printk() to the PMU exception handler. Unfortunately that is not safe. The problem is that the PMU exception may run even when interrupts are soft disabled, aka NMI context. We do this so that we can profile parts of the kernel that have interrupts soft-disabled. But by calling printk() from the exception handler, we can potentially deadlock in the printk code on logbuf_lock, eg: [c00000038ba575c0] c000000000081928 .vprintk_emit+0xa8/0x540 [c00000038ba576a0] c0000000007bcde8 .printk+0x48/0x58 [c00000038ba57710] c000000000076504 .perf_event_interrupt+0x2d4/0x490 [c00000038ba57810] c00000000001f6f8 .performance_monitor_exception+0x48/0x60 [c00000038ba57880] c0000000000032cc performance_monitor_common+0x14c/0x180 --- Exception: f01 (Performance Monitor) at c0000000007b25d4 ._raw_spin_lock_irq +0x64/0xc0 [c00000038ba57bf0] c00000000007ed90 .devkmsg_read+0xd0/0x5a0 [c00000038ba57d00] c0000000001c2934 .vfs_read+0xc4/0x1e0 [c00000038ba57d90] c0000000001c2cd8 .SyS_read+0x58/0xd0 [c00000038ba57e30] c000000000009d54 syscall_exit+0x0/0x98 --- Exception: c01 (System Call) at 00001fffffbf6f7c SP (3ffff6d4de10) is in userspace Fix it by making sure we only call printk() when we are not in NMI context. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Cc: <stable@vger.kernel.org> # 3.9 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10powerpc/hw_breakpoints: Add DABRX cpu feature to fix 32-bit regressionMichael Neuling
When introducing support for DABRX in 4474ef0, we broke older 32-bit CPUs that don't have that register. Some CPUs have a DABR but not DABRX. Configuration are: - No 32bit CPUs have DABRX but some have DABR. - POWER4+ and below have the DABR but no DABRX. - 970 and POWER5 and above have DABR and DABRX. - POWER8 has DAWR, hence no DABRX. This introduces CPU_FTR_DABRX and sets it on appropriate CPUs. We use the top 64 bits for CPU FTR bits since only 64 bit CPUs have this. Processors that don't have the DABRX will still work as they will fall back to software filtering these breakpoints via perf_exclude_event(). Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: "Gorelik, Jacob (335F)" <jacob.gorelik@jpl.nasa.gov> cc: stable@vger.kernel.org (v3.9 only) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10powerpc/power8: Update denormalization handlerMichael Neuling
POWER8 can take a denormalisation exception on any VSX registers. This does the extra 32 VSX registers we don't currently handle. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10powerpc/pseries: Simplify denormalization handlerMichael Neuling
The following simplifies the denorm code by using macros to generate the long stream of almost identical instructions. This patch results in no changes to the output binary, but removes a lot of lines of code. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10powerpc/power8: Fix oprofile and perfMichael Neuling
In 2ac6f42 powerpc/cputable: Fix oprofile_cpu_type on power8 we broke all power8 hw events. This reverts this change and uses oprofile_type instead. Perf now works on POWER8 again and oprofile will revert to using timers on POWER8. Kudos to mpe this fix. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10powerpc/eeh: Don't check RTAS token to get PE addrGavin Shan
RTAS token "ibm,get-config-addr-info" or ibm,get-config-addr-info2" are used to retrieve the PE address according to PCI address, which made up of domain/bus/slot/function. If we don't have those 2 tokens, the domain/bus/slot/function would be used as the address for EEH RTAS operations. Some older f/w might not have those 2 tokens and that blocks the EEH functionality to be initialized. It was introduced by commit e2af155c ("powerpc/eeh: pseries platform EEH initialization"). The patch skips the check on those 2 tokens so we can bring up EEH functionality successfully. And domain/bus/slot/function will be used as address for EEH RTAS operations. Cc: <stable@vger.kernel.org> # v3.4+ Reported-by: Robert Knight <knight@princeton.edu> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Tested-by: Robert Knight <knight@princeton.edu> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10powerpc/pci: Check the bus address instead of resource address in ↵Kevin Hao
pcibios_fixup_resources If a BAR has the value of 0, we would assume that it is unset yet and then mark the resource as unset and would reassign it later. But after commit 6c5705fe (powerpc/PCI: get rid of device resource fixups) the pcibios_fixup_resources is invoked after the bus address was translated to linux resource. So the value of res->start is resource address. And since the resource and bus address may be different, we should translate it to the bus address before doing the check. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/cputable: Fix typo on P7+ cputable entryWill Schmidt
Fix a typo in setting COMMON_USER2_POWER7 bits to .cpu_user_features2 cpu specs table. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/perf: Add missing SIER supportMichael Ellerman
Commit 8f61aa3 "Add support for SIER" missed updates to siar_valid() and perf_get_data_addr(). In both cases we need to check the SIER instead of mmcra. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/perf: Revert to original NO_SIPR logicMichael Ellerman
This is a revert and then some of commit 860aad7 "Add regs_no_sipr()". This workaround was only needed on early chip versions. As before NO_SIPR becomes a static flag of the PMU struct. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/pci: Remove the unused variables in pci_process_bridge_OF_rangesKevin Hao
The codes which ever used these two variables have gone. Throw away them too. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/pci: Remove the stale comments of pci_process_bridge_OF_rangesKevin Hao
These comments already don't apply to the current code. So just remove them. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/pseries: Always enable CONFIG_HOTPLUG_CPU on PSERIES SMPSrivatsa S. Bhat
Adam Lackorzynski reported the following build failure on !CONFIG_HOTPLUG_CPU configuration: CC arch/powerpc/kernel/rtas.o arch/powerpc/kernel/rtas.c: In function ‘rtas_cpu_state_change_mask’: arch/powerpc/kernel/rtas.c:843:4: error: implicit declaration of function ‘cpu_down’ [-Werror=implicit-function-declaration] cc1: all warnings being treated as errors make[1]: *** [arch/powerpc/kernel/rtas.o] Error 1 make: *** [arch/powerpc/kernel] Error 2 The build fails because cpu_down() is defined only under CONFIG_HOTPLUG_CPU. Looking further, the mobility code in pseries is one of the call-sites which uses rtas_ibm_suspend_me(), which in turn calls rtas_cpu_state_change_mask(). And the mobility code is unconditionally compiled-in (it does not fall under any Kconfig option). And commit 120496ac (powerpc: Bring all threads online prior to migration/hibernation) which introduced this build regression is critical for the proper functioning of the migration code. So it appears that the only solution to this problem is to enable CONFIG_HOTPLUG_CPU if SMP is enabled on PPC_PSERIES platforms. So make that change in the Kconfig. Reported-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de> Cc: stable@vger.kernel.org Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/kvm/book3s: Add support for H_IPOLL and H_XIRR_X in XICS emulationPaul Mackerras
This adds the remaining two hypercalls defined by PAPR for manipulating the XICS interrupt controller, H_IPOLL and H_XIRR_X. H_IPOLL returns information about the priority and pending interrupts for a virtual cpu, without changing any state. H_XIRR_X is like H_XIRR in that it reads and acknowledges the highest-priority pending interrupt, but it also returns the timestamp (timebase register value) from when the interrupt was first received by the hypervisor. Currently we just return the current time, since we don't do any software queueing of virtual interrupts inside the XICS emulation code. These hcalls are not currently used by Linux guests, but may be in future. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/32bit:Store temporary result in r0 instead of r8Priyanka Jain
Commit a9c4e541ea9b22944da356f2a9258b4eddcc953b "powerpc/kprobe: Complete kprobe and migrate exception frame" introduced a regression: While returning from exception handling in case of PREEMPT enabled, _TIF_NEED_RESCHED bit is checked in TI_FLAGS (thread_info flag) of current task. Only if this bit is set, it should continue with the process of calling preempt_schedule_irq() to schedule highest priority task if available. Current code assumes that r8 contains TI_FLAGS and check this for _TIF_NEED_RESCHED, but as r8 is modified in the code which executes before this check, r8 no longer contains the expected TI_FLAGS information. As a result check for comparison with _TIF_NEED_RESCHED was failing even if NEED_RESCHED bit is set in the current thread_info flag. Due to this, preempt_schedule_irq() and in turn scheduler was not getting called even if highest priority task is ready for execution. So, store temporary results in r0 instead of r8 to prevent r8 from getting modified as subsequent code is dependent on its value. Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com> CC: <stable@vger.kernel.org> [v3.7+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/mm: Always invalidate tlb on hpte invalidate and updateAneesh Kumar K.V
If a hash bucket gets full, we "evict" a more/less random entry from it. When we do that we don't invalidate the TLB (hpte_remove) because we assume the old translation is still technically "valid". This implies that when we are invalidating or updating pte, even if HPTE entry is not valid we should do a tlb invalidate. This was a regression introduced by b1022fbd293564de91596b8775340cf41ad5214c Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/pseries: Improve stream generation comments in copypage/userMichael Neuling
No code changes, just documenting what's happening a little better. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/pseries: Kill all prefetch streams on context switchMichael Neuling
On context switch, we should have no prefetch streams leak from one userspace process to another. This frees up prefetch resources for the next process. Based on patch from Milton Miller. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/cputable: Fix oprofile_cpu_type on power8Nishanth Aravamudan
Maynard informed me that neither the oprofile kernel module nor oprofile userspace has been updated to support that "legacy" oprofile module interface for power8, which is indicated by "ppc64/power8." This results in no samples. The solution is to default to the "timer" type, instead. The raw entry also should be updated, as "ppc64/ibm-compat-v1" indicates to oprofile userspace to use "compatibility events" which are obsolete in ISA 2.07. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>