aboutsummaryrefslogtreecommitdiff
path: root/virt/kvm/ioapic.c
AgeCommit message (Collapse)Author
2013-04-25KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)Andy Honig
commit a2c118bfab8bc6b8bb213abfc35201e441693d55 upstream. If the guest specifies a IOAPIC_REG_SELECT with an invalid value and follows that with a read of the IOAPIC_REG_WINDOW KVM does not properly validate that request. ioapic_read_indirect contains an ASSERT(redir_index < IOAPIC_NUM_PINS), but the ASSERT has no effect in non-debug builds. In recent kernels this allows a guest to cause a kernel oops by reading invalid memory. In older kernels (pre-3.3) this allows a guest to read from large ranges of host memory. Tested: tested against apic unit tests. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2011-12-27KVM: drop bsp_vcpu pointer from kvm structGleb Natapov
Drop bsp_vcpu pointer from kvm struct since its only use is incorrect anyway. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-27KVM: Allow aligned byte and word writes to IOAPIC registers.Julian Stecklina
This fixes byte accesses to IOAPIC_REG_SELECT as mandated by at least the ICH10 and Intel Series 5 chipset specs. It also makes ioapic_mmio_write consistent with ioapic_mmio_read, which also allows byte and word accesses. Signed-off-by: Julian Stecklina <js@alien8.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-25KVM: Intelligent device lookup on I/O busSasha Levin
Currently the method of dealing with an IO operation on a bus (PIO/MMIO) is to call the read or write callback for each device registered on the bus until we find a device which handles it. Since the number of devices on a bus can be significant due to ioeventfds and coalesced MMIO zones, this leads to a lot of overhead on each IO operation. Instead of registering devices, we now register ranges which points to a device. Lookup is done using an efficient bsearch instead of a linear search. Performance test was conducted by comparing exit count per second with 200 ioeventfds created on one byte and the guest is trying to access a different byte continuously (triggering usermode exits). Before the patch the guest has achieved 259k exits per second, after the patch the guest does 274k exits per second. Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22KVM: ioapic: Fix an error field referenceLiu Yuan
Function ioapic_debug() in the ioapic_deliver() misnames one filed by reference. This patch correct it. Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02KVM: Convert mask notifiers to use irqchip/pin instead of gsiGleb Natapov
Devices register mask notifier using gsi, but irqchip knows about irqchip/pin, so conversion from irqchip/pin to gsi should be done before looking for mask notifier to call. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: Update Red Hat copyrightsAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-10KVM: read apic->irr with ioapic lock heldMarcelo Tosatti
Read ioapic->irr inside ioapic->lock protected section. KVM-Stable-Tag Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-13KVM: convert ioapic lock to spinlockMarcelo Tosatti
kvm_set_irq is used from non sleepable contexes, so convert ioapic from mutex to spinlock. KVM-Stable-Tag. Tested-by: Ralf Bonenkamp <ralf.bonenkamp@swyx.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-01KVM: cleanup the failure path of KVM_CREATE_IRQCHIP ioctrlWei Yongjun
If we fail to init ioapic device or the fail to setup the default irq routing, the device register by kvm_create_pic() and kvm_ioapic_init() remain unregister. This patch fixed to do this. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: kvm->arch.vioapic should be NULL if kvm_ioapic_init() failureWei Yongjun
kvm->arch.vioapic should be NULL in case of kvm_ioapic_init() failure due to cannot register io dev. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: avoid taking ioapic mutex for non-ioapic EOIsAvi Kivity
When the guest acknowledges an interrupt, it sends an EOI message to the local apic, which broadcasts it to the ioapic. To handle the EOI, we need to take the ioapic mutex. On large guests, this causes a lot of contention on this mutex. Since large guests usually don't route interrupts via the ioapic (they use msi instead), this is completely unnecessary. Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking the mutex, check if the ioapic was actually responsible for the acked vector. If not, we can return early. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: convert slots_lock to a mutexMarcelo Tosatti
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: convert io_bus to SRCUMarcelo Tosatti
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03KVM: Move IO APIC to its own lockGleb Natapov
The allows removal of irq_lock from the injection path. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Fix coalesced interrupt reporting in IOAPICGleb Natapov
This bug was introduced by b4a2f5e723e4f7df467. Cc: stable@kernel.org Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: make io_bus interface more robustGregory Haskins
Today kvm_io_bus_regsiter_dev() returns void and will internally BUG_ON if it fails. We want to create dynamic MMIO/PIO entries driven from userspace later in the series, so we need to enhance the code to be more robust with the following changes: 1) Add a return value to the registration function 2) Fix up all the callsites to check the return code, handle any failures, and percolate the error up to the caller. 3) Add an unregister function that collapses holes in the array Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Add trace points in irqchip codeGleb Natapov
Add tracepoint in msi/ioapic/pic set_irq() functions, in IPI sending and in the point where IRQ is placed into apic's IRR. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Use temporary variable to shorten lines.Gleb Natapov
Cosmetic only. No logic is changed by this patch. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: fix lock imbalanceJiri Slaby
There is a missing unlock on one fail path in ioapic_mmio_write, fix that. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: remove in_range from io devicesMichael S. Tsirkin
This changes bus accesses to use high-level kvm_io_bus_read/kvm_io_bus_write functions. in_range now becomes unused so it is removed from device ops in favor of read/write callbacks performing range checks internally. This allows aliasing (mostly for in-kernel virtio), as well as better error handling by making it possible to pass errors up to userspace. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: convert bus to slots_lockMichael S. Tsirkin
Use slots_lock to protect device list on the bus. slots_lock is already taken for read everywhere, so we only need to take it for write when registering devices. This is in preparation to removing in_range and kvm->lock around it. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Introduce kvm_vcpu_is_bsp() function.Gleb Natapov
Use it instead of open code "vcpu_id zero is BSP" assumption. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: introduce irq_lock, use it to protect ioapicMarcelo Tosatti
Introduce irq_lock, and use to protect ioapic data structures. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: cleanup io_device codeGregory Haskins
We modernize the io_device code so that we use container_of() instead of dev->private, and move the vtable to a separate ops structure (theoretically allows better caching for multiple instances of the same ops structure) Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-09KVM: Avoid redelivery of edge interrupt before next edgeGleb Natapov
The check for an edge is broken in current ioapic code. ioapic->irr is cleared on each edge interrupt by ioapic_service() and this makes old_irr != ioapic->irr condition in kvm_ioapic_set_irq() to be always true. The patch fixes the code to properly recognise edge. Some HW emulation calls set_irq() without level change. If each such call is propagated to an OS it may confuse a device driver. This is the case with keyboard device emulation and Windows XP x64 installer on SMP VM. Each keystroke produce two interrupts (down/up) one interrupt is submitted to CPU0 and another to CPU1. This confuses Windows somehow and it ignores keystrokes. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: APIC: get rid of deliver_bitmaskGleb Natapov
Deliver interrupt during destination matching loop. Signed-off-by: Gleb Natapov <gleb@redhat.com> Acked-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10KVM: consolidate ioapic/ipi interrupt delivery logicGleb Natapov
Use kvm_apic_match_dest() in kvm_get_intr_delivery_bitmask() instead of duplicating the same code. Use kvm_get_intr_delivery_bitmask() in apic_send_ipi() to figure out ipi destination instead of reimplementing the logic. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10KVM: ioapic/msi interrupt delivery consolidationGleb Natapov
ioapic_deliver() and kvm_set_msi() have code duplication. Move the code into ioapic_deliver_entry() function and call it from both places. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10KVM: APIC: kvm_apic_set_irq deliver all kinds of interruptsGleb Natapov
Get rid of ioapic_inj_irq() and ioapic_inj_nmi() functions. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10KVM: Merge kvm_ioapic_get_delivery_bitmask into kvm_get_intr_delivery_bitmaskSheng Yang
Gleb fixed bitmap ops usage in kvm_ioapic_get_delivery_bitmask. Sheng merged two functions, as well as fixed several issues in kvm_get_intr_delivery_bitmask 1. deliver_bitmask is a bitmap rather than a unsigned long intereger. 2. Lowest priority target bitmap wrong calculated by mistake. 3. Prevent potential NULL reference. 4. Declaration in include/kvm_host.h caused powerpc compilation warning. 5. Add warning for guest broadcast interrupt with lowest priority delivery mode. 6. Removed duplicate bitmap clean up in caller of kvm_get_intr_delivery_bitmask. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10KVM: bit ops for deliver_bitmapSheng Yang
It's also convenient when we extend KVM supported vcpu number in the future. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Change API of kvm_ioapic_get_delivery_bitmaskSheng Yang
In order to use with bit ops. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Unify the delivery of IOAPIC and MSI interruptsSheng Yang
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Split IOAPIC structureSheng Yang
Prepared for reuse ioapic_redir_entry for MSI. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24KVM: Report IRQ injection status to userspace.Gleb Natapov
IRQ injection status is either -1 (if there was no CPU found that should except the interrupt because IRQ was masked or ioapic was misconfigured or ...) or >= 0 in that case the number indicates to how many CPUs interrupt was injected. If the value is 0 it means that the interrupt was coalesced and probably should be reinjected. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24KVM: make irq ack notifications aware of routing tableMarcelo Tosatti
IRQ ack notifications assume an identity mapping between pin->gsi, which might not be the case with, for example, HPET. Translate before acking. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Gleb Natapov <gleb@redhat.com>
2009-03-24KVM: Interrupt mask notifiers for ioapicAvi Kivity
Allow clients to request notifications when the guest masks or unmasks a particular irq line. This complements irq ack notifications, as the guest will not ack an irq line that is masked. Currently implemented for the ioapic only. Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31KVM: Export ioapic_get_delivery_bitmaskSheng Yang
It would be used for MSI in device assignment, for MSI dispatch. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31KVM: Kick NMI receiving VCPUJan Kiszka
Kick the NMI receiving VCPU in case the triggering caller runs in a different context. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2008-10-15KVM: ia64: add a dummy irq ack notificationXiantao Zhang
Before enabling notify_acked_irq for ia64, leave the related APIs as nop-op first. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: irq ack notificationMarcelo Tosatti
Based on a patch from: Ben-Ami Yassour <benami@il.ibm.com> which was based on a patch from: Amit Shah <amit.shah@qumranet.com> Notify IRQ acking on PIC/APIC emulation. The previous patch missed two things: - Edge triggered interrupts on IOAPIC - PIC reset with IRR/ISR set should be equivalent to ack (LAPIC probably needs something similar). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> CC: Amit Shah <amit.shah@qumranet.com> CC: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: kvm_io_device: extend in_range() to manage len and write attributeLaurent Vivier
Modify member in_range() of structure kvm_io_device to pass length and the type of the I/O (write or read). This modification allows to use kvm_io_device with coalesced MMIO. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: IOAPIC/LAPIC: Enable NMI supportSheng Yang
[avi: fix ia64 build breakage] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-06KVM: IOAPIC: Fix level-triggered irq injection hangMark McLoughlin
The "remote_irr" variable is used to indicate an interrupt which has been received by the LAPIC, but not acked. In our EOI handler, we unset remote_irr and re-inject the interrupt if the interrupt line is still asserted. However, we do not set remote_irr here, leading to a situation where if kvm_ioapic_set_irq() is called, then we go ahead and call ioapic_service(). This means that IRR is re-asserted even though the interrupt is currently in service (i.e. LAPIC IRR is cleared and ISR/TMR set) The issue with this is that when the currently executing interrupt handler finishes and writes LAPIC EOI, then TMR is unset and EOI sent to the IOAPIC. Since IRR is now asserted, but TMR is not, then when the second interrupt is handled, no EOI is sent and if there is any pending interrupt, it is not re-injected. This fixes a hang only seen while running mke2fs -j on an 8Gb virtio disk backed by a fully sparse raw file, with aliguori "avoid fragmented virtio-blk transfers by copying" changes. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24KVM: ioapic: fix lost interrupt when changing a device's irqAvi Kivity
The ioapic acknowledge path translates interrupt vectors to irqs. It currently uses a first match algorithm, stopping when it finds the first redirection table entry containing the vector. That fails however if the guest changes the irq to a different line, leaving the old redirection table entry in place (though masked). Result is interrupts not making it to the guest. Fix by always scanning the entire redirection table. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06KVM: IOAPIC: only set remote_irr if interrupt was injectedMarcelo Tosatti
There's a bug in the IOAPIC code for level-triggered interrupts. Its relatively easy to trigger by sharing (virtio-blk + usbtablet was the testcase, initially reported by Gerd von Egidy). The "remote_irr" variable is used to indicate accepted but not yet acked interrupts. Its cleared from the EOI handler. Problem is that the EOI handler clears remote_irr unconditionally, even if it reinjected another pending interrupt. In that case, kvm_ioapic_set_irq() proceeds to ioapic_service() which sets remote_irr even if it failed to inject (since the IRR was high due to EOI reinjection). Since the TMR bit has been cleared by the first EOI, the second one fails to clear remote_irr. End result is interrupt line dead. Fix it by setting remote_irr only if a new pending interrupt has been generated (and the TMR bit for vector in question set). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-03-04KVM: Route irq 0 to vcpu 0 exclusivelyAvi Kivity
Some Linux versions allow the timer interrupt to be processed by more than one cpu, leading to hangs due to tsc instability. Work around the issue by only disaptching the interrupt to vcpu 0. Problem analyzed (and patch tested) by Sheng Yang. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30KVM: Move ioapic code to common directory.Zhang Xiantao
Move ioapic code to common, since IA64 also needs it. Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>