aboutsummaryrefslogtreecommitdiff
path: root/virt
AgeCommit message (Collapse)Author
2015-05-12 Merge tag 'v3.10.77' into linux-linaro-lskAlex Shi
This is the 3.10.77 stable release Conflicts: drivers/video/console/Kconfig scripts/kconfig/menu.c
2015-05-06KVM: use slowpath for cross page cached accessesRadim Krčmář
commit ca3f0874723fad81d0c701b63ae3a17a408d5f25 upstream. kvm_write_guest_cached() does not mark all written pages as dirty and code comments in kvm_gfn_to_hva_cache_init() talk about NULL memslot with cross page accesses. Fix all the easy way. The check is '<= 1' to have the same result for 'len = 0' cache anywhere in the page. (nr_pages_needed is 0 on page boundary.) Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Message-Id: <20150408121648.GA3519@potion.brq.redhat.com> Reviewed-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14Merge tag 'v3.10.60' into linux-linaro-lskMark Brown
This is the 3.10.60 stable release
2014-11-14kvm: fix excessive pages un-pinning in kvm_iommu_map error path.Quentin Casasnovas
commit 3d32e4dbe71374a6780eaf51d719d76f9a9bf22f upstream. The third parameter of kvm_unpin_pages() when called from kvm_iommu_map_pages() is wrong, it should be the number of pages to un-pin and not the page size. This error was facilitated with an inconsistent API: kvm_pin_pages() takes a size, but kvn_unpin_pages() takes a number of pages, so fix the problem by matching the two. This was introduced by commit 350b8bd ("kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601)"), which fixes the lack of un-pinning for pages intended to be un-pinned (i.e. memory leak) but unfortunately potentially aggravated the number of pages we un-pin that should have stayed pinned. As far as I understand though, the same practical mitigations apply. This issue was found during review of Red Hat 6.6 patches to prepare Ksplice rebootless updates. Thanks to Vegard for his time on a late Friday evening to help me in understanding this code. Fixes: 350b8bd ("kvm: iommu: fix the third parameter of... (CVE-2014-3601)") Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Jamie Iles <jamie.iles@oracle.com> Reviewed-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-31Merge tag 'v3.10.59' into linux-linaro-lskMark Brown
This is the 3.10.59 stable release
2014-10-30kvm: don't take vcpu mutex for obviously invalid vcpu ioctlsDavid Matlack
commit 2ea75be3219571d0ec009ce20d9971e54af96e09 upstream. vcpu ioctls can hang the calling thread if issued while a vcpu is running. However, invalid ioctls can happen when userspace tries to probe the kind of file descriptors (e.g. isatty() calls ioctl(TCGETS)); in that case, we know the ioctl is going to be rejected as invalid anyway and we can fail before trying to take the vcpu mutex. This patch does not change functionality, it just makes invalid ioctls fail faster. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-11Merge remote-tracking branch 'lsk/v3.10/topic/kvm' into linux-linaro-lskMark Brown
Conflicts: arch/arm/kvm/arm.c arch/arm64/Makefile arch/arm64/kernel/asm-offsets.c virt/kvm/kvm_main.c
2014-10-08KVM: correct null pid check in kvm_vcpu_yield_to()Sam Bobroff
Correct a simple mistake of checking the wrong variable before a dereference, resulting in the dereference not being properly protected by rcu_dereference(). Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 27fbe64bfa63cfb9da025975b59d96568caa2d53) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-08KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()Ard Biesheuvel
Read-only memory ranges may be backed by the zero page, so avoid misidentifying it a a MMIO pfn. This fixes another issue I identified when testing QEMU+KVM_UEFI, where a read to an uninitialized emulated NOR flash brought in the zero page, but mapped as a read-write device region, because kvm_is_mmio_pfn() misidentifies it as a MMIO pfn due to its PG_reserved bit being set. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Fixes: b88657674d39 ("ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 85c8555ff07ef09261bd50d603cd4290cff5a8cc) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02Revert "arm, kvm: fix double lock on cpu_add_remove_lock"Christoffer Dall
This reverts commit d77503eadd2f16f2900b9be79a1dc6f37e8cd579. The whole register cpu hotplug fix series has not been applied, so LSK is released without this fix. If we ever include that series in LSK later, then this can be fixed later too. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit 553f809e23f00976caea7a1ebdabaa58a6383e7d) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: Fix set_clear_sgi_pend_reg offsetChristoffer Dall
The sgi values calculated in read_set_clear_sgi_pend_reg() and write_set_clear_sgi_pend_reg() were horribly incorrectly multiplied by 4 with catastrophic results in that subfunctions ended up overwriting memory not allocated for the expected purpose. This showed up as bugs in kfree() and the kernel complaining a lot of you turn on memory debugging. This addresses: http://marc.info/?l=kvm&m=141164910007868&w=2 Reported-by: Shannon Zhao <zhaoshenglong@huawei.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit 0fea6d7628ed6e25a9ee1b67edf7c859718d39e8) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: make number of irqs a configurable attributeMarc Zyngier
In order to make the number of interrupts configurable, use the new fancy device management API to add KVM_DEV_ARM_VGIC_GRP_NR_IRQS as a VGIC configurable attribute. Userspace can now specify the exact size of the GIC (by increments of 32 interrupts). Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> (cherry picked from commit a98f26f183801685ef57333de4bafd4bbc692c7c) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: delay vgic allocation until init timeMarc Zyngier
It is now quite easy to delay the allocation of the vgic tables until we actually require it to be up and running (when the first vcpu is kicking around, or someones tries to access the GIC registers). This allow us to allocate memory for the exact number of CPUs we have. As nobody configures the number of interrupts just yet, use a fallback to VGIC_NR_IRQS_LEGACY. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> (cherry picked from commit 4956f2bc1fdee4bc336532f3f34635a8534cedfd) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: kill VGIC_NR_IRQSMarc Zyngier
Nuke VGIC_NR_IRQS entierly, now that the distributor instance contains the number of IRQ allocated to this GIC. Also add VGIC_NR_IRQS_LEGACY to preserve the current API. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> (cherry picked from commit 5fb66da64064d0cb8dcce4cc8bf4cb1b921b13a0) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: handle out-of-range MMIO accessesMarc Zyngier
Now that we can (almost) dynamically size the number of interrupts, we're facing an interesting issue: We have to evaluate at runtime whether or not an access hits a valid register, based on the sizing of this particular instance of the distributor. Furthermore, the GIC spec says that accessing a reserved register is RAZ/WI. For this, add a new field to our range structure, indicating the number of bits a single interrupts uses. That allows us to find out whether or not the access is in range. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> (cherry picked from commit c3c918361adcceb816c92b21dd95d2b46fb96a8f) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: kill VGIC_MAX_CPUSMarc Zyngier
We now have the information about the number of CPU interfaces in the distributor itself. Let's get rid of VGIC_MAX_CPUS, and just rely on KVM_MAX_VCPUS where we don't have the choice. Yet. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> (cherry picked from commit fc675e355e705a046df7b635d3f3330c0ad94569) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: Parametrize VGIC_NR_SHARED_IRQSMarc Zyngier
Having a dynamic number of supported interrupts means that we cannot relly on VGIC_NR_SHARED_IRQS being fixed anymore. Instead, make it take the distributor structure as a parameter, so it can return the right value. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> (cherry picked from commit fb65ab63b8cae510ea1e43e68b5da2f9980aa6d5) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: switch to dynamic allocationMarc Zyngier
So far, all the VGIC data structures are statically defined by the *maximum* number of vcpus and interrupts it supports. It means that we always have to oversize it to cater for the worse case. Start by changing the data structures to be dynamically sizeable, and allocate them at runtime. The sizes are still very static though. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> (cherry picked from commit c1bfb577addd4867a82c4f235824a315d5afb94a) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: ARM: vgic: plug irq injection raceMarc Zyngier
As it stands, nothing prevents userspace from injecting an interrupt before the guest's GIC is actually initialized. This goes unnoticed so far (as everything is pretty much statically allocated), but ends up exploding in a spectacular way once we switch to a more dynamic allocation (the GIC data structure isn't there yet). The fix is to test for the "ready" flag in the VGIC distributor before trying to inject the interrupt. Note that in order to avoid breaking userspace, we have to ignore what is essentially an error. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit 71afaba4a2e98bb7bdeba5078370ab43d46e67a1) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: Clarify and correct vgic documentationChristoffer Dall
The VGIC virtual distributor implementation documentation was written a very long time ago, before the true nature of the beast had been partially absorbed into my bloodstream. Clarify the docs. Plus, it fixes an actual bug. ICFRn, pfff. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit 7e362919a59e6fc60e08ad1cf0b047291d1ca2e9) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: Fix SGI writes to GICD_I{CS}PENDR0Christoffer Dall
Writes to GICD_ISPENDR0 and GICD_ICPENDR0 ignore all settings of the pending state for SGIs. Make sure the implementation handles this correctly. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit 9da48b5502622f9f0e49df957521ec43a0c9f4c1) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: Improve handling of GICD_I{CS}PENDRnChristoffer Dall
Writes to GICD_ISPENDRn and GICD_ICPENDRn are currently not handled correctly for level-triggered interrupts. The spec states that for level-triggered interrupts, writes to the GICD_ISPENDRn activate the output of a flip-flop which is in turn or'ed with the actual input interrupt signal. Correspondingly, writes to GICD_ICPENDRn simply deactivates the output of that flip-flop, but does not (of course) affect the external input signal. Reads from GICC_IAR will also deactivate the flip-flop output. This requires us to track the state of the level-input separately from the state in the flip-flop. We therefore introduce two new variables on the distributor struct to track these two states. Astute readers may notice that this is introducing more state than required (because an OR of the two states gives you the pending state), but the remaining vgic code uses the pending bitmap for optimized operations to figure out, at the end of the day, if an interrupt is pending or not on the distributor side. Refactoring the code to consider the two state variables all the places where we currently access the precomputed pending value, did not look pretty. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit faa1b46c3e9f4d40359aee04ff275eea5f4cae3a) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: vgic: Clear queued flags on unqueueChristoffer Dall
If we unqueue a level-triggered interrupt completely, and the LR does not stick around in the active state (and will therefore no longer generate a maintenance interrupt), then we should clear the queued flag so that the vgic can actually queue this level-triggered interrupt at a later time and deal with its pending state then. Note: This should actually be properly fixed to handle the active state on the distributor. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit cced50c9280ef7ca1af48080707a170efa1adfa0) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: Rename irq_active to irq_queuedChristoffer Dall
We have a special bitmap on the distributor struct to keep track of when level-triggered interrupts are queued on the list registers. This was named irq_active, which is confusing, because the active state of an interrupt as per the GIC spec is a different thing, not specifically related to edge-triggered/level-triggered configurations but rather indicates an interrupt which has been ack'ed but not yet eoi'ed. Rename the bitmap and the corresponding accessor functions to irq_queued to clarify what this is actually used for. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit dbf20f9d8105cca531614c8bff9a74351e8e67e7) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02arm/arm64: KVM: Rename irq_state to irq_pendingChristoffer Dall
The irq_state field on the distributor struct is ambiguous in its meaning; the comment says it's the level of the input put, but that doesn't make much sense for edge-triggered interrupts. The code actually uses this state variable to check if the interrupt is in the pending state on the distributor so clarify the comment and rename the actual variable and accessor methods. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit 227844f53864077ccaefe01d0960fcccc03445ce) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: VFIO: register kvm_device_ops dynamicallyWill Deacon
Now that we have a dynamic means to register kvm_device_ops, use that for the VFIO kvm device, instead of relying on the static table. This is achieved by a module_init call to register the ops with KVM. Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Alex Williamson <Alex.Williamson@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 80ce1639727e9d38729c34f162378508c307ca25) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: ARM: vgic: register kvm_device_ops dynamicallyWill Deacon
Now that we have a dynamic means to register kvm_device_ops, use that for the ARM VGIC, instead of relying on the static table. Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit c06a841bf36340e9e917ce60d11a6425ac85d0bd) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: device: add simple registration mechanism for kvm_device_opsWill Deacon
kvm_ioctl_create_device currently has knowledge of all the device types and their associated ops. This is fairly inflexible when adding support for new in-kernel device emulations, so move what we currently have out into a table, which can support dynamic registration of ops by new drivers for virtual hardware. Cc: Alex Williamson <Alex.Williamson@redhat.com> Cc: Alex Graf <agraf@suse.de> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit d60eacb07053142bfb9b41582074a89a790a9d46) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: EVENTFD: remove inclusion of irq.hEric Auger
No more needed. irq.h would be void on ARM. Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> (cherry picked from commit 0ba09511ddc3ff0b462f37b4fe4b9c4dccc054ec) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: remove redundant assignments in __kvm_set_memory_regionChristian Borntraeger
__kvm_set_memory_region sets r to EINVAL very early. Doing it again is not necessary. The same is true later on, where r is assigned -ENOMEM twice. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit f2a25160887e00434ce1361007009120e1fecbda) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: remove redundant assigment of return value in kvm_dev_ioctlChristian Borntraeger
The first statement of kvm_dev_ioctl is long r = -EINVAL; No need to reassign the same value. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit a13f533b2f1d53a7c0baa7490498caeab7bc8ba5) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: remove redundant check of in_spin_loopChristian Borntraeger
The expression `vcpu->spin_loop.in_spin_loop' is always true, because it is evaluated only when the condition `!vcpu->spin_loop.in_spin_loop' is false. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 34656113182b704682e23d1363417536addfec97) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: remove garbage arg to *hardware_{en,dis}ableRadim Krčmář
In the beggining was on_each_cpu(), which required an unused argument to kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten. Remove unnecessary arguments that stem from this. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 13a34e067eab24fec882e1834fbf2cc31911d474) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: Unconditionally export KVM_CAP_READONLY_MEMChristoffer Dall
The idea between capabilities and the KVM_CHECK_EXTENSION ioctl is that userspace can, at run-time, determine if a feature is supported or not. This allows KVM to being supporting a new feature with a new kernel version without any need to update user space. Unfortunately, since the definition of KVM_CAP_READONLY_MEM was guarded by #ifdef __KVM_HAVE_READONLY_MEM, such discovery still required a user space update. Therefore, unconditionally export KVM_CAP_READONLY_MEM and change the in-kernel conditional to rely on __KVM_HAVE_READONLY_MEM. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 0f8a4de3e088797576ac76200b634b802e5c7781) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: vgic: declare probe function pointer as constWill Deacon
We extract the vgic probe function from the of_device_id data pointer, which is const. Kill the sparse warning by ensuring that the local function pointer is also marked as const. Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit de56fb1923ca11f428bf557870e0faa99f38762e) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: vgic: return int instead of bool when checking I/O rangesWill Deacon
vgic_ioaddr_overlap claims to return a bool, but in reality it returns an int. Shut sparse up by fixing the type signature. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit 1fa451bcc67fa921a04c5fac8dbcde7844d54512) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: Introduce gfn_to_hva_memslot_protChristoffer Dall
To support read-only memory regions on arm and arm64, we have a need to resolve a gfn to an hva given a pointer to a memslot to avoid looping through the memslots twice and to reuse the hva error checking of gfn_to_hva_prot(), add a new gfn_to_hva_memslot_prot() function and refactor gfn_to_hva_prot() to use this function. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit 64d831269ccbca1fc6d739a0f3c8aa24afb43a5e) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: add kvm_arch_sched_inRadim Krčmář
Introduce preempt notifiers for architecture specific code. Advantage over creating a new notifier in every arch is slightly simpler code and guaranteed call order with respect to kvm_sched_in. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit e790d9ef6405633b007339d746b709aed43a928d) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: avoid unnecessary synchronize_rcuChristian Borntraeger
We dont have to wait for a grace period if there is no oldpid that we are going to free. putpid also checks for NULL, so this patch only fences synchronize_rcu. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 7103f60de8bed21a0ad5d15d2ad5b7a333dda201) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: Move more code under CONFIG_HAVE_KVM_IRQFDPaolo Bonzini
Commits e4d57e1ee1ab (KVM: Move irq notifier implementation into eventfd.c, 2014-06-30) included the irq notifier code unconditionally in eventfd.c, while it was under CONFIG_HAVE_KVM_IRQCHIP before. Similarly, commit 297e21053a52 (KVM: Give IRQFD its own separate enabling Kconfig option, 2014-06-30) moved code from CONFIG_HAVE_IRQ_ROUTING to CONFIG_HAVE_KVM_IRQFD but forgot to move the pieces that used to be under CONFIG_HAVE_KVM_IRQCHIP. Together, this broke compilation without CONFIG_KVM_XICS. Fix by adding or changing the #ifdefs so that they point at CONFIG_HAVE_KVM_IRQFD. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit c77dcacb397519b6ade8f08201a4a90a7f4f751e) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: Give IRQFD its own separate enabling Kconfig optionPaul Mackerras
Currently, the IRQFD code is conditional on CONFIG_HAVE_KVM_IRQ_ROUTING. So that we can have the IRQFD code compiled in without having the IRQ routing code, this creates a new CONFIG_HAVE_KVM_IRQFD, makes the IRQFD code conditional on it instead of CONFIG_HAVE_KVM_IRQ_ROUTING, and makes all the platforms that currently select HAVE_KVM_IRQ_ROUTING also select HAVE_KVM_IRQFD. Signed-off-by: Paul Mackerras <paulus@samba.org> Tested-by: Eric Auger <eric.auger@linaro.org> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 297e21053a52f060944e9f0de4c64fad9bcd72fc) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: Move irq notifier implementation into eventfd.cPaul Mackerras
This moves the functions kvm_irq_has_notifier(), kvm_notify_acked_irq(), kvm_register_irq_ack_notifier() and kvm_unregister_irq_ack_notifier() from irqchip.c to eventfd.c. The reason for doing this is that those functions are used in connection with IRQFDs, which are implemented in eventfd.c. In future we will want to use IRQFDs on platforms that don't implement the GSI routing implemented in irqchip.c, so we won't be compiling in irqchip.c, but we still need the irq notifiers. The implementation is unchanged. Signed-off-by: Paul Mackerras <paulus@samba.org> Tested-by: Eric Auger <eric.auger@linaro.org> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit e4d57e1ee1ab59f0cef0272800ac6c52e0ec814a) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: Move all accesses to kvm::irq_routing into irqchip.cPaul Mackerras
Now that struct _irqfd does not keep a reference to storage pointed to by the irq_routing field of struct kvm, we can move the statement that updates it out from under the irqfds.lock and put it in kvm_set_irq_routing() instead. That means we then have to take a srcu_read_lock on kvm->irq_srcu around the irqfd_update call in kvm_irqfd_assign(), since holding the kvm->irqfds.lock no longer ensures that that the routing can't change. Combined with changing kvm_irq_map_gsi() and kvm_irq_map_chip_pin() to take a struct kvm * argument instead of the pointer to the routing table, this allows us to to move all references to kvm->irq_routing into irqchip.c. That in turn allows us to move the definition of the kvm_irq_routing_table struct into irqchip.c as well. Signed-off-by: Paul Mackerras <paulus@samba.org> Tested-by: Eric Auger <eric.auger@linaro.org> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 9957c86d659a4d5a2bed25ccbd3bfc9c3f25e658) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: irqchip: Provide and use accessors for irq routing tablePaul Mackerras
This provides accessor functions for the KVM interrupt mappings, in order to reduce the amount of code that accesses the fields of the kvm_irq_routing_table struct, and restrict that code to one file, virt/kvm/irqchip.c. The new functions are kvm_irq_map_gsi(), which maps from a global interrupt number to a set of IRQ routing entries, and kvm_irq_map_chip_pin, which maps from IRQ chip and pin numbers to a global interrupt number. This also moves the update of kvm_irq_routing_table::chip[][] into irqchip.c, out of the various kvm_set_routing_entry implementations. That means that none of the kvm_set_routing_entry implementations need the kvm_irq_routing_table argument anymore, so this removes it. This does not change any locking or data lifetime rules. Signed-off-by: Paul Mackerras <paulus@samba.org> Tested-by: Eric Auger <eric.auger@linaro.org> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 8ba918d488caded2c4368b0b922eb905fe3bb101) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: Don't keep reference to irq routing table in irqfd structPaul Mackerras
This makes the irqfd code keep a copy of the irq routing table entry for each irqfd, rather than a reference to the copy in the actual irq routing table maintained in kvm/virt/irqchip.c. This will enable us to change the routing table structure in future, or even not have a routing table at all on some platforms. The synchronization that was previously achieved using srcu_dereference on the read side is now achieved using a seqcount_t structure. That ensures that we don't get a halfway-updated copy of the structure if we read it while another thread is updating it. We still use srcu_read_lock/unlock around the read side so that when changing the routing table we can be sure that after calling synchronize_srcu, nothing will be using the old routing. Signed-off-by: Paul Mackerras <paulus@samba.org> Tested-by: Eric Auger <eric.auger@linaro.org> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 56f89f3629ffd1a21d38c3d0bea23deac0e284ce) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: arm64: GICv3: mandate page-aligned GICV regionMarc Zyngier
Just like GICv2 was fixed in 63afbe7a0ac1 (kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform), mandate the GICV region to be both aligned on a page boundary and its size to be a multiple of page size. This prevents a guest from being able to poke at regions where we have no idea what is sitting there. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit fb3ec67942e92e5713e05b7691b277d0a0c0575d) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02kvm: arm64: vgic: fix hyp panic with 64k pages on juno platformWill Deacon
If the physical address of GICV isn't page-aligned, then we end up creating a stage-2 mapping of the page containing it, which causes us to map neighbouring memory locations directly into the guest. As an example, consider a platform with GICV at physical 0x2c02f000 running a 64k-page host kernel. If qemu maps this into the guest at 0x80010000, then guest physical addresses 0x80010000 - 0x8001efff will map host physical region 0x2c020000 - 0x2c02efff. Accesses to these physical regions may cause UNPREDICTABLE behaviour, for example, on the Juno platform this will cause an SError exception to EL3, which brings down the entire physical CPU resulting in RCU stalls / HYP panics / host crashing / wasted weeks of debugging. SBSA recommends that systems alias the 4k GICV across the bounding 64k region, in which case GICV physical could be described as 0x2c020000 in the above scenario. This patch fixes the problem by failing the vgic probe if the physical base address or the size of GICV aren't page-aligned. Note that this generated a warning in dmesg about freeing enabled IRQs, so I had to move the IRQ enabling later in the probe. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Joel Schopp <joel.schopp@amd.com> Cc: Don Dutile <ddutile@redhat.com> Acked-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Joel Schopp <joel.schopp@amd.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> (cherry picked from commit 63afbe7a0ac184ef8485dac4914e87b211b5bfaa) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: Allow KVM_CHECK_EXTENSION on the vm fdAlexander Graf
The KVM_CHECK_EXTENSION is only available on the kvm fd today. Unfortunately on PPC some of the capabilities change depending on the way a VM was created. So instead we need a way to expose capabilities as VM ioctl, so that we can see which VM type we're using (HV or PR). To enable this, add the KVM_CHECK_EXTENSION ioctl to our vm ioctl portfolio. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 92b591a4c46b103ebd3fc0d03a084e1efd331253) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02KVM: Rename and add argument to check_extensionAlexander Graf
In preparation to make the check_extension function available to VM scope we add a struct kvm * argument to the function header and rename the function accordingly. It will still be called from the /dev/kvm fd, but with a NULL argument for struct kvm *. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 784aa3d7fb6f729c06d5836c9d9569f58e4d05ae) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02kvm: Resolve missing-field-initializers warningsMark Rustad
Resolve missing-field-initializers warnings seen in W=2 kernel builds by having macros generate more elaborated initializers. That is enough to silence the warnings. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 25f97ff451a4aab534afc1290af97d23ea0b4fb3) Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>