aboutsummaryrefslogtreecommitdiff
path: root/kvm-all.c
AgeCommit message (Collapse)Author
2011-02-01Merge remote branch 'qemu-kvm/uq/master' into stagingAnthony Liguori
aliguori: fix build with !defined(KVM_CAP_ASYNC_PF) Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-01-31virtio-pci: Disable virtio-ioeventfd when !CONFIG_IOTHREADStefan Hajnoczi
It is not possible to use virtio-ioeventfd when building without an I/O thread. We rely on a signal to kick us out of vcpu execution. Timers and AIO use SIGALRM and SIGUSR2 respectively. Unfortunately eventfd does not support O_ASYNC (SIGIO) so eventfd cannot be used in a signal driven manner. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-23kvm: Flush coalesced mmio buffer on IO window exitsJan Kiszka
We must flush pending mmio writes if we leave kvm_cpu_exec for an IO window. Otherwise we risk to loose those requests when migrating to a different host during that window. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23kvm: Consolidate must-have capability checksJan Kiszka
Instead of splattering the code with #ifdefs and runtime checks for capabilities we cannot work without anyway, provide central test infrastructure for verifying their availability both at build and runtime. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23kvm: Drop smp_cpus argument from init functionsJan Kiszka
No longer used. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23x86: Optionally dump code bytes on cpu_dump_stateJan Kiszka
Introduce the cpu_dump_state flag CPU_DUMP_CODE and implement it for x86. This writes out the code bytes around the current instruction pointer. Make use of this feature in KVM to help debugging fatal vm exits. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23kvm: Improve reporting of fatal errorsJan Kiszka
Report KVM_EXIT_UNKNOWN, KVM_EXIT_FAIL_ENTRY, and KVM_EXIT_EXCEPTION with more details to stderr. The latter two are so far x86-only, so move them into the arch-specific handler. Integrate the Intel real mode warning on KVM_EXIT_FAIL_ENTRY that qemu-kvm carries, but actually restrict it to Intel CPUs. Moreover, always dump the CPU state in case we fail. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23kvm: Stop on all fatal exit reasonsJan Kiszka
Ensure that we stop the guest whenever we face a fatal or unknown exit reason. If we stop, we also have to enforce a cpu loop exit. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21kvm: Fix coding style violationsJan Kiszka
No functional changes. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21kvm: convert kvm_ioctl(KVM_CHECK_EXTENSION) to kvm_check_extension()Lai Jiangshan
simple cleanup and use existing helper: kvm_check_extension(). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-10kvm: test for ioeventfd support on old kernelsStefan Hajnoczi
There used to be a limit of 6 KVM io bus devices in the kernel. On such a kernel, we can't use many ioeventfds for host notification since the limit is reached too easily. Add an API to test for this condition. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-12-02migration: stable ram block orderingMichael S. Tsirkin
This makes ram block ordering under migration stable, ordered by offset. This is especially useful for migration to exec, for debugging. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Jason Wang <jasowang@redhat.com>
2010-10-20Add RAM -> physical addr mapping in MCE simulationHuang Ying
In QEMU-KVM, physical address != RAM address. While MCE simulation needs physical address instead of RAM address. So kvm_physical_memory_addr_from_ram() is implemented to do the conversion, and it is invoked before being filled in the IA32_MCi_ADDR MSR. Reported-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-09-25Introduce qemu_madvise()Andreas Färber
vl.c has a Sun-specific hack to supply a prototype for madvise(), but the call site has apparently moved to arch_init.c. Haiku doesn't implement madvise() in favor of posix_madvise(). OpenBSD and Solaris 10 don't implement posix_madvise() but madvise(). MinGW implements neither. Check for madvise() and posix_madvise() in configure and supply qemu_madvise() as wrapper. Prefer madvise() over posix_madvise() due to flag availability. Convert all callers to use qemu_madvise() and QEMU_MADV_*. Note that on Solaris the warning is fixed by moving the madvise() prototype, not by qemu_madvise() itself. It helps with porting though, and it simplifies most call sites. v7 -> v8: * Some versions of MinGW have no sys/mman.h header. Reported by Blue Swirl. v6 -> v7: * Adopt madvise() rather than posix_madvise() semantics for returning errors. * Use EINVAL in place of ENOTSUP. v5 -> v6: * Replace two leftover instances of POSIX_MADV_NORMAL with QEMU_MADV_INVALID. Spotted by Blue Swirl. v4 -> v5: * Introduce QEMU_MADV_INVALID, suggested by Alexander Graf. Note that this relies on -1 not being a valid advice value. v3 -> v4: * Eliminate #ifdefs at qemu_advise() call sites. Requested by Blue Swirl. This will currently break the check in kvm-all.c by calling madvise() with a supported flag, which will not fail. Ideas/patches welcome. v2 -> v3: * Reuse the *_MADV_* defines for QEMU_MADV_*. Suggested by Alexander Graf. * Add configure check for madvise(), too. Add defines to Makefile, not QEMU_CFLAGS. Convert all callers, untested. Suggested by Blue Swirl. * Keep Solaris' madvise() prototype around. Pointed out by Alexander Graf. * Display configure check results. v1 -> v2: * Don't rely on posix_madvise() availability, add qemu_madvise(). Suggested by Blue Swirl. Signed-off-by: Andreas Färber <afaerber@opensolaris.org> Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Alexander Graf <agraf@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-08-23Merge remote branch 'qemu-kvm/uq/master' into stagingAnthony Liguori
2010-08-10Add function to assign ioeventfd to MMIO.Cam Macdonell
Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-05kvm: remove guest triggerable abort()Gleb Natapov
This abort() condition is easily triggerable by a guest if it configures pci bar with unaligned address that overlaps main memory. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-05kvm: Don't walk memory_size == 0 slots in kvm_client_migration_logAlex Williamson
If we've unregistered a memory area, we should avoid calling qemu_get_ram_ptr() on the left over phys_offset cruft in the slot array. Now that we support removing ramblocks, the phys_offset ram_addr_t can go away and cause a lookup fault and abort. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-07-22Rework debug exception processing for gdb useJan Kiszka
Guest debugging is currently broken under CONFIG_IOTHREAD. The reason is inconsistent or even lacking signaling the debug events from the source VCPU to the main loop and the gdbstub. This patch addresses the issue by pushing this signaling into a CPUDebugExcpHandler: cpu_debug_handler is registered as first handler, thus will be executed last after potential breakpoint emulation handlers. It sets informs the gdbstub about the debug event source, requests a debug exit of the main loop and stops the current VCPU. This mechanism works both for TCG and KVM, with and without IO-thread. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-06-28kvm: Enable XSAVE live migration supportSheng Yang
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28kvm: Switch kvm_update_guest_debug to run_on_cpuJan Kiszka
Guest debugging under KVM is currently broken once io-threads are enabled. Easily fixable by switching the fake on_vcpu to the real run_on_cpu implementation. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-11Do not stop VM if emulation failed in userspace.Gleb Natapov
Continue vcpu execution in case emulation failure happened while vcpu was in userspace. In this case #UD will be injected into the guest allowing guest OS to kill offending process and continue. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-11kvm: enable smp > 1Marcelo Tosatti
Process INIT/SIPI requests and enable -smp > 1. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-11kvm: synchronize state from cpu contextJan Kiszka
It is not safe to retrieve the KVM internal state of a given cpu while its potentially modifying it. Queue the request to run on cpu context, similarly to qemu-kvm. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-11kvm: set cpu_single_env around KVM_RUN ioctlMarcelo Tosatti
Zero cpu_single_env before leaving global lock protection, and restore on return. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-28kvm: port qemu-kvm's bitmap scanningMarcelo Tosatti
Which is significantly faster. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-04-26kvm: handle internal errorMarcelo Tosatti
Port qemu-kvm's KVM_EXIT_INTERNAL_ERROR handling to upstream. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-26KVM: x86: Add debug register saving and restoringJan Kiszka
Make use of the new KVM_GET/SET_DEBUGREGS to save/restore the x86 debug registers. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-04-19provide a stub version of kvm-all.c if !CONFIG_KVMPaolo Bonzini
This allows limited use of kvm functions (which will return ENOSYS) even in once-compiled modules. The patch also improves a bit the error messages for KVM initialization. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [blauwirbel@gmail.com: fixed Win32 build] Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-04-18kvm: avoid collision with dprintf macro in stdio.h, spotted by clangBlue Swirl
Fixes clang errors: CC i386-softmmu/kvm.o /src/qemu/target-i386/kvm.c:40:9: error: 'dprintf' macro redefined In file included from /src/qemu/target-i386/kvm.c:21: In file included from /src/qemu/qemu-common.h:27: In file included from /usr/include/stdio.h:910: /usr/include/bits/stdio2.h:189:12: note: previous definition is here CC i386-softmmu/kvm-all.o /src/qemu/kvm-all.c:39:9: error: 'dprintf' macro redefined In file included from /src/qemu/kvm-all.c:23: In file included from /src/qemu/qemu-common.h:27: In file included from /usr/include/stdio.h:910: /usr/include/bits/stdio2.h:189:12: note: previous definition is here Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-04-01S390: Tell user why VM creation failedAlexander Graf
The KVM kernel module on S390 refuses to create a VM when the switch_amode kernel parameter is not used. Since that is not exactly obvious, let's give the user a nice warning. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-01kvm: add API to set ioeventfdMichael S. Tsirkin
Comment on kvm usage: rather than require users to do if (kvm_enabled()) and/or ifdefs, this patch adds an API that, internally, is defined to stub function on non-kvm build, and checks kvm_enabled for non-kvm run. While rest of qemu code still uses if (kvm_enabled()), I think this approach is cleaner, and we should convert rest of code to it long term. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-03-29Move KVM and Xen global flags to vl.cBlue Swirl
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-03-04KVM: Rework VCPU state writeback APIJan Kiszka
This grand cleanup drops all reset and vmsave/load related synchronization points in favor of four(!) generic hooks: - cpu_synchronize_all_states in qemu_savevm_state_complete (initial sync from kernel before vmsave) - cpu_synchronize_all_post_init in qemu_loadvm_state (writeback after vmload) - cpu_synchronize_all_post_init in main after machine init - cpu_synchronize_all_post_reset in qemu_system_reset (writeback after system reset) These writeback points + the existing one of VCPU exec after cpu_synchronize_state map on three levels of writeback: - KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run) - KVM_PUT_RESET_STATE (on synchronous system reset, all VCPUs stopped) - KVM_PUT_FULL_STATE (on init or vmload, all VCPUs stopped as well) This level is passed to the arch-specific VCPU state writing function that will decide which concrete substates need to be written. That way, no writer of load, save or reset functions that interact with in-kernel KVM states will ever have to worry about synchronization again. That also means that a lot of reasons for races, segfaults and deadlocks are eliminated. cpu_synchronize_state remains untouched, just as Anthony suggested. We continue to need it before reading or writing of VCPU states that are also tracked by in-kernel KVM subsystems. Consequently, this patch removes many cpu_synchronize_state calls that are now redundant, just like remaining explicit register syncs. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04KVM: Rework of guest debug state writingJan Kiszka
So far we synchronized any dirty VCPU state back into the kernel before updating the guest debug state. This was a tribute to a deficite in x86 kernels before 2.6.33. But as this is an arch-dependent issue, it is better handle in the x86 part of KVM and remove the writeback point for generic code. This also avoids overwriting the flushed state later on if user space decides to change some more registers before resuming the guest. We furthermore need to reinject guest exceptions via the appropriate mechanism. That is KVM_SET_GUEST_DEBUG for older kernels and KVM_SET_VCPU_EVENTS for recent ones. Using both mechanisms at the same time will cause state corruptions. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-02-22kvm-all.c: define smp_wmb and use it for coalesced mmioMarcelo Tosatti
Acked-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-02-22kvm: remove pre-entry exit_request check with iothread enabledMarcelo Tosatti
With SIG_IPI blocked vcpu loop exit notification happens via -EAGAIN from KVM_RUN. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-02-22kvm: consume internal signal with sigtimedwaitMarcelo Tosatti
Change the way the internal qemu signal, used for communication between iothread and vcpus, is handled. Block and consume it with sigtimedwait on the outer vcpu loop, which allows more precise timing control. Change from standard signal (SIGUSR1) to real-time one, so multiple signals are not collapsed. Set the signal number on KVM's in-kernel allowed sigmask. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-02-10kvm: reduce code duplication in config_iothreadAmit Shah
We have some duplicated code in the CONFIG_IOTHREAD #ifdef and #else cases. Fix that. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-09kvm: move kvm to use memory notifiersMichael S. Tsirkin
remove direct kvm calls from exec.c, make kvm use memory notifiers framework instead. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-09kvm: move kvm_set_phys_mem aroundMichael S. Tsirkin
move kvm_set_phys_mem so that it will be later available earlier in the file. needed for next patch using memory notifiers. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-03KVM: Move and rename regs_modifiedJan Kiszka
Touching the user space representation of KVM's VCPU state is - naturally - a per-VCPU thing. So move the dirty flag into KVM_CPU_COMMON and rename it at this chance to reflect its true meaning. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2010-02-03kvm: Flush coalesced MMIO buffer periodlySheng Yang
The default action of coalesced MMIO is, cache the writing in buffer, until: 1. The buffer is full. 2. Or the exit to QEmu due to other reasons. But this would result in a very late writing in some condition. 1. The each time write to MMIO content is small. 2. The writing interval is big. 3. No need for input or accessing other devices frequently. This issue was observed in a experimental embbed system. The test image simply print "test" every 1 seconds. The output in QEmu meets expectation, but the output in KVM is delayed for seconds. Per Avi's suggestion, I hooked flushing coalesced MMIO buffer in VGA update handler. By this way, We don't need vcpu explicit exit to QEmu to handle this issue. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03kvm: x86: Add support for VCPU event statesJan Kiszka
This patch extends the qemu-kvm state sync logic with support for KVM_GET/SET_VCPU_EVENTS, giving access to yet missing exception, interrupt and NMI states. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03Don't leak file descriptorsKevin Wolf
We're leaking file descriptors to child processes. Set FD_CLOEXEC on file descriptors that don't need to be passed to children to stop this misbehaviour. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-17kvm: Add arch reset handlerJan Kiszka
Will be required by succeeding changes. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-12kvm: Move KVM mp_state accessors to i386-specific codeHollis Blanchard
Unbreaks PowerPC and S390 KVM builds. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-12unlock iothread mutex before running kvm ioctlGlauber Costa
Without this, kvm will hold the mutex while it issues its run ioctl, and never be able to step out of it, causing a deadlock. Patchworks-ID: 35359 Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05temporary fix for on_vcpuGlauber Costa
Recent changes made on_vcpu hit the abort() path, even with the IO thread disabled. This is because cpu_single_env is no longer set when we call this function. Although the correct fix is a little bit more complicated that that, the recent thread in which I proposed qemu_queue_work (which fixes that, btw), is likely to go on a quite different direction. So for the benefit of those using guest debugging, I'm proposing this simple fix in the interim. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05kvm: Fix guest single-steppingJan Kiszka
Hopefully the last regression of 4c0960c0: KVM_SET_GUEST_DEBUG requires properly synchronized guest registers (on x86: eflags) on entry. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>