blob: 2e499a7c5a662cf255befe564506b95297c22e5c [file] [log] [blame]
bellard1f673132004-04-04 15:21:17 +00001\input texinfo @c -*- texinfo -*-
bellarddebc7062006-04-30 21:58:41 +00002@c %**start of header
3@setfilename qemu-tech.info
Stefan Weile080e782010-02-05 23:52:00 +01004
5@documentlanguage en
6@documentencoding UTF-8
7
bellarddebc7062006-04-30 21:58:41 +00008@settitle QEMU Internals
9@exampleindent 0
10@paragraphindent 0
11@c %**end of header
bellard1f673132004-04-04 15:21:17 +000012
Stefan Weila1a32b02010-02-05 23:51:59 +010013@ifinfo
14@direntry
15* QEMU Internals: (qemu-tech). The QEMU Emulator Internals.
16@end direntry
17@end ifinfo
18
bellard1f673132004-04-04 15:21:17 +000019@iftex
bellard1f673132004-04-04 15:21:17 +000020@titlepage
21@sp 7
22@center @titlefont{QEMU Internals}
23@sp 3
24@end titlepage
25@end iftex
26
bellarddebc7062006-04-30 21:58:41 +000027@ifnottex
28@node Top
29@top
30
31@menu
Paolo Bonzini77d47e12016-10-06 16:49:03 +020032* CPU emulation::
33* Translator Internals::
34* Device emulation::
35* QEMU compared to other emulators::
36* Bibliography::
bellarddebc7062006-04-30 21:58:41 +000037@end menu
38@end ifnottex
39
40@contents
41
Paolo Bonzini77d47e12016-10-06 16:49:03 +020042@node CPU emulation
43@chapter CPU emulation
bellard1f673132004-04-04 15:21:17 +000044
bellarddebc7062006-04-30 21:58:41 +000045@menu
Paolo Bonzini77d47e12016-10-06 16:49:03 +020046* x86:: x86 and x86-64 emulation
47* ARM:: ARM emulation
48* MIPS:: MIPS emulation
49* PPC:: PowerPC emulation
50* SPARC:: Sparc32 and Sparc64 emulation
51* Xtensa:: Xtensa emulation
bellarddebc7062006-04-30 21:58:41 +000052@end menu
53
Paolo Bonzini77d47e12016-10-06 16:49:03 +020054@node x86
blueswir1998a0502008-10-09 18:52:04 +000055@section x86 and x86-64 emulation
bellard1f673132004-04-04 15:21:17 +000056
57QEMU x86 target features:
58
ths5fafdf22007-09-16 21:08:06 +000059@itemize
bellard1f673132004-04-04 15:21:17 +000060
ths5fafdf22007-09-16 21:08:06 +000061@item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation.
blueswir1998a0502008-10-09 18:52:04 +000062LDT/GDT and IDT are emulated. VM86 mode is also supported to run
63DOSEMU. There is some support for MMX/3DNow!, SSE, SSE2, SSE3, SSSE3,
64and SSE4 as well as x86-64 SVM.
bellard1f673132004-04-04 15:21:17 +000065
66@item Support of host page sizes bigger than 4KB in user mode emulation.
67
68@item QEMU can emulate itself on x86.
69
ths5fafdf22007-09-16 21:08:06 +000070@item An extensive Linux x86 CPU test program is included @file{tests/test-i386}.
bellard1f673132004-04-04 15:21:17 +000071It can be used to test other x86 virtual CPUs.
72
73@end itemize
74
75Current QEMU limitations:
76
ths5fafdf22007-09-16 21:08:06 +000077@itemize
bellard1f673132004-04-04 15:21:17 +000078
blueswir1998a0502008-10-09 18:52:04 +000079@item Limited x86-64 support.
bellard1f673132004-04-04 15:21:17 +000080
81@item IPC syscalls are missing.
82
ths5fafdf22007-09-16 21:08:06 +000083@item The x86 segment limits and access rights are not tested at every
bellard1f673132004-04-04 15:21:17 +000084memory access (yet). Hopefully, very few OSes seem to rely on that for
85normal use.
86
bellard1f673132004-04-04 15:21:17 +000087@end itemize
88
Paolo Bonzini77d47e12016-10-06 16:49:03 +020089@node ARM
bellard1f673132004-04-04 15:21:17 +000090@section ARM emulation
91
92@itemize
93
94@item Full ARM 7 user emulation.
95
96@item NWFPE FPU support included in user Linux emulation.
97
98@item Can run most ARM Linux binaries.
99
100@end itemize
101
Paolo Bonzini77d47e12016-10-06 16:49:03 +0200102@node MIPS
ths24d4de42007-07-11 10:24:28 +0000103@section MIPS emulation
104
105@itemize
106
107@item The system emulation allows full MIPS32/MIPS64 Release 2 emulation,
108including privileged instructions, FPU and MMU, in both little and big
109endian modes.
110
111@item The Linux userland emulation can run many 32 bit MIPS Linux binaries.
112
113@end itemize
114
115Current QEMU limitations:
116
117@itemize
118
119@item Self-modifying code is not always handled correctly.
120
121@item 64 bit userland emulation is not implemented.
122
123@item The system emulation is not complete enough to run real firmware.
124
thsb1f45232007-07-12 09:03:30 +0000125@item The watchpoint debug facility is not implemented.
126
ths24d4de42007-07-11 10:24:28 +0000127@end itemize
128
Paolo Bonzini77d47e12016-10-06 16:49:03 +0200129@node PPC
bellard1f673132004-04-04 15:21:17 +0000130@section PowerPC emulation
131
132@itemize
133
ths5fafdf22007-09-16 21:08:06 +0000134@item Full PowerPC 32 bit emulation, including privileged instructions,
bellard1f673132004-04-04 15:21:17 +0000135FPU and MMU.
136
137@item Can run most PowerPC Linux binaries.
138
139@end itemize
140
Paolo Bonzini77d47e12016-10-06 16:49:03 +0200141@node SPARC
blueswir1998a0502008-10-09 18:52:04 +0000142@section Sparc32 and Sparc64 emulation
bellard1f673132004-04-04 15:21:17 +0000143
144@itemize
145
blueswir1f6b647c2007-04-05 18:40:23 +0000146@item Full SPARC V8 emulation, including privileged
bellard34751872005-07-02 14:31:34 +0000147instructions, FPU and MMU. SPARC V9 emulation includes most privileged
blueswir1a785e422007-10-20 08:09:05 +0000148and VIS instructions, FPU and I/D MMU. Alignment is fully enforced.
bellard1f673132004-04-04 15:21:17 +0000149
blueswir1a785e422007-10-20 08:09:05 +0000150@item Can run most 32-bit SPARC Linux binaries, SPARC32PLUS Linux binaries and
151some 64-bit SPARC Linux binaries.
bellard34751872005-07-02 14:31:34 +0000152
153@end itemize
154
155Current QEMU limitations:
156
ths5fafdf22007-09-16 21:08:06 +0000157@itemize
bellard34751872005-07-02 14:31:34 +0000158
bellard34751872005-07-02 14:31:34 +0000159@item IPC syscalls are missing.
160
blueswir11f587322007-11-25 18:40:20 +0000161@item Floating point exception support is buggy.
bellard34751872005-07-02 14:31:34 +0000162
163@item Atomic instructions are not correctly implemented.
164
blueswir1998a0502008-10-09 18:52:04 +0000165@item There are still some problems with Sparc64 emulators.
bellard1f673132004-04-04 15:21:17 +0000166
167@end itemize
168
Paolo Bonzini77d47e12016-10-06 16:49:03 +0200169@node Xtensa
Max Filippov3aeaea62011-10-10 14:48:23 +0400170@section Xtensa emulation
171
172@itemize
173
174@item Core Xtensa ISA emulation, including most options: code density,
175loop, extended L32R, 16- and 32-bit multiplication, 32-bit division,
Max Filippov044d0032012-11-29 19:53:20 +0400176MAC16, miscellaneous operations, boolean, FP coprocessor, coprocessor
177context, debug, multiprocessor synchronization,
Max Filippov3aeaea62011-10-10 14:48:23 +0400178conditional store, exceptions, relocatable vectors, unaligned exception,
179interrupts (including high priority and timer), hardware alignment,
180region protection, region translation, MMU, windowed registers, thread
181pointer, processor ID.
182
Max Filippov044d0032012-11-29 19:53:20 +0400183@item Not implemented options: data/instruction cache (including cache
184prefetch and locking), XLMI, processor interface. Also options not
185covered by the core ISA (e.g. FLIX, wide branches) are not implemented.
Max Filippov3aeaea62011-10-10 14:48:23 +0400186
187@item Can run most Xtensa Linux binaries.
188
189@item New core configuration that requires no additional instructions
190may be created from overlay with minimal amount of hand-written code.
191
192@end itemize
193
Paolo Bonzini77d47e12016-10-06 16:49:03 +0200194@node Translator Internals
195@chapter Translator Internals
bellard1f673132004-04-04 15:21:17 +0000196
bellarddebc7062006-04-30 21:58:41 +0000197@menu
bellarddebc7062006-04-30 21:58:41 +0000198* CPU state optimisations::
199* Translation cache::
200* Direct block chaining::
201* Self-modifying code and translated code invalidation::
202* Exception support::
203* MMU emulation::
bellarddebc7062006-04-30 21:58:41 +0000204@end menu
205
bellard1f673132004-04-04 15:21:17 +0000206QEMU is a dynamic translator. When it first encounters a piece of code,
207it converts it to the host instruction set. Usually dynamic translators
208are very complicated and highly CPU dependent. QEMU uses some tricks
209which make it relatively easily portable and simple while achieving good
210performances.
211
Paolo Bonzinibf28a692016-10-06 15:10:10 +0200212QEMU's dynamic translation backend is called TCG, for "Tiny Code
213Generator". For more information, please take a look at @code{tcg/README}.
bellard1f673132004-04-04 15:21:17 +0000214
bellarddebc7062006-04-30 21:58:41 +0000215@node CPU state optimisations
bellard1f673132004-04-04 15:21:17 +0000216@section CPU state optimisations
217
blueswir1998a0502008-10-09 18:52:04 +0000218The target CPUs have many internal states which change the way it
219evaluates instructions. In order to achieve a good speed, the
220translation phase considers that some state information of the virtual
221CPU cannot change in it. The state is recorded in the Translation
222Block (TB). If the state changes (e.g. privilege level), a new TB will
223be generated and the previous TB won't be used anymore until the state
224matches the state recorded in the previous TB. For example, if the SS,
225DS and ES segments have a zero base, then the translator does not even
226generate an addition for the segment base.
bellard1f673132004-04-04 15:21:17 +0000227
228[The FPU stack pointer register is not handled that way yet].
229
bellarddebc7062006-04-30 21:58:41 +0000230@node Translation cache
bellard1f673132004-04-04 15:21:17 +0000231@section Translation cache
232
陳韋任27c8efc2011-11-05 01:14:44 +0800233A 32 MByte cache holds the most recently used translations. For
bellard1f673132004-04-04 15:21:17 +0000234simplicity, it is completely flushed when it is full. A translation unit
235contains just a single basic block (a block of x86 instructions
236terminated by a jump or by a virtual CPU state change which the
237translator cannot deduce statically).
238
bellarddebc7062006-04-30 21:58:41 +0000239@node Direct block chaining
bellard1f673132004-04-04 15:21:17 +0000240@section Direct block chaining
241
242After each translated basic block is executed, QEMU uses the simulated
Gongleid274e072015-07-03 17:50:57 +0800243Program Counter (PC) and other cpu state information (such as the CS
bellard1f673132004-04-04 15:21:17 +0000244segment base value) to find the next basic block.
245
246In order to accelerate the most common cases where the new simulated PC
247is known, QEMU can patch a basic block so that it jumps directly to the
248next one.
249
250The most portable code uses an indirect jump. An indirect jump makes
251it easier to make the jump target modification atomic. On some host
252architectures (such as x86 or PowerPC), the @code{JUMP} opcode is
253directly patched so that the block chaining has no overhead.
254
bellarddebc7062006-04-30 21:58:41 +0000255@node Self-modifying code and translated code invalidation
bellard1f673132004-04-04 15:21:17 +0000256@section Self-modifying code and translated code invalidation
257
258Self-modifying code is a special challenge in x86 emulation because no
259instruction cache invalidation is signaled by the application when code
260is modified.
261
262When translated code is generated for a basic block, the corresponding
blueswir1998a0502008-10-09 18:52:04 +0000263host page is write protected if it is not already read-only. Then, if
264a write access is done to the page, Linux raises a SEGV signal. QEMU
265then invalidates all the translated code in the page and enables write
266accesses to the page.
bellard1f673132004-04-04 15:21:17 +0000267
268Correct translated code invalidation is done efficiently by maintaining
269a linked list of every translated block contained in a given page. Other
ths5fafdf22007-09-16 21:08:06 +0000270linked lists are also maintained to undo direct block chaining.
bellard1f673132004-04-04 15:21:17 +0000271
blueswir1998a0502008-10-09 18:52:04 +0000272On RISC targets, correctly written software uses memory barriers and
273cache flushes, so some of the protection above would not be
274necessary. However, QEMU still requires that the generated code always
275matches the target instructions in memory in order to handle
276exceptions correctly.
bellard1f673132004-04-04 15:21:17 +0000277
bellarddebc7062006-04-30 21:58:41 +0000278@node Exception support
bellard1f673132004-04-04 15:21:17 +0000279@section Exception support
280
281longjmp() is used when an exception such as division by zero is
ths5fafdf22007-09-16 21:08:06 +0000282encountered.
bellard1f673132004-04-04 15:21:17 +0000283
284The host SIGSEGV and SIGBUS signal handlers are used to get invalid
blueswir1998a0502008-10-09 18:52:04 +0000285memory accesses. The simulated program counter is found by
286retranslating the corresponding basic block and by looking where the
287host program counter was at the exception point.
bellard1f673132004-04-04 15:21:17 +0000288
289The virtual CPU cannot retrieve the exact @code{EFLAGS} register because
290in some cases it is not computed because of condition code
291optimisations. It is not a big concern because the emulated code can
292still be restarted in any cases.
293
bellarddebc7062006-04-30 21:58:41 +0000294@node MMU emulation
bellard1f673132004-04-04 15:21:17 +0000295@section MMU emulation
296
blueswir1998a0502008-10-09 18:52:04 +0000297For system emulation QEMU supports a soft MMU. In that mode, the MMU
298virtual to physical address translation is done at every memory
299access. QEMU uses an address translation cache to speed up the
300translation.
bellard1f673132004-04-04 15:21:17 +0000301
302In order to avoid flushing the translated code each time the MMU
303mappings change, QEMU uses a physically indexed translation cache. It
ths5fafdf22007-09-16 21:08:06 +0000304means that each basic block is indexed with its physical address.
bellard1f673132004-04-04 15:21:17 +0000305
306When MMU mappings change, only the chaining of the basic blocks is
307reset (i.e. a basic block can no longer jump directly to another one).
308
blueswir1998a0502008-10-09 18:52:04 +0000309@node Device emulation
Paolo Bonzini77d47e12016-10-06 16:49:03 +0200310@chapter Device emulation
blueswir1998a0502008-10-09 18:52:04 +0000311
312Systems emulated by QEMU are organized by boards. At initialization
313phase, each board instantiates a number of CPUs, devices, RAM and
314ROM. Each device in turn can assign I/O ports or memory areas (for
315MMIO) to its handlers. When the emulation starts, an access to the
316ports or MMIO memory areas assigned to the device causes the
317corresponding handler to be called.
318
319RAM and ROM are handled more optimally, only the offset to the host
320memory needs to be added to the guest address.
321
322The video RAM of VGA and other display cards is special: it can be
323read or written directly like RAM, but write accesses cause the memory
324to be marked with VGA_DIRTY flag as well.
325
326QEMU supports some device classes like serial and parallel ports, USB,
327drives and network devices, by providing APIs for easier connection to
328the generic, higher level implementations. The API hides the
329implementation details from the devices, like native device use or
330advanced block device formats like QCOW.
331
332Usually the devices implement a reset method and register support for
333saving and loading of the device state. The devices can also use
334timers, especially together with the use of bottom halves (BHs).
335
Paolo Bonzini77d47e12016-10-06 16:49:03 +0200336@node QEMU compared to other emulators
337@chapter QEMU compared to other emulators
338
339Like bochs [1], QEMU emulates an x86 CPU. But QEMU is much faster than
340bochs as it uses dynamic compilation. Bochs is closely tied to x86 PC
341emulation while QEMU can emulate several processors.
342
343Like Valgrind [2], QEMU does user space emulation and dynamic
344translation. Valgrind is mainly a memory debugger while QEMU has no
345support for it (QEMU could be used to detect out of bound memory
346accesses as Valgrind, but it has no support to track uninitialised data
347as Valgrind does). The Valgrind dynamic translator generates better code
348than QEMU (in particular it does register allocation) but it is closely
349tied to an x86 host and target and has no support for precise exceptions
350and system emulation.
351
352EM86 [3] is the closest project to user space QEMU (and QEMU still uses
353some of its code, in particular the ELF file loader). EM86 was limited
354to an alpha host and used a proprietary and slow interpreter (the
355interpreter part of the FX!32 Digital Win32 code translator [4]).
356
357TWIN from Willows Software was a Windows API emulator like Wine. It is less
358accurate than Wine but includes a protected mode x86 interpreter to launch
359x86 Windows executables. Such an approach has greater potential because most
360of the Windows API is executed natively but it is far more difficult to
361develop because all the data structures and function parameters exchanged
362between the API and the x86 code must be converted.
363
364User mode Linux [5] was the only solution before QEMU to launch a
365Linux kernel as a process while not needing any host kernel
366patches. However, user mode Linux requires heavy kernel patches while
367QEMU accepts unpatched Linux kernels. The price to pay is that QEMU is
368slower.
369
370The Plex86 [6] PC virtualizer is done in the same spirit as the now
371obsolete qemu-fast system emulator. It requires a patched Linux kernel
372to work (you cannot launch the same kernel on your PC), but the
373patches are really small. As it is a PC virtualizer (no emulation is
374done except for some privileged instructions), it has the potential of
375being faster than QEMU. The downside is that a complicated (and
376potentially unsafe) host kernel patch is needed.
377
378The commercial PC Virtualizers (VMWare [7], VirtualPC [8]) are faster
379than QEMU (without virtualization), but they all need specific, proprietary
380and potentially unsafe host drivers. Moreover, they are unable to
381provide cycle exact simulation as an emulator can.
382
383VirtualBox [9], Xen [10] and KVM [11] are based on QEMU. QEMU-SystemC
384[12] uses QEMU to simulate a system where some hardware devices are
385developed in SystemC.
386
bellarddebc7062006-04-30 21:58:41 +0000387@node Bibliography
Paolo Bonzini77d47e12016-10-06 16:49:03 +0200388@chapter Bibliography
bellard1f673132004-04-04 15:21:17 +0000389
390@table @asis
391
ths5fafdf22007-09-16 21:08:06 +0000392@item [1]
bellard1f673132004-04-04 15:21:17 +0000393@url{http://bochs.sourceforge.net/}, the Bochs IA-32 Emulator Project,
394by Kevin Lawton et al.
395
Thomas Huth8e9620a2015-09-25 11:38:36 +0200396@item [2]
397@url{http://www.valgrind.org/}, Valgrind, an open-source memory debugger
398for GNU/Linux.
bellard1f673132004-04-04 15:21:17 +0000399
Thomas Huth8e9620a2015-09-25 11:38:36 +0200400@item [3]
401@url{http://ftp.dreamtime.org/pub/linux/Linux-Alpha/em86/v0.2/docs/em86.html},
402the EM86 x86 emulator on Alpha-Linux.
403
404@item [4]
bellarddebc7062006-04-30 21:58:41 +0000405@url{http://www.usenix.org/publications/library/proceedings/usenix-nt97/@/full_papers/chernoff/chernoff.pdf},
bellard1f673132004-04-04 15:21:17 +0000406DIGITAL FX!32: Running 32-Bit x86 Applications on Alpha NT, by Anton
407Chernoff and Ray Hookway.
408
Thomas Huth8e9620a2015-09-25 11:38:36 +0200409@item [5]
ths5fafdf22007-09-16 21:08:06 +0000410@url{http://user-mode-linux.sourceforge.net/},
bellard1f673132004-04-04 15:21:17 +0000411The User-mode Linux Kernel.
412
Thomas Huth8e9620a2015-09-25 11:38:36 +0200413@item [6]
ths5fafdf22007-09-16 21:08:06 +0000414@url{http://www.plex86.org/},
bellard1f673132004-04-04 15:21:17 +0000415The new Plex86 project.
416
Thomas Huth8e9620a2015-09-25 11:38:36 +0200417@item [7]
ths5fafdf22007-09-16 21:08:06 +0000418@url{http://www.vmware.com/},
bellard1f673132004-04-04 15:21:17 +0000419The VMWare PC virtualizer.
420
Thomas Huth8e9620a2015-09-25 11:38:36 +0200421@item [8]
422@url{https://www.microsoft.com/download/details.aspx?id=3702},
bellard1f673132004-04-04 15:21:17 +0000423The VirtualPC PC virtualizer.
424
Thomas Huth8e9620a2015-09-25 11:38:36 +0200425@item [9]
blueswir1998a0502008-10-09 18:52:04 +0000426@url{http://virtualbox.org/},
427The VirtualBox PC virtualizer.
428
Thomas Huth8e9620a2015-09-25 11:38:36 +0200429@item [10]
blueswir1998a0502008-10-09 18:52:04 +0000430@url{http://www.xen.org/},
431The Xen hypervisor.
432
Thomas Huth8e9620a2015-09-25 11:38:36 +0200433@item [11]
434@url{http://www.linux-kvm.org/},
blueswir1998a0502008-10-09 18:52:04 +0000435Kernel Based Virtual Machine (KVM).
436
Thomas Huth8e9620a2015-09-25 11:38:36 +0200437@item [12]
blueswir1998a0502008-10-09 18:52:04 +0000438@url{http://www.greensocs.com/projects/QEMUSystemC},
439QEMU-SystemC, a hardware co-simulator.
440
bellard1f673132004-04-04 15:21:17 +0000441@end table
442
bellarddebc7062006-04-30 21:58:41 +0000443@bye