blob: 905b6d1f48dc4ecaf0408755d17a80f8514abf2d (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
|
APEI tables generating and CPER record
======================================
Copyright (c) 2019 HUAWEI TECHNOLOGIES CO., LTD.
This work is licensed under the terms of the GNU GPL, version 2 or later.
See the COPYING file in the top-level directory.
Design Details
--------------
::
etc/acpi/tables etc/hardware_errors
==================== ==========================================
+ +--------------------------+ +-----------------------+
| | HEST | | address | +--------------+
| +--------------------------+ | registers | | Error Status |
| | GHES1 | | +---------------------+ | Data Block 1 |
| +--------------------------+ +--------->| |error_block_address1 |----------->| +------------+
| | ................. | | | +---------------------+ | | CPER |
| | error_status_address-----+-+ +------->| |error_block_address2 |--------+ | | CPER |
| | ................. | | | +---------------------+ | | | .... |
| | read_ack_register--------+-+ | | | .............. | | | | CPER |
| | read_ack_preserve | | | +-----------------------+ | | +------------+
| | read_ack_write | | | +----->| |error_block_addressN |------+ | | Error Status |
+ +--------------------------+ | | | | +---------------------+ | | | Data Block 2 |
| | GHES2 | +-+-+----->| |read_ack_register1 | | +-->| +------------+
+ +--------------------------+ | | | +---------------------+ | | | CPER |
| | ................. | | | +--->| |read_ack_register2 | | | | CPER |
| | error_status_address-----+---+ | | | +---------------------+ | | | .... |
| | ................. | | | | | ............. | | | | CPER |
| | read_ack_register--------+-----+-+ | +---------------------+ | +-+------------+
| | read_ack_preserve | | +->| |read_ack_registerN | | | |.......... |
| | read_ack_write | | | | +---------------------+ | | +------------+
+ +--------------------------| | | | | Error Status |
| | ............... | | | | | Data Block N |
+ +--------------------------+ | | +---->| +------------+
| | GHESN | | | | | CPER |
+ +--------------------------+ | | | | CPER |
| | ................. | | | | | .... |
| | error_status_address-----+-----+ | | | CPER |
| | ................. | | +-+------------+
| | read_ack_register--------+---------+
| | read_ack_preserve |
| | read_ack_write |
+ +--------------------------+
(1) QEMU generates the ACPI HEST table. This table goes in the current
"etc/acpi/tables" fw_cfg blob. Each error source has different
notification types.
(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
also needs to populate this blob. The "etc/hardware_errors" fw_cfg blob
contains an address registers table and an Error Status Data Block table.
(3) The address registers table contains N Error Block Address entries
and N Read Ack Register entries. The size for each entry is 8-byte.
The Error Status Data Block table contains N Error Status Data Block
entries. The size for each entry is 4096(0x1000) bytes. The total size
for the "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
N is the number of the kinds of hardware error sources.
(4) QEMU generates the ACPI linker/loader script for the firmware. The
firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
and copies blob contents there.
(5) QEMU generates N ADD_POINTER commands, which patch addresses in the
"error_status_address" fields of the HEST table with a pointer to the
corresponding "address registers" in the "etc/hardware_errors" blob.
(6) QEMU generates N ADD_POINTER commands, which patch addresses in the
"read_ack_register" fields of the HEST table with a pointer to the
corresponding "address registers" in the "etc/hardware_errors" blob.
(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
addresses in the "error_block_address" fields with a pointer to the
respective "Error Status Data Block" in the "etc/hardware_errors" blob.
(8) QEMU defines a third and write-only fw_cfg blob which is called
"etc/hardware_errors_addr". Through that blob, the firmware can send back
the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
for the firmware. The firmware will write back the start address of
"etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
(9) When QEMU gets a SIGBUS from the kernel, QEMU formats the CPER right into
guest memory, and then injects platform specific interrupt (in case of
arm/virt machine it's Synchronous External Abort) as a notification which
is necessary for notifying the guest.
(10) This notification (in virtual hardware) will be handled by the guest
kernel, guest APEI driver will read the CPER which is recorded by QEMU and
do the recovery.
|