1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
|
Code layout
===========
A Introduction
The software contained in the 'bootwrapper' directory allows
the execution of a software payload e.g. a Linux stack to
alternate between two multi-core clusters of ARM Cortex-A15
& Cortex-A7 processors connected by a coherent
interconnect. To achieve this aim it provides the ability
to:
1. Save the processor context on one cluster (henceforth
called the outbound cluster) and restore it on the other
cluster (henceforth called the inbound cluster).
2. Hide any software visible microarchitectural differences
between the Cortex-A15 & Cortex-A7 processors.
3. Use the ARM Virtualization Extensions to perform 1. and 2.
in a payload software agnostic manner.
This software is intended to be executed on the Real-Time
System Model v7.0.1 (RTSM_VE_Cortex_A15x1_A7x1 and
RTSM_VE_Cortex_A15x4_A7x4).
In addition to switching the payload software execution
between the two clusters, the software also contains support
for executing the payload software simultaneously on the two
clusters.
This is called the MP configuration. In it's current state,
it mainly involves making the payload software believe that
the A15 cluster includes the cpus present on Cortex-A7 cluster
i.e.there is one cluster with more cpus that there
physically are. [Note that MP support is highly experimental
and unstable. It is NOT the focus of this release and is
intended for purely informational purposes. The cluster
swithing mode of operation remains the focus of this
release.]
The Virtualizer software needs initialization prior to being
used to perform any of the above functions. The
initialization needs to be done before the payload software
is executed. Hence, it makes sense to do this from the
existing boot firmware being used on the platform. The code
in the 'bootwrapper' directory is a bareminimal bootloader
that:
1. Sets up the environment for execution of the payload
software in the Non-secure world by programming the
appropriate coprocessor and memory mapped peripheral
registers from the Secure world.
2. Invokes the entry point of the Virtualizer software
(bl_setup()) which does the necessary initialization.
3. Passes control to the payload software in the Non-secure
world.
B Code layout overview
1. bootwrapper/
Apart from containing the bootloader, this directory
also contains scatter files to load the bootloader,
Virtualizer and the payload software correctly on the
target platform as a single ELF file (img.axf).
The important files here are:
1. vectors.S
1. Implements the Secure world exception vectors
which are loaded to the base of physical memory
(0x80000000) at reset.
2. boot.S
1. Handles a power-on reset.
2. Initialises the I-Cache, sets up the stack &
passes control to the C handler for performing
the rest of the initialization.
3. c_start.c
1. Picks up from where the start() routine left in
the previous file.
2. Programs the exception vector tables for the
Secure world.
3. Provides Non-secure access to certain
coprocessor registers and memory mapped
peripherals e.g. access to the cache
coherent interconnect registers, coprocessors
etc.
4. Enables functionality which can be initialised
only in the Secure world. e.g. Configuration of
interrupts as Non-secure.
5. Synchronises execution with the secondary cpus
(if present) so that any global peripheral is
accesses by them only after the primary has
initialised it.
6. Enters the non-secure HYP mode and initialises
the Virtualizer.
7. Enters the non-secure SVC mode and jumps to the
payload software entry point.
4. payload/
1. Contains two files 'fsimg' and 'kernel'.
2. The 'kernel' is a raw Linux kernel binary image.
The instructions to build this Linux image can
be found in docs/03-Linux-kernel-build.txt.
This image can be replaced with a raw binary
image of any other software payload which is
desired to be run on this system.
3. The 'fsimg' is an empty filesystem stub. If
desired, it can be replaced with a suitable
filesystem image in a Linux initramfs format. A
custom busybox filesystem was used for testing.
More complex filesystems may be used if needed
but will require the use of MMC emulation with
the ARM FastModels.
See docs/06-Optional-rootfs-build.txt for
details.
5. boot.map.template
1. Scatter file which combines the payload
software, Virtualizer and the bootloader into a
single ELF file (img.axf) which can
then be loaded on the relevant platform.
6. makemap
1. Simple perl script that takes an ELF image of
the Virtualizer, parses through the relevant
sections & adds those sections to the scatter
file so that a consolidated image can be
created.
2. big-little/common
This directory mainly deals with setting up of the HYP
processor mode and the Virtual GIC. This allows the
payload software to run unmodified while either the
Switching or the MP mode is active in the background.
The important files here are:
1. hyp_vectors.s
1. Implements the HYP mode vector table.
2. It contains the entry point "bl_setup()" which
is invoked by the bootwrapper to initialise the
Virtualizer software.
3. The exception vector for interrupts
[irq_entry()] is the entry point for all
physical interrupts. The exception vector for
hypervisor traps [hvc_entry()] is the entry
point for all accesses made by the payload
software that need to be handled in the HYP
mode.
4. Also contained is rudimentary support for fault
exception handlers [dabt_entry(), iabt_entry() &
undef_entry()].
2. hyp_setup.c
1. Extends the initialization of the Virtualizer
software into C code after a cold reset.
2. If switching is being done asynchronously then
the HYP timer interrupt is setup to periodically
(~12 million instructions) trigger a switchover
to the other cluster.
3. If in MP mode, then CCI snoops are enabled for
both the clusters.
3. vgic_handle.c
1. Extends handling of physical interrupts into C
code from irq_entry(). Interrupts are
acknowledged (optionally EOI'ed) and queued as
virtual interrupts. The HYP timer interrupt is
handled differently. When recieved, its used as
a trigger to initiate the switchover process.
4. vgiclib.c
1. Implements handling of virtual interrupts once
they have been queued up in the vGIC HYP view
list registers. It maintains the list registers
and also saves and restores the context of the
vGIC HYP view interface.
5. pagetable_setup.c
1. Creates and sets up the HYP mode and 2nd stage
translation page tables. Accesses by the payload
software to the vGIC physical cpu interface are
mapped to the vGIC virtual cpu interface using
the 2nd stage translation page tables.
2. In the MP configuration, the translation tables
are shared by all the cpus in the two clusters.
Hence the first cpu in only one of the clusters
creates them.
6. vgic_setup.c
1. Enables virtual interrupts and exceptions,
initialises the physical cpu interface and the
HYP view interface.
3. big-little/lib
This directory implements common functionality thats
used across all the Virtualizer code. This includes:
1. Locks which can be used with Strongly Ordered and
Device memory.
2. Code tracing support on the Fast Models platform
through the use of memory mapped TUBE registers &
the Generic Trace plugin.
Details of this feature can be found in
docs/04-Cache-hit-rate-howto.txt.
3. Events to synchronise the switching process between
the clusters and within the clusters. They also used
to synchronise the setup phase after a cold reset in
the MP configuration.
4. UART routines to enable support semihosting of
printf family of functions.
5. Cache maintenance, Stack manipulation and Locking
routines.
6. Use of IPIs for HYP mode communication.
4. big-little/include
1. This directory contains the headers specific to HYP
mode setup, Switching process and common helper
routines. Most importantly, context.h contains the
data structures which are used to save and restore
the processor context.
5. big-little/switcher
This directory implements code to save and restore
processor context and to initiate/handle a
async/synchronous switchover request.
1. context/
1. ns_context.c
1. Contains top level routines to save and
restore the Non-secure world context.
2. If the type of operation is a cluster
switch it requests the secure world to save
its own context and bring the inbound
cluster out of reset. It also uses events to
synchronise the switching process between
the inbound and outbound clusters.
3. If the type of operation is a cpu hotplug
it requests the secure world to save
its own context and then saves only the
relevant HYP mode context before placing the
cpu in reset.
2. gic.c
1. Contains routines to save and restore the
context of the vGIC physical distributor and
cpu interfaces.
3. sh_vgic.c
1. The two clusters share the interrupt
controller instead of each cluster having
its own. A consequence of this is that there
is no longer a 1 to 1 mapping between cpu
ids and cpu interface ids e.g. on an
MPx1+MPx1 cluster configuration,
cpu0 of the Cortex-A7 cluster would
correspond to cpuinterface1 on the shared
vGIC. This in turn affects routing of
peripheral and software generated
interrupts. This file implements code to
allow use of the shared vGIC correctly
keeping this limitation in mind.
2. trigger/
1. async_switchover.c
1. Contains code to use the HYP timer interrupt
as a trigger to initiate a switchover
asynchronously.
2. Exports a function [signal_switchover()] which
can be used to trigger a async/synchronous
switch.
3. Implements logic to ensure that only the cpus
which have not been hot-plugged are switched
to the inbound cluster.
2. handle_switchover.s
1. Contains code to start saving the non-secure
world context and request the secure world to
power down the outbound cluster once the
inbound cluster is up and running.
6. big-little/virtualisor
This directory implements code that using the ARM
Virtualization extensions:
1. Hides any microarchitectural differences between the
Cortex-A15 & Cortex-A7 processors visible to the
payload software.
2. Provides a different view of the underlying hardware
than what really exists e.g. in the switching mode
it traps accesses made by the host cluster
(Cortex-A7 cluster currently) to the shared vGIC
physical distributor interface, so that routing of
interrupts can take place correctly. In the MP mode,
the L2 control and MPIDR registers are virtualized
to tell the payload software that there is one
cluster with multiple processors instead of two.
The ARM Virtualization extensions provide a set of trap
registers (HCPTR (Hyp Coprocessor Trap Register), HSTR
(Hyp System Trap Register), HDCR (Hyp Debug
Configuration Register)) to be able to select what
accesses made by the payload software to the coprocessor
block will be trapped in the HYP mode.
Accesses to memory mapped peripherals e.g. shared vGIC
can be trapped into the HYP mode by populating
appropriate entries in the 2nd stage translation tables.
This is how microarchitectural differences between the
two processor sets are resolved.
Whenever a trap into HYP mode is taken, the HSR (Hyp
Syndrome Register) contains enough information about the
type of trap taken for the software to take appropriate
action.
The Virtualizer design centres around the traps
recognized by the HSR. Also, to deal with
microarchitectural differences the concept of a HOST
cluster is introduced. It is possible for each
cpu to find out the system topology using the Kingfisher
System Control Block. Once it knows the host cluster id
& whether the software is expected to switch execution
or run in the MP mode (provided at compile time), the
CPU can configure itself accordingly.
The processor cluster for which the payload software has
been built to run on [assumed to be Cortex-A15 for this
release] is termed as the TARGET while the cluster on
which the differences are expected to crop up is called
the HOST (assumed to be Cortex-A7 for this release).
The HOST environment variable is used to specify
the host cluster. The target cluster is assumed to be
the logical complement of the host i.e. cluster ids can
only take the values of 0 and 1.
The HOST processor emulates the TARGET processor by
trapping the accesses to differing processor features
into the HYP mode. Most of the microarchitectural
differences & registers that need to be virtualized are
handled in a generic (CPU Independent) layer of
code. Additionally, each processor exports functions to
setup, handle & optionally save/restore context of each
trap that the HSR recognises. These handlers are invoked
whenever the software runs
on that processor.
1. virt_setup.c
1. Generic function that initialises the required
traps. This is done once each on both the host
and target clusters if the trap handler needs
to obtain some information about the target
cluster to be able to work correctly e.g the
Cortex-A7 processor cluster needs to find out
the cache geometry of the Cortex-A15
processor cluster to be able to handle cache
maintenance operations by set/way correctly.This
function further calls any setup function that
has been exported by the processor the code is
executing on.
2. virt_handle.c
1. Generic function that extends the hvc_entry()
routine to C Code. It calls the generic trap
handler (if registered) and then any trap
handlers exported by the processor on
which the trap has been invoked.
2. It invokes a synchronous cluster switch if the
appropriate 'HVC' instruction is issued by the
payload software. Please refer to
"docs/09-HVC-calling-conventions.pdf" for details
of this 'HVC' API.
3. It returns the value of the physical MPIDR
register if the appropriate 'HVC' instruction
is issued by the payload software. Please refer
to "docs/09-HVC-calling-conventions.pdf" for
details of this 'HVC' API.
3. virt_context.c
1. Generic function that saves and restores traps
on the host cluster & then calls any
save/restore function that has been exported by
the processor the code is executing on.
4. cache_geom.c
1. Generic function that detects cache geometries
on the host and target clusters & then maps
cache maintenance operations by set/way from the
target to the host cache.
5. mem_trap.c
1. Generic function that sets up any memory traps
by editing the 2nd stage translation tables.
6. vgic_trap_handler.c
1. Generic function that handles trapped accesses
to the shared vGIC.
7. kfscb_trap_handler.c
1. Generic function that handles trapped accesses
to the Kingfisher System Control Block. This is
usually done to start a cpu hotplug operation.
8. pmu_trap_handler.c
1. Generic function that enables the use of PMU
with the Virtualizer software as per the design
detailed in:
'docs/10-ARM-Virtualizer-support-for-debug-and-the-PMU.pdf'
7. include/
Header files specific to the Virtualisor code.
8. cpus/
Allows implementation of trap handling specific to the
Cortex-A7 or A15 processors.
1. a15/a15.c
2. a7/a7.c
1. The differences between the I-cache topologies of
the Cortex-A7 & A15 processors cannot be handled
within the existing abstraction of HOST & TARGET
clusters. These differences are treated as cpu-
specific and handled within these two files.
9. big-little/secure_world
Since both Cortex-A7 & Cortex-A15 processors support ARM
TrustZone Security Extensions, there is certain context
that needs to be setup, saved & restored in the Secure
world.
This context allows access to certain coprocessor and
peripheral registers to the Non-secure world. It also
configures the shared vGIC for use by the Non-secure
world.
Execution shifts to the Secure world through the SMC
instruction which is a part of the ARM V7-ISA.
1. monmode_vectors.s
1. Implements the monitor mode vector table. It
contains the secure entry point [do_smc()] for
the SMC instruction alongwith rudimentary
support for other fault exceptions taken while
executing in the secure world.
2. Three types of SMC exceptions are expected (type
of exception is contained in r0):
1. SMC_SEC_INIT
Called once after a power on reset to
initialise the Secure world stacks,
coherency, pagetables, to configure some
coprocessor and memory mapped peripheral
(Coherent interconnect & shared vGIC)
registers for use of these features by
the Non-secure world.
2. SMC_SEC_SAVE
Called from ns_context.c to request the
secure world to save its context and bring
the corresponding core in the inbound
cluster out of reset so that it can start
restoring the saved state.
3. SMC_SEC_SHUTDOWN
Called from handle_switchover.s to request
the secure world to flush the L1 and L2 caches
and power down the outbound cluster.
Also implemented is a function to handle warm
resets on the inbound cluster. Bareminimal
context is initialised while the rest is restored
before control is passed to the Non-secure world
handler for restoring context [restore_context()]
in ns_context.c
2. secure_context.c
Implements code to save and restore the secure world
context
3. secure_resets.c
Implements code to power down the outbound cluster
and bring individual cores in the inbound cluster
out of reset.
4. ve_reset_handler.s
Base of physical memory in the Versatile Express
memory map is at 0x80000000. The processors are
brought out of reset at 0x0 which points to Secure
RAM/Flash memory. This file implements a small stub
function that is placed at 0x0 so that execution
jumps to 0x80000000 after a cold reset and to the
warm_reset() handler in monmode_vectors.s
after a warm reset.
The secure world code is built into a seperate ELF image
to maintain its distinction from the Virtualizer code
that executes in the Non-secure world.
10. big-little/bl.scf.template
1. Scatter file that is used to build the Non-secure
world code in the Virtualizer software. The
resultant image is bl.axf.
11. big-little/bl-sec.scf.template
1. Scatter file that is used to build the Secure world
code in the Virtualizer software. The resultant
image is bl_sec.axf.
12. acsr/
The secure world code is built into a seperate ELF image
to maintain its distinction from the Virtualizer code
that executes in the Non-secure world.
1. helpers.s
Helper functions to access the CP15 coprocessor
space.
2. v7.s
Contains routines to save and restore ARM processor
context.
3. v7_c.c
Contains routines to save and restore a processor's
debug subsystem state. State is saved through the
cp14 interface for v7.1 of the debug subsystem &
through the memory mapped interface for v7.0.
4. debug_helpers.s
5. debug_helpers.h
Helper functions to save and restore the ARM Debug
subsystem context.
|