docs/02-Code-layout.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616

Code layout
===========

A   Introduction

    The software contained in the 'bootwrapper' directory allows
    the execution of a software payload e.g. a Linux stack to
    alternate between two multi-core clusters of ARM Cortex-A15
    & Cortex-A7 processors connected by a coherent
    interconnect. To achieve this aim it provides the ability
    to:

    1.  Save the processor context on one cluster (henceforth
        called the outbound cluster) and restore it on the other
        cluster (henceforth called the inbound cluster).

    2.  Hide any software visible microarchitectural differences
        between the Cortex-A15 & Cortex-A7 processors.

    3.  Use the ARM Virtualization Extensions to perform 1. and 2.
        in a payload software agnostic manner.

    This software is intended to be executed on the Real-Time
    System Model v7.0.1 (RTSM_VE_Cortex_A15x1_A7x1 and
    RTSM_VE_Cortex_A15x4_A7x4).

    In addition to switching the payload software execution
    between the two clusters, the software also contains support
    for executing the payload software simultaneously on the two
    clusters.

    This is called the MP configuration. In it's current state,
    it mainly involves making the payload software believe that
    the A15 cluster includes the cpus present on Cortex-A7 cluster
    i.e.there is one cluster with more cpus that there
    physically are. [Note that MP support is highly experimental
    and unstable. It is NOT the focus of this release and is
    intended for purely informational purposes. The cluster
    swithing mode of operation remains the focus of this
    release.]

    The Virtualizer software needs initialization prior to being
    used to perform any of the above functions. The
    initialization needs to be done before the payload software
    is executed. Hence, it makes sense to do this from the
    existing boot firmware being used on the platform. The code
    in the 'bootwrapper' directory is a bareminimal bootloader
    that:

    1.  Sets up the environment for execution of the payload
        software in the Non-secure world by programming the
        appropriate coprocessor and memory mapped peripheral
        registers from the Secure world.

    2.  Invokes the entry point of the Virtualizer software
        (bl_setup()) which does the necessary initialization.

    3.  Passes control to the payload software in the Non-secure
        world.

B   Code layout overview

    1.  bootwrapper/

        Apart from containing the bootloader, this directory
        also contains scatter files to load the bootloader,
        Virtualizer and the payload software correctly on the
        target platform as a single ELF file (img.axf).

        The important files here are:

        1.  vectors.S

            1.  Implements the Secure world exception vectors
                which are loaded to the base of physical memory
                (0x80000000) at reset.

        2.  boot.S

            1.  Handles a power-on reset.

            2.  Initialises the I-Cache, sets up the stack &
                passes control to the C handler for performing
                the rest of the initialization.

        3.  c_start.c

            1.  Picks up from where the start() routine left in
                the previous file.

            2.  Programs the exception vector tables for the
                Secure world.

            3.  Provides Non-secure access to certain
                coprocessor registers and memory mapped
                peripherals e.g. access to the cache
                coherent interconnect registers, coprocessors
                etc.

            4.  Enables functionality which can be initialised
                only in the Secure world. e.g. Configuration of
                interrupts as Non-secure.

            5.  Synchronises execution with the secondary cpus
                (if present) so that any global peripheral is
                accesses by them only after the primary has
                initialised it.

            6.  Enters the non-secure HYP mode and initialises
                the Virtualizer.

            7.  Enters the non-secure SVC mode and jumps to the
                payload software entry point.

        4.  payload/

            1.  Contains two files 'fsimg' and 'kernel'.

            2.  The 'kernel' is a raw Linux kernel binary image.
                The instructions to build this Linux image can
                be found in docs/03-Linux-kernel-build.txt.
                This image can be replaced with a raw binary
                image of any other software payload which is
                desired to be run on this system.

            3.  The 'fsimg' is an empty filesystem stub. If
                desired, it can be replaced with a suitable
                filesystem image in a Linux initramfs format. A
                custom busybox filesystem was used for testing.
                More complex filesystems may be used if needed
                but will require the use of MMC emulation with
                the ARM FastModels.
                See docs/06-Optional-rootfs-build.txt for
                details.

        5.  boot.map.template

            1.  Scatter file which combines the payload
                software, Virtualizer and the bootloader into a
                single ELF file (img.axf) which can
                then be loaded on the relevant platform.

        6.  makemap

            1.  Simple perl script that takes an ELF image of
                the Virtualizer, parses through the relevant
                sections & adds those sections to the scatter
                file so that a consolidated image can be
                created.

    2.  big-little/common

        This directory mainly deals with setting up of the HYP
        processor mode and the Virtual GIC. This allows the
        payload software to run unmodified while either the
        Switching or the MP mode is active in the background.

        The important files here are:

        1.  hyp_vectors.s

            1.  Implements the HYP mode vector table.

            2.  It contains the entry point "bl_setup()" which
                is invoked by the bootwrapper to initialise the
                Virtualizer software.

            3.  The exception vector for interrupts
                [irq_entry()] is the entry point for all
                physical interrupts. The exception vector for
                hypervisor traps [hvc_entry()] is the entry
                point for all accesses made by the payload
                software that need to be handled in the HYP
                mode.

            4.  Also contained is rudimentary support for fault
                exception handlers [dabt_entry(), iabt_entry() &
                undef_entry()].

        2.  hyp_setup.c

            1.  Extends the initialization of the Virtualizer
                software into C code after a cold reset.

            2.  If switching is being done asynchronously then
                the HYP timer interrupt is setup to periodically
                (~12 million instructions) trigger a switchover
                to the other cluster.

            3.  If in MP mode, then CCI snoops are enabled for
                both the clusters.

        3.  vgic_handle.c

            1.  Extends handling of physical interrupts into C
                code from irq_entry(). Interrupts are
                acknowledged (optionally EOI'ed) and queued as
                virtual interrupts. The HYP timer interrupt is
                handled differently. When recieved, its used as
                a trigger to initiate the switchover process.

        4.  vgiclib.c

            1.  Implements handling of virtual interrupts once
                they have been queued up in the vGIC HYP view
                list registers. It maintains the list registers
                and also saves and restores the context of the
                vGIC HYP view interface.

        5.  pagetable_setup.c

            1.  Creates and sets up the HYP mode and 2nd stage
                translation page tables. Accesses by the payload
                software to the vGIC physical cpu interface are
                mapped to the vGIC virtual cpu interface using
                the 2nd stage translation page tables.

            2.  In the MP configuration, the translation tables
                are shared by all the cpus in the two clusters.
                Hence the first cpu in only one of the clusters
                creates them.

        6.  vgic_setup.c

            1.  Enables virtual interrupts and exceptions,
                initialises the physical cpu interface and the
                HYP view interface.

    3.  big-little/lib

        This directory implements common functionality thats
        used across all the Virtualizer code. This includes:

        1.  Locks which can be used with Strongly Ordered and
            Device memory.

        2.  Code tracing support on the Fast Models platform
            through the use of memory mapped TUBE registers &
            the Generic Trace plugin.
            Details of this feature can be found in
            docs/04-Cache-hit-rate-howto.txt.

        3.  Events to synchronise the switching process between
            the clusters and within the clusters. They also used
            to synchronise the setup phase after a cold reset in
            the MP configuration.

        4.  UART routines to enable support semihosting of
            printf family of functions.

        5.  Cache maintenance, Stack manipulation and Locking
            routines.

        6.  Use of IPIs for HYP mode communication.

    4.  big-little/include

        1.  This directory contains the headers specific to HYP
            mode setup, Switching process and common helper
            routines. Most importantly, context.h contains the
            data structures which are used to save and restore
            the processor context.

    5.  big-little/switcher

        This directory implements code to save and restore
        processor context and to initiate/handle a
        async/synchronous switchover request.

        1.  context/

            1.  ns_context.c

                1.  Contains top level routines to save and
                    restore the Non-secure world context.

                2.  If the type of operation is a cluster
                    switch it requests the secure world to save
                    its own context and bring the inbound
                    cluster out of reset. It also uses events to
                    synchronise the switching process between
                    the inbound and outbound clusters.

                3.  If the type of operation is a cpu hotplug
                    it requests the secure world to save
                    its own context and then saves only the
                    relevant HYP mode context before placing the
                    cpu in reset.

            2.  gic.c

                1.  Contains routines to save and restore the
                    context of the vGIC physical distributor and
                    cpu interfaces.

            3.  sh_vgic.c

                1.  The two clusters share the interrupt
                    controller instead of each cluster having
                    its own. A consequence of this is that there
                    is no longer a 1 to 1 mapping between cpu
                    ids and cpu interface ids e.g. on an
                    MPx1+MPx1 cluster configuration,
                    cpu0 of the Cortex-A7 cluster would
                    correspond to cpuinterface1 on the shared
                    vGIC. This in turn affects routing of
                    peripheral and software generated
                    interrupts. This file implements code to
                    allow use of the shared vGIC correctly
                    keeping this limitation in mind.

        2.  trigger/

            1.  async_switchover.c

                1.  Contains code to use the HYP timer interrupt
                    as a trigger to initiate a switchover
                    asynchronously.

                2.  Exports a function [signal_switchover()] which
                    can be used to trigger a async/synchronous
                    switch.

                3.  Implements logic to ensure that only the cpus
                    which have not been hot-plugged are switched
                    to the inbound cluster.

            2.  handle_switchover.s

                1.  Contains code to start saving the non-secure
                    world context and request the secure world to
                    power down the outbound cluster once the
                    inbound cluster is up and running.

    6.  big-little/virtualisor

        This directory implements code that using the ARM
        Virtualization extensions:

        1.  Hides any microarchitectural differences between the
            Cortex-A15 & Cortex-A7 processors visible to the
            payload software.

        2.  Provides a different view of the underlying hardware
            than what really exists e.g. in the switching mode
            it traps accesses made by the host cluster
            (Cortex-A7 cluster currently) to the shared vGIC
            physical distributor interface, so that routing of
            interrupts can take place correctly. In the MP mode,
            the L2 control and MPIDR registers are virtualized
            to tell the payload software that there is one
            cluster with multiple processors instead of two.

        The ARM Virtualization extensions provide a set of trap
        registers (HCPTR (Hyp Coprocessor Trap Register), HSTR
        (Hyp System Trap Register), HDCR (Hyp Debug
        Configuration Register)) to be able to select what
        accesses made by the payload software to the coprocessor
        block will be trapped in the HYP mode.

        Accesses to memory mapped peripherals e.g. shared vGIC
        can be trapped into the HYP mode by populating
        appropriate entries in the 2nd stage translation tables.
        This is how microarchitectural differences between the
        two processor sets are resolved.

        Whenever a trap into HYP mode is taken, the HSR (Hyp
        Syndrome Register) contains enough information about the
        type of trap taken for the software to take appropriate
        action.

        The Virtualizer design centres around the traps
        recognized by the HSR. Also, to deal with
        microarchitectural differences the concept of a HOST
        cluster is introduced. It is possible for each
        cpu to find out the system topology using the Kingfisher
        System Control Block. Once it knows the host cluster id
        & whether the software is expected to switch execution
        or run in the MP mode (provided at compile time), the
        CPU can configure itself accordingly.

        The processor cluster for which the payload software has
        been built to run on [assumed to be Cortex-A15 for this
        release] is termed as the TARGET while the cluster on
        which the differences are expected to crop up is called
        the HOST (assumed to be Cortex-A7 for this release).
        The HOST environment variable is used to specify
        the host cluster. The target cluster is assumed to be
        the logical complement of the host i.e. cluster ids can
        only take the values of 0 and 1.

        The HOST processor emulates the TARGET processor by
        trapping the accesses to differing processor features
        into the HYP mode. Most of the microarchitectural
        differences & registers that need to be virtualized are
        handled in a generic (CPU Independent) layer of
        code. Additionally, each processor exports functions to
        setup, handle & optionally save/restore context of each
        trap that the HSR recognises. These handlers are invoked
        whenever the software runs
        on that processor.

        1.  virt_setup.c

            1.  Generic function that initialises the required
                traps. This is done once each on both the host
                and target  clusters if the trap handler needs
                to obtain some information about the target
                cluster to be able to work correctly e.g the
                Cortex-A7 processor cluster needs to find out
                the cache geometry of the Cortex-A15
                processor cluster to be able to handle cache
                maintenance operations by set/way correctly.This
                function further calls any setup function that
                has been exported by the processor the code is
                executing on.

        2.  virt_handle.c

            1.  Generic function that extends the hvc_entry()
                routine to C Code. It calls the generic trap
                handler (if registered) and then any trap
                handlers exported by the processor on
                which the trap has been invoked.

            2.  It invokes a synchronous cluster switch if the
                appropriate 'HVC' instruction is issued by the
                payload software. Please refer to
                "docs/09-HVC-calling-conventions.pdf" for details
                of this 'HVC' API.

            3.  It returns the value of the physical MPIDR
                register if the appropriate 'HVC' instruction
                is issued by the payload software. Please refer
                to "docs/09-HVC-calling-conventions.pdf" for
                details of this 'HVC' API.

        3.  virt_context.c

            1.  Generic function that saves and restores traps
                on the host cluster & then calls any
                save/restore function that has been exported by
                the processor the code is executing on.

        4.  cache_geom.c

            1.  Generic function that detects cache geometries
                on the host and target clusters & then maps
                cache maintenance operations by set/way from the
                target to the host cache.

        5.  mem_trap.c

            1.  Generic function that sets up any memory traps
                by editing the 2nd stage translation tables.

        6.  vgic_trap_handler.c

            1.  Generic function that handles trapped accesses
                to the shared vGIC.

        7.  kfscb_trap_handler.c

            1.  Generic function that handles trapped accesses
                to the Kingfisher System Control Block. This is
                usually done to start a cpu hotplug operation.

        8.  pmu_trap_handler.c

            1.  Generic function that enables the use of PMU
                with the Virtualizer software as per the design
                detailed in:

                'docs/10-ARM-Virtualizer-support-for-debug-and-the-PMU.pdf'

    7.  include/

        Header files specific to the Virtualisor code.

    8.  cpus/

        Allows implementation of trap handling specific to the
        Cortex-A7 or A15 processors.

        1.  a15/a15.c
        2.  a7/a7.c

            1.  The differences between the I-cache topologies of
                the Cortex-A7 & A15 processors cannot be handled
                within the existing abstraction of HOST & TARGET
                clusters. These differences are treated as cpu-
                specific and handled within these two files.

    9.  big-little/secure_world

        Since both Cortex-A7 & Cortex-A15 processors support ARM
        TrustZone Security Extensions, there is certain context
        that needs to be setup, saved & restored in the Secure
        world.

        This context allows access to certain coprocessor and
        peripheral registers to the Non-secure world. It also
        configures the shared vGIC for use by the Non-secure
        world.

        Execution shifts to the Secure world through the SMC
        instruction which is a part of the ARM V7-ISA.

        1.  monmode_vectors.s

            1.  Implements the monitor mode vector table.  It
                contains the secure entry point [do_smc()] for
                the SMC instruction alongwith rudimentary
                support for other fault exceptions taken while
                executing in the secure world.

            2.  Three types of SMC exceptions are expected (type
                of exception is contained in r0):

                1.  SMC_SEC_INIT

                    Called once after a power on reset to
                    initialise the Secure world stacks,
                    coherency, pagetables, to configure some
                    coprocessor and memory mapped peripheral
                    (Coherent interconnect & shared vGIC)
                    registers for use of these features by
                    the Non-secure world.

                2.  SMC_SEC_SAVE

                    Called from ns_context.c to request the
                    secure world to save its context and bring
                    the corresponding core in the inbound
                    cluster out of reset so that it can start
                    restoring the saved state.

                3.  SMC_SEC_SHUTDOWN

                    Called from handle_switchover.s to request
                    the secure world to flush the L1 and L2 caches
                    and power down the outbound cluster.

               Also implemented is a function to handle warm
               resets on the inbound cluster. Bareminimal
               context is initialised while the rest is restored
               before control is passed to the Non-secure world
               handler for restoring context [restore_context()]
               in ns_context.c

        2.  secure_context.c

            Implements code to save and restore the secure world
            context

        3.  secure_resets.c

            Implements code to power down the outbound cluster
            and bring individual cores in the inbound cluster
            out of reset.

        4.  ve_reset_handler.s

            Base of physical memory in the Versatile Express
            memory map is at 0x80000000. The processors are
            brought out of reset at 0x0 which points to Secure
            RAM/Flash memory. This file implements a small stub
            function that is placed at 0x0 so that execution
            jumps to 0x80000000 after a cold reset and to the
            warm_reset() handler in monmode_vectors.s
            after a warm reset.

        The secure world code is built into a seperate ELF image
        to maintain its distinction from the Virtualizer code
        that executes in the Non-secure world.

    10. big-little/bl.scf.template

        1.  Scatter file that is used to build the Non-secure
            world code in the Virtualizer software. The
            resultant image is bl.axf.

    11. big-little/bl-sec.scf.template

        1.  Scatter file that is used to build the Secure world
            code in the Virtualizer software. The resultant
            image is bl_sec.axf.

    12. acsr/

        The secure world code is built into a seperate ELF image
        to maintain its distinction from the Virtualizer code
        that executes in the Non-secure world.

        1.  helpers.s

            Helper functions to access the CP15 coprocessor
            space.

        2.  v7.s

            Contains routines to save and restore ARM processor
            context.

        3.  v7_c.c

            Contains routines to save and restore a processor's
            debug subsystem state. State is saved through the
            cp14 interface for v7.1 of the debug subsystem &
            through the memory mapped interface for v7.0.

        4.  debug_helpers.s
        5.  debug_helpers.h

            Helper functions to save and restore the ARM Debug
            subsystem context.