diff options
Diffstat (limited to 'Documentation')
19 files changed, 1493 insertions, 14 deletions
diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt index 8a8b82c9ca53..39cfa72732ff 100644 --- a/Documentation/IRQ-domain.txt +++ b/Documentation/IRQ-domain.txt @@ -151,3 +151,74 @@ used and no descriptor gets allocated it is very important to make sure that the driver using the simple domain call irq_create_mapping() before any irq_find_mapping() since the latter will actually work for the static IRQ assignment case. + +==== Hierarchy IRQ domain ==== +On some architectures, there may be multiple interrupt controllers +involved in delivering an interrupt from the device to the target CPU. +Let's look at a typical interrupt delivering path on x86 platforms: + +Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU + +There are three interrupt controllers involved: +1) IOAPIC controller +2) Interrupt remapping controller +3) Local APIC controller + +To support such a hardware topology and make software architecture match +hardware architecture, an irq_domain data structure is built for each +interrupt controller and those irq_domains are organized into hierarchy. +When building irq_domain hierarchy, the irq_domain near to the device is +child and the irq_domain near to CPU is parent. So a hierarchy structure +as below will be built for the example above. + CPU Vector irq_domain (root irq_domain to manage CPU vectors) + ^ + | + Interrupt Remapping irq_domain (manage irq_remapping entries) + ^ + | + IOAPIC irq_domain (manage IOAPIC delivery entries/pins) + +There are four major interfaces to use hierarchy irq_domain: +1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt + controller related resources to deliver these interrupts. +2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controller + related resources associated with these interrupts. +3) irq_domain_activate_irq(): activate interrupt controller hardware to + deliver the interrupt. +3) irq_domain_deactivate_irq(): deactivate interrupt controller hardware + to stop delivering the interrupt. + +Following changes are needed to support hierarchy irq_domain. +1) a new field 'parent' is added to struct irq_domain; it's used to + maintain irq_domain hierarchy information. +2) a new field 'parent_data' is added to struct irq_data; it's used to + build hierarchy irq_data to match hierarchy irq_domains. The irq_data + is used to store irq_domain pointer and hardware irq number. +3) new callbacks are added to struct irq_domain_ops to support hierarchy + irq_domain operations. + +With support of hierarchy irq_domain and hierarchy irq_data ready, an +irq_domain structure is built for each interrupt controller, and an +irq_data structure is allocated for each irq_domain associated with an +IRQ. Now we could go one step further to support stacked(hierarchy) +irq_chip. That is, an irq_chip is associated with each irq_data along +the hierarchy. A child irq_chip may implement a required action by +itself or by cooperating with its parent irq_chip. + +With stacked irq_chip, interrupt controller driver only needs to deal +with the hardware managed by itself and may ask for services from its +parent irq_chip when needed. So we could achieve a much cleaner +software architecture. + +For an interrupt controller driver to support hierarchy irq_domain, it +needs to: +1) Implement irq_domain_ops.alloc and irq_domain_ops.free +2) Optionally implement irq_domain_ops.activate and + irq_domain_ops.deactivate. +3) Optionally implement an irq_chip to manage the interrupt controller + hardware. +4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap, + they are unused with hierarchy irq_domain. + +Hierarchy irq_domain may also be used to support other architectures, +such as ARM, ARM64 etc. diff --git a/Documentation/device-mapper/dm-crypt.txt b/Documentation/device-mapper/dm-crypt.txt index c81839b52c4d..692171fe9da0 100644 --- a/Documentation/device-mapper/dm-crypt.txt +++ b/Documentation/device-mapper/dm-crypt.txt @@ -5,7 +5,7 @@ Device-Mapper's "crypt" target provides transparent encryption of block devices using the kernel crypto API. For a more detailed description of supported parameters see: -http://code.google.com/p/cryptsetup/wiki/DMCrypt +https://gitlab.com/cryptsetup/cryptsetup/wikis/DMCrypt Parameters: <cipher> <key> <iv_offset> <device path> \ <offset> [<#opt_params> <opt_params>] @@ -51,7 +51,7 @@ Parameters: <cipher> <key> <iv_offset> <device path> \ Otherwise #opt_params is the number of following arguments. Example of optional parameters section: - 1 allow_discards + 3 allow_discards same_cpu_crypt submit_from_crypt_cpus allow_discards Block discard requests (a.k.a. TRIM) are passed through the crypt device. @@ -63,11 +63,24 @@ allow_discards used space etc.) if the discarded blocks can be located easily on the device later. +same_cpu_crypt + Perform encryption using the same cpu that IO was submitted on. + The default is to use an unbound workqueue so that encryption work + is automatically balanced between available CPUs. + +submit_from_crypt_cpus + Disable offloading writes to a separate thread after encryption. + There are some situations where offloading write bios from the + encryption threads to a single thread degrades performance + significantly. The default is to offload write bios to the same + thread because it benefits CFQ to have writes submitted using the + same context. + Example scripts =============== LUKS (Linux Unified Key Setup) is now the preferred way to set up disk encryption with dm-crypt using the 'cryptsetup' utility, see -http://code.google.com/p/cryptsetup/ +https://gitlab.com/cryptsetup/cryptsetup [[ #!/bin/sh diff --git a/Documentation/device-mapper/verity.txt b/Documentation/device-mapper/verity.txt index 9884681535ee..0075f70cd3f9 100644 --- a/Documentation/device-mapper/verity.txt +++ b/Documentation/device-mapper/verity.txt @@ -125,7 +125,7 @@ block boundary) are the hash blocks which are stored a depth at a time The full specification of kernel parameters and on-disk metadata format is available at the cryptsetup project's wiki page - http://code.google.com/p/cryptsetup/wiki/DMVerity + https://gitlab.com/cryptsetup/cryptsetup/wikis/DMVerity Status ====== @@ -142,7 +142,7 @@ Set up a device: A command line tool veritysetup is available to compute or verify the hash tree or activate the kernel device. This is available from -the cryptsetup upstream repository http://code.google.com/p/cryptsetup/ +the cryptsetup upstream repository https://gitlab.com/cryptsetup/cryptsetup/ (as a libcryptsetup extension). Create hash on the device: diff --git a/Documentation/devicetree/bindings/arm/arm-boards b/Documentation/devicetree/bindings/arm/arm-boards index c554ed3d44fb..10615f5053b0 100644 --- a/Documentation/devicetree/bindings/arm/arm-boards +++ b/Documentation/devicetree/bindings/arm/arm-boards @@ -92,3 +92,134 @@ Required nodes: - core-module: the root node to the Versatile platforms must have a core-module with regs and the compatible strings "arm,core-module-versatile", "syscon" + +ARM RealView Boards +------------------- +The RealView boards cover tailored evaluation boards that are used to explore +the ARM11 and Cortex A-8 and Cortex A-9 processors. + +Required properties (in root node): + /* RealView Emulation Baseboard */ + compatible = "arm,realview-eb"; + /* RealView Platform Baseboard for ARM1176JZF-S */ + compatible = "arm,realview-pb1176"; + /* RealView Platform Baseboard for ARM11 MPCore */ + compatible = "arm,realview-pb11mp"; + /* RealView Platform Baseboard for Cortex A-8 */ + compatible = "arm,realview-pba8"; + /* RealView Platform Baseboard Explore for Cortex A-9 */ + compatible = "arm,realview-pbx"; + +Required nodes: + +- soc: some node of the RealView platforms must be the SoC + node that contain the SoC-specific devices, withe the compatible + string set to one of these tuples: + "arm,realview-eb-soc", "simple-bus" + "arm,realview-pb1176-soc", "simple-bus" + "arm,realview-pb11mp-soc", "simple-bus" + "arm,realview-pba8-soc", "simple-bus" + "arm,realview-pbx-soc", "simple-bus" + +- syscon: some subnode of the RealView SoC node must be a + system controller node pointing to the control registers, + with the compatible string set to one of these tuples: + "arm,realview-eb-syscon", "syscon" + "arm,realview-pb1176-syscon", "syscon" + "arm,realview-pb11mp-syscon", "syscon" + "arm,realview-pba8-syscon", "syscon" + "arm,realview-pbx-syscon", "syscon" + + Required properties for the system controller: + - regs: the location and size of the system controller registers, + one range of 0x1000 bytes. + +Example: + +/dts-v1/; +#include <dt-bindings/interrupt-controller/irq.h> +#include "skeleton.dtsi" + +/ { + model = "ARM RealView PB1176 with device tree"; + compatible = "arm,realview-pb1176"; + + soc { + #address-cells = <1>; + #size-cells = <1>; + compatible = "arm,realview-pb1176-soc", "simple-bus"; + ranges; + + syscon: syscon@10000000 { + compatible = "arm,realview-syscon", "syscon"; + reg = <0x10000000 0x1000>; + }; + + }; +}; + +ARM Versatile Express Boards +----------------------------- +For details on the device tree bindings for ARM Versatile Express boards +please consult the vexpress.txt file in the same directory as this file. + +ARM Juno Boards +---------------- +The Juno boards are targeting development for AArch64 systems. The first +iteration, Juno r0, is a vehicle for evaluating big.LITTLE on AArch64, +with the second iteration, Juno r1, mainly aimed at development of PCIe +based systems. Juno r1 also has support for AXI masters placed on the TLX +connectors to join the coherency domain. + +Juno boards are described in a similar way to ARM Versatile Express boards, +with the motherboard part of the hardware being described in a separate file +to highlight the fact that is part of the support infrastructure for the SoC. +Juno device tree bindings also share the Versatile Express bindings as +described under the RS1 memory mapping. + +Required properties (in root node): + compatible = "arm,juno"; /* For Juno r0 board */ + compatible = "arm,juno-r1"; /* For Juno r1 board */ + +Required nodes: +The description for the board must include: + - a "psci" node describing the boot method used for the secondary CPUs. + A detailed description of the bindings used for "psci" nodes is present + in the psci.txt file. + - a "cpus" node describing the available cores and their associated + "enable-method"s. For more details see cpus.txt file. + +Example: + +/dts-v1/; +/ { + model = "ARM Juno development board (r0)"; + compatible = "arm,juno", "arm,vexpress"; + interrupt-parent = <&gic>; + #address-cells = <2>; + #size-cells = <2>; + + cpus { + #address-cells = <2>; + #size-cells = <0>; + + A57_0: cpu@0 { + compatible = "arm,cortex-a57","arm,armv8"; + reg = <0x0 0x0>; + device_type = "cpu"; + enable-method = "psci"; + }; + + ..... + + A53_0: cpu@100 { + compatible = "arm,cortex-a53","arm,armv8"; + reg = <0x0 0x100>; + device_type = "cpu"; + enable-method = "psci"; + }; + + ..... + }; + +}; diff --git a/Documentation/devicetree/bindings/arm/coresight.txt b/Documentation/devicetree/bindings/arm/coresight.txt new file mode 100644 index 000000000000..88602b75418e --- /dev/null +++ b/Documentation/devicetree/bindings/arm/coresight.txt @@ -0,0 +1,199 @@ +* CoreSight Components: + +CoreSight components are compliant with the ARM CoreSight architecture +specification and can be connected in various topologies to suit a particular +SoCs tracing needs. These trace components can generally be classified as +sinks, links and sources. Trace data produced by one or more sources flows +through the intermediate links connecting the source to the currently selected +sink. Each CoreSight component device should use these properties to describe +its hardware characteristcs. + +* Required properties for all components *except* non-configurable replicators: + + * compatible: These have to be supplemented with "arm,primecell" as + drivers are using the AMBA bus interface. Possible values include: + - "arm,coresight-etb10", "arm,primecell"; + - "arm,coresight-tpiu", "arm,primecell"; + - "arm,coresight-tmc", "arm,primecell"; + - "arm,coresight-funnel", "arm,primecell"; + - "arm,coresight-etm3x", "arm,primecell"; + + * reg: physical base address and length of the register + set(s) of the component. + + * clocks: the clock associated to this component. + + * clock-names: the name of the clock as referenced by the code. + Since we are using the AMBA framework, the name should be + "apb_pclk". + + * port or ports: The representation of the component's port + layout using the generic DT graph presentation found in + "bindings/graph.txt". + +* Required properties for devices that don't show up on the AMBA bus, such as + non-configurable replicators: + + * compatible: Currently supported value is (note the absence of the + AMBA markee): + - "arm,coresight-replicator" + + * port or ports: same as above. + +* Optional properties for ETM/PTMs: + + * arm,cp14: must be present if the system accesses ETM/PTM management + registers via co-processor 14. + + * cpu: the cpu phandle this ETM/PTM is affined to. When omitted the + source is considered to belong to CPU0. + +* Optional property for TMC: + + * arm,buffer-size: size of contiguous buffer space for TMC ETR + (embedded trace router) + + +Example: + +1. Sinks + etb@20010000 { + compatible = "arm,coresight-etb10", "arm,primecell"; + reg = <0 0x20010000 0 0x1000>; + + clocks = <&oscclk6a>; + clock-names = "apb_pclk"; + port { + etb_in_port: endpoint@0 { + slave-mode; + remote-endpoint = <&replicator_out_port0>; + }; + }; + }; + + tpiu@20030000 { + compatible = "arm,coresight-tpiu", "arm,primecell"; + reg = <0 0x20030000 0 0x1000>; + + clocks = <&oscclk6a>; + clock-names = "apb_pclk"; + port { + tpiu_in_port: endpoint@0 { + slave-mode; + remote-endpoint = <&replicator_out_port1>; + }; + }; + }; + +2. Links + replicator { + /* non-configurable replicators don't show up on the + * AMBA bus. As such no need to add "arm,primecell". + */ + compatible = "arm,coresight-replicator"; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + /* replicator output ports */ + port@0 { + reg = <0>; + replicator_out_port0: endpoint { + remote-endpoint = <&etb_in_port>; + }; + }; + + port@1 { + reg = <1>; + replicator_out_port1: endpoint { + remote-endpoint = <&tpiu_in_port>; + }; + }; + + /* replicator input port */ + port@2 { + reg = <0>; + replicator_in_port0: endpoint { + slave-mode; + remote-endpoint = <&funnel_out_port0>; + }; + }; + }; + }; + + funnel@20040000 { + compatible = "arm,coresight-funnel", "arm,primecell"; + reg = <0 0x20040000 0 0x1000>; + + clocks = <&oscclk6a>; + clock-names = "apb_pclk"; + ports { + #address-cells = <1>; + #size-cells = <0>; + + /* funnel output port */ + port@0 { + reg = <0>; + funnel_out_port0: endpoint { + remote-endpoint = + <&replicator_in_port0>; + }; + }; + + /* funnel input ports */ + port@1 { + reg = <0>; + funnel_in_port0: endpoint { + slave-mode; + remote-endpoint = <&ptm0_out_port>; + }; + }; + + port@2 { + reg = <1>; + funnel_in_port1: endpoint { + slave-mode; + remote-endpoint = <&ptm1_out_port>; + }; + }; + + port@3 { + reg = <2>; + funnel_in_port2: endpoint { + slave-mode; + remote-endpoint = <&etm0_out_port>; + }; + }; + + }; + }; + +3. Sources + ptm@2201c000 { + compatible = "arm,coresight-etm3x", "arm,primecell"; + reg = <0 0x2201c000 0 0x1000>; + + cpu = <&cpu0>; + clocks = <&oscclk6a>; + clock-names = "apb_pclk"; + port { + ptm0_out_port: endpoint { + remote-endpoint = <&funnel_in_port0>; + }; + }; + }; + + ptm@2201d000 { + compatible = "arm,coresight-etm3x", "arm,primecell"; + reg = <0 0x2201d000 0 0x1000>; + + cpu = <&cpu1>; + clocks = <&oscclk6a>; + clock-names = "apb_pclk"; + port { + ptm1_out_port: endpoint { + remote-endpoint = <&funnel_in_port1>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/arm/gic-v3.txt b/Documentation/devicetree/bindings/arm/gic-v3.txt index 33cd05e6c125..ddfade40ac59 100644 --- a/Documentation/devicetree/bindings/arm/gic-v3.txt +++ b/Documentation/devicetree/bindings/arm/gic-v3.txt @@ -49,11 +49,29 @@ Optional occupied by the redistributors. Required if more than one such region is present. +Sub-nodes: + +GICv3 has one or more Interrupt Translation Services (ITS) that are +used to route Message Signalled Interrupts (MSI) to the CPUs. + +These nodes must have the following properties: +- compatible : Should at least contain "arm,gic-v3-its". +- msi-controller : Boolean property. Identifies the node as an MSI controller +- reg: Specifies the base physical address and size of the ITS + registers. + +The main GIC node must contain the appropriate #address-cells, +#size-cells and ranges properties for the reg property of all ITS +nodes. + Examples: gic: interrupt-controller@2cf00000 { compatible = "arm,gic-v3"; #interrupt-cells = <3>; + #address-cells = <2>; + #size-cells = <2>; + ranges; interrupt-controller; reg = <0x0 0x2f000000 0 0x10000>, // GICD <0x0 0x2f100000 0 0x200000>, // GICR @@ -61,11 +79,20 @@ Examples: <0x0 0x2c010000 0 0x2000>, // GICH <0x0 0x2c020000 0 0x2000>; // GICV interrupts = <1 9 4>; + + gic-its@2c200000 { + compatible = "arm,gic-v3-its"; + msi-controller; + reg = <0x0 0x2c200000 0 0x200000>; + }; }; gic: interrupt-controller@2c010000 { compatible = "arm,gic-v3"; #interrupt-cells = <3>; + #address-cells = <2>; + #size-cells = <2>; + ranges; interrupt-controller; redistributor-stride = <0x0 0x40000>; // 256kB stride #redistributor-regions = <2>; @@ -76,4 +103,16 @@ Examples: <0x0 0x2c060000 0 0x2000>, // GICH <0x0 0x2c080000 0 0x2000>; // GICV interrupts = <1 9 4>; + + gic-its@2c200000 { + compatible = "arm,gic-v3-its"; + msi-controller; + reg = <0x0 0x2c200000 0 0x200000>; + }; + + gic-its@2c400000 { + compatible = "arm,gic-v3-its"; + msi-controller; + reg = <0x0 0x2c400000 0 0x200000>; + }; }; diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt index c7d2fa156678..45ccc2db3436 100644 --- a/Documentation/devicetree/bindings/arm/gic.txt +++ b/Documentation/devicetree/bindings/arm/gic.txt @@ -31,12 +31,16 @@ Main node required properties: The 3rd cell is the flags, encoded as follows: bits[3:0] trigger type and level flags. 1 = low-to-high edge triggered - 2 = high-to-low edge triggered + 2 = high-to-low edge triggered (invalid for SPIs) 4 = active high level-sensitive - 8 = active low level-sensitive + 8 = active low level-sensitive (invalid for SPIs). bits[15:8] PPI interrupt cpu mask. Each bit corresponds to each of the 8 possible cpus attached to the GIC. A bit set to '1' indicated the interrupt is wired to that CPU. Only valid for PPI interrupts. + Also note that the configurability of PPI interrupts is IMPLEMENTATION + DEFINED and as such not guaranteed to be present (most SoC available + in 2014 seem to ignore the setting of this flag and use the hardware + default value). - reg : Specifies base physical address(s) and size of the GIC registers. The first region is the GIC distributor register base and size. The 2nd region is @@ -96,3 +100,56 @@ Example: <0x2c006000 0x2000>; interrupts = <1 9 0xf04>; }; + + +* GICv2m extension for MSI/MSI-x support (Optional) + +Certain revisions of GIC-400 supports MSI/MSI-x via V2M register frame(s). +This is enabled by specifying v2m sub-node(s). + +Required properties: + +- compatible : The value here should contain "arm,gic-v2m-frame". + +- msi-controller : Identifies the node as an MSI controller. + +- reg : GICv2m MSI interface register base and size + +Optional properties: + +- arm,msi-base-spi : When the MSI_TYPER register contains an incorrect + value, this property should contain the SPI base of + the MSI frame, overriding the HW value. + +- arm,msi-num-spis : When the MSI_TYPER register contains an incorrect + value, this property should contain the number of + SPIs assigned to the frame, overriding the HW value. + +Example: + + interrupt-controller@e1101000 { + compatible = "arm,gic-400"; + #interrupt-cells = <3>; + #address-cells = <2>; + #size-cells = <2>; + interrupt-controller; + interrupts = <1 8 0xf04>; + ranges = <0 0 0 0xe1100000 0 0x100000>; + reg = <0x0 0xe1110000 0 0x01000>, + <0x0 0xe112f000 0 0x02000>, + <0x0 0xe1140000 0 0x10000>, + <0x0 0xe1160000 0 0x10000>; + v2m0: v2m@0x8000 { + compatible = "arm,gic-v2m-frame"; + msi-controller; + reg = <0x0 0x80000 0 0x1000>; + }; + + .... + + v2mN: v2m@0x9000 { + compatible = "arm,gic-v2m-frame"; + msi-controller; + reg = <0x0 0x90000 0 0x1000>; + }; + }; diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,sysirq.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,sysirq.txt new file mode 100644 index 000000000000..d680b07ec6e8 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,sysirq.txt @@ -0,0 +1,28 @@ +Mediatek 65xx/81xx sysirq + +Mediatek SOCs sysirq support controllable irq inverter for each GIC SPI +interrupt. + +Required properties: +- compatible: should be one of: + "mediatek,mt8135-sysirq" + "mediatek,mt8127-sysirq" + "mediatek,mt6589-sysirq" + "mediatek,mt6582-sysirq" + "mediatek,mt6577-sysirq" +- interrupt-controller : Identifies the node as an interrupt controller +- #interrupt-cells : Use the same format as specified by GIC in + Documentation/devicetree/bindings/arm/gic.txt +- interrupt-parent: phandle of irq parent for sysirq. The parent must + use the same interrupt-cells format as GIC. +- reg: Physical base address of the intpol registers and length of memory + mapped region. + +Example: + sysirq: interrupt-controller@10200100 { + compatible = "mediatek,mt6589-sysirq", "mediatek,mt6577-sysirq"; + interrupt-controller; + #interrupt-cells = <3>; + interrupt-parent = <&gic>; + reg = <0 0x10200100 0 0x1c>; + }; diff --git a/Documentation/devicetree/bindings/mfd/mfd.txt b/Documentation/devicetree/bindings/mfd/mfd.txt new file mode 100644 index 000000000000..af9d6931a1a2 --- /dev/null +++ b/Documentation/devicetree/bindings/mfd/mfd.txt @@ -0,0 +1,41 @@ +Multi-Function Devices (MFD) + +These devices comprise a nexus for heterogeneous hardware blocks containing +more than one non-unique yet varying hardware functionality. + +A typical MFD can be: + +- A mixed signal ASIC on an external bus, sometimes a PMIC (Power Management + Integrated Circuit) that is manufactured in a lower technology node (rough + silicon) that handles analog drivers for things like audio amplifiers, LED + drivers, level shifters, PHY (physical interfaces to things like USB or + ethernet), regulators etc. + +- A range of memory registers containing "miscellaneous system registers" also + known as a system controller "syscon" or any other memory range containing a + mix of unrelated hardware devices. + +Optional properties: + +- compatible : "simple-mfd" - this signifies that the operating system should + consider all subnodes of the MFD device as separate devices akin to how + "simple-bus" inidicates when to see subnodes as children for a simple + memory-mapped bus. For more complex devices, when the nexus driver has to + probe registers to figure out what child devices exist etc, this should not + be used. In the latter case the child devices will be determined by the + operating system. + +Example: + +foo@1000 { + compatible = "syscon", "simple-mfd"; + reg = <0x01000 0x1000>; + + led@08.0 { + compatible = "register-bit-led"; + offset = <0x08>; + mask = <0x01>; + label = "myled"; + default-state = "on"; + }; +}; diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 74499e5033fc..74499e5033fc 100644 --- a/Documentation/devicetree/bindings/power/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt diff --git a/Documentation/devicetree/bindings/pci/arm,juno-r1-pcie.txt b/Documentation/devicetree/bindings/pci/arm,juno-r1-pcie.txt new file mode 100644 index 000000000000..f7514c170a32 --- /dev/null +++ b/Documentation/devicetree/bindings/pci/arm,juno-r1-pcie.txt @@ -0,0 +1,10 @@ +* ARM Juno R1 PCIe interface + +This PCIe host controller is based on PLDA XpressRICH3-AXI IP +and thus inherits all the common properties defined in plda,xpressrich3-axi.txt +as well as the base properties defined in host-generic-pci.txt. + +Required properties: + - compatible: "arm,juno-r1-pcie" + - dma-coherent: The host controller bridges the AXI transactions into PCIe bus + in a manner that makes the DMA operations to appear coherent to the CPUs. diff --git a/Documentation/devicetree/bindings/pci/plda,xpressrich3-axi.txt b/Documentation/devicetree/bindings/pci/plda,xpressrich3-axi.txt new file mode 100644 index 000000000000..f3f75bfb42bc --- /dev/null +++ b/Documentation/devicetree/bindings/pci/plda,xpressrich3-axi.txt @@ -0,0 +1,12 @@ +* PLDA XpressRICH3-AXI host controller + +The PLDA XpressRICH3-AXI host controller can be configured in a manner that +makes it compliant with the SBSA[1] standard published by ARM Ltd. For those +scenarios, the host-generic-pci.txt bindings apply with the following additions +to the compatible property: + +Required properties: + - compatible: should contain "plda,xpressrich3-axi" to identify the IP used. + + +[1] http://infocenter.arm.com/help/topic/com.arm.doc.den0029a/ diff --git a/Documentation/devicetree/bindings/pci/xgene-pci-msi.txt b/Documentation/devicetree/bindings/pci/xgene-pci-msi.txt new file mode 100644 index 000000000000..36d881c8e6d4 --- /dev/null +++ b/Documentation/devicetree/bindings/pci/xgene-pci-msi.txt @@ -0,0 +1,68 @@ +* AppliedMicro X-Gene v1 PCIe MSI controller + +Required properties: + +- compatible: should be "apm,xgene1-msi" to identify + X-Gene v1 PCIe MSI controller block. +- msi-controller: indicates that this is X-Gene v1 PCIe MSI controller node +- reg: physical base address (0x79000000) and length (0x900000) for controller + registers. These registers include the MSI termination address and data + registers as well as the MSI interrupt status registers. +- reg-names: not required +- interrupts: A list of 16 interrupt outputs of the controller, starting from + interrupt number 0x10 to 0x1f. +- interrupt-names: not required + +Each PCIe node needs to have property msi-parent that points to msi controller node + +Examples: + +SoC DTSI: + + + MSI node: + msi@79000000 { + compatible = "apm,xgene1-msi"; + msi-controller; + reg = <0x00 0x79000000 0x0 0x900000>; + interrupts = <0x0 0x10 0x4> + <0x0 0x11 0x4> + <0x0 0x12 0x4> + <0x0 0x13 0x4> + <0x0 0x14 0x4> + <0x0 0x15 0x4> + <0x0 0x16 0x4> + <0x0 0x17 0x4> + <0x0 0x18 0x4> + <0x0 0x19 0x4> + <0x0 0x1a 0x4> + <0x0 0x1b 0x4> + <0x0 0x1c 0x4> + <0x0 0x1d 0x4> + <0x0 0x1e 0x4> + <0x0 0x1f 0x4>; + }; + + + PCIe controller node with msi-parent property pointing to MSI node: + pcie0: pcie@1f2b0000 { + status = "disabled"; + device_type = "pci"; + compatible = "apm,xgene-storm-pcie", "apm,xgene-pcie"; + #interrupt-cells = <1>; + #size-cells = <2>; + #address-cells = <3>; + reg = < 0x00 0x1f2b0000 0x0 0x00010000 /* Controller registers */ + 0xe0 0xd0000000 0x0 0x00040000>; /* PCI config space */ + reg-names = "csr", "cfg"; + ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000 /* io */ + 0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>; /* mem */ + dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000 + 0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>; + interrupt-map-mask = <0x0 0x0 0x0 0x7>; + interrupt-map = <0x0 0x0 0x0 0x1 &gic 0x0 0xc2 0x1 + 0x0 0x0 0x0 0x2 &gic 0x0 0xc3 0x1 + 0x0 0x0 0x0 0x3 &gic 0x0 0xc4 0x1 + 0x0 0x0 0x0 0x4 &gic 0x0 0xc5 0x1>; + dma-coherent; + clocks = <&pcie0clk 0>; + msi-parent= <&msi>; + }; diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt index f5db6b72a36f..99d6608c9d5f 100644 --- a/Documentation/devicetree/bindings/thermal/thermal.txt +++ b/Documentation/devicetree/bindings/thermal/thermal.txt @@ -167,6 +167,13 @@ Optional property: by means of sensor ID. Additional coefficients are interpreted as constant offset. +- sustainable-power: An estimate of the sustainable power (in mW) that the + Type: unsigned thermal zone can dissipate at the desired + Size: one cell control temperature. For reference, the + sustainable power of a 4'' phone is typically + 2000mW, while on a 10'' tablet is around + 4500mW. + Note: The delay properties are bound to the maximum dT/dt (temperature derivative over time) in two situations for a thermal zone: (i) - when passive cooling is activated (polling-delay-passive); and @@ -546,6 +553,8 @@ thermal-zones { */ coefficients = <1200 -345 890>; + sustainable-power = <2500>; + trips { /* Trips are based on resulting linear equation */ cpu-trip: cpu-trip { diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt index a344ec2713a5..3efffccf8dd7 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.txt +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt @@ -115,6 +115,7 @@ panasonic Panasonic Corporation phytec PHYTEC Messtechnik GmbH picochip Picochip Ltd plathome Plat'Home Co., Ltd. +plda PLDA pixcir PIXCIR MICROELECTRONICS Co., Ltd powervr PowerVR (deprecated, use img) qca Qualcomm Atheros, Inc. diff --git a/Documentation/thermal/cpu-cooling-api.txt b/Documentation/thermal/cpu-cooling-api.txt index fca24c931ec8..71653584cd03 100644 --- a/Documentation/thermal/cpu-cooling-api.txt +++ b/Documentation/thermal/cpu-cooling-api.txt @@ -3,7 +3,7 @@ CPU cooling APIs How To Written by Amit Daniel Kachhap <amit.kachhap@linaro.org> -Updated: 12 May 2012 +Updated: 6 Jan 2015 Copyright (c) 2012 Samsung Electronics Co., Ltd(http://www.samsung.com) @@ -25,8 +25,173 @@ the user. The registration APIs returns the cooling device pointer. clip_cpus: cpumask of cpus where the frequency constraints will happen. -1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) +1.1.2 struct thermal_cooling_device *of_cpufreq_cooling_register( + struct device_node *np, const struct cpumask *clip_cpus) + + This interface function registers the cpufreq cooling device with + the name "thermal-cpufreq-%x" linking it with a device tree node, in + order to bind it via the thermal DT code. This api can support multiple + instances of cpufreq cooling devices. + + np: pointer to the cooling device device tree node + clip_cpus: cpumask of cpus where the frequency constraints will happen. + +1.1.3 struct thermal_cooling_device *cpufreq_power_cooling_register( + const struct cpumask *clip_cpus, u32 capacitance, + get_static_t plat_static_func) + +Similar to cpufreq_cooling_register, this function registers a cpufreq +cooling device. Using this function, the cooling device will +implement the power extensions by using a simple cpu power model. The +cpus must have registered their OPPs using the OPP library. + +The additional parameters are needed for the power model (See 2. Power +models). "capacitance" is the dynamic power coefficient (See 2.1 +Dynamic power). "plat_static_func" is a function to calculate the +static power consumed by these cpus (See 2.2 Static power). + +1.1.4 struct thermal_cooling_device *of_cpufreq_power_cooling_register( + struct device_node *np, const struct cpumask *clip_cpus, u32 capacitance, + get_static_t plat_static_func) + +Similar to cpufreq_power_cooling_register, this function register a +cpufreq cooling device with power extensions using the device tree +information supplied by the np parameter. + +1.1.5 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) This interface function unregisters the "thermal-cpufreq-%x" cooling device. cdev: Cooling device pointer which has to be unregistered. + +2. Power models + +The power API registration functions provide a simple power model for +CPUs. The current power is calculated as dynamic + (optionally) +static power. This power model requires that the operating-points of +the CPUs are registered using the kernel's opp library and the +`cpufreq_frequency_table` is assigned to the `struct device` of the +cpu. If you are using CONFIG_CPUFREQ_DT then the +`cpufreq_frequency_table` should already be assigned to the cpu +device. + +The `plat_static_func` parameter of `cpufreq_power_cooling_register()` +and `of_cpufreq_power_cooling_register()` is optional. If you don't +provide it, only dynamic power will be considered. + +2.1 Dynamic power + +The dynamic power consumption of a processor depends on many factors. +For a given processor implementation the primary factors are: + +- The time the processor spends running, consuming dynamic power, as + compared to the time in idle states where dynamic consumption is + negligible. Herein we refer to this as 'utilisation'. +- The voltage and frequency levels as a result of DVFS. The DVFS + level is a dominant factor governing power consumption. +- In running time the 'execution' behaviour (instruction types, memory + access patterns and so forth) causes, in most cases, a second order + variation. In pathological cases this variation can be significant, + but typically it is of a much lesser impact than the factors above. + +A high level dynamic power consumption model may then be represented as: + +Pdyn = f(run) * Voltage^2 * Frequency * Utilisation + +f(run) here represents the described execution behaviour and its +result has a units of Watts/Hz/Volt^2 (this often expressed in +mW/MHz/uVolt^2) + +The detailed behaviour for f(run) could be modelled on-line. However, +in practice, such an on-line model has dependencies on a number of +implementation specific processor support and characterisation +factors. Therefore, in initial implementation that contribution is +represented as a constant coefficient. This is a simplification +consistent with the relative contribution to overall power variation. + +In this simplified representation our model becomes: + +Pdyn = Capacitance * Voltage^2 * Frequency * Utilisation + +Where `capacitance` is a constant that represents an indicative +running time dynamic power coefficient in fundamental units of +mW/MHz/uVolt^2. Typical values for mobile CPUs might lie in range +from 100 to 500. For reference, the approximate values for the SoC in +ARM's Juno Development Platform are 530 for the Cortex-A57 cluster and +140 for the Cortex-A53 cluster. + + +2.2 Static power + +Static leakage power consumption depends on a number of factors. For a +given circuit implementation the primary factors are: + +- Time the circuit spends in each 'power state' +- Temperature +- Operating voltage +- Process grade + +The time the circuit spends in each 'power state' for a given +evaluation period at first order means OFF or ON. However, +'retention' states can also be supported that reduce power during +inactive periods without loss of context. + +Note: The visibility of state entries to the OS can vary, according to +platform specifics, and this can then impact the accuracy of a model +based on OS state information alone. It might be possible in some +cases to extract more accurate information from system resources. + +The temperature, operating voltage and process 'grade' (slow to fast) +of the circuit are all significant factors in static leakage power +consumption. All of these have complex relationships to static power. + +Circuit implementation specific factors include the chosen silicon +process as well as the type, number and size of transistors in both +the logic gates and any RAM elements included. + +The static power consumption modelling must take into account the +power managed regions that are implemented. Taking the example of an +ARM processor cluster, the modelling would take into account whether +each CPU can be powered OFF separately or if only a single power +region is implemented for the complete cluster. + +In one view, there are others, a static power consumption model can +then start from a set of reference values for each power managed +region (e.g. CPU, Cluster/L2) in each state (e.g. ON, OFF) at an +arbitrary process grade, voltage and temperature point. These values +are then scaled for all of the following: the time in each state, the +process grade, the current temperature and the operating voltage. +However, since both implementation specific and complex relationships +dominate the estimate, the appropriate interface to the model from the +cpu cooling device is to provide a function callback that calculates +the static power in this platform. When registering the cpu cooling +device pass a function pointer that follows the `get_static_t` +prototype: + + int plat_get_static(cpumask_t *cpumask, int interval, + unsigned long voltage, u32 &power); + +`cpumask` is the cpumask of the cpus involved in the calculation. +`voltage` is the voltage at which they are operating. The function +should calculate the average static power for the last `interval` +milliseconds. It returns 0 on success, -E* on error. If it +succeeds, it should store the static power in `power`. Reading the +temperature of the cpus described by `cpumask` is left for +plat_get_static() to do as the platform knows best which thermal +sensor is closest to the cpu. + +If `plat_static_func` is NULL, static power is considered to be +negligible for this platform and only dynamic power is considered. + +The platform specific callback can then use any combination of tables +and/or equations to permute the estimated value. Process grade +information is not passed to the model since access to such data, from +on-chip measurement capability or manufacture time data, is platform +specific. + +Note: the significance of static power for CPUs in comparison to +dynamic power is highly dependent on implementation. Given the +potential complexity in implementation, the importance and accuracy of +its inclusion when using cpu cooling devices should be assessed on a +case by case basis. + diff --git a/Documentation/thermal/power_allocator.txt b/Documentation/thermal/power_allocator.txt new file mode 100644 index 000000000000..c3797b529991 --- /dev/null +++ b/Documentation/thermal/power_allocator.txt @@ -0,0 +1,247 @@ +Power allocator governor tunables +================================= + +Trip points +----------- + +The governor requires the following two passive trip points: + +1. "switch on" trip point: temperature above which the governor + control loop starts operating. This is the first passive trip + point of the thermal zone. + +2. "desired temperature" trip point: it should be higher than the + "switch on" trip point. This the target temperature the governor + is controlling for. This is the last passive trip point of the + thermal zone. + +PID Controller +-------------- + +The power allocator governor implements a +Proportional-Integral-Derivative controller (PID controller) with +temperature as the control input and power as the controlled output: + + P_max = k_p * e + k_i * err_integral + k_d * diff_err + sustainable_power + +where + e = desired_temperature - current_temperature + err_integral is the sum of previous errors + diff_err = e - previous_error + +It is similar to the one depicted below: + + k_d + | +current_temp | + | v + | +----------+ +---+ + | +----->| diff_err |-->| X |------+ + | | +----------+ +---+ | + | | | tdp actor + | | k_i | | get_requested_power() + | | | | | | | + | | | | | | | ... + v | v v v v v + +---+ | +-------+ +---+ +---+ +---+ +----------+ + | S |-------+----->| sum e |----->| X |--->| S |-->| S |-->|power | + +---+ | +-------+ +---+ +---+ +---+ |allocation| + ^ | ^ +----------+ + | | | | | + | | +---+ | | | + | +------->| X |-------------------+ v v + | +---+ granted performance +desired_temperature ^ + | + | + k_po/k_pu + +Sustainable power +----------------- + +An estimate of the sustainable dissipatable power (in mW) should be +provided while registering the thermal zone. This estimates the +sustained power that can be dissipated at the desired control +temperature. This is the maximum sustained power for allocation at +the desired maximum temperature. The actual sustained power can vary +for a number of reasons. The closed loop controller will take care of +variations such as environmental conditions, and some factors related +to the speed-grade of the silicon. `sustainable_power` is therefore +simply an estimate, and may be tuned to affect the aggressiveness of +the thermal ramp. For reference, the sustainable power of a 4" phone +is typically 2000mW, while on a 10" tablet is around 4500mW (may vary +depending on screen size). + +If you are using device tree, do add it as a property of the +thermal-zone. For example: + + thermal-zones { + soc_thermal { + polling-delay = <1000>; + polling-delay-passive = <100>; + sustainable-power = <2500>; + ... + +Instead, if the thermal zone is registered from the platform code, pass a +`thermal_zone_params` that has a `sustainable_power`. If no +`thermal_zone_params` were being passed, then something like below +will suffice: + + static const struct thermal_zone_params tz_params = { + .sustainable_power = 3500, + }; + +and then pass `tz_params` as the 5th parameter to +`thermal_zone_device_register()` + +k_po and k_pu +------------- + +The implementation of the PID controller in the power allocator +thermal governor allows the configuration of two proportional term +constants: `k_po` and `k_pu`. `k_po` is the proportional term +constant during temperature overshoot periods (current temperature is +above "desired temperature" trip point). Conversely, `k_pu` is the +proportional term constant during temperature undershoot periods +(current temperature below "desired temperature" trip point). + +These controls are intended as the primary mechanism for configuring +the permitted thermal "ramp" of the system. For instance, a lower +`k_pu` value will provide a slower ramp, at the cost of capping +available capacity at a low temperature. On the other hand, a high +value of `k_pu` will result in the governor granting very high power +whilst temperature is low, and may lead to temperature overshooting. + +The default value for `k_pu` is: + + 2 * sustainable_power / (desired_temperature - switch_on_temp) + +This means that at `switch_on_temp` the output of the controller's +proportional term will be 2 * `sustainable_power`. The default value +for `k_po` is: + + sustainable_power / (desired_temperature - switch_on_temp) + +Focusing on the proportional and feed forward values of the PID +controller equation we have: + + P_max = k_p * e + sustainable_power + +The proportional term is proportional to the difference between the +desired temperature and the current one. When the current temperature +is the desired one, then the proportional component is zero and +`P_max` = `sustainable_power`. That is, the system should operate in +thermal equilibrium under constant load. `sustainable_power` is only +an estimate, which is the reason for closed-loop control such as this. + +Expanding `k_pu` we get: + P_max = 2 * sustainable_power * (T_set - T) / (T_set - T_on) + + sustainable_power + +where + T_set is the desired temperature + T is the current temperature + T_on is the switch on temperature + +When the current temperature is the switch_on temperature, the above +formula becomes: + + P_max = 2 * sustainable_power * (T_set - T_on) / (T_set - T_on) + + sustainable_power = 2 * sustainable_power + sustainable_power = + 3 * sustainable_power + +Therefore, the proportional term alone linearly decreases power from +3 * `sustainable_power` to `sustainable_power` as the temperature +rises from the switch on temperature to the desired temperature. + +k_i and integral_cutoff +----------------------- + +`k_i` configures the PID loop's integral term constant. This term +allows the PID controller to compensate for long term drift and for +the quantized nature of the output control: cooling devices can't set +the exact power that the governor requests. When the temperature +error is below `integral_cutoff`, errors are accumulated in the +integral term. This term is then multiplied by `k_i` and the result +added to the output of the controller. Typically `k_i` is set low (1 +or 2) and `integral_cutoff` is 0. + +k_d +--- + +`k_d` configures the PID loop's derivative term constant. It's +recommended to leave it as the default: 0. + +Cooling device power API +======================== + +Cooling devices controlled by this governor must supply the additional +"power" API in their `cooling_device_ops`. It consists on three ops: + +1. int get_requested_power(struct thermal_cooling_device *cdev, + struct thermal_zone_device *tz, u32 *power); +@cdev: The `struct thermal_cooling_device` pointer +@tz: thermal zone in which we are currently operating +@power: pointer in which to store the calculated power + +`get_requested_power()` calculates the power requested by the device +in milliwatts and stores it in @power . It should return 0 on +success, -E* on failure. This is currently used by the power +allocator governor to calculate how much power to give to each cooling +device. + +2. int state2power(struct thermal_cooling_device *cdev, struct + thermal_zone_device *tz, unsigned long state, u32 *power); +@cdev: The `struct thermal_cooling_device` pointer +@tz: thermal zone in which we are currently operating +@state: A cooling device state +@power: pointer in which to store the equivalent power + +Convert cooling device state @state into power consumption in +milliwatts and store it in @power. It should return 0 on success, -E* +on failure. This is currently used by thermal core to calculate the +maximum power that an actor can consume. + +3. int power2state(struct thermal_cooling_device *cdev, u32 power, + unsigned long *state); +@cdev: The `struct thermal_cooling_device` pointer +@power: power in milliwatts +@state: pointer in which to store the resulting state + +Calculate a cooling device state that would make the device consume at +most @power mW and store it in @state. It should return 0 on success, +-E* on failure. This is currently used by the thermal core to convert +a given power set by the power allocator governor to a state that the +cooling device can set. It is a function because this conversion may +depend on external factors that may change so this function should the +best conversion given "current circumstances". + +Cooling device weights +---------------------- + +Weights are a mechanism to bias the allocation among cooling +devices. They express the relative power efficiency of different +cooling devices. Higher weight can be used to express higher power +efficiency. Weighting is relative such that if each cooling device +has a weight of one they are considered equal. This is particularly +useful in heterogeneous systems where two cooling devices may perform +the same kind of compute, but with different efficiency. For example, +a system with two different types of processors. + +If the thermal zone is registered using +`thermal_zone_device_register()` (i.e., platform code), then weights +are passed as part of the thermal zone's `thermal_bind_parameters`. +If the platform is registered using device tree, then they are passed +as the `contribution` property of each map in the `cooling-maps` node. + +Limitations of the power allocator governor +=========================================== + +The power allocator governor's PID controller works best if there is a +periodic tick. If you have a driver that calls +`thermal_zone_device_update()` (or anything that ends up calling the +governor's `throttle()` function) repetitively, the governor response +won't be very good. Note that this is not particular to this +governor, step-wise will also misbehave if you call its throttle() +faster than the normal thermal framework tick (due to interrupts for +example) as it will overreact. diff --git a/Documentation/thermal/sysfs-api.txt b/Documentation/thermal/sysfs-api.txt index 87519cb379ee..c1f6864a8c5d 100644 --- a/Documentation/thermal/sysfs-api.txt +++ b/Documentation/thermal/sysfs-api.txt @@ -95,7 +95,7 @@ temperature) and throttle appropriate devices. 1.3 interface for binding a thermal zone device with a thermal cooling device 1.3.1 int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz, int trip, struct thermal_cooling_device *cdev, - unsigned long upper, unsigned long lower); + unsigned long upper, unsigned long lower, unsigned int weight); This interface function bind a thermal cooling device to the certain trip point of a thermal zone device. @@ -110,6 +110,8 @@ temperature) and throttle appropriate devices. lower:the Minimum cooling state can be used for this trip point. THERMAL_NO_LIMIT means no lower limit, and the cooling device can be in cooling state 0. + weight: the influence of this cooling device in this thermal + zone. See 1.4.1 below for more information. 1.3.2 int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz, int trip, struct thermal_cooling_device *cdev); @@ -127,9 +129,15 @@ temperature) and throttle appropriate devices. This structure defines the following parameters that are used to bind a zone with a cooling device for a particular trip point. .cdev: The cooling device pointer - .weight: The 'influence' of a particular cooling device on this zone. - This is on a percentage scale. The sum of all these weights - (for a particular zone) cannot exceed 100. + .weight: The 'influence' of a particular cooling device on this + zone. This is relative to the rest of the cooling + devices. For example, if all cooling devices have a + weight of 1, then they all contribute the same. You can + use percentages if you want, but it's not mandatory. A + weight of 0 means that this cooling device doesn't + contribute to the cooling of this zone unless all cooling + devices have a weight of 0. If all weights are 0, then + they all contribute the same. .trip_mask:This is a bit mask that gives the binding relation between this thermal zone and cdev, for a particular trip point. If nth bit is set, then the cdev and thermal zone are bound @@ -176,6 +184,14 @@ Thermal zone device sys I/F, created once it's registered: |---trip_point_[0-*]_type: Trip point type |---trip_point_[0-*]_hyst: Hysteresis value for this trip point |---emul_temp: Emulated temperature set node + |---sustainable_power: Sustainable dissipatable power + |---k_po: Proportional term during temperature overshoot + |---k_pu: Proportional term during temperature undershoot + |---k_i: PID's integral term in the power allocator gov + |---k_d: PID's derivative term in the power allocator + |---integral_cutoff: Offset above which errors are accumulated + |---slope: Slope constant applied as linear extrapolation + |---offset: Offset constant applied as linear extrapolation Thermal cooling device sys I/F, created once it's registered: /sys/class/thermal/cooling_device[0-*]: @@ -192,6 +208,8 @@ thermal_zone_bind_cooling_device/thermal_zone_unbind_cooling_device. /sys/class/thermal/thermal_zone[0-*]: |---cdev[0-*]: [0-*]th cooling device in current thermal zone |---cdev[0-*]_trip_point: Trip point that cdev[0-*] is associated with + |---cdev[0-*]_weight: Influence of the cooling device in + this thermal zone Besides the thermal zone device sysfs I/F and cooling device sysfs I/F, the generic thermal driver also creates a hwmon sysfs I/F for each _type_ @@ -265,6 +283,14 @@ cdev[0-*]_trip_point point. RO, Optional +cdev[0-*]_weight + The influence of cdev[0-*] in this thermal zone. This value + is relative to the rest of cooling devices in the thermal + zone. For example, if a cooling device has a weight double + than that of other, it's twice as effective in cooling the + thermal zone. + RW, Optional + passive Attribute is only present for zones in which the passive cooling policy is not supported by native thermal driver. Default is zero @@ -289,6 +315,66 @@ emul_temp because userland can easily disable the thermal policy by simply flooding this sysfs node with low temperature values. +sustainable_power + An estimate of the sustained power that can be dissipated by + the thermal zone. Used by the power allocator governor. For + more information see Documentation/thermal/power_allocator.txt + Unit: milliwatts + RW, Optional + +k_po + The proportional term of the power allocator governor's PID + controller during temperature overshoot. Temperature overshoot + is when the current temperature is above the "desired + temperature" trip point. For more information see + Documentation/thermal/power_allocator.txt + RW, Optional + +k_pu + The proportional term of the power allocator governor's PID + controller during temperature undershoot. Temperature undershoot + is when the current temperature is below the "desired + temperature" trip point. For more information see + Documentation/thermal/power_allocator.txt + RW, Optional + +k_i + The integral term of the power allocator governor's PID + controller. This term allows the PID controller to compensate + for long term drift. For more information see + Documentation/thermal/power_allocator.txt + RW, Optional + +k_d + The derivative term of the power allocator governor's PID + controller. For more information see + Documentation/thermal/power_allocator.txt + RW, Optional + +integral_cutoff + Temperature offset from the desired temperature trip point + above which the integral term of the power allocator + governor's PID controller starts accumulating errors. For + example, if integral_cutoff is 0, then the integral term only + accumulates error when temperature is above the desired + temperature trip point. For more information see + Documentation/thermal/power_allocator.txt + RW, Optional + +slope + The slope constant used in a linear extrapolation model + to determine a hotspot temperature based off the sensor's + raw readings. It is up to the device driver to determine + the usage of these values. + RW, Optional + +offset + The offset constant used in a linear extrapolation model + to determine a hotspot temperature based off the sensor's + raw readings. It is up to the device driver to determine + the usage of these values. + RW, Optional + ***************************** * Cooling device attributes * ***************************** @@ -318,7 +404,8 @@ passive, active. If an ACPI thermal zone supports critical, passive, active[0] and active[1] at the same time, it may register itself as a thermal_zone_device (thermal_zone1) with 4 trip points in all. It has one processor and one fan, which are both registered as -thermal_cooling_device. +thermal_cooling_device. Both are considered to have the same +effectiveness in cooling the thermal zone. If the processor is listed in _PSL method, and the fan is listed in _AL0 method, the sys I/F structure will be built like this: @@ -340,8 +427,10 @@ method, the sys I/F structure will be built like this: |---trip_point_3_type: active1 |---cdev0: --->/sys/class/thermal/cooling_device0 |---cdev0_trip_point: 1 /* cdev0 can be used for passive */ + |---cdev0_weight: 1024 |---cdev1: --->/sys/class/thermal/cooling_device3 |---cdev1_trip_point: 2 /* cdev1 can be used for active[0]*/ + |---cdev1_weight: 1024 |cooling_device0: |---type: Processor diff --git a/Documentation/trace/coresight.txt b/Documentation/trace/coresight.txt new file mode 100644 index 000000000000..77d14d51a670 --- /dev/null +++ b/Documentation/trace/coresight.txt @@ -0,0 +1,299 @@ + Coresight - HW Assisted Tracing on ARM + ====================================== + + Author: Mathieu Poirier <mathieu.poirier@linaro.org> + Date: September 11th, 2014 + +Introduction +------------ + +Coresight is an umbrella of technologies allowing for the debugging of ARM +based SoC. It includes solutions for JTAG and HW assisted tracing. This +document is concerned with the latter. + +HW assisted tracing is becoming increasingly useful when dealing with systems +that have many SoCs and other components like GPU and DMA engines. ARM has +developed a HW assisted tracing solution by means of different components, each +being added to a design at synthesis time to cater to specific tracing needs. +Compoments are generally categorised as source, link and sinks and are +(usually) discovered using the AMBA bus. + +"Sources" generate a compressed stream representing the processor instruction +path based on tracing scenarios as configured by users. From there the stream +flows through the coresight system (via ATB bus) using links that are connecting +the emanating source to a sink(s). Sinks serve as endpoints to the coresight +implementation, either storing the compressed stream in a memory buffer or +creating an interface to the outside world where data can be transferred to a +host without fear of filling up the onboard coresight memory buffer. + +At typical coresight system would look like this: + + ***************************************************************** + **************************** AMBA AXI ****************************===|| + ***************************************************************** || + ^ ^ | || + | | * ** + 0000000 ::::: 0000000 ::::: ::::: @@@@@@@ |||||||||||| + 0 CPU 0<-->: C : 0 CPU 0<-->: C : : C : @ STM @ || System || + |->0000000 : T : |->0000000 : T : : T :<--->@@@@@ || Memory || + | #######<-->: I : | #######<-->: I : : I : @@@<-| |||||||||||| + | # ETM # ::::: | # PTM # ::::: ::::: @ | + | ##### ^ ^ | ##### ^ ! ^ ! . | ||||||||| + | |->### | ! | |->### | ! | ! . | || DAP || + | | # | ! | | # | ! | ! . | ||||||||| + | | . | ! | | . | ! | ! . | | | + | | . | ! | | . | ! | ! . | | * + | | . | ! | | . | ! | ! . | | SWD/ + | | . | ! | | . | ! | ! . | | JTAG + *****************************************************************<-| + *************************** AMBA Debug APB ************************ + ***************************************************************** + | . ! . ! ! . | + | . * . * * . | + ***************************************************************** + ******************** Cross Trigger Matrix (CTM) ******************* + ***************************************************************** + | . ^ . . | + | * ! * * | + ***************************************************************** + ****************** AMBA Advanced Trace Bus (ATB) ****************** + ***************************************************************** + | ! =============== | + | * ===== F =====<---------| + | ::::::::: ==== U ==== + |-->:: CTI ::<!! === N === + | ::::::::: ! == N == + | ^ * == E == + | ! &&&&&&&&& IIIIIII == L == + |------>&& ETB &&<......II I ======= + | ! &&&&&&&&& II I . + | ! I I . + | ! I REP I<.......... + | ! I I + | !!>&&&&&&&&& II I *Source: ARM ltd. + |------>& TPIU &<......II I DAP = Debug Access Port + &&&&&&&&& IIIIIII ETM = Embedded Trace Macrocell + ; PTM = Program Trace Macrocell + ; CTI = Cross Trigger Interface + * ETB = Embedded Trace Buffer + To trace port TPIU= Trace Port Interface Unit + SWD = Serial Wire Debug + +While on target configuration of the components is done via the APB bus, +all trace data are carried out-of-band on the ATB bus. The CTM provides +a way to aggregate and distribute signals between CoreSight components. + +The coresight framework provides a central point to represent, configure and +manage coresight devices on a platform. This first implementation centers on +the basic tracing functionality, enabling components such ETM/PTM, funnel, +replicator, TMC, TPIU and ETB. Future work will enable more +intricate IP blocks such as STM and CTI. + + +Acronyms and Classification +--------------------------- + +Acronyms: + +PTM: Program Trace Macrocell +ETM: Embedded Trace Macrocell +STM: System trace Macrocell +ETB: Embedded Trace Buffer +ITM: Instrumentation Trace Macrocell +TPIU: Trace Port Interface Unit +TMC-ETR: Trace Memory Controller, configured as Embedded Trace Router +TMC-ETF: Trace Memory Controller, configured as Embedded Trace FIFO +CTI: Cross Trigger Interface + +Classification: + +Source: + ETMv3.x ETMv4, PTMv1.0, PTMv1.1, STM, STM500, ITM +Link: + Funnel, replicator (intelligent or not), TMC-ETR +Sinks: + ETBv1.0, ETB1.1, TPIU, TMC-ETF +Misc: + CTI + + +Device Tree Bindings +---------------------- + +See Documentation/devicetree/bindings/arm/coresight.txt for details. + +As of this writing drivers for ITM, STMs and CTIs are not provided but are +expected to be added as the solution matures. + + +Framework and implementation +---------------------------- + +The coresight framework provides a central point to represent, configure and +manage coresight devices on a platform. Any coresight compliant device can +register with the framework for as long as they use the right APIs: + +struct coresight_device *coresight_register(struct coresight_desc *desc); +void coresight_unregister(struct coresight_device *csdev); + +The registering function is taking a "struct coresight_device *csdev" and +register the device with the core framework. The unregister function takes +a reference to a "strut coresight_device", obtained at registration time. + +If everything goes well during the registration process the new devices will +show up under /sys/bus/coresight/devices, as showns here for a TC2 platform: + +root:~# ls /sys/bus/coresight/devices/ +replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm +20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm +root:~# + +The functions take a "struct coresight_device", which looks like this: + +struct coresight_desc { + enum coresight_dev_type type; + struct coresight_dev_subtype subtype; + const struct coresight_ops *ops; + struct coresight_platform_data *pdata; + struct device *dev; + const struct attribute_group **groups; +}; + + +The "coresight_dev_type" identifies what the device is, i.e, source link or +sink while the "coresight_dev_subtype" will characterise that type further. + +The "struct coresight_ops" is mandatory and will tell the framework how to +perform base operations related to the components, each component having +a different set of requirement. For that "struct coresight_ops_sink", +"struct coresight_ops_link" and "struct coresight_ops_source" have been +provided. + +The next field, "struct coresight_platform_data *pdata" is acquired by calling +"of_get_coresight_platform_data()", as part of the driver's _probe routine and +"struct device *dev" gets the device reference embedded in the "amba_device": + +static int etm_probe(struct amba_device *adev, const struct amba_id *id) +{ + ... + ... + drvdata->dev = &adev->dev; + ... +} + +Specific class of device (source, link, or sink) have generic operations +that can be performed on them (see "struct coresight_ops"). The +"**groups" is a list of sysfs entries pertaining to operations +specific to that component only. "Implementation defined" customisations are +expected to be accessed and controlled using those entries. + +Last but not least, "struct module *owner" is expected to be set to reflect +the information carried in "THIS_MODULE". + +How to use +---------- + +Before trace collection can start, a coresight sink needs to be identify. +There is no limit on the amount of sinks (nor sources) that can be enabled at +any given moment. As a generic operation, all device pertaining to the sink +class will have an "active" entry in sysfs: + +root:/sys/bus/coresight/devices# ls +replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm +20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm +root:/sys/bus/coresight/devices# ls 20010000.etb +enable_sink status trigger_cntr +root:/sys/bus/coresight/devices# echo 1 > 20010000.etb/enable_sink +root:/sys/bus/coresight/devices# cat 20010000.etb/enable_sink +1 +root:/sys/bus/coresight/devices# + +At boot time the current etm3x driver will configure the first address +comparator with "_stext" and "_etext", essentially tracing any instruction +that falls within that range. As such "enabling" a source will immediately +trigger a trace capture: + +root:/sys/bus/coresight/devices# echo 1 > 2201c000.ptm/enable_source +root:/sys/bus/coresight/devices# cat 2201c000.ptm/enable_source +1 +root:/sys/bus/coresight/devices# cat 20010000.etb/status +Depth: 0x2000 +Status: 0x1 +RAM read ptr: 0x0 +RAM wrt ptr: 0x19d3 <----- The write pointer is moving +Trigger cnt: 0x0 +Control: 0x1 +Flush status: 0x0 +Flush ctrl: 0x2001 +root:/sys/bus/coresight/devices# + +Trace collection is stopped the same way: + +root:/sys/bus/coresight/devices# echo 0 > 2201c000.ptm/enable_source +root:/sys/bus/coresight/devices# + +The content of the ETB buffer can be harvested directly from /dev: + +root:/sys/bus/coresight/devices# dd if=/dev/20010000.etb \ +of=~/cstrace.bin + +64+0 records in +64+0 records out +32768 bytes (33 kB) copied, 0.00125258 s, 26.2 MB/s +root:/sys/bus/coresight/devices# + +The file cstrace.bin can be decompressed using "ptm2human", DS-5 or Trace32. + +Following is a DS-5 output of an experimental loop that increments a variable up +to a certain value. The example is simple and yet provides a glimpse of the +wealth of possibilities that coresight provides. + +Info Tracing enabled +Instruction 106378866 0x8026B53C E52DE004 false PUSH {lr} +Instruction 0 0x8026B540 E24DD00C false SUB sp,sp,#0xc +Instruction 0 0x8026B544 E3A03000 false MOV r3,#0 +Instruction 0 0x8026B548 E58D3004 false STR r3,[sp,#4] +Instruction 0 0x8026B54C E59D3004 false LDR r3,[sp,#4] +Instruction 0 0x8026B550 E3530004 false CMP r3,#4 +Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 +Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] +Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c +Timestamp Timestamp: 17106715833 +Instruction 319 0x8026B54C E59D3004 false LDR r3,[sp,#4] +Instruction 0 0x8026B550 E3530004 false CMP r3,#4 +Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 +Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] +Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c +Instruction 9 0x8026B54C E59D3004 false LDR r3,[sp,#4] +Instruction 0 0x8026B550 E3530004 false CMP r3,#4 +Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 +Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] +Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c +Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4] +Instruction 0 0x8026B550 E3530004 false CMP r3,#4 +Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 +Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] +Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c +Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4] +Instruction 0 0x8026B550 E3530004 false CMP r3,#4 +Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 +Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] +Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c +Instruction 10 0x8026B54C E59D3004 false LDR r3,[sp,#4] +Instruction 0 0x8026B550 E3530004 false CMP r3,#4 +Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 +Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] +Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c +Instruction 6 0x8026B560 EE1D3F30 false MRC p15,#0x0,r3,c13,c0,#1 +Instruction 0 0x8026B564 E1A0100D false MOV r1,sp +Instruction 0 0x8026B568 E3C12D7F false BIC r2,r1,#0x1fc0 +Instruction 0 0x8026B56C E3C2203F false BIC r2,r2,#0x3f +Instruction 0 0x8026B570 E59D1004 false LDR r1,[sp,#4] +Instruction 0 0x8026B574 E59F0010 false LDR r0,[pc,#16] ; [0x8026B58C] = 0x80550368 +Instruction 0 0x8026B578 E592200C false LDR r2,[r2,#0xc] +Instruction 0 0x8026B57C E59221D0 false LDR r2,[r2,#0x1d0] +Instruction 0 0x8026B580 EB07A4CF true BL {pc}+0x1e9344 ; 0x804548c4 +Info Tracing enabled +Instruction 13570831 0x8026B584 E28DD00C false ADD sp,sp,#0xc +Instruction 0 0x8026B588 E8BD8000 true LDM sp!,{pc} +Timestamp Timestamp: 17107041535 |