diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/DocBook/media_api.tmpl | 4 | ||||
-rw-r--r-- | Documentation/arm/small_task_packing.txt | 136 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/arm/cci.txt | 172 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/arm/pmu.txt | 3 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/arm/rtsm-dcscb.txt | 19 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/mfd/vexpress-spc.txt | 35 | ||||
-rw-r--r-- | Documentation/hwmon/k10temp | 1 | ||||
-rw-r--r-- | Documentation/i2c/busses/i2c-piix4 | 2 | ||||
-rw-r--r-- | Documentation/kernel-parameters.txt | 28 | ||||
-rw-r--r-- | Documentation/networking/ip-sysctl.txt | 12 | ||||
-rw-r--r-- | Documentation/networking/packet_mmap.txt | 10 | ||||
-rw-r--r-- | Documentation/parisc/registers | 8 | ||||
-rw-r--r-- | Documentation/sysctl/kernel.txt | 25 |
13 files changed, 442 insertions, 13 deletions
diff --git a/Documentation/DocBook/media_api.tmpl b/Documentation/DocBook/media_api.tmpl index 6a8b7158697..9c92bb879b6 100644 --- a/Documentation/DocBook/media_api.tmpl +++ b/Documentation/DocBook/media_api.tmpl @@ -1,6 +1,6 @@ <?xml version="1.0"?> -<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" - "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [ +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" + "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ <!ENTITY % media-entities SYSTEM "./media-entities.tmpl"> %media-entities; <!ENTITY media-indices SYSTEM "./media-indices.tmpl"> diff --git a/Documentation/arm/small_task_packing.txt b/Documentation/arm/small_task_packing.txt new file mode 100644 index 00000000000..43f0a8b8023 --- /dev/null +++ b/Documentation/arm/small_task_packing.txt @@ -0,0 +1,136 @@ +Small Task Packing in the big.LITTLE MP Reference Patch Set + +What is small task packing? +---- +Simply that the scheduler will fit as many small tasks on a single CPU +as possible before using other CPUs. A small task is defined as one +whose tracked load is less than 90% of a NICE_0 task. This is a change +from the usual behavior since the scheduler will normally use an idle +CPU for a waking task unless that task is considered cache hot. + + +How is it implemented? +---- +Since all small tasks must wake up relatively frequently, the main +requirement for packing small tasks is to select a partly-busy CPU when +waking rather than looking for an idle CPU. We use the tracked load of +the CPU runqueue to determine how heavily loaded each CPU is and the +tracked load of the task to determine if it will fit on the CPU. We +always start with the lowest-numbered CPU in a sched domain and stop +looking when we find a CPU with enough space for the task. + +Some further tweaks are necessary to suppress load balancing when the +CPU is not fully loaded, otherwise the scheduler attempts to spread +tasks evenly across the domain. + + +How does it interact with the HMP patches? +---- +Firstly, we only enable packing on the little domain. The intent is that +the big domain is intended to spread tasks amongst the available CPUs +one-task-per-CPU. The little domain however is attempting to use as +little power as possible while servicing its tasks. + +Secondly, since we offload big tasks onto little CPUs in order to try +to devote one CPU to each task, we have a threshold above which we do +not try to pack a task and instead will select an idle CPU if possible. +This maintains maximum forward progress for busy tasks temporarily +demoted from big CPUs. + + +Can the behaviour be tuned? +---- +Yes, the load level of a 'full' CPU can be easily modified in the source +and is exposed through sysfs as /sys/kernel/hmp/packing_limit to be +changed at runtime. The presence of the packing behaviour is controlled +by CONFIG_SCHED_HMP_LITTLE_PACKING and can be disabled at run-time +using /sys/kernel/hmp/packing_enable. +The definition of a small task is hard coded as 90% of NICE_0_LOAD +and cannot be modified at run time. + + +Why do I need to tune it? +---- +The optimal configuration is likely to be different depending upon the +design and manufacturing of your SoC. + +In the main, there are two system effects from enabling small task +packing. + +1. CPU operating point may increase +2. wakeup latency of tasks may be increased + +There are also likely to be secondary effects from loading one CPU +rather than spreading tasks. + +Note that all of these system effects are dependent upon the workload +under consideration. + + +CPU Operating Point +---- +The primary impact of loading one CPU with a number of light tasks is to +increase the compute requirement of that CPU since it is no longer idle +as often. Increased compute requirement causes an increase in the +frequency of the CPU through CPUfreq. + +Consider this example: +We have a system with 3 CPUs which can operate at any frequency between +350MHz and 1GHz. The system has 6 tasks which would each produce 10% +load at 1GHz. The scheduler has frequency-invariant load scaling +enabled. Our DVFS governor aims for 80% utilization at the chosen +frequency. + +Without task packing, these tasks will be spread out amongst all CPUs +such that each has 2. This will produce roughly 20% system load, and +the frequency of the package will remain at 350MHz. + +With task packing set to the default packing_limit, all of these tasks +will sit on one CPU and require a package frequency of ~750MHz to reach +80% utilization. (0.75 = 0.6 * 0.8). + +When a package operates on a single frequency domain, all CPUs in that +package share frequency and voltage. + +Depending upon the SoC implementation there can be a significant amount +of energy lost to leakage from idle CPUs. The decision about how +loaded a CPU must be to be considered 'full' is therefore controllable +through sysfs (sys/kernel/hmp/packing_limit) and directly in the code. + +Continuing the example, lets set packing_limit to 450 which means we +will pack tasks until the total load of all running tasks >= 450. In +practise, this is very similar to a 55% idle 1Ghz CPU. + +Now we are only able to place 4 tasks on CPU0, and two will overflow +onto CPU1. CPU0 will have a load of 40% and CPU1 will have a load of +20%. In order to still hit 80% utilization, CPU0 now only needs to +operate at (0.4*0.8=0.32) 320MHz, which means that the lowest operating +point will be selected, the same as in the non-packing case, except that +now CPU2 is no longer needed and can be power-gated. + +In order to use less energy, the saving from power-gating CPU2 must be +more than the energy spent running CPU0 for the extra cycles. This +depends upon the SoC implementation. + +This is obviously a contrived example requiring all the tasks to +be runnable at the same time, but it illustrates the point. + + +Wakeup Latency +---- +This is an unavoidable consequence of trying to pack tasks together +rather than giving them a CPU each. If you cannot find an acceptable +level of wakeup latency, you should turn packing off. + +Cyclictest is a good test application for determining the added latency +when configuring packing. + + +Why is it turned off for the VersatileExpress V2P_CA15A7 CoreTile? +---- +Simply, this core tile only has power gating for the whole A7 package. +When small task packing is enabled, all our low-energy use cases +normally fit onto one A7 CPU. We therefore end up with 2 mostly-idle +CPUs and one mostly-busy CPU. This decreases the amount of time +available where the whole package is idle and can be turned off. + diff --git a/Documentation/devicetree/bindings/arm/cci.txt b/Documentation/devicetree/bindings/arm/cci.txt new file mode 100644 index 00000000000..92d36e2aa87 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/cci.txt @@ -0,0 +1,172 @@ +======================================================= +ARM CCI cache coherent interconnect binding description +======================================================= + +ARM multi-cluster systems maintain intra-cluster coherency through a +cache coherent interconnect (CCI) that is capable of monitoring bus +transactions and manage coherency, TLB invalidations and memory barriers. + +It allows snooping and distributed virtual memory message broadcast across +clusters, through memory mapped interface, with a global control register +space and multiple sets of interface control registers, one per slave +interface. + +Bindings for the CCI node follow the ePAPR standard, available from: + +www.power.org/documentation/epapr-version-1-1/ + +with the addition of the bindings described in this document which are +specific to ARM. + +* CCI interconnect node + + Description: Describes a CCI cache coherent Interconnect component + + Node name must be "cci". + Node's parent must be the root node /, and the address space visible + through the CCI interconnect is the same as the one seen from the + root node (ie from CPUs perspective as per DT standard). + Every CCI node has to define the following properties: + + - compatible + Usage: required + Value type: <string> + Definition: must be set to + "arm,cci-400" + + - reg + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. Specifies base physical + address of CCI control registers common to all + interfaces. + + - ranges: + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. Follow rules in the ePAPR for + hierarchical bus addressing. CCI interfaces + addresses refer to the parent node addressing + scheme to declare their register bases. + + CCI interconnect node can define the following child nodes: + + - CCI control interface nodes + + Node name must be "slave-if". + Parent node must be CCI interconnect node. + + A CCI control interface node must contain the following + properties: + + - compatible + Usage: required + Value type: <string> + Definition: must be set to + "arm,cci-400-ctrl-if" + + - interface-type: + Usage: required + Value type: <string> + Definition: must be set to one of {"ace", "ace-lite"} + depending on the interface type the node + represents. + + - reg: + Usage: required + Value type: <prop-encoded-array> + Definition: the base address and size of the + corresponding interface programming + registers. + +* CCI interconnect bus masters + + Description: masters in the device tree connected to a CCI port + (inclusive of CPUs and their cpu nodes). + + A CCI interconnect bus master node must contain the following + properties: + + - cci-control-port: + Usage: required + Value type: <phandle> + Definition: a phandle containing the CCI control interface node + the master is connected to. + +Example: + + cpus { + #size-cells = <0>; + #address-cells = <1>; + + CPU0: cpu@0 { + device_type = "cpu"; + compatible = "arm,cortex-a15"; + cci-control-port = <&cci_control1>; + reg = <0x0>; + }; + + CPU1: cpu@1 { + device_type = "cpu"; + compatible = "arm,cortex-a15"; + cci-control-port = <&cci_control1>; + reg = <0x1>; + }; + + CPU2: cpu@100 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + cci-control-port = <&cci_control2>; + reg = <0x100>; + }; + + CPU3: cpu@101 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + cci-control-port = <&cci_control2>; + reg = <0x101>; + }; + + }; + + dma0: dma@3000000 { + compatible = "arm,pl330", "arm,primecell"; + cci-control-port = <&cci_control0>; + reg = <0x0 0x3000000 0x0 0x1000>; + interrupts = <10>; + #dma-cells = <1>; + #dma-channels = <8>; + #dma-requests = <32>; + }; + + cci@2c090000 { + compatible = "arm,cci-400"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0x0 0x2c090000 0 0x1000>; + ranges = <0x0 0x0 0x2c090000 0x6000>; + + cci_control0: slave-if@1000 { + compatible = "arm,cci-400-ctrl-if"; + interface-type = "ace-lite"; + reg = <0x1000 0x1000>; + }; + + cci_control1: slave-if@4000 { + compatible = "arm,cci-400-ctrl-if"; + interface-type = "ace"; + reg = <0x4000 0x1000>; + }; + + cci_control2: slave-if@5000 { + compatible = "arm,cci-400-ctrl-if"; + interface-type = "ace"; + reg = <0x5000 0x1000>; + }; + }; + +This CCI node corresponds to a CCI component whose control registers sits +at address 0x000000002c090000. +CCI slave interface @0x000000002c091000 is connected to dma controller dma0. +CCI slave interface @0x000000002c094000 is connected to CPUs {CPU0, CPU1}; +CCI slave interface @0x000000002c095000 is connected to CPUs {CPU2, CPU3}; diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt index 343781b9f24..4ce82d045a6 100644 --- a/Documentation/devicetree/bindings/arm/pmu.txt +++ b/Documentation/devicetree/bindings/arm/pmu.txt @@ -16,6 +16,9 @@ Required properties: "arm,arm1176-pmu" "arm,arm1136-pmu" - interrupts : 1 combined interrupt or 1 per core. +- cluster : a phandle to the cluster to which it belongs + If there are more than one cluster with same CPU type + then there should be separate PMU nodes per cluster. Example: diff --git a/Documentation/devicetree/bindings/arm/rtsm-dcscb.txt b/Documentation/devicetree/bindings/arm/rtsm-dcscb.txt new file mode 100644 index 00000000000..3b8fbf3c00c --- /dev/null +++ b/Documentation/devicetree/bindings/arm/rtsm-dcscb.txt @@ -0,0 +1,19 @@ +ARM Dual Cluster System Configuration Block +------------------------------------------- + +The Dual Cluster System Configuration Block (DCSCB) provides basic +functionality for controlling clocks, resets and configuration pins in +the Dual Cluster System implemented by the Real-Time System Model (RTSM). + +Required properties: + +- compatible : should be "arm,rtsm,dcscb" + +- reg : physical base address and the size of the registers window + +Example: + + dcscb@60000000 { + compatible = "arm,rtsm,dcscb"; + reg = <0x60000000 0x1000>; + }; diff --git a/Documentation/devicetree/bindings/mfd/vexpress-spc.txt b/Documentation/devicetree/bindings/mfd/vexpress-spc.txt new file mode 100644 index 00000000000..1d71dc2ff15 --- /dev/null +++ b/Documentation/devicetree/bindings/mfd/vexpress-spc.txt @@ -0,0 +1,35 @@ +* ARM Versatile Express Serial Power Controller device tree bindings + +Latest ARM development boards implement a power management interface (serial +power controller - SPC) that is capable of managing power/voltage and +operating point transitions, through memory mapped registers interface. + +On testchips like TC2 it also provides a configuration interface that can +be used to read/write values which cannot be read/written through simple +memory mapped reads/writes. + +- spc node + + - compatible: + Usage: required + Value type: <stringlist> + Definition: must be + "arm,vexpress-spc,v2p-ca15_a7","arm,vexpress-spc" + - reg: + Usage: required + Value type: <prop-encode-array> + Definition: A standard property that specifies the base address + and the size of the SPC address space + - interrupts: + Usage: required + Value type: <prop-encoded-array> + Definition: SPC interrupt configuration. A standard property + that follows ePAPR interrupts specifications + +Example: + +spc: spc@7fff0000 { + compatible = "arm,vexpress-spc,v2p-ca15_a7","arm,vexpress-spc"; + reg = <0 0x7FFF0000 0 0x1000>; + interrupts = <0 95 4>; +}; diff --git a/Documentation/hwmon/k10temp b/Documentation/hwmon/k10temp index 90956b61802..4dfdc8f8363 100644 --- a/Documentation/hwmon/k10temp +++ b/Documentation/hwmon/k10temp @@ -12,6 +12,7 @@ Supported chips: * AMD Family 12h processors: "Llano" (E2/A4/A6/A8-Series) * AMD Family 14h processors: "Brazos" (C/E/G/Z-Series) * AMD Family 15h processors: "Bulldozer" (FX-Series), "Trinity" +* AMD Family 16h processors: "Kabini" Prefix: 'k10temp' Addresses scanned: PCI space diff --git a/Documentation/i2c/busses/i2c-piix4 b/Documentation/i2c/busses/i2c-piix4 index 1e6634f54c5..a370b2047cf 100644 --- a/Documentation/i2c/busses/i2c-piix4 +++ b/Documentation/i2c/busses/i2c-piix4 @@ -13,7 +13,7 @@ Supported adapters: * AMD SP5100 (SB700 derivative found on some server mainboards) Datasheet: Publicly available at the AMD website http://support.amd.com/us/Embedded_TechDocs/44413.pdf - * AMD Hudson-2 + * AMD Hudson-2, CZ Datasheet: Not publicly available * Standard Microsystems (SMSC) SLC90E66 (Victory66) southbridge Datasheet: Publicly available at the SMSC website http://www.smsc.com diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 2fe6e767b3d..15b24a2be6b 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1240,6 +1240,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted. See comment before ip2_setup() in drivers/char/ip2/ip2base.c. + irqaffinity= [SMP] Set the default irq affinity mask + Format: + <cpu number>,...,<cpu number> + or + <cpu number>-<cpu number> + (must be a positive range in ascending order) + or a mixture + <cpu number>,...,<cpu number>-<cpu number> + irqfixup [HW] When an interrupt is not handled search all handlers for it. Intended to get systems with badly broken @@ -1456,6 +1465,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. * dump_id: dump IDENTIFY data. + * atapi_dmadir: Enable ATAPI DMADIR bridge support + + * disable: Disable this device. + If there are multiple matching configurations changing the same attribute, the last one is used. @@ -3341,6 +3354,21 @@ bytes respectively. Such letter suffixes can also be entirely omitted. that this also can be controlled per-workqueue for workqueues visible under /sys/bus/workqueue/. + workqueue.power_efficient + Per-cpu workqueues are generally preferred because + they show better performance thanks to cache + locality; unfortunately, per-cpu workqueues tend to + be more power hungry than unbound workqueues. + + Enabling this makes the per-cpu workqueues which + were observed to contribute significantly to power + consumption unbound, leading to measurably lower + power usage at the cost of small performance + overhead. + + The default value of this parameter is determined by + the config option CONFIG_WQ_POWER_EFFICIENT_DEFAULT. + x2apic_phys [X86-64,APIC] Use x2apic physical mode instead of default x2apic cluster mode on platforms supporting x2apic. diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 3458d6343e0..a59ee432a98 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -478,6 +478,15 @@ tcp_syn_retries - INTEGER tcp_timestamps - BOOLEAN Enable timestamps as defined in RFC1323. +tcp_min_tso_segs - INTEGER + Minimal number of segments per TSO frame. + Since linux-3.12, TCP does an automatic sizing of TSO frames, + depending on flow rate, instead of filling 64Kbytes packets. + For specific usages, it's possible to force TCP to build big + TSO frames. Note that TCP stack might split too big TSO packets + if available window is too small. + Default: 2 + tcp_tso_win_divisor - INTEGER This allows control over what percentage of the congestion window can be consumed by a single TSO frame. @@ -562,9 +571,6 @@ tcp_limit_output_bytes - INTEGER typical pfifo_fast qdiscs. tcp_limit_output_bytes limits the number of bytes on qdisc or device to reduce artificial RTT/cwnd and reduce bufferbloat. - Note: For GSO/TSO enabled flows, we try to have at least two - packets in flight. Reducing tcp_limit_output_bytes might also - reduce the size of individual GSO packet (64KB being the max) Default: 131072 tcp_challenge_ack_limit - INTEGER diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt index 23dd80e82b8..0f4376ec885 100644 --- a/Documentation/networking/packet_mmap.txt +++ b/Documentation/networking/packet_mmap.txt @@ -123,6 +123,16 @@ Transmission process is similar to capture as shown below. [shutdown] close() --------> destruction of the transmission socket and deallocation of all associated resources. +Socket creation and destruction is also straight forward, and is done +the same way as in capturing described in the previous paragraph: + + int fd = socket(PF_PACKET, mode, 0); + +The protocol can optionally be 0 in case we only want to transmit +via this socket, which avoids an expensive call to packet_rcv(). +In this case, you also need to bind(2) the TX_RING with sll_protocol = 0 +set. Otherwise, htons(ETH_P_ALL) or any other protocol, for example. + Binding the socket to your network interface is mandatory (with zero copy) to know the header size of frames used in the circular buffer. diff --git a/Documentation/parisc/registers b/Documentation/parisc/registers index dd3caddd1ad..10c7d1730f5 100644 --- a/Documentation/parisc/registers +++ b/Documentation/parisc/registers @@ -78,6 +78,14 @@ Shadow Registers used by interruption handler code TOC enable bit 1 ========================================================================= + +The PA-RISC architecture defines 7 registers as "shadow registers". +Those are used in RETURN FROM INTERRUPTION AND RESTORE instruction to reduce +the state save and restore time by eliminating the need for general register +(GR) saves and restores in interruption handlers. +Shadow registers are the GRs 1, 8, 9, 16, 17, 24, and 25. + +========================================================================= Register usage notes, originally from John Marvin, with some additional notes from Randolph Chung. diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index ccd42589e12..9b34b168507 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -289,13 +289,24 @@ Default value is "/sbin/hotplug". kptr_restrict: This toggle indicates whether restrictions are placed on -exposing kernel addresses via /proc and other interfaces. When -kptr_restrict is set to (0), there are no restrictions. When -kptr_restrict is set to (1), the default, kernel pointers -printed using the %pK format specifier will be replaced with 0's -unless the user has CAP_SYSLOG. When kptr_restrict is set to -(2), kernel pointers printed using %pK will be replaced with 0's -regardless of privileges. +exposing kernel addresses via /proc and other interfaces. + +When kptr_restrict is set to (0), the default, there are no restrictions. + +When kptr_restrict is set to (1), kernel pointers printed using the %pK +format specifier will be replaced with 0's unless the user has CAP_SYSLOG +and effective user and group ids are equal to the real ids. This is +because %pK checks are done at read() time rather than open() time, so +if permissions are elevated between the open() and the read() (e.g via +a setuid binary) then %pK will not leak kernel pointers to unprivileged +users. Note, this is a temporary solution only. The correct long-term +solution is to do the permission checks at open() time. Consider removing +world read permissions from files that use %pK, and using dmesg_restrict +to protect against uses of %pK in dmesg(8) if leaking kernel pointer +values to unprivileged users is a concern. + +When kptr_restrict is set to (2), kernel pointers printed using +%pK will be replaced with 0's regardless of privileges. ============================================================== |