Age | Commit message (Collapse) | Author |
|
'jstultz-android/linaro-fixes/experimental/android-3.9' into merge-android
Conflicts:
arch/arm/common/Makefile
arch/arm/include/asm/irq.h
arch/arm/include/asm/smp.h
arch/arm/kernel/smp.c
|
|
|
|
Conflicting files:
|
|
Conflicting files:
|
|
|
|
commit 509eb76ebf9771abc9fe51859382df2571f11447 upstream.
__my_cpu_offset is non-volatile, since we want its value to be cached
when we access several per-cpu variables in a row with preemption
disabled. This means that we rely on preempt_{en,dis}able to hazard
with the operation via the barrier() macro, so that we can't end up
migrating CPUs without reloading the per-cpu offset.
Unfortunately, GCC doesn't treat a "memory" clobber on a non-volatile
asm block as a side-effect, and will happily re-order it before other
memory clobbers (including those in prempt_disable()) and cache the
value. This has been observed to break the cmpxchg logic in the slub
allocator, leading to livelock in kmem_cache_alloc in mainline kernels.
This patch adds a dummy memory input operand to __my_cpu_offset,
forcing it to be ordered with respect to the barrier() macro.
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch determines the (mpidr,irq) pairings associated with each
PMU from the DT at probe time, and uses this information to work
out which IRQs to request for which logical CPU when enabling an
event on a PMU.
This patch also ensures that each PMU's init function is called on
a CPU of the correct type.
Previously, this was relying on luck.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
|
|
This patch adds preliminary, highly experimental perf support for
CONFIG_BL_SWITCHER=y.
In this configuration, every PMU is registered as valid for every
logical CPU, in a way which covers all the combinations which will
be seen at runtime, regardless of whether the switcher is enabled
or not.
Tracking of which PMUs are active at a given point in time is
delegated to the lower-level abstractions in perf_event_v7.c.
Warning: this patch does not handle PMU interrupt affinities
correctly.
Because of the way the switcher pairs up CPUs, this
does not cause a problem when the switcher is active; however,
interrupts may be directed to the wrong CPU when the switcher is
disabled.
This will result in spurious interrupts and wrong event
counts.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
|
|
We need a allocate some per-cpu-pmu data outside atomic context,
along with other actions required for setting up the cpu_pmu
struct.
This code does not need to run on any particular CPU, so
we call this after the per-CPU init method is called.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
|
|
This patch aims to provide basic register file functionality for
ARMv7 CPU PMUs while the PMU is offline.
It is incomplete and
lacks the necessary plumbing to actually make use of this, but the
extra code needed is not expected to be large or complex.
Save/restore are ported over the register emulation framework,
since the offline logical state of the CPU matches exactly what
needs to be captures in save/restore.
Because this patch is rather invasive, it should be
dropped in the future in favour of higher-level abstraction before merging upstream.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
|
|
In a system where Linux logical CPUs can migrate between different
physical CPUs, multiple CPU PMUs can logically count events for
each logical CPU, as logical CPUs migrate from one cluster to
another.
This patch allows multiple PMUs to be registered against each CPU.
The pairing of a PMU and a CPU is reperesented by a struct
arm_cpu_pmu, with existing per-CPU state used by perf moving into
this structure.
arm_cpu_pmus are per-cpu-allocated, and hang off
the relevant arm_pmu structure.
This arrangement allows us to find all the CPU-PMU pairings for a
given PMU, but not for a given CPU.
Do do the latter, a list of
all registered CPU PMUs is maintained, and we iterate over that
when we need to find all of a CPU's CPU PMUs.
This is not elegent,
but it shouldn't be a heavy cost since the number of different CPU
PMUs across the system is currently expected to be low (i.e., 2 or
fewer).
This could be improved later.
As a side-effect, the get_hw_events() method no longer has enough
context to provide an answer, because there may be multiple
candidate PMUs for a CPU.
This patch adds the struct arm_pmu * for
the relevant PMU to this interface to resolve this problem,
resulting in trivial changes to various ARM PMU implementations.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
|
|
This adds core support for saving and restoring CPU PMU registers
for suspend/resume support i.e. deeper C-states in cpuidle terms.
This patch adds support only to ARMv7 PMU registers save/restore.
It needs to be extended to xscale and ARMv6 if needed.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
|
|
In a system with multiple heterogeneous CPU PMUs and each PMUs can handle
events on a subset of CPUs, probably belonging a the same cluster.
This patch introduces a cpumask to track which CPUs each PMU supports.
It also updates armpmu_event_init to reject cpu-specific events being
initialised for unsupported CPUs. Since process-specific events can be
initialised for all the CPU PMUs,armpmu_start/stop/add are modified to
prevent from being added on unsupported CPUs.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
|
|
When the switcher is active, there is no straightforward way to
figure out which logical CPU a given physical CPU maps to.
This patch provides a function
bL_switcher_get_logical_index(mpidr), which is analogous to
get_logical_index().
This function returns the logical CPU on which the specified
physical CPU is grouped (or -EINVAL if unknown).
If the switcher
is inactive or not present, -EUNATCH is returned instead.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
|
|
Conflicts:
arch/arm/Kconfig
arch/arm/common/Makefile
|
|
old/new value
commit 6eabb3301b1facee669d9938f7c5a0295c21d71d upstream.
The implementation of cmpxchg64() for the ARM v6 and v7 architecture
casts parameter 2 and 3 (the old and new 64bit values) to an unsigned
long before calling the atomic_cmpxchg64() function. This clears
the top 32 bits of the old and new values, resulting in the wrong
values being compare-exchanged. Luckily, this only appears to be used
for 64-bit sched_clock, which we don't (yet) have on ARM.
This bug was introduced by commit 3e0f5a15f500 ("ARM: 7404/1: cmpxchg64:
use atomic64 and local64 routines for cmpxchg64").
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch exports a bL_switcher_trace_trigger() function to
provide a means for drivers using the trace events to get the
current status when starting a trace session.
Calling this function is equivalent to pinging the trace_trigger
file in sysfs.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
|
|
Some subsystems will need to respond synchronously to runtime
enabling and disabling of the switcher.
This patch adds a dedicated notifier interface to support such
subsystems. Pre- and post- enable/disable notifications are sent
to registered callbacks, allowing safe transition of non-b.L-
transparent subsystems across these control transitions.
Notifier callbacks may veto switcher (de)activation on pre notifications
only. Post notifications won't revert the action.
If enabling or disabling of the switcher fails after the pre-change
notification has been sent, subsystems which have registered
notifiers can be left in an inappropriate state.
This patch sends a suitable post-change notification on failure,
indicating that the old state has been reestablished.
For example, a failed initialisation will result in the following
sequence:
BL_NOTIFY_PRE_ENABLE
/* switcher initialisation fails */
BL_NOTIFY_POST_DISABLE
It is the responsibility of notified subsystems to respond in an
appropriate way.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
|
|
Some subsystems will need to know for sure whether the switcher is
enabled or disabled during certain critical regions.
This patch provides a simple mutex-based mechanism to discover
whether the switcher is enabled and temporarily lock out further
enable/disable:
* bL_switcher_get_enabled() returns true iff the switcher is
enabled and temporarily inhibits enable/disable.
* bL_switcher_put_enabled() permits enable/disable of the switcher
again after a previous call to bL_switcher_get_enabled().
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
|
|
This allows to poke a predetermined value into a specific address
upon entering the early boot code in bL_head.S.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
|
|
We need a mechanism to let an inbound CPU signal that it is alive before
even getting into the kernel environment i.e. from early assembly code.
Using an IPI is the simplest way to achieve that.
This adds some basic infrastructure to register a struct completion
pointer to be "completed" when the dedicated IPI for this task is
received.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
|
|
The workqueues are problematic as they may be contended.
They can't be scheduled with top priority either. Also the optimization
in bL_switch_request() to skip the workqueue entirely when the target CPU
and the calling CPU were the same didn't allow for bL_switch_request() to
be called from atomic context, as might be the case for some cpufreq
drivers.
Let's move to dedicated kthreads instead.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
|
|
The main entry point for a switch request is:
void bL_switch_request(unsigned int cpu, unsigned int new_cluster_id)
If the calling CPU is not the wanted one, this wrapper takes care of
sending the request to the appropriate CPU with schedule_work_on().
In the future, some switching related tasks which do not require a
strict CPU affinity might be moved here though.
At the moment the core switch operation is handled by bL_switch_to()
which must be called on the CPU for which a switch is requested.
What this code does:
* Return early if the current cluster is the wanted one.
* Close the gate in the kernel entry vector for both the inbound
and outbound CPUs.
* Wake up the inbound CPU so it can perform its reset sequence in
parallel up to the kernel entry vector gate.
* Migrate all interrupts in the GIC targeting the outbound CPU
interface to the inbound CPU interface, including SGIs. This is
performed by gic_migrate_target() in arch/arm/common/gic.c.
* Shut down the local timer for the outbound CPU.
* Call cpu_pm_enter() which takes care of flushing the VFP state to
RAM and save the CPU interface config from the GIC to RAM.
* Call cpu_suspend() which saves the CPU state (general purpose
registers, page table address) onto the stack and store the
resulting stack pointer in an array indexed by processor number,
then call the provided shutdown function. This happens in
arch/arm/kernel/sleep.S.
At this point, the provided shutdown function executed by the outbound
CPU ungates the inbound CPU. Therefore the inbound CPU:
* Picks up the saved stack pointer in the array indexed by
processor number above. At the moment the corresponding code in
arch/arm/kernel/sleep.S only looks at the CPU number field in the
MPIDR so the current code works unmodified even if the new CPU
comes from a different cluster.
* The MMU and caches are re-enabled using the saved state on the
provided stack, just like if this was a resume operation from a
suspended state.
* Then cpu_suspend() returns, although this is on the inbound CPU
rather than the outbound CPU which called it initially.
* The function cpu_pm_exit() is called which effect is to restore the
CPU interface state in the GIC using the state previously saved by
the outbound CPU.
* The local timer on the inbound CPU is restored.
* Exit of bL_switch_to() to resume normal kernel execution on the
new CPU.
However, the outbound CPU is potentially still running in parallel while
the inbound CPU is resuming normal kernel execution, hence we need
per CPU stack isolation to execute bL_do_switch(). After the outbound
CPU has ungated the inbound CPU, it calls bL_cpu_power_down() to:
* Clean its L1 cache.
* If it is the last CPU still alive in its cluster (last man standing),
it also cleans its L2 cache and disables cache snooping from the other
cluster.
Code called from bL_do_switch() might end up referencing 'current' for
some reasons. However, 'current' is derived from the stack pointer.
With any arbitrary stack, the returned value for 'current' and any
dereferenced values through it are just random garbage which may lead to
segmentation faults.
The active page table during the execution of bL_do_switch() is also a
problem. There is no guarantee that the inbound CPU won't destroy the
corresponding task which would free the attached page table while the
outbound CPU is still running and relying on it.
To solve both issues, we borrow some of the task space belonging to
the init/idle task which, by its nature, is lightly used and therefore
is unlikely to clash with our usage. The init task is also never going
away.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
|
|
|
|
|
|
|
|
Add a new 'smp_init' hook to machine_desc so platforms can specify a
function to be used to setup smp ops instead of having a statically
defined value. The hook must return true when smp_ops are initialized.
If false the static mdesc->smp_ops will be used by default.
Add the definition of "bool" by including the linux/types.h file to
asm/mach/arch.h and make it self-contained.
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
|
|
mcpm-merge-nico
|
|
commit 104ad3b32d7a71941c8ab2dee78eea38e8a23309 upstream.
ARM processors with LPAE enabled use 3 levels of page tables, with an
entry in the top level (pgd) covering 1GB of virtual space. Because of
the branch relocation limitations on ARM, the loadable modules are
mapped 16MB below PAGE_OFFSET, making the corresponding 1GB pgd shared
between kernel modules and user space.
If free_pgtables() is called with the default ceiling 0,
free_pgd_range() (and subsequently called functions) also frees the page
table shared between user space and kernel modules (which is normally
handled by the ARM-specific pgd_free() function). This patch changes
defines the ARM USER_PGTABLES_CEILING to TASK_SIZE when CONFIG_ARM_LPAE
is enabled.
Note that the pgd_free() function already checks the presence of the
shared pmd page allocated by pgd_alloc() and frees it, though with
ceiling 0 this wasn't necessary.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Change-Id: Icb73d60324ad0ddfc3e8a450a28bb3d90c702788
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: Ib91208a2c33621aa2d7bd9aa72bfbc670d9d5f1d
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: I520dfb6e593dac131de8b9b1db77f1c734f18c24
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: Iff964ba2f6236ed81863e02ec7b3ec9fbc48044a
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: If2cb7928d0711f48348443d882a12416be9c5910
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
If more than one ETM or PTM are present, configure all of them
and enable the formatter in the ETB. This allows tracing on dual
core systems (e.g. omap4).
Change-Id: I028657d5cf2bee1b23f193d4387b607953b35888
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
The old code enabled data tracing, but did not configure the
range. We now configure it to trace all data addresses by default,
and add a trace_data_range attribute to change the range or disable
data tracing.
Change-Id: I9d04e3e1ea0d0b4d4d5bcb93b1b042938ad738b2
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: Ibb536c88f0dbaf4766d0599296907e35e42cbfd6
Signed-off-by: Iliyan Malchev <malchev@google.com>
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: I27d2554e07d9de204e0a06696d38db51608d9f6b
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Colin Cross <ccross@android.com>
|
|
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
|
|
Change-Id: I9f10244b0603f7842b8504a16124d40dc4a71ed2
Signed-off-by: Colin Cross <ccross@android.com>
|
|
This patch implements CONFIG_DEBUG_RODATA, allowing
the kernel text section to be marked read-only in
order to catch bugs that write over the kernel. This
requires mapping the kernel code, plus up to 4MB, using
pages instead of sections, which can increase TLB
pressure.
The kernel is normally mapped using 1MB section entries
in the first level page table, and the first level page
table is copied into every mm. This prevents marking
the kernel text read-only, because the 1MB section
entries are too large granularity to separate the init
section, which is reused as read-write memory after
init, and the kernel text section. Also, the top level
page table for every process would need to be updated,
which is not possible to do safely and efficiently on SMP.
To solve both problems, allow alloc_init_pte to overwrite
an existing section entry with a fully-populated second
level page table. When CONFIG_DEBUG_RODATA is set, all
the section entries that overlap the kernel text section
will be replaced with page mappings. The kernel always
uses a pair of 2MB-aligned 1MB sections, so up to 2MB
of memory before and after the kernel may end up page
mapped.
When the top level page tables are copied into each
process the second level page tables are not copied,
leaving a single second level page table that will
affect all processes on all cpus. To mark a page
read-only, the second level page table is located using
the pointer in the first level page table for the
current process, and the supervisor RO bit is flipped
atomically. Once all pages have been updated, all TLBs
are flushed to ensure the changes are visible on all
cpus.
If CONFIG_DEBUG_RODATA is not set, the kernel will be
mapped using the normal 1MB section entries.
Change-Id: I94fae337f882c2e123abaf8e1082c29cd5d483c6
Signed-off-by: Colin Cross <ccross@android.com>
|
|
Based on a rough patch by frank.rowand@am.sony.com
Since ARM doesn't have an NMI (fiq's are not always available),
send an IPI to all other CPUs (current cpu prints the stack directly)
to capture a backtrace.
Change-Id: I8b163c8cec05d521b433ae133795865e8a33d4e2
Signed-off-by: Dima Zavin <dima@android.com>
|
|
ARM errata 727915 for PL310 has been updated to include a new
workaround required for PL310 r2p0 for l2x0_flush_all, which also
affects l2x0_clean_all in my testing. For r2p0, clean or flush
each set/way individually. For r3p0 or greater, use the debug
register for cleaning and flushing.
Requires exporting the cache_id, sets and ways detected in the
init function for later use.
Change-Id: I215055cbe5dc7e4e8184fb2befc4aff672ef0a12
Signed-off-by: Colin Cross <ccross@android.com>
|
|
These functions have been introduced by commit 10a8c383 (irq: introduce
entry and exit functions for chained handlers) in asm/mach/irq.h. This
patch moves them to linux/irqchip/chained_irq.h so that generic irqchip
drivers do not rely on architecture specific header files.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
|
|
This patch prepares the removal of <asm/mach/irq.h> include in the
GIC and VIC irqchip drivers.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
|
|
Move the private set_auxcr/get_auxcr functions from
drivers/cpuidle/cpuidle-calxeda.c so they can be used across platforms.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
|
|
This patch adds a probe function to check if the secure firmware has an
implementation of the Power State Coordination Interface.
'bL_platform_power_ops' will be implemented by:
a. a native backend when Linux runs in secure world
b. a psci backend which relies on the secure firmware to implement the
power ops
presence of b. will be indicated by the psci device node in the device tree.
The device node is expected to be populated by the secure firmware if it
supports psci. If the native backend detects a psci node then it bails out
allowing the psci backend to be registered.
Also a dummy 'psci_probe' function is added for the case when psci support
is not included. This prevents the build from breaking for tc2 and the
rtsm platforms.
Signed-off-by: Achin Gupta <achin.gupta@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
|
|
This patch defines constants to allow callers of the psci 'suspend'
& 'off' calls specify supported affinity levels.
Signed-off-by: Achin Gupta <achin.gupta@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
|
|
|
|
This is cleaner than exporting the mcpm_smp_ops structure.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Jon Medhurst <tixy@linaro.org>
|