aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/include/asm/cpufeature.h
AgeCommit message (Collapse)Author
2015-04-26x86_64, asm: Work around AMD SYSRET SS descriptor attribute issueAndy Lutomirski
AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET with SS == 0 results in an invalid usermode state in which SS is apparently equal to __USER_DS but causes #SS if used. Work around the issue by setting SS to __KERNEL_DS __switch_to, thus ensuring that SYSRET never happens with SS set to NULL. This was exposed by a recent vDSO cleanup. Fixes: e7d6eefaaa44 x86/vdso32/syscall.S: Do not load __USER32_DS to %ss Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Brian Gerst <brgerst@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-14Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Core kernel changes: - One of the more interesting features in this cycle is the ability to attach eBPF programs (user-defined, sandboxed bytecode executed by the kernel) to kprobes. This allows user-defined instrumentation on a live kernel image that can never crash, hang or interfere with the kernel negatively. (Right now it's limited to root-only, but in the future we might allow unprivileged use as well.) (Alexei Starovoitov) - Another non-trivial feature is per event clockid support: this allows, amongst other things, the selection of different clock sources for event timestamps traced via perf. This feature is sought by people who'd like to merge perf generated events with external events that were measured with different clocks: - cluster wide profiling - for system wide tracing with user-space events, - JIT profiling events etc. Matching perf tooling support is added as well, available via the -k, --clockid <clockid> parameter to perf record et al. (Peter Zijlstra) Hardware enablement kernel changes: - x86 Intel Processor Trace (PT) support: which is a hardware tracer on steroids, available on Broadwell CPUs. The hardware trace stream is directly output into the user-space ring-buffer, using the 'AUX' data format extension that was added to the perf core to support hardware constraints such as the necessity to have the tracing buffer physically contiguous. This patch-set was developed for two years and this is the result. A simple way to make use of this is to use BTS tracing, the PT driver emulates BTS output - available via the 'intel_bts' PMU. More explicit PT specific tooling support is in the works as well - will probably be ready by 4.2. (Alexander Shishkin, Peter Zijlstra) - x86 Intel Cache QoS Monitoring (CQM) support: this is a hardware feature of Intel Xeon CPUs that allows the measurement and allocation/partitioning of caches to individual workloads. These kernel changes expose the measurement side as a new PMU driver, which exposes various QoS related PMU events. (The partitioning change is work in progress and is planned to be merged as a cgroup extension.) (Matt Fleming, Peter Zijlstra; CPU feature detection by Peter P Waskiewicz Jr) - x86 Intel Haswell LBR call stack support: this is a new Haswell feature that allows the hardware recording of call chains, plus tooling support. To activate this feature you have to enable it via the new 'lbr' call-graph recording option: perf record --call-graph lbr perf report or: perf top --call-graph lbr This hardware feature is a lot faster than stack walk or dwarf based unwinding, but has some limitations: - It reuses the current LBR facility, so LBR call stack and branch record can not be enabled at the same time. - It is only available for user-space callchains. (Yan, Zheng) - x86 Intel Broadwell CPU support and various event constraints and event table fixes for earlier models. (Andi Kleen) - x86 Intel HT CPUs event scheduling workarounds. This is a complex CPU bug affecting the SNB,IVB,HSW families that results in counter value corruption. The mitigation code is automatically enabled and is transparent. (Maria Dimakopoulou, Stephane Eranian) The perf tooling side had a ton of changes in this cycle as well, so I'm only able to list the user visible changes here, in addition to the tooling changes outlined above: User visible changes affecting all tools: - Improve support of compressed kernel modules (Jiri Olsa) - Save DSO loading errno to better report errors (Arnaldo Carvalho de Melo) - Bash completion for subcommands (Yunlong Song) - Add 'I' event modifier for perf_event_attr.exclude_idle bit (Jiri Olsa) - Support missing -f to override perf.data file ownership. (Yunlong Song) - Show the first event with an invalid filter (David Ahern, Arnaldo Carvalho de Melo) User visible changes in individual tools: 'perf data': New tool for converting perf.data to other formats, initially for the CTF (Common Trace Format) from LTTng (Jiri Olsa, Sebastian Siewior) 'perf diff': Add --kallsyms option (David Ahern) 'perf list': Allow listing events with 'tracepoint' prefix (Yunlong Song) Sort the output of the command (Yunlong Song) 'perf kmem': Respect -i option (Jiri Olsa) Print big numbers using thousands' group (Namhyung Kim) Allow -v option (Namhyung Kim) Fix alignment of slab result table (Namhyung Kim) 'perf probe': Support multiple probes on different binaries on the same command line (Masami Hiramatsu) Support unnamed union/structure members data collection. (Masami Hiramatsu) Check kprobes blacklist when adding new events. (Masami Hiramatsu) 'perf record': Teach 'perf record' about perf_event_attr.clockid (Peter Zijlstra) Support recording running/enabled time (Andi Kleen) 'perf sched': Improve the performance of 'perf sched replay' on high CPU core count machines (Yunlong Song) 'perf report' and 'perf top': Allow annotating entries in callchains in the hists browser (Arnaldo Carvalho de Melo) Indicate which callchain entries are annotated in the TUI hists browser (Arnaldo Carvalho de Melo) Add pid/tid filtering to 'report' and 'script' commands (David Ahern) Consider PERF_RECORD_ events with cpumode == 0 in 'perf top', removing one cause of long term memory usage buildup, i.e. not processing PERF_RECORD_EXIT events (Arnaldo Carvalho de Melo) 'perf stat': Report unsupported events properly (Suzuki K. Poulose) Output running time and run/enabled ratio in CSV mode (Andi Kleen) 'perf trace': Handle legacy syscalls tracepoints (David Ahern, Arnaldo Carvalho de Melo) Only insert blank duration bracket when tracing syscalls (Arnaldo Carvalho de Melo) Filter out the trace pid when no threads are specified (Arnaldo Carvalho de Melo) Dump stack on segfaults (Arnaldo Carvalho de Melo) No need to explicitely enable evsels for workload started from perf, let it be enabled via perf_event_attr.enable_on_exec, removing some events that take place in the 'perf trace' before a workload is really started by it. (Arnaldo Carvalho de Melo) Allow mixing with tracepoints and suppressing plain syscalls. (Arnaldo Carvalho de Melo) There's also been a ton of infrastructure work done, such as the split-out of perf's build system into tools/build/ and other changes - see the shortlog and changelog for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (358 commits) perf/x86/intel/pt: Clean up the control flow in pt_pmu_hw_init() perf evlist: Fix type for references to data_head/tail perf probe: Check the orphaned -x option perf probe: Support multiple probes on different binaries perf buildid-list: Fix segfault when show DSOs with hits perf tools: Fix cross-endian analysis perf tools: Fix error path to do closedir() when synthesizing threads perf tools: Fix synthesizing fork_event.ppid for non-main thread perf tools: Add 'I' event modifier for exclude_idle bit perf report: Don't call map__kmap if map is NULL. perf tests: Fix attr tests perf probe: Fix ARM 32 building error perf tools: Merge all perf_event_attr print functions perf record: Add clockid parameter perf sched replay: Use replay_repeat to calculate the runavg of cpu usage instead of the default value 10 perf sched replay: Support using -f to override perf.data file ownership perf sched replay: Fix the EMFILE error caused by the limitation of the maximum open files perf sched replay: Handle the dead halt of sem_wait when create_tasks() fails for any task perf sched replay: Fix the segmentation fault problem caused by pr_err in threads perf sched replay: Realloc the memory of pid_to_task stepwise to adapt to the different pid_max configurations ...
2015-04-03x86/asm: Add support for the CLWB instructionRoss Zwisler
Add support for the new CLWB (cache line write back) instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf The CLWB instruction is used to write back the contents of dirtied cache lines to memory without evicting the cache lines from the processor's cache hierarchy. This should be used in favor of clflushopt or clflush in cases where you require the cache line to be written to memory but plan to access the data again in the near future. One of the main use cases for this is with persistent memory where CLWB can be used with PCOMMIT to ensure that data has been accepted to memory and is durable on the DIMM. This function shows how to properly use CLWB/CLFLUSHOPT/CLFLUSH and PCOMMIT with appropriate fencing: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * Use SFENCE to order CLWB/CLFLUSHOPT/CLFLUSH cache flushes. * (MFENCE via mb() also works) */ wmb(); /* PCOMMIT and the required SFENCE for ordering */ pcommit_sfence(); } After this function completes the data pointed to by vaddr is has been accepted to memory and will be durable if the vaddr points to persistent memory. Regarding the details of how the alternatives assembly is set up, we need one additional byte at the beginning of the CLFLUSH so that we can flip it into a CLFLUSHOPT by changing that byte into a 0x66 prefix. Two options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no functional effect with the plain CLFLUSH, but I've been told that executing a CLFLUSH + prefix should be faster than executing a CLFLUSH + NOP. We had to hard code the assembly for CLWB because, lacking the ability to assemble the CLWB instruction itself, the next closest thing is to have an xsaveopt instruction with a 0x66 prefix. Unfortunately XSAVEOPT itself is also relatively new, and isn't included by all the GCC versions that the kernel needs to support. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02x86: Add Intel Processor Trace (INTEL_PT) cpu feature detectionAlexander Shishkin
Intel Processor Trace is an architecture extension that allows for program flow tracing. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kaixu Xia <kaixu.xia@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Robert Richter <rric@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: adrian.hunter@intel.com Cc: kan.liang@intel.com Cc: markus.t.metzger@intel.com Cc: mathieu.poirier@linaro.org Link: http://lkml.kernel.org/r/1421237903-181015-11-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-04Merge tag 'alternatives_padding' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/asm Pull alternative instructions framework improvements from Borislav Petkov: "A more involved rework of the alternatives framework to be able to pad instructions and thus make using the alternatives macros more straightforward and without having to figure out old and new instruction sizes but have the toolchain figure that out for us. Furthermore, it optimizes JMPs used so that fetch and decode can be relieved with smaller versions of the JMPs, where possible. Some stats: x86_64 defconfig: Alternatives sites total: 2478 Total padding added (in Bytes): 6051 The padding is currently done for: X86_FEATURE_ALWAYS X86_FEATURE_ERMS X86_FEATURE_LFENCE_RDTSC X86_FEATURE_MFENCE_RDTSC X86_FEATURE_SMAP This is with the latest version of the patchset. Of course, on each machine the alternatives sites actually being patched are a proper subset of the total number." Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-25x86: Add support for Intel Cache QoS Monitoring (CQM) detectionPeter P Waskiewicz Jr
This patch adds support for the new Cache QoS Monitoring (CQM) feature found in future Intel Xeon processors. It includes the new values to track CQM resources to the cpuinfo_x86 structure, plus the CPUID detection routines for CQM. CQM allows a process, or set of processes, to be tracked by the CPU to determine the cache usage of that task group. Using this data from the CPU, software can be written to extract this data and report cache usage and occupancy for a particular process, or group of processes. More information about Cache QoS Monitoring can be found in the Intel (R) x86 Architecture Software Developer Manual, section 17.14. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Chris Webb <chris@arachsys.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Jacob Shin <jacob.w.shin@gmail.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kanaka Juvva <kanaka.d.juvva@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Honeyman <stevenhoneyman@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Link: http://lkml.kernel.org/r/1422038748-21397-5-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-23x86/alternatives: Make JMPs more robustBorislav Petkov
Up until now we had to pay attention to relative JMPs in alternatives about how their relative offset gets computed so that the jump target is still correct. Or, as it is the case for near CALLs (opcode e8), we still have to go and readjust the offset at patching time. What is more, the static_cpu_has_safe() facility had to forcefully generate 5-byte JMPs since we couldn't rely on the compiler to generate properly sized ones so we had to force the longest ones. Worse than that, sometimes it would generate a replacement JMP which is longer than the original one, thus overwriting the beginning of the next instruction at patching time. So, in order to alleviate all that and make using JMPs more straight-forward we go and pad the original instruction in an alternative block with NOPs at build time, should the replacement(s) be longer. This way, alternatives users shouldn't pay special attention so that original and replacement instruction sizes are fine but the assembler would simply add padding where needed and not do anything otherwise. As a second aspect, we go and recompute JMPs at patching time so that we can try to make 5-byte JMPs into two-byte ones if possible. If not, we still have to recompute the offsets as the replacement JMP gets put far away in the .altinstr_replacement section leading to a wrong offset if copied verbatim. For example, on a locally generated kernel image old insn VA: 0xffffffff810014bd, CPU feat: X86_FEATURE_ALWAYS, size: 2 __switch_to: ffffffff810014bd: eb 21 jmp ffffffff810014e0 repl insn: size: 5 ffffffff81d0b23c: e9 b1 62 2f ff jmpq ffffffff810014f2 gets corrected to a 2-byte JMP: apply_alternatives: feat: 3*32+21, old: (ffffffff810014bd, len: 2), repl: (ffffffff81d0b23c, len: 5) alt_insn: e9 b1 62 2f ff recompute_jumps: next_rip: ffffffff81d0b241, tgt_rip: ffffffff810014f2, new_displ: 0x00000033, ret len: 2 converted to: eb 33 90 90 90 and a 5-byte JMP: old insn VA: 0xffffffff81001516, CPU feat: X86_FEATURE_ALWAYS, size: 2 __switch_to: ffffffff81001516: eb 30 jmp ffffffff81001548 repl insn: size: 5 ffffffff81d0b241: e9 10 63 2f ff jmpq ffffffff81001556 gets shortened into a two-byte one: apply_alternatives: feat: 3*32+21, old: (ffffffff81001516, len: 2), repl: (ffffffff81d0b241, len: 5) alt_insn: e9 10 63 2f ff recompute_jumps: next_rip: ffffffff81d0b246, tgt_rip: ffffffff81001556, new_displ: 0x0000003e, ret len: 2 converted to: eb 3e 90 90 90 ... and so on. This leads to a net win of around 40ish replacements * 3 bytes savings =~ 120 bytes of I$ on an AMD guest which means some savings of precious instruction cache bandwidth. The padding to the shorter 2-byte JMPs are single-byte NOPs which on smart microarchitectures means discarding NOPs at decode time and thus freeing up execution bandwidth. Signed-off-by: Borislav Petkov <bp@suse.de>
2015-02-23x86/alternatives: Add instruction paddingBorislav Petkov
Up until now we have always paid attention to make sure the length of the new instruction replacing the old one is at least less or equal to the length of the old instruction. If the new instruction is longer, at the time it replaces the old instruction it will overwrite the beginning of the next instruction in the kernel image and cause your pants to catch fire. So instead of having to pay attention, teach the alternatives framework to pad shorter old instructions with NOPs at buildtime - but only in the case when len(old instruction(s)) < len(new instruction(s)) and add nothing in the >= case. (In that case we do add_nops() when patching). This way the alternatives user shouldn't have to care about instruction sizes and simply use the macros. Add asm ALTERNATIVE* flavor macros too, while at it. Also, we need to save the pad length in a separate struct alt_instr member for NOP optimization and the way to do that reliably is to carry the pad length instead of trying to detect whether we're looking at single-byte NOPs or at pathological instruction offsets like e9 90 90 90 90, for example, which is a valid instruction. Thanks to Michael Matz for the great help with toolchain questions. Signed-off-by: Borislav Petkov <bp@suse.de>
2015-02-20x86/asm: Add support for the pcommit instructionRoss Zwisler
Add support for the new pcommit (persistent commit) instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022: https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf The pcommit instruction ensures that data that has been flushed from the processor's cache hierarchy with clwb, clflushopt or clflush is accepted to memory and is durable on the DIMM. The primary use case for this is persistent memory. This function shows how to properly use clwb/clflushopt/clflush and pcommit with appropriate fencing: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); /* pcommit and the required sfence for ordering */ pcommit_sfence(); } After this function completes the data pointed to by vaddr is has been accepted to memory and will be durable if the vaddr points to persistent memory. Pcommit must always be ordered by an mfence or sfence, so to help simplify things we include both the pcommit and the required sfence in the alternatives generated by pcommit_sfence(). The other option is to keep them separated, but on platforms that don't support pcommit this would then turn into: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); nop(); /* from pcommit(), via alternatives */ /* * sfence to order pcommit * mfence via mb() also works */ wmb(); } This is still correct, but now you've got two fences separated by only a nop. With the commit and the fence together in pcommit_sfence() you avoid the final unneeded fence. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1424367448-24254-1-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-28Merge branch 'perf/hw_breakpoints' into perf/coreIngo Molnar
The new hw_breakpoint bits are now ready for v3.20, merge them into the main branch, to avoid conflicts. Conflicts: tools/perf/Documentation/perf-record.txt Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-03perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8Jacob Shin
Implement hardware breakpoint address mask for AMD Family 16h and above processors. CPUID feature bit indicates hardware support for DRn_ADDR_MASK MSRs. These masks further qualify DRn/DR7 hardware breakpoint addresses to allow matching of larger addresses ranges. Valuable advice and pseudo code from Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jacob Shin <jacob.w.shin@gmail.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: xiakaixu <xiakaixu@huawei.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-11-12x86: Add support for Intel HWP feature detection.Dirk Brandewie
Add support of Hardware Managed Performance States (HWP) described in Volume 3 section 14.4 of the SDM. One bit CPUID.06H:EAX[bit 7] expresses the presence of the HWP feature on the processor. The remaining bits CPUID.06H:EAX[bit 8-11] denote the presense of various HWP features. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-14Merge branch 'x86-cpufeature-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpufeature updates from Ingo Molnar: "This tree includes the following changes: - Introduce DISABLED_MASK to list disabled CPU features, to simplify CPU feature handling and avoid excessive #ifdefs - Remove the lightly used cpu_has_pae() primitive" * 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Add more disabled features x86: Introduce disabled-features x86: Axe the lightly-used cpu_has_pae
2014-10-08Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "Fixes and features for 3.18. Apart from the usual cleanups, here is the summary of new features: - s390 moves closer towards host large page support - PowerPC has improved support for debugging (both inside the guest and via gdbstub) and support for e6500 processors - ARM/ARM64 support read-only memory (which is necessary to put firmware in emulated NOR flash) - x86 has the usual emulator fixes and nested virtualization improvements (including improved Windows support on Intel and Jailhouse hypervisor support on AMD), adaptive PLE which helps overcommitting of huge guests. Also included are some patches that make KVM more friendly to memory hot-unplug, and fixes for rare caching bugs. Two patches have trivial mm/ parts that were acked by Rik and Andrew. Note: I will soon switch to a subkey for signing purposes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (157 commits) kvm: do not handle APIC access page if in-kernel irqchip is not in use KVM: s390: count vcpu wakeups in stat.halt_wakeup KVM: s390/facilities: allow TOD-CLOCK steering facility bit KVM: PPC: BOOK3S: HV: CMA: Reserve cma region only in hypervisor mode arm/arm64: KVM: Report correct FSC for unsupported fault types arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc kvm: Fix kvm_get_page_retry_io __gup retval check arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset kvm: x86: Unpin and remove kvm_arch->apic_access_page kvm: vmx: Implement set_apic_access_page_addr kvm: x86: Add request bit to reload APIC access page address kvm: Add arch specific mmu notifier for page invalidation kvm: Rename make_all_cpus_request() to kvm_make_all_cpus_request() and make it non-static kvm: Fix page ageing bugs kvm/x86/mmu: Pass gfn and level to rmapp callback. x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-only kvm: x86: use macros to compute bank MSRs KVM: x86: Remove debug assertion of non-PAE reserved bits kvm: don't take vcpu mutex for obviously invalid vcpu ioctls kvm: Faults which trigger IO release the mmap_sem ...
2014-09-24x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-onlyPaolo Bonzini
On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA. In that case, KVM will fail to patch VMCALL instructions to VMMCALL as required on AMD processors. The failure mode is currently a divide-by-zero exception, which obviously is a KVM bug that has to be fixed. However, picking the right instruction between VMCALL and VMMCALL will be faster and will help if you cannot upgrade the hypervisor. Reported-by: Chris Webb <chris@arachsys.com> Tested-by: Chris Webb <chris@arachsys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11x86: Add more disabled featuresDave Hansen
The original motivation for these patches was for an Intel CPU feature called MPX. The patch to add a disabled feature for it will go in with the other parts of the support. But, in the meantime, there are a few other features than MPX that we can make assumptions about at compile-time based on compile options. Add them to disabled-features.h and check them with cpu_feature_enabled(). Note that this gets rid of the last things that needed an #ifdef CONFIG_X86_64 in cpufeature.h. Yay! Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140911211524.C0EC332A@viggo.jf.intel.com Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-09-11x86: Introduce disabled-featuresDave Hansen
I believe the REQUIRED_MASK aproach was taken so that it was easier to consult in assembly (arch/x86/kernel/verify_cpu.S). DISABLED_MASK does not have the same restriction, but I implemented it the same way for consistency. We have a REQUIRED_MASK... which does two things: 1. Keeps a list of cpuid bits to check in very early boot and refuse to boot if those are not present. 2. Consulted during cpu_has() checks, which allows us to optimize out things at compile-time. In other words, if we *KNOW* we will not boot with the feature off, then we can safely assume that it will be present forever. But, we don't have a similar mechanism for CPU features which may be present but that we know we will not use. We simply use our existing mechanisms to repeatedly check the status of the bit at runtime (well, the alternatives patching helps here but it does not provide compile-time optimization). Adding a feature to disabled-features.h allows the bit to be checked via a new macro: cpu_feature_enabled(). Note that for features in DISABLED_MASK, checks with this macro have all of the benefits of an #ifdef. Before, we would have done this in a header: #ifdef CONFIG_X86_INTEL_MPX #define cpu_has_mpx cpu_has(X86_FEATURE_MPX) #else #define cpu_has_mpx 0 #endif and this in the code: if (cpu_has_mpx) do_some_mpx_thing(); Now, just add your feature to DISABLED_MASK and you can do this everywhere, and get the same benefits you would have from #ifdefs: if (cpu_feature_enabled(X86_FEATURE_MPX)) do_some_mpx_thing(); We need a new function and *not* a modification to cpu_has() because there are cases where we actually need to check the CPU itself, despite what features the kernel supports. The best example of this is a hypervisor which has no control over what features its guests are using and where the guest does not depend on the host for support. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140911211513.9E35E931@viggo.jf.intel.com Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-09-11x86: Axe the lightly-used cpu_has_paeDave Hansen
cpu_has_pae is only referenced in one place: the X86_32 kexec code (in a file not even built on 64-bit). It hardly warrants its own macro, or the trouble we go to ensuring that it can't be called in X86_64 code. Axe the macro and replace it with a direct cpu feature check. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140911211511.AD76E774@viggo.jf.intel.com Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-08-17x86: Support compiling out human-friendly processor feature namesJosh Triplett
The table mapping CPUID bits to human-readable strings takes up a non-trivial amount of space, and only exists to support /proc/cpuinfo and a couple of kernel messages. Since programs depend on the format of /proc/cpuinfo, force inclusion of the table when building with /proc support; otherwise, support omitting that table to save space, in which case the kernel messages will print features numerically instead. In addition to saving 1408 bytes out of vmlinux, this also saves 1373 bytes out of the uncompressed setup code, which contributes directly to the size of bzImage. Signed-off-by: Josh Triplett <josh@joshtriplett.org>
2014-07-14x86, cpu: Kill cpu_has_mpBorislav Petkov
It was used only for checking for some K7s which didn't have MP support, see http://www.hardwaresecrets.com/article/How-to-Transform-an-Athlon-XP-into-an-Athlon-MP/24 and it was unconditionally set on 64-bit for no reason. Kill it. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1403609105-8332-4-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-07-14x86/cpufeature: Add bug flags to /proc/cpuinfoBorislav Petkov
Dump the flags which denote we have detected and/or have applied bug workarounds to the CPU we're executing on, in a similar manner to the feature flags. The advantage is that those are not accumulating over time like the CPU features. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1403609105-8332-2-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-06-18x86, cpufeature: Convert more "features" to bugsBorislav Petkov
X86_FEATURE_FXSAVE_LEAK, X86_FEATURE_11AP and X86_FEATURE_CLFLUSH_MONITOR are not really features but synthetic bits we use for applying different bug workarounds. Call them what they really are, and make sure they get the proper cross-CPU behavior (OR rather than AND). Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1403042783-23278-1-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-05-29x86/xsaves: Detect xsaves/xrstors featureFenghua Yu
Detect the xsaveopt, xsavec, xgetbv, and xsaves features in processor extended state enumberation sub-leaf (eax=0x0d, ecx=1): Bit 00: XSAVEOPT is available Bit 01: Supports XSAVEC and the compacted form of XRSTOR if set Bit 02: Supports XGETBV with ECX = 1 if set Bit 03: Supports XSAVES/XRSTORS and IA32_XSS if set The above features are defined in the new word 10 in cpu features. The IA32_XSS MSR (index DA0H) contains a state-component bitmap that specifies the state components that software has enabled xsaves and xrstors to manage. If the bit corresponding to a state component is clear in XCR0 | IA32_XSS, xsaves and xrstors will not operate on that state component, regardless of the value of the instruction mask. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1401387164-43416-3-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-05-29x86/cpufeature.h: Reformat x86 feature macrosFenghua Yu
In each X86 feature macro definition, add one space in front of the word number which is a one-digit number currently. The purpose of reformatting the macros is to align one-digit and two-digit word numbers. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1401387164-43416-2-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-04-01Merge tag 'driver-core-3.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and sysfs updates from Greg KH: "Here's the big driver core / sysfs update for 3.15-rc1. Lots of kernfs updates to make it useful for other subsystems, and a few other tiny driver core patches. All have been in linux-next for a while" * tag 'driver-core-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (42 commits) Revert "sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()" kernfs: cache atomic_write_len in kernfs_open_file numa: fix NULL pointer access and memory leak in unregister_one_node() Revert "driver core: synchronize device shutdown" kernfs: fix off by one error. kernfs: remove duplicate dir.c at the top dir x86: align x86 arch with generic CPU modalias handling cpu: add generic support for CPU feature based module autoloading sysfs: create bin_attributes under the requested group driver core: unexport static function create_syslog_header firmware: use power efficient workqueue for unloading and aborting fw load firmware: give a protection when map page failed firmware: google memconsole driver fixes firmware: fix google/gsmi duplicate efivars_sysfs_init() drivers/base: delete non-required instances of include <linux/init.h> kernfs: fix kernfs_node_from_dentry() ACPI / platform: drop redundant ACPI_HANDLE check kernfs: fix hash calculation in kernfs_rename_ns() kernfs: add CONFIG_KERNFS sysfs, kobject: add sysfs wrapper for kernfs_enable_ns() ...
2014-02-27x86, cpufeature: Rename X86_FEATURE_CLFLSH to X86_FEATURE_CLFLUSHH. Peter Anvin
We call this "clflush" in /proc/cpuinfo, and have cpu_has_clflush()... let's be consistent and just call it that. Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/n/tip-mlytfzjkvuf739okyn40p8a5@git.kernel.org
2014-02-27x86: Add support for the clflushopt instructionRoss Zwisler
Add support for the new clflushopt instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with Ref # 319433-018. http://download-software.intel.com/sites/default/files/managed/50/1a/319433-018.pdf [ hpa: changed the feature flag to simply X86_FEATURE_CLFLUSHOPT - if that is what we want to report in /proc/cpuinfo anyway... ] Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Link: http://lkml.kernel.org/r/1393441612-19729-2-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-20x86, AVX-512: AVX-512 Feature DetectionFenghua Yu
AVX-512 is an extention of AVX2. Its spec can be found at: http://download-software.intel.com/sites/default/files/managed/71/2e/319433-017.pdf This patch detects AVX-512 features by CPUID. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1392931491-33237-1-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # hw enabling
2014-02-18x86: align x86 arch with generic CPU modalias handlingArd Biesheuvel
The x86 CPU feature modalias handling existed before it was reimplemented generically. This patch aligns the x86 handling so that it (a) reuses some more code that is now generic; (b) uses the generic format for the modalias module metadata entry, i.e., it now uses 'cpu:type:x86,venVVVVfamFFFFmodMMMM:feature:,XXXX,YYYY' instead of the 'x86cpu:vendor:VVVV:family:FFFF:model:MMMM:feature:,XXXX,YYYY' that was used before. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-06x86, cpufeature: Define the Intel MPX feature flagQiaowei Ren
Define the Intel MPX (Memory Protection Extensions) CPU feature flag in the cpufeature list. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Link: http://lkml.kernel.org/r/1386375658-2191-2-git-send-email-qiaowei.ren@intel.com Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-11compiler/gcc4: Add quirk for 'asm goto' miscompilation bugIngo Molnar
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto' constructs, as outlined here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670 Implement a workaround suggested by Jakub Jelinek. Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-28x86, cpufeature: Use new CC_HAVE_ASM_GOTOBorislav Petkov
... for checking for "asm goto" compiler support. It is more explicit this way and we cover the cases where distros have backported that support even to gcc versions < 4.5. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1372437701-13351-1-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-06-20x86: Add a static_cpu_has_safe variantBorislav Petkov
We want to use this in early code where alternatives might not have run yet and for that case we fall back to the dynamic boot_cpu_has. For that, force a 5-byte jump since the compiler could be generating differently sized jumps for each label. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1370772454-6106-5-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-06-20x86: Sanity-check static_cpu_has usageBorislav Petkov
static_cpu_has may be used only after alternatives have run. Before that it always returns false if constant folding with __builtin_constant_p() doesn't happen. And you don't want that. This patch is the result of me debugging an issue where I overzealously put static_cpu_has in code which executed before alternatives have run and had to spend some time with scratching head and cursing at the monitor. So add a jump to a warning which screams loudly when we use this function too early. The alternatives patch that check away in conjunction with patching the rest of the kernel image. [ hpa: factored this into its own configuration option. If we want to have an overarching option, it should be an option which selects other options, not as a group option in the source code. ] Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1370772454-6106-4-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-06-20x86, cpu: Add a synthetic, always true, cpu featureBorislav Petkov
This will be used in alternatives later as an always-replace flag. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1370772454-6106-2-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto update from Herbert Xu: - XTS mode optimisation for twofish/cast6/camellia/aes on x86 - AVX2/x86_64 implementation for blowfish/twofish/serpent/camellia - SSSE3/AVX/AVX2 optimisations for sha256/sha512 - Added driver for SAHARA2 crypto accelerator - Fix for GMAC when used in non-IPsec secnarios - Added generic CMAC implementation (including IPsec glue) - IP update for crypto/atmel - Support for more than one device in hwrng/timeriomem - Added Broadcom BCM2835 RNG driver - Misc fixes * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (59 commits) crypto: caam - fix job ring cleanup code crypto: camellia - add AVX2/AES-NI/x86_64 assembler implementation of camellia cipher crypto: serpent - add AVX2/x86_64 assembler implementation of serpent cipher crypto: twofish - add AVX2/x86_64 assembler implementation of twofish cipher crypto: blowfish - add AVX2/x86_64 implementation of blowfish cipher crypto: tcrypt - add async cipher speed tests for blowfish crypto: testmgr - extend camellia test-vectors for camellia-aesni/avx2 crypto: aesni_intel - fix Kconfig problem with CRYPTO_GLUE_HELPER_X86 crypto: aesni_intel - add more optimized XTS mode for x86-64 crypto: x86/camellia-aesni-avx - add more optimized XTS code crypto: cast6-avx: use new optimized XTS code crypto: x86/twofish-avx - use optimized XTS code crypto: x86 - add more optimized XTS-mode for serpent-avx xfrm: add rfc4494 AES-CMAC-96 support crypto: add CMAC support to CryptoAPI crypto: testmgr - add empty test vectors for null ciphers crypto: testmgr - add AES GMAC test vectors crypto: gcm - fix rfc4543 to handle async crypto correctly crypto: gcm - make GMAC work when dst and src are different hwrng: timeriomem - added devicetree hooks ...
2013-04-30Merge tag 'pm+acpi-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael J Wysocki: - ARM big.LITTLE cpufreq driver from Viresh Kumar. - exynos5440 cpufreq driver from Amit Daniel Kachhap. - cpufreq core cleanup and code consolidation from Viresh Kumar and Stratos Karafotis. - cpufreq scalability improvement from Nathan Zimmer. - AMD "frequency sensitivity feedback" powersave bias for the ondemand cpufreq governor from Jacob Shin. - cpuidle code consolidation and cleanups from Daniel Lezcano. - ARM OMAP cpuidle fixes from Santosh Shilimkar and Daniel Lezcano. - ACPICA fixes and other improvements from Bob Moore, Jung-uk Kim, Lv Zheng, Yinghai Lu, Tang Chen, Colin Ian King, and Linn Crosetto. - ACPI core updates related to hotplug from Toshi Kani, Paul Bolle, Yasuaki Ishimatsu, and Rafael J Wysocki. - Intel Lynxpoint LPSS (Low-Power Subsystem) support improvements from Rafael J Wysocki and Andy Shevchenko. * tag 'pm+acpi-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (192 commits) cpufreq: Revert incorrect commit 5800043 cpufreq: MAINTAINERS: Add co-maintainer cpuidle: add maintainer entry ACPI / thermal: do not always return THERMAL_TREND_RAISING for active trip points ARM: s3c64xx: cpuidle: use init/exit common routine cpufreq: pxa2xx: initialize variables ACPI: video: correct acpi_video_bus_add error processing SH: cpuidle: use init/exit common routine ARM: S5pv210: compiling issue, ARM_S5PV210_CPUFREQ needs CONFIG_CPU_FREQ_TABLE=y ACPI: Fix wrong parameter passed to memblock_reserve cpuidle: fix comment format pnp: use %*phC to dump small buffers isapnp: remove debug leftovers ARM: imx: cpuidle: use init/exit common routine ARM: davinci: cpuidle: use init/exit common routine ARM: kirkwood: cpuidle: use init/exit common routine ARM: calxeda: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine for tegra3 ARM: tegra: cpuidle: use init/exit common routine for tegra2 ARM: OMAP4: cpuidle: use init/exit common routine ...
2013-04-30Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid changes from Ingo Molnar: "The biggest change is x86 CPU bug handling refactoring and cleanups, by Borislav Petkov" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, CPU, AMD: Drop useless label x86, AMD: Correct {rd,wr}msr_amd_safe warnings x86: Fold-in trivial check_config function x86, cpu: Convert AMD Erratum 400 x86, cpu: Convert AMD Erratum 383 x86, cpu: Convert Cyrix coma bug detection x86, cpu: Convert FDIV bug detection x86, cpu: Convert F00F bug detection x86, cpu: Expand cpufeature facility to include cpu bugs
2013-04-30Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core timer updates from Ingo Molnar: "The main changes in this cycle's merge are: - Implement shadow timekeeper to shorten in kernel reader side blocking, by Thomas Gleixner. - Posix timers enhancements by Pavel Emelyanov: - allocate timer ID per process, so that exact timer ID allocations can be re-created be checkpoint/restore code. - debuggability and tooling (/proc/PID/timers, etc.) improvements. - suspend/resume enhancements by Feng Tang: on certain new Intel Atom processors (Penwell and Cloverview), there is a feature that the TSC won't stop in S3 state, so the TSC value won't be reset to 0 after resume. This can be taken advantage of by the generic via the CLOCK_SOURCE_SUSPEND_NONSTOP flag: instead of using the RTC to recover/approximate sleep time, the main (and precise) clocksource can be used. - Fix /proc/timer_list for 4096 CPUs by Nathan Zimmer: on so many CPUs the file goes beyond 4MB of size and thus the current simplistic seqfile approach fails. Convert /proc/timer_list to a proper seq_file with its own iterator. - Cleanups and refactorings of the core timekeeping code by John Stultz. - International Atomic Clock time is managed by the NTP code internally currently but not exposed externally. Separate the TAI code out and add CLOCK_TAI support and TAI support to the hrtimer and posix-timer code, by John Stultz. - Add deep idle support enhacement to the broadcast clockevents core timer code, by Daniel Lezcano: add an opt-in CLOCK_EVT_FEAT_DYNIRQ clockevents feature (which will be utilized by future clockevents driver updates), which allows the use of IRQ affinities to avoid spurious wakeups of idle CPUs - the right CPU with an expiring timer will be woken. - Add new ARM bcm281xx clocksource driver, by Christian Daudt - ... various other fixes and cleanups" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) clockevents: Set dummy handler on CPU_DEAD shutdown timekeeping: Update tk->cycle_last in resume posix-timers: Remove unused variable clockevents: Switch into oneshot mode even if broadcast registered late timer_list: Convert timer list to be a proper seq_file timer_list: Split timer_list_show_tickdevices posix-timers: Show sigevent info in proc file posix-timers: Introduce /proc/PID/timers file posix timers: Allocate timer id per process (v2) timekeeping: Make sure to notify hrtimers when TAI offset changes hrtimer: Fix ktime_add_ns() overflow on 32bit architectures hrtimer: Add expiry time overflow check in hrtimer_interrupt timekeeping: Shorten seq_count region timekeeping: Implement a shadow timekeeper timekeeping: Delay update of clock->cycle_last timekeeping: Store cycle_last value in timekeeper struct as well ntp: Remove ntp_lock, using the timekeeping locks to protect ntp state timekeeping: Simplify tai updating from do_adjtimex timekeeping: Hold timekeepering locks in do_adjtimex and hardpps timekeeping: Move ADJ_SETOFFSET to top level do_adjtimex() ...
2013-04-25crypto: blowfish - add AVX2/x86_64 implementation of blowfish cipherJussi Kivilinna
Patch adds AVX2/x86-64 implementation of Blowfish cipher, requiring 32 parallel blocks for input (256 bytes). Table look-ups are performed using vpgatherdd instruction directly from vector registers and thus should be faster than earlier implementations. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-04-21perf/x86/amd: Add support for AMD NB and L2I "uncore" countersJacob Shin
Add support for AMD Family 15h [and above] northbridge performance counters. MSRs 0xc0010240 ~ 0xc0010247 are shared across all cores that share a common northbridge. Add support for AMD Family 16h L2 performance counters. MSRs 0xc0010230 ~ 0xc0010237 are shared across all cores that share a common L2 cache. We do not enable counter overflow interrupts. Sampling mode and per-thread events are not supported. Signed-off-by: Jacob Shin <jacob.shin@amd.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Stephane Eranian <eranian@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20130419213428.GA8229@jshin-Toonie Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-10cpufreq: AMD "frequency sensitivity feedback" powersave bias for ondemand ↵Jacob Shin
governor Future AMD processors, starting with Family 16h, can provide software with feedback on how the workload may respond to frequency change -- memory-bound workloads will not benefit from higher frequency, where as compute-bound workloads will. This patch enables this "frequency sensitivity feedback" to aid the ondemand governor to make better frequency change decisions by hooking into the powersave bias. Signed-off-by: Jacob Shin <jacob.shin@amd.com> Acked-by: Thomas Renninger <trenn@suse.de> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-02x86, cpu: Convert AMD Erratum 400Borislav Petkov
Convert AMD erratum 400 to the bug infrastructure. Then, retract all exports for modules since they're not needed now and make the AMD erratum checking machinery local to amd.c. Use forward declarations to avoid shuffling too much code around needlessly. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1363788448-31325-7-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02x86, cpu: Convert AMD Erratum 383Borislav Petkov
Convert the AMD erratum 383 testing code to the bug infrastructure. This allows keeping the AMD-specific erratum testing machinery private to amd.c and not export symbols to modules needlessly. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1363788448-31325-6-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02x86, cpu: Convert Cyrix coma bug detectionBorislav Petkov
... to the new facility. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1363788448-31325-5-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02x86, cpu: Convert FDIV bug detectionBorislav Petkov
... to the new facility. Add a reference to the wikipedia article explaining the FDIV test we're doing here. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1363788448-31325-4-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02x86, cpu: Convert F00F bug detectionBorislav Petkov
... to using the new facility and drop the cpuinfo_x86 member. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1363788448-31325-3-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02x86, cpu: Expand cpufeature facility to include cpu bugsBorislav Petkov
We add another 32-bit vector at the end of the ->x86_capability bitvector which collects bugs present in CPUs. After all, a CPU bug is a kind of a capability, albeit a strange one. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1363788448-31325-2-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-03-15x86: Add cpu capability flag X86_FEATURE_NONSTOP_TSC_S3Feng Tang
On some new Intel Atom processors (Penwell and Cloverview), there is a feature that the TSC won't stop in S3 state, say the TSC value won't be reset to 0 after resume. This feature makes TSC a more reliable clocksource and could benefit the timekeeping code during system suspend/resume cycle, so add a flag for it. Signed-off-by: Feng Tang <feng.tang@intel.com> [jstultz: Fix checkpatch warning] Signed-off-by: John Stultz <john.stultz@linaro.org>
2013-02-16perf/x86/amd: Enable northbridge performance counters on AMD family 15hJacob Shin
On AMD family 15h processors, there are 4 new performance counters (in addition to 6 core performance counters) that can be used for counting northbridge events (i.e. DRAM accesses). Their bit fields are almost identical to the core performance counters. However, unlike the core performance counters, these MSRs are shared between multiple cores (that share the same northbridge). We will reuse the same code path as existing family 10h northbridge event constraints handler logic to enforce this sharing. Signed-off-by: Jacob Shin <jacob.shin@amd.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jacob Shin <jacob.shin@amd.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360171589-6381-7-git-send-email-jacob.shin@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>