aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2010-05-17x86, hweight: Use a 32-bit popcnt for __arch_hweight32()H. Peter Anvin
Use a 32-bit popcnt instruction for __arch_hweight32(), even on x86-64. Even though the input register will *usually* be zero-extended due to the standard operation of the hardware, it isn't necessarily so if the input value was the result of truncating a 64-bit operation. Note: the POPCNT32 variant used on x86-64 has a technically unnecessary REX prefix to make it five bytes long, the same as a CALL instruction, therefore avoiding an unnecessary NOP. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <alpine.LFD.2.00.1005171443060.4195@i5.linux-foundation.org>
2010-04-06x86: Add optimized popcnt variantsBorislav Petkov
Add support for the hardware version of the Hamming weight function, popcnt, present in CPUs which advertize it under CPUID, Function 0x0000_0001_ECX[23]. On CPUs which don't support it, we fallback to the default lib/hweight.c sw versions. A synthetic benchmark comparing popcnt with __sw_hweight64 showed almost a 3x speedup on a F10h machine. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100318112015.GC11152@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-06bitops: Optimize hweight() by making use of compile-time evaluationPeter Zijlstra
Rename the extisting runtime hweight() implementations to __arch_hweight(), rename the compile-time versions to __const_hweight() and then have hweight() pick between them. Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100318111929.GB11152@aftab> Acked-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1265028224.24455.154.camel@laptop> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-05Merge branch 'master' into export-slabhTejun Heo
2010-04-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sunxvr500: Ignore secondary output PCI devices. sparc64: Implement perf_arch_fetch_caller_regs sparc64: Update defconfig. sparc64: Fix array size reported by vmemmap_populate() sparc: Fix regset register window handling. drivers/serial/sunsu.c: Correct use after free
2010-04-04Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Always build the powerpc perf_arch_fetch_caller_regs version perf: Always build the stub perf_arch_fetch_caller_regs version perf, probe-finder: Build fix on Debian perf/scripts: Tuple was set from long in both branches in python_process_event() perf: Fix 'perf sched record' deadlock perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernels perf, x86: Fix AMD hotplug & constraint initialization x86: Move notify_cpu_starting() callback to a later stage x86,kgdb: Always initialize the hw breakpoint attribute perf: Use hot regs with software sched switch/migrate events perf: Correctly align perf event tracing buffer
2010-04-03sparc64: Implement perf_arch_fetch_caller_regsDavid S. Miller
We provide regs->tstate, regs->tpc, regs->tnpc and regs->u_regs[UREG_FP]. regs->tstate is necessary for: user_mode() (via perf_exclude_event()) perf_misc_flags() (via perf_prepare_sample()) regs->tpc is necessary for: perf_instruction_pointer() (via perf_prepare_sample()) and regs->u_regs[UREG_FP] is necessary for: perf_callchain() (via perf_prepare_sample()) The regs->tnpc value is provided just to be tidy. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-03sparc64: Update defconfig.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-03Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2010-04-03sparc64: Fix array size reported by vmemmap_populate()Ben Hutchings
vmemmap_populate() attempts to report the used index and total size of vmemmap_table, but it wrongly shifts the total size so that it is always shown as 0. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-03perf: Always build the powerpc perf_arch_fetch_caller_regs versionFrederic Weisbecker
Now that software events use perf_arch_fetch_caller_regs() too, we need the powerpc version to be always built. Fixes the following build error: (.text+0x3210): undefined reference to `perf_arch_fetch_caller_regs' (.text+0x3324): undefined reference to `perf_arch_fetch_caller_regs' (.text+0x33bc): undefined reference to `perf_arch_fetch_caller_regs' (.text+0x33ec): undefined reference to `perf_arch_fetch_caller_regs' (.text+0xd4a0): undefined reference to `perf_arch_fetch_caller_regs' arch/powerpc/kernel/built-in.o:(.text+0xd528): more undefined references to `perf_arch_fetch_caller_regs' follow make[1]: *** [.tmp_vmlinux1] Error 1 make: *** [sub-make] Error 2 Reported-by: Michael Ellerman <michael@ellerman.id.au> Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org>
2010-04-02Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* master.kernel.org:/home/rmk/linux-2.6-arm: ARM: 5965/1: Fix soft lockup in at91 udc driver ARM: 6006/1: ARM: Use the correct NOP size in memmove for Thumb-2 kernel builds ARM: 6005/1: arm: kprobes: fix register corruption with jprobes ARM: 6003/1: removing compilation warning from pl061.h ARM: 6001/1: removing compilation warning comming from clkdev.h ARM: 6000/1: removing compilation warning comming from <asm/irq.h> ARM: 5999/1: Including device.h and resource.h header files in linux/amba/bus.h ARM: 5997/1: ARM: Correct the VFPv3 detection ARM: 5996/1: ARM: Change the mandatory barriers implementation (4/4) ARM: 5995/1: ARM: Add L2x0 outer_sync() support (3/4) ARM: 5994/1: ARM: Add outer_cache_fns.sync function pointer (2/4) ARM: 5993/1: ARM: Move the outer_cache definitions into a separate file (1/4)
2010-04-02Merge branch 'merge' of git://git.secretlab.ca/git/linux-2.6Linus Torvalds
* 'merge' of git://git.secretlab.ca/git/linux-2.6: powerpc/5200: in lpbfifo, flag DMA irqs as enabled after requesting them powerpc/fsl: add device tree binding for QE firmware of/flattree: Fix unhandled OF_DT_NOP tag when unflattening the device tree
2010-04-02perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernelsTorok Edwin
When profiling a 32-bit process on a 64-bit kernel, callgraph tracing stopped after the first function, because it has seen a garbage memory address (tried to interpret the frame pointer, and return address as a 64-bit pointer). Fix this by using a struct stack_frame with 32-bit pointers when the TIF_IA32 flag is set. Note that TIF_IA32 flag must be used, and not is_compat_task(), because the latter is only set when the 32-bit process is executing a syscall, which may not always be the case (when tracing page fault events for example). Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paul Mackerras <paulus@samba.org> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org LKML-Reference: <1268820436-13145-1-git-send-email-edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02perf, x86: Fix AMD hotplug & constraint initializationPeter Zijlstra
Commit 3f6da39 ("perf: Rework and fix the arch CPU-hotplug hooks") moved the amd northbridge allocation from CPUS_ONLINE to CPUS_PREPARE_UP however amd_nb_id() doesn't work yet on prepare so it would simply bail basically reverting to a state where we do not properly track node wide constraints - causing weird perf results. Fix up the AMD NorthBridge initialization code by allocating from CPU_UP_PREPARE and installing it from CPU_STARTING once we have the proper nb_id. It also properly deals with the allocation failing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> [ robustify using amd_has_nb() ] Signed-off-by: Stephane Eranian <eranian@google.com> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02x86: Move notify_cpu_starting() callback to a later stagePeter Zijlstra
Because we need to have cpu identification things done by the time we run CPU_STARTING notifiers. ( This init ordering will be relied on by the next fix. ) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02Merge branch 'perf/urgent' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
2010-04-02Merge branch 'sh/for-2.6.34' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh/for-2.6.34' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: sh: Fix up the SH-3 build for recent TLB changes. sh: export return_address() symbol. sh: Enable the mmu in start_secondary() sh: Fix FDPIC binary loader arch/sh/kernel: Use set_cpus_allowed_ptr sh: Update ecovec_defconfig USB gadget r8a66597-udc.c: duplicated include sh: update the TLB replacement counter for entry wiring.
2010-04-02sh: Fix up the SH-3 build for recent TLB changes.Paul Mundt
While the MMUCR.URB and ITLB/UTLB differentiation works fine for all SH-4 and later TLBs, these features are absent on SH-3. This splits out local_flush_tlb_all() in to SH-4 and PTEAEX copies while restoring the old SH-3 one, subsequently fixing up the build. This will probably want some further reordering and tidying in the future, but that's out of scope at present. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-04-02sh: export return_address() symbol.Paul Mundt
This is needed with some of the tracing code built as modules, so provide the export. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-04-01microblaze: Support word copying in copy_tofrom_userMichal Simek
Word copying is used only for aligned addresses. Here is space for improving to use any better copying technique. Look at memcpy implementation. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Print early printk information to log bufferMichal Simek
If early printk console is not enabled then all messages are written to log buffer. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: head.S typo fixMichal Simek
I forget to change register name in comments. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Use MICROBLAZE_TLB_SIZE in asm codeMichal Simek
TLB size was hardcoded in asm code. This patch brings ability to change TLB size only in one place. (mmu.h). Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Kconfig Fix - pciMichal Simek
I forget to remove pci Kconfig option. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Adding likely macrosMichal Simek
On the base on GCOV analytics is helpful to add likely/unlikely macros. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Add .type and .size to ASM functionsMichal Simek
Cachegrind analysis need this fix to be able to log asm functions. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Fix TLB macrosMichal Simek
To be able to do trace TLB operations. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Use instruction with delay slotMichal Simek
Sync labels. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Remove additional resr and rear loadingMichal Simek
RESR and REAR uses the same regs in whole file. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Change register usage for ESR and EARMichal Simek
This change synchronize register usage in code. ESR = R4 EAR = R3 Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Prepare work for optimization in exception codeMichal Simek
Any sync branch must follow mts instructions not mfs. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Add DEBUG optionMichal Simek
Disable debug option in asm code. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Support systems without lmb bramMichal Simek
When the system has no lmb bram, main memory should be start from zero because of microblaze vectors. DTS fragment could look like: DDR2_SDRAM: memory@0 { device_type = "memory"; reg = < 0x0 0x10000000 >; } ; Then you have to setup CONFIG_KERNEL_BASE_ADDR=0 which caused that kernel physical start address will be zero. On reset vector place will be jump to 0x100 and on 0x100 starts kernel text. You have to solve how to load the kernel before cpu starts. Tested with XMD. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: Sync strlen, strnlen, copy_to/from_userMichal Simek
Last sync. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: Unify __copy_tofrom_userMichal Simek
Move to generic location. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: Move functions to generic locationMichal Simek
noMMU and MMU use them. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: Fix put_user for noMMUMichal Simek
Here is small regression on dhrystone tests and I think that on all benchmarking tests. It is due to better checking mechanism in put_user macro Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: Fix get_user macro for noMMUMichal Simek
Use unified version. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: fix clear_user for noMMU kernelMichal Simek
Previous patches fixed only MMU version and this is the first patch for noMMU kernel Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: Fix strncpy_from_user functionMichal Simek
Generic implementation for noMMU and MMU version Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: fix copy_from_user macroMichal Simek
copy_from_user macro also use copy_tofrom_user function Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: copy_to_user unificationMichal Simek
noMMU and MMU kernel will use copy copy_tofrom_user asm implementation. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: sync put/get/clear_user macrosMichal Simek
Add macro description and resort. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: fix put_user and get_user macrosMichal Simek
Use FIXUP macros and resort them. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: fix __get_user_asm macroMichal Simek
It is used __FIXUP_SECTION and __EX_TABLE_SECTION macros. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: uaccess: fix clean user macroMichal Simek
This is the first patch which does uaccess unification. I choosed to do several patches to be able to use bisect in future if any fault happens. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: move noMMU __range_ok function to uaccess.hMichal Simek
The same noMMU and MMU functions should be placed in one file. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Move exception_table_entry upwardMichal Simek
Just sort to be able remove whole block. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-01microblaze: Remove segment.hMichal Simek
I would like to use asm-generic uaccess.h where are segment macros defined. This is just first step. Signed-off-by: Michal Simek <monstr@monstr.eu>