llvm/llvm.git - [no description]

Age	Commit message (Collapse)	Author
2017-11-08	[X86] Add some initial scheduling tests for generic x86 instructions	Simon Pilgrim
	These will be using inline asm to ensure we have coverage that we're unlikely to get from lowering of basic ir. Currently waiting for D39728 to land to add support for scheduler comments for inline asm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317698 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-08	[X86] Add patterns to fold EVEX store with EVEX encoded vcvtps2ph ↵	Craig Topper
	instructions. Remove bad pattern that had vf432 vcvtps2ph storing 128-bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317662 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-08	[X86] Allow legacy vcvtps2ph intrinsics to select EVEX encoded instructions. ↵	Craig Topper
	Rely on EVEX->VEX to convert back. Missed store folding opportunities will be fixed in a subsequent commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317661 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-08	Attribute nonlazybind should not affect calls to functions with hidden ↵	Sriraman Tallam
	visibility. Differential Revision: https://reviews.llvm.org/D39625 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317639 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	Reland "Correct dwarf unwind information in function epilogue for X86"	Petar Jovanovic
	Reland r317100 with minor fix regarding ComputeCommonTailLength function in BranchFolding.cpp. Skipping top CFI instructions block needs to executed on several more return points in ComputeCommonTailLength(). Original r317100 message: "Correct dwarf unwind information in function epilogue for X86" This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: - CFI instructions do not affect code generation - Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Changed CFI instructions so that they: - are duplicable - are not counted as instructions when tail duplicating or tail merging - can be compared as equal Added CFIInstrInserter pass: - analyzes each basic block to determine cfa offset and register valid at its entry and exit - verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors - inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317579 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	[X86] Regenerate select tests	Simon Pilgrim
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317571 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	[X86] Don't clobber reserved registers with stack adjustments	Bjorn Steinbrink
	Summary: Calls using invoke in funclet based functions are assumed to clobber all registers, which causes the stack adjustment using pops to consider all registers not defined by the call to be undefined, which can unfortunately include the base pointer, if one is needed. To prevent this (and possibly other hazards), skip reserved registers when looking for candidate registers. This fixes issue #45034 in the Rust compiler. Reviewers: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39636 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317551 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	[X86] Add patterns to fold a 64-bit load into the EVEX vcvtph2ps instructions.	Craig Topper
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317548 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	[X86] Add patterns for folding a v16i8 with the VEX vcvtph2ps intrinsics.	Craig Topper
	Disable the peephole pass to prove that the pattern is working. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317547 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	[X86] Add a test for a 128-bit vector load feeding a cvtph2ps intrinsic.	Craig Topper
	The instruction only loads 64-bits, but we should be able to fold a wider load and let it be narrowed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317546 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	[X86] Remove alignment from a load in the f16c intrinsic test. The alignment ↵	Craig Topper
	shouldn't be required for load folding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317545 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	[X86] Add support for using EVEX instructions for the legacy vcvtph2ps ↵	Craig Topper
	intrinsics. Looks like there's some missed load folding opportunities for i64 loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317544 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	[X86] Add AVX512VL command line to f16c intrinsic test to show missed EVEX ↵	Craig Topper
	opportunities for the legacy intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317543 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-07	[X86] Use IMPLICIT_DEF in VEX/EVEX vcvtss2sd/vcvtsd2ss patterns instead of a ↵	Craig Topper
	COPY_TO_REGCLASS. ExeDepsFix pass should take care of making the registers match. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317542 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06	[X86] Make FeatureAVX512 imply FeatureF16C.	Craig Topper
	The EVEX to VEX pass is already assuming this is true under AVX512VL. We had special patterns to use zmm instructions if VLX and F16C weren't available. Instead just make AVX512 imply F16C to make the EVEX to VEX behavior explicitly legal and remove the extra patterns. All known CPUs with AVX512 have F16C so this should safe for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317521 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06	[MIRPrinter] Use %subreg.xxx syntax for subregister index operands	Bjorn Pettersson
	Summary: Print %subreg.<subregidxname> instead of just the subregister index when printing immediate operands corresponding to subreg indices in INSERT_SUBREG, EXTRACT_SUBREG, SUBREG_TO_REG and REG_SEQUENCE. Reviewers: qcolombet, MatzeB Reviewed By: MatzeB Subscribers: nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39696 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317513 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06	[X86][AVX512] Improve lowering of AVX512 test intrinsics	Uriel Korach
	Added TESTM and TESTNM to the list of instructions that already zeroing unused upper bits and does not need the redundant shift left and shift right instructions afterwards. Added a pattern for TESTM and TESTNM in iselLowering, so now icmp(neq,and(X,Y), 0) goes folds into TESTM and icmp(eq,and(X,Y), 0) goes folds into TESTNM This commit is a preparation for lowering the test and testn X86 intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38732 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317465 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06	X86 ISel: Basic support for variable-index vector permutations	Zvi Rackover
	Summary: Try to lower a BUILD_VECTOR composed of extract-extract chains that can be reasoned to be a permutation of a vector by indices in a non-constant vector. We saw this pattern created by ISPC, which resolts to creating it due to the requirement that shufflevector's mask operand be a constant vector. I didn't check this but we could possibly use this pattern for lowering the X86 permute C-instrinsics instead of llvm.x86 instrinsics. This change can be followed by more improvements: 1. Handle vectors with undef elements. 2. Utilize pshufb and zero-mask-blending to support more effiecient construction of vectors with constant-0 elements. 3. Use smaller-element vectors of same width, and "interpolate" the indices, when no native operation available. Reviewers: RKSimon, craig.topper Reviewed By: RKSimon Subscribers: chandlerc, DavidKreitzer Differential Revision: https://reviews.llvm.org/D39126 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317463 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06	[x86][AVX512] Lowering Broadcastm intrinsics to LLVM IR	Jina Nahias
	This patch, together with a matching clang patch (https://reviews.llvm.org/D38683), implements the lowering of X86 broadcastm intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38684 Change-Id: I709ac0b34641095397e994c8ff7e15d1315b3540 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317458 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06	[X86] Use EVEX encoded intrinsics for legacy FMA intrinsics when possible.	Craig Topper
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317454 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06	[X86] Add avx512vl command line to fma-instrinsics-x86.ll	Craig Topper
	Some of these demonstrate a missed EVEX to VEX compression because we aren't prefering EVEX instructions during isel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317452 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06	[X86] Simplify command lines on the fma-instrinsics-x86.ll test and add ↵	Craig Topper
	-show-mc-encoding. Use feature names instead of CPU names. A future commit will add avx512vl command lines to demonstrate missed use of EVEX instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317451 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06	[X86] Use EVEX encoded instructions for legacy scalar sqrt intrinsics.	Craig Topper
	Fixes PR35161. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317445 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-05	[X86] Remove some more RCP and RSQRT patterns from InstrAVX512.td that I ↵	Craig Topper
	missed in r317413. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317441 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-05	[X86][SSE] Tests for integer min/max horizontal reductions	Simon Pilgrim
	Matching patterns that vectorizers should have created for us. The experimental intrinsics should probably be added as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317439 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-04	[X86][AVX] Regenerate test. NFCI.	Simon Pilgrim
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317424 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-04	[X86] Don't use RCP14 and RSQRT14 for reciprocal estimations or for legacy ↵	Craig Topper
	SSE rcp/rsqrt intrinsics when AVX512 features are enabled. Summary: AVX512 added RCP14 and RSQRT instructions which improve accuracy over the legacy RCP and RSQRT instruction, but not enough accuracy to remove the need for a Newton Raphson refinement. Currently we use these new instructions for the legacy packed SSE instrinics, but not the scalar instrinsics. And we use it for fast math optimization of division and reciprocal sqrt. I think switching the legacy instrinsics maybe surprising to the user since it changes the answer based on which processor you're using regardless of any fastmath settings. It's also weird that we did something different between scalar and packed. As far at the reciprocal estimation, I think it creates unnecessary deltas in our output behavior (and prevents EVEX->VEX). A little playing around with gcc and icc and godbolt suggest they don't change which instructions they use here. This patch adds new X86ISD nodes for the RCP14/RSQRT14 and uses those for the new intrinsics. Leaving the old intrinsics to use the old instructions. Going forward I think our focus should be on -Supporting 512-bit vectors, which will have to use the RCP14/RSQRT14. -Using RSQRT28/RCP28 to remove the Newton Raphson step on processors with AVX512ER -Supporting double precision. Reviewers: zvi, DavidKreitzer, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39583 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317413 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-04	[X86] Regenerate a couple more tests that I missed in r317410.	Craig Topper
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317412 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-04	[X86] Teach EVEX->VEX pass to turn SHUFI32X4/SHUFF32X4/SHUFI64X/SHUFF64X2 ↵	Craig Topper
	into VPERM2F128/VPERM2I128. This recovers some of the tests that were changed by r317403. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317410 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-04	[X86] Teach shuffle lowering to use 256-bit SHUF128 when possible.	Craig Topper
	This allows masked operations to be used and allows the register allocator to use YMM16-31 if necessary. As a follow up I'll look into teaching EVEX->VEX how to turn this back into PERM2X128 if any of the additional features don't work out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317403 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03	[X86] Give unary PERMI priority over SHUF128 in lowerV8I64VectorShuffle to ↵	Craig Topper
	make it possible to fold a load. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317382 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03	Fix for Bug 34475 - LOCK/REP/REPNE prefixes emitted as instruction on their own.	Andrew V. Tischenko
	Differential Revision: https://reviews.llvm.org/D39546 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317330 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03	re-land [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."	Clement Courbet
	Fix undefined references: ExpandMemCmp belongs to CodeGen/, not Scalar/. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317318 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03	[X86][SSE] Add PACKUS support to combineVectorTruncation	Simon Pilgrim
	Similar to the existing code to lower to PACKSS, we can use PACKUS if the input vector's leading zero bits extend all the way to the packed/truncated value. We have to account for pre-SSE41 targets not supporting PACKUSDW git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317315 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03	[X86] Remove PALIGNR/VALIGN handling from combineBitcastForMaskedOp and move ↵	Craig Topper
	to isel patterns instead. Prefer 128-bit VALIGND/VALIGNQ over PALIGNR during lowering when possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317299 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03	Avoid PLT for external calls when attribute nonlazybind is used.	Sriraman Tallam
	Differential Revision: https://reviews.llvm.org/D39065 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317292 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02	[X86] Give AVX512VL instructions priority over their AVX equivalents.	Craig Topper
	I thought we had gotten all these priority bugs worked out, but I guess not. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317283 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02	Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."	Clement Courbet
	undefined reference to `llvm::TargetPassConfig::ID' on clang-ppc64le-linux-multistage This reverts commit eea333c33fa73ad225ef28607795984829f65688. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317213 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02	[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.	Clement Courbet
	Summary: This is mostly a noop (most of the test diffs are renamed blocks). There are a few temporary register renames (eax<->ecx) and a few blocks are shuffled around. See the discussion in PR33325 for more details. Reviewers: spatel Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D39456 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317211 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02	[X86] Fix bug in legalize vector types - Split large loads	Ayman Musa
	When splitting a large load to smaller legally-typed loads, the last load should be padded to reach the size of the previous one so a CONCAT_VECTORS node could reunite them again. The code currently pads the last load to reach the size of the first load (instead of the previous). Differential Revision: https://reviews.llvm.org/D38495 Change-Id: Ib60b55ed26ce901fabf68108daf52683fbd5013f git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317206 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02	Adding test for extraxt sub vector load and store avx512	Michael Zuckerman
	Change-Id: Iefcb0ec6b6aa1b530ce5358081f02e6e522a8e50 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317202 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02	The patch updates sched numbers for YMM AVX instrs such as VMOVx, VORx, ↵	Andrew V. Tischenko
	VXOR, VPERMILx, VBROADCASTx, etc. PR32857 should be closed. Differential Revision: https://reviews.llvm.org/D39227 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317196 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02	[X86] Fix fast-isel-int-float-conversion test	Steven Wu
	Test is failing due to the revert in r317136. Fix the test to make all the bots happy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317153 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01	Revert "Correct dwarf unwind information in function epilogue for X86"	Petar Jovanovic
	This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf buildbot failure (build #15606). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317136 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01	[X86][SSE] Add PACKUS support to LowerTruncate	Simon Pilgrim
	Similar to the existing code to lower to PACKSS, we can use PACKUS if the input vector's leading zero bits extend all the way to the packed/truncated value. We have to account for pre-SSE41 targets not supporting PACKUSDW git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317128 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01	[X86] Add custom code to EVEX to VEX pass to turn unmasked 128-bit ↵	Craig Topper
	VPALIGND/Q into VPALIGNR if the extended registers aren't being used. This will enable us to prefer VALIGND/Q during shuffle lowering in order to get the extended register encoding space when BWI isn't available. But if we end up not using the extended registers we can switch VPALIGNR for the shorter VEX encoding. Differential Revision: https://reviews.llvm.org/D39401 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317122 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01	[X86] Prevent fast isel from folding loads into the instructions listed in ↵	Craig Topper
	hasPartialRegUpdate. This patch moves the check for opt size and hasPartialRegUpdate into the lower level implementation of foldMemoryOperandImpl to catch the entry point that fast isel uses. We're still folding undef register instructions in AVX that we should also probably disable, but that's a problem for another patch. Unfortunately, this requires reordering a bunch of functions which is why the diff is so large. I can do the function reordering separately if we want. Differential Revision: https://reviews.llvm.org/D39402 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317112 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01	[X86] Regnerate test to attempt to fix build bot failure.	Craig Topper
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317107 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01	[X86] Add 64-bit int to float/double conversion with AVX to ↵	Craig Topper
	X86FastISel::X86SelectSIToFP Summary: [X86] Teach fast isel to handle i64 sitofp with AVX. For some reason we only handled i32 sitofp with AVX. But with SSE only we support i64 so we should do the same with AVX. Also add i686 command lines for the 32-bit tests. 64-bit tests are in a separate file to avoid a fast-isel abort failure in 32-bit mode. Reviewers: RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39450 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317102 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01	Update VCVTx, VMOVNTPx and VROUNDYPx instructions scheduling on btver2.	Andrew V. Tischenko
	Differential Revision: https://reviews.llvm.org/D39059 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317101 91177308-0d34-0410-b5e6-96231b3b80d8