aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-21target/arm: Make VMOV scalar <-> gpreg beatwise for MVEPeter Maydell
In a CPU with MVE, the VMOV (vector lane to general-purpose register) and VMOV (general-purpose register to vector lane) insns are not predicated, but they are subject to beatwise execution if they are not in an IT block. Since our implementation always executes all 4 beats in one tick, this means only that we need to handle PSR.ECI: * we must do the usual check for bad ECI state * we must advance ECI state if the insn succeeds * if ECI says we should not be executing the beat corresponding to the lane of the vector register being accessed then we should skip performing the move Note that if PSR.ECI is non-zero then we cannot be in an IT block. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-45-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VADDVPeter Maydell
Implement the MVE VADDV insn, which performs an addition across vector lanes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-44-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VHCADDPeter Maydell
Implement the MVE VHCADD insn, which is similar to VCADD but performs a halving step. This one overlaps with VADC. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-43-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VCADDPeter Maydell
Implement the MVE VCADD insn, which performs a complex add with rotate. Note that the size=0b11 encoding is VSBC. The architecture grants some leeway for the "destination and Vm source overlap" case for the size MO_32 case, but we choose not to make use of it, instead always calculating all 16 bytes worth of results before setting the destination register. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-42-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VADC, VSBCPeter Maydell
Implement the MVE VADC and VSBC insns. These perform an add-with-carry or subtract-with-carry of the 32-bit elements in each lane of the input vectors, where the carry-out of each add is the carry-in of the next. The initial carry input is either 1 or is from FPSCR.C; the carry out at the end is written back to FPSCR.C. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-41-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VRHADDPeter Maydell
Implement the MVE VRHADD insn, which performs a rounded halving addition. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-40-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQDMULL (vector)Peter Maydell
Implement the vector form of the MVE VQDMULL insn. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-39-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQDMLSDH and VQRDMLSDHPeter Maydell
Implement the MVE VQDMLSDH and VQRDMLSDH insns, which are like VQDMLADH and VQRDMLADH except that products are subtracted rather than added. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-38-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQDMLADH and VQRDMLADHPeter Maydell
Implement the MVE VQDMLADH and VQRDMLADH insns. These multiply elements, and then add pairs of products, double, possibly round, saturate and return the high half of the result. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-37-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VRSHLPeter Maydell
Implement the MVE VRSHL insn (vector form). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-36-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VSHL insnPeter Maydell
Implement the MVE VSHL insn (vector form). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-35-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQRSHLPeter Maydell
Implement the MV VQRSHL (vector) insn. Again, the code to perform the actual shifts is borrowed from neon_helper.c. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-34-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQSHL (vector)Peter Maydell
Implement the MVE VQSHL insn (encoding T4, which is the vector-shift-by-vector version). The DO_SQSHL_OP and DO_UQSHL_OP macros here are derived from the neon_helper.c code for qshl_u{8,16,32} and qshl_s{8,16,32}. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-33-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQADD, VQSUB (vector)Peter Maydell
Implement the vector forms of the MVE VQADD and VQSUB insns. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-32-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQDMULH, VQRDMULH (vector)Peter Maydell
Implement the vector forms of the MVE VQDMULH and VQRDMULH insns. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-31-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQDMULL scalarPeter Maydell
Implement the MVE VQDMULL scalar insn. This multiplies the top or bottom half of each element by the scalar, doubles and saturates to a double-width result. Note that this encoding overlaps with VQADD and VQSUB; it uses what in VQADD and VQSUB would be the 'size=0b11' encoding. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-30-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQDMULH and VQRDMULH (scalar)Peter Maydell
Implement the MVE VQDMULH and VQRDMULH scalar insns, which multiply elements by the scalar, double, possibly round, take the high half and saturate. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-29-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VQADD and VQSUBPeter Maydell
Implement the MVE VQADD and VQSUB insns, which perform saturating addition of a scalar to each element. Note that individual bytes of each result element are used or discarded according to the predicate mask, but FPSCR.QC is only set if the predicate mask for the lowest byte of the element is set. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-28-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VPSTPeter Maydell
Implement the MVE VPST insn, which sets the predicate mask fields in the VPR to the immediate value encoded in the insn. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-27-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VBRSRPeter Maydell
Implement the MVE VBRSR insn, which reverses a specified number of bits in each element, setting the rest to zero. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-26-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VHADD, VHSUB (scalar)Peter Maydell
Implement the scalar variants of the MVE VHADD and VHSUB insns. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-25-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VSUB, VMUL (scalar)Peter Maydell
Implement the scalar forms of the MVE VSUB and VMUL insns. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-24-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VADD (scalar)Peter Maydell
Implement the scalar form of the MVE VADD insn. This takes the scalar operand from a general purpose register. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-23-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VRMLALDAVH, VRMLSLDAVHPeter Maydell
Implement the MVE VRMLALDAVH and VRMLSLDAVH insns, which accumulate the results of a rounded multiply of pairs of elements into a 72-bit accumulator, returning the top 64 bits in a pair of general purpose registers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-22-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VMLSLDAVPeter Maydell
Implement the MVE insn VMLSLDAV, which multiplies source elements, alternately adding and subtracting them, and accumulates into a 64-bit result in a pair of general purpose registers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-21-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VMLALDAVPeter Maydell
Implement the MVE VMLALDAV insn, which multiplies pairs of integer elements, accumulating them into a 64-bit result in a pair of general-purpose registers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-20-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VMULLPeter Maydell
Implement the MVE VMULL insn, which multiplies two single width integer elements to produce a double width result. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-19-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VHADD, VHSUBPeter Maydell
Implement MVE VHADD and VHSUB insns, which perform an addition or subtraction and then halve the result. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-18-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VABDPeter Maydell
Implement the MVE VABD insn. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-17-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VMAX, VMINPeter Maydell
Implement the MVE VMAX and VMIN insns. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-16-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VRMULHPeter Maydell
Implement the MVE VRMULH insn, which performs a rounding multiply and then returns the high half. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-15-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VMULHPeter Maydell
Implement the MVE VMULH insn, which performs a vector multiply and returns the high half of the result. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-14-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VADD, VSUB, VMULPeter Maydell
Implement the MVE VADD, VSUB and VMUL insns. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-13-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VAND, VBIC, VORR, VORN, VEORPeter Maydell
Implement the MVE vector logical operations operating on two registers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-12-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VDUPPeter Maydell
Implement the MVE VDUP insn, which duplicates a value from a general-purpose register into every lane of a vector register (subject to predication). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-11-peter.maydell@linaro.org
2021-06-21tcg: Make gen_dup_i32/i64() public as tcg_gen_dup_i32/i64Peter Maydell
The Arm MVE VDUP implementation would like to be able to emit code to duplicate a byte or halfword value into an i32. We have code to do this already in tcg-op-gvec.c, so all we need to do is make the functions global. For consistency with other functions made available to the frontends: * we rename to tcg_gen_dup_* * we expose both the _i32 and _i64 forms * we provide the #define for a _tl form Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20210617121628.20116-10-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VNEGPeter Maydell
Implement the MVE VNEG insn (both integer and floating point forms). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-9-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VABSPeter Maydell
Implement the MVE VABS functions (both integer and floating point). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-8-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VMVN (register)Peter Maydell
Implement the MVE VMVN(register) operation. Note that for predication this operation is byte-by-byte. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-7-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VREV16, VREV32, VREV64Peter Maydell
Implement the MVE instructions VREV16, VREV32 and VREV64. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-6-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VCLSPeter Maydell
Implement the MVE VCLS insn. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-5-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VCLZPeter Maydell
Implement the MVE VCLZ insn (and the necessary machinery for MVE 1-input vector ops). Note that for non-load instructions predication is always performed at a byte level granularity regardless of element size (R_ZLSJ), and so the masking logic here differs from that used in the VLDR and VSTR helpers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-4-peter.maydell@linaro.org
2021-06-21target/arm: Implement widening/narrowing MVE VLDR/VSTR insnsPeter Maydell
Implement the variants of MVE VLDR (encodings T1, T2) which perform "widening" loads where bytes or halfwords are loaded from memory and zero or sign-extended into halfword or word length vector elements, and the narrowing MVE VSTR (encodings T1, T2) where bytes or halfwords are stored from halfword or word elements. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-3-peter.maydell@linaro.org
2021-06-21target/arm: Implement MVE VLDR/VSTR (non-widening forms)Peter Maydell
Implement the forms of the MVE VLDR and VSTR insns which perform non-widening loads of bytes, halfwords or words from memory into vector elements of the same width (encodings T5, T6, T7). (At the moment we know for MVE and M-profile in general that vfp_access_check() can never return false, but we include the conventional return-true-on-failure check for consistency with non-M-profile translation code.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210617121628.20116-2-peter.maydell@linaro.org
2021-06-21target/arm: Handle FPU check for FPCXT_NS insns via vfp_access_check_m()Peter Maydell
Instead of open-coding the "take NOCP exception if FPU disabled, otherwise call gen_preserve_fp_state()" code in the accessors for FPCXT_NS, add an argument to vfp_access_check_m() which tells it to skip the gen_update_fp_context() call, so we can use it for the FPCXT_NS case. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210618141019.10671-8-peter.maydell@linaro.org
2021-06-21target/arm: Split vfp_access_check() into A and M versionsPeter Maydell
vfp_access_check and its helper routine full_vfp_access_check() has gradually grown and is now an awkward mix of A-profile only and M-profile only pieces. Refactor it into an A-profile only and an M-profile only version, taking advantage of the fact that now the only direct call to full_vfp_access_check() is in A-profile-only code. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210618141019.10671-7-peter.maydell@linaro.org
2021-06-21target/arm: Factor FP context update code out into helper functionPeter Maydell
Factor the code in full_vfp_access_check() which updates the ownership of the FP context and creates a new FP context out into its own function. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210618141019.10671-6-peter.maydell@linaro.org
2021-06-21target/arm: Handle writeback in VLDR/VSTR sysreg with no memory accessPeter Maydell
A few subcases of VLDR/VSTR sysreg succeed but do not perform a memory access: * VSTR of VPR when unprivileged * VLDR to VPR when unprivileged * VLDR to FPCXT_NS when fpInactive In these cases, even though we don't do the memory access we should still update the base register and perform the stack limit check if the insn's addressing mode specifies writeback. Our implementation failed to do this, because we handle these side-effects inside the memory_to_fp_sysreg() and fp_sysreg_to_memory() callback functions, which are only called if there's something to load or store. Fix this by adding an extra argument to the callbacks which is set to true to actually perform the access and false to only do side effects like writeback, and calling the callback with do_access = false for the three cases listed above. This produces slightly suboptimal code for the case of a write to FPCXT_NS when the FPU is inactive and the insn didn't have side effects (ie no writeback, or via VMSR), in which case we'll generate a conditional branch over an unconditional branch. But this doesn't seem to be important enough to merit requiring the callback to report back whether it generated any code or not. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210618141019.10671-5-peter.maydell@linaro.org
2021-06-21target/arm: Don't NOCP fault for FPCXT_NS accessesPeter Maydell
The M-profile architecture requires that accesses to FPCXT_NS when there is no active FP state must not take a NOCP fault even if the FPU is disabled. We were not implementing this correctly, because in our decode we catch the NOCP faults early in m-nocp.decode. Fix this bug by moving all the handling of M-profile FP system register accesses from vfp.decode into m-nocp.decode and putting it above the NOCP blocks. This provides the correct behaviour: * for accesses other than FPCXT_NS the trans functions call vfp_access_check(), which will check for FPU disabled and raise a NOCP exception if necessary * for FPCXT_NS we have the special case code that doesn't call vfp_access_check() * when these trans functions want to raise an UNDEF they return false, so the decoder will fall through into the NOCP blocks. This means that NOCP correctly takes precedence over UNDEF for these insns. (This is a difference from the other insns handled by m-nocp.decode, where UNDEF takes precedence and which we implement by having those trans functions call unallocated_encoding() in the appropriate places.) [Note for backport to stable: this commit has a semantic dependency on commit 9a486856e9173af, which was not marked as cc-stable because we didn't know we'd need it for a for-stable bugfix.] Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210618141019.10671-4-peter.maydell@linaro.org
2021-06-21target/arm: Handle FPU being disabled in FPCXT_NS accessesPeter Maydell
If the guest makes an FPCXT_NS access when the FPU is disabled, one of two things happens: * if there is no active FP context, then the insn behaves the same way as if the FPU was enabled: writes ignored, reads same value as FPDSCR_NS * if there is an active FP context, then we take a NOCP exception Add code to the sysreg read/write functions which emits code to take the NOCP exception in the latter case. At the moment this will never be used, because the NOCP checks in m-nocp.decode happen first, and so the trans functions are never called when the FPU is disabled. The code will be needed when we move the sysreg access insns to before the NOCP patterns in the following commit. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210618141019.10671-3-peter.maydell@linaro.org