peter.maydell/qemu-arm.git - This qemu repo is mostly a place to put together pull requests for upstream

Age	Commit message (Collapse)	Author
2019-06-04	target/arm: Convert float-to-integer VCVT insns to decodetreevfp-decodetree	Peter Maydell
	Convert the float-to-integer VCVT instructions to decodetree. Since these are the last unconverted instructions, we can delete the old decoder structure entirely now. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree	Peter Maydell
	Convert the VCVT (between floating-point and fixed-point) instructions to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VJCVT to decodetree	Peter Maydell
	Convert the VJCVT instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert integer-to-float insns to decodetree	Peter Maydell
	Convert the VCVT integer-to-float instructions to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert double-single precision conversion insns to decodetree	Peter Maydell
	Convert the VCVT double/single precision conversion insns to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VFP round insns to decodetree	Peter Maydell
	Convert the VFP round-to-integer instructions VRINTR, VRINTZ and VRINTX to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert the VCVT-to-f16 insns to decodetree	Peter Maydell
	Convert the VCVTT and VCVTB instructions which convert from f32 and f64 to f16 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit store of the right half of the input single-precision register rather than doing a load/modify/store sequence on the full 32 bits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert the VCVT-from-f16 insns to decodetree	Peter Maydell
	Convert the VCVTT, VCVTB instructions that deal with conversion from half-precision floats to f32 or 64 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit load of the right half of the input single-precision register rather than loading the full 32 bits and then doing a separate shift or sign-extension. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VFP comparison insns to decodetree	Peter Maydell
	Convert the VFP comparison instructions to decodetree. Note that comparison instructions should not honour the VFP short-vector length and stride information: they are scalar-only operations. This applies to all the 2-operand instructions except for VMOV, VABS, VNEG and VSQRT. (In the old decoder this is implemented via the "if (op == 15 && rn > 3) { veclen = 0; }" check.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VMOV (register) to decodetree	Peter Maydell
	Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VSQRT to decodetree	Peter Maydell
	Convert the VSQRT instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VNEG to decodetree	Peter Maydell
	Convert the VNEG instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VABS to decodetree	Peter Maydell
	Convert the VFP VABS instruction to decodetree. Unlike the 3-op versions, we don't pass fpst to the VFPGen2OpSPFn or VFPGen2OpDPFn because none of the operations which use this format and support short vectors will need it. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VMOV (imm) to decodetree	Peter Maydell
	Convert the VFP VMOV (immediate) instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VFP fused multiply-add insns to decodetree	Peter Maydell
	Convert the VFP fused multiply-add instructions (VFNMA, VFNMS, VFMA, VFMS) to decodetree. Note that in the old decode structure we were implementing these to honour the VFP vector stride/length. These instructions were introduced in VFPv4, and in the v7A architecture they are UNPREDICTABLE if the vector stride or length are non-zero. In v8A they must UNDEF if stride or length are non-zero, like all VFP instructions; we choose to UNDEF always. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VDIV to decodetree	Peter Maydell
	Convert the VDIV instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VSUB to decodetree	Peter Maydell
	Convert the VSUB instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VADD to decodetree	Peter Maydell
	Convert the VADD instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VNMUL to decodetree	Peter Maydell
	Convert the VNMUL instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VMUL to decodetree	Peter Maydell
	Convert the VMUL instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VFP VNMLA to decodetree	Peter Maydell
	Convert the VFP VNMLA instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VFP VNMLS to decodetree	Peter Maydell
	Convert the VFP VNMLS instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VFP VMLS to decodetree	Peter Maydell
	Convert the VFP VMLS instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VFP VMLA to decodetree	Peter Maydell
	Convert the VFP VMLA instruction to decodetree. This is the first of the VFP 3-operand data processing instructions, so we include in this patch the code which loops over the elements for an old-style VFP vector operation. The existing code to do this looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since we are going to be converting instructions one at a time anyway we can take the opportunity to make the new loop use TCG temporaries, which means we can do that conversion one operation at a time rather than needing to do it all in one go. We include an UNDEF check which was missing in the old code: short-vector operations (with stride or length non-zero) were deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec field does not indicate that support for short vectors is present we UNDEF the operations that would use them. (This is a change of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which previously were all incorrectly allowing short-vector operations.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d	Peter Maydell
	Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans functions which perform the memory accesses by going via the TCG globals cpu_F0s and cpu_F0d, to use local TCG temps instead. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert the VFP load/store multiple insns to decodetree	Peter Maydell
	Convert the VFP load/store multiple insns to decodetree. This includes tightening up the UNDEF checking for pre-VFPv3 CPUs which only have D0-D15 : they now UNDEF for any access to D16-D31, not merely when the smallest register in the transfer list is in D16-D31. This conversion does not try to share code between the single precision and the double precision versions; this looks a bit duplicative of code, but it leaves the door open for a future refactoring which gets rid of the use of the "F0" registers by inlining the various functions like gen_vfp_ld() and gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }" conditionalisation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VFP VLDR and VSTR to decodetree	Peter Maydell
	Convert the VFP single load/store insns VLDR and VSTR to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VFP two-register transfer insns to decodetree	Peter Maydell
	Convert the VFP two-register transfer instructions to decodetree (in the v8 Arm ARM these are the "Advanced SIMD and floating-point 64-bit move" encoding group). Again, we expand out the sequences involving gen_vfp_msr() and gen_msr_vfp(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert "single-precision" register moves to decodetree	Peter Maydell
	Convert the "single-precision" register moves to decodetree: * VMSR * VMRS * VMOV between general purpose register and single precision Note that the VMSR/VMRS conversions make our handling of the "should this UNDEF?" checks consistent between the two instructions: * VMSR to MVFR0, MVFR1, MVFR2 now UNDEF from EL0 (previously was a nop) * VMSR to FPSID now UNDEFs from EL0 or if VFPv3 or better (previously was a nop) * VMSR to FPINST and FPINST2 now UNDEF if VFPv3 or better (previously would write to the register, which had no guest-visible effect because we always UNDEF reads) We also tighten up the decode: we were previously underdecoding some SBZ or SBO bits. The conversion of VMOV_single includes the expansion out of the gen_mov_F0_vreg()/gen_vfp_mrs() and gen_mov_vreg_F0()/gen_vfp_msr() sequences into the simpler direct load/store of the TCG temp via tcg_gen_st_f32(): we know in the new function that we're always single-precision, we don't need to use the old-and-deprecated cpu_F0* TCG globals, and we don't happen to have the declaration of gen_vfp_msr() and gen_vfp_mrs() at the point in the file where the new function is. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert "double-precision" register moves to decodetree	Peter Maydell
	Convert the "double-precision" register moves to decodetree: this covers VMOV scalar-to-gpreg, VMOV gpreg-to-scalar and VDUP. Note that the conversion process has tightened up a few of the UNDEF encoding checks: we now correctly forbid: * VMOV-to-gpr with U:opc1:opc2 == 10x00 or x0x10 * VMOV-from-gpr with opc1:opc2 == 0x10 * VDUP with B:E == 11 * VDUP with Q == 1 and Vn<0> == 1 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Move the VFP trans_* functions to translate-vfp.inc.c	Peter Maydell
	Move the trans_*() functions we've just created from translate.c to translate-vfp.inc.c. This is pure code motion with no textual changes (this can be checked with 'git show --color-moved'). Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM to decodetree	Peter Maydell
	Convert the VCVTA/VCVTN/VCVTP/VCVTM instructions to decodetree. trans_VCVT() is temporarily left in translate.c. We include one minor comment syntax fix in the new trans_VCVT() function; this will allow us to move it to translate-vfp.inc.c without checkpatch complaining about the style problem in the "new" code. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VRINTA/VRINTN/VRINTP/VRINTM to decodetree	Peter Maydell
	Convert the VRINTA/VRINTN/VRINTP/VRINTM instructions to decodetree. Again, trans_VRINT() is temporarily left in translate.c. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-04	target/arm: Convert VMINNM, VMAXNM to decodetree	Peter Maydell
	Convert the VMINNM and VMAXNM instructions to decodetree. As with VSEL, we leave the trans_VMINMAXNM() function in translate.c for the moment. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-03	target/arm: Convert the VSEL instructions to decodetree	Peter Maydell
	Convert the VSEL instructions to decodetree. We leave trans_VSEL() in translate.c for now as this allows the patch to show just the changes from the old handle_vsel(). In the old code the check for "do D16-D31 exist" was hidden in the VFP_DREG macro, and assumed that VFPv3 always implied that D16-D31 exist. In the new code we do the correct ID register test. This gives identical behaviour for most of our CPUs, and fixes previously incorrect handling for Cortex-R5F, Cortex-M4 and Cortex-M33, which all implement VFPv3 or better with only 16 double-precision registers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-31	target/arm: Factor out VFP access checking code	Peter Maydell
	Factor out the VFP access checking code so that we can use it in the leaf functions of the decodetree decoder. We call the function full_vfp_access_check() so we can keep the more natural vfp_access_check() for a version which doesn't have the 'ignore_vfp_enabled' flag -- that way almost all VFP insns will be able to use vfp_access_check(s) and only the special-register access function will have to use full_vfp_access_check(s, ignore_vfp_enabled). Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-24	target/arm: Add stubs for AArch32 VFP decodetree	Peter Maydell
	Add the infrastructure for building and invoking a decodetree decoder for the AArch32 VFP encodings. At the moment the new decoder covers nothing, so we always fall back to the existing hand-written decode. We need to have one decoder for the unconditional insns and one for the conditional insns, as otherwise the patterns for conditional insns would incorrectly match against the unconditional ones too. Since translate.c is over 14,000 lines long and we're going to be touching pretty much every line of the VFP code as part of the decodetree conversion, we create a new translate-vfp.inc.c to hold the code which deals with VFP in the new scheme. It should be possible to convert this into a standalone translation unit eventually, but the conversion process will be much simpler if we simply #include it midway through translate.c to start with. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-23	target/arm: Fix vector operation segfault	Alistair Francis
	Commit 89e68b575 "target/arm: Use vector operations for saturation" causes this abort() when booting QEMU ARM with a Cortex-A15: 0 0x00007ffff4c2382f in raise () at /usr/lib/libc.so.6 1 0x00007ffff4c0e672 in abort () at /usr/lib/libc.so.6 2 0x00005555559c1839 in disas_neon_data_insn (insn=<optimized out>, s=<optimized out>) at ./target/arm/translate.c:6673 3 0x00005555559c1839 in disas_neon_data_insn (s=<optimized out>, insn=<optimized out>) at ./target/arm/translate.c:6386 4 0x00005555559cd8a4 in disas_arm_insn (insn=4081107068, s=0x7fffe59a9510) at ./target/arm/translate.c:9289 5 0x00005555559cd8a4 in arm_tr_translate_insn (dcbase=0x7fffe59a9510, cpu=<optimized out>) at ./target/arm/translate.c:13612 6 0x00005555558d1d39 in translator_loop (ops=0x5555561cc580 <arm_translator_ops>, db=0x7fffe59a9510, cpu=0x55555686a2f0, tb=<optimized out>, max_insns=<optimized out>) at ./accel/tcg/translator.c:96 7 0x00005555559d10d4 in gen_intermediate_code (cpu=cpu@entry=0x55555686a2f0, tb=tb@entry=0x7fffd7840080 <code_gen_buffer+126091347>, max_insns=max_insns@entry=512) at ./target/arm/translate.c:13901 8 0x00005555558d06b9 in tb_gen_code (cpu=cpu@entry=0x55555686a2f0, pc=3067096216, cs_base=0, flags=192, cflags=-16252928, cflags@entry=524288) at ./accel/tcg/translate-all.c:1736 9 0x00005555558ce467 in tb_find (cf_mask=524288, tb_exit=1, last_tb=0x7fffd783e640 <code_gen_buffer+126084627>, cpu=0x1) at ./accel/tcg/cpu-exec.c:407 10 0x00005555558ce467 in cpu_exec (cpu=cpu@entry=0x55555686a2f0) at ./accel/tcg/cpu-exec.c:728 11 0x000055555588b0cf in tcg_cpu_exec (cpu=0x55555686a2f0) at ./cpus.c:1431 12 0x000055555588d223 in qemu_tcg_cpu_thread_fn (arg=0x55555686a2f0) at ./cpus.c:1735 13 0x000055555588d223 in qemu_tcg_cpu_thread_fn (arg=arg@entry=0x55555686a2f0) at ./cpus.c:1709 14 0x0000555555d2629a in qemu_thread_start (args=<optimized out>) at ./util/qemu-thread-posix.c:502 15 0x00007ffff4db8a92 in start_thread () at /usr/lib/libpthread. This patch ensures that we don't hit the abort() in the second switch case in disas_neon_data_insn() as we will return from the first case. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-id: ad91b397f360b2fc7f4087e476f7df5b04d42ddb.1558021877.git.alistair.francis@wdc.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-13	target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs	Richard Henderson
	Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-05-13	tcg: Add support for integer absolute value	Richard Henderson
	Remove a function of the same name from target/arm/. Use a branchless implementation of abs gleaned from gcc. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-05-13	tcg: Specify optional vector requirements with a list	Richard Henderson
	Replace the single opcode in .opc with a null-terminated array in .opt_opc. We still require that all opcodes be used with the same .vece. Validate the contents of this list with CONFIG_DEBUG_TCG. All tcg_gen_*_vec functions will check any list active during .fniv expansion. Swap the active list in and out as we expand other opcodes, or take control away from the front-end function. Convert all existing vector aware front ends. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-04-29	target/arm: Implement VLLDM for v7M CPUs with an FPU	Peter Maydell
	Implement the VLLDM instruction for v7M for the FPU present cas. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190416125744.27770-26-peter.maydell@linaro.org
2019-04-29	target/arm: Implement VLSTM for v7M CPUs with an FPU	Peter Maydell
	Implement the VLSTM instruction for v7M for the FPU present case. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190416125744.27770-25-peter.maydell@linaro.org
2019-04-29	target/arm: Implement M-profile lazy FP state preservation	Peter Maydell
	The M-profile architecture floating point system supports lazy FP state preservation, where FP registers are not pushed to the stack when an exception occurs but are instead only saved if and when the first FP instruction in the exception handler is executed. Implement this in QEMU, corresponding to the check of LSPACT in the pseudocode ExecuteFPCheck(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190416125744.27770-24-peter.maydell@linaro.org
2019-04-29	target/arm: Activate M-profile floating point context when FPCCR.ASPEN is set	Peter Maydell
	The M-profile FPCCR.ASPEN bit indicates that automatic floating-point context preservation is enabled. Before executing any floating-point instruction, if FPCCR.ASPEN is set and the CONTROL FPCA/SFPA bits indicate that there is no active floating point context then we must create a new context (by initializing FPSCR and setting FPCA/SFPA to indicate that the context is now active). In the pseudocode this is handled by ExecuteFPCheck(). Implement this with a new TB flag which tracks whether we need to create a new FP context. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190416125744.27770-20-peter.maydell@linaro.org
2019-04-29	target/arm: Set FPCCR.S when executing M-profile floating point insns	Peter Maydell
	The M-profile FPCCR.S bit indicates the security status of the floating point context. In the pseudocode ExecuteFPCheck() function it is unconditionally set to match the current security state whenever a floating point instruction is executed. Implement this by adding a new TB flag which tracks whether FPCCR.S is different from the current security state, so that we only need to emit the code to update it in the less-common case when it is not already set correctly. Note that we will add the handling for the other work done by ExecuteFPCheck() in later commits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190416125744.27770-19-peter.maydell@linaro.org
2019-04-29	target/arm: Overlap VECSTRIDE and XSCALE_CPAR TB flags	Peter Maydell
	We are close to running out of TB flags for AArch32; we could start using the cs_base word, but before we do that we can economise on our usage by sharing the same bits for the VFP VECSTRIDE field and the XScale XSCALE_CPAR field. This works because no XScale CPU ever had VFP. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190416125744.27770-18-peter.maydell@linaro.org
2019-04-29	target/arm: Decode FP instructions for M profile	Peter Maydell
	Correct the decode of the M-profile "coprocessor and floating-point instructions" space: * op0 == 0b11 is always unallocated * if the CPU has an FPU then all insns with op1 == 0b101 are floating point and go to disas_vfp_insn() For the moment we leave VLLDM and VLSTM as NOPs; in a later commit we will fill in the proper implementation for the case where an FPU is present. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190416125744.27770-7-peter.maydell@linaro.org
2019-04-29	target/arm: Honour M-profile FP enable bits	Peter Maydell
	Like AArch64, M-profile floating point has no FPEXC enable bit to gate floating point; so always set the VFPEN TB flag. M-profile also has CPACR and NSACR similar to A-profile; they behave slightly differently: * the CPACR is banked between Secure and Non-Secure * if the NSACR forces a trap then this is taken to the Secure state, not the Non-Secure state Honour the CPACR and NSACR settings. The NSACR handling requires us to borrow the exception.target_el field (usually meaningless for M profile) to distinguish the NOCP UsageFault taken to Secure state from the more usual fault taken to the current security state. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190416125744.27770-6-peter.maydell@linaro.org
2019-04-29	target/arm: Disable most VFP sysregs for M-profile	Peter Maydell
	The only "system register" that M-profile floating point exposes via the VMRS/VMRS instructions is FPSCR, and it does not have the odd special case for rd==15. Add a check to ensure we only expose FPSCR. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190416125744.27770-5-peter.maydell@linaro.org