My new favourite AArch64 CPU instruction: rotate then merge in to flags (RMIF)
I find myself writing some CPU emulators at the moment, which has caused the AArch64 (aka. ARM64) RMIF instruction to become my new favourite instruction. It takes a 64-bit general purpose register, rotates it right by a specified number of bits, then selectively merges the low four bits into the flags register. A 6-bit field in the instruction gives the rotate amount, and a 4-bit field in the instruction gives a mask of which flag bits to overwrite versus which to leave unchanged.
One use of rmif
is to emulate the x86 bt reg, imm
instruction, which extracts one bit from a general purpose register, writes that bit to the C flag, and leaves other flags unchanged. Thus bt reg, imm
in x86 becomes rmif reg, #((imm - 1) & 63), #2
in AArch64.
At the other end of the spectrum is the x86 inc
instruction, which adds 1 to a general purpose register, and then sets most flags based on this addition, but leaves the C flag unchanged. To emulate inc reg
, we can first save off the old value of the C flag (via csinc tmp, wzr, wzr, cc
or adc tmp, wzr, wzr
), then do adds reg, reg, #1
to perform the addition and set all the flags, then rmif tmp, #63, #2
to restore the old value of the C flag.
As another example, the AArch32 muls
instruction sets the N and Z flags based on the result of the multiplication, but leaves the C and V flags unchanged. To emulate this on AArch64, we can save off all the flags (mrs tmp, NZCV
), then do the multiplication, then set N and Z based on the result but also clobber C and V (ands wzr, dst, dst
or adds wzr, dst, #0
), then restore the old values of C and V (rmif tmp, #28, #3
).