Decoding BLX instruction on ARM/Thumb(Android) - arm

I want to decoding a blx instruction on arm, and I have found a good answer here:
Decoding BLX instruction on ARM/Thumb (IOS)
But in my case, I follow this tip step by step, and get the wrong result, can anyone tell me why?
This is my test:
.plt: 000083F0 sub_83F0 ...
...
.text:00008436 FF F7 DC EF BLX sub_83F0
I parse the machine code 'FF F7 DC EF' by follow:
F7 FF EF DC
11110 1 1111111111 11 1 0 1 1111101110 0
S imm10H J1 J2 imm10L
I1 = NOT(J1 EOR S) = 1
I2 = NOT(J2 EOR S) = 1
imm32 = SignExtend(S:I1:I2:imm10H:imm10L:00)
= SignExtend(1111111111111111110111000)
= SignExtend(0x1FFFFB8)
= ?
So the offset is 0xFFB8?
But 0x83F0-0X8436-4=0xFFB6
I need your help!!!

When the target of a BLX is 32-bit ARM code, the immediate value encoded in the BLX instruction is added to align(PC,4), not the raw value of PC.
PC during execution of the BLX instruction is 0x8436 + 4 == 0x843a due to the ARM pipeline
align(0x843a, 4) == 0x8438
So:
0x00008438 + 0ffffffb8 == 0x83f0
The ARM ARM mentions this in the assembler syntax for the <label> part of the instruction:
For BLX (encodings T2, A2), the assembler calculates the required value of the offset from the Align(PC,4) value of the BLX instruction to this label, then selects an encoding that sets imm32 to that offset.
The alignment requirement can also be found by careful reading of the Operation pseudocode in the ARM ARM:
if ConditionPassed() then
EncodingSpecificOperations();
if CurrentInstrSet == InstrSet_ARM then
next_instr_addr = PC - 4;
LR = next_instr_addr;
else
next_instr_addr = PC;
LR = next_instr_addr<31:1> : ‘1’;
if toARM then
SelectInstrSet(InstrSet_ARM);
BranchWritePC(Align(PC,4) + imm32); // <--- alignment of the current PC when BLX to non-Thumb ARM code
else
SelectInstrSet(InstrSet_Thumb);
BranchWritePC(PC + imm32);

F7FF
1111011111111111
111 10 11111111111 h = 10 offset upper = 11111111111
EFDC
1110111111011100
111 01 11111011100 h = 01 blx offset upper 11111011100
offset = 1111111111111111011100<<1
sign extended = 0xFFFFFFB8
0x00008436 + 2 + 0xFFFFFFB8 = 1000083F0
clip to 32 bits 0x000083F0

Related

Is there a bug in the nestest rom?

I am currently making an emulator for the NES (like many others) , and while testing my emulation against the nestest rom by Kevtris (found here : https://wiki.nesdev.com/w/index.php/Emulator_tests),
there is a weird bug I've encountered , at the instruction 877 on the nestest log (this one : http://www.qmtpro.com/~nes/misc/nestest.log , at line CE42) .
The instruction is a PLA , which pulls the accumulator from the stack , while having the stack pointer at $7E at the beginning. (I'm using a 1 byte value for the stack pointer , since it goes from 0x0100 to 0x01FF , so when I write $7E talking about the stack , it's 0x017E , not zeropage
;) )
So , when PLA is executed at line 877, the stack pointer moves to $7F and retrieve the first byte and store into the accumulator .
The problem is here : on the nestest log , this byte is 0x39 , then , on instruction 878 which is also a PLA , the retrieved byte at $80 (stack pointer incremented + 1) , is 0xCE, and this has inverted the low byte and high byte.
The values written on the stack (0xCE39) have their origin in the JSR instruction at line CE37 and here is my implementation of the JSR opcode :
uint8_t JSR(){
get() ; // fetch the data of the opcode , like an absolute address operand or a value
uint16_t newPC = PC - 1 ; // the program counter is decremented by 1
uint8_t low = newPC & 0x00FF ;
uint8_t high = (newPC & 0xFF00) >> 8;
write_to_stack(SP-- , low) ; //we store the PC , highest address in stack takes the low bytes
write_to_stack(SP-- , high) ; //lower address on the stack takes the high bytes
PC = new_address ; // the address we read that points to the subroutine.
return 0 ;
}
Here are the logs from nestest :
CE37 20 3D CE JSR $CE3D A:69 X:80 Y:01 P:A5 SP:80 PPU:233, 17 CYC:2017
CE3D BA TSX A:69 X:80 Y:01 P:A5 SP:7E PPU:251, 17 CYC:2023
CE3E E0 7E CPX #$7E A:69 X:7E Y:01 P:25 SP:7E PPU:257, 17 CYC:2025
CE40 D0 19 BNE $CE5B A:69 X:7E Y:01 P:27 SP:7E PPU:263, 17 CYC:2027
CE42 68 PLA A:69 X:7E Y:01 P:27 SP:7E PPU:269, 17 CYC:2029
CE43 68 PLA A:39 X:7E Y:01 P:25 SP:7F PPU:281, 17 CYC:2033
CE44 BA TSX A:CE X:7E Y:01 P:A5 SP:80 PPU:293, 17 CYC:2037
With my code , I am having 0xCE at $7F and 0x39 at $80.
So the first PLA with my code stores 0xCE in the accumulator , and the second PLA stores 0x39, and this is the invert of what the nestest log shows.
I don't know if my JSR code is wrong , it has succeeded until now.
I tried inverting the low and high byte of the program counter when stored on the stack , but , as expected , the instructions become invalid at the first JSR of the rom .
So , what do you guys think I'm missing ?
The mistake is not in nestest; the mistake is in your implementation of JSR and RTS!
You need to push the high byte first, and then the low byte. (This is so that the low byte can be retrieved first, and incremented while the high byte is being fetched)

MSP430 microcontroller - how to check addressing modes

I'm programming a MSP430 in C language as a simulation of real microcontroller. I got stuck in addressing modes (https://en.wikipedia.org/wiki/TI_MSP430#MSP430_CPU), especially:
Addressing modes using R0 (PC)
Addressing modes using R2 (SR) and R3 (CG), special-case decoding
I don't understand what does mean 0(PC), 2(SR) and 3(CG). What they are?
How to check these values?
so for the source if the as bits are 01 and the source register bits are a 0 which is the pc for reference then
ADDR Symbolic. Equivalent to x(PC). The operand is in memory at address PC+x.
if the ad bit is a 1 and the destination is a 0 then also
ADDR Symbolic. Equivalent to x(PC). The operand is in memory at address PC+x.
x is going to be another word that follows this instruction so the cpu will fetch the next word, add it to the pc and that is the source
if the as bits are 11 and the source is register 0, the source is an immediate value which is in the next word after the instruction.
if the as bits are 01 and the source is a 2 which happens to be the SR register for reference then the address is x the next word after the instruction (&ADDR)
if the ad bit is a 1 and the destination register is a 2 then it is also an &ADDR
if the as bits are 10 the source bits are a 2, then the source is the constant value 4 and we dont have to burn a word in flash after the instruction for that 4.
it doesnt make sense to have a destination be a constant 4 so that isnt a real combination.
repeat for the rest of the table.
you can have both of these addressing modes at the same time
mov #0x5A80,&0x0120
generates
c000: b2 40 80 5a mov #23168, &0x0120 ;#0x5a80
c004: 20 01
which is
0x40b2 0x5a80 0x0120
0100000010110010
0100 opcode mov
0000 source
1 ad
0 b/w
11 as
0010 destination
so we have an as of 11 with source of 0 the immediate #x, an ad of 1 with a destination 2 so the destination is &ADDR. this is an important experiment because when you have 2 x values, a three word instruction basically which one goes with the source and which the destination
0x40b2 0x5a80 0x0120
so the address 0x5a80 which is the destination is the first x to follow the instruction then the source 0x0120 an immediate comes after that.
if it were just an immediate and a register then
c006: 31 40 ff 03 mov #1023, r1 ;#0x03ff
0x4031 0x03FF
0100000000110001
0100 mov
0000 source
0 ad
0 b/w
11 as
0001 dest
as of 11 and source of 0 is #immediate the X is 0x03FF in this case the word that follows. the destination is ad of 0
Register direct. The operand is the contents of Rn
where destination in this case is r1
so the first group Rn, x(Rn), #Rn and #Rn+ are the normal cases, the ones below that that you are asking about are special cases, if you get a combination that fits into a special case then you do that otherwise you do the normal case like the mov immediate to r1 example above. the destination of r1 was a normal Rn case.
As=01, Ad=1, R0 (ADDR): This is exactly the same as x(Rn), i.e., the operand is in memory at address R0+x.
This is used for data that is stored near the code that uses it, when the compiler does not know at which absolute address the code will be located, but it knows that the data is, e.g., twenty words behind the instruction.
As=11, R0 (#x): This is exactly the same as #R0+, and is used for instructions that need a word of data from the instruction stream. For example, this assembler instruction:
MOV #1234, R5
is actually encoded and implemented as:
MOV #PC+, R5
.dw 1234
After the CPU has read the MOV instruction word, PC points to the data word. When reading the first MOV operand, the CPU reads the data word, and increments PC again.
As=01, Ad=1, R2 (&ADDR): this is exactly the same as x(Rn), but the R2 register reads as zero, so what you end up with is the value of x.
Using the always-zero register allows to encode absolute addresses without needing a special addressing mode for this (just a special register).
constants -1/0/1/2/4/8: it would not make sense to use the SR and CG registers with most addressing modes, so these encodings are used to generate special values without a separate data word, to save space:
encoding: what actually happens:
MOV #SR, R5 MOV #4, R5
MOV #SR+, R5 MOV #8, R5
MOV CG, R5 MOV #0, R5
MOV x(CG), R5 MOV #1, R5 (no word for x)
MOV #CG, R5 MOV #2, R5
MOV #CG+, R5 MOV #-1, R5

What would be the C equivalent of rlwinm (PPC Instruction)

I was wondering if any of you would know the C equivelent of the powerpc instruction below.
rlwinm r31, r0, 0,13,13
Thanks.
Rotate left register immediate, then and with mask.
Rotate left is 0 here, so we can ignore this. The mask is all bits set from 13 to 13, which is just bit 13 (0x2000 as a bitmask; this command was probably chosen over just and to document that bit 13 is selected).
So in this case, we need to build a mask for bit 13 and then apply bitwise and with the source.
r31 = r0 & (1 << 13);
<< is the shift left operation in C, we use it here to create a mask just for bit 13. & is the and operation in C.
Documentation source: http://sametwice.com/rlwinm

Understanding condition code flag setting in assembly

If I have the following table:
Case 1: x: 42 y: -15 (y-x) = -57
Case 2: x: -17 y: -17 (y-x) = 0
Case 3: x: 0x7ffffffd y: -67 (y-x) = 2147483584
Case 4: x: 67 y: -0x7fffffffd (y-x) = 2147483584
What would the condition code flags set (zero or one, per flag) for ZF SF OF and CF
when considering the instruction: cmp1 %eax %ecx if %eax contains x and %ecx contains y?
I understand that cmp1 ...,... is executed by: cmp1 SRC2,SRC1
which means: "sets condition codes of SRC1 – SRC2"
I understand that the flags represent:
OF = overflow (?)
ZF = zero flag i.e. zero...
CF = carry out from msb
SF - sign flag i.e. negative
For my four cases in the table, I believe the flags would be:
1) ZF = 0 SF = 1 CF = 0 OF = ?
2) ZF = 1 SF = 0 CF = 0 OF = ?
3) ZF = 0 SF = 0 CF = 1 OF = ?
4) ZF = 0 SF = 0 CF = 1 OF = ?
Am I correct? Please explain what CF and OF are and how to determine if either will be set TRUE, and correct any of my flawed understanding. Thank you.
Carry overflow occurs when an arithmetic operation generates a carry that cannot fit into the register. So if you had 8-bit registers, and wanted to add 10000000 and 10000000 (unsigned):
10000000
10000000
--------
100000000
This 1 is the carry from most significant bit, and thus sets CF = 1.
You might also want to check this other answer.

Setting text and background in Assembly intel

I have a programming assignment to run through and set the background and text of all the possible combinations. I am using a predefined function called SetTextColor which basically sets the values like this:
mov eax, white + (blue * 16)
Essentially this sets the text white and the background blue (to set the background you multiply by 16). Basically the combination is 16 X 16 = 256
TITLE BACKGROUND COLORS (main.asm)
; Description: T
; Author: Chad Peppers
; Revision date: June 21, 2012
INCLUDE Irvine32.inc
.data
COUNT = 16
COUNT2 = 16
LCOUNT DWORD ?
val1 DWORD 0
val2 DWORD 0
.code
main PROC
mov ecx, COUNT
L1:
mov LCOUNT, ecx
mov ecx, COUNT2
L2:
mov eax, val1 + (val2 * 16)
call SetTextColor
inc val2
Loop L2
mov ecx, LCOUNT
Loop L1
call DumpRegs
exit
main ENDP
END main
Basically I am doing a nested loop. My thinking is that I simply do a 1 * (1 * 16) then inc the value in a nested loop until 1 * (16 * 16). I am getting the error below
I am getting the error A2026: constant expected
I imagine the error you are getting is at this line:
mov eax, val1 + (val2 * 16)
You just can't do that. If you intend to multiply val2 by 16 and then add val1 to the result, then you need to implement it step by step (you may come across addressing in the form of a+b*c but a and c need to be registers and b can only be 2, 4 or 8, not 16). Try replacing this line with something like this:
mov eax, val2
imul eax, 16
add aex, val1

Resources