What is the issue with my branches? (ARM Assembly) - arm

I am working with the following code right now:
push {r1-r2, lr}
mov r1, r0
ldrb r2, [r1]
cmp r2, #'0'
blt notNum
cmpgt r2, #'9'
bgt notNum
ldrltb r2, [r1, #1]
cmplt r2, #0
beq isNum
bne notNum
isNum:
mov r0, #1
notNum:
mov r0, #0
The purpose of this particular code is to take in a string stored in r0 and test whether it is a number or not. The issue I am having at present is that for some reason the first cmp is always resulting in branching to notNum via blt. I've done some gdb testing and assuming my logic is correct (cmp = input1 - input2) then the compare flag would be greater than. Any insight into my issue would be greatly appreciated.

Related

ARM - Invalid Immediate operand value

The question is to store 12 to R1 and 27 to R2 then subtract R2 with R1 and store the result into the memory address 0x4000. Lastly, store R1 into 0x4004 and R2 into 0x4008 , but I got Invalid immediate Operand Value on MOV R5, #0x4004 and R6, #0x4008.
MOV R2, #27
SUB R3, R2, R1
MOV R4, #0x4000
STR R3, [R4]
MOV R5, #0x4004
MOV R6, #0x4008
STR R1, [R5]
STR R2, [R6]
MOV R2, #27
SUB R3, R2, R1
MOV R4, #0x4000
STR R3, [R4, #0]
STR R1, [R4, #4]
STR R2, [R4, #8]
According to this tutorial, it looks like intermediate values in 32-bit ARM are restricted to "neat" numbers which can be represented as a byte value shifted by some even integer. A quick check of 0x4000 yields 0b100000000000000 which is could be represented by 0x1 shifted left 0d14 times. Your values 0x4004 and 0x4008 don't seem to fall under this category: 0x4004 is 0b100000000000100 and 0x4008 is 0b100000000001000.
Since those specific values seem important, you could try adding to the value in R4 and saving those values to R5 and R6.
MOV R2, #27
SUB R3, R2, R1
MOV R4, #0x4000
STR R3, [R4]
ADD R5, R4, #0x4
ADD R6, R4, #0x8
STR R1, [R5]
STR R2, [R6]
If you'd like more information, you can check out ARM's own documentation on immediate values here. Just make sure to check that the version of the ISA you're using is the same as the one in the documentation. In the future, try to give as much detail as possible about your environment, since there are many versions of the ARM ISA (for example, the 16-bit thumb version is very different compared to the 32 or 64 bit versions).

How to do the if or in assembly?

I am trying to translate the following code to assembly!
int i = 1
int a = 3
int b;
if(i == 1 || a == 3)
b = 95;
else
b = 0;
I am confused about the part where I have to use or in the if statement. Do you guys have any suggestions?
ldr r0, [r13, #0] //i = 1
ldr r1, [r13, #4] //a = 3
mov r2, #1 //put 1 in r2
mov r3, #3 //put 3 in r3
cmp r0, r2 //compare i and 1
orr r1, r3 //or a and 3
bgt else //if false branch to else
ldr r4 #0 // put 0 in r4
str r4, [r13, #8] //store it at location 208 with r13
b endif //branch to else if if true
ldr r5 #95 //put 95 on r5
str r5 [r13, #12] //store 95 on location 212 with r13
So far I have this!
Honestly looks wrong! So you can roast me I am here to learn so please teach me! :)
I don't recognize the assembly language. But the pseudo-code would be:
compare i with 1
if true, jump to if
compare a with 3
if true, jump to if
else:
store 0 in b
jump to endif
if:
store 95 in b
endif:
This also implements the short-circuiting of ||, since a == 3 is only tested if i == 1 fails.
Writing assembly code by hand will quickly get out of hand and become goto spaghetti and a modern compiler does a better job optimizing, with that said sometimes you want to write a few lines of assembler for some other reason.
I don't think you must load the constants into registers first and the cost of assigning a value to a register is low compared to conditional branching.
My approach here would be
Store the value of one of the braces (95) in one register.
Compare a with 1 and 3 and branch if equal
Overwrite the register with 0 if branch was not taken
Store the contents of the register in area of variable b
begin:
mov r5, #95 // The value
ldr r0, [r13, #0] //i = 1
ldr r1, [r13, #4] //a = 3
cmp r0, #1 //compare i and 1
beq else
cmp r1,#3 // compare a and 3
beq else //if false branch to else
mov r5,#0 // Clear r5
else:
str r5, [r13, #8] //store it at location 208 with r13
Edit: I Found this cheat sheet It looks like there are conditional variants of the mov instruction. Then the code could be written like this: Without any jumps at all.
begin:
mov r5, #95 // The value
mov r6,#0 // Clear r6
ldr r0, [r13, #0] // i = 1
ldr r1, [r13, #4] // a = 3
cmp r0, #1 // compare i and 1
moveq r5,r6 // conditional move
cmp r1,#3 // compare a and 3
moveq r5,r6
str r5, [r13, #8] // store it at location 208 with r13
If i equals 1 then branch to code that assigns 95 to b. Otherwise, if a is 3, branch to that same code. Otherwise, assign 0 to b and branch to just after the other assignment.
I find the comments in another answer interesting and disturbing at the same time. And more interesting that that answer did not simply ask a compiler.
int fun ( int i, int a )
{
int b;
if(i == 1 || a == 3)
b = 95;
else
b = 0;
return b;
}
00000000 <fun>:
0: e3510003 cmp r1, #3
4: 13500001 cmpne r0, #1
8: 03a0005f moveq r0, #95 ; 0x5f
c: 13a00000 movne r0, #0
10: e12fff1e bx lr
so that means
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
cmp r0, #1 //compare i with 1, interested in equal or not
cmpne r1, #3 //if not equal then test a with 3, interested in equal or not
moveq r5, #95 //if either of the two were equal set b = 95
movne r5, #0 //if neither of the two were equal set b = 0
which is this machine code
0: e59d0000 ldr r0, [sp]
4: e59d1004 ldr r1, [sp, #4]
8: e3500001 cmp r0, #1
c: 13510003 cmpne r1, #3
10: 03a0505f moveq r5, #95 ; 0x5f
14: 13a05000 movne r5, #0
As shown in the ARM documentation, start with the ARM Architectural Reference Manual for ARMv5 to get your feet wet with the basic 32 bit ARM instructions (and base (all thumb variants) thumb instructions). Notice in that documentation that the first nibble describes the condition code and all instructions can be conditionally executed (to avoid branches for if-then-else type things).
0: e3a0505f mov r5, #95 ; 0x5f
4: 03a0505f moveq r5, #95 ; 0x5f
8: 13a0505f movne r5, #95 ; 0x5f
c: e3500001 cmp r0, #1
10: 03500001 cmpeq r0, #1
14: 13500001 cmpne r0, #1
18: c3500001 cmpgt r0, #1
1c: b3500001 cmplt r0, #1
See how the first 4 bits change but the other 28 do not? A feature you see in ARM instruction sets specifically and not necessarily in others. Some others have similar features though.
Not heard of a32 instruction set, so it is not clear which of the handful or more of the arm instruction sets you are using. The above works on armv4t through armv7-a. But tell a modern compiler to build for armv7-a it is likely going to build thumb first then arm only if you can force it. See the ARM Architectural Reference Manual for armv7-ar (it also shows all the way back to armv4t each instruction indicating which architectures are supported).
This is arm code as well that runs on some arm processors:
0: 2903 cmp r1, #3 compare a with 3
2: bf18 it ne these two
4: 2801 cmpne r0, #1 do an if not equal then compare i with 1
6: bf0c ite eq these three do a
8: 205f moveq r0, #95 ; 0x5f if either are equal b = 95
a: 2000 movne r0, #0 else b = 0
c: 4770 bx lr
e: bf00 nop
(just to show that it matters very much which specific instruction set a question is asking about and for ARM which of the ARM instruction sets)
You are basically wanting to do a
if i == 1 set the z flag
else if a == 3 set the z flag
if the z flag is set (from either of the above) b = 95
else b = 0
There are many basic ways to do this and Simson's answer is a clean straightforward approach that saves a branch or two.
mov r5,#95
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
// if i == 1
cmp r0,#1
bne skip
// or if a == 3
cmp r1,#3
bne skip
// else
mov r5,#0 //neither were equal
skip:
str r5, [r13, #12]
I was focused on that answer, but looking at yours did you mean to place the result in two different places based on the result?
ldr r4 #0 // put 0 in r4
str r4, [r13, #8] //store it at location 208 with r13
ldr r5 #95 //put 95 on r5
str r5 [r13, #12] //store 95 on location 212 with r13
That breaks Simson's answer. And mine above.
Most folks would start with this, easy to read and follow, brute force straight from the high level code.
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
// if i == 1
check_i:
cmp r0,#1
bne check_a
b one_equal //folks will forget to do this one
check_a:
// or if a == 3
cmp r1,#3
beq one_equal
bne neither_equal //or just fall through
// else
neither_equal:
mov r4,#0
str r4, [r13, #8]
b the_end //many folks forget this branch
one_equal:
mov r5,#95
str r5, [r13, #12]
the_end:
Or something like it which can then be shortened slightly into this, some folks would start with something like this:
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
// if i == 1
cmp r0,#1
beq one_equal
// or if a == 3
cmp r1,#3
beq one_equal
// else
neither_equal:
mov r4,#0
str r4, [r13, #8]
b the_end //many folks forget this one
one_equal:
mov r5,#95
str r5, [r13, #12]
the_end:
Here is where you start to go off the rails
cmp r0, r2 //this is a valid starting point
orr r1, r3 //orr is a logical or, not an if this "or" that
// so we are confused by what you are doing here
bgt else //you are wanting to know if it is equal or not, not if greater
// than
It does not get any better after that
If you really meant the result in two different places then:
Still get the variables into registers from the stack
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
This still does an if this is equal or that is equal
cmp r0, #1 //is i == 1?
cmpne r1, #3 //if not then is a == 3?
You end up here with z set if either one is equal or z clear if neither are equal
moveq r4,#95 //one or the other is equal
streq r4,[r13, #8] //one or the other is equal
movne r5,#0 //neither are equal
strne r5,[r13, #12] //neither are equal
Final result:
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
cmp r0, #1 //is i == 1?
cmpne r1, #3 //if not then is a == 3
moveq r4,#95 //one or the other is equal
streq r4,[r13, #8] //one or the other is equal
movne r5,#0 //neither are equal
strne r5,[r13, #12] //neither are equal
It assembles fine, so the syntax is good
0: e59d0000 ldr r0, [sp]
4: e59d1004 ldr r1, [sp, #4]
8: e3500001 cmp r0, #1
c: 13510003 cmpne r1, #3
10: 03a0405f moveq r4, #95 ; 0x5f
14: 058d4008 streq r4, [sp, #8]
18: 13a05000 movne r5, #0
1c: 158d500c strne r5, [sp, #12]
I have edited this so many times I hope I did not leave any mistakes...I will get beat up for it if I did I am sure...Before doing any assembly language you need the proper documentation. In this case you want one of the ARM Architectural Reference Manuals, likely the oldest one which is directly derived from the printed versions before they distributed pdfs. The armv5 manual.
In general you will see a compiler will do the opposite and jump over
if(x==1)
{
y = 5;
}
cmp r0,#1
bne skip //C code is equal so branch if not
mov r1,#5
skip:
If you had if ((i==1)&&(a==3)) you would also want to look at the opposite, skip over if (i!=1) skip over if (a!=3) having the two paths skip to a common label.
But in the case of an this OR that you kind of want to have two paths land in the same place by branching to a common label and then have it fall through to the else code if neither are true. By doing the as written comparison if i == 1 branch to label, of a == 3 branch to label.

C - occasional CPU stall during memcmp on Cortex-R5

I'm running some tests on a Cortex-R5 (Ultrascale MpSoC). It basically generates 2 random numbers with a hardware module and compares them at the end to ensure they're not 0, nor the same values.
uint32_t status;
const uint8_t zeros[32] = {0};
uint8_t bytes1[32] = {0};
uint8_t bytes2[32] = {0};
// (generate random numbers and put them in bytes1)
// (generate random numbers and put them in bytes2)
printf("memcmp 0\n");
status = !memcmp(bytes1, bytes2, 32);
printf("memcmp 1\n");
status |= !memcmp(bytes1, zeros, 32);
printf("memcmp 2\n");
status |= !memcmp(bytes2, zeros, 32);
Some tests are running fine. Some executions are stalled after printing "memcmp 0" (when it freezes, it's always at the first memcmp)...
I have tried several things:
When I print the values in bytes1 and 2, they are indeed random numbers not equal to 0 and not equal with each other.
Moving the memcmp at different places, or switching the memcmp's. It's always the first one which freezes.
Replacing memcmp with a custom function to do comparison => it never freezes.
The memcmp function is used at other places of the code and it freezes nowhere else. Perhaps the difference is that the random check is the only place where the memcmp expects different values (at other places it's to ensure a function produces expected output).
I couldn't find the definition of memcmp... I don't know where to look. The only thing I could find is the assembly code, but it'd be difficult to attach a debugger to know exactly which instruction can't complete.
000064d0 <memcmp>:
64d0: 2a03 cmp r2, #3
64d2: b470 push {r4, r5, r6}
64d4: d912 bls.n 64fc <memcmp+0x2c>
64d6: ea40 0501 orr.w r5, r0, r1
64da: 4604 mov r4, r0
64dc: 07ad lsls r5, r5, #30
64de: 460b mov r3, r1
64e0: d120 bne.n 6524 <memcmp+0x54>
64e2: 681d ldr r5, [r3, #0]
64e4: 4619 mov r1, r3
64e6: 6826 ldr r6, [r4, #0]
64e8: 4620 mov r0, r4
64ea: 3304 adds r3, #4
64ec: 3404 adds r4, #4
64ee: 42ae cmp r6, r5
64f0: d118 bne.n 6524 <memcmp+0x54>
64f2: 3a04 subs r2, #4
64f4: 4620 mov r0, r4
64f6: 2a03 cmp r2, #3
64f8: 4619 mov r1, r3
64fa: d8f2 bhi.n 64e2 <memcmp+0x12>
64fc: 1e54 subs r4, r2, #1
64fe: b172 cbz r2, 651e <memcmp+0x4e>
6500: 7802 ldrb r2, [r0, #0]
6502: 780b ldrb r3, [r1, #0]
6504: 429a cmp r2, r3
6506: bf08 it eq
6508: 1864 addeq r4, r4, r1
650a: d006 beq.n 651a <memcmp+0x4a>
650c: e00c b.n 6528 <memcmp+0x58>
650e: f810 2f01 ldrb.w r2, [r0, #1]!
6512: f811 3f01 ldrb.w r3, [r1, #1]!
6516: 429a cmp r2, r3
6518: d106 bne.n 6528 <memcmp+0x58>
651a: 42a1 cmp r1, r4
651c: d1f7 bne.n 650e <memcmp+0x3e>
651e: 2000 movs r0, #0
6520: bc70 pop {r4, r5, r6}
6522: 4770 bx lr
6524: 1e54 subs r4, r2, #1
6526: e7eb b.n 6500 <memcmp+0x30>
6528: 1ad0 subs r0, r2, r3
652a: bc70 pop {r4, r5, r6}
652c: 4770 bx lr
652e: bf00 nop
Where can I see the source code of memcmp for cortex R5? FYI, the used compiler is armr5-none-eabi-gcc.
Any idea what could cause a CPU stall with this function?
Thank you

How can I do this section of code, but using auto-indexing with ARM Assembly

this works, but I have to do it using auto-indexing and I can not figure out that part.
writeloop:
cmp r0, #10
beq writedone
ldr r1, =array1
lsl r2, r0, #2
add r2, r1, r2
str r2, [r2]
add r0, r0, #1
b writeloop
and for data I have
.balign 4
array1: skip 40
What I had tried was this, and yes I know it is probably a poor attempt but I am new to this and do not understand
ldr r1, =array1
writeloop:
cmp r0, #10
beq writedone
ldr r2, [r1], #4
str r2, [r2]
add r0, r0, #1
b writeloop
It says segmentation fault when I try this. What is wrong? What I am thinking should happen is every time it loops through, it sets the element r2 it at = to the address of itself, and then increments to the next element and does the same thing
The ARM architechures gives several different address modes.
From ARM946E-S product overview and many other sources:
Load and store instructions have three primary addressing modes
- offset
- pre-indexed
- post-indexed.
They are formed by adding or subtracting an immediate or register-based offset to or from a base register. Register-based offsets can also be scaled with shift operations. Pre-indexed and post-indexed addressing modes update the base register with the base plus offset calculation. As the PC is a general purpose register, a 32‑bit value can be loaded directly into the PC to perform a jump to any address in the 4GB memory space.
As well, they support write back or updating of the register, hence the reason for pre-indexed and post-indexed. Post-index doesn't make much sense without write back.
Now to your issue, I believe that you want to write the values 0-9 to an array of ten words (length four bytes). Assuming this, you can use indexing and update the value via add. This leads to,
mov r0, #0 ; start value
ldr r1, =array1 ; array pointer
writeloop:
cmp r0, #10
beq writedone
str r0, [r1, r0, lsl #2] ; index with r1 base by r0 scaled by *4
add r0, r0, #1
b writeloop
writedone:
; code to jump somewhere else and not execute data.
.balign 4
array1: skip 40
For interest a more efficient loop can be done by counting and writing down,
mov r0, #9 ; start value
ldr r1, =array1 ; array pointer
writeloop:
str r0, [r1, r0, lsl #2] ; index with r1 base by r0 scaled by *4
subs r0, r0, #1
bne writeloop
Your original example was writing the pointer to the array; often referred to as 'value equals address'. If this is what you want,
ldr r0, =array_end ; finished?
ldr r1, =array1 ; array pointer
write_loop:
str r1, [r1], #4 ; add four and update after storing
cmp r0, r1
bne write_loop
; code to jump somewhere else and not execute data.
.balign 4
array1: skip 40
array_end:

Need Help understanding ARM function

I'm still learning ARM and I couldn't understand what this function is supposed to do.
Can you guys help me out explaining how it works?
.text:0006379C EXPORT _nativeD2AB
.text:0006379C _nativeD2AB
.text:0006379C var_28 = -0x28
.text:0006379C
.text:0006379C STMFD SP!, {R4-R11,LR}
.text:000637A0 SUB SP, SP, #0x3A4
.text:000637A4 STMFA SP, {R0-R3}
.text:000637A8 LDR R0, =(_GLOBAL_OFFSET_ - 0x637B8)
.text:000637AC LDR R1, =(__stack_chk - 0x134EAC)
.text:000637B0 ADD R0, PC, R0 ; _GLOBAL_OFFSET_
.text:000637B4 LDR R0, [R1,R0] ; __stack_chk
.text:000637B8 LDR R0, [R0]
.text:000637BC STR R0, [SP,#0x3C8+var_28]
.text:000637C0 MOV R0, #1
.text:000637C4 ADR R1, sub_637D0
.text:000637C8 MUL R0, R1, R0
.text:000637CC MOV PC, R0
.text:000637CC ; End of function _nativeD2AB
.
.got:00134EAC _GLOBAL_OFFSET_TABLE_ DCD 0
.
.got:00134B0C AREA .got, DATA
.got:00134B0C __stack_chk DCD __stack_chkA
.
Found the rest of the function. If I understood some of it correctly, it seems to be scrambling the data, though that may be just a wild guess:
.text:000637D0 sub_637D0
.text:000637D0 MOV R0, #1
.text:000637D4 ADR R1, sub_637E0
.text:000637D8 MUL R0, R1, R0
.text:000637DC MOV PC, R0
.text:000637DC ; End of function sub_637D0
.text:000637E0 sub_637E0
.text:000637E0
.text:000637E0 arg_14 = 0x14
.text:000637E0
.text:000637E0 STR R2, [SP,#arg_14]
.text:000637E4 MOV R0, #1
.text:000637E8 ADR R1, loc_637F4
.text:000637EC MUL R0, R1, R0
.text:000637F0 MOV PC, R0
.text:000637F0 ; End of function sub_637E0
.text:000637F4 loc_637F4
.text:000637F4 STR R2, [SP,#0x28]
.text:000637F8 STR R0, [SP,#0x18]
.text:000637FC MOV R1, #2
.text:00063800 STR R2, [SP,#0x1C]
.text:00063804 STR R0, [SP,#0x20]
.text:00063808 STR R0, [SP,#0x24]
The function has several parts:
Store registers to the stacj and reserve space (Strangely, not restored)
Load to R0 the address of GLOBAL_OFFSET (Once added with PC), to actually access __stack_chk (When added to GLOBAL_OFFSET). This is done in a very strange way.
Load the data at __stack_chk and store it in the stack
Load to R0 the value of sub_637D0, by doing a multiplication by 1. This is the value returned by the function.
So in my opinion, this does not seem to do anything useful...

Resources