I am trying to figure out how arrays work in ARM assembly, but I am just overwhelmed. I want to initialize an array of size 20 to 0, 1, 2 and so on.
A[0] = 0
A[1] = 1
I can't even figure out how to print what I have to see if I did it correctly. This is what I have so far:
.data
.balign 4 # Memory location divisible by 4
string: .asciz "a[%d] = %d\n"
a: .skip 80 # allocates 20
.text
.global main
.extern printf
main:
push {ip, lr} # return address + dummy register
ldr r1, =a # set r1 to index point of array
mov r2, #0 # index r2 = 0
loop:
cmp r2, #20 # 20 elements?
beq end # Leave loop if 20 elements
add r3, r1, r2, LSL #2 # r3 = r1 + (r2*4)
str r2, [r3] # r3 = r2
add r2, r2, #1 # r2 = r2 + 1
b loop # branch to next loop iteration
print:
push {lr} # store return address
ldr r0, =string # format
bl printf # c printf
pop {pc} # return address
ARM confuses me enough as it is, I don't know what i'm doing wrong. If anyone could help me better understand how this works that would be much appreciated.
This might help down the line for others who want to know about how to allocate memory for array in arm assembly language
here is a simple example to add corresponding array elements and store in the third array.
.global _start
_start:
MOV R0, #5
LDR R1,=first_array # loading the address of first_array[0]
LDR R2,=second_array # loading the address of second_array[0]
LDR R7,=final_array # loading the address of final_array[0]
MOV R3,#5 # len of array
MOV R4,#0 # to store sum
check:
cmp R3,#1 # like condition in for loop for i>1
BNE loop # if R3 is not equal to 1 jump to the loop label
B _exit # else exit
loop:
LDR R5,[R1],#4 # loading the values and storing in registers and base register gets updated automatically R1 = R1 + 4
LDR R6,[R2],#4 # similarly
add R4,R5,R6
STR R4,[R7],#4 # storing the values back to the final array
SUB R3,R3,#1 # decrment value just like i-- in for loop
B check
_exit:
LDR R7,=final_array # before exiting checking the values stored
LDR R1, [R7] # R1 = 60
LDR R2, [R7,#4] # R2 = 80
LDR R3, [R7,#8] # R3 = 100
LDR R4, [R7,#12] # R4 = 120
MOV R7, #1 # terminate syscall, 1
SWI 0 # execute syscall
.data
first_array: .word 10,20,30,40
second_array: .word 50,60,70,80
final_array: .word 0,0,0,0,0
as mentioned your printf has problems, you can use the toolchain itself to see what the calling convention is, and then conform to that.
#include <stdio.h>
unsigned int a,b;
void notmain ( void )
{
printf("a[%d] = %d\n",a,b);
}
giving
00001008 <notmain>:
1008: e59f2010 ldr r2, [pc, #16] ; 1020 <notmain+0x18>
100c: e59f3010 ldr r3, [pc, #16] ; 1024 <notmain+0x1c>
1010: e5921000 ldr r1, [r2]
1014: e59f000c ldr r0, [pc, #12] ; 1028 <notmain+0x20>
1018: e5932000 ldr r2, [r3]
101c: eafffff8 b 1004 <printf>
1020: 0000903c andeq r9, r0, ip, lsr r0
1024: 00009038 andeq r9, r0, r8, lsr r0
1028: 0000102c andeq r1, r0, ip, lsr #32
Disassembly of section .rodata:
0000102c <.rodata>:
102c: 64255b61 strtvs r5, [r5], #-2913 ; 0xb61
1030: 203d205d eorscs r2, sp, sp, asr r0
1034: 000a6425 andeq r6, sl, r5, lsr #8
Disassembly of section .bss:
00009038 <b>:
9038: 00000000 andeq r0, r0, r0
0000903c <a>:
903c:
the calling convention is generally first parameter in r0, second in r1, third in r2 up to r3 then use the stack. There are many exceptions to this, but we can see here that the compiler which normally works fine with a printf call, wants the address of the format string in r0. the value of a then the value of b in r1 and r2 respectively.
Your printf has the string in r0, but a printf call with that format string needs three parameters.
The code above used a tail optimization and branch to printf rather than called it and returned from. The arm convention these days prefers the stack to be aligned on 64 bit boundaries, so you can put some register, you dont necessarily care to preserve on the push/pop in order to keep that alignment
push {r3,lr}
...
pop {r3,pc}
It certainly wont hurt you to do this, it may or may not hurt to not do it depending on what downstream assumes.
Your setup and loop should function just fine assuming that r1 (label a) is a word aligned address. Which it may or may not be if you mess with your string, should put a first then the string or put another alignment statement before a to insure the array is aligned. There are instruction set features that can simply the code, but it appears functional as is.
Related
I am trying to translate the following code to assembly!
int i = 1
int a = 3
int b;
if(i == 1 || a == 3)
b = 95;
else
b = 0;
I am confused about the part where I have to use or in the if statement. Do you guys have any suggestions?
ldr r0, [r13, #0] //i = 1
ldr r1, [r13, #4] //a = 3
mov r2, #1 //put 1 in r2
mov r3, #3 //put 3 in r3
cmp r0, r2 //compare i and 1
orr r1, r3 //or a and 3
bgt else //if false branch to else
ldr r4 #0 // put 0 in r4
str r4, [r13, #8] //store it at location 208 with r13
b endif //branch to else if if true
ldr r5 #95 //put 95 on r5
str r5 [r13, #12] //store 95 on location 212 with r13
So far I have this!
Honestly looks wrong! So you can roast me I am here to learn so please teach me! :)
I don't recognize the assembly language. But the pseudo-code would be:
compare i with 1
if true, jump to if
compare a with 3
if true, jump to if
else:
store 0 in b
jump to endif
if:
store 95 in b
endif:
This also implements the short-circuiting of ||, since a == 3 is only tested if i == 1 fails.
Writing assembly code by hand will quickly get out of hand and become goto spaghetti and a modern compiler does a better job optimizing, with that said sometimes you want to write a few lines of assembler for some other reason.
I don't think you must load the constants into registers first and the cost of assigning a value to a register is low compared to conditional branching.
My approach here would be
Store the value of one of the braces (95) in one register.
Compare a with 1 and 3 and branch if equal
Overwrite the register with 0 if branch was not taken
Store the contents of the register in area of variable b
begin:
mov r5, #95 // The value
ldr r0, [r13, #0] //i = 1
ldr r1, [r13, #4] //a = 3
cmp r0, #1 //compare i and 1
beq else
cmp r1,#3 // compare a and 3
beq else //if false branch to else
mov r5,#0 // Clear r5
else:
str r5, [r13, #8] //store it at location 208 with r13
Edit: I Found this cheat sheet It looks like there are conditional variants of the mov instruction. Then the code could be written like this: Without any jumps at all.
begin:
mov r5, #95 // The value
mov r6,#0 // Clear r6
ldr r0, [r13, #0] // i = 1
ldr r1, [r13, #4] // a = 3
cmp r0, #1 // compare i and 1
moveq r5,r6 // conditional move
cmp r1,#3 // compare a and 3
moveq r5,r6
str r5, [r13, #8] // store it at location 208 with r13
If i equals 1 then branch to code that assigns 95 to b. Otherwise, if a is 3, branch to that same code. Otherwise, assign 0 to b and branch to just after the other assignment.
I find the comments in another answer interesting and disturbing at the same time. And more interesting that that answer did not simply ask a compiler.
int fun ( int i, int a )
{
int b;
if(i == 1 || a == 3)
b = 95;
else
b = 0;
return b;
}
00000000 <fun>:
0: e3510003 cmp r1, #3
4: 13500001 cmpne r0, #1
8: 03a0005f moveq r0, #95 ; 0x5f
c: 13a00000 movne r0, #0
10: e12fff1e bx lr
so that means
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
cmp r0, #1 //compare i with 1, interested in equal or not
cmpne r1, #3 //if not equal then test a with 3, interested in equal or not
moveq r5, #95 //if either of the two were equal set b = 95
movne r5, #0 //if neither of the two were equal set b = 0
which is this machine code
0: e59d0000 ldr r0, [sp]
4: e59d1004 ldr r1, [sp, #4]
8: e3500001 cmp r0, #1
c: 13510003 cmpne r1, #3
10: 03a0505f moveq r5, #95 ; 0x5f
14: 13a05000 movne r5, #0
As shown in the ARM documentation, start with the ARM Architectural Reference Manual for ARMv5 to get your feet wet with the basic 32 bit ARM instructions (and base (all thumb variants) thumb instructions). Notice in that documentation that the first nibble describes the condition code and all instructions can be conditionally executed (to avoid branches for if-then-else type things).
0: e3a0505f mov r5, #95 ; 0x5f
4: 03a0505f moveq r5, #95 ; 0x5f
8: 13a0505f movne r5, #95 ; 0x5f
c: e3500001 cmp r0, #1
10: 03500001 cmpeq r0, #1
14: 13500001 cmpne r0, #1
18: c3500001 cmpgt r0, #1
1c: b3500001 cmplt r0, #1
See how the first 4 bits change but the other 28 do not? A feature you see in ARM instruction sets specifically and not necessarily in others. Some others have similar features though.
Not heard of a32 instruction set, so it is not clear which of the handful or more of the arm instruction sets you are using. The above works on armv4t through armv7-a. But tell a modern compiler to build for armv7-a it is likely going to build thumb first then arm only if you can force it. See the ARM Architectural Reference Manual for armv7-ar (it also shows all the way back to armv4t each instruction indicating which architectures are supported).
This is arm code as well that runs on some arm processors:
0: 2903 cmp r1, #3 compare a with 3
2: bf18 it ne these two
4: 2801 cmpne r0, #1 do an if not equal then compare i with 1
6: bf0c ite eq these three do a
8: 205f moveq r0, #95 ; 0x5f if either are equal b = 95
a: 2000 movne r0, #0 else b = 0
c: 4770 bx lr
e: bf00 nop
(just to show that it matters very much which specific instruction set a question is asking about and for ARM which of the ARM instruction sets)
You are basically wanting to do a
if i == 1 set the z flag
else if a == 3 set the z flag
if the z flag is set (from either of the above) b = 95
else b = 0
There are many basic ways to do this and Simson's answer is a clean straightforward approach that saves a branch or two.
mov r5,#95
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
// if i == 1
cmp r0,#1
bne skip
// or if a == 3
cmp r1,#3
bne skip
// else
mov r5,#0 //neither were equal
skip:
str r5, [r13, #12]
I was focused on that answer, but looking at yours did you mean to place the result in two different places based on the result?
ldr r4 #0 // put 0 in r4
str r4, [r13, #8] //store it at location 208 with r13
ldr r5 #95 //put 95 on r5
str r5 [r13, #12] //store 95 on location 212 with r13
That breaks Simson's answer. And mine above.
Most folks would start with this, easy to read and follow, brute force straight from the high level code.
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
// if i == 1
check_i:
cmp r0,#1
bne check_a
b one_equal //folks will forget to do this one
check_a:
// or if a == 3
cmp r1,#3
beq one_equal
bne neither_equal //or just fall through
// else
neither_equal:
mov r4,#0
str r4, [r13, #8]
b the_end //many folks forget this branch
one_equal:
mov r5,#95
str r5, [r13, #12]
the_end:
Or something like it which can then be shortened slightly into this, some folks would start with something like this:
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
// if i == 1
cmp r0,#1
beq one_equal
// or if a == 3
cmp r1,#3
beq one_equal
// else
neither_equal:
mov r4,#0
str r4, [r13, #8]
b the_end //many folks forget this one
one_equal:
mov r5,#95
str r5, [r13, #12]
the_end:
Here is where you start to go off the rails
cmp r0, r2 //this is a valid starting point
orr r1, r3 //orr is a logical or, not an if this "or" that
// so we are confused by what you are doing here
bgt else //you are wanting to know if it is equal or not, not if greater
// than
It does not get any better after that
If you really meant the result in two different places then:
Still get the variables into registers from the stack
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
This still does an if this is equal or that is equal
cmp r0, #1 //is i == 1?
cmpne r1, #3 //if not then is a == 3?
You end up here with z set if either one is equal or z clear if neither are equal
moveq r4,#95 //one or the other is equal
streq r4,[r13, #8] //one or the other is equal
movne r5,#0 //neither are equal
strne r5,[r13, #12] //neither are equal
Final result:
ldr r0, [sp, #0] //i
ldr r1, [sp, #4] //a
cmp r0, #1 //is i == 1?
cmpne r1, #3 //if not then is a == 3
moveq r4,#95 //one or the other is equal
streq r4,[r13, #8] //one or the other is equal
movne r5,#0 //neither are equal
strne r5,[r13, #12] //neither are equal
It assembles fine, so the syntax is good
0: e59d0000 ldr r0, [sp]
4: e59d1004 ldr r1, [sp, #4]
8: e3500001 cmp r0, #1
c: 13510003 cmpne r1, #3
10: 03a0405f moveq r4, #95 ; 0x5f
14: 058d4008 streq r4, [sp, #8]
18: 13a05000 movne r5, #0
1c: 158d500c strne r5, [sp, #12]
I have edited this so many times I hope I did not leave any mistakes...I will get beat up for it if I did I am sure...Before doing any assembly language you need the proper documentation. In this case you want one of the ARM Architectural Reference Manuals, likely the oldest one which is directly derived from the printed versions before they distributed pdfs. The armv5 manual.
In general you will see a compiler will do the opposite and jump over
if(x==1)
{
y = 5;
}
cmp r0,#1
bne skip //C code is equal so branch if not
mov r1,#5
skip:
If you had if ((i==1)&&(a==3)) you would also want to look at the opposite, skip over if (i!=1) skip over if (a!=3) having the two paths skip to a common label.
But in the case of an this OR that you kind of want to have two paths land in the same place by branching to a common label and then have it fall through to the else code if neither are true. By doing the as written comparison if i == 1 branch to label, of a == 3 branch to label.
this works, but I have to do it using auto-indexing and I can not figure out that part.
writeloop:
cmp r0, #10
beq writedone
ldr r1, =array1
lsl r2, r0, #2
add r2, r1, r2
str r2, [r2]
add r0, r0, #1
b writeloop
and for data I have
.balign 4
array1: skip 40
What I had tried was this, and yes I know it is probably a poor attempt but I am new to this and do not understand
ldr r1, =array1
writeloop:
cmp r0, #10
beq writedone
ldr r2, [r1], #4
str r2, [r2]
add r0, r0, #1
b writeloop
It says segmentation fault when I try this. What is wrong? What I am thinking should happen is every time it loops through, it sets the element r2 it at = to the address of itself, and then increments to the next element and does the same thing
The ARM architechures gives several different address modes.
From ARM946E-S product overview and many other sources:
Load and store instructions have three primary addressing modes
- offset
- pre-indexed
- post-indexed.
They are formed by adding or subtracting an immediate or register-based offset to or from a base register. Register-based offsets can also be scaled with shift operations. Pre-indexed and post-indexed addressing modes update the base register with the base plus offset calculation. As the PC is a general purpose register, a 32‑bit value can be loaded directly into the PC to perform a jump to any address in the 4GB memory space.
As well, they support write back or updating of the register, hence the reason for pre-indexed and post-indexed. Post-index doesn't make much sense without write back.
Now to your issue, I believe that you want to write the values 0-9 to an array of ten words (length four bytes). Assuming this, you can use indexing and update the value via add. This leads to,
mov r0, #0 ; start value
ldr r1, =array1 ; array pointer
writeloop:
cmp r0, #10
beq writedone
str r0, [r1, r0, lsl #2] ; index with r1 base by r0 scaled by *4
add r0, r0, #1
b writeloop
writedone:
; code to jump somewhere else and not execute data.
.balign 4
array1: skip 40
For interest a more efficient loop can be done by counting and writing down,
mov r0, #9 ; start value
ldr r1, =array1 ; array pointer
writeloop:
str r0, [r1, r0, lsl #2] ; index with r1 base by r0 scaled by *4
subs r0, r0, #1
bne writeloop
Your original example was writing the pointer to the array; often referred to as 'value equals address'. If this is what you want,
ldr r0, =array_end ; finished?
ldr r1, =array1 ; array pointer
write_loop:
str r1, [r1], #4 ; add four and update after storing
cmp r0, r1
bne write_loop
; code to jump somewhere else and not execute data.
.balign 4
array1: skip 40
array_end:
I am writing a program that creates an array of 10 random integers between 0 and 999. My code for creating and occupying the array is functioning properly but when I attempt to print out the minimum and maximum value of the array, it prints the wrong value.
This is the code where I read through my array and print out each value. The spaced section is where I look for the min and max:
main:
BL _seedrand # seed random number generator with current time
MOV R0, #0 # initialze index variable
MOV R3, #1000
MOV R5, #-1
PUSH {R3}
PUSH {R5}
readloop:
CMP R0, #10 # check to see if we are done iterating
BEQ readdone # exit loop if done
LDR R1, =a # get address of a
LSL R2, R0, #2 # multiply index*4 to get array offset
ADD R2, R1, R2 # R2 now has the element address
LDR R1, [R2] # read the array at address
POP {R5}
CMP R5, R1
MOVGT R5, R1
POP {R3}
CMP R3, R1
MOVLT R3, R1
PUSH {R3}
PUSH {R5}
PUSH {R0} # backup register before printf
PUSH {R1} # backup register before printf
PUSH {R2} # backup register before printf
MOV R2, R1 # move array value to R2 for printf
MOV R1, R0 # move array index to R1 for printf
BL _printf # branch to print procedure with return
POP {R2} # restore register
POP {R1} # restore register
POP {R0} # restore register
ADD R0, R0, #1 # increment index
B readloop # branch to next loop iteration
readdone:
MOV R1, R3
LDR R0, =min_str # R0 contains formatted string address
BL printf
MOV R1, R5
LDR R0, =max_str
BL printf
B _exit
If you would like to see the rest of the code where the array is created and populated, I can do so. I have tried different tags and backing up R3 and R5 on the stack and it still prints the wrong values. Specifically the min value will always print the last integer in the array and the max is always printed as 0.
I am working on one of the simple codes assigned for a ARM microprocessor course I'm taking. I am having a slight issue with getting my code to load a value of an array to compare using Keil. The program is supposed to compare 5 numbers and then store values if the comparison is true. When I run my program it will not load array values that I declared. My professor isn't much help either and doesn't seem to know why it's not working properly.
Here's what I have done so far. I also think my PUSH is wrong but I can probably figure that out after I at least get the array to load. I should be pushing those values onto the stack but I am pretty sure I'm just loading values in registers instead.
AREA main, CODE, READONLY
EXPORT __main
ENTRY
__main PROC
MOVS r5, #0
LDR r0, =NUMB
loop1
LDR r1, [r0]
CMP r5, #5
BEQ stop
loop
CMP r1, #10
BLT low10
CMP r1, #100
BLT mid
CMP r1, #255
BLT high100
low10
PUSH {r2}
MOVS r2, #2
ADDS r5, #1
B loop1
mid
PUSH {r3}
MOVS r3, #0
ADDS r5, #1
B loop1
high100
PUSH {r4}
MOVS r4, #1
ADDS r5, #1
B loop1
stop B stop
ENDP
AREA myDATA, DATA, READWRITE
ALIGN
NUMB DCD 1,11,111,11,1
END
With respect to the array, your element size is not 1-byte, it's 4-bytes.
Using GNU & GDB, if we examine the address at R0 and interpret as signed words (i.e, 4 byte form), we see the expected array values.
.data
NUMB: .word 1,11,111,11,1
...
LDR r0, =NUMB
(gdb) x/8wd $r0
0x200dc: 1 11 111 11
0x200ec: 1 4929 1634033920 16804194
So you will need to change your values in the context of R5 to assume a 4-byte word. E.g.,
CMP r5, #(4*5)
ADDS r5, #4
Is is sadly very simple, just change in myData READWRITE to READONLY :)
I'm new to assembly programing and I'm programing for ARM.
I'm making a program with two subroutines: one that appends a byte info on a byte vector in memory, and one that prints this vector. The first address of the vector contains the number of elements that follows, up to 255. As I debug it with GDB, I can see that the "appendbyte" subroutine works fine. But when it comes to the "printvector" one, there are some problems. First, the element loaded in register r1 is wrong (it loads 0, when it should be 7). Then, when I read the registers values with GDB after I use the "printf" function, a lot of register get other values that weren't supposed to change, since I didn't modify them, I just used "printf". Why is "printf" modyfing the values.
I was thinking something about the align. I'm not sure if i'm using the directive correctly.
Here is the full code:
.text
.global main
.equ num, 255 # Max number of elements
main:
push {lr}
mov r8, #7
bl appendbyte
mov r8, #5
bl appendbyte
mov r8, #8
bl appendbyte
bl imprime
pop {pc}
.text
.align
printvector:
push {lr}
ldr r3, =vet # stores the address of the start of the vector in r3
ldr r2, [r3], #1 # stores the number of elements in r2
.align
loop:
cmp r2, #0 #if there isn't elements to print
beq fimimprime #quit subroutine
ldr r0, =node #r0 receives the print format
ldr r1, [r3], #1 #stores in r1 the value of the element pointed by r3. Increments r3 after that.
sub r2, r2, #1 #decrements r2 (number of elements left to print)
bl printf #call printf
b loop #continue on the loop
.align
endprint:
pop {pc}
.align
appendbyte:
push {lr}
ldr r0, =vet #stores in r0 the beggining address of the vector
ldr r1, [r0], #1 #stores in r1 the number of elements and makes r0 point to the next address
add r3, r0, r1 #stores in r3 the address of the first available position
str r8, [r3] #put the value at the first available position
ldr r0, =vet #stores in r0 the beggining address of the vector
add r1, r1, #1 # increment the number of elements in the vector
str r1, [r0] # stores it in the vector
pop {pc}
.data # Read/write data follows
.align # Make sure data is aligned on 32-bit boundaries
vet: .byte 0
.skip num # Reserve num bytes
.align
node: .asciz "[%d]\n"
.end
The problems are in
ldr r1, [r3], #1
and
bl printf
I hope I was clear on the problem.
Thanks in advance!
The ARM ABI specifies that registers r0-r3 and r12 are to be considered volatile on function calls. Meaning that the callee does not have to restore their value. LR also changes if you use bl, because LR will then contain the return address for the called function.
More information can be found on ARMs Information Center entry for the ABI or in the APCS (ARM Procedure Call Standard) document.
printvector:
push {lr}
ldr r3, =vet # stores the address of the start of the vector in r3
ldr r2, [r3], #1 # stores the number of elements in r2
.align
loop:
cmp r2, #0 #if there isn't elements to print
beq fimimprime #quit subroutine
ldr r0, =node #r0 receives the print format
ldr r1, [r3], #1 #stores in r1 the value of the element pointed by r3. Increments r3 after that.
sub r2, r2, #1 #decrements r2 (number of elements left to print)
bl printf #call printf
b loop #continue on the loop
.align
endprint:
pop {pc}
that is definitely not how you use align. Align is there to...align the thing that follows on some boundary (specified in an optional argument, note this is an assembler directive, not an instruction) by padding the binary with zeros or whatever the padding is. So you dont want a .align in the code flow, between instructions. You have done that between the ldr r1, and the cmp r2 after loop. Now the align after b loop is not harmful as the branch is unconditional but at the same time not necessary as there is no reason to align there the assembler is generating an instruction flow so the bytes cant be unaligned. Where you would use .align is after some data declaration before instructions:
.byte 1,2,3,4,5,
.align
some_code_branch_dest:
In particular one where the assembler complains or the code crashes.