I'm trying to use ARM assembly to insert one string into another, however my code would always return an empty string to the C program calling the assembly program. I believe I have narrowed down my issue to the STRB instruction. Below is my code with most irrelevant code removed. The important part to look at is in the "test" block.
.global ins
ins:
stmfd sp!, {v1-v6, lr}
mov v1, a1 # save pointer to 1st string
mov v2, a2 # save pointer to 2nd string
bl strlen # find out length
mov v3, a1 # save string1 length
mov a1, v2 # recover pointer to string 2
bl strlen # length of string 2
add a1, a1, v3 # total length
add a1, a1, #1 # add one for null byte
bl malloc
add a3, a3, #1
test:
ldrb v3, [v1], #1
strb v3, [a1], #1
ldmfd sp!, {v1-v6, pc}
exit:
.end
v1 and v2 hold strings 1 and 2. When I have the test block written as:
test:
ldrb v3, [v1], #1
strb v3, [a1], #1
ldmfd sp!, {v1-v6, pc}
then the program returns an empty string. However, if I have it written as:
test:
ldrb v3, [v1], #1
strb v3, [a1]
ldmfd sp!, {v1-v6, pc}
it successfully returns the first character in string 1. Obviously, this is not sufficient to build a new string, as I'm not performing an offset on a1.
Does anyone know what is causing the string to be returned as empty? I honestly have no idea what the issue may be after several hours of experimenting and researching.
Any help is greatly appreciated!
The value in a1 is returned to the C function calling your assembler routine. You need to return the address of the start of the string, but if you increment a1 while writing the string you will return the address of the end of the string instead.
If you use another register for storing the current address that you are writing to then the start address will still be in a1 when you return. e.g:
test:
mov v4, a1 # copy address of new string to v4
ldrb v3, [v1], #1
strb v3, [v4], #1 # increment v4, the start of string
# will still be in a1
ldmfd sp!, {v1-v6, pc}
Related
I know I'm doing something stupid, but I can't figure out what. The functions return a null string, so I'm not copying each character. Entry into my reverse_string_func has the input string in R0. The example I have used a syntax which gas doesn't seem to support, so I modified it, but it isn't functional. I want to move along the array of char one at a time. Any assistance is greatly appreciated, I've stared at it so many times I'm going code blind.
EDIT:
The input string is in R0 when it is passed to the function. The reverse_string is an empty buffer filled with 0's. What I'm attempting to do is load the location of the input string into R1, move through it (while loading into r3) until I find the terminating 0, then move to the copy loop. There I want to load the current location of R3 (end of the string) into the first position of R2 (the reverse string) then go backwards pulling the characters from R3 and putting them back into R2 in the reverse order. I assumed I was using R2 and not ignoring it?
ldr r2, [r0], r3 /* start at end of string */
add r3, #-1
I believe I'm pushing the value numbered R3 in the array held in R0 into R2? Is that not what is happening?
reverse_string_func:
PUSH {ip, lr}
LDR r1, =input_string # pointer to first string.
LDR r2, =reverse_string # pointer to second string
mov r3, #0
mv2end: /* try to find the end of the string */
ldr r1, [r1], #1
add r3, #1
cmp r1, #0
bne mv2end
/* found end of string, move on */
string_copy:
ldr r2, [r0], r3 /* start at end of string */
add r3, #-1
cmp r3, #0
bne string_copy /* move back for next character */
POP {ip, pc}
EDIT2:
OK, so I've made changes to attempt to str the values I'm finding. I have single-stepped through everything and it all seems to be in the registers, but it isn't being stored into the character buffer in memory? I still don't understand what I'm doing wrong.
reverse_string_func:
PUSH {ip, lr}
LDR r1, =input_string # pointer to first string.
LDR r4, =reverse_string # pointer to second string
mov r3, #0
mv2end: /* try to find the end of the string */
ldrb r2, [r1, r3]
add r3, #1
cmp r2, #0
bne mv2end
/* found end of string, move on */
add r3, #-2 /* on terminator, move back past terminator, and line feed */
string_copy:
add r3, #-1
ldrb r2, [r1, r3]
strb r4, [r1, r3] # <--- I believe I'm going wrong here, but how?
cmp r3, #0
bne string_copy /* move back for next character */
POP {ip, pc}
As it stands this just returns a NULL string into the main program.
The problem I was having with the second edit was with the STRB mnemonic and the problem was I that was misunderstanding the direction of storing. The correct way is:
strb r2, [r4, r5]
Because you're storing the contents of register R2 into the memory location held in R4, by the offset R5.
Hopefully this might help someone else who comes here later. There is a good tutorial here: https://azeria-labs.com/memory-instructions-load-and-store-part-4/
You load current character into r2 register (and then ignore it), while it appears what you wanted is to store that character into string pointed to by r2.
I am writing a program that creates an array of 10 random integers between 0 and 999. My code for creating and occupying the array is functioning properly but when I attempt to print out the minimum and maximum value of the array, it prints the wrong value.
This is the code where I read through my array and print out each value. The spaced section is where I look for the min and max:
main:
BL _seedrand # seed random number generator with current time
MOV R0, #0 # initialze index variable
MOV R3, #1000
MOV R5, #-1
PUSH {R3}
PUSH {R5}
readloop:
CMP R0, #10 # check to see if we are done iterating
BEQ readdone # exit loop if done
LDR R1, =a # get address of a
LSL R2, R0, #2 # multiply index*4 to get array offset
ADD R2, R1, R2 # R2 now has the element address
LDR R1, [R2] # read the array at address
POP {R5}
CMP R5, R1
MOVGT R5, R1
POP {R3}
CMP R3, R1
MOVLT R3, R1
PUSH {R3}
PUSH {R5}
PUSH {R0} # backup register before printf
PUSH {R1} # backup register before printf
PUSH {R2} # backup register before printf
MOV R2, R1 # move array value to R2 for printf
MOV R1, R0 # move array index to R1 for printf
BL _printf # branch to print procedure with return
POP {R2} # restore register
POP {R1} # restore register
POP {R0} # restore register
ADD R0, R0, #1 # increment index
B readloop # branch to next loop iteration
readdone:
MOV R1, R3
LDR R0, =min_str # R0 contains formatted string address
BL printf
MOV R1, R5
LDR R0, =max_str
BL printf
B _exit
If you would like to see the rest of the code where the array is created and populated, I can do so. I have tried different tags and backing up R3 and R5 on the stack and it still prints the wrong values. Specifically the min value will always print the last integer in the array and the max is always printed as 0.
I am working on one of the simple codes assigned for a ARM microprocessor course I'm taking. I am having a slight issue with getting my code to load a value of an array to compare using Keil. The program is supposed to compare 5 numbers and then store values if the comparison is true. When I run my program it will not load array values that I declared. My professor isn't much help either and doesn't seem to know why it's not working properly.
Here's what I have done so far. I also think my PUSH is wrong but I can probably figure that out after I at least get the array to load. I should be pushing those values onto the stack but I am pretty sure I'm just loading values in registers instead.
AREA main, CODE, READONLY
EXPORT __main
ENTRY
__main PROC
MOVS r5, #0
LDR r0, =NUMB
loop1
LDR r1, [r0]
CMP r5, #5
BEQ stop
loop
CMP r1, #10
BLT low10
CMP r1, #100
BLT mid
CMP r1, #255
BLT high100
low10
PUSH {r2}
MOVS r2, #2
ADDS r5, #1
B loop1
mid
PUSH {r3}
MOVS r3, #0
ADDS r5, #1
B loop1
high100
PUSH {r4}
MOVS r4, #1
ADDS r5, #1
B loop1
stop B stop
ENDP
AREA myDATA, DATA, READWRITE
ALIGN
NUMB DCD 1,11,111,11,1
END
With respect to the array, your element size is not 1-byte, it's 4-bytes.
Using GNU & GDB, if we examine the address at R0 and interpret as signed words (i.e, 4 byte form), we see the expected array values.
.data
NUMB: .word 1,11,111,11,1
...
LDR r0, =NUMB
(gdb) x/8wd $r0
0x200dc: 1 11 111 11
0x200ec: 1 4929 1634033920 16804194
So you will need to change your values in the context of R5 to assume a 4-byte word. E.g.,
CMP r5, #(4*5)
ADDS r5, #4
Is is sadly very simple, just change in myData READWRITE to READONLY :)
This is for homework
I have to take a given string and offset as parameters and create an encrypted version of the string. Here is what I have so far
.global
cypher:
stmfd sp!, {v1-v6, lr} #std
mov v1, a1 #hold the string pointer in v1
bl strlen #get the length of the string in a1
add a1, a1, #1 #add null byte space to strlen
mov v2, a1 #hold the length of space needed in v2
bl malloc #reserve space for new string in a1
mov v3, #0 #initial index of new string
loop:ldr v4, [v1], #4 #load v4 with string pointer and increment by bytes
add v5, v4, a2 #add the offset to the current character
str v5, [a1, v3] #store the new character in the new address
add v3, v3, #4 #increment the index by a byte
cmp v2, v3
bne loop
ldmfd sp!, {v1-v6, pc} #std
.end
I'm having trouble figuring out how to actually increment the character correctly. How to I add an offset to a character?(I'm guessing the ascii characters need to be incremented?)
You are iterating the loop by four bytes.
Below is the "correct" one: (and an optimize one, too: you don't need v3 and v5)
loop:ldrb v4, [v1], #1 #load v4 with string pointer and increment by bytes
subs v2, v2, #1
add v4, v4, a2 #add the offset to the current character
strb v4, [a1], #1 #store the new character in the new address
bne loop
ldmfd sp!, {v1-v6, pc} #std
I'm new to assembly programing and I'm programing for ARM.
I'm making a program with two subroutines: one that appends a byte info on a byte vector in memory, and one that prints this vector. The first address of the vector contains the number of elements that follows, up to 255. As I debug it with GDB, I can see that the "appendbyte" subroutine works fine. But when it comes to the "printvector" one, there are some problems. First, the element loaded in register r1 is wrong (it loads 0, when it should be 7). Then, when I read the registers values with GDB after I use the "printf" function, a lot of register get other values that weren't supposed to change, since I didn't modify them, I just used "printf". Why is "printf" modyfing the values.
I was thinking something about the align. I'm not sure if i'm using the directive correctly.
Here is the full code:
.text
.global main
.equ num, 255 # Max number of elements
main:
push {lr}
mov r8, #7
bl appendbyte
mov r8, #5
bl appendbyte
mov r8, #8
bl appendbyte
bl imprime
pop {pc}
.text
.align
printvector:
push {lr}
ldr r3, =vet # stores the address of the start of the vector in r3
ldr r2, [r3], #1 # stores the number of elements in r2
.align
loop:
cmp r2, #0 #if there isn't elements to print
beq fimimprime #quit subroutine
ldr r0, =node #r0 receives the print format
ldr r1, [r3], #1 #stores in r1 the value of the element pointed by r3. Increments r3 after that.
sub r2, r2, #1 #decrements r2 (number of elements left to print)
bl printf #call printf
b loop #continue on the loop
.align
endprint:
pop {pc}
.align
appendbyte:
push {lr}
ldr r0, =vet #stores in r0 the beggining address of the vector
ldr r1, [r0], #1 #stores in r1 the number of elements and makes r0 point to the next address
add r3, r0, r1 #stores in r3 the address of the first available position
str r8, [r3] #put the value at the first available position
ldr r0, =vet #stores in r0 the beggining address of the vector
add r1, r1, #1 # increment the number of elements in the vector
str r1, [r0] # stores it in the vector
pop {pc}
.data # Read/write data follows
.align # Make sure data is aligned on 32-bit boundaries
vet: .byte 0
.skip num # Reserve num bytes
.align
node: .asciz "[%d]\n"
.end
The problems are in
ldr r1, [r3], #1
and
bl printf
I hope I was clear on the problem.
Thanks in advance!
The ARM ABI specifies that registers r0-r3 and r12 are to be considered volatile on function calls. Meaning that the callee does not have to restore their value. LR also changes if you use bl, because LR will then contain the return address for the called function.
More information can be found on ARMs Information Center entry for the ABI or in the APCS (ARM Procedure Call Standard) document.
printvector:
push {lr}
ldr r3, =vet # stores the address of the start of the vector in r3
ldr r2, [r3], #1 # stores the number of elements in r2
.align
loop:
cmp r2, #0 #if there isn't elements to print
beq fimimprime #quit subroutine
ldr r0, =node #r0 receives the print format
ldr r1, [r3], #1 #stores in r1 the value of the element pointed by r3. Increments r3 after that.
sub r2, r2, #1 #decrements r2 (number of elements left to print)
bl printf #call printf
b loop #continue on the loop
.align
endprint:
pop {pc}
that is definitely not how you use align. Align is there to...align the thing that follows on some boundary (specified in an optional argument, note this is an assembler directive, not an instruction) by padding the binary with zeros or whatever the padding is. So you dont want a .align in the code flow, between instructions. You have done that between the ldr r1, and the cmp r2 after loop. Now the align after b loop is not harmful as the branch is unconditional but at the same time not necessary as there is no reason to align there the assembler is generating an instruction flow so the bytes cant be unaligned. Where you would use .align is after some data declaration before instructions:
.byte 1,2,3,4,5,
.align
some_code_branch_dest:
In particular one where the assembler complains or the code crashes.