Assembly: store a string in register - c

Let's say we work on architecture x86_64 and let's say we have the following string, "123456". In ASCII characters, it becomes 31 32 33 34 35 36 00.
Now, which assembly instructions should I use to move the entire (even if fragmented) content of this string somewhere in a way that %rdi stores the address of that string (points to that)?
Because I am not simply able to move the hex representation of the string into a register, like one can do with unsigned values, how do I do it?

There are a couple of ways to do so.
If you want to move the entire string to another offset first, you would have to do so with a loop.
mov rbx, 0
loop:
mov al, [string+rbx]
mov [copyoffset+rbx], al
inc rbx
cmp al, 0x0
jne loop
... Insert other code here
Then you can use the Lea instruction described below to move it into rdi.
If you just want to load the address of the string and don't care about moving it you can just use lea
lea rdi, [stringoffset]
Edit: Changed rax to al so we only move one byte at a time

Related

Copying to and Displaying an Array

Hello Everyone!
I'm a newbie at NASM and I just started out recently. I currently have a program that reserves an array and is supposed to copy and display the contents of a string from the command line arguments into that array.
Now, I am not sure if I am copying the string correctly as every time I try to display this, I keep getting a segmentation error!
This is my code for copying the array:
example:
%include "asm_io.inc"
section .bss
X: resb 50 ;;This is our array
~some code~
mov eax, dword [ebp+12] ; eax holds address of 1st arg
add eax, 4 ; eax holds address of 2nd arg
mov ebx, dword [eax] ; ebx holds 2nd arg, which is pointer to string
mov ecx, dword 0
;Where our 2nd argument is a string eg "abcdefg" i.e ebx = "abcdefg"
copyarray:
mov al, [ebx] ;; get a character
inc ebx
mov [X + ecx], al
inc ecx
cmp al, 0
jz done
jmp copyarray
My question is whether this is the correct method of copying the array and how can I display the contents of the array after?
Thank you!
The loop looks ok, but clunky. If your program is crashing, use a debugger. See the x86 for links and a quick intro to gdb for asm.
I think you're getting argv[1] loaded correctly. (Note that this is the first command-line arg, though. argv[0] is the command name.) https://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames says ebp+12 is the usual spot for the 2nd arg to a 32bit functions that bother to set up stack frames.
Michael Petch commented on Simon's deleted answer that the asm_io library has print_int, print_string, print_char, and print_nl routines, among a few others. So presumably you a pointer to your buffer to one of those functions and call it a day. Or you could call sys_write(2) directly with an int 0x80 instruction, since you don't need to do any string formatting and you already have the length.
Instead of incrementing separately for two arrays, you could use the same index for both, with an indexed addressing mode for the load.
;; main (int argc ([esp+4]), char **argv ([esp+8]))
... some code you didn't show that I guess pushes some more stuff on the stack
mov eax, dword [ebp+12] ; load argv
;; eax + 4 is &argv[1], the address of the 1st cmdline arg (not counting the command name)
mov esi, dword [eax + 4] ; esi holds 2nd arg, which is pointer to string
xor ecx, ecx
copystring:
mov al, [esi + ecx]
mov [X + ecx], al
inc ecx
test al, al
jnz copystring
I changed the comments to say "cmdline arg", to distinguish between those and "function arguments".
When it doesn't cost any extra instructions, use esi for source pointers, edi for dest pointers, for readability.
Check the ABI for which registers you can use without saving/restoring (eax, ecx, and edx at least. That might be all for 32bit x86.). Other registers have to be saved/restored if you want to use them. At least, if you're making functions that follow the usual ABI. In asm you can do what you like, as long as you don't tell a C compiler to call non-standard functions.
Also note the improvement in the end of the loop. A single jnz to loop is more efficient than jz break / jmp.
This should run at one cycle per byte on Intel, because test/jnz macro-fuse into one uop. The load is one uop, and the store micro-fuses into one uop. inc is also one uop. Intel CPUs since Core2 are 4-wide: 4 uops issued per clock.
Your original loop runs at half that speed. Since it's 6 uops, it takes 2 clock cycles to issue an iteration.
Another hacky way to do this would be to get the offset between X and ebx into another register, so one of the effective addresses could use a one-register addressing mode, even if the dest wasn't a static array.
mov [X + ebx + ecx], al. (Where ecx = X - start_of_src_buf). But ideally you'd make the store the one that used a one-register addressing mode, unless the load was a memory operand to an ALU instruction that could micro-fuse it. Where the dest is a static buffer, this address-different hack isn't useful at all.
You can't use rep string instructions (like rep movsb) to implement strcpy for implicit-length strings (C null-terminated, rather than with a separately-stored length). Well you could, but only scanning the source twice: once for find the length, again to memcpy.
To go faster than one byte clock, you'd have to use vector instructions to test for the null byte at any of 16 positions in parallel. Google up an optimized strcpy implementation for example. Probably using pcmpeqb against a vector register of all-zeros.

Passing values from one array to another in assembler

I enter assembler function with two C char arrays like that:
EncryptAsm(arr1,arr2)
where both are of type char*, one containing text and the second one is full of '#' signs and it acts like two dimensional array, both are the same length.
I'm trying to pass some values from first array to second one in asm procedure:
mov ecx,row ;calculating index of arr2 index=[row*inputLength+column]
imul ecx,ebx
add ecx,column
mov eax,1 ;calculating index of arr1
imul eax,iterator
mov esi,arr1[eax]
mov edi,arr2[ecx]
movsb
When the indexes of both arrays equal 0 (eax and ecx are 0) everything is fine, but if it's bigger it doesn't work and throws an error (eg. eax==1).
In asm code, the arrays are of type:
arr1:ptr byte, arr2:ptr byte
What am I doing wrong?
can you check the assembly guide of the movsb? If a normal Intel movsb,
it shall code like this:
CLD
MOV ECX ,100
LEA ESI,FIRST
LEA EDI,SECOND
REP MOVSB
And also, something needed to be checked:
1, the segment of SI/DI, if write access and segment length are right
2, interrupt protection during the REP MOVSB

8086 assembly - how to access array elements within a loop

Ok, to make things as simple as possible, say I have a basic loop that i want to use in order to modify some elements of an array labeled a. In the following sample code I've tried replacing all elements of a with 1, but that doesn't really work.
assume cs:code ,ds:data
data segment
a db 1,2,3,4
i db 0
data ends
code segment
start:
mov ax,data
mov ds,ax
lea si,a
the_loop:
mov cl,i
cmp cl,4
jae the_end
mov ds:si[i],1 ; this is the part that i don't really understand since
inc i ; i'm expecting i=0 and ds:si[i] equiv to ds:si[0] which
loop the_loop ; is apparently not the case here since i actually receives the
; the value 1
the_end:
mov ax,4c00h
int 21h
code ends
end start
I am aware that I could simply do this by modifying the element stored in al after the lodsb instruction, and just store that. But I would like to know if it is possible to do something like what I've tried above.
In x86 assembly you can't use a value stored to a memory to address memory indirectly.
You need to read i into some register that can be used for memory addressing, and use that instead. You may want to check Wikipedia for 8086 memory addressing modes.
So, replace
mov ds:si[i],1
with (segment ds is unnecessary here, as it's the default of si, bx and bx+si too):
xor bx,bx
mov bl,[i]
mov [bx+si],byte 1 ; some other assemblers want byte ptr
There are other problems with your code too. The entire loop can be made easier and fixed this way:
lea si,a
xor cx,cx
mov cl,[i]
#fill_loop:
mov [si], byte 1
inc si
dec cx
jnz #fill_loop
Or, if you want to save 1 byte and use loop instruction.
#fill_loop:
mov [si], byte 1
inc si
loop #fill_loop
Note that in 16-bit mode loop instruction decrements cx and jumps to label if cx is not zero after decrement. However, in 32-bit mode loop decrements ecx and in 64-bit mode (x86-64) it decrements rcx.
I suppose that your code does not even run through the assembler, since
mov ds:si[i],1
is not a valid address mode.
Use something like
mov byte ptr [si],1 ; store value 1 at [SI]
inc si ; point to next array element
instead (used MASM to verify the syntax).
The DS: prefix is unnecessary for [si] since this is the default.
See also The 80x86 Addressing Modes.

Displaying hexadecimal contents in assembly language

Hey guys I'm not sure if I'm going about all this the right way. I need the first 12 numbers of Fibonacci sequence to calculate which its doing already I'm pretty sure. But now I need to display the hexadecimal contents of (Fibonacci) in my program using dumpMem. I need to be getting a print out of : 01 01 02 03 05 08 0D 15 22 37 59 90
But I'm only getting: 01 01 00 00 00 00 00 00 00 00 00 00
Any tips or help is much much appreciated.
INCLUDE Irvine32.inc
.data
reg DWORD -1,1,0 ; Initializes a DOUBLEWORD array, giving it the values of -1, 1, and 0
array DWORD 48 DUP(?)
Fibonacci BYTE 1, 1, 10 DUP (?)
.code
main PROC
mov array, 1
mov esi,OFFSET array ; or should this be Fibonacci?
mov ecx,12
add esi, 4
L1:
mov edx, [reg]
mov ebx, [reg+4]
mov [reg+8], edx
add [reg+8], ebx ; Adds the value of the EBX and 'temp(8)' together and stores it as temp(8)
mov eax, [reg+8] ; Moves the value of 'temp(8)' into the EAX register
mov [esi], eax ; Moves the value of EAX into the offset of array
mov [reg], ebx ; Moves the value of the EBX register to 'temp(0)'
mov [reg+4], eax ; Moves the value of the EAX register to 'temp(4)
add esi, 4
; call DumpRegs
call WriteInt
loop L1
;mov ebx, offset array
;mov ecx, 12
;L2:
;mov eax, [esi]
;add esi, 4
;call WriteInt
;loop L2
;Below will show hexadecimal contents of string target-----------------
mov esi, OFFSET Fibonacci ; offset the variables
mov ebx,1 ; byte format
mov ecx, SIZEOF Fibonacci ; counter
call dumpMem
exit
main ENDP
END main
It seems to me that the problem here is with computing the Fibonacci sequence. Your code for that leaves me somewhat...puzzled. You have a bunch of "stuff" there, that seems to have nothing to do with computing Fibonacci numbers (e.g., reg), and others that could, but it seems you don't really know what you're trying to do with them.
Looking at your loop to compute the sequence, the first thing that practically jumps out at me is that you're using memory a lot. One of the first (and most important) things when you're writing assembly language is to maximize your use of registers and minimize your use of memory.
As a hint, I think if you read anything from memory in the course if computing the sequence, you're probably making a mistake. You should be able to do all the computation in registers, so the only memory references will be writing results. Since you're (apparently) producing only byte-sized results, you should need only one array of the proper number of bytes to hold the results (i.e., one byte per number you're going to generate).
I'm tempted to write a little routine showing how neatly this can be adapted to assembly language, but I suppose I probably shouldn't do that...
Your call to dumpMem is correct, but your program is not storing the results of your calculations at the correct location: the region you call "Fibonacci" remains initialized to 1, 1, and ten zeros. You need to make sure that your loop starts writing at the offset of Fibonacci plus 2, and moves ten times in one-byte increments (ten, not twelve, because you provided the two initial items in the initializer).
I'm sorry, I cannot be any more specific, as any question containing the word "Fibonacci" inevitably turns out to be someone's homework :-)

Copying Array Content to Another Array in Assembly

I'm looking to copy some elements of an array to another one in Assembly. Both arrays are accessed via pointers which are stored in registers. So, edx would be pointing to one array and eax would point to another. Basically, edx points to an array of character read in from a text file, and I'd like eax to contain only 32 of the characters. Here's what I'm attempting to do:
I386 Assembly using NASM
add edx, 8 ; the first 8 characters of the string are not wanted
mov cl, 32
ip_address:
; move the character currently pointed to by edx to eax (mov [eax], [edx])
inc edx
inc eax
loop ip_address
Again, i'd like this to place the 32 characters after the first eight to be placed in the second array. The problem is that I'm stumped on how to do this.. Any help is very much appreciated.
You can't do direct memory-to-memory moves in x86. You need to use another scratch register:
mov ecx, [edx]
mov [eax], ecx
Or something like that...
Both ia32 and ia64 do contain a memory-to-memory string move instruction that can move bytes, "words", and "doublewords".
movsb
movsw
movsd
The source address is specified in ESI and the destination in EDI.1 By itself, it moves one byte, word, or doubleword. If the rep prefix is used, then ECX will contain a count and the instruction will move an entire string of values.
1. I think these instructions are the reason that the ESI and EDI registers are so named. (Source Index and Destination Index.)
The simple solution is to just do:
mov ebx, [edx]
mov [eax], ebx
Be aware that under many platform's ABIs, ebx is a callee-save register, so you will need to save and restore its value in your function.
The simpler solution is to link against the standard library and call memcpy, which is perfectly acceptable in assembly, and will usually be substantially faster than writing your own loop.

Resources