Adding an offset to a memory address - masm

I have some code like so (emu8086)
data segment
str1 db "hello"
len dw 4h
data ends
code segment
...
...
mov si, offset str1
lea di, [si + len]
code ends
I would expect this to make di point to the address of DS:0004, however the actual instruction generated is LEA DI, [SI] + 021h.
If instead, I use:
lea di, [si + 4]
Then it works as expected.
How do I make the first version work in a similar way to the second?

Where is your "expected" 4 coming from? If it's from the contents of len dw 4h, then you need a load, like perhaps add si, [len].
lea does not access the contents of memory.
x86 doesn't have a copy-and-add with a memory source, so you have to choose between a "destructive" add with a register destination, or lea that just does math with registers + assemble-time constants

Related

Can't write to memory requested with malloc/calloc in x64 Assembly

This is my first question on this platform. I'm trying to modify the pixels of an image file and to copy them to memory requested with calloc. When the code tries to dereference the pointer to the memory requested with calloc at offset 16360 to write, an "access violation writing location" exception is thrown. Sometimes the offset is slightly higher or lower. The amount of memory requested is correct. When I write equivalent code in C++ with calloc, it works, but not in assembly. I've also tried to request an higher amount of memory in assembly and to raise the heap and stack size in the visual studio settings but nothing works for the assembly code. I also had to set the option /LARGEADDRESSAWARE:NO before I could even build and run the program.
I know that the AVX instruction sets would be better suited for this, but the code would contain slightly more lines so I made it simpler for this question and I'm also not a pro, I did this to practice the AVX instruction set.
Many thanks in advance :)
const uint8_t* getImagePtr(sf::Image** image, const char* imageFilename, uint64_t* imgSize) {
sf::Image* img = new sf::Image;
img->loadFromFile(imageFilename);
sf::Vector2u sz = img->getSize();
*imgSize = uint64_t((sz.x * sz.y) * 4u);
*image = img;
return img->getPixelsPtr();
}
EXTRN getImagePtr:PROC
EXTRN calloc:PROC
.data
imagePixelPtr QWORD 0 ; contains address to source array of 8 bit pixels
imageSize QWORD 0 ; contains size in bytes of the image file
image QWORD 0 ; contains pointer to image object
newImageMemory QWORD 0 ; contains address to destination array
imageFilename BYTE "IMAGE.png", 0 ; name of the file
.code
mainasm PROC
sub rsp, 40
mov rcx, OFFSET image
mov rdx, OFFSET imageFilename
mov r8, OFFSET imageSize
call getImagePtr
mov imagePixelPtr, rax
mov rcx, 1
mov rdx, imageSize
call calloc
add rsp, 40
cmp rax, 0
je done
mov newImageMemory, rax
mov rcx, imageSize
xor eax, eax
mov bl, 20
SomeLoop:
mov dl, BYTE PTR [imagePixelPtr + rax]
add dl, bl
mov BYTE PTR [newImageMemory + rax], dl ; exception when dereferencing and writing to offset 16360
inc rax
loop SomeLoop
done:
ret
mainasm ENDP
END
Let's translate this line back into C:
mov BYTE PTR [newImageMemory + rax], dl ;
In C, this is more or less equivalent to:
*((unsigned char *)&newImageMemory + rax) = dl;
Which is clearly not what you want. It's writing to an offset from the location of newImageMemory, and not to an offset from where newImageMemory points to.
You will need to keep newImageMemory in a register if you want to use it as the base address for an offset.
While we're at it, this line is also wrong, for the same reason:
mov dl, BYTE PTR [imagePixelPtr + rax]
It just happens not to crash.

Copying to and Displaying an Array

Hello Everyone!
I'm a newbie at NASM and I just started out recently. I currently have a program that reserves an array and is supposed to copy and display the contents of a string from the command line arguments into that array.
Now, I am not sure if I am copying the string correctly as every time I try to display this, I keep getting a segmentation error!
This is my code for copying the array:
example:
%include "asm_io.inc"
section .bss
X: resb 50 ;;This is our array
~some code~
mov eax, dword [ebp+12] ; eax holds address of 1st arg
add eax, 4 ; eax holds address of 2nd arg
mov ebx, dword [eax] ; ebx holds 2nd arg, which is pointer to string
mov ecx, dword 0
;Where our 2nd argument is a string eg "abcdefg" i.e ebx = "abcdefg"
copyarray:
mov al, [ebx] ;; get a character
inc ebx
mov [X + ecx], al
inc ecx
cmp al, 0
jz done
jmp copyarray
My question is whether this is the correct method of copying the array and how can I display the contents of the array after?
Thank you!
The loop looks ok, but clunky. If your program is crashing, use a debugger. See the x86 for links and a quick intro to gdb for asm.
I think you're getting argv[1] loaded correctly. (Note that this is the first command-line arg, though. argv[0] is the command name.) https://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames says ebp+12 is the usual spot for the 2nd arg to a 32bit functions that bother to set up stack frames.
Michael Petch commented on Simon's deleted answer that the asm_io library has print_int, print_string, print_char, and print_nl routines, among a few others. So presumably you a pointer to your buffer to one of those functions and call it a day. Or you could call sys_write(2) directly with an int 0x80 instruction, since you don't need to do any string formatting and you already have the length.
Instead of incrementing separately for two arrays, you could use the same index for both, with an indexed addressing mode for the load.
;; main (int argc ([esp+4]), char **argv ([esp+8]))
... some code you didn't show that I guess pushes some more stuff on the stack
mov eax, dword [ebp+12] ; load argv
;; eax + 4 is &argv[1], the address of the 1st cmdline arg (not counting the command name)
mov esi, dword [eax + 4] ; esi holds 2nd arg, which is pointer to string
xor ecx, ecx
copystring:
mov al, [esi + ecx]
mov [X + ecx], al
inc ecx
test al, al
jnz copystring
I changed the comments to say "cmdline arg", to distinguish between those and "function arguments".
When it doesn't cost any extra instructions, use esi for source pointers, edi for dest pointers, for readability.
Check the ABI for which registers you can use without saving/restoring (eax, ecx, and edx at least. That might be all for 32bit x86.). Other registers have to be saved/restored if you want to use them. At least, if you're making functions that follow the usual ABI. In asm you can do what you like, as long as you don't tell a C compiler to call non-standard functions.
Also note the improvement in the end of the loop. A single jnz to loop is more efficient than jz break / jmp.
This should run at one cycle per byte on Intel, because test/jnz macro-fuse into one uop. The load is one uop, and the store micro-fuses into one uop. inc is also one uop. Intel CPUs since Core2 are 4-wide: 4 uops issued per clock.
Your original loop runs at half that speed. Since it's 6 uops, it takes 2 clock cycles to issue an iteration.
Another hacky way to do this would be to get the offset between X and ebx into another register, so one of the effective addresses could use a one-register addressing mode, even if the dest wasn't a static array.
mov [X + ebx + ecx], al. (Where ecx = X - start_of_src_buf). But ideally you'd make the store the one that used a one-register addressing mode, unless the load was a memory operand to an ALU instruction that could micro-fuse it. Where the dest is a static buffer, this address-different hack isn't useful at all.
You can't use rep string instructions (like rep movsb) to implement strcpy for implicit-length strings (C null-terminated, rather than with a separately-stored length). Well you could, but only scanning the source twice: once for find the length, again to memcpy.
To go faster than one byte clock, you'd have to use vector instructions to test for the null byte at any of 16 positions in parallel. Google up an optimized strcpy implementation for example. Probably using pcmpeqb against a vector register of all-zeros.

Why do byte spills occur and what do they achieve?

What is a byte spill?
When I dump the x86 ASM from an LLVM intermediate representation generated from a C program, there are numerous spills, usually of a 4 byte size. I cannot figure out why they occur and what they achieve.
They seem to "cut" pieces of the stack off, but in an unusual way:
## this fragment comes from a C program right before a malloc() call to a struct.
## there are other spills in different circumstances in this same program, so it
## is not related exclusively to malloc()
...
sub ESP, 84
mov EAX, 60
mov DWORD PTR [ESP + 80], 0
mov DWORD PTR [ESP], 60
mov DWORD PTR [ESP + 60], EAX # 4-byte Spill
call malloc
mov ECX, 60
...
A register spill is simply what happens when you have more local variables than registers (it's an analogy - really the meaning is that they must be saved to memory). The instruction is saving the value of EAX, likely because EAX is clobbered by malloc and you don't have another spare register to save it in (and for whatever reason the compiler has decided it needs the constant 60 in the register later).
By the looks of it, the compiler could certainly have omitted mov DWORD PTR [ESP + 60], EAX and instead repeated the mov EAX, 60 where it would otherwise mov EAX, DWORD PTR [ESP + 60] or whatever offset it used, because the saved value of EAX cannot be other than 60 at that point. However, compilation is not guaranteed to be perfectly optimal.
Bear also in mind that after sub ESP, 84, the stack size is not adjusted (except by the call instruction which of course pushes the return address). The following instructions are using ESP as a memory offset, not a destination.

8086 assembly - how to access array elements within a loop

Ok, to make things as simple as possible, say I have a basic loop that i want to use in order to modify some elements of an array labeled a. In the following sample code I've tried replacing all elements of a with 1, but that doesn't really work.
assume cs:code ,ds:data
data segment
a db 1,2,3,4
i db 0
data ends
code segment
start:
mov ax,data
mov ds,ax
lea si,a
the_loop:
mov cl,i
cmp cl,4
jae the_end
mov ds:si[i],1 ; this is the part that i don't really understand since
inc i ; i'm expecting i=0 and ds:si[i] equiv to ds:si[0] which
loop the_loop ; is apparently not the case here since i actually receives the
; the value 1
the_end:
mov ax,4c00h
int 21h
code ends
end start
I am aware that I could simply do this by modifying the element stored in al after the lodsb instruction, and just store that. But I would like to know if it is possible to do something like what I've tried above.
In x86 assembly you can't use a value stored to a memory to address memory indirectly.
You need to read i into some register that can be used for memory addressing, and use that instead. You may want to check Wikipedia for 8086 memory addressing modes.
So, replace
mov ds:si[i],1
with (segment ds is unnecessary here, as it's the default of si, bx and bx+si too):
xor bx,bx
mov bl,[i]
mov [bx+si],byte 1 ; some other assemblers want byte ptr
There are other problems with your code too. The entire loop can be made easier and fixed this way:
lea si,a
xor cx,cx
mov cl,[i]
#fill_loop:
mov [si], byte 1
inc si
dec cx
jnz #fill_loop
Or, if you want to save 1 byte and use loop instruction.
#fill_loop:
mov [si], byte 1
inc si
loop #fill_loop
Note that in 16-bit mode loop instruction decrements cx and jumps to label if cx is not zero after decrement. However, in 32-bit mode loop decrements ecx and in 64-bit mode (x86-64) it decrements rcx.
I suppose that your code does not even run through the assembler, since
mov ds:si[i],1
is not a valid address mode.
Use something like
mov byte ptr [si],1 ; store value 1 at [SI]
inc si ; point to next array element
instead (used MASM to verify the syntax).
The DS: prefix is unnecessary for [si] since this is the default.
See also The 80x86 Addressing Modes.

Copying Array Content to Another Array in Assembly

I'm looking to copy some elements of an array to another one in Assembly. Both arrays are accessed via pointers which are stored in registers. So, edx would be pointing to one array and eax would point to another. Basically, edx points to an array of character read in from a text file, and I'd like eax to contain only 32 of the characters. Here's what I'm attempting to do:
I386 Assembly using NASM
add edx, 8 ; the first 8 characters of the string are not wanted
mov cl, 32
ip_address:
; move the character currently pointed to by edx to eax (mov [eax], [edx])
inc edx
inc eax
loop ip_address
Again, i'd like this to place the 32 characters after the first eight to be placed in the second array. The problem is that I'm stumped on how to do this.. Any help is very much appreciated.
You can't do direct memory-to-memory moves in x86. You need to use another scratch register:
mov ecx, [edx]
mov [eax], ecx
Or something like that...
Both ia32 and ia64 do contain a memory-to-memory string move instruction that can move bytes, "words", and "doublewords".
movsb
movsw
movsd
The source address is specified in ESI and the destination in EDI.1 By itself, it moves one byte, word, or doubleword. If the rep prefix is used, then ECX will contain a count and the instruction will move an entire string of values.
1. I think these instructions are the reason that the ESI and EDI registers are so named. (Source Index and Destination Index.)
The simple solution is to just do:
mov ebx, [edx]
mov [eax], ebx
Be aware that under many platform's ABIs, ebx is a callee-save register, so you will need to save and restore its value in your function.
The simpler solution is to link against the standard library and call memcpy, which is perfectly acceptable in assembly, and will usually be substantially faster than writing your own loop.

Resources