Segmentation fault when writing assembly version of reference operator [duplicate] - c

I'm quite new to x86 assembly, and I'm trying to build off a hello world program. I'm trying to make a subroutine, that writes a single byte to stdout, but i've hit a problem.
The line mov ebx, [esp+1] (to load the byte passed, when I call the subroutine) causes a segfault.
I've tried xoring the ebx register with itself, to make sure that it is empty, to make sure, that it doesn't mess with the syscall
_start:
push 32h
call _writeByte
; This just jumps to an exit routine
jmp _exit
_writeByte:
; This line causes the problem. If I remove it the program works fine
mov ebx, [esp+1]
xor ebx, ebx
mov eax, 1
mov edi, 1
mov esi, tmp
mov edx, 1
syscall
ret
Why is the program segfaulting?

I'm in x64 mode, and like a bunch of people suggested in the comments using mov ebx, [rsp+8] worked, because esp are just the 4 lower bytes of the register. The stack is outside the low 4 GiB of virtual address space, so ESP != RSP and [esp] will be an unmapped page.
Note that x86-64 calling conventions pass the first few args in register, not on the stack, so you normally don't want to do this at all (unless your function has lots of args).

Related

x86 Assembly Language: Filling an array with values generated by a subroutine

Edit: Thank you so much for all your help! I really appreciate it because for some reason I am having some trouble conceptualizing assembly but I'm piecing it together.
1) I am using the debugger to step through line by line. The problem, Unhandled exception at 0x0033366B in Project2.exe: 0xC0000005: Access violation writing location 0x00335FF8 occurs at this line:
mov [arrayfib + ebp*4], edx
Is think maybe this because the return statement from the other loop is not able to be accessed by the main procedure, but not sure - having a hard time understanding what is happening here.
2) I have added additional comments, hopefully making this somewhat clearer. For context: I've linked the model I've used to access the Fibonacci numbers, and my goal is to fill this array with the values, looping from last to first.
.386
.model flat, stdcall
.stack 4096
INCLUDE Irvine32.inc
ExitProcess PROTO, dwExitCode: DWORD
.data
arrayfib DWORD 35 DUP (99h)
COMMENT !
eax = used to calculate fibonacci numbers
edi = also used to calculate fibonacci numbers
ebp = number of fibonacci sequence being calculated
edx = return value of fibonacci number requested
!
.code
main PROC
;stores n'th value to calculate fibonacci sequence to
mov ebp, 30
mov edx, 0
;recursive call to fibonacci sequence procedure
call FibSequence
mov [arrayfib + ebp*4], edx
dec ebp
jnz FibSequence
mov esi, OFFSET arrayfib
mov ecx, LENGTHOF arrayfib
mov ebx, TYPE arrayfib
call DumpMem
INVOKE ExitProcess, 0
main ENDP
;initializes 0 and 1 as first two fibonacci numbers
FibSequence PROC
mov eax, 0
mov edi, 1
;subrracts 2 from fibonacci number to be calculated. if less than 0, jumps to finish,
;else adds two previous fibonacci numbers together
L1:
sub ebp, 2
cmp ebp, 0
js FINISH
add eax, edi
add edi, eax
LOOP L1
;stores odd fibonacci numbers
FINISH:
test eax, 1
jz FINISHEVEN
mov edx, eax
ret
;stores even fibonacci numbers
FINISHEVEN:
mov edx, edi
ret
FibSequence ENDP
END main
Your Fibonacci function destroys EBP, returning with it less than zero.
If your array is at the start of a page, then arrafib + ebp*4] will try to access the previous page and fault. Note the fault address of 0x00335FF8 - the last 3 hex digits are the offset within a 4k virtual page, an 0x...3FF8 = 0x...4000 + 4*-2.
So this is exactly what happened: EBP = -2 when your mov store executed.
(It's normal for function calls to destroy their register args in typical calling conventions, although using EBP for arg-passing is unusual. Normally on Windows you'd pass args in ECX and/or EDX, and return in EAX. Or on the stack, but that sucks for performance.)
(There's a lot of other stuff that doesn't make sense about your Fibonacci function too, e.g. I think you want jmp L1 not loop L1 which is like dec ecx / jnz L1 without setting flags).
In assembly language, every instruction has a specific effect on the architectural state (register values and memory contents). You can look up this effect in an instruction-set reference manual like https://www.felixcloutier.com/x86/index.html.
There is no "magic" that preserves registers for you. (Well, MASM will push/pop for you if you write stuff like proc foo uses EBP, but until you understand what that's doing for you it's better not to have the assembler adding instructions to you code.)
If you want a function to preserve its caller's EBP value, you need to write the function that way. (e.g. mov to copy the value to another register.) See What are callee and caller saved registers? for more about this idea.
maybe this because the return statement from the other loop is not able to be accessed by the main procedure
That doesn't make any sense.
ret is just how we write pop eip. There are no non-local effects, you just have to make sure that ESP is pointing to the address (pushed by call) that you want to jump to. Aka the return address.

Assembler Intel x86 Loop n times with user input

I'm learning x86 assembly and I'm trying to write a program that reads a number n (2 digits) from user input and iterate n times.
I've tried many ways but I get an infinite loop or segment fault.
input:
push msgInputQty
call printf
add esp, 4
push quantity
call gets
add esp, 4
mov ecx, 2
mov eax, 0
mov ebx, 0
mov edi, 0
mov dl, 10
transform:
mul dl
mov ebx, 0
mov bl, byte[quantity+edi]
sub bl, 30h
add eax, ebx
inc edi
loop transform
mov ecx, eax
printNTimes:
push msgDig
call printf
add esp, 4
loop printNTimes
I'd like to save in ecx and iterate n times this number
Your ecx register is being blown away by the call to printf.
ecx is a volatile register in some calling conventions and its likely that your loop is being corrupted by what printf is leaving in there.
To begin with, I would follow Raymond's advice in the comment attached to your original question and attach a debugger to witness this behaviour for yourself.
As for a solution, you can try preserving ecx and restoring it after the call to see the difference:
; for example
mov edi,ecx
call printf
mov ecx,edi
There may be more issues here (hard to know for sure since your code is incomplete ... but things like your stack allocations that don't appear to be for any reason are interesting) - but that is a good place to start.
Peter has left a comment under my answer to point out that you could remove the issue and optimize my solution by just not using ecx for your loop at all and instead do it manually, making your code change:
mov edi, eax
printNTimes:
push msgDig
call printf
add esp, 4
dec edi
jnz printNTimes

Assembly Program to calculate the sum of an array of 10 32-bit integers using loop(s)

I am trying to put together this assembly language program, and honestly, I am not sure of what I am doing. I tried to look up to some online example. Please assist me and explain to me what each line is does?
Calculate the sum of an array of 10 32-bit integers using loop(s). You may hard-code the input integers. Save the sum in register EAX.
INCLUDE Irvine32.inc
.data
arrayVal DWORD 1,2,3,4,5,6,7,8,9,10
counter = 0
.code
main:
xor eax, eax
xor edx, edx
mov ecx, LENGTHOF arrayVal
L1:
mov ebx, DWORD arrayVal [edx]
add eax, ebx
inc edx
loop L1
Call WriteDec
exit
end main
I haven't done assembly since college but i will try to explain as much as i remember it might be bit off.
INCLUDE Irvine32.inc
this line is used to import Irvine32.inc which is required for 32 bit programming.
.data
An assembly programs is made up of multiple sections (memory segments) the data section is used to allocate memory spaces and variables to be used in the subsequent sections
arrayVal DWORD 1,2,3,4,5,6,7,8,9,10
here we create an array which is 32bit signified by DWORD and initialized with values 1 to 10. With arrayVal pointing to 1.
counter = 0
We will come to this little later i have doubts about this :P
.code
main:
The code section and main, the above two lines roughly says we are done with declaring variables, all the following stated stuff is instructions you should execute as main function.
Before i explain the other following code you should understand that even though you created memory segments that hold data they can't be used for operations in assembly. you must use special locations called registers for them more info here. edx, eax and ecx are all registers used for special functions.
xor eax, eax
xor edx, edx
register basically store binary data, they can have data from old operations before so we reset them to zero by using on xor with themselves.(they are faster than setting them by zero explanation here).
mov ecx, LENGTHOF arrayVal
this command basically moves the length of arrayval to the ecx register which is typically used as counter.
So the general logic will be. Read one value store to ebx, add it to eax (our accumalator) and then get next arrayvalue into ebx and add it to ebx and so on and so forth until we have added all values.
L1: and loop L1
the loop is instruction is very special. It basically does two things jump to where we have mentioned in this case L1 and reduce ecx automatically until ecx is zero. and everything in between gets repeated until ecx and hence these lines
mov ebx, DWORD arrayVal [edx]
add eax, ebx
inc edx
gets executed for all values of array. but this begs the question why do you need counter in the first place in data section(maybe it is a reserved word for ecx i am not sure).
Call WriteDec
exit
now after adding all the values if you call WriteDec it prints it values from eax register to standard output and we are done . so we exit and end main.
There are some things that seems off and unnecessary but if you google around a bit you should understand more. This seems a good place to start. Maybe read a couple of books, since you seem to be a very fresh beginner.

What is the correct way to loop if I use ECX in loop (Assembly)

I am currently learning assembly language, and I have a program which outputs "Hello World!" :
section .text
global _start
_start:
mov ebx, 1
mov ecx, string
mov edx, string_len
mov eax, 4
int 0x80
mov eax, 1
int 0x80
section .data
string db "Hello World!", 10, 0
string_len equ $ - string
I understand how this code works. But, now, I want to display 10 times the line. The code which I saw on the internet to loop is the following:
mov ecx, 5
start_loop:
; the code here would be executed 5 times
loop start_loop
Problem : I tried to implement the loop on my code but it outputs an infinite loop. I also noticed that the loop needs ECX and the write function also needs ECX. What is the correct way to display 10 times my "Hello World!" ?
This is my current code (which produces infinite loop):
section .text
global _start
_start:
mov ecx, 10
myloop:
mov ebx, 1 ;file descriptor
mov ecx, string
mov edx, string_len
mov eax, 4 ; write func
int 0x80
loop myloop
mov eax, 1 ;exit
int 0x80
section .data
string db "Hello World!", 10, 0
string_len equ $ - string
Thank you very much
loop uses the ecx register. This is easy the remember because the c stands for counter. You however overwrite the ecx register so that will never work!
The easiest fix is to use a different register for your loop counter, and avoid the loop instruction.
mov edi,10 ;registers are general purpose
loop:
..... do loop stuff as above
;push edi ;no need save the register
int 0x80
.... ;unless you yourself are changing edi
int 0x80
;pop edi ;restore the register. Remember to always match push/pop pairs.
sub edi,1 ;sub sometimes faster than dec and never slower
jnz loop
There is no right or wrong way to loop.
(Unless you're looking for every last cycle, which you are not, because you've got a system call inside the loop.)
The disadvantage of loop is that it's slower than the equivalent sub ecx,1 ; jnz start_of_loop.
The advantage of loop is that it uses less instruction bytes. Think of it as a code-size optimization you can use if ECX happens to be a convenient register for looping, but at the cost of speed on some CPUs.
Note that the use of dec reg + jcc label is discouraged for some CPUs (Silvermont / Goldmont being still relevant, Pentium 4 not). Because dec only alters part of the flags register it can require an extra merging uop. Mainstream Intel and AMD rename parts of EFLAGS separately so there's no penalty for dec / jnz (because jnz only reads one of the flags written by dec), and can even micro-fuse into a dec-and-branch uop on Sandybridge-family. (But so can sub). sub is never worse, except code-size, so you may want to use sub reg,1 for the forseeable future.
A system call using int 80h does not alter any registers other than eax, so as long as you remember not to mess with edi you don't need the push/pop pair. Pick a register you don't use inside the loop, or you could just use a stack location as your loop counter directly instead of push/pop of a register.

Accessing parameters passed on the stack in an ASM function

I am writing an assembly function to be called from C that will call the sidt machine instruction with a memory address passed to the C function. From my analysis of the code produced by MSVC10, I have constructed the following code (YASM syntax):
SECTION .data
SECTION .text
GLOBAL _sidtLoad
_sidtLoad:
push ebp
mov ebp,esp
sub esp,0C0h
push ebx
push esi
push edi
lea edi,[ebp-0C0h]
mov ecx,30h
mov eax,0CCCCCCCCh
sidt [ebp+8]
pop edi
pop esi
pop ebx
add esp,0C0h
cmp ebp,esp
mov esp,ebp
pop ebp
ret
Here is the C function signature:
void sidtLoad (void* mem);
As far as I can see everything should work, I have even checked the memory address passed to the function and have seen that it matches the address stored at [ebp+8] (the bytes are reversed, which I presume is a result of endianness and should be handled by the machine). I have tried other arguments for the sidt instruction, such as [ebp+12], [ebp+4], [ebp-8] etc but no luck.
P.S I am writing this in an external assembly module in order to get around the lack of inline assembly when targeting x64 using MSVC10. However, this particular assembly function/program is being run as x86, just so I can get to grips with the whole process.
P.P.S I am not using the __sidt intrinsic as the Windows Driver Kit (WDK) doesn't seem to support it. Or at least I can't get it to work with it!
Your code will write the IDT into the memory address ebp+8, which is not what you want. Instead, you need something like:
mov eax, [ebp+8]
sidt [eax]
compile:
int sidLoad_fake(void * mem) {
return (int)mem;
}
to assembly, and then look at where it pulls the value from to put in the return register.
I think that the first few arguments on x86_64 are passed in registers.

Resources