This question already has answers here:
What registers are preserved through a linux x86-64 function call
(3 answers)
How to restore x86-64 register saving conventions
(1 answer)
Why do gcc and clang generate mov reg,-1
(1 answer)
Closed 1 year ago.
I'd like to order the unordered array displayed in the source code. I don't know what I'm missing out here. Ok, I added some more details, hope this helps :)
The program
section .data
fmt: db "%d",0x0a,0
arr: dd 1000,100,10,1
section .text
global main
extern qsort
extern printf
main: push rbp
mov rbp, rsp
mov rdi, arr
mov rsi, 4
mov rdx, 4
mov rcx, cmp
call qsort
xor rbx, rbx
.L1: lea rdi, [fmt]
mov esi, [arr+rbx*4]
xor rax, rax
call printf
inc rbx
cmp rbx, 4
jnz .L1
xor rax, rax
leave
ret
cmp: movsxd rax, dword [rdi]
movsxd rbx, dword [rsi]
sub rax, rbx
ret
The program prints
1
100
10
1000
Related
This question already has answers here:
Segmentation fault when calling printf from C function called from assembly [duplicate]
(2 answers)
32-bit absolute addresses no longer allowed in x86-64 Linux?
(1 answer)
glibc scanf Segmentation faults when called from a function that doesn't align RSP
(1 answer)
Closed 4 months ago.
I have the following code (I'll print only the lines the program goes through before arriving at the segmentation fault)
push rbp
mov rbp, rsp
push rax
push rsi
push rdi
push rcx
push rdx
mov ecx, 0Ah
mov esi, 0h
for1:
cmp esi, DimMatrix * DimMatrix * 2
jge for1_end
mov edx, 011h
mov edi, 0h
for2:
cmp edi, DimMatrix
jge for2_end
mov ax, [m + esi + edi * 2]
movsx eax, ax
mov DWORD[number], eax
mov DWORD[rowScreen], ecx
mov DWORD[colScreen], edx
call showNumberP1
showNumberP1:
push rbp
mov rbp, rsp
push rax
push rbx
push rcx
push rdx
mov eax, [number]
mov cl, 0h
mov ebx, 0Ah
if1:
cmp eax, 000F423Fh
jle if1_end
mov eax, 000F423Fh
if1_end:
for:
cmp cl, 6h
jge for_end
mov BYTE[charac], 20h
if2:
cmp eax, 0h
jle if2_end
mov edx, 0h
div ebx
mov BYTE[charac], dl
add BYTE[charac], 30h
if2_end:
call gotoxyP1
gotoxyP1 is an nasm subroutine that calls on the correspoding C function which has one line:
printf("\x1B[%d;%dH",rowScreen,colScreen);
And that's where to program crashes with a segmentation fault.
number, rowScreen and colScreen are all int variables in the C program, and charac is a char variable. showNumberP1 works fine when tested on its own. Hoping somebody can find what's causing the segmentation fault, because I can't see it.
Here is a basic program I written on the godbolt compiler, and it's as simple as:
#include<stdio.h>
void main()
{
int a = 10;
int *p = &a;
printf("%d", *p);
}
The results after compilation I get:
.LC0:
.string "%d"
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-12], 10
lea rax, [rbp-12]
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-8]
mov eax, DWORD PTR [rax]
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
nop
leave
ret
Question: Pushing the rbp, making the stack frame by making a 16 byte block, how from a register, a value is moved to a stack location and vice versa, how the job of LEA is to figure out the address, I got this part.
Problem:
lea rax, [rbp-12]
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-8]
mov eax, DWORD PTR [rax]
Lea -> getting address of rbp-12 into rax,
then moving the value which is the address of rbp-12 into rax,
but next line again says, move to rax, the value of rbp-8. This seems ambiguous. Then again moving the value of rax to eax. I don't understand the amount of work here. Why couldn't I have done
lea rax, [rbp-12]
mov QWORD PTR [rbp-8], rax
mov eax, QWORD PTR [rbp-8]
and be done with it? coz on the original line, rbp-12's address is stored onto rax, then rax stored to rbp-8. then rbp-8 stored again into rax, and then again rax is stored into eax? couldn't we have just copied the rbp-8 directly to eax? i guess not. But my question is why?
I know there is de-referencing in pointers, so How LEA helps grabbing the address of rbp-12, I understand, but on the next parts, when did it went from grabbing values from addresses I completely lost. And also, after that, I didn't understand any of the asm lines.
You're seeing very un-optimized code. Here's my line-by-line interpretation:
.LC0:
.string "%d" ; Format string for printf
main:
push rbp ; Save original base pointer
mov rbp, rsp ; Set base pointer to beginning of stack frame
sub rsp, 16 ; Allocate space for stack frame
mov DWORD PTR [rbp-12], 10 ; Initialize variable 'a'
lea rax, [rbp-12] ; Load effective address of 'a'
mov QWORD PTR [rbp-8], rax ; Store address of 'a' in 'p'
mov rax, QWORD PTR [rbp-8] ; Load 'p' into rax (even though it's already there - heh!)
mov eax, DWORD PTR [rax] ; Load 32-bit value of '*p' into eax
mov esi, eax ; Load value to print into esi
mov edi, OFFSET FLAT:.LC0 ; Load format string address into edi
mov eax, 0 ; Zero out eax (not sure why -- likely printf call protocol)
call printf ; Make the printf call
nop ; No-op (not sure why)
leave ; Remove the stack frame
ret ; Return
Compilers, when not optimizing, generate code like this as they parse the code you gave them. It's doing a lot of unnecessary stuff, but it is quicker to generate and makes using a debugger easier.
Compare this with the optimized code (-O2):
.LC0:
.string "%d" ; Format string for printf
main:
mov esi, 10 ; Don't need those variables -- just a 10 to pass to printf!
mov edi, OFFSET FLAT:.LC0 ; Load format string address into edi
xor eax, eax ; It's a few cycles faster to xor a register with itself than to load an immediate 0
jmp printf ; Just jmp to printf -- it will handle the return
The optimizer found that the variables weren't necessary, so no stack frame is created. Nothing is left but the printf call! And that's done as a jmp since nothing else need be done here when the printf is complete.
I am trying to understand how a variable sized static array work internally:
Following is a fixed size static array in C and its Assembly equivalent;
int main()
{
int arr[2] = {3};
}
================
main:
push rbp
mov rbp, rsp
mov QWORD PTR [rbp-8], 0
mov DWORD PTR [rbp-8], 2
mov eax, 0
pop rbp
ret
However a variable sized array is shown below
int main()
{
int varSize ;
int Arr[varSize];
}
=================
main:
push rbp
mov rbp, rsp
sub rsp, 32
mov rax, rsp
mov rcx, rax
mov eax, DWORD PTR [rbp-4]
movsx rdx, eax
sub rdx, 1
mov QWORD PTR [rbp-16], rdx
movsx rdx, eax
mov r8, rdx
mov r9d, 0
movsx rdx, eax
mov rsi, rdx
mov edi, 0
cdqe
lea rdx, [0+rax*4]
mov eax, 16
sub rax, 1
add rax, rdx
mov edi, 16
mov edx, 0
div rdi
imul rax, rax, 16
sub rsp, rax
mov rax, rsp
add rax, 3
shr rax, 2
sal rax, 2
mov QWORD PTR [rbp-24], rax
mov rsp, rcx
mov eax, 0
leave
ret
I am seeing a whole lot of assembly instructions if I declare a variable sized array. Can some one explain how is this flexibility of variable size achieved?
Same mechanism as alloca() - allocate memory by decreasing the stack pointer, with the assumption that the stack is big enough and/or the OS will grow it as needed.
There might be a bit of an issue when the requested size is over a memory page and the stack is near its end. Normally, the OS grows the stack by setting up a guard page at the stack top and watching for faults in that area, but that assumes that the stack grows more or less sequentially (by pushes and function calls). If the decreased stack pointer overshoots the guard page, it might end up pointing at a bogus location. I'm not sure what does the compiler do about that possibility.
This question already has an answer here:
Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
(1 answer)
Closed 3 years ago.
I used https://godbolt.org/ with "x86-64 gcc 9.1" to assemble the following C code to understand why passing a pointer to a local variable as a function argument works. Now I have difficulties to understand some steps.
I commented on the lines I have difficulties with.
void printStr(char* cpStr) {
printf("str: %s", cpStr);
}
int main(void) {
char str[] = "abc";
printStr(str);
return 0;
}
.LC0:
.string "str: %s"
printStr:
push rbp
mov rbp, rsp
sub rsp, 16 ; why allocate 16 bytes when using it just for the pointer to str[0] which is 4 bytes long?
mov QWORD PTR [rbp-8], rdi ; why copy rdi to the stack...
mov rax, QWORD PTR [rbp-8] ; ... just to copy it into rax again? Also rax seems to already contain the pointer to str[0] (see *)
mov rsi, rax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
nop
leave
ret
main:
push rbp
mov rbp, rsp
sub rsp, 16 ; why allocate 16 bytes when "abc" is just 4 bytes long?
mov DWORD PTR [rbp-4], 6513249
lea rax, [rbp-4] ; pointer to str[0] copied into rax (*)
mov rdi, rax ; why copy the pointer to str[0] to rdi?
call printStr
mov eax, 0
leave
ret
Thanks to the help of Jester I could solve my confusion. The following code is compiled with the "-O1" flag of GCC (for me the best optimization level to understand what's going on):
.LC0:
.string "str: %s"
printStr:
sub rsp, 8
; now the call to printf gets prepared, rdi = first argument, rsi = second argument
mov rsi, rdi ; move str[0] to rsi
mov edi, OFFSET FLAT:.LC0 ; move address of static string literal "str: %s" to edi
mov eax, 0 ; set eax to the number of vector registers used, because printf is a varargs function
call printf
add rsp, 8
ret
main:
sub rsp, 24
mov DWORD PTR [rsp+12], 6513249 ; create string "abc" on the stack
lea rdi, [rsp+12] ; move address of str[0] (pointer to 'a') to rdi (first argument for printStr)
call printStr
mov eax, 0
add rsp, 24
ret
As Jester said, the 16 bytes were allocated for alignment. There is a good post on Stack Overflow which explains this here.
Edit:
There is a post on Stack Overflow which explains why al is zeroed before a call to a varargs function here.
section .data
text db 'Put a number',10,0
scanform db '%d'
number dw 0
section .text
extern printf,scanf
global main
main:
push rbp
mov rbp,rsp
push rdi
push rsi
push rbx
mov rdi,text
mov rax,0
call printf
mov rsi,number
mov rdi,scanform
mov rax,0
call scanf
pop rbx
pop rsi
pop rdi
ret
This is my code I write a other codes all day and I do not have problem with these but now when I call scanf, write program received signal SIGSEV, segfault... Specified first and last line in different files. I do not understand this message can someone help me?
You have the following issues:
You forgot to pop rbp.
You misalign the stack which needs to be 16 byte aligned.
You do not zero terminate your format string (thanks to Paul for pointing this out).
You use %d which writes a 4 byte integer but you only allocated 2 bytes with dw.
It is recommended to align integers to 4 bytes.
A possible fixed version:
section .data
number dd 0
text db 'Put a number',10,0
scanform db '%d', 0
section .text
extern printf,scanf
global main
main:
push rbp
mov rbp,rsp
push rdi
push rsi
push rbx
push rbx ; for alignment
mov rdi,text
mov rax,0
call printf
mov rsi,number
mov rdi,scanform
mov rax,0
call scanf
pop rbx
pop rbx
pop rsi
pop rdi
pop rbp
ret
Since rsi and rdi are caller-saved registers and rbx is not touched, you can simplify the code. I also changed to xor zeroing and rip-relative addressing as follows:
section .data
number dd 0
text db 'Put a number',10,0
scanform db '%d', 0
section .text
extern printf,scanf
global main
main:
push rbp
lea rdi, [rel text]
xor eax, eax
call printf
lea rsi, [rel number]
lea rdi, [rel scanform]
xor eax, eax
call scanf
pop rbp
ret