Hi I'm still new to the Assembly language using gcc compiler, right now I'm working on functions.
My program asks the user for 4 int values, store those values in register eax, ebx, ecx and edx, then calls a function to divide (ebx/eax). I store the value of "d" after the division because as I understand idiv uses edx to store the residue. It then subtracts (eax-ecx) and multiply (eax*edx), then returns the value inside of register eax. For some reason I get a segmentation fault: 11.
Here is my code:
#include <stdio.h>
int a, b, c, d;
int main (void)
{
printf("Dame a: ");
scanf("%d", &a);
printf("Dame b: ");
scanf("%d", &b);
printf("Dame c: ");
scanf("%d", &c);
printf("Dame d: ");
scanf("%d", &d);
__asm( ".intel_syntax noprefix;"
"xor eax, eax;"
"mov eax, dword ptr [_a];"
"xor ebx, ebx;"
"mov ebx, dword ptr [_b];"
"xor ecx, ecx;"
"mov ecx, dword ptr [_c];"
"Call fun1;"
"mov dword ptr [_a], eax;"
"fun1: xor edx, edx;"
"idiv ebx;"
"sub eax, ecx;"
"mov edx, dword ptr [_d];"
"imul eax, edx;"
"ret;"
".att_syntax");
printf("%d\n", a);
}
Is it something to do with some pointer error?
Aside from the other errors pointed out in comments, you have a significant issue here:
"mov ecx, dword ptr [_c];"
"Call fun1;"
"mov dword ptr [_a], eax;"
"fun1: xor edx, edx;"
"idiv ebx;"
"sub eax, ecx;"
"mov edx, dword ptr [_d];"
"imul eax, edx;"
"ret;"
Consider the program flow. Your C code falls into this assembly code. The assembly code calls an internal function of its own (not a problem), which then returns to the instruction before the call... Still no problem. A value is moved into EAX... and you then fall through your function to a return. This is horribly bad.
By falling through to that ret you are bypassing the entire C function epilog. This means that the stack is not properly cleaned up, nor is the stack from restored. This will almost certainly lead to a crash.
Related
I have a problem with the below code:
void swap(int* a, int* b) {
__asm {
mov eax, a;
mov ebx, b;
push[eax];
push[ebx];
pop[eax];
pop[ebx];
}
}
int main() {
int a = 3, b = 6;
printf("a: %d\tb: %d\n", a, b);
swap(&a, &b);
printf("a: %d\tb: %d\n", a, b);
}
I am running this code in visual studio and when I run this, it says:
Run-Time check failure- The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention.
What am I missing?
To answer the title question: make sure you balance pushes and pops. (Normally getting that wrong would just crash, not return with the wrong ESP). If you're writing a whole function in asm make sure ret 0 or ret 8 or whatever matches the calling convention you're supposed to be using and the amount of stack args to pop (e.g. caller-pops cdecl ret 0 or callee-pops stdcall ret n).
Looking at the compiler's asm output (e.g. on Godbolt or locally) reveals the problem: different operand-sizes for push vs. pop, MSVC not defaulting to dword ptr for pop.
; MSVC 19.14 (under WINE) -O0
_a$ = 8 ; size = 4
_b$ = 12 ; size = 4
void swap(int *,int *) PROC ; swap
push ebp
mov ebp, esp
push ebx ; save this call-preserved reg because you used it instead of ECX or EDX
mov eax, DWORD PTR _a$[ebp]
mov ebx, DWORD PTR _b$[ebp]
push DWORD PTR [eax]
push DWORD PTR [ebx]
pop WORD PTR [eax]
pop WORD PTR [ebx]
pop ebx
pop ebp
ret 0
void swap(int *,int *) ENDP
This code would just crash, with ret executing while ESP points to the saved EBP (pushed by push ebp). Presumably Visual Studio passes addition debug-build options to the compiler so it does more checking instead of just crashing?
Insanely, MSVC compiles/assembles push [reg] to push dword ptr (32-bit operand-size, ESP-=4 each), but pop [reg] to pop word ptr (16-bit operand-size, ESP+=2 each)
It doesn't even warn about the operand-size being ambiguous, unlike good assemblers such as NASM where push [eax] is an error without a size override. (push 123 of an immediate always defaults to an operand-size matching the mode, but push/pop of a memory operand usually needs a size specifier in most assemblers.)
Use push dword ptr [eax] / pop dword ptr [ebx]
Or since you're using EBX anyway, not limiting your function to just the 3 call-clobbered registers in the standard 32-bit calling conventions, use registers to hold the temporaries instead of stack space.
void swap_mov(int* a, int* b) {
__asm {
mov eax, a
mov ebx, b
mov ecx, [eax]
mov edx, [ebx]
mov [eax], edx
mov [ebx], ecx
}
}
(You don't need ; empty comments at the end of each line. The syntax inside an asm{} block is MASM-like, not C statements.)
Sorry if I am asking a stupid question, but I can't find the answer due to clumsy search terms I guess
If I declare three variables as follows
volatile uint16_t a, b, c;
Will all three variables be declared volatile?
Or should I really not declare multiple variables in a row but instead do:
volatile uint16_t a;
volatile uint16_t b;
volatile uint16_t c;
If I declare three variables as follows
volatile uint16_t a, b, c;
Will all three variables be declared volatile?
Yes, all 3 variables will be volatile.
Or should I really not declare multiple variables in a row but instead do:
That is related to code style and personal preference. Usually declaring variables one per line is preferred, is more readable, easier to read, easier to refactor and results in more readable changes when browsing diff output of files.
We can check the assembly generated by the compiler to see if it optimizes the variables out or not.
When I check this simple program:
#include <stdio.h>
#include <stdint.h>
int main(void)
{
uint16_t a = 1, b = 1, c = 1;
printf("%hu", a);
printf("%hu", b);
printf("%hu", c);
}
The generated assembly at -O3 (link) is:
.LC0:
.string "%hu"
main:
sub rsp, 8
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
xor eax, eax
add rsp, 8
ret
It's obvious here that the variables have been optimized out and 1 is being used as a parameter instead of the variables.
When I replace the uint16_t a = 1, b = 1, c = 1; with volatile uint16_t a = 1, b = 1, c = 1;, The assembly generated (link) is:
main:
sub rsp, 24
mov edx, 1
mov ecx, 1
mov eax, 1
mov WORD PTR [rsp+10], ax
mov edi, OFFSET FLAT:.LC0
xor eax, eax
mov WORD PTR [rsp+12], dx
mov WORD PTR [rsp+14], cx
movzx esi, WORD PTR [rsp+10]
call printf
movzx esi, WORD PTR [rsp+12]
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
movzx esi, WORD PTR [rsp+14]
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
xor eax, eax
add rsp, 24
ret
Here, volatile is working like it should for all variables. The variables are created and are not optimized out.
In comparison, if we replace volatile uint16_t a = 1, b = 1, c = 1; with volatile uint16_t a = 1; uint16_t b = 1, c = 1; we see that only a is not optimized out (link):
main:
sub rsp, 24
mov eax, 1
mov edi, OFFSET FLAT:.LC0
mov WORD PTR [rsp+14], ax
movzx esi, WORD PTR [rsp+14]
xor eax, eax
call printf
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
xor eax, eax
add rsp, 24
ret
I make program in order to find maximum depth of parenthesis in String.
At first, I make C program with Assembly code, it works good:
#include <stdio.h>
int main(){
char *s ="((abc) ∗ (ab+(ab)∗(cdx7)))−xxab((()))c";
int x;
asm volatile (
".intel_syntax noprefix;"
"mov eax,%1;"
"lea ebx,[eax];"
"xor eax,eax;" ecx - currentDepth, edx - MaxDepth
"xor ecx,ecx;"
"xor edx,edx;"
"loop:"
"mov al,[ebx];"
"or al, al;" // check if end of string
"jz print;" // if end jump to print
"cmp al, '(';"
"je increase;"
"cmp al,')';"
"je decrease;"
"inc ebx;" // next char
"jmp loop;"
"increase:"
"inc ecx; cmp edx,ecx;js changeMax;inc ebx;jmp loop;"
"changeMax:"
"mov edx,ecx; inc ebx; jmp loop;"
"decrease:"
"dec ecx;inc ebx;jmp loop;"
"print:"
"mov %0, edx;"
".att_syntax prefix;"
: "=r" (x)
: "r" (s)
: "eax", "ebx", "ecx", "edx"
);
printf("%d", x);
return 0;
}
But I have to make it in pure Assembler with .s extension, so i try this:
.intel_syntax noprefix
.globl main
.text
main:
mov eax,offset expr
lea ebx,[eax]
xor eax,eax
xor ecx,ecx
xor edx,edx
loop:
mov al,[ebx]
or al,al
jz print
cmp al,'('
je increase
cmp al,')'
je decrease
inc ebx
jmp loop
increase:
inc ecx
cmp edx,ecx
js changeMax
inc ebx
jmp loop
changeMax:
mov edx,ecx
inc ebx
jmp loop
decrease:
dec ecx
inc ebx
jmp loop
print:
push edx
call printf
data:
mesg: .ascii "%d\n"
expr: .ascii "((x))"
But I got segmentation fault(core dumped)
Maybe labels are working differently than in first program in c.
But I don't know, please help me.
UPDATE
Thank you Jester
I edited label print, now it print out the result, but after that I still get segmentation fault.
print:
push edx
mov edx, offset mesg
push edx
call printf
ret
Ok, I remove it from the stack. Now i don't get segmentation fault, but after the result i have some rubbish text
print:
push edx
mov edx, offset mesg
push edx
call printf
add esp,8
ret
xor edx,edx
ret
Output:
2
f�f��UWVS������×
Update 2
Ok, everything works right now, I use string from command line ./a.out '((x))'
Someone can check that everything is ok.
.intel_syntax noprefix
.globl main
.text
main:
pop eax #return address
pop eax #return argc
pop eax #return argv
mov eax,[eax+4] #argv[1]
sub esp,12 #return stack to the right position
lea ebx,[eax]
xor eax,eax
xor ecx,ecx
xor edx,edx
loop:
mov al,[ebx]
or al,al
jz print
cmp al,'('
je increase
cmp al,')'
je decrease
inc ebx
jmp loop
increase:
inc ecx
cmp edx,ecx
js changeMax
inc ebx
jmp loop
changeMax:
mov edx,ecx
inc ebx
jmp loop
decrease:
dec ecx
inc ebx
jmp loop
print:
push edx
mov edx, offset mesg
push edx
call printf
add esp,8
ret
mov edx,0
ret
data:
mesg: .asciz "%d\n"
In one of the exercises in my assembler course we have to create a program that will change all uppercase letters in char* to lowercase.
char *a = "AAA BB CCC";
char *b;
asm(
".intel_syntax noprefix;"
"xor ecx, ecx;"
"xor eax, eax;"
"lea eax, [%[a]];" // getting address of the first char into eax
"loop:"
"inc ecx;" // loop variable
"cmp ecx, 10;" // 10 cycles for very character in *a
"jne lowercase;"
"jmp end;"
"lowercase:"
"mov bl, [eax];" // copy value stored under eax address to bl
"add bl, 32;" // make it lowercase in ASCII
"mov [eax], bl;" // copy it back to address stored in eax
"inc eax;" // move pointer to the next char
"jmp loop;"
"end:"
"lea %[b], [eax];"
".att_syntax prefix;"
: [b] "=r" (b)
: [a] "r" (a)
: "eax", "ebx", "ecx"
);
My idea was to get address of the first char, increase it by 32 (making it lowercase in ASCII), save it on the same address, than move pointer to the next char and so on. Getting first char and adding 32 to it works fine, its this line that throws "Segmentation fault":
"mov [eax], bl;"
From what I understand it should change value under address held in eax to value of bl. I feel like I'm missing something obvious, but I'm pretty new to this and everything I found so far on the net leads me to believe that this is how I should do it.
I'm trying to understand how exactly the stack works, so I'll recreate here a small example with some questions.
Pretend that I have a small code in ASM that does the following thing:
(all this is x86, intel syntax, Linux)
push ebp
mov ebp, esp
sub esp, 16
mov eax, 0xdeadbeef <-- let's call this 'local variable a'
mov [ebp - 16], eax
mov eax, 0xabcdabcd <-- let's call this 'local variable b'
mov [ebp - 12], eax
mov eax, 0xcacafafa <-- let's call this 'local variable c'
mov [ebp - 8], eax
mov eax, 0xdada1111 <-- let's call this 'local variable d'
mov [ebp - 4], eax
call 0x10101010 <------- pretend that is the real address of function_B
mov esp, ebp
pop ebp
ret
Now pretend that I have a C function called function_B which get's called from the asm code and it looks like this:
asmlinkage void function_B(void){
//some code here...
}
How would I access a, b, c and d from function_B with inline ASM code?
Would this work? Should I be doing it differently?
uint32_t val_a;
__asm__ __volatile__(
".intel_syntax noprefix;"
"mov %0, dword [ebp + 4 + 4 + 0];"
".att_syntax;"
: "=r" (val_a)
);
uint32_t val_b;
__asm__ __volatile__(
".intel_syntax noprefix;"
"mov %0, dword [ebp + 4 + 4 + 4];"
".att_syntax;"
: "=r" (val_b)
);
Also, how would I access a, b, c and d if my function looked like this:
asmlinkage void function_B(unsigned int val){
//some code here...
}
What about this:
void function_B(uint32_t fake){
uint32_t* poke = &fake;
uint32_t a = poke[0];
uint32_t b = poke[1];
uint32_t c = poke[2];
uint32_t d = poke[3];
}
What will give you the values of a, b, c and d with your current asm code (which is not pushing any arguments to function_B). If you also want to pass real arguments to function_B you can still add the fake one after the last real argument:
void function_B(unsigned int val,uint32_t fake)
It is actually a little safer to do it like this (then with inline asm) since it has a better chance to survive optimizations (like stack frame omission). But it's still a hack of course, and nothing I would rely too much upon.