Segmentation Fault when getting the max of three numbers in Assembly x86 - c

I am trying to get the max of three numbers using C to call a method in Assembly 32 bit AT & T. When the program runs, I get a segmentation fault(core dumped) error and cannot figure out why. My input has been a mix of positive/negative numbers and 1,2,3, both with the same error as a result.
Assembly
# %eax - first parameter
# %ecx - second parameter
# %edx - third parameter
.code32
.file "maxofthree.S"
.text
.global maxofthree
.type maxofthree #function
maxofthree:
pushl %ebp # old ebp
movl %esp, %ebp # skip over
movl 8(%ebp), %eax # grab first value
movl 12(%ebp), %ecx # grab second value
movl 16(%ebp), %edx # grab third value
#test for first
cmpl %ecx, %eax # compare first and second
jl firstsmaller # first smaller than second, exit if
cmpl %edx, %eax # compare first and third
jl firstsmaller # first smaller than third, exit if
leave # reset the stack pointer and pop the old base pointer
ret # return since first > second and first > third
firstsmaller: # first smaller than second or third, resume comparisons
#test for second and third against each other
cmpl %edx, %ecx # compare second and third
jg secondgreatest # second is greatest, so jump to end
movl %eax, %edx # third is greatest, move third to eax
leave # reset the stack pointer and pop the old base pointer
ret # return third
secondgreatest: # second > third
movl %ecx, %eax #move second to eax
leave # reset the stack pointer and pop the old base pointer
ret # return second
C code
#include <stdio.h>
#include <inttypes.h>
long int maxofthree(long int, long int, long int);
int main(int argc, char *argv[]) {
if (argc != 4) {
printf("Missing command line arguments. Instructions to"
" execute this program:- .\a.out <num1> <num2> <num3>");
return 0;
}
long int x = atoi(argv[1]);
long int y = atoi(argv[2]);
long int z = atoi(argv[3]);
printf("%ld\n", maxofthree(x, y, z)); // TODO change back to (x, y, z)
}

The code is causing a segmentation fault because it is trying to jump back to an invalid return address when the ret instruction is executed. This happens for all three different ret instructions.
The reason why it is occurring is because you don't pop the old base pointer before returning. A small change to the code will remove the fault. Change each ret instruction to:
leave
ret
The leave instruction will do the following:
movl %ebp, %esp
popl %ebp
Which will reset the stack pointer and pop the old base pointer that you saved.
Also, your comparisons are not doing what they are specified to do in the comments. When you do:
cmp %eax, %edx
jl firstsmaller
The jump will happen when %edx is smaller than %eax. So you want the code be
cmpl %edx, %eax
jl firstsmaller
which will jump when %eax is smaller than %edx, as specified in the comment.
Reference this this page for details on the cmp instruction in AT&T/GAS syntax.

You forgot to pop ebp before returning from the function.
Also, cmpl %eax, %ecx compares ecx to eax not the other way. So the code
cmpl %eax, %ecx
jl firstsmaller
will jump if ecx is smaller than eax.

Related

x86 Assembly/C: SIGABRT abortion when trying to create a dynamically allocated array

I am writing an x86 Assembly function (Intel/AT&T syntax) that creates a dynamically allocated array and returns a poitner to it given two parameters:
the number of integers into the array
the default value in that array
Below is my C code (which calls the x86 Assembly function):
#include <stdio.h>
#include <stdlib.h>
int* allocateDynamicArray(int size, int value);
int main(void) {
int* dynamicArray = allocateDynamicArray(3, 3);
printf("%d", dynamicArray[0]);
return EXIT_SUCCESS;
}
Below is my x86 Assembly code:
.extern malloc
.data
numIntegers:
.int 0
defaultValue:
.int 0
.text
# Defining a function addTwoMatrixCells
.global allocateDynamicArray
allocateDynamicArray:
# Prologue
push %ebp
movl %esp, %ebp
# Process in parameter 1: the number of integers stored in the array
movl 8(%ebp), %ecx
movl %ecx, numIntegers
# Process in parameter 2: what to fill the array with
movl 12(%ebp), %ecx
movl %ecx, defaultValue
# Allocate appropriate space for our dynamic array
movl numIntegers, %edi
imul $4, %edi
push %edi
call malloc
pop %edi
# EAX register now stores a pointer to our dynamically allocated array
push %eax
# The ECX register will store the index of the cell we are storing our number in
movl $0, %ecx
# The EDX register will store the default value
movl defaultValue, %edx
# Loop through each cell in the dynamically allocated array
fillArray:
# See if we have reached the last cell
cmpl %ecx, numIntegers
jl return
# Move the default value into the cell's memory address
push %ecx
imul $4, %ecx
addl %eax, %ecx
movl %edx, (%ecx)
pop %ecx
# Shift to the next cell
incl %ecx
# Complete the loop
jmp fillArray
return:
# We want to return a pointer to the dynamically allocated array
pop %eax
# Epilogue
movl %ebp, %esp
pop %ebp
ret
pop %eax
I am running my code on a 32-bit Linux machine. When I run my code, I get the following error:
CallAssemblyFromC.out: malloc.c:2379: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
Aborted (core dumped)
This seems strange, since I didn't have any memory allocation errors in my assembly program allocateDynamicArray.s and I believe I am resetting my stack pointers to the correct value as I pop the EDI register after calling malloc.

Segmentation fault when calling assembly function from C code

I'm trying to link assembly functions to a C code for exercise.
Here's my assembly function, written in x86 assembly:
.code32
.section .text
.globl max_function
.type max_function, #function
# i parametri saranno in ordine inverso a partire da 8(%ebp)
max_function:
pushl %ebp # save ebp
movl %esp, %ebp # new frame function
movl $0, %edi # first index is 0
movl 8(%ebp), %ecx # ecx is loaded with the number of elements
cmpl $0, %ecx # check that the number of elements is not 0
je end_function_err #if it is, exit
movl 12(%ebp),%edx # edx is loaded with the array base
movl (%edx), %eax # first element of the array
start_loop:
incl %edi #increment the index
cmpl %edi,%ecx #if it's at the end quit
je loop_exit
movl (%edx,%edi,4),%ebx #pick the value
cmpl %ebx,%eax #compare with actual maximum value
jle start_loop #less equal -> repeat loop
movl %ebx,%eax #greater -> update value
jmp start_loop #repeat loop
loop_exit:
jmp end_function #finish
end_function: #exit operations
movl %ebp, %esp
popl %ebp
ret
end_function_err:
movl $0xffffffff, %eax #return -1 and quit
jmp end_function
It basically defines a function that finds the maximum number of an array (or it should be)
And my C code:
#include <stdio.h>
#include <stdlib.h>
extern int max_function(int size, int* values);
int main(){
int values[] = { 4 , 5 , 7 , 3 , 2 , 8 , 5 , 6 } ;
printf("\nMax value is: %d\n",max_function(8,values));
}
I compile them with gcc -o max max.s max.c.
I get a SegmentationFault when executing the code.
My suspect is that I don't access the value in a right manner, but I can't see why, even because I based my code on an example code that prints argc and argv values when called from the command line.
I'm running Debian 8 64-bit
The problems were:
not preserving %ebx and %edi
not compiling for 32 bit (had to use -m32 flag for gcc)
cmpl operands were inverted
Thanks everybody, problem is solved.
I'll focus more on debugging tools to (disassembling and running step by step was very useful)!

C Code represented as Assembler Code - How to interpret?

I got this short C Code.
#include <stdint.h>
uint64_t multiply(uint32_t x, uint32_t y) {
uint64_t res;
res = x*y;
return res;
}
int main() {
uint32_t a = 3, b = 5, z;
z = multiply(a,b);
return 0;
}
There is also an Assembler Code for the given C code above.
I don't understand everything of that assembler code. I commented each line and you will find my question in the comments for each line.
The Assembler Code is:
.text
multiply:
pushl %ebp // stores the stack frame of the calling function on the stack
movl %esp, %ebp // takes the current stack pointer and uses it as the frame for the called function
subl $16, %esp // it leaves room on the stack, but why 16Bytes. sizeof(res) = 8Bytes
movl 8(%ebp), %eax // I don't know quite what "8(%ebp) mean? It has to do something with res, because
imull 12(%ebp), %eax // here is the multiplication done. And again "12(%ebp).
movl %eax, -8(%ebp) // Now, we got a negative number in front of. How to interpret this?
movl $0, -4(%ebp) // here as well
movl -8(%ebp), %eax // and here again.
movl -4(%ebp), %edx // also here
leave
ret
main:
pushl %ebp // stores the stack frame of the calling function on the stack
movl %esp, %ebp // // takes the current stack pointer and uses it as the frame for the called function
andl $-8, %esp // what happens here and why?
subl $24, %esp // here, it leaves room for local variables, but why 24 bytes? a, b, c: the size of each of them is 4 Bytes. So 3*4 = 12
movl $3, 20(%esp) // 3 gets pushed on the stack
movl $5, 16(%esp) // 5 also get pushed on the stack
movl 16(%esp), %eax // what does 16(%esp) mean and what happened with z?
movl %eax, 4(%esp) // we got the here as well
movl 20(%esp), %eax // and also here
movl %eax, (%esp) // what does happen in this line?
call multiply // thats clear, the function multiply gets called
movl %eax, 12(%esp) // it looks like the same as two lines before, except it contains the number 12
movl $0, %eax // I suppose, this line is because of "return 0;"
leave
ret
Negative references relative to %ebp are for local variables on the stack.
movl 8(%ebp), %eax // I don't know quite what "8(%ebp) mean? It has to do something with res, because`
%eax = x
imull 12(%ebp), %eax // here is the multiplication done. And again "12(%ebp).
%eax = %eax * y
movl %eax, -8(%ebp) // Now, we got a negative number in front of. How to interpret this?
(u_int32_t)res = %eax // sets low 32 bits of res
movl $0, -4(%ebp) // here as well
clears upper 32 bits of res to extend 32-bit multiplication result to uint64_t
movl -8(%ebp), %eax // and here again.
movl -4(%ebp), %edx // also here
return ret; //64-bit results are returned as a pair of 32-bit registers %edx:%eax
As for the main, see x86 calling convention which may help making sense of what happens.
andl $-8, %esp // what happens here and why?
stack boundary is aligned by 8. I believe it's ABI requirement
subl $24, %esp // here, it leaves room for local variables, but why 24 bytes? a, b, c: the size of each of them is 4 Bytes. So 3*4 = 12
Multiples of 8 (probably due to alignment requirements)
movl $3, 20(%esp) // 3 gets pushed on the stack
a = 3
movl $5, 16(%esp) // 5 also get pushed on the stack
b = 5
movl 16(%esp), %eax // what does 16(%esp) mean and what happened with z?
%eax = b
z is at 12(%esp) and is not used yet.
movl %eax, 4(%esp) // we got the here as well
put b on the stack (second argument to multiply())
movl 20(%esp), %eax // and also here
%eax = a
movl %eax, (%esp) // what does happen in this line?
put a on the stack (first argument to multiply())
call multiply // thats clear, the function multiply gets called
multiply returns 64-bit result in %edx:%eax
movl %eax, 12(%esp) // it looks like the same as two lines before, except it contains the number 12
z = (uint32_t) multiply()
movl $0, %eax // I suppose, this line is because of "return 0;"
yup. return 0;
Arguments are pushed onto the stack when the function is called. Inside the function, the stack pointer at that time is saved as the base pointer. (You got that much already.) The base pointer is used as a fixed location from which to reference arguments (which are above it, hence the positive offsets) and local variables (which are below it, hence the negative offsets).
The advantage of using a base pointer is that it is stable throughout the entire function, even when the stack pointer changes (due to function calls and new scopes).
So 8(%ebp) is one argument, and 12(%ebp) is the other.
The code is likely using more space on the stack than it needs to, because it is using temporary variables that could be optimized out of you had optimization turned on.
You might find this helpful: http://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames
I started typing this as a comment but it was getting too long to fit.
You can compile your example with -masm=intel so the assembly is more readable. Also, don't confuse the push and pop instructions with mov. push and pop always increments and decrements esp respectively before derefing the address whereas mov does not.
There are two ways to store values onto the stack. You can either push each item onto it one item at a time or you can allocate up-front the space required and then load each value onto the stackslot using mov + relative offset from either esp or ebp.
In your example, gcc chose the second method since that's usually faster because, unlike the first method, you're not constantly incrementing esp before saving the value onto the stack.
To address your other question in comment, x86 instruction set does not have a mov instruction for copying values from memory location a to another memory location b directly. It is not uncommon to see code like:
mov eax, [esp+16]
mov [esp+4], eax
mov eax, [esp+20]
mov [esp], eax
call multiply(unsigned int, unsigned int)
mov [esp+12], eax
Register eax is being used as an intermediate temporary variable to help copy data between the two stack locations. You can mentally translate the above as:
esp[4] = esp[16]; // argument 2
esp[0] = esp[20]; // argument 1
call multiply
esp[12] = eax; // eax has return value
Here's what the stack approximately looks like right before the call to multiply:
lower addr esp => uint32_t:a_copy = 3 <--. arg1 to 'multiply'
esp + 4 uint32_t:b_copy = 5 <--. arg2 to 'multiply'
^ esp + 8 ????
^ esp + 12 uint32_t:z = ? <--.
| esp + 16 uint32_t:b = 5 | local variables in 'main'
| esp + 20 uint32_t:a = 3 <--.
| ...
| ...
higher addr ebp previous frame

Assembler and C programming linux -m32 (char-byte from register in assembler)

I'm very new to assembly programming, I wrote a function in C which need to call another function in assembly. It seems like the register wants to give back four characters (bytes) instead of one, which is what I want.
Ignore the code after the jump, since I jump just to skip this part of the code until i make this work properly.
This is actually supposed to be part of my own simplified version of sprintf in C.
I removed some of my code just to get things to work. It's supposed to return the first parameter with a %. So, when I call this assembler function in C, i can write printf("%s", res); (or %c in this example) and it prints %:
.globl printpercent
# Name: printpercent
# Synopsis: A simplified sprintf
# C-signature: int printpercent(unsigned char *res, unsigned char *format, ...);
# Registers: %eax: first argument
# %ebx: second argument
printpercent: # sprinter
pushl %ebp # start of
movl %esp, %ebp # function
movl 8(%ebp), %eax # first argument
movl 12(%ebp), %ebx # second argument
loop:
movb $37, %bl # lowest bits to %
movb %bl, %al
jmp exit
movb (%ebx), %dl #
cmp $0, %dl # Check if 0
je exit # if 0 -> exit
cmp $37, %dl # Check '%'
movb %dl, (%eax) # if it doesnt equal any above/or default
# add to register %eax
jmp loop # jump back to the start of the loop
exit:
popl %ebp # popping standard end of function
# 0-byte ?
ret # return
Your function returns int so of course the compile will alayws take the full register as return value. After all int == 4 bytes in your environment. You have to to clear EAX to ,make sure there are no random values in it.
you can easily clear a register by using xor, so that the register is cleared before use again:
xor %eax, %eax

Translate C code to assembly code?

I have to translate this C code to assembly code:
#include <stdio.h>
int main(){
int a, b,c;
scanf("%d",&a);
scanf("%d",&b);
if (a == b){
b++;
}
if (a > b){
c = a;
a = b;
b = c;
}
printf("%d\n",b-a);
return 0;
}
My code is below, and incomplete.
rdint %eax # reading a
rdint %ebx # reading b
irmovl $1, %edi
subl %eax,%ebx
addl %ebx, %edi
je Equal
irmov1 %eax, %efx #flagged as invalid line
irmov1 %ebx, %egx
irmov1 %ecx, %ehx
irmovl $0, %eax
irmovl $0, %ebx
irmovl $0, %ecx
addl %eax, %efx #flagged as invalid line
addl %ebx, %egx
addl %ecx, %ehx
halt
Basically I think it is mostly done, but I have commented next to two lines flagged as invalid when I try to run it, but I'm not sure why they are invalid. I'm also not sure how to do an if statment for a > b. I could use any suggestions from people who know about y86 assembly language.
From what I can find online (1, 2), the only supported registers are: eax, ecx, edx, ebx, esi, edi, esp, and ebp.
You are requesting non-existent registers (efx and further).
Also irmov is for moving an immediate operand (read: constant numerical value) into a register operand, whereas your irmov1 %eax, %efx has two register operands.
Finally, in computer software there's a huge difference between the character representing digit "one" and the character representing letter "L". Mind your 1's and l's. I mean irmov1 vs irmovl.
Jens,
First, Y86 does not have any efx, egx, and ehx registers, which is why you are getting the invalid lines when you pour the code through YAS.
Second, you make conditional branches by subtracting two registers using the subl instruction and jumping on the condition code set by the Y86 ALU by ways of the jxx instructions.
Check my blog at http://y86tutoring.wordpress.com for details.

Resources