Okay so I'm trying to create a program for fun which counts the bits in a number
What I Want:
As I said, a program which counts the bits in a given number.
(for instance countsbits(1)=countbits(2)=countbits(4)=1).
What I Get:
I get the correct output but now I receive an error message
"Segmentation fault:11". I ran someone else's program and they did not receive this error, so clearly it's my wrongdoing. How can I amend this so I don't get a segmentation fault?
The command I enter is:
gcc -m32 -mstackrealign countbit.c countbits.s
The program compiles just fine but when I try to run the a.out generated by the program I get the error. Any ideas?
My Code:
.text
.data
.globl _x
.globl _countbits
_countbits:
pushl %ebp
movl %esp,%ebp
pushl %ebx
mov $0,%edx
mov $0,%eax
mov 8(%ebp),%ebx
LOOP:
mov $1,%ecx
and %ebx,%ecx
add %ecx,%eax
shrl $1,%ebx
add $1,%edx
cmp $32,%edx
jle LOOP
pop %ebx
pop %ebp
ret
and the code that calls it from C:
#include <stdio.h>
int foo (int x){
int p=countbits(x);
printf("The count is: %d",p);
}
main(){
int x=16;
foo(16);
}
You can't really ask a question about assembly code without mentioning what kind of processor assembly code you're talking about. For example, many processors have a dedicated instruction for counting number of bits set. For example, see POPCNT
Related
I compile come simple code with intel icc compiler, and I notice that there are some numbers at the end of each line. I wanna know the meaning.
Just like #3.12 in the following code.
#include <stdio.h>
int main() {
int a = 3, b;
scanf("%d", &b);
a = a + b;
printf("Hello, world! I am %d\n", a);
return 0;
}
...
main:
..B1.1: # Preds ..B1.0
# Execution count [1.00e+00]
..L1:
#3.12
pushl %ebp #3.12
movl %esp, %ebp #3.12
andl $-128, %esp #3.12
...
It is indeed the line and column of the corresponding source code. #3.12 is the opening { of the main function which makes sense since the shown statements are consistent with the start of a function.
If you insert an extra space before the { you will see that the output changes to #3.13; likewise the 3 changes to 4 if you insert an empty line before the main()function.
This is the procedure for preparing the start of a function, also called the function header. Here we hide the return address on the stack and allocate empty space on the stack for the function to work. Pay attention at the end is the reverse process. Here is an example of the same from another compiler:
push ebp
mov ebp, esp
sub esp, 8
...
mov esp, ebp
pop ebp
ret 0
I'm trying to learn some assembly.
My goal is to create an external assembly function that is able to read an array of char, cast to int and then execute various operation, just to learn something.
I've done many proofs but i think i'm missing the point
code:
#include <stdio.h>
#define SIZE 5
extern int foo(char array[]);
int main(void){
char array[SIZE]={'0','1','1','0','1'};
printf("GAS said: %c\n", foo(array));
return 0;
}
assembly:
.data
.text
.global foo
foo:
pushl %ebp
movl %esp, %ebp
movl 8(%esp), %eax #saving in eax the pointer of the array
movl (%eax), %eax #saving in eax the first char of the array
popl %ebp
ret
The strange thing for me is here:
when i use, like in this case
printf("GAS said: %c\n", foo(array));
The output is, as expected, GAS said: 0
Based on this, i was expecting also that changing with:
printf("GAS said: %i\n", foo(array));
will output GAS said: 48 but instead i get in return some random address.
Also, in the assembly file, i can't explain why if i try to
cmpl $48, %eax
je LABEL
the jump will never happen.
The only thing i can think of is that there is a problem with the size, since int takes 4B and char only 1B but i'm not so sure.
So, how can i use compare and return an int to main in this case?
Here is an example found via an assembly website. This is the C code:
int main()
{
int a = 5;
int b = a + 6;
return 0;
}
Here is the associated assembly code:
(gdb) disassemble
Dump of assembler code for function main:
0x0000000100000f50 <main+0>: push %rbp
0x0000000100000f51 <main+1>: mov %rsp,%rbp
0x0000000100000f54 <main+4>: mov $0x0,%eax
0x0000000100000f59 <main+9>: movl $0x0,-0x4(%rbp)
0x0000000100000f60 <main+16>: movl $0x5,-0x8(%rbp)
0x0000000100000f67 <main+23>: mov -0x8(%rbp),%ecx
0x0000000100000f6a <main+26>: add $0x6,%ecx
0x0000000100000f70 <main+32>: mov %ecx,-0xc(%rbp)
0x0000000100000f73 <main+35>: pop %rbp
0x0000000100000f74 <main+36>: retq
End of assembler dump.
I can safely assume that this line of assembly code:
0x0000000100000f6a <main+26>: add $0x6,%ecx
correlates to this line of C:
int b = a + 6;
But is there a way to extract which lines of assembly are associated to the specific line of C code?
In this small sample it's not too difficult, but in larger programs and when debugging a larger amount of code it gets a bit cumbersome.
But is there a way to extract which lines of assembly are associated to the specific line of C code?
Yes, in principle - your compiler can probably do it (GCC option -fverbose-asm, for example). Alternatively, objdump -lSd or similar will disassemble a program or object file with source and line number annotations where available.
In general though, for a large optimized program, this can be very hard to follow.
Even with perfect annotation, you'll see the same source line mentioned multiple times as expressions and statements are split up, interleaved and reordered, and some instructions associated with multiple source expressions.
In this case, you just need to think about the relationship between your source and the assembly, but it takes some effort.
One of the best tools I've found for this is Matthew Godbolt's Compiler Explorer.
It features multiple compiler toolchains, auto-recompiles, and it immediately shows the assembly output with colored lines to show the corresponding line of source code.
First, you need to compile the program keeping inside its object file informations about the source code either via gdwarf or g flag or both. Next, if you want to debug it is important for the compiler to avoid optimizations, otherwise it is difficult to see a correspondence code<>assembly.
gcc -gdwarf -g3 -O0 prog.c -o out
Next, tell the disassembler to output the source code. The source flag involves the disassemble flag.
objdump --source out
#Useless is very right. Anyways, a trick to know where C has arrived in the machine code is to inject markers in it; for instance,
#define ASM_MARK do { asm __volatile__("nop; nop; nop;\n\t" :::); } while (0);
int main()
{
int a = 5;
ASM_MARK;
int b = a + 6;
ASM_MARK;
return 0;
}
You will see:
main:
pushq %rbp
movq %rsp, %rbp
movl $5, -4(%rbp)
nop; nop; nop;
movl -4(%rbp), %eax
addl $6, %eax
movl %eax, -8(%rbp)
nop; nop; nop;
movl $0, %eax
popq %rbp
ret
You need to use the __volatile__ keyword or equivalent in order to tell the compiler not to interfere and this is often compiler-specific (notice the __), as C does not
provide this kind of syntax.
While I shouldn't list out the entire 4 line sample I'm given, (since this is a homework question) I'm confused how this should be read and translated into C.
cmovge %edi, %eax
What I understand so far is that the instruction is a conditional move for when the result is >=. It's comparing the first parameter of a function %edi to the integer register %eax (which was assigned the other parameter value %esi in the previous line of assembly code). However, I don't understand its result.
My problem is interpreting the optimized code. It doesn't manipulate the stack, and I'm not sure how to write this in C (or at least the gcc switch I could even use to generate the same result when compiling).
Could someone please give a few small examples of how the cmovge instruction might translate into C code? If it doesn't make sense as its own line of code, feel free to make something up with it.
This is in x86-64 assembly through a virtualized Linux operating system (CentOS 7).
I'm probably giving you the whole solution here:
int
doit(int a, int b) {
return a >= b ? a : b;
}
With gcc -O3 -masm=intel becomes:
doit:
.LFB0:
.cfi_startproc
cmp edi, esi
mov eax, esi
cmovge eax, edi
ret
.cfi_endproc
This question already has answers here:
Calling C functions from x86 assembly language
(2 answers)
Closed 8 years ago.
Despite I searched everywhere I couldn't find any solution to my problem.The problem is that I I defined a function "hello_world() " in a C file "hello.c" and I want to call this function in an assembly file . "hello_assembly.asm" .Can anyone help me ?
Thank you.
You could check the below example which might give some idea.
\#include <stdio.h>
int main(void)
{
signed int a, b;
a=5,b=25;
mymul(&a,&b);
printf("\nresult=%d",b);
return 0;
}
mymul is a function which is being written in assembly language in file called mymul.S
Below is the code for mymul.S
.globl mymul
mymul:
pushl %ebp # save the old base pointer register
movl %esp, %ebp #copy the stack pointer to base pointer register
movl 8(%ebp), %eax # get the address of a
movl 12(%ebp), %ebx # get the address of b
xchg (%eax), %ecx # we get the value of a and store it in ecx
xchg (%ebx), %edx # we get the value of b and stored it in edx
imul %ecx,%edx # do the multiplication
xchg %ecx, (%eax) #save the value back in a
xchg %edx, (%ebx) # save the value back in b
movl %ebp, %esp # get the stack pointer back to ebp
popl %ebp #restore old ebp
ret #back to the main function
We use the command "cc" to compile our above programs
$ cc mymul.S mul.c -o mulprogram
In the mul.c when we call mymul, we are passing address of a and b , and these address are getting pushed to the stack. When the execution of program enters the mymul function, the stack looks like this: addressofb,addressofa, returnaddress, oldebp
we get the value stored in the address of a and address of b using xchg(we could use movl here) , do the multiplication and save the result in b.
I hope the above program helps you.
gcc calling conventions
The gcc documentation should spell this out in more detail.
If you couldn't find documentation for your compiler and environment, I'd suggest you compile your C function to an assembler listing and look at how it expects arguments to be passed in and what it leaves on the stack when exiting.