It is possible to convert given C code to Assembly x86? - c

I'm working in AWD obstacle avoidance robot in assembly x86. I can find out some program which is already executed in C language but can't find executed in assembly x86.
How do convert these C codes to Assembly x86 code?
The whole part of codes here:
http://www.mertarduino.com/arduino-obstacle-avoiding-robot-car-4wd/2018/11/22/
void compareDistance() // find the longest distance
{
if (leftDistance>rightDistance) //if left is less obstructed
{
turnLeft();
}
else if (rightDistance>leftDistance) //if right is less obstructed
{
turnRight();
}
else //if they are equally obstructed
{
turnAround();
}
}
int readPing() { // read the ultrasonic sensor distance
delay(70);
unsigned int uS = sonar.ping();
int cm = uSenter code here/US_ROUNDTRIP_CM;
return cm;
}

How do convert these C codes to Assembly x86 code?
Converting source code to assembly is basically what a compiler does, so just compile it. Most (if not all) compilers have the option of outputting the intermediate assembly code.
If you use gcc -S main.c you will get a file called main.s containing the assembly code.
Here is an example:
$ cat hello.c
#include <stdio.h>
void print_hello() {
puts("Hello World!");
}
int main() {
print_hello();
}
$ gcc -S hello.c
$ cat hello.s
.file "hello.c"
.text
.section .rodata
.LC0:
.string "Hello World!"
.text
.globl print_hello
.type print_hello, #function
print_hello:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
call puts#PLT
nop
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size print_hello, .-print_hello
.globl main
.type main, #function
main:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %eax
call print_hello
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (Debian 8.3.0-6) 8.3.0"
.section .note.GNU-stack,"",#progbits

How do convert these C codes to Assembly x86 code?
You can use the gcc -m32 -S main.c command to do that, where :
the -S flag indicates that the output must be assembly,
the -m32 flag indicates that you want to produce i386 (32-bit) output.

Related

Understanding argc and argv in main() assembly [duplicate]

This question already has answers here:
gcc argument register spilling on x86-64
(2 answers)
Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
(1 answer)
x86 explanation, number of function arguments and local variables
(2 answers)
Closed 2 years ago.
I have the following C program to see how the main function is called with argc and argv as follows:
#include <stdio.h>
int main(int argc, char *argv[]) {
// use a non-zero value so we can easily tell it ran properly with echo $?
return 3;
}
And the non-optimized assembly output with $ gcc ifile.c -S -o ifile.s gives us:
.file "ifile.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp) <== here
movq %rsi, -16(%rbp) <== here
movl $3, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
.section .note.GNU-stack,"",#progbits
I understand this with the exception of the two lines above preceding moving the return value into %eax:
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
What are these two lines doing? I am guessing the first line since it has an offset of 4 is populating the integer value of argc, and the second argument is passing an (8-byte) pointer padded to 16 for the strings that can be passed in the argv. Is this a correct understanding of these items? Where can I learn more about, not so much the full ABI, but the specific details/internals about how the main() function gets invoked and such?

How to run converted .asm code from .c using 'gcc' in Emu8086

I am new here and I converted code from C language to asm. However, it doesn't look like normal code in asm language. So my question is how can I convert a code from C(or C++) language to Assembly language, that the converted asm code could be run on Emu8086.
Here is a simple c code:
#include<stdio.h>
void Hello(){
printf("Hello world");
}
int main (){
Hello();
return 0;
}
Then I converted it with gcc -S test.c and here is the answer:
.file "test1.c"
.section .rodata
.LC0:
.string "Hello world"
.text
.globl Hello
.type Hello, #function
Hello:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
movl $0, %eax
call printf#PLT
nop
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size Hello, .-Hello
.globl main
.type main, #function
main:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %eax
call Hello
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (Debian 6.3.0-18+deb9u1) 6.3.0 20170516"
.section .note.GNU-stack,"",#progbits
Emu8086 does what it says on the tin: it emulates an Intel 8086 processor. The assembly that GCC has produced is for your host machine (since you haven't told it to do otherwise), which evidently uses an x86-64 instructions set. The 8086 can't understand most of these instructions. You need to cross-compile it to an x86 16-bit real-mode executable. The -m16 option on GCC will generate 16-bit code, but it apparently still uses 32-bit registers (EAX, etc.). So you will have to find a compiler that targets the basic 8086 instruction set.

inline vs static inline c

Here are some simple tests run on a x86_64 to show assembler code generated when using inline statement :
TEST 1
static inline void
show_text(void)
{
printf("Hello\n");
}
int main(int argc, char *argv[])
{
show_text();
return 0;
}
And assembler :
gcc -O0 -fno-asynchronous-unwind-tables -S -masm=att main.c && less main.s
.file "main.c"
.text
.section .rodata
.LC0:
.string "Hello"
.text
.type show_text, #function
show_text:
pushq %rbp
movq %rsp, %rbp
leaq .LC0(%rip), %rdi
call puts#PLT
nop
popq %rbp
ret
.size show_text, .-show_text
.globl main
.type main, #function
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
call show_text
movl $0, %eax
leave
ret
.size main, .-main
.ident "GCC: (GNU) 7.3.1 20180312"
.section .note.GNU-stack,"",#progbits
Test 1 result : inline suggestion not taken into account by compiler
Test 2
Same code as test 1, but with -O1 optimization flag
gcc -O1 -fno-asynchronous-unwind-tables -S -masm=att main.c && less main.s
.file "main.c"
.text
.section .rodata.str1.1,"aMS",#progbits,1
.LC0:
.string "Hello"
.text
.globl main
.type main, #function
main:
subq $8, %rsp
leaq .LC0(%rip), %rdi
call puts#PLT
movl $0, %eax
addq $8, %rsp
ret
.size main, .-main
.ident "GCC: (GNU) 7.3.1 20180312"
.section .note.GNU-stack,"",#progbits
Test 2 result : no more show_text function defined in assembler
Test 3
show_text not declared as inline, -O1 optimization flag
Test 3 result : no more show_text function defined in assembler, with or without inline : same generated code
Test 4
#include <stdio.h>
static inline void
show_text(void)
{
printf("Hello\n");
printf("Hello\n");
printf("Hello\n");
printf("Hello\n");
printf("Hello\n");
printf("Hello\n");
}
int main(int argc, char *argv[])
{
show_text();
show_text();
return 0;
}
produces :
gcc -O1 -fno-asynchronous-unwind-tables -S -masm=att main.c && less main.s
.file "main.c"
.text
.section .rodata
.LC0:
.string "Hello"
.text
.type show_text, #function
show_text:
pushq %rbp
movq %rsp, %rbp
leaq .LC0(%rip), %rdi
call puts#PLT
leaq .LC0(%rip), %rdi
call puts#PLT
leaq .LC0(%rip), %rdi
call puts#PLT
leaq .LC0(%rip), %rdi
call puts#PLT
leaq .LC0(%rip), %rdi
call puts#PLT
leaq .LC0(%rip), %rdi
call puts#PLT
nop
popq %rbp
ret
.size show_text, .-show_text
.globl main
.type main, #function
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
call show_text
call show_text
movl $0, %eax
leave
ret
.size main, .-main
.ident "GCC: (GNU) 7.3.1 20180312"
.section .note.GNU-stack,"",#progbits
Test 4 result : show_text defined in assembler, inline suggestion not taken into account
I understand inline keyword does not force inlining. But for Test 1 results, what can prevent show_text code replacement in main?
So far, I used to inline some small static functions in my C source code. But from these results it seems quite useless.
Why should I declare some of my small functions static inline when using some modern compilers (and possibly compiling optimized code)?
It is one of those questionable decisions of the C Language Standards people... use of inline does not guarantee a function to be inlined... the keyword only suggests to the compiler that the function could be inlined.
I've had lengthy exchanges on this topic with the ISO WG; this followed a MISRA guideline that requires all inline functions to be declared at module scope using the static keyword. Their logic is that there may be circumstances where the compiler needs to not inline the function... and equally, there may be cases where that non-inlined function needs to have global scope!
IMHO, if a programmer adds the inline keyword, then the suggestion is that they know what they are doing, and that function should be inline.
As you suggest, in its current form, the inline keyword is effectively pointless, unless a compiler treats it seriously.
In your first test you disable optimizations. Inlining is an optimization method. Do not expect it to happen.
Also inline keyword doesn't work nowadays as it used to in the past. I'd say it's only purpose is to have functions in headers without having linker errors about duplicated symbols (when more than one cpp file uses such a header).
Let your compiler do its work. Just enable optimizations (including LTO) and do not worry about details.

Finding out type of assembly language generated by `gcc hello_world.c -S`

hello_world.c
#include <stdio.h>
int main()
{
printf("Hello World\n");
return 0;
}
Running gcc hello_world.c -S generates a hello_world.s file in assembly language.
hello_world.s
.file "hello_world.c"
.section .rodata
.LC0:
.string "Hello World"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $.LC0, %edi
call puts
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section .note.GNU-stack,"",#progbits
Is there some way to find out in what type of assembly language the code was generated in (besides knowing the syntax of all assembly languages.)?
Reference for myself or anyone else who didn't know this:
To get your processor architecture run the following:
uname -p
It is the AT&T syntax for the GNU assembler of the target code's CPU by default. There are options to alter that.

long compile time when using big arrays in the extern block

Why does gcc take a long time to compile a C code if it has a big array in the extern block?
#define MAXNITEMS 100000000
int buff[MAXNITEMS];
int main (int argc, char *argv[])
{
return 0;
}
I suspect a bug somewhere. There is no reason for the compile to take longer, no matter how big the array is since the compiler will just write an integer into the .bss segment since you never assign a value to an element in it. Proof:
.file "big.c"
.comm buff,4000000000000000000,32
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3"
.section .note.GNU-stack,"",#progbits
As you can see, the only thing left of the array in the assembly is .comm buff,4000000000000000000,32.
I suggest you gcc with -S to see the assembler code. Maybe your version of GCC has bug. I tested with GCC 4.7.3 and the compile times here are the same, no matter which value I use.
Related: Where are static variables stored (in C/C++)?

Resources