I'm developing a shared library which can be executed independently to print it's own version number.
I've defined a custom entry point as:
const char my_interp[] __attribute__((section(".interp"))) = "/lib64/ld-linux-x86-64.so.2";
void my_main() {
printf("VERSION: %d\n", 0);
_exit(0);
}
and I compile with
gcc -o list.os -c -g -Wall -fPIC list.c
gcc -o liblist.so -g -Wl,-e,my_main -shared list.os -lc
This code compiles and runs perfectly.
My issue is when I change the parameter of the printf to be a float or double (%f or %lf). The library will then compile but segfault when run.
Anyone have any ideas?
edit1:
Here is the code that segfaults:
const char my_interp[] __attribute__((section(".interp"))) = "/lib64/ld-linux-x86-64.so.2";
void my_main() {
printf("VERSION: %f\n", 0.1f);
_exit(0);
}
edit2:
Additional environmental details:
uname -a
Linux mjolnir.site 3.1.10-1.16-desktop #1 SMP PREEMPT Wed Jun 27 05:21:40 UTC 2012 (d016078) x86_64 x86_64 x86_64 GNU/Linux
gcc --version
gcc (SUSE Linux) 4.6.2
/lib64/libc.so.6
Configured for x86_64-suse-linux.
Compiled by GNU CC version 4.6.2.
Compiled on a Linux 3.1.0 system on 2012-03-30.
edit 3:
Output in /var/log/messages upon segfault:
Aug 11 08:27:45 mjolnir kernel: [10560.068741] liblist.so[11222] general protection ip:7fc2b3cb2314 sp:7fff4f5c7de8 error:0 in libc-2.14.1.so[7fc2b3c63000+187000]
Figured it out. :)
The floating point operations on x86_64 use the xmm vector registers. Access to these must be aligned on 16byte boundaries. This explains why 32bit platforms were unaffected and integer and character printing worked.
I've compiled my code to assembly with:
gcc -W list.c -o list.S -shared -Wl,-e,my_main -S -fPIC
then altered the "my_main" function to be have more stack space.
Before:
my_main:
.LFB6:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $.LC0, %eax
movsd .LC1(%rip), %xmm0
movq %rax, %rdi
movl $1, %eax
call printf
movl $0, %edi
call _exit
.cfi_endproc
After:
my_main:
.LFB6:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
subq $8, %rsp ;;;;;;;;;;;;;;; ADDED THIS LINE
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $.LC0, %eax
movsd .LC1(%rip), %xmm0
movq %rax, %rdi
movl $1, %eax
call printf
movl $0, %edi
call _exit
.cfi_endproc
Then I compiled this .S file by:
gcc list.S -o liblist.so -Wl,-e,my_main -shared
This fixes the issue, but I will forward this thread to the GCC and GLIBC mailing lists, as it looks like a bug.
edit1:
According to noshadow in gcc irc, this is a non standard way to do this. He said if one is to use gcc -e option, either initialize the C runtime manually, or don't use libc functions. Makes sense.
Related
I'm working in AWD obstacle avoidance robot in assembly x86. I can find out some program which is already executed in C language but can't find executed in assembly x86.
How do convert these C codes to Assembly x86 code?
The whole part of codes here:
http://www.mertarduino.com/arduino-obstacle-avoiding-robot-car-4wd/2018/11/22/
void compareDistance() // find the longest distance
{
if (leftDistance>rightDistance) //if left is less obstructed
{
turnLeft();
}
else if (rightDistance>leftDistance) //if right is less obstructed
{
turnRight();
}
else //if they are equally obstructed
{
turnAround();
}
}
int readPing() { // read the ultrasonic sensor distance
delay(70);
unsigned int uS = sonar.ping();
int cm = uSenter code here/US_ROUNDTRIP_CM;
return cm;
}
How do convert these C codes to Assembly x86 code?
Converting source code to assembly is basically what a compiler does, so just compile it. Most (if not all) compilers have the option of outputting the intermediate assembly code.
If you use gcc -S main.c you will get a file called main.s containing the assembly code.
Here is an example:
$ cat hello.c
#include <stdio.h>
void print_hello() {
puts("Hello World!");
}
int main() {
print_hello();
}
$ gcc -S hello.c
$ cat hello.s
.file "hello.c"
.text
.section .rodata
.LC0:
.string "Hello World!"
.text
.globl print_hello
.type print_hello, #function
print_hello:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
call puts#PLT
nop
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size print_hello, .-print_hello
.globl main
.type main, #function
main:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %eax
call print_hello
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (Debian 8.3.0-6) 8.3.0"
.section .note.GNU-stack,"",#progbits
How do convert these C codes to Assembly x86 code?
You can use the gcc -m32 -S main.c command to do that, where :
the -S flag indicates that the output must be assembly,
the -m32 flag indicates that you want to produce i386 (32-bit) output.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I want to test my code (I know my code is still incomplete -- yes I am planning to complete it before I compile it) to see if it gives the correct assembly code by compiling with -s switch, how do I do this?
I am not very familiar with compiling. All I did so far was save my file. Now I need to compile it to be able to run it.
typedef enum {MODE_A, MODE_B, MODE_C, MODE_D, MODE_E} mode_t;
long switch3 (long *p1, long *p2, mode_t action) {
long result = 0;
switch(action){
case MODE_A:
case MODE_B:
case MODE_C:
case MODE_D:
case MODE_E:
default:; // don't forget the colon
}
return result;
}
Open an editor, Vi or Emacs for example
Type and save your code in a file, maybe main.c
Exit the editor
Type gcc -S main.c or clang -S main.c in the terminal. You can also add a -fverbose-asm flag to tell the complier to add more information in the output, or a -masm=intel flag to inspect the assembly output much nicer.
On success, a file named main.s will be generated under the current directory, containing the assembly code; on failure, error messages will be printed on the screen.
Also note that your C code will only be compiled when it's compilable, so you have to modify your code first. At least, change default; to default:;
Here is the assembly code produced by clang -S main.c on my machine:
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 11
.globl _switch3
.align 4, 0x90
_switch3: ## #switch3
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp0:
.cfi_def_cfa_offset 16
Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp2:
.cfi_def_cfa_register %rbp
movq %rdi, -8(%rbp)
movq %rsi, -16(%rbp)
movl %edx, -20(%rbp)
movq $0, -32(%rbp)
movl -20(%rbp), %edx
subl $4, %edx
movl %edx, -36(%rbp) ## 4-byte Spill
ja LBB0_2
jmp LBB0_1
LBB0_1:
jmp LBB0_2
LBB0_2:
jmp LBB0_3
LBB0_3:
movq -32(%rbp), %rax
popq %rbp
retq
.cfi_endproc
.subsections_via_symbols
To compile without linking using GNU Compiler Collection (gcc) you can use the -S switch:
jan#jsn-dev:~/src/so> gcc -S main.c
main.c: In function ‘switch3’:
main.c:11:12: error: expected ‘:’ before ‘;’ token
default;
^
After correcting your code with the suggested fix, you get:
jan#jsn-dev:~/src/so> gcc -S main.c
jan#jsn-dev:~/src/so> cat main.s
.file "main.c"
.text
.globl switch3
.type switch3, #function
switch3:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -24(%rbp)
movq %rsi, -32(%rbp)
movl %edx, -36(%rbp)
movq $0, -8(%rbp)
movq -8(%rbp), %rax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size switch3, .-switch3
.ident "GCC: (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]"
.section .note.GNU-stack,"",#progbits
First time here, Im running Kali linux 64bits ,Im a linux rookie and a new to ASM aswell.... So I pulled a code in C ,the wich works perfectly fine..... here is the code:
#include<stdio.h>
#include<string.h> //strlen
#include<sys/socket.h>
#include<arpa/inet.h> //inet_addr
int main(int argc , char *argv[])
{
int socket_desc;
struct sockaddr_in server;
char *message , server_reply[2000];
//Create socket
socket_desc = socket(AF_INET , SOCK_STREAM , 0);
if (socket_desc == -1)
{
printf("Could not create socket");
}
server.sin_addr.s_addr = inet_addr("127.0.0.1");
server.sin_family = AF_INET;
server.sin_port = htons( 2000 );
//Connect to remote server
if (connect(socket_desc , (struct sockaddr *)&server , sizeof(server)) <0)
{
puts("connect error");
return 1;
}
puts("Connected\n");
//Send some data
message = "Hola!!!!\n\r\n";
if( send(socket_desc , message , strlen(message) , 0) < 0)
{
puts("Send failed");
return 1;
}
puts("Data Send\n");
//Receive a reply from the server
if( recv(socket_desc, server_reply , 2000 , 0) < 0)
{
puts("recv failed");
}
puts("Reply received\n");
puts(server_reply);
return 0;
}
So ... I use gcc -S -o example.s example.c , to get the ASM code... wich is:
.file "test.c"
.section .rodata
.LC0:
.string "Could not create socket"
.LC1:
.string "127.0.0.1"
.LC2:
.string "connect error"
.LC3:
.string "Connected\n"
.align 8
.LC4:
.string "Hola!! , \n\r\n"
.LC5:
.string "Send failed"
.LC6:
.string "Data Send\n"
.LC7:
.string "recv failed"
.LC8:
.string "Reply received\n"
.text
.globl main
.type main, #function
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $2048, %rsp
movl %edi, -2036(%rbp)
movq %rsi, -2048(%rbp)
movl $0, %edx
movl $1, %esi
movl $2, %edi
call socket
movl %eax, -4(%rbp)
cmpl $-1, -4(%rbp)
jne .L2
movl $.LC0, %edi
movl $0, %eax
call printf
.L2:
movl $.LC1, %edi
call inet_addr
movl %eax, -28(%rbp)
movw $2, -32(%rbp)
movl $2000, %edi
call htons
movw %ax, -30(%rbp)
leaq -32(%rbp), %rcx
movl -4(%rbp), %eax
movl $16, %edx
movq %rcx, %rsi
movl %eax, %edi
call connect
testl %eax, %eax
jns .L3
movl $.LC2, %edi
call puts
movl $1, %eax
jmp .L7
.L3:
movl $.LC3, %edi
call puts
movq $.LC4, -16(%rbp)
movq -16(%rbp), %rax
movq %rax, %rdi
call strlen
movq %rax, %rdx
movq -16(%rbp), %rsi
movl -4(%rbp), %eax
movl $0, %ecx
movl %eax, %edi
call send
testq %rax, %rax
jns .L5
movl $.LC5, %edi
call puts
movl $1, %eax
jmp .L7
.L5:
movl $.LC6, %edi
call puts
leaq -2032(%rbp), %rsi
movl -4(%rbp), %eax
movl $0, %ecx
movl $2000, %edx
movl %eax, %edi
call recv
testq %rax, %rax
jns .L6
movl $.LC7, %edi
call puts
.L6:
movl $.LC8, %edi
call puts
leaq -2032(%rbp), %rax
movq %rax, %rdi
call puts
movl $0, %eax
.L7:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE2:
.size main, .-main
.ident "GCC: (Debian 4.9.2-10) 4.9.2"
So after using as example.s -o example.o, I use ld example.o -o example, and thats where I get these following errors:
ld: warning: cannot find entry symbol _start; defaulting to 00000000004000b0
test.o: In function main':
test.c:(.text+0x28): undefined reference tosocket'
test.c:(.text+0x40): undefined reference to printf'
test.c:(.text+0x4a): undefined reference toinet_addr'
test.c:(.text+0x5d): undefined reference to htons'
test.c:(.text+0x77): undefined reference toconnect'
test.c:(.text+0x85): undefined reference to puts'
test.c:(.text+0x99): undefined reference toputs'
test.c:(.text+0xad): undefined reference to strlen'
test.c:(.text+0xc3): undefined reference tosend'
test.c:(.text+0xd2): undefined reference to puts'
test.c:(.text+0xe3): undefined reference toputs'
test.c:(.text+0xfe): undefined reference to recv'
test.c:(.text+0x10d): undefined reference toputs'
test.c:(.text+0x117): undefined reference to puts'
test.c:(.text+0x126): undefined reference toputs'
it seems to me that gcc is not usingn correctly .start, global main, etc. but to be honest I wouldnt know how to fix it., if this is correct then why?
Any help Will be appreciate.
Thank you.
The problem is that ld example.o -o example tries to link just example.o and nothing else. To get missing symbols you need to link much more (e.g. startup code, standard library, C runtime, etc). Try gcc -v example.c to see how the linker should be invoked.
The commands given in Harry's answer are the good ones:
gcc -Wall -O -fverbose-asm -S example.c
gcc -c example.s -o example.o
gcc example.o -o example
Basically, you should be aware that GCC would link your code with :
startup code like crt0 (actually, that is several object files today)
the C standard library (libc.so) (which will do some system calls)
the libgcc providing a few low level, processor specific, functions (e.g. 64 bits arithmetic on 32 bits machine); it has a permissive but ad-hoc license.
and you often need some dynamic linker like ld-linux(8)
the kernel would provide vdso(7)
How all this is linked together is known by the gcc command, which will start some ld. Replace gcc with gcc -v in your compilation commands to understand what exactly is happening. If you want to issue your own ld command you should add the options providing what I have listed above. The errors you are getting are notably because of the lack of crt0 & libc
BTW on Linux most C standard libraries (e.g. GNU libc or musl-libc) are free software (and so is GCC), so you can study their source code.
Try also gcc -dumpspecs which describes what gcc knows about issuing various commands (notice that gcc is only a driving program; the real C compiler is some cc1). Read also the wikipage on GCC. Some slides and references on the documentation of GCC MELT gives a lot more information. See also this and the picture there.
I strongly recommend to also use gcc to assemble (some assembler code of yours) and to link stuff (because you don't want to handle all the gory details mentioned above, plus some other ones I did not mention).
Try this
gcc -Wall -O -fverbose-asm -S example.c
gcc -c example.s -o example.o
gcc example.o -o example
This is an important part:
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/4.9/crtbegin.o
-lgcc
--as-needed -lgcc_s
--no-as-needed -lc -lgcc
--as-needed -lgcc_s
--no-as-needed /usr/lib/gcc/x86_64-linux-gnu/4.9/crtend.o
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crtn.o
crt1, crti, crtbegin supply the startup code where the _start entry point is actually defined (later on the control is passed to your main), stdio is initialized, etc. Similarly strand and crtn handle the cleanup after main return. lc supplies the standard library (like puts and other missing symbols). lgcc and lgcc_s have the gcc-specific runtime support.
The bottomline is, you need all that to be linked in.
From this question: Why do you have to link the math library in C?
I know that C math library (libm) is separated from C standard library (libc), and is not linked in by default.
But when I compiled the code below using gcc filename.c without -lm on mac osx 10.11.1
:
#include <math.h>
#include <stdio.h>
int
main (void)
{
double x = sqrt (2.0);
printf ("The square root of 2.0 is %f\n", x);
return 0;
}
There's no link error and the output executable file works correctly.
Then I tried otool -L output:
output:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1225.1.1)
/opt/local/lib/libgcc/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0)
I wonder to know is there some library structure differences on mac?
Or it's the new feature for gcc 5.2.0?
Thanks a lot!
Update:
I changed the code with:
double in = 0;
scanf("%lf", &in);
double x = sqrt(in);
and it still doesn't need -lm.
And I disassemble the code with otool -vVt:
(__TEXT,__text) section
_main:
0000000100000eed pushq %rbp
0000000100000eee movq %rsp, %rbp
0000000100000ef1 subq $0x10, %rsp
0000000100000ef5 pxor %xmm0, %xmm0
0000000100000ef9 movsd %xmm0, -0x10(%rbp)
0000000100000efe leaq -0x10(%rbp), %rax
0000000100000f02 movq %rax, %rsi
0000000100000f05 leaq 0x82(%rip), %rdi ## literal pool for: "%lf"
0000000100000f0c movl $0x0, %eax
0000000100000f11 callq 0x100000f54 ## symbol stub for: _scanf
0000000100000f16 movq -0x10(%rbp), %rax
0000000100000f1a movd %rax, %xmm0
0000000100000f1f callq 0x100000f5a ## symbol stub for: _sqrt
0000000100000f24 movd %xmm0, %rax
0000000100000f29 movq %rax, -0x8(%rbp)
0000000100000f2d movq -0x8(%rbp), %rax
0000000100000f31 movd %rax, %xmm0
0000000100000f36 leaq 0x55(%rip), %rdi ## literal pool for: "The square root of 2.0 is %f\n"
0000000100000f3d movl $0x1, %eax
0000000100000f42 callq 0x100000f4e ## symbol stub for: _printf
0000000100000f47 movl $0x0, %eax
0000000100000f4c leave
0000000100000f4d retq
It seems sqrt is called. So why things go different on mac?
Update:
I found the conclusion in this question: C std library don't appear to be linked in object file
It says on OS X, the math library is part of libSystem:
$ ls -l /usr/lib/libm.dylib
lrwxr-xr-x 1 root wheel 15 3 Jun 01:39 /usr/lib/libm.dylib# -> libSystem.dylib
There's no separate math library on OSX. While a lot of systems ship functions in the standard C math.h header in a separate math library, OSX does not do that, it's part of the libSystem library, which is always linked in.
In addition to that, a compiler might optimize away any such call if it can perform the computation at compile time.
sqrt is provided as a compiler built-in, so no link to the library is necessary (as it happens - doing so would still be good practice so that it compiles elsewhere).
Per this page:
The ISO C90 functions [long list including sqrt] are all recognized as built-in functions unless -fno-builtin is specified (or -fno-builtin-function is specified for an individual function). All of these functions have corresponding versions prefixed with __builtin_.
If you compile with -fno-builtin I would expect a failure at the link stage.
I'm studying NASM on Linux 64-bit and have been trying to implement some examples of code. However I got a problem in the following example. The function donothing is implemented in NASM and is supposed to be called in a program implemented in C:
File main.c:
#include <stdio.h>
#include <stdlib.h>
int donothing(int, int);
int main() {
printf(" == %d\n", donothing(1, 2));
return 0;
}
File first.asm
global donothing
section .text
donothing:
push rbp
mov rbp, rsp
mov eax, [rbp-0x4]
pop rbp
ret
What donothing does is nothing more than returning the value of the first parameter. But when donothing is called the value 0 is printed instead of 1. I tried rbp+0x4, but it doesn't work too.
I compile the files using the following command:
nasm -f elf64 first.asm && gcc first.o main.c
Compiling the function 'test' in C by using gcc -s the assembly code generated to get the parameters looks similar to the donothing:
int test(int a, int b) {
return a > b;
}
Assembly generated by gcc for the function 'test' above:
test:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movl %esi, -8(%rbp)
movl -4(%rbp), %eax
cmpl -8(%rbp), %eax
setg %al
movzbl %al, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
So, what's wrong with donothing?
In x86-64 calling conventions the first few parameters are passed in registers rather than on the stack. In your case you should find the 1 and 2 in RDI and RSI.
As you can see in the compiled C code, it takes a from edi and b from esi (although it goes through an unnecessary intermediate step by placing them in memory)