Call libc function from assembly - c

I have a function defined in assembly that is calling a libc function (swapcontext). I invoke that function from my C code. For the purpose of creating a reproducible example, I'm using 'puts' instead:
foo.S:
.globl foo
foo:
call puts
ret
test.c:
void foo(char *str);
int main() {
foo("Hello World\n");
return 0;
}
Compile:
gcc test.c foo.S -o test
This compiles fine. Dis-assembling the result binary however shows that a valid call instruction wasn't inserted by the linker:
objdump -dR:
0000000000000671 <foo>:
671: e8 00 00 00 00 callq 676 <foo+0x5>
672: R_X86_64_PC32 puts#GLIBC_2.2.5-0x4
676: c3 retq
677: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
67e: 00 00
0000000000000530 <puts#plt>:
530: ff 25 9a 0a 20 00 jmpq *0x200a9a(%rip) # 200fd0 <puts#GLIBC_2.2.5>
536: 68 00 00 00 00 pushq $0x0
53b: e9 e0 ff ff ff jmpq 520 <.plt>
Execution:
./test1: Symbol `puts' causes overflow in R_X86_64_PC32 relocation
Segmentation fault
Any ideas why?

For your updated totally separate question, which replaced your question about disassembling a .o:
semi-related: Unexpected value of a function pointer local variable mentions the fact that the linker transforms references to puts to puts#plt for you in a non-PIE (because that lets you get efficient code if statically linking), but not in a PIE.
libc gets mapped more than 2GiB away from the main executable so a call rel32 can't reach it.
See also Can't call C standard library function on 64-bit Linux from assembly (yasm) code, which shows AT&T and NASM syntax for calling libc functions from a PIE executable, either via the PLT call puts#plt or gcc -fno-plt style with call *puts#gotpcrel(%rip).

You appear to be disassembling an object file with relocations.
Relocations are stubs for the linker to resolve when the file is loaded.
To properly view the relocations and symbol names, use objdump -dr test or objdump -dR test.
The output will be similar to this:
0000000000000000 <foo>:
0: e8 00 00 00 00 callq 5 <foo+0x5>
1: R_X86_64_PLT32 swapcontext-0x4
You may also consider adding a ret instruction at the end of foo, just in case swapcontext errors.
As shown by your objdump -dR output, both of these refer to libc functions:
670: R_X86_64_PC32 swapcontext#GLIBC_2.2.5-0x4
# 200fd0 <swapcontext#GLIBC_2.2.5>

Related

In shared objects, why does gcc relocate through the GOT for global variables which are defined in the same shared object?

While answering another question on Stack, I came upon a question myself. With the following code in shared.c:
#include "stdio.h"
int glob_var = 0;
void shared_func(){
printf("Hello World!");
glob_var = 3;
}
Compiled with:
gcc -shared -fPIC shared.c -oshared.so
If I disassemble the code with objdump -D shared.so, I get the following for shared_func:
0000000000001119 <shared_func>:
1119: f3 0f 1e fa endbr64
111d: 55 push %rbp
111e: 48 89 e5 mov %rsp,%rbp
1121: 48 8d 3d d8 0e 00 00 lea 0xed8(%rip),%rdi # 2000 <_fini+0xebc>
1128: b8 00 00 00 00 mov $0x0,%eax
112d: e8 1e ff ff ff callq 1050 <printf#plt>
1132: 48 8b 05 9f 2e 00 00 mov 0x2e9f(%rip),%rax # 3fd8 <glob_var##Base-0x54>
1139: c7 00 03 00 00 00 movl $0x3,(%rax)
113f: 90 nop
1140: 5d pop %rbp
1141: c3 retq
Using readelf -S shared.so, I get (for the GOT):
[22] .got PROGBITS 0000000000003fd8 00002fd8
0000000000000028 0000000000000008 WA 0 0 8
Correct me if I'm wrong but, looking at this, the relocation for accessing glob_var seems to be through the GOT. As I read on some websites, this is due to limitations in x86-64 machine code where RIP-relative addressing is limited to a 32 bits offset. This explanation is not satisfactory to me because, since the global variable is in the same shared object, then it is guaranteed to be found in its own data segment. The global variable could thus be accessed using RIP-relative addressing without an issue.
I would understand the GOT relocation if glob_var had been declared extern but, in this case, it is defined in the same shared object. Why does gcc require a relocation through the GOT? Is it because it is not able to detect that the global variable is defined within the same shared object?
Related: Why are nonstatic global variables defined in shared objects referenced using GOT?
The above is 11 years old and doesn't answer my question because there doesn't seem to be an appropriate answer there.
Bonus: what does <glob_var##Base-0x54> mean in the disassembly of shared_func? Why it isn't <glob_var#GOT>?
Thanks for any help!

Minimal 64-bit Windows executable crashes with tail-call optimization enabled by gcc

I'm trying to create a minimal 64-bit Windows executable to better understand how the Windows executable format works.
I wrote very basic assembly and C code as follows.
hi.s
section .text
hi:
db "hi", 0
global sayHi
align 16
sayHi:
lea rax, [rel hi]
ret
start.c
extern int puts();
extern const char *sayHi();
void start() {
puts(sayHi());
}
compiled with,
nasm -fwin64 hi.s
gcc -c -ostart.obj -O3 -fno-optimize-sibling-calls start.c
# I will explain the flag
and linked with,
golink /fo r.exe /console start.obj hi.obj msvcrt.dll
# create a console application `r.exe`
# the default entry point is `start`
The program runs fine and prints hi, but note the gcc flag -fno-optimize-sibling-calls. That flag disables tail-call optimizations so that the program always allocates stack space and calls a function. Without the flag, the program crashes.
This is the disassembled result without tail-call optimization. Not sure why gcc put a nop there, but otherwise it's very simple and runs fine.
0000000000401000 <.text>:
401000: 48 83 ec 28 sub rsp,0x28
401004: e8 27 00 00 00 call 0x401030 # sayHi
401009: 48 89 c1 mov rcx,rax
40100c: e8 ff 2f 00 00 call 0x404010 # puts
401011: 90 nop
401012: 48 83 c4 28 add rsp,0x28
401016: c3 ret
...
401020: 68 69 00 90 90 push 0xffffffff90900069 # "hi"
...
401030: 48 8d 05 e9 ff ff ff lea rax,[rip+0xffffffffffffffe9] # 0x401020
401037: c3 ret
This is when tail-call opt is enabled, in which the program crashes.
0000000000401000 <.text>:
401000: 48 83 ec 28 sub rsp,0x28
401004: e8 27 00 00 00 call 0x401030 # sayHi
401009: 48 89 c1 mov rcx,rax
40100c: 48 83 c4 28 add rsp,0x28
401010: e9 eb 2f 00 00 jmp 0x404000 # puts
...
401020: 68 69 00 90 90 push 0xffffffff90900069 # "hi"
...
401030: 48 8d 05 e9 ff ff ff lea rax,[rip+0xffffffffffffffe9] # 0x401020
401037: c3 ret
Now the program doesn't allocate stack space before puts and simply does a jmp instead of call.
I investigated further to see where exactly it jumps when calling puts.
In the no-tail-call case, the called address 0x404010 in the .idata section has the instruction jmp QWORD PTR [rip+0xffffffffffffffea] # 0x404000, and 0x404000 seems to contain the address to puts.
However in the tail-call case, the called address 0x404000 has 54 40 00 00 which is no meaningful instruction. The debugger says the program segfaults at 0x404003, so I'm pretty sure the program chokes trying to execute a garbage instruction.
I must be doing something wrong, but I'm not sure which, so could you explain why the tail-call case fails and how to get it work?
The problem was on golink not correctly handling tail-calls. I searched a while to make GNU ld link the program with the same options given to golink.
You can create a console-mode Windows executable by GNU ld with this command.
ld -o... --subsystem=console object-files...
--subsystem console or -subsystem=console also means the same. Use --subsystem=windows to create a GUI application.
GNU ld also handles Windows dll files, so in this case, simply giving ld a copy of msvcrt.dll from the system folder worked.

Jump to a label from inline assembly to C

I have a written piece of code in assembly and at some points of it, I want to jump to a label in C. So I have the following code (shortened version but still, I am having the same problem):
#include <stdio.h>
#define JE asm volatile("jmp end");
int main(){
printf("hi\n");
JE
printf("Invisible\n");
end:
printf("Visible\n");
return 0;
}
This code compiles, but there is no end label in the disassembled version of the code.
If I change the label name from end to any other thing (let's say l1, both in asm code(jmp l1) and in the C code), the compiler says that
main.c:(.text+0x6b): undefined reference to `l1'
collect2: error: ld returned 1 exit status
Makefile:2: recipe for target 'main' failed
make: *** [main] Error 1
I have tried different things(different length, different cases, upper, lower, etc.) and I think it only compiles with end label. And with end label, I am receiving segmentation fault because, there is no end label in the disassembled version.
Compiled with: gcc -O0 main.c -o main
Disassembled code:
000000000000063a <main>:
63a: 55 push %rbp
63b: 48 89 e5 mov %rsp,%rbp
63e: 48 8d 3d af 00 00 00 lea 0xaf(%rip),%rdi # 6f4 <_IO_stdin_used+0x4>
645: e8 c6 fe ff ff callq 510 <puts#plt>
64a: e9 c9 09 20 00 jmpq 201018 <_end> # there is no _end label!
64f: 48 8d 3d a1 00 00 00 lea 0xa1(%rip),%rdi # 6f7 <_IO_stdin_used+0x7>
656: e8 b5 fe ff ff callq 510 <puts#plt>
65b: 48 8d 3d 9f 00 00 00 lea 0x9f(%rip),%rdi # 701 <_IO_stdin_used+0x11>
662: e8 a9 fe ff ff callq 510 <puts#plt>
667: b8 00 00 00 00 mov $0x0,%eax
66c: 5d pop %rbp
66d: c3 retq
66e: 66 90 xchg %ax,%ax
So, the questions are:
Am I doing something wrong? I have seen this kind of jumps (from
assembly to C) in codes. I can provide example links.
Why the compiler/linker cannot find l1 but can find end?
This is what asm goto is for. GCC Inline Assembly: Jump to label outside block
Note that defining a label inside another asm statement will sometimes work (e.g. with optimization disabled) but IS NOT SAFE.
asm("end:"); // BROKEN; NEVER USE
// except for toy experiments to look at compiler output
GNU C does not define the behaviour of jumping from one asm statement to another without asm goto. The compiler is allowed to assume that execution comes out the end of an asm statement and e.g. put a store after it.
The C end: label within a given function won't just have the asm symbol name of end or _end: - that wouldn't make sense because separate C functions are each allowed to have their own end: label. It could be something like main.end but it turns out GCC and clang just use their usual autonumbered labels like .L123.
Then how this code works: https://github.com/IAIK/transientfail/blob/master/pocs/spectre/PHT/sa_oop/main.c
It doesn't; the end label that asm volatile("je end"); references is in the .data section and happens to be defined by the compiler or linker to mark the end of that section.
asm volatile("je end") has no connection to the C label in that function.
I commented out some of the code in other functions to get it to compile without the "cacheutils.h" header but that didn't affect that part of the oop() function; see https://godbolt.org/z/jabYu3 for disassembly of the linked executable with JE_4k changed to JE_16 so it's not huge. It's disassembly of a linked executable so you can see the numeric address of je 6010f0 <_end> while the oop function itself starts at 4006e0 and ends at 400750. (So it doesn't contain the branch target).
If this happens to work for Spectre exploits, that's because apparently the branch is never actually taken.

Stripped binary shows "_cxa_finalize" instead of "libc_start_main"

Why stripped binary shows _cxa_finalize instead of libc_start_main?
I am trying to locate and disassemble main() in a very simple C program on Linux (Ubuntu). The binary is stripped. Below you can see disassembly (not stripped) vs disassembly (stripped) of the same instructions.
Question: what is _cxa_finalize in the stripped version? Why is libc_start_main is replaced by _cxa_finalize?
Not stripped:
106d: 48 8d 3d c1 00 00 00 lea rdi,[rip+0xc1] # 1135 <main>
1074: ff 15 66 2f 00 00 call QWORD PTR [rip+0x2f66] # 3fe0 <__libc_start_main#GLIBC_2.2.5>
Stripped:
106d: 48 8d 3d c1 00 00 00 lea rdi,[rip+0xc1] # 1135 <__cxa_finalize#plt+0xf5>
1074: ff 15 66 2f 00 00 call QWORD PTR [rip+0x2f66] # 3fe0 <__cxa_finalize#plt+0x2fa0>
It's not __cxa_finalize. It's __cxa_finalize#plt+0xf5 and __cxa_finalize#plt+0x2fa0 (notice the significant offsets). The disassembler has no information about the symbol main or __libc_start_main because you removed the symbol table, but for technical reasons it is still aware of the symbols assocated with PLT thunks (because they're needed for binding at dynamic linking time, and the disassembler probably falls back to using that information when it lacks s symbol table). In general, the disassembler works backward from an address until it finds an address named by a symbol, and assumes (wrongly, here) that the address being disassembled is part of that function.

C: Var and Function have same name -- a bug of ld?

In my project (https://github.com/zzt93/os-lab1), I encounter that a global variable has the same name with a function, but compile it produce on error or warning, which cause a bug.
A simple program which can almost reproduce this problem:
//a.c
struct {
int t;
int *s;
} empty, full;
int main(){
printf("full is at %p", &full);
printf("empty is at %p", &empty);
empty.t = 1;
return 0;
}
//b.c
int empty() {
return 1;
}
Compiling them use gcc -o res.out -Wall -g -Wextra a.c b.c
will just produce some warning like this (notice: in my project, it even produce no error):
/usr/bin/ld: Warning: alignment 1 of symbol empty in /tmp/ccq70SCM.o is smaller than 16 in /tmp/ccVCOeWq.o
/usr/bin/ld: Warning: size of symbol empty changed from 16 in /tmp/ccVCOeWq.o to 11 in /tmp/ccq70SCM.o
/usr/bin/ld: Warning: type of symbol empty changed from 1 to 2 in /tmp/ccq70SCM.o
it seems it take struct empty and function empty as the same one.
Decompile it, you can clearly see that linker link the address of function empty rather than that struct empty.So try running res.out will cause segment fault.
40054e: be 73 05 40 00 mov $0x400573,%esi
400553: bf 12 06 40 00 mov $0x400612,%edi
400558: b8 00 00 00 00 mov $0x0,%eax
40055d: e8 ae fe ff ff callq 400410 <printf#plt>
400562: c7 05 07 00 00 00 01 movl $0x1,0x7(%rip) # 400573 <empty>
400569: 00 00 00
40056c: b8 00 00 00 00 mov $0x0,%eax
400571: 5d pop %rbp
400572: c3 retq
0000000000400573 <empty>:
400573: 55 push %rbp
400574: 48 89 e5 mov %rsp,%rbp
400577: b8 01 00 00 00 mov $0x1,%eax
40057c: 5d pop %rbp
40057d: c3 retq
40057e: 66 90 xchg %ax,%ax
Question:
Why linker choose function rather than that struct? Am i right to think it as a bug?
why add a static for the declaration of struct can prevent this error? -- I understand that static make the variable invisible outside this file, but notice I add static to struct empty not function empty solving the problem.
Edit::
And strange enough, in the symbol table of res.out, there is only one empty
Name Value Class Type Size Line Section
empty |0000000000400573| T | FUNC|000000000000000b| |.text
I am using
gcc version 4.9.2
Adding static prevents the error because static, when applied to functions or global variables, makes the symbol not be exported to the linker - in simple words, it makes it "private" to that file.
If you don't use static, the linker will see both definitions, but the types don't match. However, since compilation is applied file by file, the linker has no way to know the correct type of a variable - it must trust that you did your job and didn't lie.
This is why header files are important - it makes sure that types match in different files.

Resources