Why would a simple C program need syscalls? - c

Related to this other question. I am trying to run this simple C program in gem5:
int main() {
int a=1, b=2;
int c=a+b;
return c;
}
And it fails because gem5 doesn't have some syscalls implemented.
My question is, why would a simple program like this require syscalls? This should run bare-metal without trouble. Is there a way to compile this to avoid syscalls? I am using arm-linux-gnueabi-gcc -static -DUNIX to compile it.

Without syscalls the program cannot exit. The way it works is typically something like this:
// Not how it's actually implemented... just a sketch.
void _start() {
char **argv = ...;
int argc = ...;
// ... other initialization code ...
int retcode = main(argc, argv);
exit(retcode);
}
The exact details depend on the operating system, but exit(), which terminates the process, typically has to be a system call or is implemented with system calls.
Note that this is true for "hosted" C implementations, not for "freestanding" C implementations, and is highly operating-system specific. There are freestanding C implementations can run on bare metal, but hosted C implementations usually need an operating system.
You can compile without standard libraries and without the runtime but your entry point cannot return... there is nothing to return to, without a runtime.
Creating a baremetal program
It is generally possible to compile programs capable of running baremetal.
Use -ffreestanding. This makes GCC generate code that does not assume that the standard library is available (and has other effects).
Use -nostdlib. This will prevent GCC from linking with the standard library. Note that memcmp, memset, memcpy, and memmove calls may be generated anyway, so you may have to provide these yourself.
At this point you can write your program, but you typically have to use _start instead of main:
void _start(void) {
while (1) { }
}
Note that you can't return from _start! Think about it... there is nowhere to return to. When you compile a program like this you can see that it doesn't use any system calls and doesn't have a loader.
$ gcc -ffreestanding -nostdlib test.c
We can verify that it loads no libraries:
$ ldd a.out
statically linked
$ readelf -d a.out
Dynamic section at offset 0xf30 contains 8 entries:
Tag Type Name/Value
0x000000006ffffef5 (GNU_HASH) 0x278
0x0000000000000005 (STRTAB) 0x2b0
0x0000000000000006 (SYMTAB) 0x298
0x000000000000000a (STRSZ) 1 (bytes)
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000000000015 (DEBUG) 0x0
0x000000006ffffffb (FLAGS_1) Flags: PIE
0x0000000000000000 (NULL) 0x0
We can also see that it doesn't contain any code that makes system calls:
$ objdump -d a.out
a.out: file format elf64-x86-64
Disassembly of section .text:
00000000000002c0 <_start>:
2c0: eb fe jmp 2c0 <_start>

My question is, why would a simple program like this require syscalls?
The run-time loader ld.so does syscalls. The C run-time does syscalls. Do strace <application> and see.

There are some parameters to gcc you might want to checkout. Among others:
-ffreestanding
-nostdlib
-nodefaultlibs
My question is, why would a simple program like this require syscalls?
Because entering main and exiting the program is based on syscalls.

Compiling with arm-unknown-linux-uclibcgnueabi solved the issue. Apparently uclibc implementation doesn't use the syscalls that gem5 doesn't have implemented.

Related

How do you call C functions from Assembly and how do you link it Statically?

I am playing around and trying to understand the low-level operation of computers and programs. To that end, I am experimenting with linking Assembly and C.
I have 2 program files:
Some C code here in "callee.c":
#include <unistd.h>
void my_c_func() {
write(1, "Hello, World!\n", 14);
return;
}
I also have some GAS x86_64 Assembly here in "caller.asm":
.text
.globl my_entry_pt
my_entry_pt:
# call my c function
call my_c_func # this function has no parameters and no return data
# make the 'exit' system call
mov $60, %rax # set the syscall to the index of 'exit' (60)
mov $0, %rdi # set the single parameter, the exit code to 0 for normal exit
syscall
I can build and execute the program like this:
$ as ./caller.asm -o ./caller.obj
$ gcc -c ./callee.c -o ./callee.obj
$ ld -e my_entry_pt -lc ./callee.obj ./caller.obj -o ./prog.out -dynamic-linker /lib64/ld-linux-x86-64.so.2
$ ldd ./prog.out
linux-vdso.so.1 (0x00007fffdb8fe000)
libc.so.6 => /lib64/libc.so.6 (0x00007f46c7756000)
/lib64/ld-linux-x86-64.so.2 (0x00007f46c7942000)
$ ./prog.out
Hello, World!
Along the way, I had some problems. If I don't set the -dynamic-linker option, it defaults to this:
$ ld -e my_entry_pt -lc ./callee.obj ./caller.obj -o ./prog.out
$ ldd ./prog.out
linux-vdso.so.1 (0x00007ffc771c5000)
libc.so.6 => /lib64/libc.so.6 (0x00007f8f2abe2000)
/lib/ld64.so.1 => /lib64/ld-linux-x86-64.so.2 (0x00007f8f2adce000)
$ ./prog.out
bash: ./prog.out: No such file or directory
Why is this? Is there a problem with the linker defaults on my system? How can/should I fix it?
Also, static linking doesn't work.
$ ld -static -e my_entry_pt -lc ./callee.obj ./caller.obj -o ./prog.out
ld: ./callee.obj: in function `my_c_func':
callee.c:(.text+0x16): undefined reference to `write'
Why is this? Shouldn't write() just be a c library wrapper for the syscall 'write'? How can I fix it?
Where can I find the documentation on the C function calling convention so I can read up on how parameters are passed back and forth, etc...?
Lastly, while this seems to work for this simple example, am I doing something wrong in my initialization of the C stack? I mean, right now, I'm doing nothing. Should I be allocing memory from the kernel for the stack, setting bounds, and setting %rsp and %rbp before I start trying to call functions. Or is the kernel loader taking care of all this for me? If so, will all architectures under a Linux kernel take care of it for me?
While the Linux kernel provides a syscall named write, it does not mean that you automatically get a wrapper function of the same name you can call from C as write(). In fact, you need inline assembly to call any syscalls from C, if you're not using libc, because libc defines those wrapper functions.
Instead of explicitly linking your binaries with ld, let gcc do it for you. It can even assemble assembly files (internally executing a suitable version of as), if the source ends with a .s suffix. It looks like your linking problems are simply a disagreement between what GCC assumes and how you do it via LD yourself.
No, it's not a bug; the ld default path for ld.so isn't the one used on modern x86-64 GNU/Linux systems. (/lib/ld64.so.1 might have been used on early x86-64 GNU/Linux ports before the dust settled on where multi-arch systems would put everything to support both i386 and x86-64 versions of libraries installed at the same time. Modern systems use /lib64/ld-linux-x86-64.so.2)
Linux uses the System V ABI. The AMD64 Architecture Processor Supplement (PDF) describes the initial execution environment (when _start gets invoked), and the calling convention. Essentially, you have an initialized stack, with environment and command-line arguments stored in it.
Let's construct a fully working example, containing both C and assembly (AT&T syntax) sources, and a final static and dynamic binaries.
First, we need a Makefile to save typing long commands:
# SPDX-License-Identifier: CC0-1.0
CC := gcc
CFLAGS := -Wall -Wextra -O2 -march=x86-64 -mtune=generic -m64 \
-ffreestanding -nostdlib -nostartfiles
LDFLAGS :=
all: static-prog dynamic-prog
clean:
rm -f static-prog dynamic-prog *.o
%.o: %.c
$(CC) $(CFLAGS) $^ -c -o $#
%.o: %.s
$(CC) $(CFLAGS) $^ -c -o $#
dynamic-prog: main.o asm.o
$(CC) $(CFLAGS) $^ $(LDFLAGS) -o $#
static-prog: main.o asm.o
$(CC) -static $(CFLAGS) $^ $(LDFLAGS) -o $#
Makefiles are particular about their indentation, but SO converts tabs to spaces. So, after pasting the above, run sed -e 's|^ *|\t|' -i Makefile to fix the indentation back to tabs.
The SPDX License Identifier in the above Makefile and all following files tell you that these files are licensed under Creative Commons Zero license: that is, these are all dedicated to public domain.
Compilation flags used:
-Wall -Wextra: Enable all warnings. It is a good practice.
-O2: Optimize the code. This is a commonly used optimization level, usually considered sufficient and not too extreme.
-march=x86-64 -mtune=generic -m64: Compile to 64-bit x86-64 AKA AMD64 architecture. These are the defaults; you can use -march=native to optimize for your own system.
-ffreestanding: Compilation targets the freestanding C environment. Tells the compiler it can't assume that strlen or memcpy or other library functions are available, so don't optimize a loop, struct copy, or array initialization into calls to strlen, memcpy, or memset, for example. If you do provide asm implementations of any functions gcc might want to invent calls to, you can leave this out. (Especially if you're writing a program that will run under an OS)
-nostdlib -nostartfiles: Do not link in the standard C library or its startup files. (Actually, -nostdlib already "includes" -nostartfiles, so -nostdlib alone would suffice.)
Next, let's create a header file, nolib.h, that implements nolib_exit() and nolib_write() wrappers around the group_exit and write syscalls:
// SPDX-License-Identifier: CC0-1.0
/* Require Linux on x86-64 */
#if !defined(__linux__) || !defined(__x86_64__)
#error "This only works on Linux on x86-64."
#endif
/* Known syscall numbers, without depending on glibc or kernel headers */
#define SYS_write 1
#define SYS_exit_group 231
// Normally you'd use
// #include <asm/unistd.h> for __NR_write and __NR_exit_group
// or even #include <sys/syscall.h> for SYS_write
/* Inline assembly macro for a single-parameter no-return syscall */
#define SYSCALL1_NORET(nr, arg1) \
__asm__ volatile ( "syscall\n\t" : : "a" (nr), "D" (arg1) : "rcx", "r11", "memory")
/* Inline assembly macro for a three-parameter syscall */
#define SYSCALL3(retval, nr, arg1, arg2, arg3) \
__asm__ volatile ( "syscall\n\t" : "=a" (retval) : "a" (nr), "D" (arg1), "S" (arg2), "d" (arg3) : "rcx", "r11", "memory" )
/* exit() function */
static inline void nolib_exit(int retval)
{
SYSCALL1_NORET(SYS_exit_group, retval);
}
/* Some errno values */
#define EINTR 4 /* Interrupted system call */
#define EBADF 9 /* Bad file descriptor */
#define EINVAL 22 /* Invalid argument */
// or #include <asm/errno.h> to define these
/* write() syscall wrapper - returns negative errno if an error occurs */
static inline long nolib_write(int fd, const void *data, long len)
{
long retval;
if (fd == -1)
return -EBADF;
if (!data || len < 0)
return -EINVAL;
SYSCALL3(retval, SYS_write, fd, data, len);
return retval;
}
The reason the nolib_exit() uses the exit_group syscall instead of the exit syscall is that exit_group ends the entire process. If you run a program under strace, you'll see it too calls exit_group syscall at the very end. (Syscall implementation of exit())
Next, we need some C code. main.c:
// SPDX-License-Identifier: CC0-1.0
#include "nolib.h"
const char *c_function(void)
{
return "C function";
}
static inline long nolib_put(const char *msg)
{
if (!msg) {
return nolib_write(1, "(null)", 6);
} else {
const char *end = msg;
while (*end)
end++; // strlen
if (end > msg)
return nolib_write(1, msg, (unsigned long)(end - msg));
else
return 0;
}
}
extern const char *asm_function(int);
void _start(void)
{
nolib_put("asm_function(0) returns '");
nolib_put(asm_function(0));
nolib_put("', and asm_function(1) returns '");
nolib_put(asm_function(1));
nolib_put("'.\n");
nolib_exit(0);
}
nolib_put() is just a wrapper around nolib_write(), that finds the end of the string to be written, and calculates the number of characters to be written based on that. If the parameter is a NULL pointer, it prints (null).
Because this is a freestanding environment, and the default name for the entry point is _start, this defines _start as a C function that never returns. (It must not ever return, because the ABI does not provide any return address; it would just crash the process. Instead, an exit-type syscall must be called at end.)
The C source declares and calls a function asm_function, that takes an integer parameter, and returns a pointer to a string. Obviously, we'll implement this in assembly.
The C source also declares a function c_function, that we can call from assembly.
Here's the assembly part, asm.s:
# SPDX-License-Identifier: CC0-1.0
.text
.section .rodata
.one:
.string "One" # includes zero terminator
.text
.p2align 4,,15
.globl asm_function #### visible to the linker
.type asm_function, #function
asm_function:
cmpl $1, %edi
jne .else
leaq .one(%rip), %rax
ret
.else:
subq $8, %rsp # 16B stack alignment for a call to C
call c_function
addq $8, %rsp
ret
.size asm_function, .-asm_function
We don't need to declare c_function as an extern because GNU as treats all unknown symbols as external symbols anyway. We could add Call Frame Information directives, at least .cfi_startproc and .cfi_endproc, but I left them out so it wouldn't be so obvious I just wrote the original code in C and let GCC compile it to assembly, and then prettified it just a bit. (Did I write that out aloud? Oops! But seriously, compiler output is often a good starting point for a hand-written asm implementation of something, unless it does a very bad job of optimizing.)
The subq $8, %rsp adjusts the stack so that it will be a multiple of 16 for the c_function. (On x86-64, stacks grow down, so to reserve 8 bytes of stack, you subtract 8 from the stack pointer.) After the call returns, addq $8, %rsp reverts the stack back to original.
With these four files, we're ready. To build the example binaries, run e.g.
reset ; make clean all
Running either ./static-prog or ./dynamic-prog will output
asm_function(0) returns 'C function', and asm_function(1) returns 'One'.
The two binaries are just 2 kB (static) and 6 kB (dynamic) in size or so, although you can make them even smaller by stripping unneeded stuff,
strip --strip-unneeded static-prog dynamic-prog
which removes about 0.5 kB to 1 kB of unneeded stuff from them – the exact amount varies depending on the version of GCC and Binutils you use.
On some other architectures, we'd need to also link against libgcc (via -lgcc), because some C features rely on internal GCC functions. 64-bit integer division (named udivdi or similar) on various architectures is a typical example.
As mentioned in the comments, the first version of the above examples had a few issues that need to be addressed. They do not stop the example from executing or working as intended, and were overlooked because the examples were written from scratch for this answer (in the hopes that others finding this question later on via web searches might find this useful), and I'm not perfect. :)
memory clobber argument to the inline assembly, in the syscall preprocessor macros
Adding "memory" in the clobbered list tells the compiler that the inline assembly may access (read and/or write) memory other than those specified in the parameter lists. It is obviously needed for the write syscall, but it is actually important for all syscalls, because the kernel can deliver e.g. signals in the same thread before returning from the syscall, and signal delivery can/will access memory.
As the GCC documentation mentions, this clobber also behaves like a read/write memory barrier for the compiler (but NOT for the processor!). In other words, with the memory clobber, the compiler knows that it must write any changes in variables etc. in memory before the inline assembly, and that unrelated variables and other memory content (not explicitly listed in the inline assembly inputs, outputs, or clobbers) may also change, and will generate the code we actually want, without making incorrect assumptions.
-fPIC -pie: Omitted for simplicity
Position independent code is usually only relevant for shared libraries. In real projects' Makefiles, you will need to use a different set of compilation flags for objects that will be compiled as a dynamic library, static library, dynamically linked executable, or a static executable, as the desired properties (and therefore compiler/linker flags) vary.
In an example such as this one, it is better to try and avoid such extraneous things, as it is a reasonable question to ask on its own ("Which compiler options to use to achieve X, when needing Y ?"), and the answers depend on the required features and context.
In most modern distros, PIE is the default and you might want -fno-pie -no-pie to simplify debugging / disassembling. 32-bit absolute addresses no longer allowed in x86-64 Linux?
-nostdlib does imply (or "include") -nostartfiles
There are quite a few overall options and link options we can use to control how the code is compiled and linked.
Many of the options GCC supports are grouped. For example, -O2 is actually shorthand for a collection of optimization features that you can explicitly specify.
Here, the reason for keeping both is to remind human programmers of the expectations for the code: no standard library, and no start files/objects.
-march=x86-64 -mtune=generic -m64 is the default on x86-64
Again, this is kept more as a reminder of what the code expects. Without a specific architecture definition, one might get the wrong impression that the code should be compilable in general, because C typically is not architecture specific!
The nolib.h header file does contain preprocessor checks (using pre-defined compiler macros to detect the operating system and hardware architecture), halting the compilation with an error for other OSes and hardware architectures.
Most Linux distributions provide the syscall numbers in <asm/unistd.h>, as __NR_name.
These are derived from the actual kernel sources. However, for any given architecture, these are the stable userspace ABI, and will not change. New ones may be added. Only in some extraordinary circumstances (unfixable security holes, perhaps?) can a syscall be deprecated and stop functioning.
It is always better to use the syscall numbers from the kernel, preferably via the aforementioned header, but it's possible to build this program with only GCC, no glibc or Linux kernel headers installed. For someone writing their own standard C library, they should include the file (from Linux kernel sources).
I do know that Debian derivatives (Ubuntu, Mint, et cetera) all do provide the <asm/unistd.h> file, but there are many, many other Linux distributions, and I just am not sure about all of them. I opted to only define the two (exit_group and write), to minimize the risk of problems.
(Editor's note: the file might be in a different place in the filesystem, but the <asm/unistd.h> include path should always work if the right header package is installed. It's part of the kernel's user-space C/asm API.)
Compilation flag -g adds debug symbols, which adds greatly when debugging – for example, when running and examining the binary in gdb.
I omitted this and all related flags, because I did not want to expand the topic any further, and because this example is easily debugged at the asm level and examined even without. See GDB asm tips like layout reg at the bottom of the x86 tag wiki
The System V ABI requires that before a call to a function, the stack is aligned to 16 bytes. So at the top of the function, RSP+-8 is 16-byte aligned, and if there are any stack args, they'll be aligned.
The call instruction pushes the current instruction pointer to the stack, and because this is a 64-bit architecture, that too is 64 bits = 8 bytes. So, to conform to the ABI, we really need to adjust the stack pointer by 8 before calling the function, to ensure it too gets a properly aligned stack pointer. These were initially omitted, but are now included in the assembly (asm.s file).
This matters, because on x86-64, SSE/AVX SIMD vectors have different instructions for aligned-to-16-bytes and unaligned accesses, with the aligned accesses being significantly faster or certain processors. (Why does System V / AMD64 ABI mandate a 16 byte stack alignment?). Using aligned SIMD instructions like movaps with unaligned addresses will cause the process to crash. (e.g. glibc scanf Segmentation faults when called from a function that doesn't align RSP is a real-life example of what happens when you get this wrong.)
However, when we do such stack manipulations, we really should add CFI (Call Frame Information) directives to ensure debugging and stack unwinding etc. works correctly. In this case, for general CFI, we prepend .cfi_startproc before the first instruction in an assembly function, and .cfi_endproc after the last instruction in an assembly function. For the Canonical Frame Address, CFA, we add .cfi_def_cfa_offset N after any instruction that modifies the stack pointer. Essentially, N is 8 at the beginning of the function, and increases as much as %rsp is decremented, and vice versa. See this article for more.
Internally, these directives produce information (metadata) stored in the .eh_frame and .eh_frame_hdr sections in the ELF object files and binaries, depending on other compilation flags.
So, in this case, the subq $8, %rsp should be followed by .cfi_def_cfa_offset 16, and the addq $8, %rsp by .cfi_def_cfa_offset 8, plus .cfi_startproc at the beginning of asm_function and .cfi_endproc after the final ret.
Note that you can often see rep ret instead of just rep in assembly sources. This is nothing but a workaround to certain processors having branch-prediction performance issues when jumping to or falling through a JCC to a ret instruction. The rep prefix does nothing, except it does fix the issues those processors might otherwise have with such a jump. Recent GCC versions stopped doing this by default as the affected AMD CPUs are very old and not as relevant these days. What does `rep ret` mean?
The "key" option, -ffreestanding, is one that chooses a C "dialect"
The C programming language is actually separated into two different environments: hosted, and freestanding.
The hosted environment is one where the standard C library is available, and is used when you write programs, applications, or daemons in C.
The freestanding environment is one where the standard C library is not available. It is used when you write kernels, firmware for microcontrollers or embedded systems, implement (parts of) your own standard C library, or a "standard library" for some other C-derived language.
As an example, the Arduino programming environment is based on a subset of freestanding C++. The standard C++ library is not available, and many features of C++ like exceptions are not supported. In fact, it is very close to freestanding C with classes. The environment also uses a special pre-preprocessor, which for example automatically prepends declarations of functions without the user having to write them.
Probably the most well known example of freestanding C is the Linux kernel. Not only is the standard C library not available, but the kernel code must actually avoid floating-point operations as well, because of certain hardware considerations.
For a better understanding of what exactly does the freestanding C environment look like to a programmer, I think the best thing is to go look at the language standard itself. As of now (June 2020), the most recent standard is ISO C18. While the standard itself is not free, the final draft is; for C18, it is draft N2176(PDF).
The ld default path for ld.so (the ELF interpreter) isn't the one used on modern x86-64 GNU/Linux systems.
/lib/ld64.so.1 might have been used on early x86-64 GNU/Linux ports before the dust settled on where multi-arch systems would put everything to support both i386 and x86-64 versions of libraries installed at the same time. Modern systems use /lib64/ld-linux-x86-64.so.2.
There was never a good time to update the default in GNU binutils ld; when some systems were using the default, changing it would have broken them. Multi-arch systems had to configure their GCC to pass -dynamic-linker /some/path to ld, so they simply did that instead of asking and waiting for the ld default to change. So nobody ever needed the ld default to change to make anything work, except for people playing around with assembly and using ld by hand to create dynamically-linked executables.
Instead of doing that, you can link using gcc -nostartfiles to omit CRT start code which defines a _start, but still link with the normal libraries including -lc, -lgcc internal helper functions if needed, etc.
See also Assembling 32-bit binaries on a 64-bit system (GNU toolchain) for more info on assembling with/without libc for asm that defines _start, or with libc + CRT for asm that defines main. (Leave out the -m32 from that answer for 64-bit; when using gcc to invoke as and ld for you, that's the only difference.)
ld -static -e my_entry_pt -lc ./callee.obj ./caller.obj -o ./prog.out
doesn't link because you put -lc before the object files that reference symbols in libc.
Order matters in linker command lines, for static libraries.
However, ld -static -e my_entry_pt ./callee.o ./caller.o -lc -o ./prog.out will link, but makes a program that segfaults when it calls glibc functions like write without having called glibc's init functions.
Dynamic linking takes care of that for you (glibc has .init functions that get called by the dynamic linker, the same mechanism that allows C++ static initializers to run in a C++ shared library). CRT startup code also calls those functions in the right order, but you left that out, too, and wrote your own entry point.
#Example's answer avoids that problem by defining its own write wrapper instead of linking with -lc, so it can be truly freestanding.
I thought glibc's write wrapper function would be simple enough not to crash, but that's not the case. It checks if the program is multi-threaded or something by loading from %fs:0x18. The kernel doesn't init FS base for thread-local storage; that's something user-space (glibc's internal init functions) would have to do.
glibc's write() faults on mov %fs:0x18,%eax if you haven't called glibc's init functions. (In a statically-linked executable where glibc couldn't get the dynamic linker to run them for you.)
Dump of assembler code for function write:
=> 0x0000000000401040 <+0>: endbr64 # for CET, or NOP on CPUs without CET
0x0000000000401044 <+4>: mov %fs:0x18,%eax ### this faults with no TLS setup
0x000000000040104c <+12>: test %eax,%eax
0x000000000040104e <+14>: jne 0x401060 <write+32>
0x0000000000401050 <+16>: mov $0x1,%eax # simple case: EAX = __NR_write
0x0000000000401055 <+21>: syscall
0x0000000000401057 <+23>: cmp $0xfffffffffffff000,%rax
0x000000000040105d <+29>: ja 0x4010b0 <write+112> # update errno on error
0x000000000040105f <+31>: retq # else return
0x0000000000401060 <+32>: sub $0x28,%rsp # the non-simple case:
0x0000000000401064 <+36>: mov %rdx,0x18(%rsp) # write is an async cancellation point or something
0x0000000000401069 <+41>: mov %rsi,0x10(%rsp)
0x000000000040106e <+46>: mov %edi,0x8(%rsp)
0x0000000000401072 <+50>: callq 0x4010e0 <__libc_enable_asynccancel>
0x0000000000401077 <+55>: mov 0x18(%rsp),%rdx
0x000000000040107c <+60>: mov 0x10(%rsp),%rsi
0x0000000000401081 <+65>: mov %eax,%r8d
0x0000000000401084 <+68>: mov 0x8(%rsp),%edi
0x0000000000401088 <+72>: mov $0x1,%eax
0x000000000040108d <+77>: syscall
0x000000000040108f <+79>: cmp $0xfffffffffffff000,%rax
0x0000000000401095 <+85>: ja 0x4010c4 <write+132>
0x0000000000401097 <+87>: mov %r8d,%edi
0x000000000040109a <+90>: mov %rax,0x8(%rsp)
0x000000000040109f <+95>: callq 0x401140 <__libc_disable_asynccancel>
0x00000000004010a4 <+100>: mov 0x8(%rsp),%rax
0x00000000004010a9 <+105>: add $0x28,%rsp
0x00000000004010ad <+109>: retq
0x00000000004010ae <+110>: xchg %ax,%ax
0x00000000004010b0 <+112>: mov $0xfffffffffffffffc,%rdx # errno update for the simple case
0x00000000004010b7 <+119>: neg %eax
0x00000000004010b9 <+121>: mov %eax,%fs:(%rdx) # thread-local errno?
0x00000000004010bc <+124>: mov $0xffffffffffffffff,%rax
0x00000000004010c3 <+131>: retq
0x00000000004010c4 <+132>: mov $0xfffffffffffffffc,%rdx # same for the async case
0x00000000004010cb <+139>: neg %eax
0x00000000004010cd <+141>: mov %eax,%fs:(%rdx)
0x00000000004010d0 <+144>: mov $0xffffffffffffffff,%rax
0x00000000004010d7 <+151>: jmp 0x401097 <write+87>
I don't fully understand what exactly write is checking for or doing. It may have something to do with async I/O, and/or POSIX thread cancellation points.

when do printf() and scanf() functions are linked statically or dynamically to application?

When a C program is compiled it under goes in the order of pre-processor,compiler,assembler,linker.
One of the main tasks for linker is to make code of library functions available to your program.
Linker can link them in two ways static or dynamically..
stdio.h contains only declarations,no definitions are present in it.
we only include stdio.h in program to say compiler about the return type and name of functions eg(printf(),scanf(),getc(),putc()...)..
Then how printf() and scanf() are linked in the example program below?
If it is linking dynamically which "DLL" is responsible in linking??
Is total "C" Library is linked dynamically to program??
#include "stdio.h"
int main()
{
int n;
printf("Enter an integer\n");
scanf("%d", &n);
if (n%2 == 0)
printf("Even\n");
else
printf("Odd\n");
return 0;
}
I think the question you are trying to ask is: “I know that functions like printf and scanf are implemented by the C runtime library. But I can use them without telling my compiler and/or IDE to link my program with the C runtime library. Why don’t I need to do that?”
The answer to that question is: “Programs that don’t need to be linked with the C runtime library are very, very rare. Even if you don’t explicitly use any library functions, you will still need the startup code, and the compiler might issue calls to memcpy, floating-point emulation functions, and so on ‘under the hood.’ Therefore, as a convenience, the compiler automatically links your program with the C runtime library, unless you tell it to not do that.”
You will have to consult the documentation for your compiler to learn how to tell it not to link in the C runtime library. GCC uses the -nostdlib command-line option. Below, I demonstrate the hoops you have to jump through to make that work...
$ cat > test.c
#include <stdio.h>
int main(void) { puts("hello world"); return 0; }
^D
$ gcc -nostdlib test.c && { ./a.out; echo $?; }
/usr/bin/ld: warning: cannot find entry symbol _start
/tmp/cc8svIx5.o: In function ‘main’:
test.c:(.text+0xa): undefined reference to ‘puts’
collect2: error: ld returned 1 exit status
puts is obviously in the C library, but so is this mysterious "entry symbol _start". Turn off the C library and you have to provide that yourself, too...
$ cat > test.c
int _start(void) { return 0; }
^D
$ gcc -nostdlib test.c && { ./a.out; echo $?; }
Segmentation fault
139
It links now, but we get a segmentation fault, because _start has nowhere to return to! The operating system expects it to call _exit. OK, let's do that...
$ cat > test.c
extern void _exit(int);
void _start(void) { _exit(0); }
^D
$ gcc -nostdlib test.c && { ./a.out; echo $?; }
/tmp/ccuDrMQ9.o: In function `_start':
test.c:(.text+0xa): undefined reference to `_exit'
collect2: error: ld returned 1 exit status
... nuts, _exit is a function in the C runtime library, too! Raw system call time...
$ cat > test.c
#include <unistd.h>
#include <sys/syscall.h>
void _start(void) { syscall(SYS_exit, 0); }
^D
$ gcc -nostdlib test.c && { ./a.out; echo $?; }
/tmp/cchtZnbP.o: In function `_start':
test.c:(.text+0x14): undefined reference to `syscall'
collect2: error: ld returned 1 exit status
... nope, syscall is also a function in the C runtime. I guess we just have to use assembly!
$ cat > test.S
#include <sys/syscall.h>
.text
.globl _start
.type _start, #function
_start:
movq $SYS_exit, %rax
movq $0, %rdi
syscall
$ gcc -nostdlib test.S && { ./a.out; echo $?; }
0
And that, finally, works. On my computer. It wouldn't work on a different operating system, with a different assembly-level convention for system calls.
You might now be wondering what the heck -nostdlib is even good for, if you have to drop down to assembly language just to make system calls. It's intended to be used when compiling completely self-contained, low-level system programs like the bootloader, the kernel, and (parts of) the C runtime itself — things that were going to have to implement their own everything anyway.
If we had it to do all over again from scratch, it might well make sense to separate out a low-level language-independent runtime, with just the syscall wrappers, language-independent process startup code, and the functions that any language's compiler might need to call "under the hood" (memcpy, _Unwind_RaiseException, __muldi3, that sort of thing). The problem with that idea is it rapidly suffers mission creep — do you include errno? Generic threading primitives? (Which ones, with which semantics?) The dynamic linker? An implementation of malloc, which several of the above things need? Windows's ntdll.dll began as this concept, and it's 1.8MB on disk in Windows 10, which is (slightly) bigger than libc.so + ld.so on my Linux partition. And it's rare and difficult to write a program that only uses ntdll.dll, even if you're Microsoft (the only example I'm sure of is csrss.exe, which might as well be a kernel component).
Generally, standard C libraries are linked dynamically. This is mainly because of the reasons that once a program has been statically linked, the code in it is fixed forever. If someone finds and fixes a bug in printf or scanf, then every program has to be linked again in order to pick up the fixed code.
In case of dynamic linking, none of the executable files (created after linking) contains a copy of the code for printf or scanf. If a new, fixed, version of printf is available, then it is picked up at run time.
-static-libstdc++
When the g++ program is used to link a C++ program, it normally
automatically links against libstdc++. If libstdc++ is available as a
shared library, and the -static option is not used, then this links
against the shared version of libstdc++. That is normally fine.
However, it is sometimes useful to freeze the version of libstdc++
used by the program without going all the way to a fully static link.
The -static-libstdc++ option directs the g++ driver to link libstdc++
statically, without necessarily linking other libraries statically.
For more details, please check this thread.
How can i statically link standard library to my c++ program?
They are linked statically, that way your program will be able to determine if there are any compilation errors before preceding to run the program.

printf and memcpy linkage to standard C library

It is my understanding that if I call printf in a program, by default (if the program isn't statically compiled) it makes a call to printf in the standard C library. However, if I were to call say memcpy, I'd hope the code would be inlined, as a function call is very expensive if memcpy is only copying a few bytes. If you're inlining sometimes and calling out others, the behaviour of your program after a libc upgrade is implementation dependent.
What actually occurs in both of these cases and generally?
First of all the function is never truly "inlined" - that applies to functions that you've written that are visible in the same compilation unit.
If you're inlining sometimes and calling out others, the behaviour of your program after a libc upgrade is implementation dependent.
This is not the case. The memcpy might be "inlined" at compile time. Once compiled, your libc version makes no difference.
In GCC, memcpy is recognized as a builtin. That means if GCC decides it, the call to memcpy will be replaced with a suitable implementation. On x86, this will usually be a rep movsb or similar instruction - depending on the size of the copy, and if it is of a constant size or not.
An implementation is allowed by the C standard to behave "as if" the actual standard library function were called. This is indeed a common optimization: small memcpy calls can be unrolled/inlined, and much more.
You're right that in some cases you could upgrade your libc and not see any change in function calls which were optimized out.
It's going to depend on a lot of things, here's how you can find out. GNU Binutils comes with a utility objdump that gives all sorts of details on what's in a binary.
On my system (an ARM Chromebook), compiling test.c:
#include <stdio.h>
int main(void) {
printf("Hello, world!\n");
}
with gcc test.c -o test and then running objdump -R test gives
test: file format elf32-littlearm
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
000105e4 R_ARM_GLOB_DAT __gmon_start__
000105d4 R_ARM_JUMP_SLOT puts
000105d8 R_ARM_JUMP_SLOT __libc_start_main
000105dc R_ARM_JUMP_SLOT __gmon_start__
000105e0 R_ARM_JUMP_SLOT abort
These are the dynamic relocation entries that are in the file, all the stuff that will be linked in from libraries external to the binary. Here it seems that the printf has been entirely optimized out, since it is only giving a constant string, and thus puts is sufficient. If we modify this to
printf("Hello world #%d\n", 1);
then we get the expected
000105e0 R_ARM_JUMP_SLOT printf
To get memcpy to be explicitly linked to, we have to prevent gcc from using its own builtin version with -fno-buildin-memcpy.
You can always attempt to drive the compiler behavior. For instance, with gcc:
gcc -fno-inline -fno-builtin-inline -fno-inline-functions -fno-builtin...
You should check the different results with nm or directly the interrupt calls in the assembly source code.

Compiling without libc

I want to compile my C-code without the (g)libc. How can I deactivate it and which functions depend on it?
I tried -nostdlib but it doesn't help: The code is compilable and runs, but I can still find the name of the libc in the hexdump of my executable.
If you compile your code with -nostdlib, you won't be able to call any C library functions (of course), but you also don't get the regular C bootstrap code. In particular, the real entry point of a program on Linux is not main(), but rather a function called _start(). The standard libraries normally provide a version of this that runs some initialization code, then calls main().
Try compiling this with gcc -nostdlib -m32:
// Tell the compiler incoming stack alignment is not RSP%16==8 or ESP%16==12
__attribute__((force_align_arg_pointer))
void _start() {
/* main body of program: call main(), etc */
/* exit system call */
asm("movl $1,%eax;"
"xorl %ebx,%ebx;"
"int $0x80"
);
__builtin_unreachable(); // tell the compiler to make sure side effects are done before the asm statement
}
The _start() function should always end with a call to exit (or other non-returning system call such as exec). The above example invokes the system call directly with inline assembly since the usual exit() is not available.
The simplest way to is compile the C code to object files (gcc -c to get some *.o files) and then link them directly with the linker (ld). You will have to link your object files with a few extra object files such as /usr/lib/crt1.o in order to get a working executable (between the entry point, as seen by the kernel, and the main() function, there is a bit of work to do). To know what to link with, try linking with the glibc, using gcc -v: this should show you what normally comes into the executable.
You will find that gcc generates code which may have some dependencies to a few hidden functions. Most of them are in libgcc.a. There may also be hidden calls to memcpy(), memmove(), memset() and memcmp(), which are in the libc, so you may have to provide your own versions (which is not hard, at least as long as you are not too picky about performance).
Things might get clearer at times if you look at the produced assembly (use the -S flag).

LINUX: Is it possible to write a working program that does not rely on the libc library?

I wonder if I could write a program in the C-programming language that is executable, albeit not using a single library call, e.g. not even exit()?
If so, it obviously wouldn't depend on libraries (libc, ld-linux) at all.
I suspect you could write such a thing, but it would need to have an endless loop at the end, because you can't ask the operation system to exit your process. And you couldn't do anything useful.
Well start with compiling an ELF program, look into the ELF spec and craft together the header, the program segments and the other parts you need for a program. The kernel would load your code and jump to some initial address. You could place an endless loop there. But without knowing some assembler, that's hopeless from the start on anyway.
The start.S file as used by glibc may be useful as a start point. Try to change it so that you can assemble a stand-alone executable out of it. That start.S file is the entry point of all ELF applications, and is the one that calls __libc_start_main which in turn calls main. You just change it so it fits your needs.
Ok, that was theoretical. But now, what practical use does that have?
Answer to the Updated Question
Well. There is a library called libgloss that provides a minimal interface for programs that are meant to run on embedded systems. The newlib C library uses that one as its system-call interface. The general idea is that libgloss is the layer between the C library and the operation system. As such, it also contains the startup files that the operation system jumps into. Both these libraries are part of the GNU binutils project. I've used them to do the interface for another OS and another processor, but there does not seem to be a libgloss port for Linux, so if you call system calls, you will have to do it on your own, as others already stated.
It is absolutely possible to write programs in the C programming language. The linux kernel is a good example of such a program. But also user programs are possible. But what is minimally required is a runtime library (if you want to do any serious stuff). Such one would contain really basic functions, like memcpy, basic macros and so on. The C Standard has a special conformance mode called freestanding, which requires only a very limited set of functionality, suitable also for kernels. Actually, i have no clue about x86 assembler, but i've tried my luck for a very simple C program:
/* gcc -nostdlib start.c */
int main(int, char**, char**);
void _start(int args)
{
/* we do not care about arguments for main. start.S in
* glibc documents how the kernel passes them though.
*/
int c = main(0,0,0);
/* do the system-call for exit. */
asm("movl %0,%%ebx\n" /* first argument */
"movl $1,%%eax\n" /* syscall 1 */
"int $0x80" /* fire interrupt */
: : "r"(c) :"%eax", "%ebx");
}
int main(int argc, char** argv, char** env) {
/* yeah here we can do some stuff */
return 42;
}
We're happy, it actually compiles and runs :)
Yes, it is possible, however you will have to make system calls and set up your entry point manually.
Example of a minimal program with entry point:
.globl _start
.text
_start:
xorl %eax,%eax
incl %eax
movb $42, %bl
int $0x80
Or in plain C (no exit):
void __attribute__((noreturn)) _start() {
while(1);
}
Compiled with:
gcc -nostdlib -o example example.s
gcc -nostdlib -o example example.c
In pure C? As others have said you still need a way to make syscalls, so you might need to drop down to inline asm for that. That said, if using gcc check out -ffreestanding.
You'd need a way to prevent the C compiler from generating code that depends on libc, which with gcc can be done with -fno-hosted. And you'd need one assembly language routine to implement syscall(2). They're not hard to write if you can get suitable OS doco. After that you'd be off to the races.
Well, you would need to use some system calls to load all it's information into memory, so I doubt it.
And you would almost have to use exit(), just because of the way that Linux works.
Yes you can, but it's pretty tricky.
There is essentially absolutely no point.
You can statically link a program, but then the appropriate pieces of the C library are included in its binary (so it doesn't have any dependencies).
You can completely do without the C library, in which case you need to make system calls using the appropriate low-level interface, which is architecture dependent, and not necessarily int 0x80.
If your goal is making a very small self-contained binary, you might be better off static-linking against something like uclibc.

Resources