Is errno thread-safe? - c

In errno.h, this variable is declared as extern int errno; so my question is, is it safe to check errno value after some calls or use perror() in multi-threaded code. Is this a thread safe variable? If not, then whats the alternative ?
I am using linux with gcc on x86 architecture.

Yes, it is thread safe. On Linux, the global errno variable is thread-specific. POSIX requires that errno be threadsafe.
See http://www.unix.org/whitepapers/reentrant.html
In POSIX.1, errno is defined as an
external global variable. But this
definition is unacceptable in a
multithreaded environment, because its
use can result in nondeterministic
results. The problem is that two or
more threads can encounter errors, all
causing the same errno to be set.
Under these circumstances, a thread
might end up checking errno after it
has already been updated by another
thread.
To circumvent the resulting
nondeterminism, POSIX.1c redefines
errno as a service that can access the
per-thread error number as follows
(ISO/IEC 9945:1-1996, §2.4):
Some functions may provide the error number in a variable accessed
through the symbol errno. The symbol
errno is defined by including the
header , as specified by the
C Standard ... For each thread of a
process, the value of errno shall not
be affected by function calls or
assignments to errno by other threads.
Also see http://linux.die.net/man/3/errno
errno is thread-local; setting it in one thread does not affect its value in any other thread.

Yes
Errno isn't a simple variable anymore, it's something complex behind the scenes, specifically for it to be thread-safe.
See $ man 3 errno:
ERRNO(3) Linux Programmer’s Manual ERRNO(3)
NAME
errno - number of last error
SYNOPSIS
#include <errno.h>
DESCRIPTION
...
errno is defined by the ISO C standard to be a modifiable lvalue of
type int, and must not be explicitly declared; errno may be a macro.
errno is thread-local; setting it in one thread does not affect its
value in any other thread.
We can double-check:
$ cat > test.c
#include <errno.h>
f() { g(errno); }
$ cc -E test.c | grep ^f
f() { g((*__errno_location ())); }
$

In errno.h, this variable is declared as extern int errno;
Here is what the C standard says:
The macro errno need not be the identifier of an object. It might expand to a modifiable lvalue resulting from a function call (for example, *errno()).
Generally, errno is a macro which calls a function returning the address of the error number for the current thread, then dereferences it.
Here is what I have on Linux, in /usr/include/bits/errno.h:
/* Function to get address of global `errno' variable. */
extern int *__errno_location (void) __THROW __attribute__ ((__const__));
# if !defined _LIBC || defined _LIBC_REENTRANT
/* When using threads, errno is a per-thread value. */
# define errno (*__errno_location ())
# endif
In the end, it generates this kind of code:
> cat essai.c
#include <errno.h>
int
main(void)
{
errno = 0;
return 0;
}
> gcc -c -Wall -Wextra -pedantic essai.c
> objdump -d -M intel essai.o
essai.o: file format elf32-i386
Disassembly of section .text:
00000000 <main>:
0: 55 push ebp
1: 89 e5 mov ebp,esp
3: 83 e4 f0 and esp,0xfffffff0
6: e8 fc ff ff ff call 7 <main+0x7> ; get address of errno in EAX
b: c7 00 00 00 00 00 mov DWORD PTR [eax],0x0 ; store 0 in errno
11: b8 00 00 00 00 mov eax,0x0
16: 89 ec mov esp,ebp
18: 5d pop ebp
19: c3 ret

yes, as it is explained in the errno man page and the other replies, errno is a thread local variable.
However, there is a silly detail which could be easily forgotten. Programs should save and restore the errno on any signal handler executing a system call. This is because the signal will be handled by one of the process threads which could overwrite its value.
Therefore, the signal handlers should save and restore errno. Something like:
void sig_alarm(int signo)
{
int errno_save;
errno_save = errno;
//whatever with a system call
errno = errno_save;
}

This is from <sys/errno.h> on my Mac:
#include <sys/cdefs.h>
__BEGIN_DECLS
extern int * __error(void);
#define errno (*__error())
__END_DECLS
So errno is now a function __error(). The function is implemented so as to be thread-safe.

We can check by running a simple program on a machine.
#include <stdio.h>
#include <pthread.h>
#include <errno.h>
#define NTHREADS 5
void *thread_function(void *);
int
main()
{
pthread_t thread_id[NTHREADS];
int i, j;
for(i=0; i < NTHREADS; i++)
{
pthread_create( &thread_id[i], NULL, thread_function, NULL );
}
for(j=0; j < NTHREADS; j++)
{
pthread_join( thread_id[j], NULL);
}
return 0;
}
void *thread_function(void *dummyPtr)
{
printf("Thread number %ld addr(errno):%p\n", pthread_self(), &errno);
}
Running this program and you can see different addresses for errno in each thread. The output of a run on my machine looked like:-
Thread number 140672336922368 addr(errno):0x7ff0d4ac0698
Thread number 140672345315072 addr(errno):0x7ff0d52c1698
Thread number 140672328529664 addr(errno):0x7ff0d42bf698
Thread number 140672320136960 addr(errno):0x7ff0d3abe698
Thread number 140672311744256 addr(errno):0x7ff0d32bd698
Notice that address is different for all threads.

On many Unix systems, compiling with -D_REENTRANT ensures that errno is thread-safe.
For example:
#if defined(_REENTRANT) || _POSIX_C_SOURCE - 0 >= 199506L
extern int *___errno();
#define errno (*(___errno()))
#else
extern int errno;
/* ANSI C++ requires that errno be a macro */
#if __cplusplus >= 199711L
#define errno errno
#endif
#endif /* defined(_REENTRANT) */

I think the answer is "it depends". Thread-safe C runtime libraries usually implement errno as a function call (macro expanding to a function) if you're building threaded code with the correct flags.

Related

x86, amd64: Why SIGTRAP' ucontext instruction pointer does not point to related int3

As the title says - rip from ucontext_t does not point to the int3 that raised a SIGTRAP. Instead it points to the next instruction.
This deviates from my (naive) expectations that every faulted instruction will be retried upon return from signal handler (if context was not changed and process did not explicitly terminate while in the handler).
On the other hand when getting SIGILL - context points to the bad instruction. also on ARM and Aarch64 - SIGTRAP context also points to the related bkpt #0/bpt #0.
Test program:
/* sigtest.c */
#define _GNU_SOURCE
#include <stdio.h>
#include <signal.h>
#include <ucontext.h>
extern void do_int3(void);
extern void do_ud2(void);
void sighandler(int signo, siginfo_t* info, void* context) {
struct ucontext_t* uctx = context;
/* Yes, printf() here is bad. I promise to never ever do this in real programs */
printf("Got signal %d:\n"
"\tsi_addr: %p\n"
"\tcontext RIP: %p\n",
signo,
info->si_addr,
(void*)uctx->uc_mcontext.gregs[REG_RIP]);
}
int main(int argc, char** argv) {
struct sigaction sa = {};
sa.sa_flags = SA_SIGINFO | SA_ONESHOT;
sa.sa_sigaction = sighandler;
sigemptyset(&sa.sa_mask);
sigaction(SIGTRAP, &sa, NULL);
sigaction(SIGILL, &sa, NULL);
void (*fn)(void) = 0;
if (argv[1][0] == 't') {
fn = do_int3;
}
if (argv[1][0] == 'u') {
fn = do_ud2;
}
printf("call function at %p\n", fn);
fn();
printf("call returned\n");
}
; do_int3.S
.text
.globl do_int3
.type do_int3, #function
do_int3:
int3
ret
.size do_int3, .-do_int3
; do_ud2.S
.text
.globl do_ud2
.type do_ud2, #function
do_ud2:
ud2
ret
.size do_ud2, .-do_ud2
Compile and run:
$ cc sigtest.c do_int3.S do_ud2.S
$ ./a.out t
call function at 0x5620b87c6908
Got signal 5:
si_addr: (nil)
context RIP: 0x5620b87c6909
call returned
$ ./a.out u
call function at 0x557d1bc1590a
Got signal 4:
si_addr: 0x557d1bc1590a
context RIP: 0x557d1bc1590a
Illegal instruction (core dumped)
It is easy to notice that for SIGTRAP rip value point to the next instruction rather than to the int3.
Why is instruction pointer adjusted in SIGTRAP context to not retry related int3?
Why is instruction pointer adjusted in SIGTRAP context to not retry related int3?
It's not adjusted by the kernel; the exception-return address pushed by hardware is the one after an int instruction, including int3.
Keep in mind that int3 is only slightly different from the normal case of int n such as int 0x80. int is designed as being like a syscall or far-call, so the (exception) return address is the one after the int. Otherwise an int 0x80 system call would re-run itself forever unless the kernel edited the exception-return info before iret.
So why doesn't Linux's int3 / int 3 handler decrement the saved RIP by 1? For one thing, debuggers can do that in software if they want to, and keeping the kernel simple is better. (For both maintenance, efficiency, and less demangling for software that does want to know the address pushed by hardware.)
For another, a fixed offset of 1 byte wouldn't always be correct: 2-byte CD 03 int 3 raises the same exception as 1-byte CC int3. And even if you wanted to try, x86 machine code doesn't uniquely decode backwards, so obfuscated machine code could give an address that wasn't the actual start of the instruction that executed. (Although it would run as int 3 or int3 if decoded from there). e.g. 2-byte rep int3 is indistinguishable from add al, 0xf3 / int3 if looking backwards.
Software like GDB that inserts an int3 will know what it inserted, and needs to know about the target machine details, so can deal with the offset.

What does make errno thread safe? [duplicate]

In errno.h, this variable is declared as extern int errno; so my question is, is it safe to check errno value after some calls or use perror() in multi-threaded code. Is this a thread safe variable? If not, then whats the alternative ?
I am using linux with gcc on x86 architecture.
Yes, it is thread safe. On Linux, the global errno variable is thread-specific. POSIX requires that errno be threadsafe.
See http://www.unix.org/whitepapers/reentrant.html
In POSIX.1, errno is defined as an
external global variable. But this
definition is unacceptable in a
multithreaded environment, because its
use can result in nondeterministic
results. The problem is that two or
more threads can encounter errors, all
causing the same errno to be set.
Under these circumstances, a thread
might end up checking errno after it
has already been updated by another
thread.
To circumvent the resulting
nondeterminism, POSIX.1c redefines
errno as a service that can access the
per-thread error number as follows
(ISO/IEC 9945:1-1996, §2.4):
Some functions may provide the error number in a variable accessed
through the symbol errno. The symbol
errno is defined by including the
header , as specified by the
C Standard ... For each thread of a
process, the value of errno shall not
be affected by function calls or
assignments to errno by other threads.
Also see http://linux.die.net/man/3/errno
errno is thread-local; setting it in one thread does not affect its value in any other thread.
Yes
Errno isn't a simple variable anymore, it's something complex behind the scenes, specifically for it to be thread-safe.
See $ man 3 errno:
ERRNO(3) Linux Programmer’s Manual ERRNO(3)
NAME
errno - number of last error
SYNOPSIS
#include <errno.h>
DESCRIPTION
...
errno is defined by the ISO C standard to be a modifiable lvalue of
type int, and must not be explicitly declared; errno may be a macro.
errno is thread-local; setting it in one thread does not affect its
value in any other thread.
We can double-check:
$ cat > test.c
#include <errno.h>
f() { g(errno); }
$ cc -E test.c | grep ^f
f() { g((*__errno_location ())); }
$
In errno.h, this variable is declared as extern int errno;
Here is what the C standard says:
The macro errno need not be the identifier of an object. It might expand to a modifiable lvalue resulting from a function call (for example, *errno()).
Generally, errno is a macro which calls a function returning the address of the error number for the current thread, then dereferences it.
Here is what I have on Linux, in /usr/include/bits/errno.h:
/* Function to get address of global `errno' variable. */
extern int *__errno_location (void) __THROW __attribute__ ((__const__));
# if !defined _LIBC || defined _LIBC_REENTRANT
/* When using threads, errno is a per-thread value. */
# define errno (*__errno_location ())
# endif
In the end, it generates this kind of code:
> cat essai.c
#include <errno.h>
int
main(void)
{
errno = 0;
return 0;
}
> gcc -c -Wall -Wextra -pedantic essai.c
> objdump -d -M intel essai.o
essai.o: file format elf32-i386
Disassembly of section .text:
00000000 <main>:
0: 55 push ebp
1: 89 e5 mov ebp,esp
3: 83 e4 f0 and esp,0xfffffff0
6: e8 fc ff ff ff call 7 <main+0x7> ; get address of errno in EAX
b: c7 00 00 00 00 00 mov DWORD PTR [eax],0x0 ; store 0 in errno
11: b8 00 00 00 00 mov eax,0x0
16: 89 ec mov esp,ebp
18: 5d pop ebp
19: c3 ret
yes, as it is explained in the errno man page and the other replies, errno is a thread local variable.
However, there is a silly detail which could be easily forgotten. Programs should save and restore the errno on any signal handler executing a system call. This is because the signal will be handled by one of the process threads which could overwrite its value.
Therefore, the signal handlers should save and restore errno. Something like:
void sig_alarm(int signo)
{
int errno_save;
errno_save = errno;
//whatever with a system call
errno = errno_save;
}
This is from <sys/errno.h> on my Mac:
#include <sys/cdefs.h>
__BEGIN_DECLS
extern int * __error(void);
#define errno (*__error())
__END_DECLS
So errno is now a function __error(). The function is implemented so as to be thread-safe.
We can check by running a simple program on a machine.
#include <stdio.h>
#include <pthread.h>
#include <errno.h>
#define NTHREADS 5
void *thread_function(void *);
int
main()
{
pthread_t thread_id[NTHREADS];
int i, j;
for(i=0; i < NTHREADS; i++)
{
pthread_create( &thread_id[i], NULL, thread_function, NULL );
}
for(j=0; j < NTHREADS; j++)
{
pthread_join( thread_id[j], NULL);
}
return 0;
}
void *thread_function(void *dummyPtr)
{
printf("Thread number %ld addr(errno):%p\n", pthread_self(), &errno);
}
Running this program and you can see different addresses for errno in each thread. The output of a run on my machine looked like:-
Thread number 140672336922368 addr(errno):0x7ff0d4ac0698
Thread number 140672345315072 addr(errno):0x7ff0d52c1698
Thread number 140672328529664 addr(errno):0x7ff0d42bf698
Thread number 140672320136960 addr(errno):0x7ff0d3abe698
Thread number 140672311744256 addr(errno):0x7ff0d32bd698
Notice that address is different for all threads.
On many Unix systems, compiling with -D_REENTRANT ensures that errno is thread-safe.
For example:
#if defined(_REENTRANT) || _POSIX_C_SOURCE - 0 >= 199506L
extern int *___errno();
#define errno (*(___errno()))
#else
extern int errno;
/* ANSI C++ requires that errno be a macro */
#if __cplusplus >= 199711L
#define errno errno
#endif
#endif /* defined(_REENTRANT) */
I think the answer is "it depends". Thread-safe C runtime libraries usually implement errno as a function call (macro expanding to a function) if you're building threaded code with the correct flags.

LD_PRELOAD and linkage

I have this small testcode atfork_demo.c:
#include <stdio.h>
#include <pthread.h>
void hello_from_fork_prepare() {
printf("Hello from atfork prepare.\n");
fflush(stdout);
}
void register_hello_from_fork_prepare() {
pthread_atfork(&hello_from_fork_prepare, 0, 0);
}
Now, I compile it in two different ways:
gcc -shared -fPIC atfork_demo.c -o atfork_demo1.so
gcc -shared -fPIC atfork_demo.c -o atfork_demo2.so -lpthread
My demo main atfork_demo_main.c is this:
#include <dlfcn.h>
#include <stdio.h>
#include <unistd.h>
int main(int argc, const char** argv) {
if(argc <= 1) {
printf("usage: ... lib.so\n");
return 1;
}
void* plib = dlopen("libpthread.so.0", RTLD_NOW|RTLD_GLOBAL);
if(!plib) {
printf("cannot load pthread, error %s\n", dlerror());
return 1;
}
void* lib = dlopen(argv[1], RTLD_LAZY);
if(!lib) {
printf("cannot load %s, error %s\n", argv[1], dlerror());
return 1;
}
void (*reg)();
reg = dlsym(lib, "register_hello_from_fork_prepare");
if(!reg) {
printf("did not found func, error %s\n", dlerror());
return 1;
}
reg();
fork();
}
Which I compile like this:
gcc atfork_demo_main.c -o atfork_demo_main.exec -ldl
Now, I have another small demo atfork_patch.c where I want to override pthread_atfork:
#include <stdio.h>
int pthread_atfork(void (*prepare)(void), void (*parent)(void), void (*child)(void)) {
printf("Ignoring pthread_atfork call!\n");
fflush(stdout);
return 0;
}
Which I compile like this:
gcc -shared -O2 -fPIC patch_atfork.c -o patch_atfork.so
And then I set LD_PRELOAD=./atfork_patch.so, and do these two calls:
./atfork_demo_main.exec ./atfork_demo1.so
./atfork_demo_main.exec ./atfork_demo2.so
In the first case, the LD_PRELOAD-override of pthread_atfork worked and in the second, it did not. I get the output:
Ignoring pthread_atfork call!
Hello from atfork prepare.
So, now to the question(s):
Why did it not work in the second case?
How can I make it work also in the second case, i.e. also override it?
In my real use case, atfork_demo is some library which I cannot change. I also cannot change atfork_demo_main but I can make it load any other code. I would prefer if I can just do it with some change in atfork_patch.
You get some more debug output if you also use LD_DEBUG=all. Maybe interesting is this bit, for the second case:
841: symbol=__register_atfork; lookup in file=./atfork_demo_main.exec [0]
841: symbol=__register_atfork; lookup in file=./atfork_patch_extended.so [0]
841: symbol=__register_atfork; lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
841: symbol=__register_atfork; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
841: binding file ./atfork_demo2.so [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `__register_atfork' [GLIBC_2.3.2]
So, it searches for the symbol __register_atfork. I added that to atfork_patch_extended.so but it doesn't find it and uses it from libc instead. How can I make it find and use my __register_atfork?
As a side note, my main goal is to ignore the atfork handlers when fork() is called, but this is not the question here, but actually here. One solution to that, which seems to work, is to override fork() itself by this:
pid_t fork(void) {
return syscall(SYS_clone, SIGCHLD, 0);
}
Before answering this question, I would stress that this is a really bad idea for any production application.
If you are using a third party library that puts such constraints in place, then think about an alternative solution, such as forking early to maintain a "helper" process, with a pipe between you and it... then, when you need to call exec(), you can request that it does the work (fork(), exec()) on your behalf.
Patching or otherwise side-stepping the services of a system call such as pthread_atfork() is just asking for trouble (missed events, memory leaks, crashes, etc...).
As #Sergio pointed out, pthread_atfork() is actually built into atfork_demo2.so, so you can't do anything to override it... However examining the disassembly / source of pthread_atfork() gives you a decent hint about how achieve what you're asking:
0000000000000830 <__pthread_atfork>:
830: 48 8d 05 f9 07 20 00 lea 0x2007f9(%rip),%rax # 201030 <__dso_handle>
837: 48 85 c0 test %rax,%rax
83a: 74 0c je 848 <__pthread_atfork+0x18>
83c: 48 8b 08 mov (%rax),%rcx
83f: e9 6c fe ff ff jmpq 6b0 <__register_atfork#plt>
844: 0f 1f 40 00 nopl 0x0(%rax)
848: 31 c9 xor %ecx,%ecx
84a: e9 61 fe ff ff jmpq 6b0 <__register_atfork#plt>
or the source (from here):
int
pthread_atfork (void (*prepare) (void),
void (*parent) (void),
void (*child) (void))
{
return __register_atfork (prepare, parent, child, &__dso_handle == NULL ? NULL : __dso_handle);
}
As you can see, pthread_atfork() does nothing aside from calling __register_atfork()... so patch that instead!
The content of atfork_patch.c now becomes: (using __register_atfork()'s prototype, from here / here)
#include <stdio.h>
int __register_atfork (void (*prepare) (void), void (*parent) (void),
void (*child) (void), void *dso_handle) {
printf("Ignoring pthread_atfork call!\n");
fflush(stdout);
return 0;
}
This works for both demos:
$ LD_PRELOAD=./atfork_patch.so ./atfork_demo_main.exec ./atfork_demo1.so
Ignoring pthread_atfork call!
$ LD_PRELOAD=./atfork_patch.so ./atfork_demo_main.exec ./atfork_demo2.so
Ignoring pthread_atfork call!
It doesn't work for the second case because there is nothing to override. Your second library is linked statically with pthread library:
$ readelf --symbols atfork_demo1.so | grep pthread_atfork
7: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND pthread_atfork
54: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND pthread_atfork
$ readelf --symbols atfork_demo2.so | grep pthread_atfork
41: 0000000000000000 0 FILE LOCAL DEFAULT ABS pthread_atfork.c
47: 0000000000000830 31 FUNC LOCAL DEFAULT 12 __pthread_atfork
49: 0000000000000830 31 FUNC LOCAL DEFAULT 12 pthread_atfork
So it will use local pthread_atfork each time, regardless of LD_PRELOAD or any other loaded libraries.
How to overcome that? Looks like for described configuration it is not possible since you need to modify atfork_demo library or main executable anyway.

unix socket error 14: EFAULT (bad address)

I have a very simple question, but I have not managed to find any answers to it all weekend. I am using the sendto() function and it is returning error code 14: EFAULT. The man pages describe it as:
"An invalid user space address was specified for an argument."
I was convinced that this was talking about the IP address I was specifying, but now I suspect it may be the memory address of the message buffer that it is referring to - I can't find any clarification on this anywhere, can anyone clear this up?
EFAULT It happen if the memory address of some argument passed to sendto (or more generally to any system call) is invalid. Think of it as a sort of SIGSEGV in kernel land regarding your syscall. For instance, if you pass a null or invalid buffer pointer (for reading, writing, sending, recieving...), you get that
See errno(3), sendto(2) etc... man pages.
EFAULT is not related to IP addresses at all.
Minimal runnable example with getcpu
Just to make things more concrete, we can have a look at the getcpu system call, which is very simple to understand, and shows the same EFAULT behaviour.
From man getcpu we see that the signature is:
int getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *tcache);
and the memory pointed to by the cpu will contain the ID of the current CPU the process is running on after the syscall, the only possible error being:
ERRORS
EFAULT Arguments point outside the calling process's address space.
So we can test it out with:
main.c
#define _GNU_SOURCE
#include <assert.h>
#include <errno.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/syscall.h>
int main(void) {
int err, ret;
unsigned cpu;
/* Correct operation. */
assert(syscall(SYS_getcpu, &cpu, NULL, NULL) == 0);
printf("%u\n", cpu);
/* Bad trash address == 1. */
ret = syscall(SYS_getcpu, 1, NULL, NULL);
err = errno;
assert(ret == -1);
printf("%d\n", err);
perror("getcpu");
return EXIT_SUCCESS;
}
compile and run:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
./main.out
Sample output:
cpu 3
errno 14
getcpu: Bad address
so we see that the bad call with a trash address of 1 returned 14, which is EFAULT as seen from kernel code: https://stackoverflow.com/a/53958705/895245
Remember that the syscall itself returns -14, and then the syscall C wrapper detects that it is an error due to being negative, returns -1, and sets errno to the actual precise error code.
And since the syscall is so simple, we can confirm this from the kernel 5.4 implementation as well at kernel/sys.c:
SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep,
struct getcpu_cache __user *, unused)
{
int err = 0;
int cpu = raw_smp_processor_id();
if (cpup)
err |= put_user(cpu, cpup);
if (nodep)
err |= put_user(cpu_to_node(cpu), nodep);
return err ? -EFAULT : 0;
}
so clearly we see that -EFAULT is returned if there is a problem with put_user.
It is worth mentioning that my glibc does have a getcpu wrapper as well in sched.h, but that implementation segfaults in case of bad addresses, which is a bit confusing: How do I include Linux header files like linux/getcpu.h? But it is not what the actual syscall does to the process, just whatever glibc is doing with that address.
Tested on Ubuntu 20.04, Linux 5.4.
EFAULT is a macro defined in a file "include/uapi/asm-generic/errno-base.h"
#define EFAULT 14 /* Bad address */

How do I start threads in plain C?

I have used fork() in C to start another process. How do I start a new thread?
Since you mentioned fork() I assume you're on a Unix-like system, in which case POSIX threads (usually referred to as pthreads) are what you want to use.
Specifically, pthread_create() is the function you need to create a new thread. Its arguments are:
int pthread_create(pthread_t * thread, pthread_attr_t * attr, void *
(*start_routine)(void *), void * arg);
The first argument is the returned pointer to the thread id. The second argument is the thread arguments, which can be NULL unless you want to start the thread with a specific priority. The third argument is the function executed by the thread. The fourth argument is the single argument passed to the thread function when it is executed.
AFAIK, ANSI C doesn't define threading, but there are various libraries available.
If you are running on Windows, link to msvcrt and use _beginthread or _beginthreadex.
If you are running on other platforms, check out the pthreads library (I'm sure there are others as well).
C11 threads + C11 atomic_int
Added to glibc 2.28. Tested in Ubuntu 18.10 amd64 (comes with glic 2.28) and Ubuntu 18.04 (comes with glibc 2.27) by compiling glibc 2.28 from source: Multiple glibc libraries on a single host
Example adapted from: https://en.cppreference.com/w/c/language/atomic
main.c
#include <stdio.h>
#include <threads.h>
#include <stdatomic.h>
atomic_int atomic_counter;
int non_atomic_counter;
int mythread(void* thr_data) {
(void)thr_data;
for(int n = 0; n < 1000; ++n) {
++non_atomic_counter;
++atomic_counter;
// for this example, relaxed memory order is sufficient, e.g.
// atomic_fetch_add_explicit(&atomic_counter, 1, memory_order_relaxed);
}
return 0;
}
int main(void) {
thrd_t thr[10];
for(int n = 0; n < 10; ++n)
thrd_create(&thr[n], mythread, NULL);
for(int n = 0; n < 10; ++n)
thrd_join(thr[n], NULL);
printf("atomic %d\n", atomic_counter);
printf("non-atomic %d\n", non_atomic_counter);
}
GitHub upstream.
Compile and run:
gcc -ggdb3 -std=c11 -Wall -Wextra -pedantic -o main.out main.c -pthread
./main.out
Possible output:
atomic 10000
non-atomic 4341
The non-atomic counter is very likely to be smaller than the atomic one due to racy access across threads to the non-atomic variable.
See also: How to do an atomic increment and fetch in C?
Disassembly analysis
Disassemble with:
gdb -batch -ex "disassemble/rs mythread" main.out
contains:
17 ++non_atomic_counter;
0x00000000004007e8 <+8>: 83 05 65 08 20 00 01 addl $0x1,0x200865(%rip) # 0x601054 <non_atomic_counter>
18 __atomic_fetch_add(&atomic_counter, 1, __ATOMIC_SEQ_CST);
0x00000000004007ef <+15>: f0 83 05 61 08 20 00 01 lock addl $0x1,0x200861(%rip) # 0x601058 <atomic_counter>
so we see that the atomic increment is done at the instruction level with the f0 lock prefix.
With aarch64-linux-gnu-gcc 8.2.0, we get instead:
11 ++non_atomic_counter;
0x0000000000000a28 <+24>: 60 00 40 b9 ldr w0, [x3]
0x0000000000000a2c <+28>: 00 04 00 11 add w0, w0, #0x1
0x0000000000000a30 <+32>: 60 00 00 b9 str w0, [x3]
12 ++atomic_counter;
0x0000000000000a34 <+36>: 40 fc 5f 88 ldaxr w0, [x2]
0x0000000000000a38 <+40>: 00 04 00 11 add w0, w0, #0x1
0x0000000000000a3c <+44>: 40 fc 04 88 stlxr w4, w0, [x2]
0x0000000000000a40 <+48>: a4 ff ff 35 cbnz w4, 0xa34 <mythread+36>
so the atomic version actually has a cbnz loop that runs until the stlxr store succeed. Note that ARMv8.1 can do all of that with a single LDADD instruction.
This is analogous to what we get with C++ std::atomic: What exactly is std::atomic?
Benchmark
TODO. Crate a benchmark to show that atomic is slower.
POSIX threads
main.c
#define _XOPEN_SOURCE 700
#include <assert.h>
#include <stdlib.h>
#include <pthread.h>
enum CONSTANTS {
NUM_THREADS = 1000,
NUM_ITERS = 1000
};
int global = 0;
int fail = 0;
pthread_mutex_t main_thread_mutex = PTHREAD_MUTEX_INITIALIZER;
void* main_thread(void *arg) {
int i;
for (i = 0; i < NUM_ITERS; ++i) {
if (!fail)
pthread_mutex_lock(&main_thread_mutex);
global++;
if (!fail)
pthread_mutex_unlock(&main_thread_mutex);
}
return NULL;
}
int main(int argc, char **argv) {
pthread_t threads[NUM_THREADS];
int i;
fail = argc > 1;
for (i = 0; i < NUM_THREADS; ++i)
pthread_create(&threads[i], NULL, main_thread, NULL);
for (i = 0; i < NUM_THREADS; ++i)
pthread_join(threads[i], NULL);
assert(global == NUM_THREADS * NUM_ITERS);
return EXIT_SUCCESS;
}
Compile and run:
gcc -std=c99 -Wall -Wextra -pedantic -o main.out main.c -pthread
./main.out
./main.out 1
The first run works fine, the second fails due to missing synchronization.
There don't seem to be POSIX standardized atomic operations: UNIX Portable Atomic Operations
Tested on Ubuntu 18.04. GitHub upstream.
GCC __atomic_* built-ins
For those that don't have C11, you can achieve atomic increments with the __atomic_* GCC extensions.
main.c
#define _XOPEN_SOURCE 700
#include <pthread.h>
#include <stdatomic.h>
#include <stdio.h>
#include <stdlib.h>
enum Constants {
NUM_THREADS = 1000,
};
int atomic_counter;
int non_atomic_counter;
void* mythread(void *arg) {
(void)arg;
for (int n = 0; n < 1000; ++n) {
++non_atomic_counter;
__atomic_fetch_add(&atomic_counter, 1, __ATOMIC_SEQ_CST);
}
return NULL;
}
int main(void) {
int i;
pthread_t threads[NUM_THREADS];
for (i = 0; i < NUM_THREADS; ++i)
pthread_create(&threads[i], NULL, mythread, NULL);
for (i = 0; i < NUM_THREADS; ++i)
pthread_join(threads[i], NULL);
printf("atomic %d\n", atomic_counter);
printf("non-atomic %d\n", non_atomic_counter);
}
Compile and run:
gcc -ggdb3 -O3 -std=c99 -Wall -Wextra -pedantic -o main.out main.c -pthread
./main.out
Output and generated assembly: the same as the "C11 threads" example.
Tested in Ubuntu 16.04 amd64, GCC 6.4.0.
pthreads is a good start, look here
Threads are not part of the C standard, so the only way to use threads is to use some library (eg: POSIX threads in Unix/Linux, _beginthread/_beginthreadex if you want to use the C-runtime from that thread or just CreateThread Win32 API)
Check out the pthread (POSIX thread) library.

Resources