Corrupt Valgrind? - c

I have the following simple program:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <assert.h>
#include <errno.h>
#define handle_error_en(en, msg) \
{do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0);}
#define status_check(stat) ((stat != 0) ? (handle_error_en(stat, "status error")) : ((void)0))
static void* thread_start(void *arg)
{
printf("%s\n", "thread working..");
sleep(1);
return NULL;
}
int main(int argc, char const *argv[])
{
int status;
pthread_t thread;
status = pthread_create(&thread,
NULL,
thread_start,
NULL);
status_check(status);
status = pthread_join(thread, NULL);
status_check(status);
printf("%s\n", "exit program..");
return 0;
}
When I run Valgind with
Valgrind --tool=helgrind ./threads_simple
I get 54 errors, all being data race errors. Just picking one of them:
Possible data race during write of size 4 at 0x1003FD2D8 by thread #1
==17334== Locks held: none
==17334== at 0x1003E2FDF: spin_unlock (in /usr/lib/system/libsystem_platform.dylib)
==17334== by 0x1003F7EAF: pthread_join (in /usr/lib/system/libsystem_pthread.dylib)
==17334== by 0x100000E37: main (threads_simple.c:29)
==17334==
==17334== This conflicts with a previous write of size 4 by thread #2
==17334== Locks held: none
==17334== at 0x1003E2FDF: spin_unlock (in /usr/lib/system/libsystem_platform.dylib)
==17334== by 0x1003F6FD6: _pthread_start (in /usr/lib/system/libsystem_pthread.dylib)
==17334== by 0x1003F43EC: thread_start (in /usr/lib/system/libsystem_pthread.dylib)
==17334== Address 0x1003fd2d8 is in the Data segment of /usr/lib/system/libsystem_pthread.dylib
I have written many more small thread-programms, all give errors like this. Is my Valgrind installation corrupt? I installed it on mac via brew.

Related

Why replacing malloc causes segment error (__GI__IO_file_overflow)

I tried to mimic the way jemalloc replaces ptmalloc by replacing malloc myself, and the replacement resulted in a direct segment error
code1.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
void *ptr = malloc(10);
printf("%p\n", ptr);
return EXIT_SUCCESS;
}
code2.c:
#include <stdlib.h>
#include <stdint.h>
void *malloc(size_t size)
{
return (void *)10;
}
Compile instructions
gcc -c code2.c
ar r libcode2.a code2.o
gcc code1.c -L. -lcode2 -g
gdb
Breakpoint 1, main (argc=1, argv=0x7fffffffe318) at code1.c
17 void *ptr = malloc(10);
(gdb) s
18 printf("%p\n", ptr);
(gdb) p ptr
$1 = (void *) 0xa
(gdb) n
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7e46658 in __GI__IO_file_overflow () from /lib64/libc.so.6
the replacement resulted in a direct segment error code1.c
If you replace malloc with a non-functional variant, you better not call any libc functions which may use malloc in their implementation.
Here you called printf, which itself uses malloc internally. Use GDB where command to observe where the crash happened.
I am actually surprised your program made it as far as reaching main() -- I expected it to crash much earlier (there are 1000s of instruction executed long before main is reached).

How get output of address sanitizer when emiting SIGINT to halt a loop

When I compile this simple test program I get the obvious leak report from address sanitizer, but when I compile the same program but with a infinite loop, and break it emitting SIGINT I don't get any output.
Checking asm output, the malloc is not optimized away (if this is possible at all)
Is this the expected behavior of address sanitizer? I don't encounter this problem in other developments.
Working example:
#include <stdlib.h>
int main(void)
{
char *a = malloc(1024);
return 1;
}
Not working (kill with SIGINT):
#include <stdlib.h>
int main(void)
{
char *a = malloc(1024);
for(;;);
return 1;
}
compile: gcc test.c -o test -fsanitize=address
I encounter this problem in a full programm but I reduced it to this minimal example.
I tried many ways, with exit() and abort() calls, this works:
#include <stdlib.h>
#include <signal.h>
#include <stdio.h>
#include <setjmp.h>
jmp_buf jmpbuf;
void handler (int signum) {
printf("handler %d \n", signum);
// we jump from here to main()
// and then call return
longjmp(jmpbuf, 1);
}
int main(int argc, char *argv[])
{
if (setjmp(jmpbuf)) {
// we are in signal context here
return 2;
}
signal(SIGINT, handler);
signal(SIGTERM, handler);
char *a = malloc(1024);
while (argc - 1);
return 1;
}
Results in:
> gcc file.c -fsanitize=address && timeout 1 ./a.out arg
handler 15
=================================================================
==12970==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 1024 byte(s) in 1 object(s) allocated from:
#0 0x7f4798c9bd99 in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cc:86
#1 0x5569e64e0acd in main (/tmp/a.out+0xacd)
#2 0x7f479881206a in __libc_start_main (/usr/lib/libc.so.6+0x2306a)
SUMMARY: AddressSanitizer: 1024 byte(s) leaked in 1 allocation(s).
I guess that the address sanitizer function are executed after main returns.
The code responsible for printing that error output is called as a destructor (fini) procedure. Since your program terminates without calling any of the process destructors (due to the SIGINT), you do not get any error printouts.
__lsan_do_leak_check() can help you avoid using longjmp in #KamilCuk's answer:
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <sanitizer/lsan_interface.h>
void handler (int signum) {
__lsan_do_leak_check();
}
int main(int argc, char *argv[])
{
signal(SIGINT, handler);
char *a = malloc(1024);
a=0;
printf("lost pointer\n");
for(;;);
return 1;
}
Demo:
clang test.c -fsanitize=address -fno-omit-frame-pointer -g -O0 -o test && ./test
lost pointer
C-c C-c
=================================================================
==29365==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 1024 byte(s) in 1 object(s) allocated from:
#0 0x4c9ca3 in malloc (/home/bjacob/test+0x4c9ca3)
#1 0x4f9187 in main /home/bjacob/test.c:13:14
#2 0x7fbc9898409a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
SUMMARY: AddressSanitizer: 1024 byte(s) leaked in 1 allocation(s).
Note I added a=0 to create a leak.
I also added a printf. Without that printf, the leak error is not printed. I suspect the compiler optimized-out the use of the variable "a" despite my using the -O0 compiler option.

Unknown error when setting thread policy to SCHED_RR

I get a unknown error code (it's actually 48) when trying to set the scheduling policy to SCHED_RR for my thread.
Here is a sample of my code:
#include <sched.h>
#include <pthread.h>
#include <stdio.h>
int main() {
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
int ret = pthread_attr_setschedpolicy(&attr, SCHED_RR);
printf("ret: %s\n", strerror(ret));
return 0;
}
Trace:
ret: Unknown error
Why is that so? It's not EPERM like I've seen in other questions.
I'm on Windows 7 using cygwin.
If you read the documentation of pthreads in cygwin:
https://sourceware.org/pthreads-win32/announcement.html
you can see that only SCHED_OTHER is supported:
pthread_attr_setschedpolicy (only supports SCHED_OTHER)

Portable way to programatically detect where a signal occured

Assuming the following code (main.c):
#include <unistd.h>
#include <signal.h>
void handler(int sig)
{
pause(); /* line 7 */
}
int main(void)
{
signal(SIGALRM, handler);
alarm(1);
pause();
}
When I run this in gbd and set a break point inside handler(), run the code and wait a second I can do the following:
(gdb) b 7
Breakpoint 1 at 0x4005b7: file main.c, line 7.
(gdb) r
Starting program: a.out
Breakpoint 1, handler (sig=14) at main.c:7
7 pause();
(gdb) bt
#0 handler (sig=14) at main.c:7
#1 <signal handler called>
#2 0x00007ffff7afd410 in __pause_nocancel () at ../sysdeps/unix/syscall-template.S:82
#3 0x00000000004005e0 in main () at main.c:14
Is there a portable way to get 0x00007ffff7afd410 or 0x00000000004005e0?
With sigaction instead of signal the handler is called with the ucontext of the location where the signal occurred:
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
#include <ucontext.h>
static void handler(int sig, siginfo_t *siginfo, void *context)
{
ucontext_t *ucontext = context;
printf("rip %p\n", (void *)ucontext->uc_mcontext.gregs[REG_RIP]);
pause();
}
int main(void)
{
struct sigaction sact;
memset(&sact, 0, sizeof sact);
sact.sa_sigaction = handler;
sact.sa_flags = SA_SIGINFO;
if (sigaction(SIGALRM, &sact, NULL) < 0) {
perror("sigaction");
return 1;
}
alarm(1);
pause();
return 0;
}
rip output and gdb bt output:
(gdb) b 13
Breakpoint 1 at 0x4006de: file main.c, line 13.
(gdb) r
Starting program: /home/osboxes/a.out
rip 0x7ffff7ae28a0
Breakpoint 1, handler (sig=14, siginfo=0x7fffffffdf70, context=0x7fffffffde40)
at main.c:13
13 pause();
(gdb) bt
#0 handler (sig=14, siginfo=0x7fffffffdf70, context=0x7fffffffde40)
at main.c:13
#1 <signal handler called>
#2 0x00007ffff7ae28a0 in __pause_nocancel () from /lib64/libc.so.6
#3 0x0000000000400758 in main () at main.c:28
Not extremely portable I guess, but backtrace(3) is available in glibc and a few other libc's:
backtrace() returns a backtrace for the calling program, in the array
pointed to by buffer. A backtrace is the series of currently active
function calls for the program.
You'd have to check how many entries up the stack you need to look. It should be consistent for Linux at least.
If you want to translate the backtrace to something resembling gdb's display, you could use addr2line(1) from binutils. With something like
popen("addr2line -Cfip -e ./myprog", "w")
you could even do it at runtime by writing addresses (as strings) to the FILE* you get back.

pthread_cond_broadcast broken with dlsym?

I am trying to interpose calls to pthread_cond_broadcast using LD_PRELOAD mechanism. My interposed pthread_cond_broadcast function just calls the original pthread_cond_broadcast. However, for a very simple pthread code where both pthread_cond_wait and pthread_cond_broadcast get invoked, I either end up with a segfault in glibc (for glibc 2.11.1) or the program hangs (for glibc 2.15). Any clues on that is going on?
The interposition code (that gets compiled as a shared library):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <dlfcn.h>
static int (*orig_pthread_cond_broadcast)(pthread_cond_t *cond) = NULL;
__attribute__((constructor))
static void start() {
orig_pthread_cond_broadcast =
(int (*)()) dlsym(RTLD_NEXT, "pthread_cond_broadcast");
if (orig_pthread_cond_broadcast == NULL) {
printf("pthread_cond_broadcast not found!!!\n");
exit(1);
}
}
__attribute__((__visibility__("default")))
int pthread_cond_broadcast(pthread_cond_t *cond) {
return orig_pthread_cond_broadcast(cond);
}
The simple pthread program:
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
pthread_mutex_t cond_mutex;
pthread_cond_t cond_var;
int condition;
void *thread0_work(void *arg) {
pthread_mutex_lock(&cond_mutex);
printf("Signal\n");
condition = 1;
pthread_cond_broadcast(&cond_var);
pthread_mutex_unlock(&cond_mutex);
return NULL;
}
void *thread1_work(void *arg) {
pthread_mutex_lock(&cond_mutex);
while (condition == 0) {
printf("Wait\n");
pthread_cond_wait(&cond_var, &cond_mutex);
printf("Done waiting\n");
}
pthread_mutex_unlock(&cond_mutex);
return NULL;
}
int main() {
pthread_t thread1;
pthread_mutex_init(&cond_mutex, NULL);
pthread_cond_init(&cond_var, NULL);
pthread_create(&thread1, NULL, thread1_work, NULL);
// Slowdown this thread, so the thread 1 does pthread_cond_wait.
usleep(1000);
thread0_work(NULL);
pthread_join(thread1, NULL);
return 0;
}
EDIT:
For glibc 2.11.1, gdb bt gives:
(gdb) set environment LD_PRELOAD=./libintercept.so
(gdb) run
Starting program: /home/seguljac/intercept/main
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff7436700 (LWP 19165)]
Wait
Signal
Before pthread_cond_broadcast
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff79ca0e7 in pthread_cond_broadcast##GLIBC_2.3.2 () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007ffff79ca0e7 in pthread_cond_broadcast##GLIBC_2.3.2 () from /lib/libpthread.so.0
#1 0x00007ffff7bdb769 in pthread_cond_broadcast () from ./libintercept.so
#2 0x00000000004008e8 in thread0_work ()
#3 0x00000000004009a4 in main ()
EDIT 2:
(Solved)
As suggested by R.. (thanks!), the issue is that on my platform pthread_cond_broadcast is a versioned symbol, and dlsym gives the wrong version. This blog explains this situation in great detail: http://blog.fesnel.com/blog/2009/08/25/preloading-with-multiple-symbol-versions/
The call through your function seems to end up in a different version of the function:
With LD_PRELOAD: __pthread_cond_broadcast_2_0 (cond=0x804a060) at old_pthread_cond_broadcast.c:37
Without LD_PRELOAD: pthread_cond_broadcast##GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_broadcast.S:39
So your situation is similar to this question, i.e. you are getting incompatible versions of pthread functions: symbol versioning and dlsym
This page gives one way to solve the problem, though a bit complex: http://blog.fesnel.com/blog/2009/08/25/preloading-with-multiple-symbol-versions/

Resources