Deadlock (fork + malloc) libc (glibc-2.17, glibc-2.23) - c

Im running into a pretty annoying problem :
I have a program which creates one thread at the start, this thread will launch other stuff during its execution (fork() immediatly followed by execve()).
Here is the bt of both threads at the point where my program reached (I think) the deadlock :
Thread 2 (LWP 8839):
#0 0x00007ffff6cdf736 in __libc_fork () at ../sysdeps/nptl/fork.c:125
#1 0x00007ffff6c8f8c0 in _IO_new_proc_open (fp=fp#entry=0x7ffff00031d0, command=command#entry=0x7ffff6c26e20 "ps -u brejon | grep \"cvc\"
#2 0x00007ffff6c8fbcc in _IO_new_popen (command=0x7ffff6c26e20 "ps -u user | grep \"cvc\" | wc -l", mode=0x42c7fd "r") at iopopen.c:296
#3-4 ...
#5 0x00007ffff74d9434 in start_thread (arg=0x7ffff6c27700) at pthread_create.c:333
#6 0x00007ffff6d0fcfd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 1 (LWP 8835):
#0 __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1 0x00007ffff6ca0ad9 in malloc_atfork (sz=140737337120848, caller=) at arena.c:179
#2 0x00007ffff6c8d875 in __GI__IO_file_doallocate (fp=0x17a72230) at filedoalloc.c:127
#3 0x00007ffff6c9a964 in __GI__IO_doallocbuf (fp=fp#entry=0x17a72230) at genops.c:398
#4 0x00007ffff6c99de8 in _IO_new_file_overflow (f=0x17a72230, ch=-1) at fileops.c:820
#5 0x00007ffff6c98f8a in _IO_new_file_xsputn (f=0x17a72230, data=0x17a16420, n=682) at fileops.c:1331
#6 0x00007ffff6c6fcb2 in _IO_vfprintf_internal (s=0x17a72230, format=, ap=ap#entry=0x7fffffffcf18) at vfprintf.c:1632
#7 0x00007ffff6c76a97 in __fprintf (stream=, format=) at fprintf.c:32
#8-11 ...
#12 0x000000000042706e in main (argc=3, argv=0x7fffffffd698, envp=0x7fffffffd6b8) at mains/ignore/.c:146
Both stays stuck here forever with both glibc-2.17 and glibc-2.23
Any help is welcomed :'D
EDIT :
Here is a minimal example :
1 #include <stdlib.h>
2 #include <pthread.h>
3 #include <unistd.h>
4
5 void * thread_handler(void * args)
6 {
7 char * argv[] = { "/usr/bin/ls" };
8 char * newargv[] = { "/usr/bin/ls", NULL };
9 char * newenviron[] = { NULL };
10 while (1) {
11 if (vfork() == 0) {
12 execve(argv[0], newargv, newenviron);
13 }
14 }
15
16 return 0;
17 }
18
19 int main(void)
20 {
21 pthread_t thread;
22 pthread_create(&thread, NULL, thread_handler, NULL);
23
24 int * dummy_alloc;
25
26 while (1) {
27 dummy_alloc = malloc(sizeof(int));
28 free(dummy_alloc);
29 }
30
31 return 0;
32 }
Environment :
user:deadlock$ cat /etc/redhat-release
Scientific Linux release 7.3 (Nitrogen)
user:deadlock$ ldd --version
ldd (GNU libc) 2.17
EDIT 2 :
The rpm package version is : glibc-2.17-196.el7.x86_64
I can not get the line numbers using the rpm package. Here is the BT using the glibc given with the distribution : solved with debuginfo.
(gdb) thread apply all bt
Thread 2 (Thread 0x7ffff77fb700 (LWP 59753)):
#0 vfork () at ../sysdeps/unix/sysv/linux/x86_64/vfork.S:44
#1 0x000000000040074e in thread_handler (args=0x0) at deadlock.c:11
#2 0x00007ffff7bc6e25 in start_thread (arg=0x7ffff77fb700) at pthread_create.c:308
#3 0x00007ffff78f434d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Thread 1 (Thread 0x7ffff7fba740 (LWP 59746)):
#0 0x00007ffff7878226 in _int_free (av=0x7ffff7bb8760 , p=0x602240, have_lock=0) at malloc.c:3927
#1 0x00000000004007aa in main () at deadlock.c:28

This is a custom-compiled glibc. It is possible that something went wrong with the installation. Note that Red Hat Enterprise Linux 7.4 backports a fix for a deadlock between malloc and fork, and you are missing that because you compiled your own glibc. The fix for the upstream bug went only into upstream version 2.24, so if you are basing your custom build on that, you may not have this fix. (Although the backtrace would look differently for that one.)
I think we fixed at least another post-2.17 libio-related deadlock bug.
EDIT I have been dealing with fork-related deadlocks for too long. There are multiple issues with the reproducer as posted:
There is no waitpid call for the PID. As a result, the process table will be quickly filled with zombies.
No error checking for execve. If the /usr/bin/ls pathname does not exist (for example, on a system which did not undergo UsrMove), execve will return, and the next iteration of the loop will launch another vfork call.
I fixed both issues (because debugging what is approaching a fork bomb is not fun at all), but I can't reproduce a hang with glibc-2.17-196.el7.x86_64.

Related

Issues when running MPI program on two cluster nodes

I have a very simple MPI program:
int my_rank;
int my_new_rank;
int size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
if (my_rank == 0 || my_rank == 18 || my_rank == 36){
char hostbuffer[256];
gethostname(hostbuffer, sizeof(hostbuffer));
printf("Hostname: %s\n", hostbuffer);
}
MPI_Finalize();
I am running it on a cluster with two nodes. I have a make file and with mpicc command I generate cannon.run executable file. I run it with the following command:
time mpirun --mca btl ^openib -n 64 -hostfile ../second_machinefile ./cannon.run
in second_machinefile I have names of these two nodes. The wierd problem is that, when I run this command from one node, it executes normally, however when I run the command from another node I get error:
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x30
After trying to tun with GDB I got this backtrace:
#0 0x00007ffff646e936 in ?? ()
from /usr/lib/x86_64-linux-gnu/pmix/lib/pmix/mca_gds_ds21.so
#1 0x00007ffff6449733 in pmix_common_dstor_init ()
from /lib/x86_64-linux-gnu/libmca_common_dstore.so.1
#2 0x00007ffff646e5b4 in ?? ()
from /usr/lib/x86_64-linux-gnu/pmix/lib/pmix/mca_gds_ds21.so
#3 0x00007ffff659e46e in pmix_gds_base_select ()
from /lib/x86_64-linux-gnu/libpmix.so.2
#4 0x00007ffff655688d in pmix_rte_init ()
from /lib/x86_64-linux-gnu/libpmix.so.2
#5 0x00007ffff6512d7c in PMIx_Init () from /lib/x86_64-linux-gnu/libpmix.so.2
#6 0x00007ffff660afe4 in ext2x_client_init ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pmix_ext2x.so
#7 0x00007ffff72e1656 in ?? ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_ess_pmi.so
#8 0x00007ffff7a9d11a in orte_init ()
from /lib/x86_64-linux-gnu/libopen-rte.so.40
#9 0x00007ffff7d6de62 in ompi_mpi_init ()
from /lib/x86_64-linux-gnu/libmpi.so.40
#10 0x00007ffff7d9c17e in PMPI_Init () from /lib/x86_64-linux-gnu/libmpi.so.40
#11 0x00005555555551d6 in main ()
which to be honest I don't fully understand.
My main confusion is that the program is executed properly from machine_1, it connects to the machine_2 without errors and processes are initialized on both machines. But when I try to execute the same command from machine_2, it is not able to connect machine_1. The program is also running correctly if I run it only on machine_2 as well, when decreasing number of processes so it fits in one machine.
Is there anything I am doing wrong? or what could I try to understand better the cause of the problem?
This is indeed a bug in Open PMIx that is addressed at https://github.com/openpmix/openpmix/pull/1580
Meanwhile, a workaround is to blacklist the gds/ds21 component :
One option is to
export PMIX_MCA_gds=^ds21
before invoking mpirun
An other option is to add the following line
gds = ^ds21
to the PMIx config file located in <pmix_prefix>/etc/pmix-mca-params.conf

gdb script to isolate a thread based on variable value

I have a program having over 300 threads to which I have attached gdb. I need to identify one particular thread whose call stack has a frame containing a variable whose value I want to use for matching. Can I script this in gdb?
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f16c1eeb700 (LWP 18833))]
#4 0x00007f17f3a3bdd5 in start_thread () from /lib64/libpthread.so.0
(gdb) backtrace
#0 0x00007f17f3a3fd12 in pthread_cond_timedwait##GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f17e72838be in __afr_shd_healer_wait (healer=healer#entry=0x7f17e05203d0) at afr-self-heald.c:101
#2 0x00007f17e728392d in afr_shd_healer_wait (healer=healer#entry=0x7f17e05203d0) at afr-self-heald.c:125
#3 0x00007f17e72848e8 in afr_shd_index_healer (data=0x7f17e05203d0) at afr-self-heald.c:572
#4 0x00007f17f3a3bdd5 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f17f3302ead in clone () from /lib64/libc.so.6
(gdb) frame 3
#3 0x00007f17e72848e8 in afr_shd_index_healer (data=0x7f17e05203d0) at afr-self-heald.c:572
572 afr_shd_healer_wait (healer);
(gdb) p this->name
$6 = 0x7f17e031b910 "testvol-replicate-0"
For example, can I run a macro to loop over each thread, go to frame 3 in each of it, inspect the variable this->name and print the thead number only if the value matches testvol-replicate-0 ?
It's possible to integrate Python into GDB. Then, with the Python GDB API, you could loop over threads and search for a match. Below two examples of debugging threads with GDB and Python.
https://www.linuxjournal.com/article/11027
https://fy.blackhats.net.au/blog/html/2017/08/04/so_you_want_to_script_gdb_with_python.html

Process threads stuck with __pthread_enable_asynccancel ()

I am debugging an issue with a multi-threaded TCP server application on CentOS platform. The application suddenly stopped processing connections. Even the event logging with syslog was not seen in the log files. It was as if the application become a black hole.
I killed the process with signal 11 to get the core dump. In the core dump I observed that all threads are stuck with similar patterns.
Ex:
Thread 19 (Thread 0xb2de8b70 (LWP 3722)):
#0 0x00c9e424 in __kernel_vsyscall ()
#1 0x00c17189 in __pthread_enable_asynccancel () from /lib/libpthread.so.0
#2 0x080b367d in server_debug_stats_timer_handler (tmp=0x0) at server_debug_utils.c:75 ==> this line as a event print with syslog(...)
All most all threads are attempting a syslog(...) print but gets stuck with __pthread_enable_asynccancel ().
What is being done with __pthread_enable_asynccancel () and why it isn't returning?
Here is info reg from the thread mentioned:
(gdb) info reg
eax 0xfffffe00 -512
ecx 0x80 128
edx 0x2 2
ebx 0x154ea3c4 357475268
esp 0xb2de8174 0xb2de8174
ebp 0xb2de81a8 0xb2de81a8
esi 0x0 0
edi 0x0 0
eip 0xc9e424 0xc9e424 <__kernel_vsyscall+16>
eflags 0x200246 [ PF ZF IF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
(gdb)
(gdb) print $orig_eax
$1 = 240
($orig_eax = 240 is SYS_futex)
One of the thread state is as shown below:
Thread 27 (Thread 0xa97d9b70 (LWP 3737)):
#0 0x00c9e424 in __kernel_vsyscall ()
#1 0x00faabb3 in inet_ntop () from /lib/libc.so.6
#2 0x00f20e76 in freopen64 () from /lib/libc.so.6
#3 0x00f96a55 in fcvt_r () from /lib/libc.so.6
#4 0x00f96fd7 in qfcvt_r () from /lib/libc.so.6
#5 0x00a19932 in app_signal_handler (signum=11) at appBaseClass.cpp:920
#6 <signal handler called>
#7 0x00f2aec5 in sYSMALLOc () from /lib/libc.so.6
#8 0x0043431a in CRYPTO_free () from /usr/local/lr/packages/stg_app/5.3.8/lib/ssl/libcrypto.so.10
#9 0x00000000 in ?? ()
(gdb) print $orig_eax $5 = 240`
Your stuck threads are in the futex() syscall, probably on an internal glibc lock taken within malloc() (the syslog() call allocates memory internally).
The reason for this deadlock isn't apparent, but I have two suggestions:
If you call syslog() (or any other non-AS-safe function, like printf()) from a signal handler anywhere in your program, that could cause a deadlock like this;
It's possible you're being bitten by a bug in futex() that was introduced in Linux 3.14 and fixed in 3.18. The bug was also backported RHEL 6.6 so would have been present for a while in CentOS too. The effect of this bug was to cause processes to fail to wake up from a FUTEX_WAIT when they should have.

__lll_lock_wait_private () when using malloc/free

I have a user level thread library and I changed a benchmark program to use mythreads instead of pthreads, but it always gets stuck somewhere in the code where there is a malloc or free function.
this is output of gdb:
^C
Program received signal SIGINT, Interrupt.
__lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.
(gdb) where
#0 __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1 0x00007ffff7569bb3 in _int_free (av=0x7ffff78adc00 <main_arena>,
p=0x6f1f40, have_lock=0) at malloc.c:3929
#2 0x00007ffff756d89c in __GI___libc_free (mem=<optimized out>)
at malloc.c:2950
#3 0x000000000040812d in mbuffer_free (m=m#entry=0x6a7660) at mbuffer.c:209
#4 0x00000000004038a8 in write_chunk_to_file (chunk=0x6a7610,
fd=<optimized out>) at encoder.c:279
#5 Reorder (targs=0x7fffffffab60,
targs#entry=<error reading variable: value has been optimized out>)
at encoder.c:1292
#6 0x000000000040b069 in wrapper_function (func=<optimized out>,
arg=<optimized out>) at gtthread.c:75
#7 0x00007ffff7532620 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#8 0x0000000000000000 in ?? ()
(gdb)
and here is some code
mbuffer.c
208 if(ref==0) {
209 pausee();
210 free(m->mcb->ptr);
211 resume();
212 m->mcb->ptr=NULL;
213 free(m->mcb);
214 m->mcb=NULL;
215 }
pausee and resume functions
void pausee(){
//printf("pauseeing\n");
sigemptyset(&mask);
sigaddset(&mask, SIGPROF); // block SIGPROF...
if (sigprocmask(SIG_BLOCK, &mask, &orig_mask) < 0) {
perror ("sigprocmask");
exit(1);
}
}
void resume(){
//printf("restarting\n");
sigemptyset(&mask);
sigaddset(&mask, SIGPROF); // unblock SIGPROF...
if (sigprocmask(SIG_SETMASK, &orig_mask, NULL) < 0) {
perror ("sigprocmask");
exit(1);
}
}
I'm not sure if it's related to my problem or not, but for scheduling of the threads I use SIGPROF signal and a handler function. I tried blocking SIGPROF before every malloc/free function but It had no effect.
I don't have any concurrent threads, only one thread runs at a time.
any help or idea would be very much appreciated.
as your code mbuffer.c, there is no mistake.I suggest you get a test of mbuffer.c for except your guess.

Why realloc deadlock after clone syscall?

I have a problem that realloc() deadlocks sometime after clone() syscall.
My code is:
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <linux/types.h>
#define CHILD_STACK_SIZE 4096*4
#define gettid() syscall(SYS_gettid)
#define log(str) fprintf(stderr, "[pid:%d tid:%d] "str, getpid(),gettid())
int clone_func(void *arg){
int *ptr=(int*)malloc(10);
int i;
for (i=1; i<200000; i++)
ptr = realloc(ptr, sizeof(int)*i);
free(ptr);
return 0;
}
int main(){
int flags = 0;
flags = CLONE_VM;
log("Program started.\n");
int *ptr=NULL;
ptr = malloc(16);
void *child_stack_start = malloc(CHILD_STACK_SIZE);
int ret = clone(clone_func, child_stack_start +CHILD_STACK_SIZE, flags, NULL, NULL, NULL, NULL);
int i;
for (i=1; i<200000; i++)
ptr = realloc(ptr, sizeof(int)*i);
free(ptr);
return 0;
}
the callstack in gdb is:
[pid:13268 tid:13268] Program started.
^Z[New LWP 13269]
Program received signal SIGTSTP, Stopped (user).
0x000000000040ba0e in __lll_lock_wait_private ()
(gdb) bt
#0 0x000000000040ba0e in __lll_lock_wait_private ()
#1 0x0000000000408630 in _L_lock_11249 ()
#2 0x000000000040797f in realloc ()
#3 0x0000000000400515 in main () at test-realloc.c:36
(gdb) i thr
2 LWP 13269 0x000000000040ba0e in __lll_lock_wait_private ()
* 1 LWP 13268 0x000000000040ba0e in __lll_lock_wait_private ()
(gdb) thr 2
[Switching to thread 2 (LWP 13269)]#0 0x000000000040ba0e in __lll_lock_wait_private ()
(gdb) bt
#0 0x000000000040ba0e in __lll_lock_wait_private ()
#1 0x0000000000408630 in _L_lock_11249 ()
#2 0x000000000040797f in realloc ()
#3 0x0000000000400413 in clone_func (arg=0x7fffffffe53c) at test-realloc.c:20
#4 0x000000000040b889 in clone ()
#5 0x0000000000000000 in ?? ()
My OS is debian linux-2.6.32-5-amd64, with GNU C Library (Debian EGLIBC 2.11.3-4) stable release version 2.11.3. I deeply suspect that eglibc is the criminal on this bug.
On clone() syscall, is it not enough before using realloc()?
You cannot use clone with CLONE_VM yourself -- or if you do, you have to at least make sure you restrict yourself from invoking any function from the standard library after calling clone in either the parent or the child. In order for multiple threads or processes to share the same memory, the implementations of any functions which access shared resources (like the heap) need to
be aware of the fact that multiple flows of control are potentially accessing it so they can arrange to perform the appropriate synchronization, and
be able to obtain information about their own identities via the thread pointer, usually stored in a special machine register. This is completely implementation-internal, and thus you cannot arrange for a new "thread" which you create yourself via clone to have a properly setup thread pointer.
The proper solution is to use pthread_create, not clone.
You cannot do this:
for (i=0; i<200000; i++)
ptr = realloc(ptr, sizeof(int)*i);
free(ptr);
The first time through the loop, i is zero. realloc( ptr, 0 ) is equivalent to free( ptr ), and you cannot free twice.
I add a flag, CLONE_SETTLS, in clone() syscall. Then the deadlock is gone.
So I think eglibc's realloc() used some TLS data. When new thread create without a new TLS, some locks (in TLS) shared between this thread and his father, and realloc() using those locks stucked. So, if somebody want to use clone() directly, the best way is alloc a new TLS to new thread.
code snippet likes this:
flags = CLONE_VM | CLONE_SETTLS;
struct user_desc* p_tls_desc = malloc(sizeof(struct user_desc));
clone(clone_func, child_stack_start +CHILD_STACK_SIZE, flags, NULL, NULL, p_tls_desc, NULL);

Resources