Stepping through a multithreaded application with GDB - c

1st foray into using pthreads to create a multithreaded aplication
I'm trying to debug with gdb but getting some strange unexpected behaviour
Trying to ascertain whether its me or gdb at fault
Scenario:
Main thread creates a child thread.
I place a breakpoint on a line in the child thread fn
gdb stops on that breakpoint no problem
I confirm there are now 2 threads with info threads
I also check that the 2nd thread is starred, i.e. it is the current thread for gdbs purposes
Here is the problem, when I now hit n to step through to the next line in the thread fn, the parent thread (thread 1) simply resumes and completes and gdb exits.
Is this the correct behaviour?
How can I step through the thread fn code that is being executed in the 2nd thread line by line with gdb?
In other words, even though thread 2 is confirmed as the current thread by gdb, when I hit n, it seems to be the equivalent of hitting c in the parent thread, i.e. the parent thread (thread 1) just resumes execution, completes and exits.
At a loss as to how to debug multiple threads with gdb behaving as it is currently
I am using gdb from within emacs25, i.e. M-x gud-gdb

What GDB does here depends on your settings, and also your system (some vendors patch this area).
Normally, in all-stop mode, when the inferior stops, GDB stops all the threads. This gives you the behavior that you'd "expect" -- you can switch freely between threads and see what is going on in each one.
When the inferior continues, including via next or step, GDB lets all threads run. So, if your second thread doesn't interact with your first thread in any way (no locks, etc), you may well see it exit.
However, you can control this using set scheduler-locking. Setting this to on will make it so that only the current thread can be resumed. And, setting it to step will make it so that only the current thread can be resumed by step and next, but will let all threads run freely on continue and the like.
The default mode here is replay, which is basically off, except when using record-and-replay mode. However, the Fedora GDB is built with the default as step; I am not sure if other distros followed this, but you may want to check.

Yes, this is correct behaviour of gdb. You are only debugging currently active thread, other threads are executing normally behind the scenes. Think about it, how else would you move other threds?
But your code has a bug. Your parent thread should not exit before child thread is done. The best way to do this is to join child thread in the main thread before exiting.

Related

Linux multithreading with own memory space possible?

I have a Linux C program that runs well in a Raspberry 3. When I run it in a low memory situation in another sbc (Raspberry Zero) it runs about 2-3 days then freezes. I believe it's a stack overflow situation.
I've put a thread to check periodically when the main program has frozen. Unfortunately it appears that if the main process crashes, it takes down all of the other threads in the process.
I can avoid this by having another process checking upon the first process, but I'd prefer a thread. Is it possible to have thread that is safe and does not freeze it the main process freezes?
Easily no, it's not possible because per thread definition they share memory and they are part of the main process and it own them all. So everything afflict the main process afflict all the threads.

NetServerEnum block when thread is terminated externaly

(Working in Win32 api , in C environment with VS2010)
I have a two thread app. The first thread forks the second and waits for a given interval - 'TIMEOUT', and then calls TerminateThread() on it.
Meanwhile, second thread calls NetServerEnum().
It appears that when timeout is reached , whether NetServerEnum returned successfully or not, the first thread get deadlocked.
I've already noticed that NetServerEnum creates worker threads of it's own.
I ultimately end up with one of those threads in deadlock, typically on ntdll.dll!RtlInitializeExceptionChain, unable to exit my process gracefully.
As this to too long for a comment:
Verbatim from MSDN, allow me to use te answer form (emphasis by me):
TerminateThread is a dangerous function that should only be used in the most extreme cases. You should call TerminateThread only if you know exactly what the target thread is doing, and you control all of the code that the target thread could possibly be running at the time of the termination. For example, TerminateThread can result in the following problems:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
*If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.
From reading this it is easy to understanf why it is a bad idea to cancel (terminate) a thread stucking in a system call.
A possible alternative approach to the OP's design might be to spawn off a thread calling NetServerEnum() and simply let it run until the system call returned.
In the mean while the main thread could do other things like for example informing the user that scanning the net takes longer as expected.

Debugging signal handling in multithreaded application

I have this multithreaded application using pthreads. My threads actually wait for signals using sigwait. Actually, I want to debug my application, see which thread receives which signal at what time and then debug it. Is there any method, I can do this. If I directly run my program, then signals are generated rapidly and handled by my handler threads. I want to see which handler wakes up from the sigwait call and processes the signal and all.
The handy strace utility can print out a huge amount of useful information regarding system calls and signals. It would be useful to log timing information or collect statistics regarding the performance of signal usage.
If instead you are interested in getting a breakpoint inside of an event triggered by a specific signal, you could consider stashing enough relevant information to identify the event in a variable and setting a conditional breakpoint.
One of the things you may try with gdb is set breakpoints by thread (e.g. just after return from sigwait), so you know which thread wakes up:
break file.c thread [thread_nr]
Don't forget to tell gdb to pass signals to your program e.g.:
handle SIGINT pass
You may want to put all of this into your .gdbinit file to save yourself a lot of typing.
Steven Schlansker is definitely right: if that happens to significantly change timing patterns of your program (so you can see that your program behaves completely different under debugger, than "in the wild") then strace and logging is your last hope.
I hope that helps.

Which thread holds the lock

I am working on C and I have a core dump of a multithreaded (two threads) process that I am debugging.
I see in gdb that the mutex_lock is acquired by both the threads under a rare situation. Is there a way I could check the thread that possess the lock in gdb?
I am running a flavor of linux..
Also, I am not allowed to post the code since it's proprietary.
On every line that gets and releases the lock in question (of course change the printf text), do the following:
break file:line
commands
printf "acquiring lock"
info threads
cont
end

What makes a pthread defunct?

i'm working with a multi-threaded program (using pthreads) that currently create a background thread (PTHREAD_DETACHED) and then invokes pthread_exit(0). My problem is that the process is then listed as "defunct" and curiously do not seems to "really exists" in /proc (which defeats my debugging strategies)
I would like the following requirements to be met:
the program should run function A in a loop and function B once
given the PID of the program /proc/$pid/exe, /proc/$pid/maps and /proc/$pid/fd must be accessible (when the process is defunct, they are all empty or invalid links)
it must be possible to suspend/interrupt the program with CTRL+C and CTRL+Z as usual
edit: I hesitate changing the program's interface for having A in the "main" thread and B in a spawned thread (they are currently in the other way). Would it solve the problem ?
You can either suspend the execution of the main process waiting for a signal, or don't detach the thread (using the default PHTREAD_CRATE_JOINABLE) waiting for its termination with a pthread_join().
Is there a reason you can't do things the other way round: have the main thread run the loop, and do the one-off task in the background thread?
Not the most elegant design but maybe you could block the main thread before exiting with:
while(1) {
pause();
}
Then you can install a signal handler for SIGINT and SIGTERM that breaks the loop. The easiest way for that is: exit(0) :-).

Resources