main() does not terminate after successful pthread_join - c

I have a program that starts a pthread and later on waits for the termination of this thread before it returns. The code is something like:
int main(int32_t argc, char* argv[]) {
pthread_t t;
/* initialization and other stuff
...
*/
printf("join result:%d\n", pthread_join(t, 0));
return 0;
}
The program prints, as it is supposed to: join result: 0. So the join works and t is finished. Nevertheless the program does not stop execution. I can only force it to stop if I insert a command exit(0) (or some other number) before the return 0 line.
However, if I remove the line with the pthread_join call the program exits flawlessly.
How is this even possible? What could keep a program from finishing execution after all sub-threads are joined?
EDIT: I just found out that gdb tells me I get a segmentation fault after the execution of the last line with }. Nevertheless I have no idea what is going on behind the scenes:
Program received signal SIGSEGV, Segmentation fault.
0x000000060003aa10 in ?? ()

I think it may be possible that a stack corruption occurs in the main thread. From windows i know that before executing main, the adress of the exit_process function is pushed onto the stack. Then return 0 performs an exit_process call. If in your case the stack was corrupted, it's possible the pointer to exit_process was replaced with an invalid pointer.

Related

How to create a program that crashes with a segfault, but without core dump

I am currently writing in C a UNIX-like shell, and in order to test my shell, I need to create a program that creates a segmentation fault, but without Core Dumped being print.
I already wrote a program that does a segfault (something like int *a = 0; *a = 3;), and when I run it the terminal print Segmentation Fault (Core Dumped).
What code should I write, or what command should I use, in order for my terminal to only print Segmentation Fault when I run it, and not Segmentation Fault (Core Dumped) ?
Edit:
I don't only want the output to be shortened, I also want to create a program that does not create a core dump when it crashes (such that when I use the WCOREDUMP macro on the exit status of the program, given by waitpid, it returns 1.)
Solution:
I made a program that only raise the SIGUSR1 signal ; By running it, I get a 'crash' but without (core dumped) being print - which is what I am looking for.
Code, in case someone needs it:
#include <signal.h>
int main()
{
raise(SIGUSR1);
return 0;
}
Terminal output:
User defined signal 1
You could mess with signal.h
`
#include <signal.h>
void sig_func(int sig)
{
exit(1);
}
int main (void)
{
signal(SIGSEGV, sig_func); // sets a new signal function for SIGSEGV
raise(SIGSEGV); // causes the signal function to be called
return 0;
}
`
in this example, it should exit with 1, printing nothing, change the sig_func to whatever you need.
By the way... With signal you're completely overwriting what happens when that signal is raised. If you got rid of the exit call the program would keep running, with whatever undefined behavior that constitutes.

Print in single Pthread

I'm trying to implement a program using Pthreads in C. Now, I've tried to let a single thread print "Hi":
void * generator(void *arguments){
printf("Hi");
return NULL;
}
int main(int argc, const char* argv[]){
pthread_create(&threads_ids[0], NULL, &generator, NULL);=
}
This doesn't work and doesn't print anything. However, when I put the creation of the pthread in a for loop it does print "Hi", but at each execution the occurrence differs.
Is this normal behaviour, and if so; how can I fix it? Thanks in advance!
It's because your main thread returns and thus exits the process. It means the thread you created never gets a chance to run.
Unlike just returning from main(), calling pthread_exit(0) from main(), will let the other thread continue executing.
Alternatively, you can wait for the thread to complete execution by calling pthread_join() on the thread you created.
When you execute in a loop, probably some of the threads you create gets executed before main thread exits, and thus appears to "work" (prints some Hi).
But it does have the same problem as the code you posted.

How to get rid of an error when quitting pthread while it's in sleep()?

first of all I'd like to apologize for the confusing title. But here's my question:
I have a main function which spawns another thread which is only working from time to time with "sleep(3)" in between.
Inside the main.c , I've a while loop which is running infinitively. So to cancel the program, I have to press Ctrl+C. To catch that, I added a signal handler at the beginning of the main function:
signal(SIGINT, quitProgram);
This is my quitProgram function:
void quitProgram() {
printf("CTRL + C received. Quitting.\n");
running = 0;
return;
}
So when running == 0, the loop is left.
It all seems to work, at least until the thread mentioned started. When I hit Ctrl+C after the thread has started, I get a strange error message:
`*** longjmp causes uninitialized stack frame `***: ./cluster_control terminated
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x68e4e)[0xb7407e4e]
/lib/i386-linux-gnu/libc.so.6(__fortify_fail+0x6b)[0xb749a85b]
/lib/i386-linux-gnu/libc.so.6(+0xfb70a)[0xb749a70a]
/lib/i386-linux-gnu/libc.so.6(__longjmp_chk+0x42)[0xb749a672]
./cluster_control[0x8058427]
[0xb76e2404]
[0xb76e2428]
/lib/i386-linux-gnu/libc.so.6(nanosleep+0x46)[0xb7454826]
/lib/i386-linux-gnu/libc.so.6(sleep+0xcd)[0xb74545cd]
./cluster_control[0x804c0e6]
./cluster_control[0x804ae61]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb73b8a83]
./cluster_control[0x804a331]
When I try to debug it using gdb I get the following:
`(gdb) where
`#0 0xb7fdd428 in __kernel_vsyscall ()
`#1 0xb7d4f826 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
`#2 0xb7d4f5cd in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
`#3 0x0804c0e6 in master_main (mastBroad_sock=3, workReady_ptr=0xbffff084) at master.c:150
`#4 0x0804ae61 in main () at main.c:84
The line 150 in master.c is this:
sleep(PING_PERIOD);
So my guess what's happening: The main thread is exiting while the master_main thread is sleeping and this is causing the error. But: How can I fix this? Is there a better way to let the master_main thread run every few seconds? Or to prevent the main thread from exiting while the master_man is still in sleep?
I tried to use a mutex, but it didn't work (locked the mutex before master_main is sleeping and unlock it afterwards and the exiting main thread needed that mutex to exit).
Additionally I passed an pointer from main to main_master with a state. So I would set the state of main_master to "exit" before exiting the main method, but that didn't work either.
So are there any other ideas? I'm running linux and programming language is C99.
Update 1
Sorry guys, I think I gave you wrong information. The method which causes trouble isn't even inside a thread. Here's an excerpt from my main method:
int main() {
[...]
signal(SIGINT, quitProgram);
while (running)
{
// if system is the current master
if (master_i)
{
master_main(mastBroad_sock, &workReady_condMtx);
pthread_mutex_lock(&(timeCount_mtx_sct.mtx));
master_i = 0;
pthread_mutex_unlock(&(timeCount_mtx_sct.mtx));
}
[...]
}
return 0;
}
And also an excerpt from the master_main which I guess is the problem.
int master_main(int mastBroad_sock, struct cond_mtx *workReady_ptr) {
[...]
while (master_i)
{
// do something
sleep(5); // to perform this loop only every 5 seconds, this is line 150 in master.c
}
}
Update 2
Forgot to add the code which catches Ctrl+C inside the main.c:
void quitProgram() {
printf("CTRL + C received. Quitting.\n");
running = 0;
return;
}
The simplest solution that comes to mind is to have a global flag that tells the thread that the program is shutting down, and so when the main function want to shutdown it sets the flag and then waits for the thread to terminate.
See Joining and Detaching Threads. Depending on what the thread is doing, you might also want to take a look at Condition Variables.

A strange result in a simple pthread code

I wrote the following code:
#include <pthread.h>
#include <stdio.h>
void* sayHello (void *x){
printf ("Hello, this is %d\n", (int)pthread_self());
return NULL;
}
int main (){
pthread_t thread;
pthread_create (&thread, NULL, &sayHello, NULL);
printf("HERE\n");
return 0;
}
After compiling and running I saw 3 different types of outputs.
Only "Here" was printed.
"Here" and 1 'sayHello' message.
"Here" and 2 'sayHello' messages.
Of course I'm OK with the second option, but I don't understand why the 'sayHello' massege can be printed 0 or 2 times if I created only one thread?
You can't say when the thread starts to run, it might not start until
after you return from main which means the process will end and the thread with it.
You have to wait for the thread to finish, with pthread_join, before leaving main.
The third case, with the message from the thread printed twice, might be because the thread executes, and the buffer is written to stdout as part of the end-of-line flush, but then the thread is preempted before the flush is finished, and then the process exist which means all file streams (like stdout) are flushed so the text is printed again.
For output 1:
your main function only create a pthread, and let it run without waiting for it to finish.
When your main function return, Operating system will collect back all the resources assigned to the pprocess. However the newly created pthread might have not run.
That is why you only got HERE.
For output 2:
your newly created thread finished before main function return. Therefore you can see both the main thread, and the created thread's output.
For output 3
This should be a bug in glibc. Please refer to Unexpected output in a multithreaded program for details.
To make the program always has the same output
pthread_join is needed after pthread_create

Why can't I ignore SIGSEGV signal?

Here is my code,
#include<signal.h>
#include<stdio.h>
int main(int argc,char ** argv)
{
char *p=NULL;
signal(SIGSEGV,SIG_IGN); //Ignoring the Signal
printf("%d",*p);
printf("Stack Overflow"); //This has to be printed. Right?
return 0;
}
While executing the code, i'm getting segmentation fault. I ignored the signal using SIG_IGN. So I shouldn't get Segmentation fault. Right? Then, the printf() statement after printing '*p' value must executed too. Right?
Your code is ignoring SIGSEGV instead of catching it. Recall that the instruction that triggered the signal is restarted after handling the signal. In your case, handling the signal didn't change anything so the next time round the offending instruction is tried, it fails the same way.
If you intend to catch the signal change this
signal(SIGSEGV, SIG_IGN);
to this
signal(SIGSEGV, sighandler);
You should probably also use sigaction() instead of signal(). See relevant man pages.
In your case the offending instruction is the one which tries to dereference the NULL pointer.
printf("%d", *p);
What follows is entirely dependent on your platform.
You can use gdb to establish what particular assembly instruction triggers the signal. If your platform is anything like mine, you'll find the instruction is
movl (%rax), %esi
with rax register holding value 0, i.e. NULL. One (non-portable!) way to fix this in your signal handler is to use the third argument signal your handler gets, i.e. the user context. Here is an example:
#include <signal.h>
#include <stdio.h>
#define __USE_GNU
#include <ucontext.h>
int *p = NULL;
int n = 100;
void sighandler(int signo, siginfo_t *si, ucontext_t* context)
{
printf("Handler executed for signal %d\n", signo);
context->uc_mcontext.gregs[REG_RAX] = &n;
}
int main(int argc,char ** argv)
{
signal(SIGSEGV, sighandler);
printf("%d\n", *p); // ... movl (%rax), %esi ...
return 0;
}
This program displays:
Handler executed for signal 11
100
It first causes the handler to be executed by attempting to dereference a NULL address. Then the handler fixes the issue by setting rax to the address of variable n. Once the handler returns the system retries the offending instruction and this time succeeds. printf() receives 100 as its second argument.
I strongly recommend against using such non-portable solutions in your programs, though.
You can ignore the signal but you have to do something about it. I believe what you are doing in the code posted (ignoring SIGSEGV via SIG_IGN) won't work at all for reasons which will become obvious after reading the bold bullet.
When you do something that causes the kernel to send you a SIGSEGV:
If you don't have a signal handler, the kernel kills the process and that's that
If you do have a signal handler
Your handler gets called
The kernel restarts the offending operation
So if you don't do anything abut it, it will just loop continuously. If you do catch SIGSEGV and you don't exit, thereby interfering with the normal flow, you must:
fix things such that the offending operation doesn't restart or
fix the memory layout such that what was offending will be ok on the
next run
Another option is to bracket the risky operation with setjmp/longjmp, i.e.
#include <setjmp.h>
#include <signal.h>
static jmp_buf jbuf;
static void catch_segv()
{
longjmp(jbuf, 1);
}
int main()
{
int *p = NULL;
signal(SIGSEGV, catch_segv);
if (setjmp(jbuf) == 0) {
printf("%d\n", *p);
} else {
printf("Ouch! I crashed!\n");
}
return 0;
}
The setjmp/longjmp pattern here is similar to a try/catch block. It's very risky though, and won't save you if your risky function overruns the stack, or allocates resources but crashes before they're freed. Better to check your pointers and not indirect through bad ones.
Trying to ignore or handle a SIGSEGV is the wrong approach. A SIGSEGV triggered by your program always indicates a bug. Either in your code or code you delegate to. Once you have a bug triggered, anything could happen. There is no reasonable "clean-up" or fix action the signal handler can perform, because it can not know where the signal was triggered or what action to perform. The best you can do is to let the program fail fast, so a programmer will have a chance to debug it when it is still in the immediate failure state, rather than have it (probably) fail later when the cause of the failure has been obscured. And you can cause the program to fail fast by not trying to ignore or handle the signal.

Resources