I am writing a program that sends a signal in one process and receives it in a thread in another. I have the entire program written with signals being caught and handled, as well as any synchronization issues. The problem is, I am trying to log the time the signal was sent and the time the signal was received. Though the values across the process vary strangely.
Here is how I did it.
I have a header file header.h which includes a shared global extern struct timespec begin, end;. The reason I made these shared was that I would need the beginning time to calculate the time elapsed since the program began.
Here is how I calculate the time elapsed.
I am using the POSIX clock_gettime().
I start the program and begin the timer, then when a signal is sent I run:
clock_gettime(CLOCK_REALTIME, &end);
long seconds = end.tv_sec - begin.tv_sec;
long nanoseconds = end.tv_nsec - begin.tv_nsec;
double elapsed = seconds + nanoseconds * 1e-9;
This all occurs in the main program.
The second process is another program which is exec() in a child process and that is where the signal catch occurs.
When I catch the signal, I store some data about it in a struct and store it in a buffer for another thread to read and log from.
typedef struct
{
int sig;
double time;
long int tid;
} data;
Here's what I do in one of the threads:
data d;
d.sig = 2;
d.tid = pthread_self();
clock_gettime(CLOCK_REALTIME, &end);
long seconds = end.tv_sec - begin.tv_sec;
long nanoseconds = end.tv_nsec - begin.tv_nsec;
double elapsed = seconds + nanoseconds * 1e-9;
d.time = elapsed;
put(d);
The problem is my outputs are vastly different. In my sentlog.txt the time is represented correctly, with enough precision to see a difference.
SIGUSR2 sent at 1.000286 seconds
SIGUSR2 sent at 1.082671 seconds
SIGUSR2 sent at 1.155440 seconds
SIGUSR1 sent at 1.250770 seconds
SIGUSR1 sent at 1.314637 seconds
SIGUSR2 sent at 1.398995 seconds
SIGUSR1 sent at 1.460559 seconds
SIGUSR2 sent at 1.498223 seconds
SIGUSR2 sent at 1.577555 seconds
SIGUSR1 sent at 1.618036 seconds
SIGUSR2 sent at 1.684488 seconds
SIGUSR2 sent at 1.743165 seconds
SIGUSR2 sent at 1.780100 seconds
SIGUSR2 sent at 1.871603 seconds
SIGUSR1 sent at 1.901293 seconds
SIGUSR2 sent at 1.944139 seconds
SIGUSR1 sent at 1.984142 seconds
SIGUSR1 sent at 2.040130 seconds
While the receivelog.txt is not.
Here is how I log to the file and stdout
if (d.sig == 1)
{
printf("SIGUSR1 received by thread %ld at time %f\n", d.tid, d.time);
fflush(stdout);
fprintf(fpRecieve, "Thread %ld received SIGUSR1 at %f seconds\n", d.tid, d.time);
fflush(fpRecieve);
}
else if (d.sig == 2)
{
printf("SIGUSR2 received by thread %ld at time %f\n", d.tid, d.time);
fflush(stdout);
fprintf(fpRecieve, "Thread %ld received SIGUSR2 at %f seconds\n", d.tid, d.time);
fflush(fpRecieve);
}
Thread 139995363964672 received SIGUSR2 at 1670008328.531628 seconds
Thread 139995363964672 received SIGUSR2 at 1670008328.613999 seconds
Thread 139995363964672 received SIGUSR2 at 1670008328.686767 seconds
Thread 139995372357376 received SIGUSR1 at 1670008328.782099 seconds
Thread 139995372357376 received SIGUSR1 at 1670008328.845975 seconds
Thread 139995363964672 received SIGUSR2 at 1670008328.930328 seconds
Thread 139995372357376 received SIGUSR1 at 1670008328.991889 seconds
Thread 139995363964672 received SIGUSR2 at 1670008329.029554 seconds
Thread 139995363964672 received SIGUSR2 at 1670008329.108883 seconds
Thread 139995372357376 received SIGUSR1 at 1670008329.149364 seconds
Thread 139995363964672 received SIGUSR2 at 1670008329.215814 seconds
Thread 139995363964672 received SIGUSR2 at 1670008329.274493 seconds
Thread 139995363964672 received SIGUSR2 at 1670008329.311425 seconds
Thread 139995363964672 received SIGUSR2 at 1670008329.402932 seconds
Thread 139995372357376 received SIGUSR1 at 1670008329.432621 seconds
Thread 139995363964672 received SIGUSR2 at 1670008329.475466 seconds
Why can I not simply just use the same operation as before?
I have a header file header.h which includes a shared global extern struct timespec begin, end;. The reason I made these shared was that I would need the beginning time to calculate the time elapsed since the program began.
The end does not need to be global (and should not be). Only begin needs to be global.
When end is global, multiple threads can access it at the same time. This is a race condition and is UB (undefined behavior).
Make end a function scoped variable.
You're not showing the code for put or the [ring?] queue definition.
Access to it should be with a mutex or stdatomic.h primitives.
Although a bit trickier to implement, I usually prefer the atomic functions.
Also, I agree that the code should be using CLOCK_MONOTONIC.
Related
I want to work on signal handlers in the context of two independent processes namely writer and reader for notification. The writer sends a first signal SIGUSR1 to the reader which loops till it hears the second signal SIGUSR2 from the writer.
reader.c
static volatile sig_atomic_t done_waiting;
int handler1(int signal){
done_waiting = 0;
while( !done_waiting ){
(void)fprintf(stdout, " reader waiting for sigusr2: done_waiting = %d\n", done_waiting );
}
(void)fprintf(stdout, " reader received sigusr2 \n);
}
int handler2 (int signal){
done_waiting = 1;
}
main(){
signal(SIGUSR1, handler1);
signal(SIGUSR2, handler2);
sleep(5); // sleep till we start worker
}
In writer.c, signals are sent to the reader as
main(){
kill(pid_reader, SIGUSR1);
sleep(5);
kill (pid_reader, SIGUSR2);
}
When I execute reader first followed by worker, the program quits at the while loop. And the writer prints that "No matching processes belonging to you were found".
Is nesting signal handlers allowed and if yes, is it recommended? Also, is there any another alternative mechanism for writer to notify reader that it is ready?
Is maybe nested signals actually what you meant, not nested signal handlers ? To clarify, what will happen if a SIGUSR2 is received while the handler for SIGUSR1 is executing, is that what you mean ? I assume so,
I tested your code, with some modifications, to get the pid for the reader process into the writer process I used the args to main.
The results I get is.
First reader is quiet
After receiving SIGUSR1 it starts continuously writing that it waits for SIGUSR2
When receiving SIGUSR2, it prints "reader received SIGUSR2"
This indicates that it is possible to have nested signals. However I would not say it is recommended as an intentional design.
As mentioned in the comments, you should do as little as possible in the signal handlers, definitely not loop in a while-loop.
And as also mentioned in the comments, be very careful what functions you call in signal-context, printf() is not OK, even though it may seem to work fine.
Tested on Linux, with the ancient kernel 3.16 and gcc 4.9
I am writing a signal handler in a server where I need to receive all the signals sent by clients before returning to the main program of the server. The signals are real-time in nature. My code is given below:
static void handlerB(int signum, siginfo_t *info, void *context){
id[count] = info->si_pid;
printf("A's process ID: %d \n", (int)id[count]);
count++;
kill (pid, SIGCONT);
}
I need to wait for all the sent signals to be received the handler, so that "count" is set to the total number of signals received, before proceeding to sending SIGCONT to the main program. In other words, the handler will be called only once during execution of the program. Does anyone know how to do this? Thanks in advance!
I have a capture program which in addition do capturing data and writing it into a file also prints some statistics.The function that prints the statistics
static void report(void)
{
/*Print statistics*/
}
is called roughly every second using an ALARM that expires every second.So The program is like
void capture_program()
{
pthread_t report_thread
while()
{
if(pthread_create(&report_thread,NULL,report,NULL)){
fprintf(stderr,"Error creating reporting thread! \n");
}
/*
Capturing code
--------------
--------------
*/
if(doreport)
/*wakeup the sleeping thread.*/
}
}
void *report(void *param)
{
//access some register from hardware
//sleep for a second
}
The expiry of the timer sets the doreport flag.If this flag is set report() is called which clears the flag.
How do I wake up the sleeping thread (that runs the report()) when the timer goes off in the main thread?
You can sleep a thread using sigwait, and then signal that thread to wake up with pthread_kill. Kill sounds bad, but it doesn't kill the thread, it sends a signal. This method is very fast. It was much faster than condition variables. I am not sure it is easier, harder, safer or more dangerous, but we needed the performance so we went this route.
in startup code somewhere:
sigemptyset(&fSigSet);
sigaddset(&fSigSet, SIGUSR1);
sigaddset(&fSigSet, SIGSEGV);
to sleep, the thread does this:
int nSig;
sigwait(&fSigSet, &nSig);
to wake up (done from any other thread)
pthread_kill(pThread, SIGUSR1);
or to wake up you could do this:
tgkill(nPid, nTid, SIGUSR1);
Our code calls this on the main thread before creating child threads. I'm not sure why this would be required.
pthread_sigmask(SIG_BLOCK, &fSigSet, NULL);
How do I wake up the sleeping thread (that runs the report()) when the
timer goes off in the main thread?
I think a condition variable is the mechanism you are looking for. Have the report-thread block on the condition variable, and the main thread signal the condition variable whenever you want the report-thread to wake up (see the link for more detailed instructions).
I had a similar issue when coding an UDP chat server: there is a thread_1 that only works when an alarm interruption (timeout to see if the client is still alive) OR another thread_2 (this thread meets client requests) signals arrives. What I did was put this thread_1 to sleep (sleep(n*TICK_TIMER), where TICK_TIMER is the alarm expiration value, n is some integer >1), and wake up this thread with SIGALRM signal. See sleep() doc
The alarm handler ( to use this you have to init it: "signal(SIGALRM, tick_handler); alarm(5);")
void tick_handler(){tick_flag++; alarm(5); }
will send a SIGALRM when timeout occurs.
And the command to wake this sleep thread_1 from another thread_2 is:
pthread_kill(X,SIGALRM);
where X is a pthread_t type. If your thread_1 is your main thread, you can get this number by pthread_t X = pthread_self();
I'm trying to do an assignment for one of my classes and no professors/fellow classmates are getting back to me. So before you answer, please don't give me any exact answers! Only explanations!
What I have to do is write a c program (timeout.c) that takes in two command line arguments, W and T, where W is the amount of time in seconds the child process should take before exiting, and T is the amount of time the parent process should wait for the child process, before killing the child process and printing out a "Time Out" message. Basically, if W > T, there should be a timeout. Otherwise, the child should finish its work and then no timeout message is printed.
What I wanted to do was just have the parent process sleep for T seconds, and then kill the child process and print out the timeout, however printing out the timeout message would happen no in both cases. How do I check to see that the child process is terminated? I was told to use alarm() for this, however I have no idea of what use that function would serve.
Here's my code in case anyone wants to take a look:
void handler (int sig) {
return;
}
int main(int argc, char* argv[]){
if (argc != 3) {
printf ("Please enter values W and T, where W\n");
printf ("is the number of seconds the child\n");
printf ("should do work for, and T is the number\n");
printf ("of seconds the parent process should wait.\n");
printf ("-------------------------------------------\n");
printf ("./executable <W> <T>\n");
}
pid_t pid;
unsigned int work_seconds = (unsigned int) atoi(argv[1]);
unsigned int wait_seconds = (unsigned int) atoi(argv[2]);
if ((pid = fork()) == 0) {
/* child code */
sleep(work_seconds);
printf("Child done.\n");
exit(0);
}
sleep(wait_seconds);
kill(pid, SIGKILL);
printf("Time out.");
exit(0);
}
Although waitpid would get you the return status of the child, its default usage would force parent to wait until the child terminates.
But your requirement (if i understood correctly) only wants parent to wait for a certain time, alarm() can be used to do that.
Then, you should use waitpid() with a specific option that returns immediately if the child has not exited yet (study the api's parameters). So if the child didn't exit, you could kill it, else you already receive its return status.
You want the timeout program to stop more or less as soon as the command finishes, so if you say timeout -t 1000 sleep 1 the protecting program stops after about 1 second, not after 1000 seconds.
The way to do that is to set an alarm of some sort — classically, with the alarm() system call and a signal handler for SIGALRM — and then have the main process execute wait() or waitpid() so that when the child dies, it wakes up and collects the corpse. If the parent process gets the alarm signal, it can print its message and send death threats of some sort to its child. It might be sensible to try SIGTERM and/or SIGHUP before resorting to SIGKILL; the SIGTERM and SIGHUP signals give the errant child a chance to clean up whereas SIGKILL does not.
If you know how to manage signals, you could catch SIGALRM and SIGCHLD in your parent process. SIGCHLD will be raised when the client terminates, and SIGALRM when the timer expires. If the first raised signal is SIGALRM, the timeout expired, otherwise, if the first SIGNAL that the parent catches is SIGCHLD, the child has stopped before the expiration of the timeout.
wait() or waitpid() would still be necessary to collect the terminated child.
My question is similar to How do I check if a thread is terminated when using pthread?. but i did not quite get an answer.
My problem is...I create a certain number of threads say n. As soon as main detects the exit of any one thread it creates another thread thus keeping the degree of concurrency as n and so on.
How does the main thread detect the exit of a thread. pthread_join waits for a particular thread to exit but in my case it can be any one of the n threads.
Thanks
Most obvious, without restructuring your code as aix suggests, is to have each thread set something to indicate that it has finished (probably a value in an array shared between all threads, one slot per worker thread), and then signal a condition variable. Main thread waits on the condition variable and each time it wakes up, handle all threads that have indicated themselves finished: there may be more than one.
Of course that means that if the thread is cancelled you never get signalled, so use a cancellation handler or don't cancel the thread.
There are several ways to solve this.
One natural way is to have a thread pool of fixed size n and have a queue into which the main thread would place tasks and from which the workers would pick up tasks and process them. This will maintain a constant degree of concurrency.
An alternative is to have a semaphore with the initial value set to n. Every time a worker thread is created, the value of the semaphore would need to be decremented. Whenever a worker is about to terminate, it would need to increment ("post") the semaphore. Now, waiting on the semaphore in the main thread will block until there's fewer than n workers left; a new worker thread would then be spawned and the wait resumed. Since you won't be using pthread_join on the workers, they should be detached (pthread_detach).
If you want to be informed of a thread exiting (via pthread_exit or cancellation), you can use a handler with pthread_cleanup_push to inform the main thread of the child exiting (via a condition variable, semaphore or similar) so it can either wait on it, or simply start a new one (assuming the child is detached first).
Alternately, I'd suggest having the threads wait for more work (as suggested by #aix), rather than ending.
If your parent thread needs to do other other things, then it can't just constantly be blocking on pthread_join, You will need a way to send a message to the main thread from the child thread to tell it to call pthread_join. There are a number of IPC mechanisms that you could use for this.
When a child thread has done it's work, it would then send some sort of message to the main thread through IPC saying "I completed my job" and also pass its own thread id, then the main thread knows to call pthread_join on that thread id.
One easy way is to use a pipe as a communication channel between the (worker) threads and your main thread. When a thread terminates it writes its result (thread id in the following example) to the pipe. The main thread waits on the pipe and reads the thread result from it as soon as it becomes available.
Unlike mutex or semaphore, a pipe file descriptor can be easily handled by the application main event loop (such as libevent). The writes from different threads to the same pipe are atomic as long as they write PIPE_BUF or less bytes (4096 on my Linux).
Below is a demo that creates ten threads each of which has a different life span. Then the main thread waits for any thread to terminate and prints its thread id. It terminates when all ten threads have completed.
$ cat test.cc
#include <iostream>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <time.h>
void* thread_fun(void* arg) {
// do something
unsigned delay = rand() % 10;
usleep(delay * 1000000);
// notify termination
int* thread_completed_fd = static_cast<int*>(arg);
pthread_t thread_id = pthread_self();
if(sizeof thread_id != write(*thread_completed_fd, &thread_id, sizeof thread_id))
abort();
return 0;
}
int main() {
int fd[2];
if(pipe(fd))
abort();
enum { THREADS = 10 };
time_t start = time(NULL);
// start threads
for(int n = THREADS; n--;) {
pthread_t thread_id;
if(pthread_create(&thread_id, NULL, thread_fun, fd + 1))
abort();
std::cout << time(NULL) - start << " sec: started thread " << thread_id << '\n';
}
// wait for the threads to finish
for(int n = THREADS; n--;) {
pthread_t thread_id;
if(sizeof thread_id != read(fd[0], &thread_id, sizeof thread_id))
abort();
if(pthread_join(thread_id, NULL)) // detached threads don't need this call
abort();
std::cout << time(NULL) - start << " sec: thread " << thread_id << " has completed\n";
}
close(fd[0]);
close(fd[1]);
}
$ g++ -o test -pthread -Wall -Wextra -march=native test.cc
$ ./test
0 sec: started thread 140672287479552
0 sec: started thread 140672278759168
0 sec: started thread 140672270038784
0 sec: started thread 140672261318400
0 sec: started thread 140672252598016
0 sec: started thread 140672243877632
0 sec: started thread 140672235157248
0 sec: started thread 140672226436864
0 sec: started thread 140672217716480
0 sec: started thread 140672208996096
1 sec: thread 140672208996096 has completed
2 sec: thread 140672226436864 has completed
3 sec: thread 140672287479552 has completed
3 sec: thread 140672243877632 has completed
5 sec: thread 140672252598016 has completed
5 sec: thread 140672261318400 has completed
6 sec: thread 140672278759168 has completed
6 sec: thread 140672235157248 has completed
7 sec: thread 140672270038784 has completed
9 sec: thread 140672217716480 has completed