I want to write a C program that runs for a specified amount of seconds
say 10 seconds and then exits. The code should set up an interrupt to go
off after a specified amount of time has elapsed.
Here is my attempt. But I am not sure if SIGALRM is the correct way to do it.
Can SIGALRM be called an interrupt?
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
void handler()
{
_exit(0);
}
int main()
{
signal(SIGALRM, handler);
alarm(10);
for (;;); /* You can assume that for(;;); is just a dummy code. The main idea is to insert something into code. Whatever code it may be so that it stops after 10 seconds – */
return 0;
}
Any suggestions/alternatives/better way to achieve this?
The wording "signal" vs. "interrupt" is not fully clear. Signals can interrupt system calls, so a signal is an interrupt in this sense. But a signal is not a hardware interrupt. Whan you use an operating system, normal programs often don't have direct access to hardware interrupts.
Calling _exit from the signal handler might be problematic if your program needs to finish a task or to clean up something.
I suggest to implement a graceful end by setting a flag. Additionally I suggest to use sigaction instead of signal, because the semantics of signal and signal handlers set up with this function is implementation-dependent.
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
static volatile sig_atomic_t timeout = 0;
void handler(int sig)
{
(void) sig;
timeout = 1;
}
int main(void)
{
struct sigaction act;
memset(&act, 0, sizeof(act));
act.sa_handler = handler;
if(sigaction(SIGALRM, act, NULL) < 0)
{
// handle error
}
alarm(10);
while(!timeout /* and maybe other conditions */)
{
// do something, handle error return codes and errno (EINTR)
// check terminate flag as necessary
}
// clean up if necessary
return 0;
}
Explanation (as requested in a comment)
static volatile sig_atomic_t timeout = 0;
sig_atomic_t is a type that guarantees atomic access even in the presence of asynchronous interrupts made by signals. That means an access to the variable cannot be interrupted in between, i.e. the software will never see a partially modified value. (see https://en.cppreference.com/w/c/program/sig_atomic_t)
volatile informs the compiler not to optimize access to the variable. This is necessary because the signal handler may modify the value while the main function is running the loop that is intended to check the flag. Otherwise the compiler might optimize the access out of the loop condition and do it only once before the loop because the variable is never modified inside the loop. (see https://en.cppreference.com/w/c/language/volatile)
I have some code written in C (working on ubuntu 17):
void sig_stop(int sig_num) {
/* Some cleanup that needs to be done */
}
void some_routine(const char *array[], const int length) {
/* Initialization */
signal(SIGTERM, sig_stop);
while (true) {
/* Some function */
/* I have this sleep to minimize the load on the CPU
as I don't need to check the conditions here
all the time. */
sleep(5);
}
}
Whenever I include the 5 minute sleep (sleep(5)), it appears sig_stop isn't called. However, when I comment out the sleep(5), the sig_stop cleanup works just fine. Have I got something wrong with my understanding of how to catch SIGTERM?
If I can't use the sleep function, is there a better way to "sleep" the program" so that it only runs the loop every x minutes or in such a way that minimizes the CPU load?
sleep() and signals
sleep() should not prevent the signal from being caught and the signal handler being executed. From the manpage for sleep() (emphasis mine):
sleep() causes the calling thread to sleep either until the number of real-time seconds specified in seconds have elapsed or until a signal arrives which is not ignored.
Take the following example ...
#include <signal.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
static volatile sig_atomic_t flag = 0;
static void sig_stop(int signum) { flag = 1; }
int main(void) {
int secs_remaining = 0;
signal(SIGTERM, sig_stop);
while (!flag) {
printf("Sleeping at time %d\n", time(NULL));
secs_remaining = sleep(5);
}
printf(
"Flag raised. Exiting at time %d. sleep() was interrupted %d seconds "
"early ...\n",
time(NULL), secs_remaining);
return 0;
}
Note that - in the case where it was interrupted by a signal - sleep() returns the number of seconds left to sleep. E.g., if it is interrupted 3 seconds early it will return 3. It will return 0 if it is not interrupted.
Compile as gcc -o test test.c and run. Then from another terminal run
pkill -15 test
You will see output similar to the following ...
Sleeping at time 1532273709
Flag raised. Exiting at time 1532273711. sleep() was interrupted 2 seconds early ...
By the way ... sleep(x) sleeps for x seconds - not minutes.
signal() vs sigaction()
Due to portability issues associated with signal(), it is often recommended to use sigaction() instead. The use of sigaction() would be something like the following.
int main(void) {
struct sigaction sa;
sa.sa_flags = 0;
sigemptyset(&sa.sa_mask);
sa.sa_handler = sig_stop;
if (sigaction(SIGTERM, &sa, NULL) == -1) {
perror("sigaction");
return 1;
}
// Etc.
}
As you can see the usage of sigaction() is a little more verbose than that of signal(). Perhaps that's why people still sometimes use signal().
I am trying to get the memory consumed by an algorithm, so I have created a group of functions that would stop the execution in periods of 10 milliseconds to let me read the memory using the getrusage() function. The idea is to set a timer that will raise an alarm signal to the process which will be received by a handler medir_memoria().
However, the program stops in the middle with this message:
[1] 3267 alarm ./memory_test
The code for reading the memory is:
#include "../include/rastreador_memoria.h"
#if defined(__linux__) || defined(__APPLE__) || (defined(__unix__) && !defined(_WIN32))
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <signal.h>
#include <sys/resource.h>
static long max_data_size;
static long max_stack_size;
void medir_memoria (int sig)
{
struct rusage info_memoria;
if (getrusage(RUSAGE_SELF, &info_memoria) < 0)
{
perror("Not reading memory");
}
max_data_size = (info_memoria.ru_idrss > max_data_size) ? info_memoria.ru_idrss : max_data_size;
max_stack_size = (info_memoria.ru_isrss > max_stack_size) ? info_memoria.ru_isrss : max_stack_size;
signal(SIGALRM, medir_memoria);
}
void rastrear_memoria ()
{
struct itimerval t;
t.it_interval.tv_sec = 0;
t.it_interval.tv_usec = 10;
t.it_value.tv_sec = 0;
t.it_value.tv_usec = 10;
max_data_size = 0;
max_stack_size = 0;
setitimer(ITIMER_REAL, &t,0);
signal(SIGALRM, medir_memoria);
}
void detener_rastreo ()
{
signal(SIGALRM, SIG_DFL);
printf("Data: %ld\nStack: %ld\n", max_data_size, max_stack_size);
}
#else
#endif
The main() function works calling all of them in this order:
rastrear_memoria()
Function of the algorithm I am testing
detener_rastreo()
How can I solve this? What does that alarm message mean?
First, setting an itimer to ring every 10 µs is optimistic, since ten microseconds is really a small interval of time. Try with 500 µs (or perhaps even 20 milliseconds, i.e. 20000 µs) instead of 10 µs first.
stop the execution in periods of 10 milliseconds
You have coded for a period of 10 microseconds, not milliseconds!
Then, you should exchange the two lines and code:
signal(SIGALRM, medir_memoria);
setitimer(ITIMER_REAL, &t,0);
so that a signal handler is set before the first itimer rings.
I guess your first itimer rings before the signal handler was installed. Read carefully signal(7) and time(7). The default handling of SIGALRM is termination.
BTW, a better way to measure the time used by some function is clock_gettime(2) or clock(3). Thanks to vdso(7) tricks, clock_gettime is able to get some clock in less than 50 nanoseconds on my i5-4690S desktop computer.
trying to get the memory consumed
You could consider using proc(5) e.g. opening, reading, and closing quickly /proc/self/status or /proc/self/statm etc....
(I guess you are on Linux)
BTW, your measurements will disappoint you: notice that quite often free(3) don't release memory to the kernel (thru munmap(2)...) but simply mark & manage that zone to be reusable by future malloc(3). You might consider mallinfo(3) or malloc_info(3) but notice that it is not async-signal-safe so cannot be called from inside a signal handler.
(I tend to believe that your approach is deeply flawed)
For some reason I thought that calling pthread_exit(NULL) at the end of a main function would guarantee that all running threads (at least created in the main function) would finish running before main could exit. However when I run this code below without calling the two pthread_join functions (at the end of main) explicitly I get a segmentation fault, which seems to happen because the main function has been exited before the two threads finish their job, and therefore the char buffer is not available anymore. However when I include these two pthread_join function calls at the end of main it runs as it should. To guarantee that main will not exit before all running threads have finished, is it necessary to call pthread_join explicitly for all threads initialized directly in main?
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <assert.h>
#include <semaphore.h>
#define NUM_CHAR 1024
#define BUFFER_SIZE 8
typedef struct {
pthread_mutex_t mutex;
sem_t full;
sem_t empty;
char* buffer;
} Context;
void *Reader(void* arg) {
Context* context = (Context*) arg;
for (int i = 0; i < NUM_CHAR; ++i) {
sem_wait(&context->full);
pthread_mutex_lock(&(context->mutex));
char c = context->buffer[i % BUFFER_SIZE];
pthread_mutex_unlock(&(context->mutex));
sem_post(&context->empty);
printf("%c", c);
}
printf("\n");
return NULL;
}
void *Writer(void* arg) {
Context* context = (Context*) arg;
for (int i = 0; i < NUM_CHAR; ++i) {
sem_wait(&context->empty);
pthread_mutex_lock(&(context->mutex));
context->buffer[i % BUFFER_SIZE] = 'a' + (rand() % 26);
float ranFloat = (float) rand() / RAND_MAX;
if (ranFloat < 0.5) sleep(0.2);
pthread_mutex_unlock(&(context->mutex));
sem_post(&context->full);
}
return NULL;
}
int main() {
char buffer[BUFFER_SIZE];
pthread_t reader, writer;
Context context;
srand(time(NULL));
int status = 0;
status = pthread_mutex_init(&context.mutex, NULL);
status = sem_init(&context.full,0,0);
status = sem_init(&context.empty,0, BUFFER_SIZE);
context.buffer = buffer;
status = pthread_create(&reader, NULL, Reader, &context);
status = pthread_create(&writer, NULL, Writer, &context);
pthread_join(reader,NULL); // This line seems to be necessary
pthread_join(writer,NULL); // This line seems to be necessary
pthread_exit(NULL);
return 0;
}
If that is the case, how could I handle the case where plenty of identical threads (like in the code below) would be created using the same thread identifier? In that case, how can I make sure that all the threads will have finished before main exits? Do I really have to keep an array of NUM_STUDENTS pthread_t identifiers to be able to do this? I guess I could do this by letting the Student threads signal a semaphore and then let the main function wait on that semaphore, but is there really no easier way to do this?
int main()
{
pthread_t thread;
for (int i = 0; i < NUM_STUDENTS; i++)
pthread_create(&thread,NULL,Student,NULL); // Threads
// Make sure that all student threads have finished
exit(0);
}
pthread_exit() is a function called by a thread to terminate its own execution. For the situation you've given it is not to be called from your main program thread.
As you have figured out, pthread_join() is the correct means to wait for the completion of a joinable thread from main().
Also as you've figured out, you need to maintain the value returned from pthread_create() to pass to pthread_join().
What this means is that you cannot use the same pthread_t variable for all the threads you create if you intend to use pthread_join().
Rather, build an array of pthread_t so that you have a copy of each thread's ID.
Quite aside from whether the program should or should not terminate when the main thread calls pthread_exit, pthread_exit says
The pthread_exit() function terminates
the calling thread
And also:
After a thread has terminated, the
result of access to local (auto)
variables of the thread is undefined.
Since the context is an automatic variable of main(), your code can fall over before it even gets to the point of testing what you want it to test...
A mini saga
You don't mention the environment in which you are running the original code. I modified your code to use nanosleep() (since, as I mentioned in a comment to the question, sleep() takes an integer and therefore sleep(0.2) is equivalent to sleep(0)), and compiled the program on MacOS X 10.6.4.
Without error checking
It works fine; it took about 100 seconds to run with the 0.5 probability factor (as you'd expect; I changed that to 0.05 to reduce the runtime to about 10 seconds), and generated a random string - some of the time.
Sometimes I got nothing, sometimes I got more and sometimes I got less data. But I didn't see a core dump (not even with 'ulimit -c unlimited' to allow arbitrarily large core dumps).
Eventually, I applied some tools and got to see that I always got 1025 characters (1024 generated plus a newline), but quite often, I got 1024 ASCII NUL characters. Sometimes they'd appear in the middle, sometimes at the beginning, etc:
$ ./pth | tpipe -s "vis | ww -w64" "wc -c"
1025
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000ocriexffwgdvdvyfitjtvlzcoffhusjo
zyacniffpsfswesgrkuxycsubufamxxzkrkqnwvsxcbmktodessyohixsmuhdovt
hhertqjjinzoptcuqzertybicrzaeyqlyublbfgutcdvftwkuvxhouiuduoqrftw
xjkgqutpryelzuaerpsbotwyskaflwofseibfqntecyseufqxvzikcyeeikjzsye
qxhjwrjmunntjwhohqovpwcktolcwrvmfvdfsmkvkrptjvslivbfjqpwgvroafzn
fkjumqxjbarelbrdijfrjbtiwnajeqgnobjbksulvcobjkzwwifpvpmpwyzpwiyi
cdpwalenxmocmtdluzouqemmjdktjtvfqwbityzmronwvulfizpizkiuzapftxay
obwsfajcicvcrrjehjeyzsngrwusbejiovaaatyzouktetcerqxjsdpswixjpege
blxscdebfsptxwvwsllvydipovzmnrvoiopmqotydqaujwdykidmwzitdsropguv
vudyfiaaaqueyllnwudfpplcfbsngqqeyucdawqxqzczuwsnaquofreilzvdwbjq
ksrouwltvaktpdrvjnqahpdqdshmmvntspglexggshqbjrvxceaqlfnukedxzlms
cnapdtgtcoyhnglojbjnplowericrzbfulvrobfn
$
(The 'tpipe' program is like 'tee' but it writes to pipes instead of files (and to standard output unless you specify the '-s' option); 'vis' comes from 'The UNIX Programming Environment' by Kernighan & Pike; 'ww' is a 'word wrapper' but there aren't any words here so it brute force wraps at width 64.)
The behaviour I was seeing was highly indeterminate - I'd get different results on each run. I even replaced the random characters with the alphabet in sequence ('a' + i % 26), and was still getting odd behaviour.
I added some debug printing code (and a counter to the contex), and it was clear that the semaphore context->full was not working properly for the reader - it was being allowed to go into the mutual exclusion before the writer had written anything.
With error checking
When I added error checking to the mutex and semaphore operations, I found that:
sem_init(&context.full) failed (-1)
errno = 78 (Function not implemented)
So, the weird outputs are because MacOS X does not implement sem_init(). It's odd; the sem_wait() function failed with errno = 9 (EBADF 'Bad file descriptor'); I added the checks there first. Then I checked the initialization...
Using sem_open() instead of sem_init()
The sem_open() calls succeed, which looks good (names "/full.sem" and "/empty.sem", flags O_CREAT, mode values of 0444, 0600, 0700 at different times, and initial values 0 and BUFFER_SIZE, as with sem_init()). Unfortunately, the first sem_wait() or sem_post() operation fails with errno = 9 (EBADF 'Bad file descriptor') again.
Morals
It is important to check error conditions from system calls.
The output I see is non-deterministic because the semaphores don't work.
That doesn't alter the 'it does not crash without the pthread_join() calls' behaviour.
MacOS X does not have a working POSIX semaphore implementation.
There is no need for calling pthread_join(reader,NULL); at all if Context and buffer are declared with static storage duration (as already pointed out by Steve Jessop, caf and David Schwartz).
Declaring Context and buffer static also makes it necessary to change Context *context to Context *contextr or Context *contextw respectively.
In addition, the following rewrite called pthread_exit.c replaces sem_init() with sem_open() and uses nanosleep() (as suggested by Jonathan Leffler).
pthread_exit was tested on Mac OS X 10.6.8 and did not output any ASCII NUL characters.
/*
cat pthread_exit.c (sample code to test pthread_exit() in main())
source:
"pthreads in C - pthread_exit",
http://stackoverflow.com/questions/3330048/pthreads-in-c-pthread-exit
compiled on Mac OS X 10.6.8 with:
gcc -ansi -pedantic -std=gnu99 -Os -Wall -Wextra -Wshadow -Wpointer-arith -Wcast-qual -Wstrict-prototypes \
-Wmissing-prototypes -Wformat=2 -l pthread -o pthread_exit pthread_exit.c
test with:
time -p bash -c './pthread_exit | tee >(od -c 1>&2) | wc -c'
*/
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <assert.h>
#include <semaphore.h>
#include <time.h>
void *Reader(void* arg);
void *Writer(void* arg);
// #define NUM_CHAR 1024
#define NUM_CHAR 100
#define BUFFER_SIZE 8
typedef struct {
pthread_mutex_t mutex;
sem_t *full;
sem_t *empty;
const char *semname1;
const char *semname2;
char* buffer;
} Context;
static char buffer[BUFFER_SIZE];
static Context context;
void *Reader(void* arg) {
Context *contextr = (Context*) arg;
for (int i = 0; i < NUM_CHAR; ++i) {
sem_wait(contextr->full);
pthread_mutex_lock(&(contextr->mutex));
char c = contextr->buffer[i % BUFFER_SIZE];
pthread_mutex_unlock(&(contextr->mutex));
sem_post(contextr->empty);
printf("%c", c);
}
printf("\n");
return NULL;
}
void *Writer(void* arg) {
Context *contextw = (Context*) arg;
for (int i = 0; i < NUM_CHAR; ++i) {
sem_wait(contextw->empty);
pthread_mutex_lock(&(contextw->mutex));
contextw->buffer[i % BUFFER_SIZE] = 'a' + (rand() % 26);
float ranFloat = (float) rand() / RAND_MAX;
//if (ranFloat < 0.5) sleep(0.2);
if (ranFloat < 0.5)
nanosleep((struct timespec[]){{0, 200000000L}}, NULL);
pthread_mutex_unlock(&(contextw->mutex));
sem_post(contextw->full);
}
return NULL;
}
int main(void) {
pthread_t reader, writer;
srand(time(NULL));
int status = 0;
status = pthread_mutex_init(&context.mutex, NULL);
context.semname1 = "Semaphore1";
context.semname2 = "Semaphore2";
context.full = sem_open(context.semname1, O_CREAT, 0777, 0);
if (context.full == SEM_FAILED)
{
fprintf(stderr, "%s\n", "ERROR creating semaphore semname1");
exit(EXIT_FAILURE);
}
context.empty = sem_open(context.semname2, O_CREAT, 0777, BUFFER_SIZE);
if (context.empty == SEM_FAILED)
{
fprintf(stderr, "%s\n", "ERROR creating semaphore semname2");
exit(EXIT_FAILURE);
}
context.buffer = buffer;
status = pthread_create(&reader, NULL, Reader, &context);
status = pthread_create(&writer, NULL, Writer, &context);
// pthread_join(reader,NULL); // This line seems to be necessary
// pthread_join(writer,NULL); // This line seems to be necessary
sem_unlink(context.semname1);
sem_unlink(context.semname2);
pthread_exit(NULL);
return 0;
}
pthread_join() is the standard way to wait for the other thread to complete, I would stick to that.
Alternatively, you can create a thread counter and have all child threads increment it by 1 at start, then decrement it by 1 when they finish (with proper locking of course), then have your main() wait for this counter to hit 0. (pthread_cond_wait() would be my choice).
Per normal pthread semantics, as taught e.g. here, your original idea does seem to be confirmed:
If main() finishes before the threads
it has created, and exits with
pthread_exit(), the other threads will
continue to execute. Otherwise, they
will be automatically terminated when
main() finishes.
However I'm not sure whether that's part of the POSIX threads standard or just a common but not universal "nice to have" add-on tidbit (I do know that some implementations don't respect this constraint -- I just don't know whether those implementations are nevertheless to be considered standard compliant!-). So I'll have to join the prudent chorus recommending the joining of every thread you need to terminate, just to be on the safe side -- or, as Jon Postel put it in the context of TCP/IP implementations:
Be conservative in what you send; be liberal in what you accept.
a "principle of robustness" that should be used way more broadly than just in TCP/IP;-).
pthread_exit(3) exits the thread that calls it (but not the whole process if other threads are still running). In your example other threads use variables on main's stack, thus when main's thread exits and its stack is destroyed they access unmapped memory, thus the segfault.
Use proper pthread_join(3) technique as suggested by others, or move shared variables into static storage.
When you pass a thread a pointer to a variable, you need to ensure that the lifetime of that variable is at least as long as the thread will attempt to access that variable. You pass the threads pointers to buffer and context, which are allocated on the stack inside main. As soon as main exits, those variables cease to exist. So you cannot exit from main until you confirm that those threads no longer need access to those pointers.
95% of the time, the fix for this problem is to follow this simple pattern:
1) Allocate an object to hold the parameters.
2) Fill in the object with the parameters.
3) Pass a pointer to the object to the new thread.
4) Allow the new thread to deallocate the object.
Sadly, this doesn't work well for objects shared by two or more threads. In that case, you can put a use count and a mutex inside the parameter object. Each thread can decrement the use count under protection of the mutex when it's done. The thread that drops the use count to zero frees the object.
You would need to do this for both buffer and context. Set the use count to 2 and then pass a pointer to this object to both threads.
pthread_join does the following :
The pthread_join() function suspends execution of the calling thread until the target thread terminates, unless the target thread has already terminated. On return from a successful pthread_join() call with a non-NULL value_ptr argument, the value passed to pthread_exit() by the terminating thread is made available in the location referenced by value_ptr. When a pthread_join() returns successfully, the target thread has been terminated. The results of multiple simultaneous calls to pthread_join() specifying the same target thread are undefined. If the thread calling pthread_join() is canceled, then the target thread will not be detached.
However you can achieve the same by using a light weight loop which will prevent the exe from exiting. In Glib this is achieved by creating a GMainLoop, in Gtk+ you can use the gtk_main.
After completion of threads you have to quit the main loop or call gtk_exit.
Alternatively you can create you own wait functionality using a combination of sockets,pipes and select system call but this is not required and can be considered as an exercise for practice.
The example code of section 10.6, the expected result is:
after several iterations, the static structure used by getpwnam will be corrupted, and the program will terminate with SIGSEGV signal.
But on my platform, Fedora 11, gcc (GCC) 4.4.0, the result is
[Langzi#Freedom apue]$ ./corrupt
in sig_alarm
I can see the output from sig_alarm only once, and the program seems hung up for some reason, but it does exist, and still running.
But when I try to use gdb to run the program, it seems OK, I will see the output from sig_alarm at regular intervals.
And from my manual, it said the signal handler will be set to SIG_DEF after the signal is handled, and system will not block the signal. So at the beginning of my signal handler I reset the signal handler.
Maybe I should use sigaction instead, but I only want to know the reason about the difference between normal running and gdb running.
Any advice and help will be appreciated.
following is my code:
#include "apue.h"
#include <pwd.h>
void sig_alarm(int signo);
int main()
{
struct passwd *pwdptr;
signal(SIGALRM, sig_alarm);
alarm(1);
for(;;) {
if ((pwdptr = getpwnam("Zhijin")) == NULL)
err_sys("getpwnam error");
if (strcmp("Zhijin", pwdptr->pw_name) != 0) {
printf("data corrupted, pw_name: %s\n", pwdptr->pw_name);
}
}
}
void sig_alarm(int signo)
{
signal(SIGALRM, sig_alarm);
struct passwd *rootptr;
printf("in sig_alarm\n");
if ((rootptr = getpwnam("root")) == NULL)
err_sys("getpwnam error");
alarm(1);
}
According to the standard, you're really not allowed to do much in a signal handler. All you are guaranteed to be able to do in the signal-handling function, without causing undefined behavior, is to call signal, and to assign a value to a volatile static object of the type sig_atomic_t.
The first few times I ran this program, on Ubuntu Linux, it looked like your call to alarm in the signal handler didn't work, so the loop in main just kept running after the first alarm. When I tried it later, the program ran the signal handler a few times, and then hung. All this is consistent with undefined behavior: the program fails, sometimes, and in various more or less interesting ways.
It is not uncommon for programs that have undefined behavior to work differently in the debugger. The debugger is a different environment, and your program and data could for example be laid out in memory in a different way, so errors can manifest themselves in a different way, or not at all.
I got the program to work by adding a variable:
volatile sig_atomic_t got_interrupt = 0;
And then I changed your signal handler to this very simple one:
void sig_alarm(int signo) {
got_interrupt = 1;
}
And then I inserted the actual work into the infinite loop in main:
if (got_interrupt) {
got_interrupt = 0;
signal(SIGALRM, sig_alarm);
struct passwd *rootptr;
printf("in sig_alarm\n");
if ((rootptr = getpwnam("root")) == NULL)
perror("getpwnam error");
alarm(1);
}
I think the "apue" you mention is the book "Advanced Programming in the UNIX Environment", which I don't have here, so I don't know if the purpose of this example is to show that you shouldn't mess around with things inside of a signal handler, or just that signals can cause problems by interrupting the normal work of the program.
According to the spec, the function getpwnam is not reentrant and is not guaranteed to be thread safe. Since you are accessing the structure in two different threads of control (signal handlers are effectively running in a different thread context), you are running into this issue. Whenever you have concurrent or parallel execution (as when using pthreads or when using a signal handler), you must be sure not to modify shared state (e.g. the structure owned by 'getpwnam'), and if you do, then appropriate locking/synchronization must be used.
Additionally, the signal function has been deprecated in favor of the sigaction function. In order to ensure portable behavior when registering signal handlers, you should always use the sigaction invocation.
Using the sigaction function, you can use the SA_RESETHAND flag to reset the default handler. You can also use the sigprocmask function to enable/disable the delivery of signals without modifying their handlers.
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
void sigalrm_handler(int);
int main()
{
signal(SIGALRM, sigalrm_handler);
alarm(3);
while(1)
{
}
return 0;
}
void sigalrm_handler(int sign)
{
printf("I am alive. Catch the sigalrm %d!\n",sign);
alarm(3);
}
For example, my code is runing in main doing nothing and every 3 seconds my program says im alive x)
I think that if you do as i done calling in the handler function alarm with value 3, the problem is resolved :)