C program causes memory leak? - c

I use that very simple C program to execute a system call to php each second, in order to run a php script that sends pending push notification in my database to APNS (Apple notification service).
Anyway, this program causes a memory overflow after about 10 hours, so I reduced sleep time between thread creation from 1s to 10000us, and I could see in real time with htop that memory were increasing without never lower. Here is the program :
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
typedef struct {
char* script_path ;
} arg_for_script ;
static void *start_instance(void *_args)
{
int id = abs(pthread_self());
arg_for_script* args = _args ;
printf("[SERVICE] start php script on thread %d\n",id);
fflush(stdout);
char cmd[200] ;
sprintf(cmd, "php -f %s %d", args->script_path, id );
system(cmd);
printf("[SERVICE] end of script on thread %d\n", id);
fflush(stdout);
pthread_exit(NULL);
}
int main(int argc, char* argv[])
{
if(argc < 2)
{
fprintf(stderr, "[SERVICE] Path of php notification script must be filled\n");
fflush(stderr);
return EXIT_FAILURE;
}
arg_for_script args ;
args.script_path = argv[1];
pthread_attr_t tattr ;
struct sched_param param;
param.sched_priority = 1 ;
pthread_attr_init(&tattr);
pthread_attr_setinheritsched(&tattr, PTHREAD_EXPLICIT_SCHED);
pthread_attr_setschedpolicy(&tattr, SCHED_FIFO);
pthread_attr_setschedparam(&tattr, &param);
while(1) {
pthread_t thrd;
// if(pthread_create(&thrd, &tattr, start_instance, (void *)&args) == -1) {
if(pthread_create(&thrd, NULL, start_instance, (void *)&args) == -1)
{
fprintf(stderr, "[SERVICE] Unable to create thread\n");
fflush(stderr);
return EXIT_FAILURE;
}
usleep( 10000);
}
// pthread_attr_destroy(&tattr);
return EXIT_SUCCESS ;
}
Here, I don't dynamically allocate any RAM with malloc. Why would this program increases memory usage ? What pointer should I free here ?

You aren't calling pthread_join() nor use pthread_detach(), so the resources allocated for the thread aren't freed. Namely each thread has it's own stack, which is probably what causes the rising memory consumption.
Some remarks about your implementation: Since you plan on executing a PHP script with system() and don't actually need to work on shared variables or file descriptors, it's better to use fork() and one of the variants of exec(). This will spawn a new process without the intermediate step of creating a thread. It's also not recommended to use system() because it often allows to exploit the program when the input isn't properly sanitized. In this case it might be fine, if you only call it manually.

Related

Accessing to return value of a thread by another one other than creating one

Why have copyfilepass return a pointer to the number of bytes copied when callcopypass can access this value as args[2]?
#include <unistd.h>
#include "restart.h"
void *copyfilepass(void *arg) {
int *argint;
argint = (int *)arg;
/* copyfile copies from a descriptor to another */
argint[2] = copyfile(argint[0], argint[1]);
close(argint[0]);
close(argint[1]);
return argint + 2;
}
callcopypass.c
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#define PERMS (S_IRUSR | S_IWUSR)
#define READ_FLAGS O_RDONLY
#define WRITE_FLAGS (O_WRONLY | O_CREAT | O_TRUNC)
void *copyfilepass(void *arg);
int main (int argc, char *argv[]) {
int *bytesptr;
int error;
int targs[3];
pthread_t tid;
if (argc != 3) {
fprintf(stderr, "Usage: %s fromfile tofile\n", argv[0]);
return 1;
}
if (((targs[0] = open(argv[1], READ_FLAGS)) == -1) ||
((targs[1] = open(argv[2], WRITE_FLAGS, PERMS)) == -1)) {
perror("Failed to open the files");
return 1;
}
if (error = pthread_create(&tid, NULL, copyfilepass, targs)) {
fprintf(stderr, "Failed to create thread: %s\n", strerror(error));
return 1;
}
if (error = pthread_join(tid, (void **)&bytesptr)) {
fprintf(stderr, "Failed to join thread: %s\n", strerror(error));
return 1;
}
printf("Number of bytes copied: %d\n", *bytesptr);
return 0;
}
The authors answer that
if a thread other than the creating thread joins with copyfilepass, it
has access to the number of bytes copied through the parameter to
pthread_join.
I don't even comprehend the answer. How another thread can access the return value (i.e. change the value?) other than creating one? Could you explain it if possible with an example?
The crux of the answer is that you could foreseeably want to read the result of the copyfilepass thread (in this case the number of bytes copied) from a thread other than the thread which created it. Assume, for the sake of example, we have a third thread, monitorcopy, and tid is a global instead of a local variable. monitorcopy is spawned after copyfilepass from the main method.
void* monitorcopy(void* params) {
void *result
pthread_join(tid, &result);
/* Point A: Attempt to read result */
}
Assume copyfilepass returned NULL, or a meaningless value. At Point A, result is NULL and we have no way of retrieving the number of bytes copied, as it is stored in targs[2] in the main method, which is out of scope.
Assume instead copyfilepass returned argint + 2. result is now a pointer to the number of bytes copied, even though we are not in the same scope as targs. Thus, in the absence of any memory lifetime issues, we can access the number of bytes copied as follows:
void* monitorcopy(void* params) {
void *result
pthread_join(tid, &result);
int bytesCopied = *((int*) result);
}
The problem isn't that a different thread would want to "change the return value", it's whether a different thread will have access to the input parameters (targs). Generally, pthread_join allows you to get the result value from a certain thread, from any place in your program, as long as you have the thread id. So isn't it sensible to use this value to return the result of the async operation?
However, this example is rather poorly written (as an example for good multithreading practices), for a number of reasons:
There is only a single function, and the scope of all variables extends through main. Written this way, everyone has access to the input args anyway. You are right when you say that reading the result through pthread_join is unnecessary in this case.
Passing a stack variable (targs) to a thread is a bad idea. The variable will go out of scope when the function ends, so the only safe way for the program not to crash is to join the thread immediatelly, preventing targs from going out of scope. Which means you don't get any benefits of multithreading (unless main does some extra work before joining). They should be either made global, or allocated on the heap (a malloc/free pair).
Files are opened inside main, but closed inside copyfilepass. This shift of responsibility is unnecessary, although not uncommon. I would either pass the file names to the function and handle the opening there, or close the handles outside the thread, after the files are copied.
Anyway, the point that author of the code had was that you don't need to have access to the input arguments at the place where you're joining the thread:
// removed all error checks for simplicity
int main (int argc, char *argv[]) {
pthread_t tid;
// removed all error checks for simplicity
pthread_create(&tid, NULL, copy_file, argv);
// note that this function only accepts the thread id
wait_until_copied(tid);
return 0;
}
void wait_until_copied(pthread_t tid)
{
int *bytesptr;
// no way to access input args here
pthread_join(tid, (void **)&bytesptr);
printf("Number of bytes copied: %d\n", *bytesptr);
}

Thread not printing out in correct order

I'm fairly new to threads in C. For this program I need to declare a thread which I pass in a for loop thats meant to print out the printfs from the thread.
I can't seem to get it to print in correct order. Here's my code:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#define NUM_THREADS 16
void *thread(void *thread_id) {
int id = *((int *) thread_id);
printf("Hello from thread %d\n", id);
return NULL;
}
int main() {
pthread_t threads[NUM_THREADS];
for (int i = 0; i < NUM_THREADS; i++) {
int code = pthread_create(&threads[i], NULL, thread, &i);
if (code != 0) {
fprintf(stderr, "pthread_create failed!\n");
return EXIT_FAILURE;
}
}
return EXIT_SUCCESS;
}
//gcc -o main main.c -lpthread
That's the classic example of understanding multi-threading.
The threads are running concurrently, scheduled by OS scheduler.
There is no such thing as "correct order" when we are talking about running in parallel.
Also, there is such thing as buffers flushing for stdout output. Means, when you "printf" something, it is not promised it will happen immediately, but after reaching some buffer limit/timeout.
Also, if you want to do the work in the "correct order", means wait until the first thread finishes it's work before staring next one, consider using "join":
http://man7.org/linux/man-pages/man3/pthread_join.3.html
UPD:
passing pointer to thread_id is also incorrect in this case, as a thread may print id that doesn't belong to him (thanks Kevin)

Strange behavior of clone

This is fairly simple application which creates a lightweight process (thread) with clone() call.
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <time.h>
#define STACK_SIZE 1024*1024
int func(void* param) {
printf("I am func, pid %d\n", getpid());
return 0;
}
int main(int argc, char const *argv[]) {
printf("I am main, pid %d\n", getpid());
void* ptr = malloc(STACK_SIZE);
printf("I am calling clone\n");
int res = clone(func, ptr + STACK_SIZE, CLONE_VM, NULL);
// works fine with sleep() call
// sleep(1);
if (res == -1) {
printf("clone error: %d", errno);
} else {
printf("I created child with pid: %d\n", res);
}
printf("Main done, pid %d\n", getpid());
return 0;
}
Here are results:
Run 1:
➜ LFD401 ./clone
I am main, pid 10974
I am calling clone
I created child with pid: 10975
Main done, pid 10974
I am func, pid 10975
Run 2:
➜ LFD401 ./clone
I am main, pid 10995
I am calling clone
I created child with pid: 10996
I created child with pid: 10996
I am func, pid 10996
Main done, pid 10995
Run 3:
➜ LFD401 ./clone
I am main, pid 11037
I am calling clone
I created child with pid: 11038
I created child with pid: 11038
I am func, pid 11038
I created child with pid: 11038
I am func, pid 11038
Main done, pid 11037
Run 4:
➜ LFD401 ./clone
I am main, pid 11062
I am calling clone
I created child with pid: 11063
Main done, pid 11062
Main done, pid 11062
I am func, pid 11063
What is going on here? Why "I created child" message is sometimes printed several times?
Also I noticed that adding a delay after clone call "fixes" the problem.
You have a race condition (i.e.) you don't have the implied thread safety of stdio.
The problem is even more severe. You can get duplicate "func" messages.
The problem is that using clone does not have the same guarantees as pthread_create. (i.e.) You do not get the thread safe variants of printf.
I don't know for sure, but, IMO the verbiage about stdio streams and thread safety, in practice, only applies when using pthreads.
So, you'll have to handle your own interthread locking.
Here is a version of your program recoded to use pthread_create. It seems to work without incident:
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <time.h>
#include <pthread.h>
#define STACK_SIZE 1024*1024
void *func(void* param) {
printf("I am func, pid %d\n", getpid());
return (void *) 0;
}
int main(int argc, char const *argv[]) {
printf("I am main, pid %d\n", getpid());
void* ptr = malloc(STACK_SIZE);
printf("I am calling clone\n");
pthread_t tid;
pthread_create(&tid,NULL,func,NULL);
//int res = clone(func, ptr + STACK_SIZE, CLONE_VM, NULL);
int res = 0;
// works fine with sleep() call
// sleep(1);
if (res == -1) {
printf("clone error: %d", errno);
} else {
printf("I created child with pid: %d\n", res);
}
pthread_join(tid,NULL);
printf("Main done, pid %d\n", getpid());
return 0;
}
Here is a test script I've been using to check for errors [it's a little rough, but should be okay]. Run against your version and it will abort quickly. The pthread_create version seems to pass just fine
#!/usr/bin/perl
# clonetest -- clone test
#
# arguments:
# "-p0" -- suppress check for duplicate parent messages
# "-c0" -- suppress check for duplicate child messages
# 1 -- base name for program to test (e.g. for xyz.c, use xyz)
# 2 -- [optional] number of test iterations (DEFAULT: 100000)
master(#ARGV);
exit(0);
# master -- master control
sub master
{
my(#argv) = #_;
my($arg,$sym);
while (1) {
$arg = $argv[0];
last unless (defined($arg));
last unless ($arg =~ s/^-(.)//);
$sym = $1;
shift(#argv);
$arg = 1
if ($arg eq "");
$arg += 0;
${"opt_$sym"} = $arg;
}
$opt_p //= 1;
$opt_c //= 1;
printf("clonetest: p=%d c=%d\n",$opt_p,$opt_c);
$xfile = shift(#argv);
$xfile //= "clone1";
printf("clonetest: xfile='%s'\n",$xfile);
$itermax = shift(#argv);
$itermax //= 100000;
$itermax += 0;
printf("clonetest: itermax=%d\n",$itermax);
system("cc -o $xfile -O2 $xfile.c -lpthread");
$code = $? >> 8;
die("master: compile error\n")
if ($code);
$logf = "/tmp/log";
for ($iter = 1; $iter <= $itermax; ++$iter) {
printf("iter: %d\n",$iter)
if ($opt_v);
dotest($iter);
}
}
# dotest -- perform single test
sub dotest
{
my($iter) = #_;
my($parcnt,$cldcnt);
my($xfsrc,$bf);
system("./$xfile > $logf");
open($xfsrc,"<$logf") or
die("dotest: unable to open '$logf' -- $!\n");
while ($bf = <$xfsrc>) {
chomp($bf);
if ($opt_p) {
while ($bf =~ /created/g) {
++$parcnt;
}
}
if ($opt_c) {
while ($bf =~ /func/g) {
++$cldcnt;
}
}
}
close($xfsrc);
if (($parcnt > 1) or ($cldcnt > 1)) {
printf("dotest: fail on %d -- parcnt=%d cldcnt=%d\n",
$iter,$parcnt,$cldcnt);
system("cat $logf");
exit(1);
}
}
UPDATE:
Were you able to recreate OPs problem with clone?
Absolutely. Before I created the pthreads version, in addition to testing OP's original version, I also created versions that:
(1) added setlinebuf to the start of main
(2) added fflush just before the clone and __fpurge as the first statement of func
(3) added an fflush in func before the return 0
Version (2) eliminated the duplicate parent messages, but the duplicate child messages remained
If you'd like to see this for yourself, download OP's version from the question, my version, and the test script. Then, run the test script on OP's version.
I posted enough information and files so that anyone can recreate the problem.
Note that due to differences between my system and OP's, I couldn't at first reproduce the problem on just 3-4 tries. So, that's why I created the script.
The script does 100,000 test runs and usually the problem will manifest itself within 5000-15000.
I can't recreate OP's issue, but I don't think the printf's are actually a problem.
glibc docs:
The POSIX standard requires that by default the stream operations are
atomic. I.e., issuing two stream operations for the same stream in two
threads at the same time will cause the operations to be executed as
if they were issued sequentially. The buffer operations performed
while reading or writing are protected from other uses of the same
stream. To do this each stream has an internal lock object which has
to be (implicitly) acquired before any work can be done.
Edit:
Even though the above is true for threads, as rici points out, there is a comment on sourceware:
Basically, there's nothing you can safely do with CLONE_VM unless the
child restricts itself to pure computation and direct syscalls (via
sys/syscall.h). If you use any of the standard library, you risk the
parent and child clobbering each other's internal states. You also
have issues like the fact that glibc caches the pid/tid in userspace,
and the fact that glibc expects to always have a valid thread pointer
which your call to clone is unable to initialize correctly because it
does not know (and should not know) the internal implementation of
threads.
Apparently, glibc isn't designed to work with clone if CLONE_VM is set but CLONE_THREAD|CLONE_SIGHAND are not.
Your processes both use the same stdout (that is, the C standard library FILE struct), which includes an accidentally shared buffer. That's undoubtedly causing problems.
Ass everyone suggests: it really seems to be a problem with, how shall I put it in case of clone(), process-safety? With a rough sketch of a locking version of printf (using write(2)) the output is as expected.
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <time.h>
#define STACK_SIZE 1024*1024
// VERY rough attempt at a thread-safe printf
#include <stdarg.h>
#define SYNC_REALLOC_GROW 64
int sync_printf(const char *format, ...)
{
int n, all = 0;
int size = 256;
char *p, *np;
va_list args;
if ((p = malloc(size)) == NULL)
return -1;
for (;;) {
va_start(args, format);
n = vsnprintf(p, size, format, args);
va_end(args);
if (n < 0)
return -1;
all += n;
if (n < size)
break;
size = n + SYNC_REALLOC_GROW;
if ((np = realloc(p, size)) == NULL) {
free(p);
return -1;
} else {
p = np;
}
}
// write(2) shoudl be threadsafe, so just in case
flockfile(stdout);
n = (int) write(fileno(stdout), p, all);
fflush(stdout);
funlockfile(stdout);
va_end(args);
free(p);
return n;
}
int func(void *param)
{
sync_printf("I am func, pid %d\n", getpid());
return 0;
}
int main()
{
sync_printf("I am main, pid %d\n", getpid());
void *ptr = malloc(STACK_SIZE);
sync_printf("I am calling clone\n");
int res = clone(func, ptr + STACK_SIZE, CLONE_VM, NULL);
// works fine with sleep() call
// sleep(1);
if (res == -1) {
sync_printf("clone error: %d", errno);
} else {
sync_printf("I created child with pid: %d\n", res);
}
sync_printf("Main done, pid %d\n\n", getpid());
return 0;
}
For the third time: it's only a sketch, no time for a robust version, but that shouldn't hinder you to write one.
As evaitl points out printf is documented to be thread-safe by glibc's documentation. BUT, this typically assumes that you are using the designated glibc function to create threads (that is, pthread_create()). If you do not, then you are on your own.
The lock taken by printf() is recursive (see flockfile). This means that if the lock is already taken, the implementation checks the owner of the lock against the locker. If the locker is the same as the owner, the locking attempt succeeds.
To distinguish between different threads, you need to setup properly TLS, which you do not do, but pthread_create() does. What I'm guessing happens is that in your case the TLS variable that identifies the thread is the same for both threads, so you end up taking the lock.
TL;DR: please use pthread_create()

How to read a counter from a linux C program to a bash test script?

I have a large C/C++ program on a Suse linux system. We do automated testing of it with a bash script, which sends input to the program, and reads the output. It's mainly "black-box" testing, but some tests need to know a few internal details to determine if a test has passed.
One test in particular needs to know how times the program runs a certain function (which parses a particular response message). When that function runs it issues a log and increments a counter variable. The automated test currently determines the number of invocations by grepping in the log file for the log message, and counting the number of occurrences before and after the test. This isn't ideal, because the logs (syslog-ng) aren't guaranteed, and they're frequently turned off by configuration, because they're basically debug logs.
I'm looking for a better alternative. I can change the program to enhance the testability, but it shouldn't be heavy impact to normal operation. My first thought was, I could just read the counter after each test. Something like this:
gdb --pid=$PID --batch -ex "p numServerResponseX"
That's slow when it runs, but it's good because the program doesn't need to be changed at all. With a little work, I could probably write a ptrace command to do this a little more efficiently.
But I'm wondering if there isn't a simpler way to do this. Could I write the counter to shared memory (with shm_open / mmap), and then read /dev/shm in the bash script? Is there some simpler way I could setup the counter to make it easy to read, without making it slow to increment?
Edit:
Details: The test setup is like this:
testScript <-> sipp <-> programUnderTest <-> externalServer
The bash testScript injects sip messages with sipp, and it generally determines success or failure based on the completion code from sipp. But in certain tests it needs to know the number of responses the program received from the external server. The function "processServerResponseX" processes certain responses from the external server. During the testing there isn't much traffic running, so the function is only invoked perhaps 20 times over 10 seconds. When each test ends and we want to check the counter, there should be essentially no traffic. However during normal operation, it might be invoked hundreds of times a second. The function is roughly:
unsigned long int numServerResponseX;
int processServerResponseX(DMsg_t * dMsg, AppId id)
{
if (DEBUG_ENABLED)
{
syslog(priority, "%s received %d", __func__, (int) id);
}
myMutex->getLock();
numServerResponseX++;
doLockedStuff(dMsg, id);
myMutex->releaseLock();
return doOtherStuff(dMsg, id);
}
The script currently does:
grep processServerResponseX /var/log/logfile | wc -l
and compares the value before and after. My goal is to have this work even if DEBUG_ENABLED is false, and not have it be too slow. The program is multi-threaded, and it runs on an i86_64 smp machine, so adding any long blocking function would not be a good solution.
I would have that certain function "(which parses a particular response message)" write (probably using fopen then fprintf then fclose) some textual data somewhere.
That destination could be a FIFO (see fifo(7) ...) or a temporary file in a tmpfs file system (which is a RAM file system), maybe /run/
If your C++ program is big and complex enough, you could consider adding some probing facilities (some means for an external program to query about the internal state of your C++ program) e.g. a dedicated web service (using libonion in a separate thread), or some interface to systemd, or to D-bus, or some remote procedure call service like ONC/RPC, JSON-RPC, etc etc...
You might be interested by POCOlib. Perhaps its logging framework should interest you.
As you mentioned, you might use Posix shared memory & semaphores (see shm_overview(7) and sem_overview(7) ...).
Perhaps the Linux specific eventfd(2) is what you need.... (you could code a tiny C program to be invoked by your testing bash scripts....)
You could also try to change the command line (I forgot how to do that, maybe libproc or write to /proc/self/cmdline see proc(5)...). Then ps would show it.
I personally do usually use the methods Basile Starynkevitch outlined for this, but I wanted to bring up an alternative method using realtime signals.
I am not claiming this is the best solution, but it is simple to implement and has very little overhead. The main downside is that the size of the request and response are both limited to one int (or technically, anything representable by an int or by a void *).
Basically, you use a simple helper program to send a signal to the application. The signal has a payload of one int your application can examine, and based on it, the application responds by sending the same signal back to the originator, with an int of its own as payload.
If you don't need any locking, you can use a simple realtime signal handler. When it catches a signal, it examines the siginfo_t structure. If sent via sigqueue(), the request is in the si_value member of the siginfo_t structure. The handler answers to the originating process (si_pid member of the structure) using sigqueue(), with the response. This only requires about sixty lines of code to be added to your application. Here is an example application, app1.c:
#define _POSIX_C_SOURCE 200112L
#include <unistd.h>
#include <signal.h>
#include <errno.h>
#include <string.h>
#include <time.h>
#include <stdio.h>
#define INFO_SIGNAL (SIGRTMAX-1)
/* This is the counter we're interested in */
static int counter = 0;
static void responder(int signum, siginfo_t *info,
void *context __attribute__((unused)))
{
if (info && info->si_code == SI_QUEUE) {
union sigval value;
int response, saved_errno;
/* We need to save errno, to avoid interfering with
* the interrupted thread. */
saved_errno = errno;
/* Incoming signal value (int) determines
* what we respond back with. */
switch (info->si_value.sival_int) {
case 0: /* Request loop counter */
response = *(volatile int *)&counter;
break;
/* Other codes? */
default: /* Respond with -1. */
response = -1;
}
/* Respond back to signaler. */
value.sival_ptr = (void *)0L;
value.sival_int = response;
sigqueue(info->si_pid, signum, value);
/* Restore errno. This way the interrupted thread
* will not notice any change in errno. */
errno = saved_errno;
}
}
static int install_responder(const int signum)
{
struct sigaction act;
sigemptyset(&act.sa_mask);
act.sa_sigaction = responder;
act.sa_flags = SA_SIGINFO;
if (sigaction(signum, &act, NULL))
return errno;
else
return 0;
}
int main(void)
{
if (install_responder(INFO_SIGNAL)) {
fprintf(stderr, "Cannot install responder signal handler: %s.\n",
strerror(errno));
return 1;
}
fprintf(stderr, "PID = %d\n", (int)getpid());
fflush(stderr);
/* The application follows.
* This one just loops at 100 Hz, printing a dot
* about once per second or so. */
while (1) {
struct timespec t;
counter++;
if (!(counter % 100)) {
putchar('.');
fflush(stdout);
}
t.tv_sec = 0;
t.tv_nsec = 10000000; /* 10ms */
nanosleep(&t, NULL);
/* Note: Since we ignore the remainder
* from the nanosleep call, we
* may sleep much shorter periods
* when a signal is delivered. */
}
return 0;
}
The above responder responds to query 0 with the counter value, and with -1 to everything else. You can add other queries simply by adding a suitable case statement in responder().
Note that locking primitives (except for sem_post()) are not async-signal safe, and thus should not be used in a signal handler. So, the above code cannot implement any locking.
Signal delivery can interrupt a thread in a blocking call. In the above application, the nanosleep() call is usually interrupted by the signal delivery, causing the sleep to be cut short. (Similarly, read() and write() calls may return -1 with errno == EINTR, if they were interrupted by signal delivery.)
If that is a problem, or you are not sure if all your code handles errno == EINTR correctly, or your counters need locking, you can use separate thread dedicated for the signal handling instead.
The dedicated thread will sleep unless a signal is delivered, and only requires a very small stack, so it really does not consume any significant resources at run time.
The target signal is blocked in all threads, with the dedicated thread waiting in sigwaitinfo(). If it catches any signals, it processes them just like above -- except that since this is a thread and not a signal handler per se, you can freely use any locking etc., and do not need to limit yourself to async-signal safe functions.
This threaded approach is slightly longer, adding almost a hundred lines of code to your application. (The differences are contained in the responder() and install_responder() functions; even the code added to main() is exactly the same as in app1.c.)
Here is app2.c:
#define _POSIX_C_SOURCE 200112L
#include <signal.h>
#include <errno.h>
#include <pthread.h>
#include <string.h>
#include <time.h>
#include <stdio.h>
#define INFO_SIGNAL (SIGRTMAX-1)
/* This is the counter we're interested in */
static int counter = 0;
static void *responder(void *payload)
{
const int signum = (long)payload;
union sigval response;
sigset_t sigset;
siginfo_t info;
int result;
/* We wait on only one signal. */
sigemptyset(&sigset);
if (sigaddset(&sigset, signum))
return NULL;
/* Wait forever. This thread is automatically killed, when the
* main thread exits. */
while (1) {
result = sigwaitinfo(&sigset, &info);
if (result != signum) {
if (result != -1 || errno != EINTR)
return NULL;
/* A signal was delivered using *this* thread. */
continue;
}
/* We only respond to sigqueue()'d signals. */
if (info.si_code != SI_QUEUE)
continue;
/* Clear response. We don't leak stack data! */
memset(&response, 0, sizeof response);
/* Question? */
switch (info.si_value.sival_int) {
case 0: /* Counter */
response.sival_int = *(volatile int *)(&counter);
break;
default: /* Unknown; respond with -1. */
response.sival_int = -1;
}
/* Respond. */
sigqueue(info.si_pid, signum, response);
}
}
static int install_responder(const int signum)
{
pthread_t worker_id;
pthread_attr_t attrs;
sigset_t mask;
int retval;
/* Mask contains only signum. */
sigemptyset(&mask);
if (sigaddset(&mask, signum))
return errno;
/* Block signum, in all threads. */
if (sigprocmask(SIG_BLOCK, &mask, NULL))
return errno;
/* Start responder() thread with a small stack. */
pthread_attr_init(&attrs);
pthread_attr_setstacksize(&attrs, 32768);
retval = pthread_create(&worker_id, &attrs, responder,
(void *)(long)signum);
pthread_attr_destroy(&attrs);
return errno = retval;
}
int main(void)
{
if (install_responder(INFO_SIGNAL)) {
fprintf(stderr, "Cannot install responder signal handler: %s.\n",
strerror(errno));
return 1;
}
fprintf(stderr, "PID = %d\n", (int)getpid());
fflush(stderr);
while (1) {
struct timespec t;
counter++;
if (!(counter % 100)) {
putchar('.');
fflush(stdout);
}
t.tv_sec = 0;
t.tv_nsec = 10000000; /* 10ms */
nanosleep(&t, NULL);
}
return 0;
}
For both app1.c and app2.c the application itself is the same.
The only modifications needed to the application are making sure all the necessary header files get #included, adding responder() and install_responder(), and a call to install_responder() as early as possible in main().
(app1.c and app2.c only differ in responder() and install_responder(); and in that app2.c needs pthreads.)
Both app1.c and app2.c use the signal SIGRTMAX-1, which should be unused in most applications.
app2.c approach, also has a useful side-effect you might wish to use in general: if you use other signals in your application, but don't want them to interrupt blocking I/O calls et cetera -- perhaps you have a library that was written by a third party, and does not handle EINTR correctly, but you do need to use signals in your application --, you can simply block the signals after the install_responder() call in your application. The only thread, then, where the signals are not blocked is the responder thread, and the kernel will use tat to deliver the signals. Therefore, the only thread that will ever get interrupted by the signal delivery is the responder thread, more specifically sigwaitinfo() in responder(), and it ignores any interruptions. If you use for example async I/O or timers, or this is a heavy math or data processing application, this might be useful.
Both application implementations can be queried using a very simple query program, query.c:
#define _POSIX_C_SOURCE 200112L
#include <unistd.h>
#include <signal.h>
#include <string.h>
#include <errno.h>
#include <time.h>
#include <stdio.h>
int query(const pid_t process, const int signum,
const int question, int *const response)
{
sigset_t prevmask, waitset;
struct timespec timeout;
union sigval value;
siginfo_t info;
int result;
/* Value sent to the target process. */
value.sival_int = question;
/* Waitset contains only signum. */
sigemptyset(&waitset);
if (sigaddset(&waitset, signum))
return errno = EINVAL;
/* Block signum; save old mask into prevmask. */
if (sigprocmask(SIG_BLOCK, &waitset, &prevmask))
return errno;
/* Send the signal. */
if (sigqueue(process, signum, value)) {
const int saved_errno = errno;
sigprocmask(signum, &prevmask, NULL);
return errno = saved_errno;
}
while (1) {
/* Wait for a response within five seconds. */
timeout.tv_sec = 5;
timeout.tv_nsec = 0L;
/* Set si_code to an uninteresting value,
* just to be safe. */
info.si_code = SI_KERNEL;
result = sigtimedwait(&waitset, &info, &timeout);
if (result == -1) {
/* Some other signal delivered? */
if (errno == EINTR)
continue;
/* No response; fail. */
sigprocmask(SIG_SETMASK, &prevmask, NULL);
return errno = ETIMEDOUT;
}
/* Was this an interesting signal? */
if (result == signum && info.si_code == SI_QUEUE) {
if (response)
*response = info.si_value.sival_int;
/* Return success. */
sigprocmask(SIG_SETMASK, &prevmask, NULL);
return errno = 0;
}
}
}
int main(int argc, char *argv[])
{
pid_t pid;
int signum, question, response;
long value;
char dummy;
if (argc < 3 || argc > 4 ||
!strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s PID SIGNAL [ QUERY ]\n", argv[0]);
fprintf(stderr, "\n");
return 1;
}
if (sscanf(argv[1], " %ld %c", &value, &dummy) != 1) {
fprintf(stderr, "%s: Invalid process ID.\n", argv[1]);
return 1;
}
pid = (pid_t)value;
if (pid < (pid_t)1 || value != (long)pid) {
fprintf(stderr, "%s: Invalid process ID.\n", argv[1]);
return 1;
}
if (sscanf(argv[2], "SIGRTMIN %ld %c", &value, &dummy) == 1)
signum = SIGRTMIN + (int)value;
else
if (sscanf(argv[2], "SIGRTMAX %ld %c", &value, &dummy) == 1)
signum = SIGRTMAX + (int)value;
else
if (sscanf(argv[2], " %ld %c", &value, &dummy) == 1)
signum = value;
else {
fprintf(stderr, "%s: Unknown signal.\n", argv[2]);
return 1;
}
if (signum < SIGRTMIN || signum > SIGRTMAX) {
fprintf(stderr, "%s: Not a realtime signal.\n", argv[2]);
return 1;
}
/* Clear the query union. */
if (argc > 3) {
if (sscanf(argv[3], " %d %c", &question, &dummy) != 1) {
fprintf(stderr, "%s: Invalid query.\n", argv[3]);
return 1;
}
} else
question = 0;
if (query(pid, signum, question, &response)) {
switch (errno) {
case EINVAL:
fprintf(stderr, "%s: Invalid signal.\n", argv[2]);
return 1;
case EPERM:
fprintf(stderr, "Signaling that process was not permitted.\n");
return 1;
case ESRCH:
fprintf(stderr, "No such process.\n");
return 1;
case ETIMEDOUT:
fprintf(stderr, "No response.\n");
return 1;
default:
fprintf(stderr, "Failed: %s.\n", strerror(errno));
return 1;
}
}
printf("%d\n", response);
return 0;
}
Note that I did not hardcode the signal number here; use SIGRTMAX-1 on the command line for app1.c and app2.c. (You can change it. query.c does understand SIGRTMIN+n too. You must use a realtime signal, SIGRTMIN+0 to SIGRTMAX-0, inclusive.)
You can compile all three programs using
gcc -Wall -O3 app1.c -o app1
gcc -Wall -O3 app2.c -lpthread -o app2
gcc -Wall -O3 query.c -o query
Both ./app1 and ./app2 print their PIDs, so you don't need to look for it. (You can find the PID using e.g. ps -o pid= -C app1 or ps -o pid= -C app2, though.)
If you run ./app1 or ./app2 in one shell (or both in separate shells), you can see them outputting the dots at about once per second. The counter increases every 1/100th of a second. (Press Ctrl+C to stop.)
If you run ./query PID SIGRTMAX-1 in another shell in the same directory on the same machine, you can see the counter value.
An example run on my machine:
A$ ./app1
PID = 28519
...........
B$ ./query 28519 SIGRTMAX-1
11387
C$ ./app2
PID = 28522
...
B$ ./query 28522 SIGRTMAX -1
371
As mentioned, the downside of this mechanism is that the response is limited to one int (or technically an int or a void *). There are ways around that, however, by also using some of the methods Basile Starynkevich outlined. Typically, the signal is then just a notification for the application that it should update the state stored in a file, shared memory segment, or wherever. I recommend using the dedicated thread approach for that, as it has very little overheads, and minimal impact on the application itself.
Any questions?
A hard-coded systemtap solution could look like:
% cat FOO.stp
global counts
probe process("/path/to/your/binary").function("CertainFunction") { counts[pid()] <<< 1 }
probe process("/path/to/your/binary").end { println ("pid %d count %sd", pid(), #count(counts[pid()]))
delete counts[pid()] }
# stap FOO.stp
pid 42323 count 112
pid 2123 count 0
... etc, until interrupted
Thanks for the responses. There is lots of good information in the other answers. However, here's what I did. First I tweaked the program to add a counter in a shm file:
struct StatsCounter {
char counterName[8];
unsigned long int counter;
};
StatsCounter * stats;
void initStatsCounter()
{
int fd = shm_open("TestStats", O_RDWR|O_CREAT, 0);
if (fd == -1)
{
syslog(priority, "%s:: Initialization Failed", __func__);
stats = (StatsCounter *) malloc(sizeof(StatsCounter));
}
else
{
// For now, just one StatsCounter is used, but it could become an array.
ftruncate(fd, sizeof(StatsCounter));
stats = (StatsCounter *) mmap(NULL, sizeof(StatsCounter),
PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
}
// Initialize names. Pad them to 7 chars (save room for \0).
snprintf(stats[0].counterName, sizeof(stats[0].counterName), "nRespX ");
stats[0].counter = 0;
}
And changed processServerResponseX to increment stats[0].counter in the locked section. Then I changed the script to parse the shm file with "hexdump":
hexdump /dev/shm/TestStats -e ' 1/8 "%s " 1/8 "%d\n"'
This will then show something like this:
nRespX 23
This way I can extend this later if I want to also look at response Y, ...
Not sure if there are mutual exclusion problems with hexdump if it accessed the file while it was being changed. But in my case, I don't think it matters, because the script only calls it before and after the test, it should not be in the middle of an update.

Creating a thread which writes into a file

I am learning threads and trying to implement a code which creates a thread. The thread writes into a file. If the thread has been created it returns 0 . The code here returns 0 but it does go into the function write() but does not writes in the file . Just to check it goes in the function i have put a printf() statement.I want the input should be taken by command line here but it also does not work so to make it simpler i have written only "hello world" to the file .
Here is the code :-
#include<stdio.h>
#include<stdlib.h>
#include<pthread.h>
void *write(void *arg)
{
printf("HI \n");
FILE *fp;
fp = fopen("file.txt", "a");
if (fp == NULL) {
printf("error\n");
} else {
fprintf(fp, "hello world");
}
}
int main()
{
pthread_t thread;
int tid;
tid = pthread_create(&thread, NULL, write, NULL);
printf("thread1 return %d \n", tid);
exit(0);
}
I suspect what's happening is the exit() call is executing before the fprintf() gets to the point of putting content into the buffer.
pthread_create() returns after creating the thread, not after the thread finishes, and then both threads run simultaneously. Maybe this is your first "race condition"?
void *result; pthread_join(tid, &result); will wait for the function running in the other thread to return (and get it's return value).
correction
Forgot that the file pointer is not automatically closed, so this will thwart you as well. Call fflush() or fclose() after the fprintf.
You need to join with the thread to wait for it to finish before exiting your main program.
tid=pthread_create(&thread,NULL,write,NULL);
printf("thread1 return %d \n",tid);
pthread_join(thread, NULL);
exit(0);
Your thread function should return a value since it is declared to do so. Returning NULL is fine.
I think you shoudl this code:
#include <thread>
#include <fstream>
using namespace std;
void write(string filename)
{
ofstream outfile(filename);
outfile<<"Hello World!"<<endl;
outfile.close();
}
int main()
{
thread t(write, "file.txt");
t.join();
}
use this command to compile the code:g++ -g -std=c++11 test.cpp -lpthread

Resources