I have a multi-threaded application and have got a way to do a telnet, ssh on to this application. In my application, I do one of the init script restart using the custom system() call below. It seems like, the child process is still active. I am saying this because If I logout from telnet session still the process hangs i.e. it cannot logout. This happens only when I restart the script using this system call. Is there something wrong with my system() function?
int system(const char *command)
{
int wait_val, pid;
struct sigaction sa, save_quit, save_int;
sigset_t save_mask;
syslog(LOG_ERR,"SJ.. calling this system function\r\n");
if (command == 0)
return 1;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = SIG_IGN;
/* __sigemptyset(&sa.sa_mask); - done by memset() */
/* sa.sa_flags = 0; - done by memset() */
sigaction(SIGQUIT, &sa, &save_quit);
sigaction(SIGINT, &sa, &save_int);
__sigaddset(&sa.sa_mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &sa.sa_mask, &save_mask);
if ((pid = vfork()) < 0) {
perror("vfork fails: ");
wait_val = -1;
goto out;
}
if (pid == 0) {
sigaction(SIGQUIT, &save_quit, NULL);
sigaction(SIGINT, &save_int, NULL);
sigprocmask(SIG_SETMASK, &save_mask, NULL);
struct sched_param param;
param.sched_priority = 0;
sched_setscheduler(0, SCHED_OTHER, ¶m);
setpriority(PRIO_PROCESS, 0, 5);
execl("/bin/sh", "sh", "-c", command, (char *) 0);
_exit(127);
}
#if 0
__printf("Waiting for child %d\n", pid);
#endif
if (wait4(pid, &wait_val, 0, 0) == -1)
wait_val = -1;
out:
sigaction(SIGQUIT, &save_quit, NULL);
sigaction(SIGINT, &save_int, NULL);
sigprocmask(SIG_SETMASK, &save_mask, NULL);
return wait_val;
}
Any ideas on how to debug whether this system call is getting hanged or not?
I realized this happens because file descriptors are inherited upon fork .
Since my custom system() is nothing but fork() and exec(). There are plenty of sockets in my application. These socket file descriptors gets inherited by the child process.
My assumption here is that "Child process can't exit because it is waiting for parent process to close the file descriptors or those file descriptors are in a state where it can be closed". Not sure what those states are though.
So, here is the interesting link I found -
Call system() inside forked (child) process, when parent process has many threads, sockets and IPC
Solution -
linux fork: prevent file descriptors inheritance
Not sure, I can do this in a big application where sockets are opened at thousand of places. So, here is what I did.
My Solution -
I created a separate process/daemon that listens for the command from the parent application. This communication is based on socket. Since, it is a separate application/daemon it doesn't affect the main application which is running multiple threads and has a lot of opened sockets. This worked for me.
I believe that this problem will be fixed once I do -
fcntl(fd, F_SETFD, FD_CLOEXEC);
Any comments are welcome here.
Is this a fundamental problem in Linux, C i.e.
all file descriptors are inherited by default?
Why linux/kernel allow this? What advantage do we get out of it?
Related
I am doing a simple server/client program in C which listens on a network interface and accepts clients. Each client is handled in a forked process.
The goal I have is to let the parent process know, once a client has disconnected from the child process.
Currently my main loop looks like this:
for (;;) {
/* 1. [network] Wait for new connection... (BLOCKING CALL) */
fd_listen[client] = accept(fd_listen[server], (struct sockaddr *)&cli_addr, &clilen);
if (fd_listen[client] < 0) {
perror("ERROR on accept");
exit(1);
}
/* 2. [process] Call socketpair */
if ( socketpair(AF_LOCAL, SOCK_STREAM, 0, fd_comm) != 0 ) {
perror("ERROR on socketpair");
exit(1);
}
/* 3. [process] Call fork */
pid = fork();
if (pid < 0) {
perror("ERROR on fork");
exit(1);
}
/* 3.1 [process] Inside the Child */
if (pid == 0) {
printf("[child] num of clients: %d\n", num_client+1);
printf("[child] pid: %ld\n", (long) getpid());
close(fd_comm[parent]); // Close the parent socket file descriptor
close(fd_listen[server]); // Close the server socket file descriptor
// Tasks that the child process should be doing for the connected client
child_processing(fd_listen[client]);
exit(0);
}
/* 3.2 [process] Inside the Parent */
else {
num_client++;
close(fd_comm[child]); // Close the child socket file descriptor
close(fd_listen[client]); // Close the client socket file descriptor
printf("[parent] num of clients: %d\n", num_client);
while ( (w = waitpid(-1, &status, WNOHANG)) > 0) {
printf("[EXIT] child %d terminated\n", w);
num_client--;
}
}
}/* end of while */
It all works well, the only problem I have is (probably) due to the blocking accept call.
When I connect to the above server, a new child process is created and child_processing is called.
However when I disconnect with that client, the main parent process does not know about it and does NOT output printf("[EXIT] child %d terminated\n", w);
But, when I connect with a second client after the first client has disconnected, the main loop is able to finally process the while ( (w = waitpid(-1, &status, WNOHANG)) > 0) part and tell me that the first client has disconnected.
If there will be only ever one client connecting and disconnecting afterwards, my main parent process will never be able to tell if it has disconnected or not.
Is there any way to tell the parent process that my client already left?
UPDATE
As I am a real beginner with c, it would be nice if you provide some short snippets to your answer so I can actually understand it :-)
Your waitpid usage is not correct. You have a non-blocking call so if the child is not finished then then the call gets 0:
waitpid(): on success, returns the process ID of the child whose state
has changed; if WNOHANG was specified and one or more child(ren)
specified by pid exist, but have not yet changed state, then 0 is
returned. On error, -1 is returned.
So your are going immediately out of the while loop. Of course this can be catched later when the first children terminates and a second one lets you process the waitpid again.
As you need to have a non-blocking call to wait I can suggest you not to manage termination directly but through SIGCHLD signal that will let you catch termination of any children and then appropriately call waitpid in the handler:
void handler(int signal) {
while (waitpid(...)) { // find an adequate condition and paramters for your needs
}
...
struct sigaction act;
act.sa_flag = 0;
sigemptyset(&(act.sa_mask));
act.sa_handler = handler;
sigaction(SIGCHLD,&act,NULL);
... // now ready to receive SIGCHLD when at least a children changes its state
If I understand correctly, you want to be able to servicve multiple clients at once, and therefore your waitpid call is correct in that it does not block if no child has terminated.
However, the problem you then have is that you need to be able to process asynchronous child termination while waiting for new clients via accept.
Assuming that you're dealing with a POSIXy system, merely having a SIGCHLD handler established and having the signal unmasked (via sigprocmask, though IIRC it is unmasked by default), should be enough to cause accept to fail with EINTR if a child terminates while you are waiting for a new client to connect - and you can then handle EINTR appropriately.
The reason for this is that a SIGCHLD signal will be automatically sent to the parent process when a child process terminates. In general, system calls such as accept will return an error of EINTR ("interrupted") if a signal is received while they are waiting.
However, there would still be a race condition, where a child terminates just before you call accept (i.e. in between where already have waitpid and accept). There are two main possibilities to overcome this:
Do all the child termination processing in your SIGCHLD handler, instead of the main loop. This may not be feasible, however, since there are significant limits to what you are allowed to do within a signal handler. You may not call printf for example (though you may use write).
I do not suggest you go down this path, although it may seem simpler at first it is the least flexible option and may prove unworkable later.
Write to one end of a non-blocking pipe in your SIGCHLD signal handler. Within the main loop, instead of calling accept directly, use poll (or select) to look for readiness on both the socket and the read end of the pipe, and handle each appropriately.
On Linux (and OpenBSD, I'm not sure about others) you can use ppoll (man page) to avoid the need to create a pipe (and in this case you should leave the signal masked, and have it unmasked during the poll operation; if ppoll fails with EINTR, you know that a signal was received, and you should call waitpid). You still need to set a signal handler for SIGCHLD, but it doesn't need to do anything.
Another option on Linux is to use signalfd (man page) to avoid both the need to create a pipe and set up a signal handler (I think). You should mask the SIGCHLD signal (using sigprocmask) if you use this. When poll (or equivalent) indicates that the signalfd is active, read the signal data from it (which clears the signal) and then call waitpid to reap the child.
On various BSD systems you can use kqueue (OpenBSD man page) instead of poll and watch for signals without needing to establish a signal handler.
On other POSIX systems you may be able to use pselect (documentation) in a similar way to ppoll as described above.
There is also the option of using a library such as libevent to abstract away the OS-specifics.
The Glibc manual has an example of using select. Consult the manual pages for poll, ppoll, pselect for more information about those functions. There is an online book on using Libevent.
Rough example for using select, borrowed from Glibc documentation (and modified):
/* Set up a pipe and set signal handler for SIGCHLD */
int pipefd[2]; /* must be a global variable */
pipe(pipefd); /* TODO check for error return */
fcntl(pipefd[1], F_SETFL, O_NONBLOCK); /* set write end non-blocking */
/* signal handler */
void sigchld_handler(int signum)
{
char a = 0; /* write anything, doesn't matter what */
write(pipefd[1], &a, 1);
}
/* set up signal handler */
signal(SIGCHLD, sigchld_handler);
Where you currently have accept, you need to check status of the server socket and the read end of the pipe:
fd_set set, outset;
struct timeval timeout;
/* Initialize the file descriptor set. */
FD_ZERO (&set);
FD_SET (fdlisten[server], &set);
FD_SET (pipefds[0], &set);
FD_ZERO(&outset);
for (;;) {
select (FD_SETSIZE, &set, NULL, &outset, NULL /* no timeout */));
/* TODO check for error return.
EINTR should just continue the loop. */
if (FD_ISSET(fdlisten[server], &outset)) {
/* now do accept() etc */
}
if (FD_ISSET(pipefds[0], &outset)) {
/* now do waitpid(), and read a byte from the pipe */
}
}
Using other mechanisms is generally simpler, so I leave those as an exercise :)
I have a C function to do a fork and exec that will be called twice.
The first call executes a shell script (call it setenv.sh) which can be any kind of shell (bash/korn/c/perl etc) that will set environment variables. The envp array will be NULL for this call but the intent was that it will return a populated array based on environ from the child process after setenv.sh has run.
The second call will be a C or java program that needs a certain environment to run so for this call, the envp array will be the populated one returned from the first call.
int execute(char **args, int argc, char **envp)
{
char *function = "execute";
int status, i;
pid_t p, pid;
extern int errno;
sigset_t mask, savemask;
struct sigaction ignore, saveint, savequit;
int fd[2];
pipe(fd);
sigemptyset(&ignore.sa_mask);
ignore.sa_handler = SIG_IGN;
ignore.sa_flags=0;
sigaction(SIGINT, &ignore, &saveint);
sigaction(SIGQUIT, &ignore, &savequit);
sigemptyset(&mask);
sigaddset(&mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask, &savemask);
if ((pid=fork()) < 0) status = -1;
if (pid ==0) {
/* Child */
close(fd[0]);
sigaction(SIGINT, &saveint, (struct sigaction *) 0);
sigaction(SIGQUIT, &savequit, (struct sigaction *) 0);
sigprocmask(SIG_SETMASK, &savemask, (sigset_t *) 0);
printf("Command Line Parameters\n");
printf("-----------------------\n");
for (i = 0; i < argc; i++) {
printf("[%d]: %s\n", (i+1), args[i]);
}
if (execve(*args, args, envp) < 0)
{
sprintf(err_data,"Failed to execute %s", args[0]);
perror(err_data);
return(FAILED);
}
write(fd[1], &environ, sizeof(environ));
close(fd[1]);
}
while (waitpid(pid, &status, 0) < 0) {
if (errno != EINTR) {
status = -1;
break;
}
}
if (status==0) {
read(fd[0], &envp, sizeof(envp));
}
close(fd[0]);
sigaction(SIGINT, &saveint, (struct sigaction *) 0);
sigaction(SIGQUIT, &savequit, (struct sigaction *) 0);
sigprocmask(SIG_SETMASK, &savemask, (sigset_t *) 0);
return(status);
}
This function is working fine without the pipe code to execute a real program passed in and i can also pass it a set of environment variables in an envp array and it runs in that environment fine.
However, in testing with the pipe included, i find after the exec of setenv.sh, the child process never executes the writing of environ to the pipe and the parent then just blocks on the read from pipe.
I understand why it doesnt work - because the exec of the shell script overwrites the original C code in the child. The question is, is there a way to achieve the aim of running a shell script with exec and capturing the resulting ENVIRONMENT back in the parent (not the same as capturing stdin/stdout/stderr). Assume you cannot change the contents of setenv.sh because it may be provided by a third party.
No need to nitpick over error handling etc.. , this is a work in progress so just after some inputs in how to achieve the aim.
An alternative i considered was parsing the setenv.sh script in the parent to obtain the variables into an array which can then be passed to the real program. Problem with this is the setenv.sh script might contain if statement blocks and includes of other shell scripts so i really wanted to capture the environment at the end of the run of setenv.sh (by exec'ing it) and passing this back to the parent.
Any suggestions appreciated ?
You basically can't solve this generally without using debugging facilities of your operating system and digging into the memory of your child process. Which basically requires you to do write half of a debugger.
The closest you can get with a third-party script is something like this. Let's say that the script is for /bin/bash. You write your own wrapper script like this:
#!/bin/bash
. setenv.sh
env >&3
Where 3 is the file descriptor number of your pipe. You can write equivalent scripts for other shells. The only reason this works though is because the "setenv.sh" script is executed inside your wrapper script without creating a child process. Environment variables can only be communicated to children of a process.
In a system I use at work we have environment variables that need to be unified between many different programs that come from various scripts, many of which we don't have any control over. The way we resolved that mess is that instead of environment variables we require those scripts to output "KEY=VALUE\n" lines and then import them into scripts, makefiles, etc. through simple scripts (if required). That's probably the best you can do.
You can use :
extern char **environ;
environ is defined as a global variable in the Glibc source file posix/environ.c.
I have one simple program that's using Qt Framework.
It uses QProcess to execute RAR and compress some files. In my program I am catching SIGINT and doing something in my code when it occurs:
signal(SIGINT, &unix_handler);
When SIGINT occurs, I check if RAR process is done, and if it isn't I will wait for it ... The problem is that (I think) RAR process also gets SIGINT that was meant for my program and it quits before it has compressed all files.
Is there a way to run RAR process so that it doesn't receive SIGINT when my program receives it?
Thanks
If you are generating the SIGINT with Ctrl+C on a Unix system, then the signal is being sent to the entire process group.
You need to use setpgid or setsid to put the child process into a different process group so that it will not receive the signals generated by the controlling terminal.
[Edit:]
Be sure to read the RATIONALE section of the setpgid page carefully. It is a little tricky to plug all of the potential race conditions here.
To guarantee 100% that no SIGINT will be delivered to your child process, you need to do something like this:
#define CHECK(x) if(!(x)) { perror(#x " failed"); abort(); /* or whatever */ }
/* Block SIGINT. */
sigset_t mask, omask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
CHECK(sigprocmask(SIG_BLOCK, &mask, &omask) == 0);
/* Spawn child. */
pid_t child_pid = fork();
CHECK(child_pid >= 0);
if (child_pid == 0) {
/* Child */
CHECK(setpgid(0, 0) == 0);
execl(...);
abort();
}
/* Parent */
if (setpgid(child_pid, child_pid) < 0 && errno != EACCES)
abort(); /* or whatever */
/* Unblock SIGINT */
CHECK(sigprocmask(SIG_SETMASK, &omask, NULL) == 0);
Strictly speaking, every one of these steps is necessary. You have to block the signal in case the user hits Ctrl+C right after the call to fork. You have to call setpgid in the child in case the execl happens before the parent has time to do anything. You have to call setpgid in the parent in case the parent runs and someone hits Ctrl+C before the child has time to do anything.
The sequence above is clumsy, but it does handle 100% of the race conditions.
What are you doing in your handler? There are only certain Qt functions that you can call safely from a unix signal handler. This page in the documentation identifies what ones they are.
The main problem is that the handler will execute outside of the main Qt event thread. That page also proposes a method to deal with this. I prefer getting the handler to "post" a custom event to the application and handle it that way. I posted an answer describing how to implement custom events here.
Just make the subprocess ignore SIGINT:
child_pid = fork();
if (child_pid == 0) {
/* child process */
signal(SIGINT, SIG_IGN);
execl(...);
}
man sigaction:
During an execve(2), the dispositions of handled signals are reset to the default;
the dispositions of ignored signals are left unchanged.
"No functions registered by atexit() in the calling process image are registered in the new process image".
Here is code:
pid = fork();
if (pid == 0) {
atexit(check_mem);
return execv(...);
}
check_mem function not getting called after execv(). Because of above "line". Any hacks to get the function registered after execv call ??
Thanks in advance for your help.
atexit handlers will not execute when you exec* something.
execv replaces the current process image, including any atexit handlers you've registered, so there's really not a lot you can do - your code is gone.
A little tricky but doable - create a shared library (let's call it check_mem.so) with a a function like so:
__attribute__((constructor)) void runs_first(void) {
atexit(check_mem);
};
Note that check_mem needs to be defined in the library, not your program.
Now in execve, put LD_PRELOAD=/path/to/check_mem.so into the environment variables passed to the program (last argument to execve).
What will happen is that when the new program will run it will load the your check_mem library and run the runs_first function before (almost) every other code.
It will only work if the program you are execve'ing is dynamically linked, but AFAIK that is the only limitation.
EDIT: as comment rightfully stated, it wont work on setuid programs either. I still think there is a good chance it'll cover your use case though.
The perfect solution is using ptrace() as below :
pid = fork();
if (pid == 0) {
ptrace(PTRACE_TRACEME, 0, 0, 0);
return execve(...);
}
wait(NULL);
ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_TRACEEXIT);
ptrace(PTRACE_CONT, pid, 0, (void*)0);
while(1){
waitpid(pid, &status, 0);
if((WSTOPSIG(status) == SIGTRAP) && (status & (PTRACE_EVENT_EXIT << 8)))
break;
ptrace(PTRACE_CONT, pid, 0, WSTOPSIG(status));
}
check_mem();
ptrace(PTRACE_CONT, pid, 0, 0);
Acknowledgments : www.wienand.org/junkcode/linux/stopper.c
Like nos said, exec replaces your process. You might try using < stdlib.h >'s int system (char *s) function instead to start a program with args. Unlike execve, system returns when the spawned process exits, e.g.
pid = fork();
if (pid == 0) {
atexit(check_mem);
system ("program arg1 arg2 ...");
exit (0); /* Calls atexit handlers. */
}
I'm creating background processes in C using fork().
When I created one of these processes, I add its pid to an array so I can keep track of background processes.
pid = fork();
if(pid == -1)
{
printf("error: fork()\n");
}
else if(pid == 0)
{
execvp(*args, args);
exit(0);
}
else
{
// add process to tracking array
addBGroundProcess(pid, args[0]);
}
I have a handler for reaping zombies
void childHandler(int signum)
{
pid_t pid;
int status;
/* loop as long as there are children to process */
while (1) {
/* get zombie pids */
pid = waitpid(-1, &status, WNOHANG);
if (pid == -1)
{
if (errno == EINTR)
{
continue;
}
break;
}
else if (pid == 0)
{
break;
}
/* Remove this child from tracking array */
if (pid != mainPid)
cleanUpChild(pid);
}
}
When I create a background process, the handler is executing and attempting to clean up the child before I can even make the call to addBGroundProcess.
I'm using commands like emacs& which should not be exiting immediately.
What am I missing?
Thanks.
You're right, there is a race condition there. I suggest that you block the delivery of SIGCHLD using the sigprocmask function. When you have added the new PID to your data structure, unblock the signal again. When a signal is blocked, if that signal is received, the kernel remembers that it needs to deliver that signal, and when the signal is unblocked, it's delivered.
Here's what I mean, specifically:
sigset_t mask, prevmask;
//Initialize mask with just the SIGCHLD signal
sigemptyset(&mask);
sigaddset(&mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask, &prevmask); /*block SIGCHLD, get previous mask*/
pid = fork();
if(pid == -1)
{
printf("error: fork()\n");
}
else if(pid == 0)
{
execvp(*args, args);
exit(0);
}
else
{
// add process to tracking array
addBGroundProcess(pid, args[0]);
// Unblock SIGCHLD again
sigprocmask(SIG_SETMASK, &prevmask, NULL);
}
Also, I think there's a possibility that execvp could be failing. (It's good to handle this in general, even if it's not happening in this case.) It depends exactly how it's implemented, but I don't think that you're allowed to put a & on the end of a command to get it to run in the background. Running emacs by itself is probably what you want in this case anyway, and putting & on the end of a command line is a feature provided by the shell.
Edit: I saw your comments about how you don't want emacs to run in the current terminal session. How do you want it to run, exactly - in a separate X11 window, perhaps? If so, there are other ways of achieving that.
A fairly easy way of handling execvp's failure is to do this:
execvp(*args, args);
perror("execvp failed");
_exit(127);
Your code just catches the exit of the child process it fork'ed, which is not to say that another process wasn't fork'ed by that child first. I'm guessing that emacs in your case is doing another fork() on itself for some reason, and then allowing the initial process to exit (that's a trick daemons will do).
The setsid() function might also be worth looking at, although without writing up some code myself to check it I'm not sure if that's relevant here.
You should not be using the shell with & to run background processes. If you do that, they come out as grandchildren which you cannot track and wait on. Instead you need to either mimic what the shell does to run background processes in your own code, or it would probably work just as well to close the terminal (or rather stdin/out/err) and open /dev/null in its place in the child processes so they don't try to write to the terminal or take control of it.