how to correctly wait for execve to finish? - c

A C source code (compiled and running Linux Centos 6.3) has the line:
execve(cmd, argv, envp);
execve does not return, but I want to modify the code to know when it is finished. So I do this:
if (child = fork()) {
waitpid(child, NULL, 0);
/*now I know execve is finished*/
exit(0);
}
execve(cmd, argv, envp);
When I do this, the resulting program works 99% of the time, but very rarely it exhibits strange errors.
Is anything wrong with the above?? I expect the above code to run precisely (except a little slower) as before. Am I correct?
If you want to know the background, the modified code is dash. The execve call is used to run a simple command, after dash has figured out the string to run. When I modify precisely as above (without even running anything after waiting) and recompile and run programs under the modified dash, most of the time they run fine. However, a recompilation of one particular kernel module called "biosutility" gives me this error
cc1: error: unrecognized command line option "-mfentry"

Read carefully documentation of execve(2) and of fork(2) and of waitpid(2). Both execve & fork are tricky and fork is difficult to understand. I strongly suggest to read Advanced Linux Programming (freely available online, but you could buy the paper book) which has several chapters for these questions.
(Don't be afraid of spending a few days reading and understanding these system calls, they are tricky)
Some important points.
every system call can fail and you should always handle its failure, at least by showing some error message with perror(3) and immediately exit(3)-ing.
the execve(2) syscall usually never returns, since it returns only on failure (when successful, it does not return, since the calling program has been replaced so wiped out!) hence most calls to it (and similar exec(3) functions) are often like:
if (execve(cmd, argv, envp)) { perror (cmd); exit(127); };
/* else branch cannot be reached! */
it is customary to use a weird exit code like 127 (usually unused, except like above) on execve failure, and very often you could not do anything else.
When used (almost always) with fork you'll often call execve in the child process.
the fork(2) syscall returns twice on success (once in parent process, once in child process). This is tricky to understand, read the references I gave. It returns once only on failure. So you always keep the result of fork, so typical code would be:
pid_t pid = fork ();
if (pid<0) { // fork has failed
perror("fork"); exit(EXIT_FAILURE);
}
else if (pid==0) { // successful fork in the child process
// very often you call execve in child, so you don't continue here.
// example code:
if (execve(cmd, argv, envp)) { perror (cmd); exit(127); };
// not reached!
};
// here pid is positive, we are in the parent and fork succeeded....
/// do something sensible, at some point you need to call waitpid and use pid
Suggestion: use strace(1) on some programs, perhaps try strace -f bash -c 'date; pwd' and study the output. It mentions many syscalls(2)....
Your sample code might (sometimes) work by just adding some else like
// better code, but still wrong because of unhandled failures....
if ((child = fork())>0) {
waitpid(child, NULL, 0);
/*now I know execve is finished*/
exit(0);
}
/// missing handling of `fork` failure!
else if (!child) {
execve(cmd, argv, envp);
/// missing handling of `execve` failure
}
but that code is still incorrect because failures are not handled.

Here's one possibility.
dash does, in fact, need to know when a child process terminates. It must reap the child (by waiting it) to avoid filling the process table with zombies, and anyway it cares about the exit status of the process.
Now, it knows what the PID of the process it started was, and it can use that when it does a wait to figure out which process terminated and therefore what to do with the exit status.
But you are doing an extra fork. So dash thinks it started some process with PID, say, 368. But you fork a new child, say PID 723. Then you wait for that child, but you ignore the status code. Finally, your process terminates successfully. So then dash notices that process 368 terminated successfully. Even if it didn't.
Now suppose dash was actually executing a script like
do_something && do_something_else
The programmer has specified that the shell definitely shouldn't do_something_else if do_something failed. Terrible things could happen. Or at least mysterious things. Yet, you have hidden that failure. So dash cheerfully fires up do_something_else. Et voilĂ 
Well, it's just a theory. I have no idea, really, but it shows the sort of thing that can happen.
The bottom line is that dash has some mechanism which lets it know when child processes have finished, and if you want to hook into the exit handling of a child process, you'd be much better off figuring out how that mechanism works so that you can hook into it. Trying to add your own additional mechanism is almost certain to end in tears.

Following Rici's excellent comments and answer, I found the root cause of the problem.
The original code exits with whatever cmd exited. I changed that to exit with 0 always. That is why the code behaves differently.
The following fix does not exhibit the error:
int status;
if (child = fork()) {
waitpid(child, &status, 0);
/*now we know execve is finished*/
if (WIFEXITED(status))
exit(WEXITSTATUS(status));
exit(1);
}
execve(cmd, argv, envp);

for this question: "is anything wrong with the above??"
and regarding this code:
if (child = fork()) {
waitpid(child, NULL, 0);
/*now I know execve is finished*/
exit(0);
}
execve(cmd, argv, envp);
the fork() function has three kinds of returned value:
-1 means an error occurred
=0 means fork() was successful and the child process is running
>0 means fork() was successful and the parent process is running
the call to execvp() needs to be followed
(for the rare case of the call failing) with
perror( "execvp failed" );
exit( EXIT_FAILURE );
the call to fork() returns a pid_t.
After the call, the code needs to be similar to: (using child as the pid variable)
if( 0 > child )
{
perror( "fork failed");
exit( EXIT_FAILURE );
}
else if( 0 == child )
{ // then child process
execve(cmd, argv, envp);
perror( "execvp failed" );
exit( EXIT_FAILURE );
}
//else
//{ // else parent process
waitpid(child, NULL, 0);
exit( EXIT_SUCCESS );
for your second question, about the error message:
cc1: error: unrecognized command line option "-mfentry"
the word: unrecognized is mis-spelled, so this is not the actual error message.
This error message is not related to your question about the changes you made to dash.
However, the dash does not directly invoke any compile operations, so I suspect the questions are totally unrelated.
Suggest looking at the makefile for the biosutility utility for why there is a invalid parameter being passed to cc1.

Related

Why does a process create a zombie if execv fails, but not if execv is successful and terminates?

So I am confused by the behavior of my C program. I am using the construct,
int pid = fork();
if (pid == 0) {
if(file_upload_script_path) {
rc = execv(file_upload_script_path, args);
if(rc == -1) {
printf("Error has occured when starting file_upload.exp!\n");
exit(0);
}
} else {
printf("Error with memory allocation!\n");
}
}
else {
printf("pid=%d\n", pid);
}
To fork the process and run a script for doing file upload. The script will by itself terminate safely, either by finishing the upload or failing.
Now, there was a problem with the script path, causing execv to fail. Here I noted the child process will terminate successfully if execv finishes, but in case it fails (r==-1) and I exit the process, it will become a zombie. Anyone knows why this happens?
Note here, I know why the child-process becomes a zombie. What I am confused about is why the process not becomes a zombie if execv works.
EDIT:
I got a question about errno and the cause of the error. The cause of the error is known. There were a problem with the build process, so the path of the script were another than expected.
However, this may happen again and I want to make sure my program does not start spawning zombies when it does. The behavoir where zombies are created in some situations and not others are very confusing.
BR
Patrik
If you don't want to create zombies, your program has to reap zombie processes no matter if they call execv or not call it or no matter if the execv call succeeds. To reap zombie processes "automagically" handle SIGCHLD signal:
void handle_sigchld(int sig) {
int saved_errno = errno;
while (waitpid((pid_t)(-1), 0, WNOHANG) > 0) {}
errno = saved_errno;
}
int main() {
signal(SIGCHLD, handle_sigchld);
// rest of your program....
}
Inspired (no... ripped off) from: this link.
Or maybe you want only to reap only this specified child, because later you want to call fork() and handle childs return value. Then pass the returned pid from fork() in your parent to the signal handler and wait on this pid in sigchld if needed (with some checking, ex. if the pid already finished then ignore future SIGCHLD etc...).
In this scenario, when the execv fails, the child process is killed. The fun part, I think is what happens when you call exec family of functions.
The exec family of functions replaces the current image of the process with the new image of the binary you are about to exec.
So, whatever code was will not remain - and the error in your script would cause its death.
Here, the parent needs to listen on the death of the child process using wait flavour of functions (read: waitpid).
When you say that there's problem in the script, it means that the execv actually succeeded in creating the new image; but the latter failed of its own accord.
This is what I think is happening...
If the printf of if (rc==-1) is being executed, then perhaps changing exit(0) to _exit(0) should take care of it.

Calling kill on a child process with SIGTERM terminates parent process, but calling it with SIGKILL keeps the parent alive

This is a continuation of How to prevent SIGINT in child process from propagating to and killing parent process?
In the above question, I learned that SIGINT wasn't being bubbled up from child to parent, but rather, is issued to the entire foreground process group, meaning I needed to write a signal handler to prevent the parent from exiting when I hit CTRL + C.
I tried to implement this, but here's the problem. Regarding specifically the kill syscall I invoke to terminate the child, if I pass in SIGKILL, everything works as expected, but if I pass in SIGTERM, it also terminates the parent process, showing Terminated: 15 in the shell prompt later.
Even though SIGKILL works, I want to use SIGTERM is because it seems just like a better idea in general from what I've read about it giving the process it's signaling to terminate a chance to clean itself up.
The below code is a stripped down example of what I came up with
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
pid_t CHILD = 0;
void handle_sigint(int s) {
(void)s;
if (CHILD != 0) {
kill(CHILD, SIGTERM); // <-- SIGKILL works, but SIGTERM kills parent
CHILD = 0;
}
}
int main() {
// Set up signal handling
char str[2];
struct sigaction sa = {
.sa_flags = SA_RESTART,
.sa_handler = handle_sigint
};
sigaction(SIGINT, &sa, NULL);
for (;;) {
printf("1) Open SQLite\n"
"2) Quit\n"
"-> "
);
scanf("%1s", str);
if (str[0] == '1') {
CHILD = fork();
if (CHILD == 0) {
execlp("sqlite3", "sqlite3", NULL);
printf("exec failed\n");
} else {
wait(NULL);
printf("Hi\n");
}
} else if (str[0] == '2') {
break;
} else {
printf("Invalid!\n");
}
}
}
My educated guess as to why this is happening would be something intercepts the SIGTERM, and kills the entire process group. Whereas, when I use SIGKILL, it can't intercept the signal so my kill call works as expected. That's just a stab in the dark though.
Could someone explain why this is happening?
As I side note, I'm not thrilled with my handle_sigint function. Is there a more standard way of killing an interactive child process?
You have too many bugs in your code (from not clearing the signal mask on the struct sigaction) for anyone to explain the effects you are seeing.
Instead, consider the following working example code, say example.c:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
/* Child process PID, and atomic functions to get and set it.
* Do not access the internal_child_pid, except using the set_ and get_ functions.
*/
static pid_t internal_child_pid = 0;
static inline void set_child_pid(pid_t p) { __atomic_store_n(&internal_child_pid, p, __ATOMIC_SEQ_CST); }
static inline pid_t get_child_pid(void) { return __atomic_load_n(&internal_child_pid, __ATOMIC_SEQ_CST); }
static void forward_handler(int signum, siginfo_t *info, void *context)
{
const pid_t target = get_child_pid();
if (target != 0 && info->si_pid != target)
kill(target, signum);
}
static int forward_signal(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_sigaction = forward_handler;
act.sa_flags = SA_SIGINFO | SA_RESTART;
if (sigaction(signum, &act, NULL))
return errno;
return 0;
}
int main(int argc, char *argv[])
{
int status;
pid_t p, r;
if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s COMMAND [ ARGS ... ]\n", argv[0]);
fprintf(stderr, "\n");
return EXIT_FAILURE;
}
/* Install signal forwarders. */
if (forward_signal(SIGINT) ||
forward_signal(SIGHUP) ||
forward_signal(SIGTERM) ||
forward_signal(SIGQUIT) ||
forward_signal(SIGUSR1) ||
forward_signal(SIGUSR2)) {
fprintf(stderr, "Cannot install signal handlers: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
p = fork();
if (p == (pid_t)-1) {
fprintf(stderr, "Cannot fork(): %s.\n", strerror(errno));
return EXIT_FAILURE;
}
if (!p) {
/* Child process. */
execvp(argv[1], argv + 1);
fprintf(stderr, "%s: %s.\n", argv[1], strerror(errno));
return EXIT_FAILURE;
}
/* Parent process. Ensure signals are reflected. */
set_child_pid(p);
/* Wait until the child we created exits. */
while (1) {
status = 0;
r = waitpid(p, &status, 0);
/* Error? */
if (r == -1) {
/* EINTR is not an error. Occurs more often if
SA_RESTART is not specified in sigaction flags. */
if (errno == EINTR)
continue;
fprintf(stderr, "Error waiting for child to exit: %s.\n", strerror(errno));
status = EXIT_FAILURE;
break;
}
/* Child p exited? */
if (r == p) {
if (WIFEXITED(status)) {
if (WEXITSTATUS(status))
fprintf(stderr, "Command failed [%d]\n", WEXITSTATUS(status));
else
fprintf(stderr, "Command succeeded [0]\n");
} else
if (WIFSIGNALED(status))
fprintf(stderr, "Command exited due to signal %d (%s)\n", WTERMSIG(status), strsignal(WTERMSIG(status)));
else
fprintf(stderr, "Command process died from unknown causes!\n");
break;
}
}
/* This is a poor hack, but works in many (but not all) systems.
Instead of returning a valid code (EXIT_SUCCESS, EXIT_FAILURE)
we return the entire status word from the child process. */
return status;
}
Compile it using e.g.
gcc -Wall -O2 example.c -o example
and run using e.g.
./example sqlite3
You'll notice that Ctrl+C does not interrupt sqlite3 -- but then again, it does not even if you were to run sqlite3 directly --; instead, you just see ^C on screen. This is because sqlite3 sets up the terminal in such a way that Ctrl+C does not cause a signal, and is just interpreted as normal input.
You can exit from sqlite3 using the .quit command, or pressing Ctrl+D at the start of a line.
You'll see that the original program will output a Command ... [] line afterwards, before returning you to the command line. Thus, the parent process is not killed/harmed/bothered by the signals.
You can use ps f to look at a tree of your terminal processes, and that way find out the PIDs of the parent and child processes, and send signals to either one to observe what happens.
Note that because SIGSTOP signal cannot be caught, blocked, or ignored, it would be nontrivial to reflect the job control signals (as in when you use Ctrl+Z). For proper job control, the parent process would need to set up a new session and a process group, and temporarily detach from the terminal. That too is quite possible, but a bit beyond the scope here, as it involves quite detailed behaviour of sessions, process groups, and terminals, to manage correctly.
Let's deconstruct the above example program.
The example program itself first installs some signal reflectors, then forks a child process, and that child process executes the command sqlite3. (You can speficy any executable and any parameters strings to the program.)
The internal_child_pid variable, and set_child_pid() and get_child_pid() functions, are used to manage the child process atomically. The __atomic_store_n() and __atomic_load_n() are compiler-provided built-ins; for GCC, see here for details. They avoid the problem of a signal occurring while the child pid is only partially assigned. On some common architectures this cannot occur, but this is intended as a careful example, so atomic accesses are used to ensure only a completely (old or new) value is ever seen. We could avoid using these completely, if we blocked the related signals temporarily during the transition instead. Again, I decided the atomic accesses are simpler, and might be interesting to see in practice.
The forward_handler() function obtains the child process PID atomically, then verifies it is nonzero (that we know we have a child process), and that we are not forwarding a signal sent by the child process (just to ensure we don't cause a signal storm, the two bombarding each other with signals). The various fields in the siginfo_t structure are listed in the man 2 sigaction man page.
The forward_signal() function installs the above handler for the specified signal signum. Note that we first use memset() to clear the entire structure to zeros. Clearing it this way ensures future compatibility, if some of the padding in the structure is converted to data fields.
The .sa_mask field in the struct sigaction is an unordered set of signals. The signals set in the mask are blocked from delivery in the thread that is executing the signal handler. (For the above example program, we can safely say that these signals are blocked while the signal handler is run; it's just that in multithreaded programs, the signals are only blocked in the specific thread that is used to run the handler.)
It is important to use sigemptyset(&act.sa_mask) to clear the signal mask. Simply setting the structure to zero does not suffice, even if it works (probably) in practice on many machines. (I don't know; I haven't even checked. I prefer robust and reliable over lazy and fragile any day!)
The flags used includes SA_SIGINFO because the handler uses the three-argument form (and uses the si_pid field of the siginfo_t). SA_RESTART flag is only there because the OP wished to use it; it simply means that if possible, the C library and the kernel try to avoid returning errno == EINTR error if a signal is delivered using a thread currently blocking in a syscall (like wait()). You can remove the SA_RESTART flag, and add a debugging fprintf(stderr, "Hey!\n"); in a suitable place in the loop in the parent process, to see what happens then.
The sigaction() function will return 0 if there is no error, or -1 with errno set otherwise. The forward_signal() function returns 0 if the forward_handler was assigned successfully, but a nonzero errno number otherwise. Some do not like this kind of return value (they prefer just returning -1 for an error, rather than the errno value itself), but I'm for some unreasonable reason gotten fond of this idiom. Change it if you want, by all means.
Now we get to main().
If you run the program without parameters, or with a single -h or --help parameter, it'll print an usage summary. Again, doing this this way is just something I'm fond of -- getopt() and getopt_long() are more commonly used to parse command-line options. For this kind of trivial program, I just hardcoded the parameter checks.
In this case, I intentionally left the usage output very short. It would really be much better with an additional paragraph about exactly what the program does. These kinds of texts -- and especially comments in the code (explaining the intent, the idea of what the code should do, rather than describing what the code actually does) -- are very important. It's been well over two decades since the first time I got paid to write code, and I'm still learning how to comment -- describe the intent of -- my code better, so I think the sooner one starts working on that, the better.
The fork() part ought to be familiar. If it returns -1, the fork failed (probably due to limits or some such), and it is a very good idea to print out the errno message then. The return value will be 0 in the child, and the child process ID in the parent process.
The execlp() function takes two arguments: the name of the binary file (the directories specified in the PATH environment variable will be used to search for such a binary), as well as an array of pointers to the arguments to that binary. The first argument will be argv[0] in the new binary, i.e. the command name itself.
The execlp(argv[1], argv + 1); call is actually quite simple to parse, if you compare it to the above description. argv[1] names the binary to be executed. argv + 1 is basically equivalent to (char **)(&argv[1]), i.e. it is an array of pointers that start with argv[1] instead of argv[0]. Once again, I'm simply fond of the execlp(argv[n], argv + n) idiom, because it allows one to execute another command specified on the command line without having to worry about parsing a command line, or executing it through a shell (which is sometimes downright undesirable).
The man 7 signal man page explains what happens to signal handlers at fork() and exec(). In short, the signal handlers are inherited over a fork(), but reset to defaults at exec(). Which is, fortunately, exactly what we want, here.
If we were to fork first, and then install the signal handlers, we'd have a window during which the child process already exists, but the parent still has default dispositions (mostly termination) for the signals.
Instead, we could just block these signals using e.g. sigprocmask() in the parent process before forking. Blocking a signal means it is made to "wait"; it will not be delivered until the signal is unblocked. In the child process, the signals could stay blocked, as the signal dispositions are reset to defaults over an exec() anyway. In the parent process, we could then -- or before forking, it does not matter -- install the signal handlers, and finally unblock the signals. This way we would not need the atomic stuff, nor even check if the child pid is zero, since the child pid will be set to its actual value well before any signal can be delivered!
The while loop is basically just a loop around the waitpid() call, until the exact child process we started exits, or something funny happens (the child process vanishes somehow). This loop contains pretty careful error checking, as well as the correct EINTR handing if the signal handlers were to be installed without the SA_RESTART flags.
If the child process we forked exits, we check the exit status and/or reason it died, and print a diagnostic message to standard error.
Finally, the program ends with a horrible hack: instead of returning EXIT_SUCCESS or EXIT_FAILURE, we return the entire status word we obtained with waitpid when the child process exited. The reason I left this in, is because it is sometimes used in practice, when you want to return the same or as similar exit status code as a child process returned with. So, it's for illustration. If you ever find yourself to be in a situation when your program should return the same exit status as a child process it forked and executed, this is still better than setting up machinery to have the process kill itself with the same signal that killed the child process. Just put a prominent comment there if you ever need to use this, and a note in the installation instructions so that those who compile the program on architectures where that might be unwanted, can fix it.

C-program does not return from wait-statement

I have to migrate a C-program from OpenVMS to Linux, and have now difficulties with a program generating subprocesses. A subprocess is generated (fork works fine), but execve fails (which is correct, as the wrong program name is given).
But to reset the number of active subprocesses, I afterwards call a wait() which does not return. When I look at the process via ps, I see that there are no more subprocesses, but wait() does not return ECHILD as I had thought.
while (jobs_to_be_done)
{
if (running_process_cnt < max_process_cnt)
{
if ((pid = vfork()) == 0)
{
params[0] = param1 ;
params[1] = NULL ;
if ((cstatus = execv(command, params)) == -1)
{
perror("Child - Exec failed") ; // this happens
exit(EXIT_FAILURE) ;
}
}
else if (pid < 0)
{
printf("\nMain - Child process failed") ;
}
else
{
running_process_cnt++ ;
}
}
else // no more free process slot, wait
{
if ((pid = wait(&cstatus)) == -1) // does not return from this statement
{
if (errno != ECHILD)
{
perror("Main: Wait failed") ;
}
anz_sub = 0 ;
}
else
{
...
}
}
}
Is the anything that has to be done to tell the wait-command that there are no more subprocesses?
With OpenVMS the program works fine.
Thanks a lot in advance for your help
I don't recommend using vfork these days on Linux, since fork(2) is efficient enough, thanks to lazy copy-on-write techniques in the Linux kernel.
You should check the result of fork. Unless it is failing, a process has been created, and wait (or waitpid(2), perhaps with WNOHANG if you don't want to really wait, but just find out about already ended child processes ...) should not fail (even if the exec function in the child has failed, the fork did succeed).
You might also carefully use the SIGCHLD signal, see signal(7). A defensive way of using signals is to set some volatile sigatomic_t flag in signal handlers, and test and clear these flags inside your loop. Recall that only async signal safe functions (and there are quite few of them) can be called -even indirectly- inside a signal handler. Read also about POSIX signals.
Take time to read Advanced Linux Programming to get a wider picture in your mind. Don't try to mimic OpenVMS on POSIX, but think in a POSIX or Linux way!
You probably may want to always waitpid in your loop, perhaps (sometimes or always) with WNOHANG. So waitpid should not be only called in the else part of your if (running_process_cnt < max_process_cnt) but probably in every iteration of your loop.
You might want to compile with all warnings & debug info (gcc -Wall -Wextra -g) then use the gdb debugger. You could also strace(1) your program (probably with -f)
You might want to learn about memory overcommitment. I dislike this feature and usually disable it (e.g. by running echo 0 > /proc/sys/vm/overcommit_memory as root). See also proc(5) -which is very useful to know about...
From man vfork:
The child must not return from the current function or call exit(3), but may call _exit(2)
You must not call exit() when the call to execv (after vfork) fails - you must use _exit() instead. It is quite possible that this alone is causing the problem you see with wait not returning.
I suggest you use fork instead of vfork. It's much easier and safer to use.
If that alone doesn't solve the problem, you need to do some debugging or reduce the code down until you find the cause. For example the following should run without hanging:
#include <sys/wait.h>
int main(int argc, char ** argv)
{
pid_t pid;
int cstatus;
pid = wait(&cstatus);
return 0;
}
If you can verify that this program doesn't hang, then it must be some aspect of your program that is causing a hang. I suggest putting in print statements just before and after the call to wait.

Network Programming in C with execve system calls

I am trying to create a simple client/server program that allows the client to connect to the server using a TCP socket and then allows the user to issue system calls form the client side to the server side and return the reply to the user. For example:
Client issues: ls
Server will find ls in /usr/bin or w/e and then execute it using execve()
I will also have something liks lls, or lmkdir, ect..which will issue the system calls on the client side.
The problem is my execve() is not appearing to run correctly because 'ls' or any other command is not actually being called. I have done this same kind of program before with only a local side (no server or anything) and execve() worked fine. Here is some code:
pid = fork();
if(pid){ // Child
printf("child wait");
pid = wait(&status);
printf("Child dead\n");
}else{ // Parent
if(execPath){
execve(execPath, arglist, env);
printf("Command Complete\n");
}
}
For some reason the printfs in the child section of the PID statement are not executing at all. I do not think the system is actually ever forking a process. Is there something special I would have to do to make this work since it is a client/server type of program or should it work exactly the same?
Thanks
exactly, execve does not fork. It replaces current image with the one specified as its argument and starts from its start (i.e. main()). It never returns to your origial program.
You probably want to use system() in your use case.
There are several problems in the code:
fork() returns pid for the parent and zero for the child. So parent runs the true branch of the if. And child runs the else branch. Swap those comments.
The stdout is line buffered. Add new line (\n) to printf which is before the wait. Or else you don't see the printout before waiting is done and 2nd printf is under call.
Be sure that child will exit also in error cases, or else the child will run the code of parent, and parent is still waiting exit of the child.
execve does not return if it success. It will return, if it fails.
So, fixed code could be something like that:
pid = fork();
if(pid){ // Parent
printf("child wait\n");
pid = waitpid(pid, &status, 0);
printf("Child dead\n");
}else{ // Child
if(execPath){
execve(execPath, arglist, env);
printf("execve failed!\n");
}
_exit(1);
}
Or you could use system(3).
Since the child process has not spawned any children of its own, the wait() call is unlikely to return without some other external event (like a signal interrupting the call). You should have the parent wait on the child process instead.
Note that fork() may fail, and you should account for that. Also note that if execve succeeds, it won't return. So, the print statement after it should indicate failure if it is to print anything at all.
Using system() probably would not save you the fork, since you are likely to want the output of the command to be directed to the socket associated with the connected client. But, your code is missing the steps that would allow the output to flow to the client.
switch ((pid = fork())) {
case -1: /* todo: handle error */
break;
case 0: /* child */
dup2(socket, 0); /* todo: check return value */
dup2(socket, 1); /* todo: check return value */
dup2(socket, 2); /* todo: check return value */
close(socket); /* todo: check return value */
execve(...);
/* todo: handle failure */
exit(EXIT_FAILURE);
default: /* parent */
if (pid != waitpid(pid, 0, 0)) {
/* todo: handle error */
}
}

How to cancel an alarm() signal via a child process?

For an assignment, I am working on creating a time aware shell. The shell forks and executes commands and kills them if they run for more than a set amount of time. For example.
input# /bin/ls
a.out code.c
input# /bin/cat
Error - Expired After 10 Seconds.
input#
Now, my question is: is there a way to prevent the alarm from starting if an error is incurred in the processing of the program, that is, when exevce returns -1?
Since the child-process runs separately and after hours of experimenting and research I have yet to find anything that discusses or even hints at this type of task, I have a feeling it may be impossible. If it is indeed impossible, how can I prevent something like the following from happening...
input# /bin/fgdsfgs
Error executing program
input# Error - Expired After 10 Seconds.
For context, here is the code I am currently working with, with my attempt at doing this myself removed. Thanks for the help in advance!
while(1){
write(1, prompt, sizeof(prompt)); //Prompt user
byteCount = read(0, cmd, 1024); //Retrieve command from user, and count bytes
cmd[byteCount-1] = '\0'; //Prepare command for execution
//Create Thread
child = fork();
if(child == -1){
write(2, error_fork, sizeof(error_fork));
}
if(child == 0){ //Working in child
if(-1 == execve(cmd,arg,env)){ //Execute program or error
write(2, error_exe, sizeof(error_exe));
}
}else if(child != 0){ //Working in the parent
signal(SIGALRM, handler); //Handle the alarm when it goes off
alarm(time);
wait();
alarm(0);
}
}
According to the man page:
Description
The alarm() function shall cause the system to generate a SIGALRM signal for the process after the number of realtime seconds specified by seconds have elapsed. Processor scheduling delays may prevent the process from handling the signal as soon as it is generated.
If seconds is 0, a pending alarm request, if any, is canceled.
Alarm requests are not stacked; only one SIGALRM generation can be scheduled in this manner. If the SIGALRM signal has not yet been generated, the call shall result in rescheduling the time at which the SIGALRM signal is generated.
Interactions between alarm() and any of setitimer(), ualarm(), or usleep() are unspecified.
So, to cancel an alarm: alarm(0). It is even present in your sample code.
The main problem
By the way, you're missing an important piece here:
if(child == 0){ //Working in child
if(-1 == execve(cmd,arg,env)){ //Execute program or error
write(2, error_exe, sizeof(error_exe));
_exit(EXIT_FAILURE); // EXIT OR A FORKED SHELL WILL KEEP GOING
}
}else if(child != 0){ //Working in the parent
The wait() system call takes an argument; why aren't you compiling with the right headers in the source file and with compiler warnings (preferably errors) for undeclared functions? Or, if you are getting such warnings, pay heed to them before submitting code for review on places like StackOverflow.
You don't need to test the return status from execve() (or any of the exec*() functions); if it returns, it failed.
It is good to write an error on failure. It would be better if the child process exited as well, so that it doesn't go back into the while (1) loop, competing with your main shell for input data.
if (child == 0)
{
execve(cmd, arg, env);
write(2, error_exe, sizeof(error_exe));
exit((errno == ENOEXEC) ? 126 : 127);
}
In fact, the non-exiting of your child is the primary cause of your problem; the wait doesn't return until the alarm goes off because the child hasn't exited. The exit statuses shown are intended to match the POSIX shell specification.

Resources