I want to return a unique status code to a waiting parent process from a child process through exit(), based on the execution of child's code. If execvp fails, then the exit() is used. I assume that if execvp is successful, the command executed will send its status code.
pid=fork();
if(pid==0)
{
if(execvp(cmdName,cmdArgs)==-1)
{
printf("Exec failed!\n");
exit(K); //K?
}
}
waitpid(pid,&status,0);
Suppose the command passed to execvp() is "ls", the man page says that it may return 0(success), 1 or 2(failure).
What safe unique value K can I use to indicate the return status of a child process, which won't clash with any value returned by the command executed by execvp()?
For obvious reasons, there cannot be such a value of K that will never clash with the return status of any other program.
Proof: Suppose there was such a K, and you make your program call itself...
There is no safe unique value as every program chooses its return values of which there are only a limited number.
You have to document your program and say what it returns and also provide some form of log to give more details.
I believe anything above 127 (or negative, if you use signed byte) is reserved for OS (on Unix) for reporting segfaults, parity errors and such (any exit due to signal handler and some other things besides). All other exit codes you can use.
Update: found a link for Linux: http://www.tldp.org/LDP/abs/html/exitcodes.html
Related
So i've been struggling with this exercise. I must get al of the System Calls made by any given Linux command of my choice (I.E. ls or cd), list them in a .txt file, and have their unique IDs listed beside them.
So far here's what i got:
strace -o filename.txt ls
This when executed in the Linux shell gives me a "filename.txt" file containing all the system calls of the ls command. Now in my C script:
#include <stdio.h>
#include <stdlib.h>
int main(){
system("strace -o filename.txt ls");
return 0;
}
This should do the same as the previous code, but it's not returning me anything, although the code succesfully compiles. How would i go about fixing this, and then get the IDs? I'm using the "stdlib" library because in my research i found that it has some relation to system call IDs, but haven't found any indication on how to get them. Basically i must read that file i created and have it give each system call its ID.
The exercise is obviously designed to be solved by using the ptrace() facility, because the strace utility does not have an option to print the syscall number (as far as I know).
Technically, you can use something like
printf '#include <sys/syscall.h>\n' | gcc -dD -E - | awk '$1 == "#define" { m[$2] = $3 } END { for (name in m) if (name ~ /^SYS_/) { v = name; while (v in m) v = m[v]; sub(/^SYS_/, "", name); printf "%s %s\n", v, name } }'
to generate a number of syscall-number syscall-name lines, to be used for mapping syscall names back to syscall numbers, but this would be silly and error-prone. Silly, because being able to use ptrace() gives you much more control than using the strace utility, and using a "clever hack" like above just means you avoid learning how to do that, which in my opinion is by definition self-defeating and therefore utterly silly; and error-prone, because there is absolutely no guarantee that the installed headers match the running architecture. This is especially problematic on multiarch architectures, where you can use -m32 and -m64 compiler options to switch between 32-bit and 64-bit architectures. They typically have completely different syscall numbers.
Essentially, your program should:
fork() a child process.
In the child process:
Enable ptracing by calling prctl(PR_SET_DUMPABLE, 1L)
Make parent process the tracer by calling ptrace(PTRACE_TRACEME, (pid_t)0, (void *)0, (void *)0)
Optionally, set tracing options. For example, call ptrace(PTRACE_SETOPTIONS, getpid(), PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXEC | PTRACE_O_TRACEEXIT | PTRACE_O_TRACEFORK) so that you catch at least clone(), fork(), and exec() family of syscalls.
If you do not set the PTRACE_O_TRACEEXEC option, you should stop the child process at this point using e.g. raise(SIGSTOP);, so that the parent process can start tracing this child.
Execute the command to be traced using e.g. execv(). In particular, if the first command line parameter is the command to run, optionally followed by its options, you can use execvp(argv[1], argv + 1);.
If you set the PTRACE_O_TRACEEXEC option above, then the kernel will auto-pause the child process just before executing the new binary.
If the exec fails, the child process should exit. I like to use exit(127);, to return exit status 127.
In the parent process, use waitpid(childpid, &status, WUNTRACED | WCONTINUED in a loop, to catch events in the child process.
The very first event should be the initial pause, i.e. WIFSTOPPED(status) being true. (If not, something else went wrong.)
There are three three different reasons why waitpid(childpid, &status, WUNTRACED | WCONTINUED) may return:
When the child exits (WIFEXITED(status) will be true).
This should obviously end the tracing, and have the parent tracer process exit, too.
When the child resumes execution (WIFCONTINUED(status) will be true).
You cannot assume that a PTRACE_SYSCALL, PTRACE_SYSEMU, PTRACE_CONT etc. commands have actually caused the child process to continue, until the parent gets this signal. In other words, you cannot just fire ptrace() commands to the child process, and expect them to take place in an orderly fashion! The ptrace() facility is asynchronous, and the call will return immediately; you need to waitpid() for the WIFCONTINUED(status) type of event to know that the child process heeded the command.
When the kernel stopped the child (with SIGTRAP) because the child process is about to execute a syscall. (In the parent, WIFSTOPPED(status) will be true.)
Whenever the child process gets stopped because it is about to execute a syscall, you need to use ptrace(PTRACE_GETREGS, childpid, (void *)0, ®s) to obtain the CPU register state in the child process at the point of syscall execution.
regs is of type struct user, defined in <sys/user.h>. For Intel/AMD architectures, regs.regs.eax (for 32-bit) or regs.regs.rax (for 64-bit) contains the syscall number (SYS_foo as defined in <sys/syscall.h>.
You then need to call ptrace(PTRACE_SYSCALL, childpid, (void *)0, (void *)0) to tell the kernel to execute that syscall, and waitpid() again to wait for the WIFCONTINUED(status) event notifying that it did.
The next WIFSTOPPED(status) type event from waitpid() will occur when the syscall is completed. If you want, you can use PTRACE_GETREGS again to examine regs.regs.eax or regs.regs.rax, which contains the syscall return value; on Intel/AMD, if an error occurred, it will be a negative errno value (i.e. -EACCES, -EINVAL, or similar.)
You need to call ptrace(PTRACE_SYSCALL, childpid, (void *)0, (void *)0) to tell the kernel to continue running the child, until the next syscall.
There are quite a few examples on-line showing some of the details above, although most that I have personally seen are pretty lax on error checking, and occasionally omit checking the WIFCONTINUED(status) waitpid() events. I've even written an answer detailing how to stop and continue individual threads on StackOverflow. Since the technique can be used as a very powerful custom debugging tool, I do recommend you try to learn the facility so you can leverage it in your work, rather than just copy-paste some existing code to get a passing grade on the exercise.
I'm using system("./foo 1 2 3") within C to call an external application. I use it inside a for cycle and I want to wait for the foo execution to complete (each execution takes 20/30 seconds) before going into the next cycle iteration. This is a MUST.
The returned system() value only tells me if the process was successfully started or not. So how can I do this?
I looked into fork() and wait() already but didn't manage to do what I want.
Edit:Here's my fork and wait code:
for(i=0;i<64;i++){
if((pid=fork()==-1)){
perror("fork error");
return -1;
}
else if(pid==0){
status=system("./foo 1 2 3"); //THESE 1 2 3 PARAMETERS CHANGE WITHIN EACH ITERATION
}
else{ /* start of parent process */
printf("Parent process started.n");
if ((pid = wait(&status)) == -1)/* Wait for child process. */
printf("wait error");
else { /* Check status. */
if (WIFSIGNALED(status) != 0)
printf("Child process ended because of signal %d.n",
WTERMSIG(status));
else if (WIFEXITED(status) != 0)
printf("Child process ended normally; status = %d.n",
WEXITSTATUS(status));
else
printf("Child process did not end normally.n");
}
}
}
What happens when I do this is that the PC gets extremely slow to the point I need to manually reboot. So What I guess this is doing is starting 64 simultaneous child processes, causing the computer to become really slow.
On a POSIX system, the system function should already be waiting for the command to finish.
http://pubs.opengroup.org/onlinepubs/009695399/functions/system.html
If command is not a null pointer, system() shall return the termination status of the command language interpreter in the format specified by waitpid(). The termination status shall be as defined for the sh utility; otherwise, the termination status is unspecified. If some error prevents the command language interpreter from executing after the child process is created, the return value from system() shall be as if the command language interpreter had terminated using exit(127) or _exit(127). If a child process cannot be created, or if the termination status for the command language interpreter cannot be obtained, system() shall return -1 and set errno to indicate the error.
The one thing to watch out for is if you're starting the program in the background within the command (i.e. if you're doing "./foo &") - the obvious answer is just don't do that.
After you call fork, the child calls system which starts another child that foo runs in. Once it completes, the child continues the next iteration of the for loop. So after the first loop iteration you have 2 processes, then 4 after the next, and so forth. You're spawning off processes at an exponential rate which causes the system to grind to a halt.
There are a few ways to address this:
After the call to system, you have to call exit so the forked off child quits.
Use exec instead of system. This will start foo in the same process as the child. A successful call to exec does not return, however if it fails you still want to print an error and call exit after exec.
Don't bother with fork or wait at all and just call system in a loop, since system doesn't return until the command is completed.
EDIT:
This loop is exhibiting some strange behavior. Here is the culprit:
if((pid=fork()==-1)){
You've got some misplaces parenthesis here. The innermost expression is pid=fork()==-1. Because == has higher precedence than =, it first evaluates fork()==-1. If fork was successful, this evaluates to false, i.e. 0. So then it evaluates pid=0. So after this conditional, both the parent and the child have pid==0.
After applying one of the above changes, put the parenthesis in the right place:
if((pid=fork())==-1){
And everything should work fine.
wait(2)
All of these system calls are used to wait for state changes in a
child of the calling process, and obtain information about the child
whose state has changed. A state change is considered to be: the
child terminated; the child was stopped by a signal; or the child was
resumed by a signal. In the case of a terminated child, performing a
wait allows the system to release the resources associated with the
child; if a wait is not performed, then the terminated child remains
in a "zombie" state (see NOTES below).
If a child has already changed state, then these calls return
immediately. Otherwise, they block until either a child changes
state or a signal handler interrupts the call (assuming that system
calls are not automatically restarted using the SA_RESTART flag of
sigaction(2)). In the remainder of this page, a child whose state
has changed and which has not yet been waited upon by one of these
system calls is termed waitable.
I found out what the problem was.
I saw here: https://askubuntu.com/questions/420981/how-do-i-save-terminal-output-to-a-file
that in order to save the stderr to file I needed to do &>output.txt.
So I was doing "./foo 1 2 3 &>output.txt" but that & causes the system process to go into background.
+1 to #Random832 for guessing it (even though I never said I was using &> -sorry guys, my bad ).
Btw, if you want the stderr to be exported to a file you can use 2>output.txt
What's the general meaning of an exit code 11 in C? I've looked around and can not find a definitive answer so I thought I would ask here. It comes when i try to add an element to a vector.
You didn't find a definitive answer because there isn't one. It's up to the author of the program to decide what exit codes they wish to use. Standard C only says that exit(0) or exit(EXIT_SUCCESS) indicate that the program is successful, and that exit(EXIT_FAILURE) indicates an error of some kind. (Returning a value from main is equivalent to calling exit with that value.) Most common operating systems including Windows, Linux, OSX, etc. use 0 for success and values from 1 to 255 to indicate errors; still choosing between error codes is up to the application writer, the value 11 isn't anything special.
Under Linux and most other Unix variants, the signal number 11 indicates a segmentation fault, as remarked by Kerrek SB. A segmentation fault happens when a program makes some kind of invalid memory access, so it's a plausible consequence of accessing an array out of bounds, or an error in pointer arithmetic, or trying to access a null pointer, or other pointer-related errors. Signal 11 is not the same thing as exit code 11: when a program dies due to a signal, it's marked as having been killed by a signal, rather than having exited normally. Unix shells report signals by reporting an exit code which is the signal number plus 128, so 139 for a segmentation fault.
The other answers have missed a possible ambiguity in the phrase "exit code". I suspect what you meant by "exit code" is the status code retrieved with the wait family of syscalls, as in:
/* assume a child process has already been created */
int status;
wait(&status);
printf("exit code %d\n", status);
If you do something like that you may very will see "exit code 11" if the child process segfaults. If the child process actually called exit(11) you might see "exit code 2816" instead.
It would be better to call those things "wait code" or "wait status" instead of "exit code", to avoid confusion with the value passed to exit. A wait code contains several pieces of information packed together into a single integer. Normally, you should not look at the integer directly (like I did above in that printf). You should instead use the W* macros from <sys/wait.h> to analyze it.
Start with the WIF* macros to find out what kind of thing happened, then use that information to decide which other W* macros to use to get the details.
if(WIFEXITED(status)) {
/* The child process exited normally */
printf("Exit value %d\n", WEXITSTATUS(status));
} else if(WIFSIGNALED(status)) {
/* The child process was killed by a signal. Note the use of strsignal
to make the output human-readable. */
printf("Killed by %s\n", strsignal(WTERMSIG(status)));
} else {
/* ... you might want to handle "stopped" or "continued" events here */
}
There is no standard defined which exit codes an application has to set in certain situations. It is totally up to the programmer which exit codes represent which error or even success !
Sometimes programmers decide that any value different from zero signals an error, and sometimes this value equals the operating systems error codes.
On Windows exit code 11 might be used because of problems with a file. If you want the description of this error code (which is specific to Windows and not necessarily your application) run net helpmsg 11.
Consider:
int main()
{
if (fork() == 0){
printf("a");
}
else{
printf("b");
waitpid(-1, NULL, 0);
}
printf("c");
exit(0);
}
(from Computer Systems, Bryant - O'Hallaron).
We are asked for all the possible output sequences.
I answered: acbc, abcc, bacc.
However, I am missing one output compared to the solution (bcac). I thought this output was not possible because the parent process waits for its child to return before printing c (waitpid). Is this not true? Why? And, in that case, what is the difference between the code above and the same without the waitpid line?
I don't see any way bcac is possible. At first I expected some trickery based on the stdio buffers being flushed in an unexpected order. But even then:
The child won't output c until after it has output a. Therefore the first c in bcac must have come from the parent.
The parent won't output c until after the waitpid completes. But that can't happen until after the child is finished, including the final stdio flush that happens during exit(). Therefore the first c is always from the child.
Proof by contradiction has been achieved... the output can't be bcac.
Well, there is one thing you could do to mess up the order. You could exec the program inside a process that already has a child which is about to exit. If the pre-existing child exits before the new child prints a, then the main process will detect that exit with waitpid, and go ahead and print its stuff and possibly exit before the child prints anything.
This is something to watch out for in setuid programs: don't assume that because your program only created one child process that it only has one child process. If you're in an advanced defensive-code learning context this answer makes sense. In a unix-newbie context it doesn't seem relevant, and it's probably better to just say bcac is impossible, even though it's technically not true.
It's tricky, but the call to waitpid can be interrupted (returns -1 and errno is EINTR). In this case the parent can output c before the child outputs anything and bcac is possible.
To prevent bcac from occurring, either a signal mask needs to be set, or, better, the waitpid return value is checked and if it was interrupted gets called again.
if(pid == 0)
{
execvp(cmd, args);
// printf("hello"); // apparently, putting this or not does not work.
_exit(-1);
}
else
{
// parent process work
}
"execvp()" replaces the current program with the to-be-execed program (of course in the same process context). So, putting, say, any printf() calls after execvp() won't work. That is what the docs say, and I have verified it as well.
But then, why is _exit() needed..? Does it so happen that the control DOES return to statements post execvp() ?
I will be grateful for any pointers.
Thanks
The function will return if it has failed.
If one of the exec functions returns to the calling process image, an error has occurred; the return value shall be -1, and errno shall be set to indicate the error.
The _exit() allows terminating the process properly and return an exit code, even if exec fails.
The execve() syscall can fail. The classic reason for doing this would be if the file isn't there or isn't executable. execvp() wraps around execve() to add path searching and default environment handling (virtually always what you want!) and so it adds in another few failure modes, notably trying to run something with a simple name that's not on the user's path. In any case, failure is failure and there's not a lot you can do when it happens except report that it has gone wrong and Get the (now useless) child process Out Of Dodge. (The simplest error reporting method is to print an error message, perhaps with perror(), but there are others.)
The reason why you need _exit() as opposed to the more normal exit() is because you want to quit the child process but you do not want to run any registered cleanup code associated with the parent process. OK, a lot of it might be harmless, but doing things like writing goodbye messages to a socket or something would be bad, and it's often not at all obvious what has been registered with atexit(). Let the parent process worry about its resources; the child basically owns nothing other than its stack frame!
If execvp fails, _exit will be called.
execvp's man page says:
Return Value
If any of the exec() functions returns, an error will have occurred. The return value is -1, and the global variable errno will be set to indicate the error.
One thing to note, you generally don't want a process' exit status to be signed (if portability matters). While exec() is free to return -1 on failure, its returning that so you can handle that failure within the child code.
The actual _exit() status of the child should be 0 - 255, depending on what errno was raised.