Need clarification regarding vfork use - c

I want to run the child process earlier than the parent process. I just want to use execv call from the child process, So i am using vfork instead of fork.
But suppose execv fails and returns, i want to return non-zero value from the function in which i am calling vfork.
something like this,
int start_test()
{
int err = 0;
pid_t pid;
pid = vfork();
if(pid == 0) {
execv(APP, ARGS);
err = -1;
_exit(1);
} else if(pid > 0) {
//Do something else
}
return err;
}
Is above code is proper, Or i should use some other mechanism to run child earlier than parent?
On some links, I have read that, We should not modify any parent resource in child process created through vfork(I am modifying err variable).

Is above code is proper?
No. The err variable is already in the child's memory, thus parent will not see its modification.
We should not modify any parent resource in child process created through vfork (I am modifying err variable).
The resource is outdated. Some *nix systems in the past tried to play tricks with the fork()/exec() pair, but generally those optimizations have largely backfired, resulting in unexpected, hard to reproduce problems. Which is why the vfork() was removed from the recent versions of POSIX.
Generally you can make this assumptions about vfork() on modern systems which support it: If the child does something unexpected by the OS, the child would be upgraded from vfork()ed to normal fork()ed one.
For example on Linux the only difference between fork() and vfork() is that in later case, more of the child's data are made into lazy COW data. The effect is that vfork() is slightly faster than fork(), but child is likely to sustain some extra performance penalty if it tries to access the yet-not-copied data, since they are yet to be duplicated from the parent. (Notice the subtle problem: parent can also modify the data. That would also trigger the COW and duplication of the data between parent and child processes.)
I just want to use execv call from the child process, So i am using vfork instead of fork.
The handling should be equivalent regardless of the vfork() vs fork(): child should return with a special exit code, and parent should do the normal waitpid() and check the exit status.
Otherwise, if you want to write a portable application, do not use the vfork(): it is not part of the POSIX standard.
P.S. This article might be also of interest. If you still want to use the vfork(), then read this scary official manpage.

Yes. You should not modify the variable err. Use waitpid in the parent process to check the exit code of the child process. Also check the return value from execv and the errno variable(see man execv), to determine why your execv fails.

Related

Is it possible to use fork without exec if both processes are executing the same program?

Here is a code sample where the fork library call is used to create a child process which shares the parent's address space. The child process executes its code without using the exec system call. My question is: is the exec system call not required in the case that both the parent and child processes are executing the same program?
#include <stdio.h>
int main()
{
int count;
count = fork();
if (count == 0)
printf("\nHi I'm child process and count =%d\n", count);
else
printf("\nHi I'm parent process and count =%d\n", count);
return 0;
}
The answer to this question may be different depending on the operating system. The man page for fork on OS X contains this ominous warning (bold portion is a paraphrase of the original):
There are limits to what you can do in the child process. To be
totally safe you should restrict yourself to only executing
async-signal safe operations until such time as one of the exec
functions is called. All APIs, including global data symbols, in any
framework or library should be assumed to be unsafe after a fork()
unless explicitly documented to be safe or async-signal safe. If you
need to use these frameworks in the child process, you must exec. In
this situation it's reasonable to exec another copy of the same executable.
The list of async-signal safe functions can be found in the man page for sigaction(2).
Is it possible to use fork without exec
Yes, it is possible.
is the exec system call not required in the case that both the parent and child processes are executing the same program
Yes, it is not required in that case.

Implementing posix_spawn on Linux

I am curious to see if it would be possible to implement posix_spawn in Linux using a combination of vfork+exec. In a very simplified way (leaving out most optional arguments) this could look more or less like this:
int my_posix_spawn(pid_t *ppid, char **argv, char **env)
{
pid_t pid;
pid = vfork();
if (pid == -1)
return errno;
if (pid == 0)
{
/* Child */
execve(argv[0], argv, env);
/* If we got here, execve failed. How to communicate this to
* the parent? */
_exit(-1);
}
/* Parent */
if (ppid != NULL)
*ppid = pid;
return 0;
}
However I am wondering how to cope with the case where vfork succeeds (so the child process is created) but the exec call fails. There seems to be no way to communicate this to the parent, which would only see that it could apparently create a child process successfully (as it would get a valid pid back)
Any ideas?
As others have noted in the comments, posix_spawn is permitted to create a child process that immediately dies to due to exec failure or other post-fork failures; the calling application needs to be prepared for this. But of course it's preferable not to do so.
The general procedure for communicating exec failure to the parent is described in an answer I wrote on this question: What can cause exec to fail? What happens next?.
Unfortunately, however, some of the operations you need to perform are not legal after vfork due to its nasty returns-twice semantics. I've covered this topic in the past in an article on ewontfix.com. The solution for making a posix_spawn that avoids duplicating the VM seems to be using clone with CLONE_VM (and possibly CLONE_VFORK) to get a new process that shares memory but doesn't run on the same stack. However, this still requires a lot of care to avoid making any calls to libc functions that might modify memory used by the parent. My current implementation is here:
http://git.musl-libc.org/cgit/musl/tree/src/process/posix_spawn.c?id=v1.1.4
and as you can see it's rather complicated. Reading the git history may be informative regarding some of the design decisions that were made.
I don't think there's any good way to do this with the current set of system calls. You've correctly identified the biggest problem -- the absence of any reliable way to report failure after the vfork. Other problems include race conditions in setting child state, and Linux's lack of interest in picking up closefrom.
Several years ago I sketched a new system-level API that would solve this problem: the key addition is a system call, which I called egg(), that creates a process without giving it an address space, and inheriting no state from the parent. Obviously, an egg process can't execute code; but you can (with a whole bunch more new system calls) set all of its kernelside state, and then (with yet another system call, hatch()) load an executable into it and set it going. Crucially, all of the new system calls report failure in the parent. For instance, there's a dup_into(pid, to_fd, from_fd) call that copies parent file descriptor from_fd to egg-state process pid's file descriptor to_fd; if it fails, the parent gets the failure code.
I never had time to flesh all of that out into a coherent API specification and code it up (and I'm not a kernel hacker, anyway) but I still think the concept has legs and I would be happy to work with someone to get it done.

What to do if exec() fails?

Let's suppose we have a code doing something like this:
int pipes[2];
pipe(pipes);
pid_t p = fork();
if(0 == p)
{
dup2(pipes[1], STDOUT_FILENO);
execv("/path/to/my/program", NULL);
...
}
else
{
//... parent process stuff
}
As you can see, it's creating a pipe, forking and using the pipe to read the child's output (I can't use popen here, because I also need the PID of the child process for other purposes).
Question is, what should happen if in the above code, execv fails? Should I call exit() or abort()? As far as I know, those functions close the open file descriptors. Since fork-ed process inherits the parent's file descriptors, does it mean that the file descriptors used by the parent process will become unusable?
UPD
I want to emphasize that the question is not about the executable loaded by exec() failing, but exec itself, e.g. in case the file referred by the first argument is not found or is not executable.
You should use exit(int) since the (low byte) of the argument can be read by the parent process using waitpid(). This lets you handle the error appropriately in the parent process. Depending on what your program does you may want to use _exit instead of exit. The difference is that _exit will not run functions registered with atexit nor will it flush stdio streams.
There are about a dozen reasons execv() can fail and you might want to handle each differently.
The child failing is not going to affect the parent's file descriptors. They are, in effect, reference counted.
You should call _exit(). It does everything exit() does, but it avoids invoking any registered atexit() functions. Calling _exit() means that the parent will be able to get your failed child's exit status, and take any necessary steps.

question about Fork()

When a parent process creates a child process with fork(), according to me,
the child process is in a Running state whereas the parent process is in a Ready state, i.e. waiting for the child to end.
Am I right?
No, the fork creates a copy of the parent.
Then you generally tests for the return value of fork which says 0 = I am the child, other: I'm the parent and the child has the return value as PID
If the parent has to wait for the child to end, you need to use the wait function.
Edit:
see http://linux.die.net/man/2/fork and http://linux.die.net/man/2/wait for the fork() in C.
Here is something from
After a fork(), it is indeterminate
which process—the parent or the
child—next has access to the CPU.
Applications that implicitly or
explicitly rely on a particular
sequence of execution in order to
achieve correct results are open to
failure due to race conditions.
It goes on to point different behaviors in different kernels. The bottom line is that it's implementation-defined and not to be relied upon.
Also if you do want to rely on it, on Linux since 2.6.32 "there's a sysctl for that"
kernel.sched_child_runs_first
Cheers

How do I spawn a daemon in uClinux using vfork?

This would be easy with fork(), but I've got no MMU. I've heard that vfork() blocks the parent process until the child exits or executes exec(). How would I accomplish something like this?:
pid_t pid = vfork();
if (pid == -1) {
// fail
exit(-1);
}
if (pid == 0) {
// child
while(1) {
// Do my daemon stuff
}
// Let's pretend it exits sometime
exit();
}
// Continue execution in parent without blocking.....
It seems there is no way to do this exactly as you have it here. exec or _exit have to get called for the parent to continue execution. Either put the daemon code into another executable and exec it, or use the child to spawn the original task. The second approach is the sneaky way, and is described here.
daemon() function for uClinux systems without MMU and fork(), by Jamie Lokier, in patch format
You can't do daemon() with vfork(). To create something similar to a daemon on !MMU using vfork(), the parent process doesn't die (so there are extra processes), and you should call your daemon on the background (i.e. by appending & to the command line on the shell).
On the other hand, Linux provides clone(). Armed with that, knowledge and care, it's possible to implement daemon() for !MMU. Jamie Lokier has a function to do just that on ARM and i386, get it from here.
Edit: made the link to Jamie Lokier's daemon() for !MMU Linux more prominent.
I would have thought that this would be the type of problem that many others had run into before, but I've had a hard time finding anyone talking about the "kill the parent" problems.
I initially thought that you should be able to do this with a (not quite so, but sort of) simple call to clone, like this:
pid_t new_vfork(void) {
return clone(child_func, /* child function */
child_stack, /* child stack */
SIGCHLD | CLONE_VM, /* flags */
NULL, /* argument to child */
NULL, /* pid of the child */
NULL, /* thread local storage for child */
NULL); /* thread id of child in child's mem */
}
Except that determining the child_stack and the child_func to work the way that it does with vfork is pretty difficult since child_func would need to be the return address from the clone call and the child_stack would need to be the top of the stack at the point that the actual system call (sys_clone) is made.
You could probably try to call sys_clone directly with
pid_t new_vfork(void) {
return sys_clone( SIGCHLD | CLONE_VM, NULL);
}
Which I think might get what you want. Passing NULL as the second argument, which is the child_stack pointer, causes the kernel to do the same thing as it does in vfork and fork, which is to use the same stack as the parent.
I've never used sys_clone directly and haven't tested this, but I think it should work. I believe that:
sys_clone( SIGCHLD | CLONE_VM | CLONE_VFORK, NULL);
is equivalent to vfork.
If this doesn't work (and you can't figure out how to do something similar) then you may be able to use the regular clone call along with setjump and longjmp calls to emulate it, or you may be able to get around the need for the "return's twice" semantics of fork and vfork.

Resources