Difference between clone and fork+unshare - c

Somehow it's easier to call fork and then unshare because many arguments are copied via fork that would otherwise be manually wrapped to clone. My question is, what is the difference between (1) calling clone which forks a new process in separate namespaces and (2) fork+unshare which forks a new process and then leaves parent's namespaces. Assume all namespace flags passed to clone and unshare are the same.
auto flag = CLONE_NEWUSER | CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET | SIGCHLD;
So, with fork, its very easy for child to reuse data inherited from parents.
int mydata; // this could be even more complicated, class, struct, etc.
auto pid = fork();
if(pid == 0) {
// reuse inherited data in a copy-on-write manner
unshare(flag);
}
For clone, we sometimes have to wrap data into another struct and pass it as void* to the clone system call.
int mydata;
clone(func, stack+stack_size, flag, &wrapper_of_data);
It seems to me in the above example the only difference is the performance hit, where fork can be a bit more expensive. However, the nature of fork saves me many efforts in the way that both can create a process in new namespaces.

Somehow it's easier to call fork and then unshare because many arguments are copied via fork that would otherwise be manually wrapped to clone.
This is only true for the libc wrapper function.
The underlying system call is more like fork(), and would probably work better in this case:
long flags = CLONE_NEWUSER | ... | SIGCHLD;
long pid = syscall(__NR_clone, flags, 0, 0, 0, 0);
if(pid == 0) {
/* child with unshared namespaces */
}
Since about the time clone syscall was introduced, fork() is just a specific case of clone with flags=SIGCHLD and 0s for all other arguments.
There is probably no reason to prefer fork/unshare pair over just passing the same flags to clone.

Related

Is it possibility that one shared memory segment will attach multiple time to same parent pid?

I have created 200 child processes of a parent which communicate through share memory IPC mechanism:
Parent <-> SHM <-> child
But the observation is STRANGE.
I found 4 processes are attached to the same SHM id in which 2 are parent pid and 2 are child pid.(Unexpected Behaviour).
and somewhere 2 processes are attached to one SHM id (Expected behaviour).
Please found below output-
-bash-4.2# grep 123076652 /proc/*/maps
/proc/27750/maps:7f1323576000-7f1323577000 rw-s 00000000 00:04 123076652 /SYSV2c006eff (deleted)
/proc/27750/maps:7f1323676000-7f1323677000 rw-s 00000000 00:04 123076652 /SYSV2c006eff (deleted)
/proc/27827/maps:7f87ac3c0000-7f87ac3c1000 rw-s 00000000 00:04 123076652 /SYSV2c006eff (deleted)
/proc/28090/maps:7f9d33b8b000-7f9d33b8c000 rw-s 00000000 00:04 123076652 /SYSV2c006eff (deleted)
As we can easily see that PID-27750(Parent) attached two times to one segment.
How is it possible ?
Is it Centos Bug?
To answer your question: Yes of course. You have the evidence right there in your question.
How does it happen? If you call mmap() on the same file multiple times it maps it multiple times.
To avoid that happening the answer is: Don't do that.
I'm purely guessing but my bet is that one of your fork() calls failed and you never did any error checking, and the code continued on to execute the child code in the parent process. That would explain having two maps on one PID.
I found the problem i was using same id to generate the ftok() key for two child processes.
A key_t is 32 bits, and can have any value we want.
ftok is just a "convenience" function to generate a unique key_t value to be used in a call to shmget (See below).
If we use IPC_PRIVATE for this key_t value, we get a private descriptor that any child of a single parent process can use. It shows up in ipcs as a key_t of 0, with a unique shmid [which is like a file descriptor].
So, if have a single parent and we're just doing fork, we can use this, because the children will inherit this from the parent. This is the preferred method. So, in this case, ftok isn't needed.
With private keys, when all attached processes terminate, the shared areas are automatically removed by the kernel.
If we use a non-zero key_t value, we are creating a permanent area. It will remain (with the data still there).
To remove this, the final process (i.e. the parent process) must do shmctl(shmid,IPC_RMID,NULL) for all shmid gotten from the shmget calls.
If the parent process dies before doing this, the area remains. Such areas will still show up in ipcs
Here is some sample code that illustrates the use of IPC_PRIVATE. It can be adapted to use a non-zero key_t value, but for your application, that may not be warranted:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/wait.h>
#define NCHILD 200
#define SIZE 4096
// child process control
typedef struct {
int tsk_cldidx; // child index
int tsk_shmid; // shmid for process
pid_t tsk_pid; // pid of process
void *tsk_shmadr; // address of attached area
} tsk_t;
tsk_t cldlist[NCHILD];
void
dofork(int cldidx)
{
tsk_t *tskself = &cldlist[cldidx];
do {
tskself->tsk_pid = fork();
// parent
if (tskself->tsk_pid != 0)
break;
// detach from all areas that are not ours
for (cldidx = 0; cldidx < NCHILD; ++cldidx) {
tsk_t *tsk = &cldlist[cldidx];
if (tsk->tsk_cldidx != tskself->tsk_cldidx)
shmdt(tsk->tsk_shmadr);
}
// do something useful ...
exit(0);
} while (0);
}
int
main(void)
{
int cldidx;
tsk_t *tsk;
// create private shared memory ids for each child
for (cldidx = 0; cldidx < NCHILD; ++cldidx) {
tsk = &cldlist[cldidx];
tsk->tsk_cldidx = cldidx;
tsk->tsk_shmid = shmget(IPC_PRIVATE,SIZE,0);
tsk->tsk_shmadr = shmat(tsk->tsk_shmid,NULL,0);
}
// start up all children
for (cldidx = 0; cldidx < NCHILD; ++cldidx)
dofork(cldidx);
// do something useful with children ...
// wait for all children
while (wait(NULL) >= 0);
// remove all segments
// NOTE: with IPC_PRIVATE, this may not be necessary -- it may happen
// automatically when we exit
for (cldidx = 0; cldidx < NCHILD; ++cldidx) {
tsk = &cldlist[cldidx];
shmctl(tsk->tsk_shmid,IPC_RMID,NULL);
}
return 0;
}
If we have separate programs that have no parent/child relationship, we need a non-zero key_t. It can be hard to generate a unique key_t that does not collide/conflict with another, possibly from a totally unrelated group of programs (e.g. another user)
Can you please explain how much maximum unique keys can be generated using ftok().
AFAIK only last low 8 bits are significant. Can we use can we use 2 byte integer like "300" to generate the key . What is the chance for keys duplicate here?
For a given [existing] file and eight bit proj_id, there are [as you've noticed] only 256 unique keys that can be generated. We'd need a different file argument to get the next 256.
It might be better to dispense with ftok altogether [I've never used it when using shmget]. I've done 0xAA000000 as the base key_t value. I've replaced the zeroes with whatever unique sub-key value I need (there are ~24 million possible combinations).
If we control all programs that will access the shared memory areas, it isn't necessary to have multiple areas.
It may be sufficient, and more desirable to have a single shared memory area. In that case, we only do one shmget and one shmat. Then, ftok(myfile,0) could produce a nice key.
If the size of the needed to communicate with a child is (e.g.) a page (PER_CHILD = 4096), and we have NCHILD children to create, we can just create a single area of TOTAL_SIZE = PER_CHILD * NCHILD in size. Then, for a given child X, its area pointer is shmaddr + (X * PER_CHILD)
UPDATE:
Can I use IPC_CREAT flag and do exec() call to child?
I think what you mean is using the non-zero key with shmget in conjunction with this.
An exec call will close any mappings: shared memory after exec()
It will also close the file descriptor returned by shmget [or shm_open].
So, using the non-zero key is the only [practical] way to ensure that it works across execvp et al.
Will it cause any problem. AFAIK if we use exec() then a process will have a different address space. Will it cause any problem?
The child program will have to (re)establish its own mapping via shmget and shmat.
But, if we use shm_open [instead of shmget], we can keep the file descriptor open if we use fcntl to clear the FD_CLOEXEC flag on the descriptor before calling execvp.
However, this may be of little use as the child program (target of execvp) will [probably] not know the file descriptor number that the parent opened with shm_open, so it's a bit of a moot point.

Fill in an array with fork()

First of all, I surely know there are faster and less overkill solutions to this, but I absolutely need to fill in an array with child processes only.
Let's say I have 3 childs:
int pos = 0;
for (i = 0; i<3 ; i++){
switch (fork()){
case -1: //fork error;
printf("[ERROR] - fork()\n");
exit(EXIT_FAILURE);
case 0: //child
fill(&req, pos);
pos++;
exit(EXIT_SUCCESS);
default:
break;
}
}
where fill basically works like this:
void fill (request *req, int pos){
req->array[pos] = 1;
}
I realized this method of course doesn't work, since every child has a copy of pos = 0, and they just increment their copy, so the array always gets modified at 0.
The struct request is a simpe struct with a pid and a int array to send through fifo.
typedef struct request {
int cpid; //client pid
int array[SIZE]; //array
} request;
What can I do to fill in this array with the child processes only? I have to repeat, I can't use workarounds, just fork() and childs.
Thanks!
If the children are the ones who have to fill the array, then their modifications cannot be seen by the parent or by any other child, unless the parent and the child share some memory (shmget).
Other workarounds include sending all the data to a central process using pipes or any other communication mechanism.
You cannot alter some data (to be shared) after a fork, because each process has -by definition- its own address space, hence any changes to data is private to that process (unless you use shared memory).
You could use shared memory, which you have to set up before the call to fork(2). Then you have synchronization issues. So read shm_overview(7) and sem_overview(7); in your case, I feel it is overkill.
You might also use threads, not processes. Some processes have several threads, all sharing -by definition- the same common address space. Again, synchronization is an issue (e.g. with mutexes). Read some pthread tutorial
You could use some other IPC, e.g. pipe(7)-s. You'll probably want a multiplexing syscall like poll(2).
(I guess, perhaps incorrectly, that the whole point of this homework is to teach you about pipes and event loops; if you use pipes, better adopt some textual protocol)
Read Advanced Linux Programming.
BTW, on fork and other syscalls error, you generally should call perror(3) -not a plain printf like you do- then exit(EXIT_FAILURE).

Why does GNU script use two forks instead of select and one fork?

I just realised that the "script" binary on GNU linux is using two forks instead of one.
It could simply use select instead of doing a first fork(). Why would it use two forks ?
Is it simply because select did not exist at the time it has been coded and nobody had the motivation to recode it or is there a valid reason ?
man 1 script: http://linux.die.net/man/1/script
script source: http://pastebin.com/raw.php?i=br8QXRUT
The clue is in the code, which I have added some comments to.
child = fork();
sigprocmask(SIG_SETMASK, &unblock_mask, NULL);
if (child < 0) {
warn(_("fork failed"));
fail();
}
if (child == 0) {
/* child of first fork */
sigprocmask(SIG_SETMASK, &block_mask, NULL);
subchild = child = fork();
sigprocmask(SIG_SETMASK, &unblock_mask, NULL);
if (child < 0) {
warn(_("fork failed"));
fail();
}
if (child) {
/* child of second fork runs 'dooutput' */
if (!timingfd)
timingfd = fdopen(STDERR_FILENO, "w");
dooutput(timingfd);
} else
/* parent of second fork runs 'doshell' */
doshell();
} else {
sa.sa_handler = resize;
sigaction(SIGWINCH, &sa, NULL);
}
/* parent of first fork runs doinput */
doinput();
There are thus three process running:
dooutput()
doshell()
doinput()
I think you are asking why use three processes, not one process and select(). select() has existed since ancient UNIX history, so the answer is unlikely to be that select() did not exist. The answer is more prosaic. doshell() needs to be in a separate process anyway, as what it does is exec the shell with appropriately piped fds. You thus need at least one fork. Writing dooutput() and doinput() within a select() loop looks to me perfectly possible, but it is actually easier to use blocking I/O rather than have to worry about using select etc. As fork() is relatively lightweight (given UNIX's CoW semantics) and there is little need for communication between the two processes, why use select() when fork() is perfectly good and produces smaller code? IE the real answer is 'why not?'

At what point does a fork() child process actually begin?

Does the process begin when fork() is declared? Is anything being killed here?
pid_t child;
child = fork();
kill (child, SIGKILL);
Or do you need to declare actions for the fork process in order for it to actually "begin"?
pid_t child;
child = fork();
if (child == 0) {
// do something
}
kill (child, SIGKILL);
I ask because what I am trying to do is create two children, wait for the first to complete, and then kill the second before exiting:
pid_t child1;
pid_t child2;
child1 = fork();
child2 = fork();
int status;
if (child1 == 0) { //is this line necessary?
}
waitpid(child1, &status, 0);
kill(child2, SIGKILL);
The C function fork is defined in the standard C library (glibc on linux). When you call it, it performs an equivalent system call (on linux its name is clone) by the means of a special CPU instruction (on x86 sysenter). This causes the CPU to switch to a privileged mode and start executing instructions of the kernel. The kernel then creates a new process (a record in a list and accompanying structures), which inherits a copy of memory mappings of the original process (text, heap, stack, and others), file descriptors and more.
The memory areas are marked as non-writable, so that when the new or the original process tries to overwrite them, the kernel gets to handle a CPU exception and perform a copy-on-write (therefore delaying the need to copy a memory page until absolutely necessary). That's because the mappings initially point to the same pages (pieces of physical memory) in both processes.
The kernel then gives execution to the scheduler, which decides which process to run next. It could be the original process, the child process, or any other process running in the system.
Note: The Linux kernel actually puts the child process in front of the parent process in the run queue, so it is run earlier than the parent. This is deemed to give better performance when the child calls exec right after forking.
When execution is given to the original process, the CPU is switched back to nonprivileged mode and starts executing the next instruction. In this case it continues with the fork function of the standard library, which returns the PID of the child process (as returned by the clone system call).
Similarly, the child process continues execution in the fork function, but here it returns 0 to the calling function.
After that, the program continues in both cases normally. The child process has the original process as the parent (this is noted in a structure in the kernel). When it exists, the parent process is supposed to do the cleanup (receiving the exit status of the child) by calling wait.
Note: The clone system call is rather complicated, because it unifies fork with the creation of threads, as well as linux namespaces. Other operating systems have different implementation of fork, e.g. FreeBSD has fork system call by itself.
Disclaimer: I am not a kernel developer. If you know better, please correct the answer.
See Also
clone (2)
The Design and Implementation of the FreeBSD Operating System (Google Books)
Understanding the Linux Kernel (Google Books)
Is it true that fork() calls clone() internally?
"Declare" is the wrong word to use in this context; C uses that word to talk about constructs that merely assert the existence of something, e.g.
extern int fork(void);
is a declaration of the function fork. Writing that in your code (or having it written for you as a consequence of #include <unistd.h>) does not cause fork to be called.
Now, the statement in your sample code, child = fork(); when written inside a function body, does (generate code to) make a call to the function fork. That function, assuming it is in fact the system primitive fork(2) on your operating system, and assuming it succeeds, has the special behavior of returning twice, once in the original process and once in a new process, with different return values in each so you can tell which is which.
So the answer to your question is that in both of the code fragments you showed, assuming the things I mentioned in the previous paragraph, all of the code after the child = fork(); line is at least potentially executed twice, once by the child and once by the parent. The if (child == 0) { ... } construct (again, this is not a "declaration") is the standard idiom for making parent and child do different things.
EDIT: In your third code sample, yes, the child1 == 0 block is necessary, but not to ensure that the child is created. Rather, it is there to ensure that whatever you want child1 to do is done only in child1. Moreover, as written (and, again, assuming all calls succeed) you are creating three child processes, because the second fork call will be executed by both parent and child! You probably want something like this instead:
pid_t child1, child2;
int status;
child1 = fork();
if (child1 == -1) {
perror("fork");
exit(1);
}
else if (child1 == 0) {
execlp("program_to_run_in_child_1", (char *)0);
/* if we get here, exec failed */
_exit(127);
}
child2 = fork();
if (child2 == -1) {
perror("fork");
kill(child1, SIGTERM);
exit(1);
}
else if (child2 == 0) {
execlp("program_to_run_in_child_2", (char *)0);
/* if we get here, exec failed */
_exit(127);
}
/* control reaches this point only in the parent and only when
both fork calls succeeded */
if (waitpid(child1, &status, 0) != child1) {
perror("waitpid");
kill(child1, SIGTERM);
}
/* only use SIGKILL as a last resort */
kill(child2, SIGTERM);
FYI, this is only a skeleton. If I were writing code to do this for real (which I have: see for instance https://github.com/zackw/tbbscraper/blob/master/scripts/isolate.c ) there would be a whole bunch more code just to comprehensively detect and report errors, plus the additional logic required to deal with file descriptor management in the children and a few other wrinkles.
The fork process spawns a new process identical to the old one and returns in both functions.
This happens automatically so you don't have to take any actions.
But nevertheless, it is cleaner to check if the call indeed succeeded:
A value below 0 indicates failure. In this case, it is not good to call kill().
A value == 0 indicates that we are the child process. In this case, it is not very clean to call kill().
A value > 0 indicates that we are the parent process. In this case, the return value is our child. Here it is safe to call kill().
In your case, you even end up with 4 processes:
Your parent calls fork(), being left with 2 processes.
Both of them call fork() again, resulting in a new child process for each of them.
You should move the 2nd fork() process into the branch where the parent code runs.
The child process begins some time after fork() has been called (there is some setup which happens in the context of the child).
You can be sure that the child is running when fork() returns.
So the code
pid_t child = fork();
kill (child, SIGKILL);
will kill the child. The child might execute kill(0, SIGKILL) which does nothing and returns an error.
There is no way to tell whether the child might ever live long enough to execute it's kill. Most likely, it won't since the Linux kernel will set up the process structure for the child and let the parent continue. The child will just be waiting in the ready list of the processes. The kill will then remove it again.
EDIT If fork() returns a value <= 0, then you shouldn't wait or kill.

Differences between fork and exec

What are the differences between fork and exec?
The use of fork and exec exemplifies the spirit of UNIX in that it provides a very simple way to start new processes.
The fork call basically makes a duplicate of the current process, identical in almost every way. Not everything is copied over (for example, resource limits in some implementations) but the idea is to create as close a copy as possible.
The new process (child) gets a different process ID (PID) and has the PID of the old process (parent) as its parent PID (PPID). Because the two processes are now running exactly the same code, they can tell which is which by the return code of fork - the child gets 0, the parent gets the PID of the child. This is all, of course, assuming the fork call works - if not, no child is created and the parent gets an error code.
The exec call is a way to basically replace the entire current process with a new program. It loads the program into the current process space and runs it from the entry point.
So, fork and exec are often used in sequence to get a new program running as a child of a current process. Shells typically do this whenever you try to run a program like find - the shell forks, then the child loads the find program into memory, setting up all command line arguments, standard I/O and so forth.
But they're not required to be used together. It's perfectly acceptable for a program to fork itself without execing if, for example, the program contains both parent and child code (you need to be careful what you do, each implementation may have restrictions). This was used quite a lot (and still is) for daemons which simply listen on a TCP port and fork a copy of themselves to process a specific request while the parent goes back to listening.
Similarly, programs that know they're finished and just want to run another program don't need to fork, exec and then wait for the child. They can just load the child directly into their process space.
Some UNIX implementations have an optimized fork which uses what they call copy-on-write. This is a trick to delay the copying of the process space in fork until the program attempts to change something in that space. This is useful for those programs using only fork and not exec in that they don't have to copy an entire process space.
If the exec is called following fork (and this is what happens mostly), that causes a write to the process space and it is then copied for the child process.
Note that there is a whole family of exec calls (execl, execle, execve and so on) but exec in context here means any of them.
The following diagram illustrates the typical fork/exec operation where the bash shell is used to list a directory with the ls command:
+--------+
| pid=7 |
| ppid=4 |
| bash |
+--------+
|
| calls fork
V
+--------+ +--------+
| pid=7 | forks | pid=22 |
| ppid=4 | ----------> | ppid=7 |
| bash | | bash |
+--------+ +--------+
| |
| waits for pid 22 | calls exec to run ls
| V
| +--------+
| | pid=22 |
| | ppid=7 |
| | ls |
V +--------+
+--------+ |
| pid=7 | | exits
| ppid=4 | <---------------+
| bash |
+--------+
|
| continues
V
fork() splits the current process into two processes. Or in other words, your nice linear easy to think of program suddenly becomes two separate programs running one piece of code:
int pid = fork();
if (pid == 0)
{
printf("I'm the child");
}
else
{
printf("I'm the parent, my child is %i", pid);
// here we can kill the child, but that's not very parently of us
}
This can kind of blow your mind. Now you have one piece of code with pretty much identical state being executed by two processes. The child process inherits all the code and memory of the process that just created it, including starting from where the fork() call just left off. The only difference is the fork() return code to tell you if you are the parent or the child. If you are the parent, the return value is the id of the child.
exec is a bit easier to grasp, you just tell exec to execute a process using the target executable and you don't have two processes running the same code or inheriting the same state. Like #Steve Hawkins says, exec can be used after you forkto execute in the current process the target executable.
I think some concepts from "Advanced Unix Programming" by Marc Rochkind were helpful in understanding the different roles of fork()/exec(), especially for someone used to the Windows CreateProcess() model:
A program is a collection of instructions and data that is kept in a regular file on disk. (from 1.1.2 Programs, Processes, and Threads)
.
In order to run a program, the kernel is first asked to create a new process, which is an environment in which a program executes. (also from 1.1.2 Programs, Processes, and Threads)
.
It’s impossible to understand the exec or fork system calls without fully understanding the distinction between a process and a program. If these terms are new to you, you may want to go back and review Section 1.1.2. If you’re ready to proceed now, we’ll summarize the distinction in one sentence: A process is an execution environment that consists of instruction, user-data, and system-data segments, as well as lots of other resources acquired at runtime, whereas a program is a file containing instructions and data that are used to initialize the instruction and user-data segments of a process. (from 5.3 exec System Calls)
Once you understand the distinction between a program and a process, the behavior of fork() and exec() function can be summarized as:
fork() creates a duplicate of the current process
exec() replaces the program in the current process with another program
(this is essentially a simplified 'for dummies' version of paxdiablo's much more detailed answer)
Fork creates a copy of a calling process.
generally follows the structure
int cpid = fork( );
if (cpid = = 0)
{
//child code
exit(0);
}
//parent code
wait(cpid);
// end
(for child process text(code),data,stack is same as calling process)
child process executes code in if block.
EXEC replaces the current process with new process's code,data,stack.
generally follows the structure
int cpid = fork( );
if (cpid = = 0)
{
//child code
exec(foo);
exit(0);
}
//parent code
wait(cpid);
// end
(after exec call unix kernel clears the child process text,data,stack and fills with foo process related text/data)
thus child process is with different code (foo's code {not same as parent})
They are use together to create a new child process. First, calling fork creates a copy of the current process (the child process). Then, exec is called from within the child process to "replace" the copy of the parent process with the new process.
The process goes something like this:
child = fork(); //Fork returns a PID for the parent process, or 0 for the child, or -1 for Fail
if (child < 0) {
std::cout << "Failed to fork GUI process...Exiting" << std::endl;
exit (-1);
} else if (child == 0) { // This is the Child Process
// Call one of the "exec" functions to create the child process
execvp (argv[0], const_cast<char**>(argv));
} else { // This is the Parent Process
//Continue executing parent process
}
The main difference between fork() and exec() is that,
The fork() system call creates a clone of the currently running program. The original program continues execution with the next line of code after the fork() function call. The clone also starts execution at the next line of code.
Look at the following code that i got from http://timmurphy.org/2014/04/26/using-fork-in-cc-a-minimum-working-example/
#include <stdio.h>
#include <unistd.h>
int main(int argc, char **argv)
{
printf("--beginning of program\n");
int counter = 0;
pid_t pid = fork();
if (pid == 0)
{
// child process
int i = 0;
for (; i < 5; ++i)
{
printf("child process: counter=%d\n", ++counter);
}
}
else if (pid > 0)
{
// parent process
int j = 0;
for (; j < 5; ++j)
{
printf("parent process: counter=%d\n", ++counter);
}
}
else
{
// fork failed
printf("fork() failed!\n");
return 1;
}
printf("--end of program--\n");
return 0;
}
This program declares a counter variable, set to zero, before fork()ing. After the fork call, we have two processes running in parallel, both incrementing their own version of counter. Each process will run to completion and exit. Because the processes run in parallel, we have no way of knowing which will finish first. Running this program will print something similar to what is shown below, though results may vary from one run to the next.
--beginning of program
parent process: counter=1
parent process: counter=2
parent process: counter=3
child process: counter=1
parent process: counter=4
child process: counter=2
parent process: counter=5
child process: counter=3
--end of program--
child process: counter=4
child process: counter=5
--end of program--
The exec() family of system calls replaces the currently executing code of a process with another piece of code. The process retains its PID but it becomes a new program. For example, consider the following code:
#include <stdio.h>
#include <unistd.h>
main() {
char program[80],*args[3];
int i;
printf("Ready to exec()...\n");
strcpy(program,"date");
args[0]="date";
args[1]="-u";
args[2]=NULL;
i=execvp(program,args);
printf("i=%d ... did it work?\n",i);
}
This program calls the execvp() function to replace its code with the date program. If the code is stored in a file named exec1.c, then executing it produces the following output:
Ready to exec()...
Tue Jul 15 20:17:53 UTC 2008
The program outputs the line ―Ready to exec() . . . ‖ and after calling the execvp() function, replaces its code with the date program. Note that the line ― . . . did it work‖ is not displayed, because at that point the code has been replaced. Instead, we see the output of executing ―date -u.‖
fork() creates a copy of the current process, with execution in the new child starting from just after the fork() call. After the fork(), they're identical, except for the return value of the fork() function. (RTFM for more details.) The two processes can then diverge still further, with one unable to interfere with the other, except possibly through any shared file handles.
exec() replaces the current process with a new one. It has nothing to do with fork(), except that an exec() often follows fork() when what's wanted is to launch a different child process, rather than replace the current one.
fork():
It creates a copy of running process. The running process is called parent process & newly created process is called child process. The way to differentiate the two is by looking at the returned value:
fork() returns the process identifier (pid) of the child process in the parent
fork() returns 0 in the child.
exec():
It initiates a new process within a process. It loads a new program into the current process, replacing the existing one.
fork() + exec():
When launching a new program is to firstly fork(), creating a new process, and then exec() (i.e. load into memory and execute) the program binary it is supposed to run.
int main( void )
{
int pid = fork();
if ( pid == 0 )
{
execvp( "find", argv );
}
//Put the parent to sleep for 2 sec,let the child finished executing
wait( 2 );
return 0;
}
The prime example to understand the fork() and exec() concept is the shell,the command interpreter program that users typically executes after logging into the system.The shell interprets the first word of command line as a command name
For many commands,the shell forks and the child process execs the command associated with the name treating the remaining words on the command line as parameters to the command.
The shell allows three types of commands. First, a command can be an
executable file that contains object code produced by compilation of source code (a C program for example). Second, a command can be an executable file that
contains a sequence of shell command lines. Finally, a command can be an internal shell command.(instead of an executable file ex->cd,ls etc.)

Resources