Where does the process start to execute after fork()

Where does the process start to execute after fork() - c

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int main(void) {
for (int i = 1; i < 4; i++) {
printf("%d", i);
int id = fork();
if (id == 0) {
printf("Hello\n");
exit(0);
} else {
exit(0);
}
}
return 0;
}
For this code, it prints 11Hello on my computer. It seems counter-intuitive to me because "1" is printed twice but it's before the fork() is called.

The fork() system call forks a new process and executes the instruction that follows it in each process parallelly. After your child process prints the value of i to the stdout, it gets buffered which then prints the value of 'i' again because stdout was not flushed.
Use the fflush(stdout); so that 'i' gets printed only once per fork.
Alternately, you could also use printf("%d\n", i); where the new line character at the end does the job.

Where does the process start to execute after fork()
fork() duplicates the image of the process and it's context. It will run the next line of code pointed by the instruction pointer.
It seems counter-intuitive to me because "1" is printed twice but it's before the fork() is called.
Read printf anomaly after "fork()"

To begin with, the for loop is superfluous in your example.
Recall that the child copies the caller's memory(that of its parent) (code, globals, heap and stack), registers, and open files. To be performant or there may be some other reason, the printf call may not flush the buffer and put the things passed to that except for some cases such as appending new-line-terminator.
Before forking, the parent(main process) is on the way.
Let's assume we're on a single core system and the child first preempts the core.
1 is in the buffer because its parent put it into that before forking. Then, the child reaches second print statement, a caveat here is that the child can be orphaned at that time(no matter for this moment), passing "Hello\n" string including new-line character giving rise to dump the buffer/cache(whatever you call.) Since it sees \n character, it flushes the buffer including prior 1 added by its parent, that is 11Hello.
Let's assume the parent preempts the core at first,
It surrenders after calling exit statement, bringing on the child to be orphaned, causing memory leak. After that point, the boss(init possessing process id as 1) whose newly name I forget(it may be sys-something) should handle this case. However, nothing is changed as to the printing-steps. So you run into again 11Hello except if not the buffer is flushed automagically.
I don't have much working experience with them but university class(I failed at the course 4 times). However, I can advise you whenever possible use stderr while coping with these tings since it is not buffered, in lieu of stdout or there is some magical way(I forget it again, you call it at the beginning in main()) you can opt for to disable buffering for stdout as well.
To be more competent over these topics, you should glance at The Linux Programming Interface of Michael Kerrisk and the topics related to William Pursell,
Jonathan Leffler,
WhozCraig,
John Bollinger, and
Nominal Animal. I have learnt a plethora of information from them even if the information almost wholly is useless in Turkey borders.
*Magic means needing a lot of details to explain.

Related

C unnnamed pipes and fork for calculation

So I'm trying to create program that accepts user input (price for example 50) and then first child passes it to second, second one add 10 (price is now 60), third one then 50 (price is now 110) and 4 one just prints/returns final price. I have fork in loop and I'm creating pipes, but price is always the same, only 10 is added in each child. What is wrong or how to fix so that it's going to work as I want to.
My code:
int main(int argc,char *argv[])
{
int anon_pipe[2];
int n,N=4;
char value_price[100];
if(argc>1)
{
int price=atoi(argv[1]);
printf("%d\n",price);
if(pipe(anon_pipe)==-1){
perror("Error opening pipe");
return -1;
}
for(n = 0; n < N; n++){
switch(fork()){
case -1:
perror("Problem calling fork");
return -1;
case 0:
close(anon_pipe[1]);
read(anon_pipe[0],value_price,100);
price+=10;
sprintf(value_price,"%d \n",price);
printf("Price: %d\n",atoi(value_price));
write(anon_pipe[1],value_price,sizeof(value_price));
_exit(0);
}
}
close(anon_pipe[0]);
sleep(1);
close(anon_pipe[1]);
}
return 0;
}

You seem to think that forking makes the child start from the beginning of the program. This is not the case, forking makes the child start at the same line when the fork() was called
For instance look at this code here:
read(anon_pipe[0],value_price,100);
price+=10;
sprintf(value_price,"%d \n",price);
printf("Price: %d\n",atoi(value_price));
See you increase the value of price but you never read that value form the pipe. So all children will always output +10 to their respective pipe.

You should check the return values of your function calls for error codes. If you had done, you would have detected the error arising from this combination of calls:
close(anon_pipe[1]);
// ...
write(anon_pipe[1],value_price,sizeof(value_price));
Very likely, you would also have detected that many of these calls ...
read(anon_pipe[0],value_price,100);
... signal end-of-file without reading anything. At the very least, you need read()'s return value to determine where to place the needed string terminator (which you fail to place before using the buffer as a string).
As a general rule, it is mandatory to handle the return values of read() and write(), for in addition to the possibility of errors / EOF, these functions may perform short data transfers instead of full ones. The return value tells you how many bytes were transferred, which you need to know to determine whether to loop to attempt to transfer more bytes.
Moreover, you have all of your processes using the same pipe to communicate with each other. You might luck into that working, but it is probable that at least sometimes you'll end up with garbled communication. You really ought to create a separate pipe for each pair of communicating processes (including the parent process).
Furthermore, do not use sleep() to synchronize processes. It doesn't work reliably. Instead, the parent should wait() or waitpid() for each of its child processes, but only after starting them all and performing all needed pipe-end handling. Waiting for the child processes also prevents them from remaining zombies for any significant time after they exit. That doesn't much matter when the main process exits instead of proceeding to any other work, as in this case, but otherwise it constitutes a resource leak (file descriptors). You should form the good habit of waiting for your child processes.
Of course, all of that is moot if you don't actually write the data you mean to write; #SanchkeDellowar explains in his answer how you fail to do that.

Fork() call process?

Suppose I have this code:
int main () {
int i, r;
i = 5;
printf("%d\n", i);
r = fork();
if (r > 0) {
i = 6;
}
else if (r == 0) {
i = 4;
}
printf("%d\n", i);
}
I was wondering does the forked child process start executing either from the beginning or from where it was called. The reason I ask this is because on my own system I get the output 5,6,4 which means that is starts from where it is called but typing it in http://ideone.com/rHppMp I get 5,6,5,4?

One process calls fork, two processes return (errors notwithstanding). That's how it works. So the child starts from the next "line" (technically it starts with the assignment to r).
What you're seeing here has to do with buffering. In the online case, you'll find that it's using full buffering for standard output, meaning the initial 5 hasn't yet been flushed to the output device.
Hence, at the fork, both parent and child have it and both will flush at some point.
For the line buffered case, the parent flushes on the newline so the line is no longer in the buffer at the fork.
The rules are explicit. Standard output is set to line buffered only if the output device is known to be a terminal. In cases where you redirect to a file, or catch the output in an online environment so you can sanitise it for browser output, it'll be fully buffered.
Hence why you're seeing a difference.
It's generally a good idea to flush all output handles (with fflush) before forking.

You can't judge either parent will execute first or child, in most of the cases child executes. The output should not be "5 6 5 4" might be some garbage value from buffer. You can use fflush(NULL) to flush of the buffer before fork() and try again.

Generally, your application (with its two process threads) has little to no control over how the process threads are organized on the run-queue of your OS. After fork(), there is no specific order you should expect. In fact, if your OS supports multiple CPUs, the may actually both run at the same time; which can result in some unexpected output as both processes compete for stdout.

C close STDOUT running forever

I am writing some C code that involves the use of pipes. To make a child process use my pipe instead of STDOUT for output, I used the following lines:
close(STDOUT);
dup2(leftup[1], STDOUT);
However, it seems to go into some sort of infinite loop or hang on those lines. When I get rid of close, it hangs on dup2.
Curiously, the same idea works in the immediately preceding line for STDIN:
close(STDIN);
dup2(leftdown[0], STDIN);
What could be causing this behavior?
Edit: Just to be clear...
#define STDIN 0
#define STDOUT 1
Edit 2: Here is a stripped-down example:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#define STDIN 0
#define STDOUT 1
main(){
pid_t child1 = 0;
int leftdown[2];
if (pipe(leftdown) != 0)
printf("ERROR");
int leftup[2];
if (pipe(leftup) != 0)
printf("ERROR");
printf("MADE PIPES");
child1 = fork();
if (child1 == 0){
close(STDOUT);
printf("TEST 1");
dup2(leftup[1], STDOUT);
printf("TEST 2");
exit(0);
}
return(0);
}
The "TEST 1" line is never reached. The only output is "MADE PIPES".

At a minimum, you should ensure that the dup2 function returns the new file descriptor rather than -1.
There's always a possibility that it will give you an error (for example, if the pipe() call failed previously). In addition, be absolutely certain that you're using the right indexes (0 and 1) - I've been bitten by that before and it depends on whether you're in the parent or child process.
Based on your edit, I'm not the least bit surprised that MADE PIPES is the last thing printed.
When you try to print TEST 1, you have already closed the STDOUT descriptor so that will go nowhere.
When you try to print TEST 2, you have duped the STDOUT descriptor so that will go to the parent but your parent doesn't read it.
If you change your forking code to:
child1 = fork();
if (child1 == 0){
int count;
close(STDOUT);
count = printf("TEST 1\n");
dup2(leftup[1], STDOUT);
printf("TEST 2 (%d)\n", count);
exit(0);
} else {
char buff[80];
read (leftup[0], buff, 80);
printf ("%s\n", buff);
sleep (2);
}
you'll see that the TEST 2 (-1) line is output by the parent because it read it via the pipe. The -1 in there is the return code from the printf you attempted in the child after you closed the STDOUT descriptor (but before you duped it), meaning that it failed.
From ISO C11 7.20.6.3 The printf function:
The printf function returns the number of characters transmitted, or a negative value if an output or encoding error occurred.

Multiple thing to mention,
When you use fork, it causes almost a complete copy of parent process. That also includes the buffer that is set up for stdout standard output stream as well. The stdout stream will hold the data till buffer is full or explicitly requested to flush the data from buffer/stream. Now because of this , now you have "MADE PIPES" sitting in buffer. When you close the STDOUT fd and use printf for writing data out to terminal, it does nothing but transfers your "TEST 1" and "TEST 2" into the stdout buffer and doesn't cause any error or crash (due to enough buffer). Thus even after duplicating pipe fd on STDOUT, due to buffered output printf hasn't even touched pipe write end. Most important, please use only one set of APIs i.e. either *NIX or standard C lib functions. Make sure you understand the libraries well, as they often play tricks for some sort of optimization.
Now, another thing to mention, make sure that you close the appropriate ends of pipe in appropriate process. Meaning that if say, pipe-1 is used to communicate from parent to child then make sure that you close the read end in parent and write end in child. Otherwise, your program may hung, due to reference counts associated with file descriptors you may think that closing read end in child means pipe-read end is closed. But as when you don't close the read end in parent, then you have extra reference count for read end of pipe and ultimately the pipe will never close.
There are many other things about your coding style, better you should get hold on it :)
Sooner you learn it better it will save your time. :)
Error checking is absolutely important, use at least assert to ensure that your assumptions are correct.
While using printf statements to log the error or as method of debugging and you are changing terminal FD's (STDOUT / STDIN / STDERR) its better you open a log file with *NIX open and write errors/ log entries to it.
At last, using strace utility will be a great help for you. This utility will allow you to track the system calls executed while executing your code. It is very straight forward and simple. You can even attach this to executing process, provided you have right permissions.

Why do I have different output between a terminal and a file when forking?

I'm learning to work with fork(), and I have some questions.
Consider the following code:
#include <stdio.h>
#include <unistd.h>
int main()
{
int i;
for(i = 0; i < 5; i++)
{
printf("%d", i);
if((i%2)==0)
if(fork())
fork();
}
}
When I output to a terminal, I get the result I expect (i.e.: 0,1,1,1,2,2,2,...). But when I output to a file, the result is completely different:
Case 1: (output to terminal, e.g.: ./a.out):
Result is: 0,1,1,1,2,2,2,...
Case 2: (output to file, e.g.: ./a.out > output_file)
Result is: 0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,...
Why it is like this?

When you output to a file, the stdio library automatically block-buffers the outbound bits.
When a program calls exit(2) or returns from main(), any remaining buffered bits are flushed.
In a program like this that doesn't generate much output, all of the I/O will occur after the return from main(), when the destination is not a tty. This will often change the pattern and order of I/O operations all by itself.
In this case, the result is further complicated by the series of fork() calls. This will duplicate the partially filled and as-yet-unflushed I/O buffers in each child image.
Before a program calls fork(), one might first flush I/O using fflush(3). If this flush is not done, then you may want all processes except one (typically: the children) to _exit(2) instead of exit(3) or return from main(), to prevent the same bits from being output more than once. (_exit(2) just does the exit system call.)

The fork() inside if block in your program is executed twice, because once fork is successful, the program is controlled by two processes(child and parent processes).So fork() inside if block, is executed by both child and parent processes. So it will have different output than expected since it is controlled by two different process and their order of execution is not known. ie. either child or parent may execute first after each fork()
For the difference in behaviour between the output and the file. this is the reason.
The contents you write to the buffer(to be written to file(disk) eventually) is not guaranteed to be written to the file (disk) immediatley. It is mostly flushed to the disk only after the execution of main() is complete. Whereas, it is output to terminal, during the execution of main().
Before writing to file in disk, the kernel actually copies the data into a buffer and later in the background, the kernel gathers up all of the dirty buffers, sorts them optimally and writes them out to file(disk).This is called writeback. It also allows the kernel to defer writes to more idle periods and batch many writes together.
To avoid such behaviour, it is always good to have three different condition checks in program using fork()
int pid;
if((pid = fork()) == -1 )
{ //fork unsuccessful
}
else if ( pid > 0)
{ //This is parent
}
else
{//This is child
}

buffered streams can produce some strange results sometimes... especially when you have multiple processes using the same buffered stream. Force the buffer to be flushed and you'll see different results:
int main()
{
int i;
FILE * fd = fopen(yourfile, "w");
for(i = 0; i < 5; i++)
{
fprintf(fd, "%d", i);
fflush(fd);
if((i%2)==0)
if(fork())
fork();
}
}
Also, for your debugging purposes, it might be nice to dump the process' IDs so you can see which process spawns which, and have a better idea of what's going on. getpid() can help you with that.

Why do I have different output between a terminal and a file when
forking?
C standard library functions use internal buffering for speed up. Most implementations use fully buffered IO for file streams, line buffered for stdin/stdout and unbuffered for stderr.
So your problem can be solved in number of ways:
Use explicit buffer flush before fork via fflush(3)
Set buffer type manually via setvbuf(3)
Use write(2) instead of stdlib's printf(3)
Output to stderr by default via fprintf(3) *****
Exit with _exit(2) in forked processes instead of exit(3) ******
Last two may not work as expected if:
* your implementation does not use unbuffered writes to stderr by default (Which is required by ISO C)
** you have written more than default buffer size in child and if was automatically flushed.
PS. Yet again, if you need deeper knowledge of standard library functions and buffering I recommend reading Advanced Programming in the UNIX Environment (2nd Edition) by W. Richard Stevens and Stephen A. Rago.
PPS. btw, your question is a very popular interview question for C/C++ programmer position.

vfork never ends

The following code never ends. Why is that?
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#define SIZE 5
int nums[SIZE] = {0, 1, 2, 3, 4};
int main()
{
int i;
pid_t pid;
pid = vfork();
if(pid == 0){ /* Child process */
for(i = 0; i < SIZE; i++){
nums[i] *= -i;
printf(”CHILD: %d “, nums[i]); /* LINE X */
}
}
else if (pid > 0){ /* Parent process */
wait(NULL);
for(i = 0; i < SIZE; i++)
printf(”PARENT: %d “, nums[i]); /* LINE Y */
}
return 0;
}
Update:
This code is just to illustrate some of the confusions I have regarding to vfork(). It seems like when I use vfork(), the child process doesn't copy the address space of the parent. Instead, it shares the address space. In that case, I would expect the nums array get updated by both of the processes, my question is in what order? How the OS synchronizes between the two?
As for why the code never ends, it is probably because I don't have any _exit() or exec() statement explicitly for exit. Am I right?
UPDATE2:
I just read: 56. Difference between the fork() and vfork() system call?
and I think this article helps me with my first confusion.
The child process from vfork() system call executes in the parent’s
address space (this can overwrite the parent’s data and stack ) which
suspends the parent process until the child process exits.

To quote from the vfork(2) man page:
The vfork() function has the same effect as fork(), except that the behaviour is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit() or one of the exec family of functions.
You're doing a whole bunch of those things, so you shouldn't expect it to work. I think the real question here is: why you're using vfork() rather than fork()?

Don't use vfork. That's the simplest advice you can get. The only thing that vfork gives you is suspending the parent until the child either calls exec* or _exit. The part about sharing the address space is incorrect, some operating systems do it, other choose not to because it's very unsafe and has caused serious bugs.
Last time I looked at how applications use vfork in reality the absolute majority did it wrong. It was so bad that I threw away the 6 character change that enabled address space sharing on the operating system I was working on at that time. Almost everyone who uses vfork at least leaks memory if not worse.
If you really want to use vfork, don't do anything other than immediately call _exit or execve after it returns in the child process. Anything else and you're entering undefined territory. And I really mean "anything". You start parsing your strings to make arguments for your exec call and you're pretty much guaranteed that something will touch something it's not supposed to touch. And I also mean execve, not some other function from the exec family. Many libc out there do things in execvp, execl, execle, etc. that are unsafe in a vfork context.
What is specifically happening in your example:
If your operating system shares address space the child returning from main means that your environment cleans things up (flush stdout since you called printf, free memory that was allocated by printf and such things). This means that there are other functions called that will overwrite the stack frame the parent was stuck in. vfork returning in the parent returns to a stack frame that has been overwritten and anything can happen, it might not even have a return address on the stack to return to anymore. You first entered undefined behavior country by calling printf, then the return from main brought you into undefined behavior continent and the cleanup run after the return from main made you travel to undefined behavior planet.

From the official specification:
the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(),
In your program you modify data other than the pid variable, meaning the behavior is undefined.
You also have to call _exit to end the process, or call one of the exec family of functions.

The child must _exit rather than returning from main. If the child returns from main, then the stack frame does not exist for the parent when it returns from vfork.

just call the _exit instead of calling return or insert _exit(0) to the last line in "child process". return 0 calls exit(0) while close the stdout, so when another printf follows, the program crashes.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight