C unnnamed pipes and fork for calculation

C unnnamed pipes and fork for calculation - c

So I'm trying to create program that accepts user input (price for example 50) and then first child passes it to second, second one add 10 (price is now 60), third one then 50 (price is now 110) and 4 one just prints/returns final price. I have fork in loop and I'm creating pipes, but price is always the same, only 10 is added in each child. What is wrong or how to fix so that it's going to work as I want to.
My code:
int main(int argc,char *argv[])
{
int anon_pipe[2];
int n,N=4;
char value_price[100];
if(argc>1)
{
int price=atoi(argv[1]);
printf("%d\n",price);
if(pipe(anon_pipe)==-1){
perror("Error opening pipe");
return -1;
}
for(n = 0; n < N; n++){
switch(fork()){
case -1:
perror("Problem calling fork");
return -1;
case 0:
close(anon_pipe[1]);
read(anon_pipe[0],value_price,100);
price+=10;
sprintf(value_price,"%d \n",price);
printf("Price: %d\n",atoi(value_price));
write(anon_pipe[1],value_price,sizeof(value_price));
_exit(0);
}
}
close(anon_pipe[0]);
sleep(1);
close(anon_pipe[1]);
}
return 0;
}

You seem to think that forking makes the child start from the beginning of the program. This is not the case, forking makes the child start at the same line when the fork() was called
For instance look at this code here:
read(anon_pipe[0],value_price,100);
price+=10;
sprintf(value_price,"%d \n",price);
printf("Price: %d\n",atoi(value_price));
See you increase the value of price but you never read that value form the pipe. So all children will always output +10 to their respective pipe.

You should check the return values of your function calls for error codes. If you had done, you would have detected the error arising from this combination of calls:
close(anon_pipe[1]);
// ...
write(anon_pipe[1],value_price,sizeof(value_price));
Very likely, you would also have detected that many of these calls ...
read(anon_pipe[0],value_price,100);
... signal end-of-file without reading anything. At the very least, you need read()'s return value to determine where to place the needed string terminator (which you fail to place before using the buffer as a string).
As a general rule, it is mandatory to handle the return values of read() and write(), for in addition to the possibility of errors / EOF, these functions may perform short data transfers instead of full ones. The return value tells you how many bytes were transferred, which you need to know to determine whether to loop to attempt to transfer more bytes.
Moreover, you have all of your processes using the same pipe to communicate with each other. You might luck into that working, but it is probable that at least sometimes you'll end up with garbled communication. You really ought to create a separate pipe for each pair of communicating processes (including the parent process).
Furthermore, do not use sleep() to synchronize processes. It doesn't work reliably. Instead, the parent should wait() or waitpid() for each of its child processes, but only after starting them all and performing all needed pipe-end handling. Waiting for the child processes also prevents them from remaining zombies for any significant time after they exit. That doesn't much matter when the main process exits instead of proceeding to any other work, as in this case, but otherwise it constitutes a resource leak (file descriptors). You should form the good habit of waiting for your child processes.
Of course, all of that is moot if you don't actually write the data you mean to write; #SanchkeDellowar explains in his answer how you fail to do that.

Related

Where does the process start to execute after fork()

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int main(void) {
for (int i = 1; i < 4; i++) {
printf("%d", i);
int id = fork();
if (id == 0) {
printf("Hello\n");
exit(0);
} else {
exit(0);
}
}
return 0;
}
For this code, it prints 11Hello on my computer. It seems counter-intuitive to me because "1" is printed twice but it's before the fork() is called.

The fork() system call forks a new process and executes the instruction that follows it in each process parallelly. After your child process prints the value of i to the stdout, it gets buffered which then prints the value of 'i' again because stdout was not flushed.
Use the fflush(stdout); so that 'i' gets printed only once per fork.
Alternately, you could also use printf("%d\n", i); where the new line character at the end does the job.

Where does the process start to execute after fork()
fork() duplicates the image of the process and it's context. It will run the next line of code pointed by the instruction pointer.
It seems counter-intuitive to me because "1" is printed twice but it's before the fork() is called.
Read printf anomaly after "fork()"

To begin with, the for loop is superfluous in your example.
Recall that the child copies the caller's memory(that of its parent) (code, globals, heap and stack), registers, and open files. To be performant or there may be some other reason, the printf call may not flush the buffer and put the things passed to that except for some cases such as appending new-line-terminator.
Before forking, the parent(main process) is on the way.
Let's assume we're on a single core system and the child first preempts the core.
1 is in the buffer because its parent put it into that before forking. Then, the child reaches second print statement, a caveat here is that the child can be orphaned at that time(no matter for this moment), passing "Hello\n" string including new-line character giving rise to dump the buffer/cache(whatever you call.) Since it sees \n character, it flushes the buffer including prior 1 added by its parent, that is 11Hello.
Let's assume the parent preempts the core at first,
It surrenders after calling exit statement, bringing on the child to be orphaned, causing memory leak. After that point, the boss(init possessing process id as 1) whose newly name I forget(it may be sys-something) should handle this case. However, nothing is changed as to the printing-steps. So you run into again 11Hello except if not the buffer is flushed automagically.
I don't have much working experience with them but university class(I failed at the course 4 times). However, I can advise you whenever possible use stderr while coping with these tings since it is not buffered, in lieu of stdout or there is some magical way(I forget it again, you call it at the beginning in main()) you can opt for to disable buffering for stdout as well.
To be more competent over these topics, you should glance at The Linux Programming Interface of Michael Kerrisk and the topics related to William Pursell,
Jonathan Leffler,
WhozCraig,
John Bollinger, and
Nominal Animal. I have learnt a plethora of information from them even if the information almost wholly is useless in Turkey borders.
*Magic means needing a lot of details to explain.

Averaging the output of multiple processes in C

I've been searching around about the topic, but it's kinda confusing for me with how many different techniques there are and I'm not sure how to approach my problem.
I have a function that computes some value, but it's based on random numbers and I want to compute that value multiple times, let's say few dozen or hundred times and take the average of it, but since it takes quite a while I've wanted to use multiprocessing, with each process executing that function, saving the result and then I'd simply sum the results and divide by the amount of worker processes in the main process.
Quite simple in theory, but I have no idea how to do it - it seems that a simple way would be to just do something like
loop that creates pipes
if (fork())
loop that reads the outputs of pipes
else
code of function that computes the desired value
but that somehow seems wrong? I'm really not sure how to do it
EDIT:
To adress the comments, I've been thinking about something like this:
for (int i = 0; i < n_children; ++i) {
if (fork() == 0) { //child process
x += estimation();
}
}
for (int i = 0; i < n_children; ++i) //waiting for each process to end
wait(NULL);
x /= n_children;
but I know that it won't work properly, I don't know how to store/synchronize the results

As William Pursell mentioned in the comments, a single pipe is what you want. The parent will close the write end, and each forked child will close the read end. Each child writes its result to the pipe. The parent calls wait(2) on each child and, if the status indicates data was written to the pipe, reads the pipe and updates the average.
It could also be done with Posix anonymous shared memory. Allocate an array of results in shared memory. Each child will have a unique value of the loop variable i when its process is created. The child writes to array[i]. The parent waits for each child. When they have all completed, iterate over the array and compute.

How to wait for 2 types of events in a loop (C)?

I am trying to wait on waitpid() and read() in a while-true loop. Specifically, I am waiting for either one of these two events and then process it in each iteration of the loop. Currently, I have the following implementation (which is not I desired).
while (true) {
pid_t pid = waitpid(...);
process_waitpid_event(...);
ssize_t sz = read(socket, ....);
process_read_event(...);
}
The problem with this implementation is that the processing of the second event depends on the completion of the first event. Instead of processing these two events sequentially, I wish to process whichever event that comes first in each iteration of the loop. How should I do this?

If you don't want to touch threading, you can include this in the options of the call to waitpid:
pid_t pid = waitpid(pid, &status, WNOHANG);
As from the manpage for waitpid:
WNOHANG - return immediately if no child has exited.
As such, if waitpid isn't ready, it won't block and the program will just keep going to the next line.
As for the read, if it is blocking you might want to have a look at poll(2). You can essentially check to see if your socket is ready every set interval, e.g. 250ms, and then call read when it is. This will allow it to not block.
Your code might look a bit like this:
// Creating the struct for file descriptors to be polled.
struct pollfd poll_list[1];
poll_list[0].fd = socket_fd;
poll_list[0].events = POLLIN|POLLPRI;
// POLLIN There is data to be read
// POLLPRI There is urgent data to be read
/* poll_res > 0: Something ready to be read on the target fd/socket.
** poll_res == 0: Nothing ready to be read on the target fd/socket.
** poll_res < 0: An error occurred. */
poll_res = poll(poll_list, 1, POLL_INTERVAL);
This is just assuming that you're reading from a socket, judging from the variable names in your code. As others have said, your problem might require something a bit more heavy duty like threading.

The answer of #DanielPorteous should work too if you don't want to use thread in your program.
The idea is simple, not keeping the waitpid and the read function to wait unless they consumes some time to do their operation. The idea is keeping a timeout mechanism so that, if waitpid has nothing to create an impact to the whole operation, it will return immediately and the same thing goes for the read operation too.
If the read function takes very long time to read the whole buffer, you may restrict the reading manually from the read function so that it doesn't read the whole at once, rather it reads for 2 milliseconds and then pass the cycle to the waitpid function to execute.
But its safe to use threading for your purpose and its pretty easy to implement. Here's a nice guideline about how can you implement threading.
In your case you need to declare two threads.
pthread_t readThread;
pthread_t waitpidThread;
Now you need to create the thread and pass specific function as their parameter.
pthread_create(&(waitpidThread), NULL, &waitpidFunc, NULL);
pthread_create(&(readThread), NULL, &readFunc, NULL);
Now you may have to write your waitpidFunc and readFunc function. They might look like this.
void* waitpidFunc(void *arg)
{
while(true) {
pid_t pid = waitpid(...);
// This is to put an exit condition somewhere.
// So that you can finish the thread
int exit = process_waitpid_event(...);
if(exit == 0) break;
}
return NULL;
}

I think that the right tool in this situation is select or poll. Both are doing essentially the same job. They allow to select those descriptors where an input is available. Hence you can wait simultaneously on two sockets for example. However, it is not directly usable in your case as you want to wait for a process and socket. The solution will be to create a pipe which will receive something when the waitpid finishes.
You can launch a new thread and connect it with the original one with a pipe. The new thread will invoke waitpid and when it finished it will write its result to the pipe. The main thread will wait either for the socket or pipe using select.

Why do I have different output between a terminal and a file when forking?

I'm learning to work with fork(), and I have some questions.
Consider the following code:
#include <stdio.h>
#include <unistd.h>
int main()
{
int i;
for(i = 0; i < 5; i++)
{
printf("%d", i);
if((i%2)==0)
if(fork())
fork();
}
}
When I output to a terminal, I get the result I expect (i.e.: 0,1,1,1,2,2,2,...). But when I output to a file, the result is completely different:
Case 1: (output to terminal, e.g.: ./a.out):
Result is: 0,1,1,1,2,2,2,...
Case 2: (output to file, e.g.: ./a.out > output_file)
Result is: 0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,...
Why it is like this?

When you output to a file, the stdio library automatically block-buffers the outbound bits.
When a program calls exit(2) or returns from main(), any remaining buffered bits are flushed.
In a program like this that doesn't generate much output, all of the I/O will occur after the return from main(), when the destination is not a tty. This will often change the pattern and order of I/O operations all by itself.
In this case, the result is further complicated by the series of fork() calls. This will duplicate the partially filled and as-yet-unflushed I/O buffers in each child image.
Before a program calls fork(), one might first flush I/O using fflush(3). If this flush is not done, then you may want all processes except one (typically: the children) to _exit(2) instead of exit(3) or return from main(), to prevent the same bits from being output more than once. (_exit(2) just does the exit system call.)

The fork() inside if block in your program is executed twice, because once fork is successful, the program is controlled by two processes(child and parent processes).So fork() inside if block, is executed by both child and parent processes. So it will have different output than expected since it is controlled by two different process and their order of execution is not known. ie. either child or parent may execute first after each fork()
For the difference in behaviour between the output and the file. this is the reason.
The contents you write to the buffer(to be written to file(disk) eventually) is not guaranteed to be written to the file (disk) immediatley. It is mostly flushed to the disk only after the execution of main() is complete. Whereas, it is output to terminal, during the execution of main().
Before writing to file in disk, the kernel actually copies the data into a buffer and later in the background, the kernel gathers up all of the dirty buffers, sorts them optimally and writes them out to file(disk).This is called writeback. It also allows the kernel to defer writes to more idle periods and batch many writes together.
To avoid such behaviour, it is always good to have three different condition checks in program using fork()
int pid;
if((pid = fork()) == -1 )
{ //fork unsuccessful
}
else if ( pid > 0)
{ //This is parent
}
else
{//This is child
}

buffered streams can produce some strange results sometimes... especially when you have multiple processes using the same buffered stream. Force the buffer to be flushed and you'll see different results:
int main()
{
int i;
FILE * fd = fopen(yourfile, "w");
for(i = 0; i < 5; i++)
{
fprintf(fd, "%d", i);
fflush(fd);
if((i%2)==0)
if(fork())
fork();
}
}
Also, for your debugging purposes, it might be nice to dump the process' IDs so you can see which process spawns which, and have a better idea of what's going on. getpid() can help you with that.

Why do I have different output between a terminal and a file when
forking?
C standard library functions use internal buffering for speed up. Most implementations use fully buffered IO for file streams, line buffered for stdin/stdout and unbuffered for stderr.
So your problem can be solved in number of ways:
Use explicit buffer flush before fork via fflush(3)
Set buffer type manually via setvbuf(3)
Use write(2) instead of stdlib's printf(3)
Output to stderr by default via fprintf(3) *****
Exit with _exit(2) in forked processes instead of exit(3) ******
Last two may not work as expected if:
* your implementation does not use unbuffered writes to stderr by default (Which is required by ISO C)
** you have written more than default buffer size in child and if was automatically flushed.
PS. Yet again, if you need deeper knowledge of standard library functions and buffering I recommend reading Advanced Programming in the UNIX Environment (2nd Edition) by W. Richard Stevens and Stephen A. Rago.
PPS. btw, your question is a very popular interview question for C/C++ programmer position.

Can the order of execution of fork() be determined?

I'm working on an exercise on the textbook "Operating System Concepts 7th Edition", and I'm a bit confused about how does fork() work. From my understanding, fork() creates a child process which runs concurrently with its parent. But then, how do we know exactly which process runs first? I meant the order of execution.
Problem
Write a C program using fork() system call that generates the Fibonacci sequence in the child process. The number of sequence will be provided in the command line.
This is my solution:
#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
void display_fibonacci_sequence( int n ) {
int i = 0;
int a = 1;
int b = 1;
int value;
printf( "%d, %d, ", a, b );
for( ;i < n - 2; ++i ) {
value = a + b;
printf( "%d, ", value );
a = b;
b = value;
}
printf( "\n" );
}
int main( int argc, char** argv ) {
int n;
pid_t pid;
pid = fork();
if( argc != 2 ) {
fprintf( stderr, "Invalid arguments" );
exit( -1 );
}
n = atoi( argv[1] );
if( pid < 0 ) {
fprintf( stderr, "Fork failed" );
exit( -1 );
}
else if( pid == 0 ) {
display_fibonacci_sequence( n );
}
else { // parent process
// what do we need to do here?
}
}
To be honest, I don't see any difference between using fork and not using fork. Besides, if I want the parent process to handle the input from user, and let the child process handle the display, how could I do that?

You are asking many questions, I'll try to answer them in a convenient order.
First question
To be honest, I don't see any difference between using fork and not
using fork.
That's because the example is not a very good one. In your example the parent doesn't do anything so the fork is useless.
Second
else {
// what do we need to do here?
}
You need to wait(2) for the child to terminate. Make sure you read that page carefully.
Third
I want the parent process to handle the input from user, and let the
child process handle the display
Read the input before the fork and "handle" the display inside if (pid == 0)
Fourth
But then, how do we know exactly which process runs first?
Very few programs should concern themselves with this. You can't know the order of execution, it's entirely dependent on the environment. TLPI says this:
After a fork(), it is indeterminate which process—the parent or the
child—next has access to the CPU. On a multiprocessor system, they may both simultaneously get access to a CPU.
Applications that implicitly or explicitly rely on a particular
sequence of execution in order to achieve correct results are open to
failure due to race conditions
That said, an operating system can allow you to control this order. For instance, Linux has /proc/sys/kernel/sched_child_runs_first.

We don't know which runs first, the parent or the child. This is why the parent generally has to wait for the child process to complete if there is some dependency on order of execution between them.
In your specific problem, there isn't any particular reason to use fork(). Your professor probably gave you this just for a trivial example.
If you want the parent to handle input and the child to calculate, all you have to do is move the call to fork() below the point at which you handle the command-line args. Using the same basic logic as above, have the child call display_fibonacci_sequence, and have the parent simply wait

The process which is selected by your system scheduler is chosen to run, not unlike any other application running on your operating system. The process spawned is treated like any other process where the scheduler assigns a priority or spot in queue or whatever the implementation is.

But then, how do we know exactly which process runs first? I meant the
order of execution.
There is no guarantee to which one ran first. fork returns 0 if it is the child and the pid of the child if it is the parent. Theoretically they could run at exactly the same time on a multiprocessor system. If you actually wanted to determine which ran first you could have a shared lock between the two processes. The one that acquires the lock first could be said to have run first.
In terms of what to do in your else statement. You'll want to wait for the child process to exit using wait or waitpid.
To be honest, I don't see any difference between using fork and not using fork.
The difference is that you create a child process. Another process on the system doing computation. For this simple problem the end user experience is the same. But fork is very different when you are writing systems like servers that need to deal with things concurrently.
Besides, if I want the parent process to handle the input from user, and let the child process handle the display, how could I do that?
You appear to have that setup already. The parent process just needs to wait for the child process to finish. The child process will printf the results to the terminal. And the parent process currently gets user input from the command line.

While you cannot control which process (parent or child) gets scheduled first after the fork (in fact on SMP/multicore it might be both!) there are many ways to synchronize the two processes, having one wait until the other reaches a certain point before it performs any nontrivial operations. One classic, extremely portable method is the following:
Prior to fork, call pipe to create a pipe.
Immediately after fork, the process that wants to wait should close the writing end of the pipe and call read on the reading end of the pipe.
The other process should immediately close the reading end of the pipe, and wait to close the writing end of the pipe until it's ready to let the other process run. (read will then return 0 in the other process)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight