Implementing pipe using shared memory - c

I'm implementing a pipe using shared memory.
I should write and touch only the library, and not the main().
I encountered a problem:
Lets say this is the main() of some user who uses my library shared_memory_pipe.h:
#include "shared_memory_pipe.h"
int main() {
int fd[2];
shared_memory_pipe(fd);
if (fork()) {
while(1) {}
}
shared_memory_close(fd[0]);
shared_memory_close(fd[1]);
}
In this example we see that child closes both of his fd's, but the father is stuck on an infinite loop, and never closes his fd's. In this case my pipe should still exists (compared to a case when all writing fd's are closed, or all reading fd's are closed, or all are closed, so the pipe should die).
As I said before, I write only the library, (shared_memory_pipe.h).
So, inside the library, how can I know if a fork() has been made?
How can I know that there is another process who has a reading/writing end to my shared memory pipe, so I'll know to close/not close my shared memory pipe?
I heard something about a command who knows that there was a fork() or something like that, but I didn't find it and I don't know it.
Thanks ahead!
Please ask if you need more information.

Before any fork the parent can store the result of getpid() to a global pid_t pid_parent.
Than at a later point in time the process cann test against pid_parent using getpid() again.
If than getpid()'s result is different from pid_parent the process is at least one fork() away from the parent.

Which part of the code is responsible for closing the fd's?
If it is the user's code then a fork() is not your problem. After all, the caller could do an execve to a different program (a common use of anonymous pipes) so your library code is now gone from the process, even though the fd's are still open, so there is no way you can handle that.
If you have a library API to close the FDs, then that's all you can do. An exec'ed program would not call your library anyhow.

Related

when a child process is created using fork() system call, where the child process starts execution? [duplicate]

This question already has answers here:
printf anomaly after "fork()"
(3 answers)
Closed 8 years ago.
fork() creates a new process and the child process starts to execute from the current state of the parent process.
This is the thing I know about fork() in Linux.
So, accordingly the following code:
int main() {
printf("Hi");
fork();
return 0;
}
needs to print "Hi" only once as per the above.
But on executing the above in Linux, compiled with gcc, it prints "Hi" twice.
Can someone explain to me what is happening actually on using fork() and if I have understood the working of fork() properly?
(Incorporating some explanation from a comment by user #Jack)
When you print something to the "Standard Output" stdout (computer monitor usually, although you can redirect it to a file), it gets stored in temporary buffer initially.
Both sides of the fork inherit the unflushed buffer, so when each side of the fork hits the return statement and ends, it gets flushed twice.
Before you fork, you should fflush(stdout); which will flush the buffer so that the child doesn't inherit it.
stdout to the screen (as opposed to when you're redirecting it to a file) is actually buffered by line ends, so if you'd done printf("Hi\n"); you wouldn't have had this problem because it would have flushed the buffer itself.
printf("Hi"); doesn't actually immediately print the word "Hi" to your screen. What it does do is fill the stdout buffer with the word "Hi", which will then be shown once the buffer is 'flushed'. In this case, stdout is pointing to your monitor (assumedly). In that case, the buffer will be flushed when it is full, when you force it to flush, or (most commonly) when you print out a newline ("\n") character. Since the buffer is still full when fork() is called, both parent and child process inherit it and therefore they both will print out "Hi" when they flush the buffer. If you call fflush(stout); before calling fork it should work:
int main() {
printf("Hi");
fflush(stdout);
fork();
return 0;
}
Alternatively, as I said, if you include a newline in your printf it should work as well:
int main() {
printf("Hi\n");
fork();
return 0;
}
In general, it's very unsafe to have open handles / objects in use by libraries on either side of fork().
This includes the C standard library.
fork() makes two processes out of one, and no library can detect it happening. Therefore, if both processes continue to run with the same file descriptors / sockets etc, they now have differing states but share the same file handles (technically they have copies, but the same underlying files). This makes bad things happen.
Examples of cases where fork() causes this problem
stdio e.g. tty input/output, pipes, disc files
Sockets used by e.g. a database client library
Sockets in use by a server process - which can get strange effects when a child to service one socket happens to inherit a file handle for anohter - getting this kind of programming right is tricky, see Apache's source code for examples.
How to fix this in the general case:
Either
a) Immediately after fork(), call exec(), possibly on the same binary (with necessary parameters to achieve whatever work you intended to do). This is very easy.
b) after forking, don't use any existing open handles or library objects which depend on them (opening new ones is ok); finish your work as quickly as possible, then call _exit() (not exit() ). Do not return from the subroutine that calls fork, as that risks calling C++ destructors etc which may do bad things to the parent process's file descriptors. This is moderately easy.
c) After forking, somehow clear up all the objects and make them all in a sane state before having the child continue. e.g. close underlying file descriptors without flushing data which are in a buffer which is duplicated in the parent. This is tricky.
c) is approximately what Apache does.
printf() does buffering. Have you tried printing to stderr?
Technical answer:
when using fork() you need to make sure that exit() is not called twice (falling off of main is the same as calling exit()). The child (or rarely the parent) needs to call _exit instead. Also, don't use stdio in the child. That's just asking for trouble.
Some libraries have a fflushall() you can call before fork() that makes stdio in the child safe. In this particular case it would also make exit() safe but that is not true in the general case.

Wait for child process without using wait()

When using fork(), is it possible to ensure that the child process executes before the parent without using wait() in the parent?
This is related to a homework problem in the Process API chapter of Operating Systems: Three Easy Pieces, a free online operating systems book.
The problem says:
Write another program using fork(). The child process should
print "hello"; the parent process should print "goodbye". You should
try to ensure that the child process always prints first; can you do
this without calling wait() in the parent?
Here's my solution using wait():
#include <stdio.h>
#include <stdlib.h> // exit
#include <sys/wait.h> // wait
#include <unistd.h> // fork
int main(void) {
int f = fork();
if (f < 0) { // fork failed
fprintf(stderr, "fork failed\n");
exit(1);
} else if (f == 0) { // child
printf("hello\n");
} else { // parent
wait(NULL);
printf("goodbye\n");
}
}
After thinking about it, I decided the answer to the last question was "no, you can't", but then a later question seems to imply that you can:
Now write a program that uses wait() to wait for the child process
to finish in the parent. What does wait() return? What happens if
you use wait() in the child?
Am I interpreting the second question wrong? If not, how do you do what the first question asks? How can I make the child print first without using wait() in the parent?
I hope this answer is not too late.
Minutes ago, I have emailed Remiz(this book's author), and got such a replay(extract some segments):
Without calling wait() is hard, and not really the main point.
What you did -- learning about signals on your own -- is a good sign,
showing you will seek out deeper knowledge. Good for you!
Later, you'll be able to use a shared memory segment, and
either condition variables or semaphores, to solve this problem.
Create a pipe in the parent. After fork, close the write half in the parent and the read half in the child.
Then, poll for readability. Since the child never writes to it, it will wait until the child (and all grandchildren, unless you take special care) no longer exists, at which time poll will give a "read with hangup" response. (Alternatively, you could actually communicate over the pipe).
You should read about O_CLOEXEC. As a general rule, that flag should always be set unless you have a good reason to clear it.
I can't see why second question would imply that answer is "yes" to the first.
Yes there is plenty of solutions to obtain what asked, but of course I suspect that all are not in the "spirit" of the problem/question where the focus in on fork/wait primitives. The point is always to remember that you can't assume anything after a fork regarding the way processes ran relatively to each other.
To ensure the child process print first you need a kind of synchronization in between both processes, and there is a lot of system primitives that have a semantic of "communication" between processes (for example locks, semaphores, signals, etc). I doubt one of these is to be used her, as they are generally introduced slightly later in such a course.
Any other attempt only that will only rely on time assumption (like using sleep or loops to "slow" down the parent, etc) can lead to failure, means that you will not be able to prove that it will always succeed. Even if testing would probably show you that it seems correct, most of the runs you would try will not have the bad characteristics that lead to failure. Remember that except in realtime OSes, scheduling is almost an approximation of fair concurrency.
NOTE:
As Jonathan Leffler commented, I also suppose that using other wait-like primitives is forbidden (aka wait4, waitpid, etc) -- "spirit" argument.
I'm not sure whether this is against the spirit of the question, but I think that calling the pause system call in the parent process branch will cause the scheduler to immediately run the child process (if it didn't already run).

C - execvp() interprocess communication

Hi all I am new to C so sorry if I am very lost. I am having trouble with this multi-threaded web server I am trying to create. I am attempting to...
have a thread create a new thread
have that new thread execute execvp() to call a different C program on my machine
have that new thread return streams of data from the execvp()
I was thinking about using pthreads to spawn a new process to run execvp() and have it return the data through a pipe. But is that even necessary? Don't pthreads share memory?
Also, I was maybe thinking about using fork() instead of a pthread and have the child send data back to the parent through a pipe.
Can you please help guide me in the correct direction.
What you're looking for is a combination of fork(), one of the exec functions, and pipe() (or maybe socketpair() or something, but pipes work too).
Threads share memory, but execvp() would create a completely new process replacing the caller process -- and even if this process shared memory with its parent (which I'm not sure it does!), the newly run program wouldn't know how to use that memory.
The proper way is to open a pipe when you still have one process, fork() into two processes (parent and child), and have the child call execvp(). The child can now write into its end of the pipe, and the parent can read from the other end.
Remember to wait() for the child to end.
Have you written your non-blocking, single-threaded web-server yet? How would you expect to measure the benefits of multithreading if you don't have something to compare it against? It's far easier to determine where the best performance gains are if you expose a single-threaded project to concurrency, than it is to guess and suffer with a poor framework for the rest of the project's life.
Creating threads is easy, but you really need to read the pthread_create manual first. How else can you trust that your project is handling errors correctly? I also suggest reading about the other pthread functionality. I'm happy to help you resolve issues if you show me that you're trying to resolve them yourself, by the way. I won't bother spoonfeeding you.
As mentioned by aaaaaa123456789, you wouldn't want to spawn using pthread_create/execvp as this would replace your entire program environment (including all of your threads) with the new process.

What to do if exec() fails?

Let's suppose we have a code doing something like this:
int pipes[2];
pipe(pipes);
pid_t p = fork();
if(0 == p)
{
dup2(pipes[1], STDOUT_FILENO);
execv("/path/to/my/program", NULL);
...
}
else
{
//... parent process stuff
}
As you can see, it's creating a pipe, forking and using the pipe to read the child's output (I can't use popen here, because I also need the PID of the child process for other purposes).
Question is, what should happen if in the above code, execv fails? Should I call exit() or abort()? As far as I know, those functions close the open file descriptors. Since fork-ed process inherits the parent's file descriptors, does it mean that the file descriptors used by the parent process will become unusable?
UPD
I want to emphasize that the question is not about the executable loaded by exec() failing, but exec itself, e.g. in case the file referred by the first argument is not found or is not executable.
You should use exit(int) since the (low byte) of the argument can be read by the parent process using waitpid(). This lets you handle the error appropriately in the parent process. Depending on what your program does you may want to use _exit instead of exit. The difference is that _exit will not run functions registered with atexit nor will it flush stdio streams.
There are about a dozen reasons execv() can fail and you might want to handle each differently.
The child failing is not going to affect the parent's file descriptors. They are, in effect, reference counted.
You should call _exit(). It does everything exit() does, but it avoids invoking any registered atexit() functions. Calling _exit() means that the parent will be able to get your failed child's exit status, and take any necessary steps.

fork starts executing form where?

To my previous question about segmentation fault ,I got very useful answers.Thanks for those who have responded.
#include<stdio.h>
main()
{
printf("hello");
int pid = fork();
wait(NULL);
}
output: hellohello.
In this the child process starts executing form the beginning.
If Iam not wrong , then how the program works if I put the sem_open before fork()
(ref answers to :prev questions)
I need a clear explanation about segmentation fault which happens occasionally and not always. And why not always... If there is any error in coding then it should occur always right...?
fork creates a clone of your process. Conceptually speaking, all state of the parent also ends up in the child. This includes:
CPU registers (including the instruction pointer, which defines where in the code your program is)
Memory (as an optimization your kernel will most likely mark all pages as copy-on-write, but semantically speaking it should be the same as copying all memory.)
File descriptors
Therefore... Your program will not "start running" from anywhere... All the state that you had when you called fork will propagate to the child. The child will return from fork just as the parent will.
As for what you can do after a fork... I'm not sure about what POSIX says, but I wouldn't rely on semaphores doing the right thing after a fork. You might need an inter-process semaphore (see man sem_open, or the pshared parameter of sem_init). In my experience cross-process semaphores aren't really well supported on free Unix type OS's... (Example: Some BSDs always fail with ENOSYS if you ever try to create one.)
#GregS mentions the duplicated "hello" strings after a fork. He is correct to say that stdio (i.e. FILE*) will buffer in user-space memory, and that a fork leads to the string being buffered in two processes. You might want to call fflush(stdout); fflush(stderr); and flush any other important FILE* handles before a fork.
No, it starts from the fork(), which returns 0 in the child or the child's process ID in the parent.
You see "hello" twice because the standard output is buffered, and has not actually been written at the point of the fork. Both parent and child then actually write the buffered output. If you fflush(stdout); after the printf(), you should see it only once.

Resources