Executing the program after the "fork part" - c

in my program, I use in main function fork to create 2 processes. Child process do something and parent process is forked again and his child calls another function. Both functions writes to 1 file and all works fine.
What I need is to write something to the end of file, after both functions and all processes (both functions create processes) finish.
I tried to write fprintf command everywhere in main and it allways writes somewhere in the middle of file, so I think that the main propably runs parallelly with the 2 functions.
I tried to use semaphore
s = sem_open(s1, o_CREATE, 0666, 0);
in this way: In the end of each function I wrote sem_post(s) and in main I put sem_wait(s); sem_wait(s); and after this i wrote fprintf command, but it also didn't work.
Is there some way how to solve this?
Thanks

I think you're looking for the wait function. See this stack overflow question: wait(NULL) will wait for all children to finish wait for a child process to finish (thanks Jonathan Leffler). Call wait in a loop to wait for all children processes to finish. Just use that function right before you write to the file in your parent process.
You can also read about the waitpid function if you want to wait for a specific process instead of for all the processes.
Edit:
Alternatively, you can actually use semaphores across processes, but it takes a little more work. See this stack overflow answer. The basic idea is to use the function sem_open with the O_CREAT constant. sem_open has 2 function signatures:
sem_t *sem_open(const char *name, int oflag);
sem_t *sem_open(const char *name, int oflag, mode_t mode, unsigned int value);
From the sem_open man page:
If O_CREAT is specified in oflag, then two additional arguments must
be supplied. The mode argument specifies the permissions to be
placed on the new semaphore, as for open(2). (Symbolic definitions
for the permissions bits can be obtained by including <sys/stat.h>.)
The permissions settings are masked against the process umask. Both
read and write permission should be granted to each class of user
that will access the semaphore. The value argument specifies the
initial value for the new semaphore. If O_CREAT is specified, and a
semaphore with the given name already exists, then mode and value are
ignored.
In your parent process, call sem_open with the mode and value parameters, giving it the permissions you need. In the child process(es), call sem_open("YOUR_SEMAPHORE_NAME", 0) to open that semaphore for use.

Related

What is the difference between threads and forked processes in Unix?

I know fork process does not share memory, and threads do, but then how can forked processes communicate one another?
Here is example, where one version with thread is commented out (and that version will end), and the other version with fork will never ends. The code is relying on the global variable done:
#include <stdio.h>
#include <stdbool.h>
#include <signal.h>
#include <unistd.h>
#include <pthread.h>
bool done = false;
void *foo(void *arg){
sleep(1);
done = true;
return 0;
}
int main(){
//pthread_t t1;
//pthread_create(&t1, NULL, foo, NULL);
//
//printf("waiting...\n");
//while(!done){}
//printf("Ok. Moving on.\n");
printf("waiting...\n");
if(!fork()){
foo(NULL);
} else {
while(!done){}
printf("OK. moving on.\n");
}
}
So if forked processes do not share data (i.e. global variables?) unlike threads, how do they otherwise communicate in unix?
EDIT:
this is definitely not a duplicate as I already seen similar topics like Forking vs Threading and other documents about fork/threads in *nix. I just want to know use cases of both. (e.g windows has no fork, only threads, so they probably had different use cases in mind?)
fork() copies the current process. Without any special preparations, almost no data is exchanged between child and parent. It is just so that the new process is identical to the old one, but as soon as you write a variable, a copy of the written region is created and the child gets a new physical memory location for this data. This means settings a variable in the child will not be visible for the parent and vice versa.
You can use shared memory, pipes, files, sockets, signals, and probably other IPC methods to communicate between child and parent. For your special case you can use the wait() or waitpid() function to wait till your child exits. But I assume you want to know how to exchange data.
Shared memory
You can use the mmap() call to reserve memory that is shared between parent and child.
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
You can pass the flag MAP_SHARED | MAP_ANONYMOUS to flags to create a memory region that is shared. There you can place the shared variable and both can access it. Here is an example.
//creates a region of shared memory to store a bool
static bool *reserveSharedMemory(void)
{
void *data = mmap(NULL, sizeof(bool), PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if(MAP_FAILED==data)
{
//do some error handling here
return NULL;
}
bool *p=data;
*p=false;
return p;
}
Sockets
Sockets allow you send and receive data with something else. With socketpair() you can create 2 socket file descriptors and you can communicate by writing to one of them and reading at the other file descriptor or verse visa. This way communication with the child process becomes almost the same as communicating with a network socket.
When you execute a fork you create a copy of the process you are executing with a different PID the variables declared before the fork() execution will appear in both processes. fork returns 0 in the "child" process and returns the pid of the "child" process in the "parent" process (with a switch, you can control the behavior of both processes).
If you want to communicate different processes created by fork() you can declare BEFORE an array of file descriptors such as int fd[2] and execute pipe(fd). If the result of pipe isn't -1, means you have created two "cables" where you can write or read information.
Here you can see an example on how this can work
As you probably already know, a forked thread is a child of the main thread that called the fork(), it gets initialized with a copy of the address space and the file descriptor table of the father while it shares the open files table. As someone already said it doesn't really makes sense to use forked thread when you can just create a new one, and that's because it's never a good idea to have two copies of the same thread.
A note I'd like to make is that you can create a forked thread which shares all the data with the father using "vfork", but this one is REALLY DEPRECATED, I added this just as additional information.
You can use pipes, sockets etc. to communicate between father-child if you want to and you can determine if you're on father or child thread by checking the pid.

Implementing pipe using shared memory

I'm implementing a pipe using shared memory.
I should write and touch only the library, and not the main().
I encountered a problem:
Lets say this is the main() of some user who uses my library shared_memory_pipe.h:
#include "shared_memory_pipe.h"
int main() {
int fd[2];
shared_memory_pipe(fd);
if (fork()) {
while(1) {}
}
shared_memory_close(fd[0]);
shared_memory_close(fd[1]);
}
In this example we see that child closes both of his fd's, but the father is stuck on an infinite loop, and never closes his fd's. In this case my pipe should still exists (compared to a case when all writing fd's are closed, or all reading fd's are closed, or all are closed, so the pipe should die).
As I said before, I write only the library, (shared_memory_pipe.h).
So, inside the library, how can I know if a fork() has been made?
How can I know that there is another process who has a reading/writing end to my shared memory pipe, so I'll know to close/not close my shared memory pipe?
I heard something about a command who knows that there was a fork() or something like that, but I didn't find it and I don't know it.
Thanks ahead!
Please ask if you need more information.
Before any fork the parent can store the result of getpid() to a global pid_t pid_parent.
Than at a later point in time the process cann test against pid_parent using getpid() again.
If than getpid()'s result is different from pid_parent the process is at least one fork() away from the parent.
Which part of the code is responsible for closing the fd's?
If it is the user's code then a fork() is not your problem. After all, the caller could do an execve to a different program (a common use of anonymous pipes) so your library code is now gone from the process, even though the fd's are still open, so there is no way you can handle that.
If you have a library API to close the FDs, then that's all you can do. An exec'ed program would not call your library anyhow.

setuid() before calling execv() in vfork() / clone()

I need to fork an exec from a server. Since my servers memory foot print is large, I intend to use vfork() / linux clone(). I also need to open pipes for stdin / stdout / stderr. Is this allowed with clone() / vfork()?
From the standard:
[..] the behaviour is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit() or one of the exec family of functions.
The problem with calling functions like setuid or pipe is that they could affect memory in the address space shared between the parent and child processes. If you need to do anything before exec, the best way is to write a small shim process that does whatever you need it to and then execs to the eventual child process (perhaps arguments supplied through argv).
shim.c
======
enum {
/* initial arguments */
ARGV_FILE = 5, ARGV_ARGS
};
int main(int argc, char *argv[]) {
/* consume instructions from argv */
/* setuid, pipe() etc. */
return execvp(argv[ARGV_FILE], argv + ARGV_ARGS);
}
I'd use clone() instead, using CLONE_VFORK|CLONE_VM flags; see man 2 clone for details.
Because CLONE_FILES is not set, the child process has its own file descriptors, and can close and open standard descriptors without affecting the parent at all.
Because the cloned process is a separate process, it has its own user and group ids, so setting them via setresgid() and setresuid() (perhaps calling setgroups() or initgroups() first to set the additional groups -- see man 2 setresuid, man 2 setgroups, and man 3 initgroups for details) will not affect the parent at all.
The CLONE_VFORK|CLONE_VM flags mean this clone() should behave like vfork(), with the child process running in the same memory space as the parent process up till the execve() call.
This approach avoids the latency when using an intermediate executable -- it is pretty significant --, but the approach completely Linux-specific.

What to do if exec() fails?

Let's suppose we have a code doing something like this:
int pipes[2];
pipe(pipes);
pid_t p = fork();
if(0 == p)
{
dup2(pipes[1], STDOUT_FILENO);
execv("/path/to/my/program", NULL);
...
}
else
{
//... parent process stuff
}
As you can see, it's creating a pipe, forking and using the pipe to read the child's output (I can't use popen here, because I also need the PID of the child process for other purposes).
Question is, what should happen if in the above code, execv fails? Should I call exit() or abort()? As far as I know, those functions close the open file descriptors. Since fork-ed process inherits the parent's file descriptors, does it mean that the file descriptors used by the parent process will become unusable?
UPD
I want to emphasize that the question is not about the executable loaded by exec() failing, but exec itself, e.g. in case the file referred by the first argument is not found or is not executable.
You should use exit(int) since the (low byte) of the argument can be read by the parent process using waitpid(). This lets you handle the error appropriately in the parent process. Depending on what your program does you may want to use _exit instead of exit. The difference is that _exit will not run functions registered with atexit nor will it flush stdio streams.
There are about a dozen reasons execv() can fail and you might want to handle each differently.
The child failing is not going to affect the parent's file descriptors. They are, in effect, reference counted.
You should call _exit(). It does everything exit() does, but it avoids invoking any registered atexit() functions. Calling _exit() means that the parent will be able to get your failed child's exit status, and take any necessary steps.

Clone-equivalent of fork?

I'd like to use the namespacing features of the clone function. Reading the manpage, it seems like clone has lots of intricate details I need to worry about.
Is there an equivalent clone invocation to good ol' fork()?
I'm already familiar with fork, and believe that if I have a starting point in clone, I can add flags and options from there.
I think that this will work, but I'm not entirely certain about some of the pointer arguments.
pid_t child = clone( child_f, child_stack,
/* int flags */ SIGCHLD,
/* argument to child_f */ NULL,
/* pid_t *pid */ NULL,
/* struct usr_desc * tls */ NULL,
/* pid_t *ctid */ NULL );
In the flags parameter the lower byte of it is used to specify which signal to send to notify the parent of the thread doing things like dying or stopping. I believe that all of the actual flags turn on switches which are different from fork. Looking at the kernel code suggests this is the case.
If you really want to get something close to fork you may want to call sys_clone which does not take function pointer and instead returns twice like fork.
You could fork a normal child process using fork(), then use unshare() to create a new namespace.
Namespaces are a bit weird, I can't see a lot of use-cases for them.
clone() is used to create a thread. The big difference between clone() and fork() is that clone() is meant to execute starting at a separate entry point - a function, whereas fork() just continues on down from the same point in the code from where was invoked. int (*fn)(void *) in the manpage definition is the function, which on exit returns an int, the exit status.
The closest call to clone is pthread_create() which is essentially a wrapper for clone().
This does not get you a way to get fork() behavior.

Resources