vfork VS fork in MPI mutiple thread

vfork VS fork in MPI mutiple thread - c

My program is very very large. So, I can't list it here. My program uses openMPI & mutiple_thread.
The problem has been solved. (using vfork() instead of fork()) But I don't know why it works. So, could anyone give me an explaination about it?
The problem is caused by free().
There are some segments of code in my program. All these segments are in threads which is created by pthread_create. The logic of these segments are like:
{
*p = malloc();
fun(p);
free(p);
}
All errors are at free(). It report a segment fault error. I ran the program more than 100 times. I found that there is always a fork() being called before each corruption at free.
The logic of fork segment is like(in thread):
{
MPI_program_code...
if(!fork())
{
execv(exe_file,arg);
}
MPI_program_code...
}
(Note that, in exe_file no MPI_function is used.)
When I use vfork() instead of fork(), there is no problem at all. But I don't know why it works.
So, could anyone explain why it works?

You might find the Open MPI FAQ topic on forking child processes very useful. Also an explanation on why using fork() with InfiniBand is dangerous can be found here.
vfork(2) differs from fork(2) in that it is specifically designed to be as lightweight as possible and is only meant to be used together with an immediately following execve(2) (or any of its wrappers from the C library) or _exit(2) call. The reason for that is that vfork(2) creates a child process that shares all memory with the parent instead of having it copy-on-write mapped, i.e. the new child is more like a thread than like a full-blown process. Since the child also uses the stack of the original thread, the parent is blocked until the child has either execve'd another executable or exited.
Open MPI registers a fork() handler using pthread_atfork(). The handler is not called when vfork() is used on modern Linux systems, therefore no actions are taken by the parent process upon forking.

Related

Are threads copied when calling fork?

If I have a program running with threads and call fork() on a unix-based system, are the threads copied? I know that the virtual memory for the current process is copied 1:1 to the new process spawned. I know that threads have their own stack in the virtual memory of a process. Thus, at least the stack of threads should be copied too. However, I do not know if there is anything more to threads that does not reside in virtual memory and is thus NOT copied over. If there is not, do the two processes share the threads or are they independent copies?

No.
Threads are not copied on fork(). POSIX specification says (emphasize is mine):
fork - create a new process
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
To circumvent this problem, there exists a pthread_atfork() function to help.

man fork:
The child process is created with a single thread—the one that called fork(). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition variables, and other pthreads objects; the use of pthread_atfork(3) may be helpful for dealing with problems that this can cause.

From The Open Group Base Specifications Issue 7, 2018 edition's fork:
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
When the application calls fork() from a signal handler and any of the fork handlers registered by pthread_atfork() calls a function that is not async-signal-safe, the behavior is undefined.

Originally, "fork" was achieved by writing the task to disk and then, rather than reading in a different thread (which would be done if swapping the task with a different one), modifying the task ID of the image still in memory and continuing with its execution (as the new task). This was a very simple modification to the basic task switching mechanism, where only one task would occupy RAM memory at a time.
Of course, as memory management got more elaborate this scheme was modified to suit the new environment.

Lighter weight alternatives to fork() in POSIX C?

In the man pages I've been reading, it seems popen, system, etc. tend to call fork(). In turn, fork() copies the process's entire memory state. This seems really heavy, especially when in many situations a child from a call to fork() uses little if any of the memory allocated for the parent.
So, my question is, can I get fork() like behavior without duplicating the whole memory state of the parent process? Or is there something I am missing, such that fork() is not as heavy as it appears (like, maybe calls tend to be optimized to avoid unnecessary memory duplication)?

fork(2) is, as all syscalls, a primitive operation (but some C libraries use clone(2) for it), from the point of view of user-space application. It is mostly a single machine instruction SYSCALL or SYSENTER to switch from user-mode to kernel-mode, then the (recent version of) Linux kernel is doing quite significant processing.
It is in practice quite efficient (e.g. less than a millisecond, and sometimes even less than a tenth of it) because the kernel is extensively using lazy copy-on-write techniques to share pages between parent & child processes. The actual copying would happen later, on page faults, when overwriting a shared page.
And forking has a huge advantage, since the starting of some other program is delegated to execve(2): it is conceptually simple: the only difference between the parent & child processes is the result of fork
BTW on POSIX systems such as Linux, fork(2) or the suitable clone(2) equivalent is the only way to create a process (there are some few weird exceptions that you should generally ignore: the kernel is making some processes like /sbin/init etc...), since vfork(2) is obsolete.

The problem is that to run the main function of a standardly linked executable, you need to call execve, and exec replaces the whole process image and so you need a new address space, which is what fork is for.
You can get around this by having your calee expose its main functionality in a shared library (but then it must not be called main), and then you can load the function with the main functionality without having to fork (provided there are no symbol conflicts).
That would be a more efficient alternative to system (basically with the efficiency of a function call).
Now popen involves pipes and to use pipes you need to have the pipe ends in different schedulable units. Threads, which use the same address space, can be used here as a lighter alternative to separate processes.

As you alluded to fork() is a bit of a mad syscall that has kind of stuck around for historical reasons. There's a great article about its flaws here, and also this post goes into some details and potential workarounds.
Although on Linux fork() is optimised to use copy-on-write for the memory, it's still not "free" because:
It still has to do some memory-related admin (new page tables, etc.)
If you're using RAII (e.g. in C++ or possibly Rust) then all the objects that are copied will be cleaned up twice. That might even lead to logic errors (e.g. deleting temporary files twice).
It's likely that the parent process will keep running, probably modifying lots of its memory, and then it will have to be copied.
The alternatives appear to be:
vfork()
clone()
posix_spawn()
vfork() was created for the common use case of doing fork() and then execve() to run a program. execve() replaces all of the memory of the current process with a new set, so there's no point copying the parent process's memory if your just about to obliterate it.
So vfork() doesn't do that. Instead it runs in the same memory space as the parent process and pauses it until it gets to execve(). The Linux man page for vfork() says that doing just about anything except vfork() then execve() is undefined behaviour.
posix_spawn() is basically a nice wrapper around vfork() and then execve().
clone() is similar to fork() but allows you to exactly specify what is copied (file descriptors, memory, etc.). It has a load of options, including one (CLONE_VM) which lets the child process run in the same address space as the parent, which is pretty wild! I guess that is the lightest weight way to make a new process because it doesn't involve any copying of memory at all!
But in practice I think in most situations you should either:
Use threads, or
Use posix_spawn().
(Note, I am just researching this now; I'm not an expert so I might have got some things wrong.)

fork() in C; which should be parent process which should be child process

This may seem to be a dumb question but I don't really have a good understanding of fork() other than knowing that this is about multi-threading. Child process is like a thread. If a task needs to be processed via fork(), how to correctly assign tasks to parent process and child process?

Check the return value of fork. The child process will receive the value of 0. The parent will receive the value of the process id of the child.

Read Advanced Linux Programming which has an entire chapter dedicated to processes (because fork is difficult to explain);
then read documentation of fork(2); fork is not about multi-threading, but about creating processes. Threads are generally created with pthread_create(3) (which is implemented above clone(2), a Linux specific syscall). Read some pthreads tutorial to learn more about threads.
PS. fork is difficult to understand (you'll need hours of reading, some experimentation, perhaps using strace(1), till you reach the "AhAh" insight moment when you have understood it) since it returns twice on success. You need to keep its result, and you need to test the result for the three cases : <0 (failure), ==0 (child), >0 (parent). Don't forget to later call waitpid(2) (or something similar) in the parent, to avoid having zombie processes.

How to handle a fork error for a multithreaded process?

I am working on a multithreaded process that forks to execute another process. Sometimes, the fork may error if the execution file does not exist. Since this process has multiple threads running prior to fork I have a couple questions:
Are threads copied over to the forked process.
What is the best practice to handling an error from fork with a multithreaded process. For example:
/* in a multithreaded process */
pid = fork();
if(pid == 0)
{
execlp(filename, filename, NULL);
fprintf(stderr, "filename doesn't exist");
/* what do i do here if there's multiple threads running
from the fork? */
exit(-1);
}

Well, the fork doesn't error if the executable file doesn't exist. The exec errors in that case. But, to your actual question, POSIX states that fork creates a new process with a single thread, a copy of the thread that called fork. See here for details:
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources.
Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
So what you have is okay, if a little sparse :-)
A single thread will be running in the child and, if you cannot exec another program, log a message and exit.
And, in the rationale section, it explains why it was done that way:
There are two reasons why POSIX programmers call fork(). One reason is to create a new thread of control within the same program (which was originally only possible in POSIX by creating a new process); the other is to create a new process running a different program. In the latter case, the call to fork() is soon followed by a call to one of the exec functions.
The general problem with making fork() work in a multi-threaded world is what to do with all of the threads. There are two alternatives. One is to copy all of the threads into the new process. This causes the programmer or implementation to deal with threads that are suspended on system calls or that might be about to execute system calls that should not be executed in the new process. The other alternative is to copy only the thread that calls fork(). This creates the difficulty that the state of process-local resources is usually held in process memory. If a thread that is not calling fork() holds a resource, that resource is never released in the child process because the thread whose job it is to release the resource does not exist in the child process.
When a programmer is writing a multi-threaded program, the first described use of fork(), creating new threads in the same program, is provided by the pthread_create() function. The fork() function is thus used only to run new programs, and the effects of calling functions that require certain resources between the call to fork() and the call to an exec function are undefined.

fork starts executing form where?

To my previous question about segmentation fault ,I got very useful answers.Thanks for those who have responded.
#include<stdio.h>
main()
{
printf("hello");
int pid = fork();
wait(NULL);
}
output: hellohello.
In this the child process starts executing form the beginning.
If Iam not wrong , then how the program works if I put the sem_open before fork()
(ref answers to :prev questions)
I need a clear explanation about segmentation fault which happens occasionally and not always. And why not always... If there is any error in coding then it should occur always right...?

fork creates a clone of your process. Conceptually speaking, all state of the parent also ends up in the child. This includes:
CPU registers (including the instruction pointer, which defines where in the code your program is)
Memory (as an optimization your kernel will most likely mark all pages as copy-on-write, but semantically speaking it should be the same as copying all memory.)
File descriptors
Therefore... Your program will not "start running" from anywhere... All the state that you had when you called fork will propagate to the child. The child will return from fork just as the parent will.
As for what you can do after a fork... I'm not sure about what POSIX says, but I wouldn't rely on semaphores doing the right thing after a fork. You might need an inter-process semaphore (see man sem_open, or the pshared parameter of sem_init). In my experience cross-process semaphores aren't really well supported on free Unix type OS's... (Example: Some BSDs always fail with ENOSYS if you ever try to create one.)
#GregS mentions the duplicated "hello" strings after a fork. He is correct to say that stdio (i.e. FILE*) will buffer in user-space memory, and that a fork leads to the string being buffered in two processes. You might want to call fflush(stdout); fflush(stderr); and flush any other important FILE* handles before a fork.

No, it starts from the fork(), which returns 0 in the child or the child's process ID in the parent.
You see "hello" twice because the standard output is buffered, and has not actually been written at the point of the fork. Both parent and child then actually write the buffered output. If you fflush(stdout); after the printf(), you should see it only once.