I was always a bit hazy on this little bit of C magic. When you call execv, you're "replacing the process image." What exactly does that mean? Just the DATA segment? Everything allocated to the process? The stack? The heap?
My question is about what happens to the storage used by the parameters that you pass to execv? If they were local variables to the function that called execv, then they're on the stack. But if you replace the process image, and call the new process's main() function, bad things would happen when main() returned, because the stack information that points to the return location from the main call was replaced by the new process image.
Same thing for variables, yes? And what if those variables were allocated on the heap?
Inquiring minds are inquiring to anybody who knows.
The exec family of functions replace the process wholesale - data, stack, text, heap, everything. Some file descriptors can stay open (those opened by the original process without FD_CLOEXEC set). But apart from that, you pretty much get a whole new process - see the link for all the details.
What happens to the parameters you passed in is the OS's problem - it has to make sure they're passed to the new process's main function in a way that complies with the standard, but I don't think POSIX dictates exactly how it does that.
For Linux, you can look at the fs/exec.c file to see the implementation. Jump near the end (line 1484 as I post this) to look at the do_execveat_common function which is the main part of the implementation. You'll see the arguments are copied into the new address space (calls to copy_strings near the end of the function).
Just the DATA segment?
No, all memory mappings are erased and re-create for the new executable
Everything allocated to the process? The stack? The heap?
Yes, all memory. Some kernel resources, documented here, are inherited from the parent process though, such as file descriptors. These resources are managed by the kernel, and are not part of the process memory. All of this is quite operating system specific though, it can accomplish this through various means as long as it complies with the mentioned exec() documentation.
what happens to the storage used by the parameters that you pass to execv?
Typically the kernel makes a copy of those arguments, and injects them into the memory of the new executable.
But if you replace the process image, and call the new process's main() function, bad things would happen when main() returned,
No, when main() returns, that process ends. The code and memory of the original process that called exec() doesn't exist any more, there's nothing to return to.
Related
I am writing a rudimentary shell program in C which uses a parent process to handle shell events and fork() to create child processes that call execv on another executable (also C).
I am trying to keep a process counter on the parent process. And as such I thought of the possibility of creating a pointer to a variable that keeps track of how many processes are running.
However, that seems to be impossible since the arguments execv (and the program executed by it) takes are of type char * const argv[].
I have tried to keep track of the amount of processes using mmap for shared memory between processes, but couldn't get that to work since after the execv call the process simply dies and doesn't let me update the process counter.
In summary, my question is: Is there a way for me to pass a pointer to an integer on an execv call to another program?
Thank you in advance.
You cannot meaningfully pass a pointer from one process to another because the pointer is meaningless in the other process. Each process has its own memory, and the address is relative to that memory space. In other words, the virtual memory manager lets every process pretend it has the entire machine's memory; other processes are simply invisible.
However, you do have a few options for setting up communications between related processes. The most obvious one is a pipe, which you've presumably already encountered. That's more work, though, because you need to make sure that some process is always listening for pipe communications.
Another simple possibility is to just leave a file descriptor open when you fork and exec (see the close-on-exec flag to see how to accomplish the latter); although mmap is not preserved by exec, you can remap the memory to the open fd in the child process. If you don't want to pass the fd, you can mmap the memory to a temporary file, and use an environment variable to record the name of the temporary file.
Another possibility is Posix shared memory. Again, you might want to communicate the shm name through an environment variable, rather than hard-coding it in to the application.
Note that neither shared mmaps nor shared memory are atomic. If you're incrementing a counter, you'll need to use some locking mechanism to avoid race conditions.
For possibly a lot more information than you really wanted, you can read ESR's overview of interprocess communication techniques in Chapter 7 of The Art of Unix Programming.
In the man pages I've been reading, it seems popen, system, etc. tend to call fork(). In turn, fork() copies the process's entire memory state. This seems really heavy, especially when in many situations a child from a call to fork() uses little if any of the memory allocated for the parent.
So, my question is, can I get fork() like behavior without duplicating the whole memory state of the parent process? Or is there something I am missing, such that fork() is not as heavy as it appears (like, maybe calls tend to be optimized to avoid unnecessary memory duplication)?
fork(2) is, as all syscalls, a primitive operation (but some C libraries use clone(2) for it), from the point of view of user-space application. It is mostly a single machine instruction SYSCALL or SYSENTER to switch from user-mode to kernel-mode, then the (recent version of) Linux kernel is doing quite significant processing.
It is in practice quite efficient (e.g. less than a millisecond, and sometimes even less than a tenth of it) because the kernel is extensively using lazy copy-on-write techniques to share pages between parent & child processes. The actual copying would happen later, on page faults, when overwriting a shared page.
And forking has a huge advantage, since the starting of some other program is delegated to execve(2): it is conceptually simple: the only difference between the parent & child processes is the result of fork
BTW on POSIX systems such as Linux, fork(2) or the suitable clone(2) equivalent is the only way to create a process (there are some few weird exceptions that you should generally ignore: the kernel is making some processes like /sbin/init etc...), since vfork(2) is obsolete.
The problem is that to run the main function of a standardly linked executable, you need to call execve, and exec replaces the whole process image and so you need a new address space, which is what fork is for.
You can get around this by having your calee expose its main functionality in a shared library (but then it must not be called main), and then you can load the function with the main functionality without having to fork (provided there are no symbol conflicts).
That would be a more efficient alternative to system (basically with the efficiency of a function call).
Now popen involves pipes and to use pipes you need to have the pipe ends in different schedulable units. Threads, which use the same address space, can be used here as a lighter alternative to separate processes.
As you alluded to fork() is a bit of a mad syscall that has kind of stuck around for historical reasons. There's a great article about its flaws here, and also this post goes into some details and potential workarounds.
Although on Linux fork() is optimised to use copy-on-write for the memory, it's still not "free" because:
It still has to do some memory-related admin (new page tables, etc.)
If you're using RAII (e.g. in C++ or possibly Rust) then all the objects that are copied will be cleaned up twice. That might even lead to logic errors (e.g. deleting temporary files twice).
It's likely that the parent process will keep running, probably modifying lots of its memory, and then it will have to be copied.
The alternatives appear to be:
vfork()
clone()
posix_spawn()
vfork() was created for the common use case of doing fork() and then execve() to run a program. execve() replaces all of the memory of the current process with a new set, so there's no point copying the parent process's memory if your just about to obliterate it.
So vfork() doesn't do that. Instead it runs in the same memory space as the parent process and pauses it until it gets to execve(). The Linux man page for vfork() says that doing just about anything except vfork() then execve() is undefined behaviour.
posix_spawn() is basically a nice wrapper around vfork() and then execve().
clone() is similar to fork() but allows you to exactly specify what is copied (file descriptors, memory, etc.). It has a load of options, including one (CLONE_VM) which lets the child process run in the same address space as the parent, which is pretty wild! I guess that is the lightest weight way to make a new process because it doesn't involve any copying of memory at all!
But in practice I think in most situations you should either:
Use threads, or
Use posix_spawn().
(Note, I am just researching this now; I'm not an expert so I might have got some things wrong.)
I allocate memory for the parameter lpCommandLine in CreateProcess function, either with malloc or on the stack.
Can I free/release that memory immediately after the call, or do I have to wait until the process finishes?
The buffer referred to by lpCommandLine needs be valid only for the duration of the call to CreateProcess. Once CreateProcess returns, it will not refer to that buffer again.
Imagine if you did have to keep that buffer alive. Were that the case, then all parent processes would have to outlive all of their children. That's clearly a ridiculous proposition and I'm sure you will know from experience that there is not such requirement.
There is a general principle here. By and large, API functions will not refer to their arguments after the function returns. If they do need to do so, then it will be explicitly called out in the documentation, or it will be blatantly obvious from the intent of the function. As an example of the latter I am thinking of passing a window procedure to RegisterClass. It is quite clear that the window procedure must remain valid for as long as there exists a window of that class.
I am following an example FUSE Tutorial to understand how FUSE works in linux. In the example all the dynamic data is allocated using malloc, and passed in as user data to the fuse_main function. This data is later accessible for any fuse calls. These calls need not be from the same process. How does this work?
To make the question more clear ,
i run the main bbfs program with ../src/bbfs rootdir mountdir to mount the file system. It is in the main() of bbfs.c that the malloc is called. The bbfs program also defines several fuse function calls. But this program exits after the filesystem is mounted.
How can other programs(or the kernel) which calls read() or open() on the mounted filesystem
1.access the memory allocated using malloc by the bbfs program if it has already exited? Wouldn't the OS free up the memory allocated using malloc after the program bbfs exited?
2.access the defined functions, if the process that defined them had already exited? Where would the object code of the fuse functions reside after the process exited?
I am a bit confused about the lifetimes of the object code and the heap memory objects here and how other programs (or the kernel) use it later. Any help or pointers would be appreciated.
Most of your question is based on a false assumption:
… But [the FUSE server] exits after the filesystem is mounted.
It's not actually exiting at all. It's forking into the background and continuing to run as long as the filesystem is mounted.
While it's running, everything works as normal.
Hi I created a server program that forks a new process after its accepts a socket connection.
There are several statically allocated global variables defined in the program. My question is are these static buffers allocated twice after the fork? Or does the fork only duplicate address space on the heap and the call stack?
The entire address space is duplicated, including all global variables and the program text.
The whole address space is "duplicated" during fork(2). It's often done with copy-on-write and there are more details about sharing program text and the libraries, but that is not relevant here. Both parent and child processes end up with their own copy of the static data.
fork() duplicates the entire process image. All of it. As such, are they allocated twice... no, they're allocated once per executable image of which there are now two, and no, if you refer to one in the parent, it will not hold the same content as that of the child unless you use shared memory.
On static, that keyword means this (from ISO C99):
An object whose identifier is declared
with external or internal linkage, or
with the storage-class specifier
static has static storage duration.
Its lifetime is the entire execution
of the program and its stored value is
initialized only once, prior to
program startup.
Which basically means your buffer will be initialised once as part of the CRT startup routine and that space only disappears when you exit. In this case, that storage disappears when each child exits.
Linux uses mechanism called copy-on-write. That basically means, that as long as variable is not modified parent and new process are sharing one variable. But before variable is modified it is copied and new process uses copy. It is done for performance reasons and technique is called lazy optimization. So you shouldn't worry that changing variable in one process will change it in another.