I'm trying to run an application in C, but the only way I could find that is reasonably easy to use works like this:
system("command here");
It works, of course, but it's really slow (especially when repeating this a lot). I'm just wondering if there is a way of running a program without having to interact with a shell, something like python's subprocess module.
I have heard of execl, and I would use that (forking it first, of course), but I'm wondering if there is a simpler way that wouldn't require forking first.
EDIT: I also want to be able to know the return code of the program
As I'm sure you already know, system already employs the fork/exec strategy. I understand you want to circumvent the shell and are looking for a simple approach, I'm just saying you could just as easily write a function to wrap the fork/exec pattern as is done in system. Indeed it would probably be most straightforward to just do that. An alternative as Gabe mentioned in the comments is posix_spawn.
A faster (but apparently discouraged) alternative is vfork() / exec, but this is generally discouraged and is obsolete in the latest POSIX standards.
4.3BSD; POSIX.1-2001 (but marked OBSOLETE). POSIX.1-2008 removes the
specification of vfork().
It's meant to be immediately followed by an exec or _exit. Otherwise all kinds of weird bugs can arise since the virtual memory pages and page tables aren't duplicated (child uses same data/heap/stack segments). The parent/calling process blocks until the child execs or _exits. Regular fork's modern implementations have copy-on-write semantics which approach the speed of vfork, without the potential bugs incurred by vfork's memory sharing semantics.
If you want even further control over memory-sharing semantics and process inheritance, and the consequent potential speed-up (and are on Linux), look into clone() (wrapper for system-call sys_clone()) which is what some process-creating system calls delegate their work to. Be sure to carefully comb over all of the various flags.
You can use waitpid to get the exit status of the process.
If neither system() nor popen() provides the mechanism you need, then the easy way to do it is with fork() and execv() (or, perhaps, execl(), but the argument list must be fixed at compile time, not variable, to use it). Really! It is not hard to do fork() and exec(), and any alternative will encapsulate that processing.
The Python subprocess module is simply hiding fork() and exec() for you behind a convenient interface. That's probably appropriate for a high-level language like Python. C is a lower-level language and doesn't really need the complexity.
The hard way to do it is with posix_spawn(). You have to create arguments to describe all the actions you want done in the child between the fork() and the exec(), which is far harder to set up than it is to simply do the fork(), make the changes, and then use exec() after all. This (posix_spawn()) is what you get when you design the code to spawn a child process without visibly using fork() and exec() and ensure that it can handle almost any reasonable circumstance.
You'll need to consider whether you need to use wait() or waitpid() or a variant to determine when the child is complete. You may need to consider whether to handle the SIGCHLD signal (which will notify you when a child dies).
Related
In the man pages I've been reading, it seems popen, system, etc. tend to call fork(). In turn, fork() copies the process's entire memory state. This seems really heavy, especially when in many situations a child from a call to fork() uses little if any of the memory allocated for the parent.
So, my question is, can I get fork() like behavior without duplicating the whole memory state of the parent process? Or is there something I am missing, such that fork() is not as heavy as it appears (like, maybe calls tend to be optimized to avoid unnecessary memory duplication)?
fork(2) is, as all syscalls, a primitive operation (but some C libraries use clone(2) for it), from the point of view of user-space application. It is mostly a single machine instruction SYSCALL or SYSENTER to switch from user-mode to kernel-mode, then the (recent version of) Linux kernel is doing quite significant processing.
It is in practice quite efficient (e.g. less than a millisecond, and sometimes even less than a tenth of it) because the kernel is extensively using lazy copy-on-write techniques to share pages between parent & child processes. The actual copying would happen later, on page faults, when overwriting a shared page.
And forking has a huge advantage, since the starting of some other program is delegated to execve(2): it is conceptually simple: the only difference between the parent & child processes is the result of fork
BTW on POSIX systems such as Linux, fork(2) or the suitable clone(2) equivalent is the only way to create a process (there are some few weird exceptions that you should generally ignore: the kernel is making some processes like /sbin/init etc...), since vfork(2) is obsolete.
The problem is that to run the main function of a standardly linked executable, you need to call execve, and exec replaces the whole process image and so you need a new address space, which is what fork is for.
You can get around this by having your calee expose its main functionality in a shared library (but then it must not be called main), and then you can load the function with the main functionality without having to fork (provided there are no symbol conflicts).
That would be a more efficient alternative to system (basically with the efficiency of a function call).
Now popen involves pipes and to use pipes you need to have the pipe ends in different schedulable units. Threads, which use the same address space, can be used here as a lighter alternative to separate processes.
As you alluded to fork() is a bit of a mad syscall that has kind of stuck around for historical reasons. There's a great article about its flaws here, and also this post goes into some details and potential workarounds.
Although on Linux fork() is optimised to use copy-on-write for the memory, it's still not "free" because:
It still has to do some memory-related admin (new page tables, etc.)
If you're using RAII (e.g. in C++ or possibly Rust) then all the objects that are copied will be cleaned up twice. That might even lead to logic errors (e.g. deleting temporary files twice).
It's likely that the parent process will keep running, probably modifying lots of its memory, and then it will have to be copied.
The alternatives appear to be:
vfork()
clone()
posix_spawn()
vfork() was created for the common use case of doing fork() and then execve() to run a program. execve() replaces all of the memory of the current process with a new set, so there's no point copying the parent process's memory if your just about to obliterate it.
So vfork() doesn't do that. Instead it runs in the same memory space as the parent process and pauses it until it gets to execve(). The Linux man page for vfork() says that doing just about anything except vfork() then execve() is undefined behaviour.
posix_spawn() is basically a nice wrapper around vfork() and then execve().
clone() is similar to fork() but allows you to exactly specify what is copied (file descriptors, memory, etc.). It has a load of options, including one (CLONE_VM) which lets the child process run in the same address space as the parent, which is pretty wild! I guess that is the lightest weight way to make a new process because it doesn't involve any copying of memory at all!
But in practice I think in most situations you should either:
Use threads, or
Use posix_spawn().
(Note, I am just researching this now; I'm not an expert so I might have got some things wrong.)
I've been looking at creating Unix dæmons, and there seem to be two methods. The long-winded one, which seems to come up when searching is to call fork(), setsid(), fork() again, chdir() to somewhere safe, set umask() and, finally, close() stdin, stdout and stderr.
Running man daemon, however, brings up information on a daemon() function, which seems to do all the same stuff as above. Are there any differences between the two approaches or is daemon() just a convenience function that does the same thing as the long-winded method? Is either one better, especially for a novice C programmer?
The daemon function is not defined in POSIX, so its implementation (if any) could behave differently on different platforms.
On Linux with glibc, daemon only does one fork, optionally chdirs (but only to /, you can't specify a path), does not touch umask, and does not close the std* descriptors (it optionally reopens them to /dev/null though). (source)
So it depends on the platform, and at least one implementation does less than what you do. If you need all of what you're doing, stick with that (or stick to a platform where the daemon function does exactly that).
Note that daemon is not conforming to any standard. Better use standard conforming functions (like POSIX-defined fork and setsid).
The daemon call summarizes the long-winded fork procedure, and I don't recall any implementation that does anything more.
Since daemon() is a high-level concept, it's definitely to be preferred for novice and experienced programmers.
I'm not sure if the title accurately describes what I want to do but here's the rub:
We have a large and hairy codebase (not-invented-here courtesy of Elbonian Code Slaves) which currently compiles as one big binary which internally creates several pthreads for various specific tasks, communicating through IPC messages.
It's not ideal for a number of reasons, and several of the threads would be better as independent autonomous processes as they are all individual specific "workers" rather than multiple instances of the same piece of code.
I feel a bit like I'm missing some trick, is our only option to split off the various thread code and compile each as a standalone executable invoked using system() or exec() from the main blob of code? It feels clunky somehow.
If you want to take a part of your program that currently runs as a thread, and instead run it as a separate process launched by your main program, then you have two main options:
Instead of calling pthread_create(), fork() and in the child process call the thread-start function directly (do not use any of the exec-family functions).
Compile the code that the the thread executes as a separate executable. Launch that executable at need by the standard fork / exec sequence. (Or you could use system() instead of fork/exec, but don't. Doing so needlessly brings the shell into it, and also gives you much less control.)
The former has the disadvantage that each process image contains a lot of code that it will never use, since each is a complete copy of everything. Inasmuch as under Linux fork() uses copy-on-write, however, that's mostly an address-space issue, not a resource-wastage issue.
The latter has the disadvantage that the main program needs to be able to find the child programs on the file system. That's not necessarily a hard problem, mind you, but it is substantially different from already having the needed code at hand. If there is any way that any of the child programs would be independently useful, however, then breaking them out as separate programs makes a fair amount of sense.
Do note, by the way, that I do not in general accept your premise that it is inappropriate to implement specific for-purpose workers as threads. If you want to break out such tasks, however, then the above are your available alternatives.
Edited to add:
As #EOF pointed out, if you intend that after the revamp your main process will still be multi-threaded (that is, if you intend to convert only some threads to child processes) then you need to be aware of a significant restriction placed by POSIX:
If a multi-threaded process calls fork(), [...] to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
On the other hand, I'm pretty sure the relevant definition of "multi-threaded" is that the process has multiple live threads at the time fork() is called. It should not present a problem if the child processes are all forked off before any additional threads are created, or after all but one thread is joined.
I'm writing a program that spawns child processes. For security reasons, I want to limit what these processes can do. I know of security measures from outside the program such as chroot or ulimit, but I want to do something more than that. I want to limit the system calls done by the child process (for example preventing calls to open(), fork() and such things). Is there any way to do that? Optimally, the blocked system calls should return with an error but if that's not possible, then killing the process is also good.
I guess it can be done wuth ptrace() but from the man page I don't really understand how to use it for this purpose.
It sounds like SECCOMP_FILTER, added in kernel version 3.5, is what you're after. The libseccomp library provides an easy-to-use API for this functionality.
By the way, chroot() and setrlimit() are both system calls that can be called within your program - you'd probably want to use one or both of these in addition to seccomp filtering.
If you want to do it the ptrace way, you have some options (and some are really simple). First of all, I recommend you to follow the tutorial explained here. With it you can learn how to know what system calls are being called, and also the basic ptrace knowledge (don't worry, it's a very short tutorial). The options (that I know) you have are the following:
The easiest one would be to kill the child, that is this exact code here.
Secondly you could make the child fail, just by changing the registers with PTRACE_SETREGS, putting wrong values in them, and you can also change the return value of the system call if you want (again, with PTRACE_SETREGS).
Finally you could skip the system call. But for that you should know the address after the system call call, make the intruction register point there and set it (again, with PTRACE_SETREGS).
I've been looking at creating Unix dæmons, and there seem to be two methods. The long-winded one, which seems to come up when searching is to call fork(), setsid(), fork() again, chdir() to somewhere safe, set umask() and, finally, close() stdin, stdout and stderr.
Running man daemon, however, brings up information on a daemon() function, which seems to do all the same stuff as above. Are there any differences between the two approaches or is daemon() just a convenience function that does the same thing as the long-winded method? Is either one better, especially for a novice C programmer?
The daemon function is not defined in POSIX, so its implementation (if any) could behave differently on different platforms.
On Linux with glibc, daemon only does one fork, optionally chdirs (but only to /, you can't specify a path), does not touch umask, and does not close the std* descriptors (it optionally reopens them to /dev/null though). (source)
So it depends on the platform, and at least one implementation does less than what you do. If you need all of what you're doing, stick with that (or stick to a platform where the daemon function does exactly that).
Note that daemon is not conforming to any standard. Better use standard conforming functions (like POSIX-defined fork and setsid).
The daemon call summarizes the long-winded fork procedure, and I don't recall any implementation that does anything more.
Since daemon() is a high-level concept, it's definitely to be preferred for novice and experienced programmers.