Just wondering how if it's possible to execute another program in a thread and send information to/get information from it. Essentially the same concept as with a child process and using pipes to communicate - however I don't want to use fork.
I can't seem to find whether it's possible to do this, any help would be appreciated.
Thanks
You cannot use the exec family of functions to load another executable file within a thread; the exec functions replace the entire process with the process started from the executable. Thus fork() is necessary if you want your original process to keep running.
In theory you could replicate most of the behaviour of the exec system call in userspace, and run an executable within a thread - but as the thread would share the open file table, signal handlers and so on with the rest of the process, it would likely destructively interfere with the main process. It would also be a lot of work.
If you're not using fork (directly or indirectly), then it's not really another process. Of course, you can communicate between threads within a process. That's essential to most multithreading.
Related
My question is about more philosophical than technical issues.
Objective is to write a multiprocess (not multithread) program with one "master" process and N "worker" processes. Program is linux-only, async, event-based web-server, like nginx. So, the main problem is how to spawn "worker" processes.
In linux world there are two ways:
1). fork()
2). fork() + exec*() family
A short description for each way and what confused me in each of them.
First way with fork() is dirty, because forked process has copy (...on-write, i know) of parent memory: signal handlers, variables, file\socket descriptors, environ and other, e.g. stack and heap. In conclusion, after fork i need to...hmm..."clear memory", for example, disable signal handlers, socket connections and other horrible things, inherited from parent, because child has a lot of data that he was not intended - breaks encapsulation, and many side-effects is possible.
The general way for this case is run infinite loop in forked process to handle some data and do some magic with socket pair, pipes or shared memory for creating communication channel between parent and child before and after fork(), because socket descriptors reopen in child and used same socket as parent.
Also, this is nginx-way: it has one executable binary, that use fork() for spawn child process.
The second way is similar to first, but have a difference with usage one of exec*() function in child process after fork() for run external binary. One important thing is that exec*() loads binary in current (forked) process memory, automatic clear stack, heap and do all other nasty job, so fork will look like a clearly new instance of program without copy of parent memory or something other trash.
There has another problem with communication establishing between parent and child: because forked process after exec*() remove all data inherited from parent, that i need somehow create a socket pair between parent and child. For example, create additional listen socket (domain or in another port) in parent and wait child connections and child should connect to parent after initialization.
The first way is simple, but confuse me, that is not a clear process, just a copy of parent memory, with many possible side-effects and trash, and need to keep in mind that forked process has many dependencies to parent code. Second way needs more time to support two binary, and not so elegant like single-file solution. Maybe, the best way is use fork() for process create and something to clear it memory without exec*() call, but I cant find any solution for second step.
In conclusion, I need help to decide which way to use: create one-file executable file like nginx, and use fork(), or create two separate files, one with "server" and one with "worker", and use fork() + exec*(worker) N times from "server", and want know for pros and cons for each way, maybe I missed something.
For a multiprocess solution both options, fork and fork+exec, are almost equivalent and depends on the child and parent process context. If the child process executes the parents' text (binary) and needs all or a part of parents' staff (descriptors, signals etc) - it is a sign to use fork. If the child should execute a new binary and needs nothing from the parents' staff - it seems fork+exec much more suitable.
There is also a good function in the pthread library - pthread_atfork().
It allows to register handlers that will be called before and after fork.
These handlers may perform all the necessary work (closing file descriptors, for example).
As a Linux Programmer, you have a rich library of multithreading process capabilities. Look at pthread and friends.
If you need a process per request, then fork and friends have been the most widely used since time immemorial.
I would like to know if there is any good way to execute an external command in Linux environment using C language without using system(), popen(), fork(), exec()?
The reason I cannot use these functions is that my main application has used up most of the system resources (i.e memory) in my embedded board. If I do a fork, the board won't be able to create a duplicate of my main application. From I read in a book, both system() and popen() actually using fork() underneath, so I cannot use them either.
The only idea I currently have is create a process before I run my main application and use IPC(pipe or socket) to let the new process know what external commands it needs to run with system() or popen() and return the results back to my application when it is done.
You cannot do this. Linux create new process by sequential call to fork() and exec(). No other way of process creation exists.
But fork() itself is quite efficient. It uses Copy-on-Write for child process, so fork() not copy memory until it is really needed. So, if you call exec() right after fork() your system won't eat too much memory.
UPD. I lie to you saying about process creation. In fact, there is clone() call which fork() uses internally. This call provides more control over process creation, but it can be complicated to use. Read man 2 fork and man 2 clone for more information.
I'm going to write a program in which the main thread creates new thread and then the new thread creates a child process. Since I have a hard time keeping track of the new thread and forked process, I'd like to gain a wise answer from someone.
My question is
1. Does a created process in a thread start to execute codes after pthread_create?
2. If 1 is not, where does the forked process start from if a call of fork in a thread occurs?
Thank you for reading my question.
Some of this is a bit OS-dependent, as different systems have different POSIX thread implementations and this can expose internals.
POSIX offers pthread_atfork as a somewhat blunt instrument for dealing with some of the issues, but it still looks pretty messy to me.
If your system uses a one-to-one map between "user land thread" and "kernel thread" using clone or rfork to achieve proper user-space sharing of data between threads, then fork will merely duplicate the (single) thread that calls it. However, if your system has a many-to-many style mapping (so that one user process is handling multiple threads, at least before they enter into blocking syscalls), fork may internally duplicate multiple threads. POSIX says it should look like it only duplicated one thread, so that's not supposed to be visible, but I'm not sure how well all systems implement this.
There's some general advice at http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them (Linux-centric, obviously, but still useful).
Is there some particular reason you want to fork inside a thread but not exec? In general, if you just want to run more code in parallel, you just spin off yet another thread (i.e., once you choose to run any threads, you do everything in threads, except if you have to fork for exec; if the exec fails, just _exit).
Hi all I am new to C so sorry if I am very lost. I am having trouble with this multi-threaded web server I am trying to create. I am attempting to...
have a thread create a new thread
have that new thread execute execvp() to call a different C program on my machine
have that new thread return streams of data from the execvp()
I was thinking about using pthreads to spawn a new process to run execvp() and have it return the data through a pipe. But is that even necessary? Don't pthreads share memory?
Also, I was maybe thinking about using fork() instead of a pthread and have the child send data back to the parent through a pipe.
Can you please help guide me in the correct direction.
What you're looking for is a combination of fork(), one of the exec functions, and pipe() (or maybe socketpair() or something, but pipes work too).
Threads share memory, but execvp() would create a completely new process replacing the caller process -- and even if this process shared memory with its parent (which I'm not sure it does!), the newly run program wouldn't know how to use that memory.
The proper way is to open a pipe when you still have one process, fork() into two processes (parent and child), and have the child call execvp(). The child can now write into its end of the pipe, and the parent can read from the other end.
Remember to wait() for the child to end.
Have you written your non-blocking, single-threaded web-server yet? How would you expect to measure the benefits of multithreading if you don't have something to compare it against? It's far easier to determine where the best performance gains are if you expose a single-threaded project to concurrency, than it is to guess and suffer with a poor framework for the rest of the project's life.
Creating threads is easy, but you really need to read the pthread_create manual first. How else can you trust that your project is handling errors correctly? I also suggest reading about the other pthread functionality. I'm happy to help you resolve issues if you show me that you're trying to resolve them yourself, by the way. I won't bother spoonfeeding you.
As mentioned by aaaaaa123456789, you wouldn't want to spawn using pthread_create/execvp as this would replace your entire program environment (including all of your threads) with the new process.
On an embedded platform (with no swap partition), I have an application whose main process occupies most of the available physical memory. The problem is that I want to launch an external shell script from my application, but using fork() requires that there be enough memory for 2x my original process before the child process (which will ultimately execl itself to something much smaller) can be created.
So is there any way to invoke a shell script from a C program without incurring the memory overhead of a fork()?
I've considered workarounds such as having a secondary smaller process which is responsible for creating shells, or having a "watcher" script which I signal by touching a file or somesuch, but I'd much rather have something simpler.
Some UNIX implementations will give you a vfork (part of the Single UNIX spec) which is exactly like fork except that it shares all the stuff with the parent.
With vfork, there are a very limited number of things you can do in the child before calling exec to overwrite the address space with another process - that's basically what vfork was built for, a minimal copy version of fork for the fork/exec sequence.
If your system has an MMU, then usually fork() is implemented using copy-on-write, which doesn't actually allocate more memory at the time fork() is called. Additional memory would only be allocated if you write to any of the pages shared with the parent process. An exec() would then discard those pages.
If you know you don't have an MMU, then perhaps fork() is indeed implemented using an actual copy. Another approach might be to have a helper process that is responsible for spawning subshells, which you communicate with using a pipe.
I see you've already accepted an answer, but you may want to read about posix_spawn and use if it if it's available on your target:
http://www.opengroup.org/onlinepubs/9699919799/functions/posix_spawn.html
It sounds as if the prudent move in this case is to port your shell script (if possible) to C, and executing it within the process; so you don't have to fork at all.
Then again; I don't know what you are actually trying to do.
Instead of forking your process to spawn a shell, launch a shell within your process (in foreground) then fork it within the shell.
system("/bin/ash /scripts/bgtask");
with /scripts/bgtask being:
/bin/ash /scripts/propertask &
This way you double only the memory used by the shell, not by the main program. Your main program goes busy for duration of spawning the two shells: original to start bgtask and the background clone launched by it, then the memory allocated by the first shell is free again.