My university professor has asked me to develop a project in C for Unix machines.
I should do a soccer championship emulator, in which there is a parent, and there are some child(every match between two teams).
The parent must create the matches, and the matches must tell the end result to the parent.
I think the best thing to do is to use fork() syscall and unnamed pipes.
What do you think about?
Thanks
Your suggestion above is valid. That approach would work. It might be easier to use a chunk of shared memory and mutex instead, but it's ultimately your call. I've included a working example that uses pthread_mutex calls and mmap in the references below that should get you up and running. Good luck!
References
C procs, fork(), and mutexes, Accessed 2014-04-29, <https://stackoverflow.com/questions/19172541/procs-fork-and-mutexes>
Related
I'm not sure if the title accurately describes what I want to do but here's the rub:
We have a large and hairy codebase (not-invented-here courtesy of Elbonian Code Slaves) which currently compiles as one big binary which internally creates several pthreads for various specific tasks, communicating through IPC messages.
It's not ideal for a number of reasons, and several of the threads would be better as independent autonomous processes as they are all individual specific "workers" rather than multiple instances of the same piece of code.
I feel a bit like I'm missing some trick, is our only option to split off the various thread code and compile each as a standalone executable invoked using system() or exec() from the main blob of code? It feels clunky somehow.
If you want to take a part of your program that currently runs as a thread, and instead run it as a separate process launched by your main program, then you have two main options:
Instead of calling pthread_create(), fork() and in the child process call the thread-start function directly (do not use any of the exec-family functions).
Compile the code that the the thread executes as a separate executable. Launch that executable at need by the standard fork / exec sequence. (Or you could use system() instead of fork/exec, but don't. Doing so needlessly brings the shell into it, and also gives you much less control.)
The former has the disadvantage that each process image contains a lot of code that it will never use, since each is a complete copy of everything. Inasmuch as under Linux fork() uses copy-on-write, however, that's mostly an address-space issue, not a resource-wastage issue.
The latter has the disadvantage that the main program needs to be able to find the child programs on the file system. That's not necessarily a hard problem, mind you, but it is substantially different from already having the needed code at hand. If there is any way that any of the child programs would be independently useful, however, then breaking them out as separate programs makes a fair amount of sense.
Do note, by the way, that I do not in general accept your premise that it is inappropriate to implement specific for-purpose workers as threads. If you want to break out such tasks, however, then the above are your available alternatives.
Edited to add:
As #EOF pointed out, if you intend that after the revamp your main process will still be multi-threaded (that is, if you intend to convert only some threads to child processes) then you need to be aware of a significant restriction placed by POSIX:
If a multi-threaded process calls fork(), [...] to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
On the other hand, I'm pretty sure the relevant definition of "multi-threaded" is that the process has multiple live threads at the time fork() is called. It should not present a problem if the child processes are all forked off before any additional threads are created, or after all but one thread is joined.
I'm trying to run an application in C, but the only way I could find that is reasonably easy to use works like this:
system("command here");
It works, of course, but it's really slow (especially when repeating this a lot). I'm just wondering if there is a way of running a program without having to interact with a shell, something like python's subprocess module.
I have heard of execl, and I would use that (forking it first, of course), but I'm wondering if there is a simpler way that wouldn't require forking first.
EDIT: I also want to be able to know the return code of the program
As I'm sure you already know, system already employs the fork/exec strategy. I understand you want to circumvent the shell and are looking for a simple approach, I'm just saying you could just as easily write a function to wrap the fork/exec pattern as is done in system. Indeed it would probably be most straightforward to just do that. An alternative as Gabe mentioned in the comments is posix_spawn.
A faster (but apparently discouraged) alternative is vfork() / exec, but this is generally discouraged and is obsolete in the latest POSIX standards.
4.3BSD; POSIX.1-2001 (but marked OBSOLETE). POSIX.1-2008 removes the
specification of vfork().
It's meant to be immediately followed by an exec or _exit. Otherwise all kinds of weird bugs can arise since the virtual memory pages and page tables aren't duplicated (child uses same data/heap/stack segments). The parent/calling process blocks until the child execs or _exits. Regular fork's modern implementations have copy-on-write semantics which approach the speed of vfork, without the potential bugs incurred by vfork's memory sharing semantics.
If you want even further control over memory-sharing semantics and process inheritance, and the consequent potential speed-up (and are on Linux), look into clone() (wrapper for system-call sys_clone()) which is what some process-creating system calls delegate their work to. Be sure to carefully comb over all of the various flags.
You can use waitpid to get the exit status of the process.
If neither system() nor popen() provides the mechanism you need, then the easy way to do it is with fork() and execv() (or, perhaps, execl(), but the argument list must be fixed at compile time, not variable, to use it). Really! It is not hard to do fork() and exec(), and any alternative will encapsulate that processing.
The Python subprocess module is simply hiding fork() and exec() for you behind a convenient interface. That's probably appropriate for a high-level language like Python. C is a lower-level language and doesn't really need the complexity.
The hard way to do it is with posix_spawn(). You have to create arguments to describe all the actions you want done in the child between the fork() and the exec(), which is far harder to set up than it is to simply do the fork(), make the changes, and then use exec() after all. This (posix_spawn()) is what you get when you design the code to spawn a child process without visibly using fork() and exec() and ensure that it can handle almost any reasonable circumstance.
You'll need to consider whether you need to use wait() or waitpid() or a variant to determine when the child is complete. You may need to consider whether to handle the SIGCHLD signal (which will notify you when a child dies).
Hi all I am new to C so sorry if I am very lost. I am having trouble with this multi-threaded web server I am trying to create. I am attempting to...
have a thread create a new thread
have that new thread execute execvp() to call a different C program on my machine
have that new thread return streams of data from the execvp()
I was thinking about using pthreads to spawn a new process to run execvp() and have it return the data through a pipe. But is that even necessary? Don't pthreads share memory?
Also, I was maybe thinking about using fork() instead of a pthread and have the child send data back to the parent through a pipe.
Can you please help guide me in the correct direction.
What you're looking for is a combination of fork(), one of the exec functions, and pipe() (or maybe socketpair() or something, but pipes work too).
Threads share memory, but execvp() would create a completely new process replacing the caller process -- and even if this process shared memory with its parent (which I'm not sure it does!), the newly run program wouldn't know how to use that memory.
The proper way is to open a pipe when you still have one process, fork() into two processes (parent and child), and have the child call execvp(). The child can now write into its end of the pipe, and the parent can read from the other end.
Remember to wait() for the child to end.
Have you written your non-blocking, single-threaded web-server yet? How would you expect to measure the benefits of multithreading if you don't have something to compare it against? It's far easier to determine where the best performance gains are if you expose a single-threaded project to concurrency, than it is to guess and suffer with a poor framework for the rest of the project's life.
Creating threads is easy, but you really need to read the pthread_create manual first. How else can you trust that your project is handling errors correctly? I also suggest reading about the other pthread functionality. I'm happy to help you resolve issues if you show me that you're trying to resolve them yourself, by the way. I won't bother spoonfeeding you.
As mentioned by aaaaaa123456789, you wouldn't want to spawn using pthread_create/execvp as this would replace your entire program environment (including all of your threads) with the new process.
Is there something equivalent to SIGSTOP and SICONT for threads? Am using pthreads.
Thanks
An edit:
I am implementing a crude form of file access syncronization among threads. So if a file is already opened by a thread, and another thread wants to open it again, I need to halt or pause the second thread at that point of its execution. When the first thread has completed its work it will check what other threads wanted to use a file it released and "wake" them up. The second thread then resumes execution from exactly that point. I use my own book keeping datastructures.
I'm going to tell you how to do things instead of answering the question. (Look up the "X Y problem".)
You are trying to prevent two threads from accessing the same file at the same time. In other words, access is MUTually EXclusive. A "mutex" is designed to do this. In general, it is easier to find help if you search for what you are trying to do (prevent two threads from accessing the same resource simultaneously) rather than searching for how you want to do it (make one thread wait for the other).
Edit: It sounds like you actually want many readers but one writer. This is probably the second most common synchronization problem (after the "producer-consumer" problem). Use a pthread_rwlock: readers call pthread_rdlock and writers call pthread_wrlock.
If you're doing something this sophisticated, you really should start reading the relevant literature. If you think you can do multithreaded programming some serious reading, you are much smarter than me and you don't need my help. I recommend "The Little Book of Semaphores" which is a free download (source). It's not about pthreads, but it's good stuff. The readers-writers problem you are asking about is found under ยง4.2 in the chapter "Classical Synchronization Problems" (heck, this problem is even mentioned in the blurb).
Multithreaded programing is HARD with capital letters and a bold font.
Well, there is pthread_kill.
But you almost certainly do not want to do this. What if the other thread holds (e.g.) a mutex for the heap, and you try to call new while it is stopped?
Since you do not know what the runtime is doing with mutexes, there is no way to avoid this kind of problem in general unless you completely avoid the standard library.
[edit]
Actually, come to think of it, I am not sure what happens if you target a specific thread with SIGSTOP, since that signal usually affects the whole process.
So to update my answer, I do not believe there is any standard mechanism for suspending a thread asynchronously... And for the reason mentioned above, I do not think you want one.
Depending on your application, Pthreads supports what can be considered more refined mechanisms, such as http://www.unix.com/man-page/all/3t/pthread_suspend/ and Mutex mechnisms
I am experienced at multithreaded programming in Java and C#, and am starting to learn how to do it in C on Linux. I "grew up" in the programming sense on Linux, so I understand it's memory philophy, process handling, etc. at a high level.
My question is not how to do threading. I would like to know how pthread actually does it. Does it fork a process and handle your interprocess communication for you somehow? Or does it just manage the address space? I want nitty-gritty details :) Googling has only produced "how to do it" questions, not "how it works".
The details are probably too complex to really get into (without posting a link to the glibc source code), but I can give you better things to look up:
Pthread uses sys_clone() to create new threads, which the kernel sees as a new task that happens to share many data structures with other threads.
To do synchronization, pthread relies heavily on futexes in the kernel.
On Linux, both fork() and ptrheads use the same syscall clone(), which creates a new process. The difference between them is simply the parameters they send to clone(), when creating a new thread, it simply makes both processes use the same memory mappings.
Remember, in Linux (and other modern Unixes), memory mappings, stacks, processor state, PIDs, and others are orthogonal features of a process; so you can create a new process with just a new stack and process state (sharing everything else), and call it a thread.
Here is the source of pthread.c. This may help you answer your question.