When to use forking or threading? [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have the following problem statement:
Implement a function substLinesMany ... . All the specified files should be processed concurrently. If any of the files result in an error, then substLinesMany will return false otherwise return true.
Would you use threading or forking here? (have to pick one)

I would use threading over forking. Creating a new thread consumes fewer resources than creating a new process. Threads share the same address space, while forking a process requires creating a new process with a new address space. Given the nature of the function (substituting lines in a file), having a separate address space per file is not necessary.
The only drawback is that there likely is a a per-process limit on the number of simultaneous open files which might be hit while using threads. That is a manageable problem though.

A different opinion, just in case. Threading seems tempting due to the advantages mentioned by #CraigAnderson.
But don't forget the following facts:
Threading is very difficult. Shared memory means you'll have to protect critical code and data sections with locks, etc. It's a nightmare to debug.
Anecdotal evidence suggests that in most cases your parallel solution will be slower than the sequential one. The reason: cache misses. Memory is magnitudes slower than CPUs. If all your threads access all process memory all the time, all your CPUs will spend their time refreshing their cache.
The overhead of fork is much less than you think. Linux is copy-on-write, so the child process starts with the same physical memory pages as the parent. Only changed data will trigger writing new physical pages.
As you compare threading with forking, you implicitly assume a Unix OS. Threading is very useful on non-Unix systems (Windows), where process creation is a huge overhead.
Parallel programs need careful design, where each thread/process works on its own chunk of memory only, to minimize cache misses. So you'll find that using forking and some Unix stream IPC is very efficient, has minimum overhead and is much easier to debug.
The semantics of Unix IPC, especially pipes, provide an excellent and easy to use way to communicate between processes. For instance a read from a worker child's pipe blocks until results are available. Add a select loop in the parent and you have data exchange AND syncing with a simple read.
Threading is less portable than forking. If you work on a multi-core embedded system with the uClibc library instead of glibc, well, uClibc has no threading.
References:
http://esr.ibiblio.org/?p=6364
http://www.catb.org/esr/writings/taoup/html/ch07s03.html#id2923889
http://web.stanford.edu/~ouster/cgi-bin/papers/threads.pdf
http://shop.oreilly.com/product/9780596000271.do The Perl Camel Book Threading chapter starts with 4 pages of warnings and discouragement
https://brendaneich.com/2007/02/threads-suck/
http://blog.codinghorror.com/threading-concurrency-and-the-most-powerful-psychokinetic-explosive-in-the-univ/
As you can see in the references, they are all big names: ESR, Ousterhout, Larry Wall, Brendan Eich, a StackOverflow founder. Magnitudes more intelligent than me, still scared stiff of threads.

Related

Modern System Architecture?

What could happen if we used Peterson's solution to the critical section problem on a modern computer? It is my understanding that systems with multiple CPUs can run into difficulty because of the ordering of memory reads and writes with respect to other reads and writes in memory, but is this the problem with most modern systems? Are there any advantages to using semaphores VS mutex locks?
Hey interesting question! So basically in order to understand what you're asking you have to ensure that you know what it is you're asking. The critical section is just the part of a program that should not be concurrently executed by any more than one of that program's processes or threads at a time. Multiple concurrent accesses are not allowed, so all that means is that only one process is interacting with the system at a time. Typically this "critical section" accesses a resource like a data structure, or network connection.
Mutual Exclusion or mutex just describes the requirement that only one concurrent process is in the critical section at a time, so concurrent access to shared data must ensure this "mutual exclusion".
So this introduces the problem! How do we assure that processes run completely independently of other processes, in other words, how do we ensure "atomic access" to the various critical sections by the threads?
There are a few solutions to the "critical-section problem" but the one you mention is Peterson's solution so we will discuss that.
Peterson's algorithm is designed for mutual exclusion and allows two tasks to share a single-use resource. They use shared memory for communicating.
In the algorithm, two tasks will compete for the critical section; you'll have to look into mutex, bound waiting and other properties a bit more for a full understanding, but the just of it is that in peterson's method, a process waits 1 turn and 1 turn only to get entrance into the critical section, if it gives priority to the other task or process, then that process will run to completion and hereby allowing the other process to enter the critical section.
That is the original solution proposed.
However this has no guarantee of working on today's multiprocessing modern architectures and it only works for two concurrent tasks. It is kind of messy on modern computers when it comes to reading and writing because it has an out-of-order type of execution, so sometimes sequential operations happen in an incorrect order and thus there are limitations. I suggest you also take a look at locks. Hope that helps :)
Can anyone else think of anything to add that I might have missed?
It is my understanding that systems with multiple CPUs can run into difficulty because of the ordering of memory reads and writes with respect to other reads and writes in memory, but is this the problem with most modern systems?
No. Any modern systems with "less strict" memory ordering will have ways to make the memory ordering more strict where it matters (e.g. fences).
Are there any advantages to using semaphores VS mutex locks?
Mutexes are typically simpler and faster (in the same way that a boolean is simpler than a counter); but ignoring overhead a mutex is equivalent to a semaphore with "resource count = 1".
What could happen if we used Peterson's solution to the critical section problem on a modern computer?
The big problem here is that most modern operating systems support some kind of multi-tasking (e.g. multiple processes, where each process can have multiple threads), there's usually 100 other processes (just for the OS alone), and modern hardware has power management (where you try to avoid power consumption by putting CPUs to sleep when they can't do useful work). This means that (unbounded) spinning/busy waiting is a horrible idea (e.g. you can have N CPUs being wasted spinning/trying to acquire a lock while the task that currently holds the lock isn't running on any CPU because the scheduler decided that 1234 other tasks should get 10 ms of CPU time each).
Instead; to avoid (excessive) spinning you want to ask the scheduler to block your task until/unless the lock actually can be acquired; and (especially for heavily contended locks) you probably want "fairness" (to avoid the risk of timing problems that lead to some tasks being repeatedly lucky while other tasks starve and make no progress).
This ends up being "no spinning", or "brief spinning" (to avoid scheduler overhead in cases where the task holding the lock actually can/does release it quickly); followed by the task being put on a FIFO queue and the scheduler giving the CPU to a different task or putting the CPU to sleep; where if the lock is released the scheduler wakes up the first task on the FIFO queue. Of course it's never that simple (e.g. for performance you want to do as much as you can in user-space; and you need special care and cooperating between user-space and kernel to avoid race conditions - the lock being released before a task is put on the wait queue).
Fortunately modern systems also provide simpler ways to implement locks (e.g. "atomic compare and swap"), so there's no need to resort to Peterson's algorithm (even if its just for insertion/removal of tasks from the real lock's FIFO queue).

Which approach is is better for talking between two process in c? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I have two different applications.I have to inform application_1 from application_2 that application_1 have to do some operation.(should call a function).
I can write some message 'take_action' in a text file from application_2 and application_1 will check that text file for 'take_action' on a regular interval.After getting 'take_action' it will call corresponding function and will write 'action_taken' to the text file.
I can use pipe or shared memory instead of file.
I can pass signal from application_2 through kill command (kill -SIGHUP) and call required function if SIGHUP signal comes to application_1.
Sample code for approach 1 and 3 are as follows.
approach_1:
in application_2:
fprintf(fp, "take_action");
in application_1:
int rd = read(filedReading, lineText, 13);
if (strcmp(lineText, "take_action") == 0)
{
reloadRule(); //calling required function
}
approach_3:
in application_2:
system("kill -SIGHUP");
in application_1:
void sig_handler(int signo)
{
switch(signo) {
case SIGUSR1:
opt_debug = opt_debug ? 0 : 1;
break;
case SIGHUP:
log_msg(LOG_NOTICE, "SIGHUP SIGNAL RECEIVED");
reloadRule(); //calling required function
default:
cleanexit(0);
}
Which approach is best for this kind of problem?
Opinion as always:
A temporary file is almost always the wrong thing to do if only 1 machine is involved, and RPC methods exist if you need multiple machines to cooperate.
When it is easy to establish the handle, anonymous pipes work well. You can use socketpair and mkfifo extend the pipe model to a wider set of scenarios.
Shared memory is the way forward when the pipe bandwidth is an issue. You can get a lot of data down a pipe, but it still involves a number of copies of the memory. Setting up a shared memory pool is a pain, but it gives both processes (almost) direct access to an agreed shared memory area, that is incredibly fast for data transfers. Of course you have to get this set up and there are potential synchronisation issues, but you can use your pipes to easily establish the connection or to synchronise the memory pool at a much lower bandwidth.
Signals are very limiting. You can only easily send a single flag, and they all already have reasons for existing, and what happens when "R" decides to use USR1 and USR2 for memory management, so you can't use your code with "R" programs, etc? Message queues extend signals to have a small payload, of course.
If you can use an anonymous pipe, then that's probably the best option.
This is because your ability to use a pipe means that your processes instances are intrinsically linked and started together, otherwise it would be hard to open a pipe between them.
If the processes are started together as a pair, they presumably intend to talk to exactly each other and not any other instances of the same programs, and they are probably expected to exit together. Pipes make this very simple, safe, and straight forward.
If the processes were started independently and you wanted to play matchmaker between various instances that weren't started strictly in pairs, then pipes would not have been an option, and sockets would have been a better fit.

having database in memory - C

I am programming a server daemon from which users can query data in C. The data can also be modified from clients.
I thought about keeping the data in memory.
For every new connection I do a fork().
First thing I thought about that this will generate a copy of the db every time a connection takes places, which is a waste of memory.
Second problem I have is that I don't know how to modify the database in the parent process.
What concepts are there to solve these problems?
Shared memory and multi-threading are two ways of sharing memory between multiple execution units. Check out POSIX Threads for multi-threading, and don't forget to use mutexes and/or semaphores to lock the memory areas from writing when someone is reading.
All this is part of the bigger problem of concurrency. There are multiple books and entire university courses about the problems of concurrency so maybe you need to sit down and study it a bit if you find yourself lost. It's very easy to introduce deadlocks and race conditions into concurrent C programs if you are not careful.
What concepts are there to solve these problems?
Just a few observations:
fork() only clones the memory of the process it executes at the time of execution. If you haven't opened or loaded your database at this stage, it won't be cloned into the child processes.
Shared memory - that is, memory mapped with mmap() and MAP_SHARED will be shared between processes and will not be duplicated.
The general term for communicating between processes is Interprocess communication of which there are several types and varieties, depending on your needs.
Aside On modern Linux systems, fork() implements copy-on-write copying of process memory. Actually, you won't end up with two copies of a process in memory - you'll end up with one copy that believes it has been copied twice. If you write to any of the memory, then it will be copied. This is an efficiency saving that makes use of the fact that the majority of processes alter only a small fraction of their memory as they run, so in fact even if you went for the copy the whole database approach, you might find the memory usage less that you expect - although of course that wouldn't fix your synchronisation problems!

Shared memory and IPC [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I was reading a tutorial about shared memory and found the following statement: "If a process wishes to notify another process that new data has been inserted to the shared memory, it will have to use signals, message queues, pipes, sockets, or other types of IPC.". So what is the main advantage of using shared memory and other type of IPC for notifying only instead of using an IPC that doesn't need any other IPC type to be used, like message queue and socket for example?
The distinction here is IPC mechanisms for signalling versus shared state.
Signalling (signals, message queues, pipes, etc.) is appropriate for information that tends to be short, timely and directed. Events over these mechanisms tend to wake up or interrupt another program. The analogy would be, "what would one program SMS to another?"
Hey, I added a new entry to the hash table!
Hey, I finished that work you asked me to do!
Hey, here's a picture of my cat. Isn't he cute?
Hey, would you like to go out, tonight? There's this new place called the hard drive.
Shared memory, compared with the above, is more effective for sharing relatively large, stable objects that change in small parts or are read repeatedly. Programs might consult shared memory from time to time or after receiving some other signal. Consider, what would a family of programs write on a (large) whiteboard in their home's kitchen?
Our favorite recipes.
Things we know.
Our friends' phone numbers and other contact information.
The latest manuscript of our family's illustrious history, organized by prison time served.
With these examples, you might say that shared memory is closer to a file than to an IPC mechanism in the strictest sense, with the obvious exceptions that shared memory is
Random access, whereas files are sequential.
Volatile, whereas files tend to survive program crashes.
An example of where you want shared memory is a shared hash table (or btree or other compound structure). You could have every process receive update messages and update a private copy of the structure, or you can store the hash table in shared memory and use semaphores for locking.
Shared memory is very fast - that is the main advantage and reason you would use it. You can use part of the memory to keep flags/timestamps regarding the data validity, but you can use other forms of IPC for signaling if you want to avoid polling the shared memory.
Shared memory is used to transfer the data between processes (and also to read/write disk files fast). If you don't need to transfer the data and need to only notify other process, don't use shared memory - use other notification mechanisms (semaphores, events, etc) instead.
Depending on the amount of data to be passed from process to process, shared memory would be more efficient because you would minimize the number of times that data would be copied from userland memory to kernel memory and back to userland memory.

if using shared memory, are there still advantages for processes over threading?

I have written a Linux application in which the main 'consumer' process forks off a bunch of 'reader' processes (~16) which read data from the disk and pass it to the 'consumer' for display. The data is passed over a socket which was created before the fork using socketpair.
I originally wrote it with this process boundary for 3 reasons:
The consumer process has real-time constraints, so I wanted to avoid any memory allocations in the consumer. The readers are free to allocate memory as they wish, or even be written in another language (e.g. with garbage collection), and this doesn't interrupt the consumer, which has FIFO priority. Also, disk access or other IO in the reader process won't interrupt the consumer. I figured that with threads I couldn't get such guarantees.
Using processes will stop me, the programmer, from doing stupid things like using global variables and clobbering other processes' memory.
I figured forking off a bunch of workers would be the best way to utilize multiple CPU architectures, and I figured using processes instead of threads would generally be safer.
Not all readers are always active, however, those that are active are constantly sending large amounts of data. Lately I was thinking that to optimize this by avoiding memory copies associated with writing and reading the socket, it would be nice to just read the data directly into a shared memory buffer (shm_open/mmap). Then only an index into this shared memory would be passed over the socket, and the consumer would read directly from it before marking it as available again.
Anyways, one of the biggest benefits of processes over threads is to avoid clobbering another thread's memory space. Do you think that switching to shared memory would destroy any advantages I have in this architecture? Is there still any advantage to using processes in this context, or should I just switch my application to using threads?
Your assumption that you cannot meet your realtime constraints with threads is mistaken. IO or memory allocation in the reader threads cannot stall the consumer thread as long as the consumer thread is not using malloc itself (which could of course lead to lock contention). I would recommend reading what POSIX has to say on the matter if you're unsure.
As for the other reasons to use processes instead of threads (safety, possibility of writing the readers in a different language, etc.), these are perfectly legitimate. As long as your consumer process treats the shared memory buffer as potentially-unsafe external data, I don't think you lose any significant amount of safety by switching from pipes to shared memory.
Yes, exactly for the reason you told. It's better to have each processes memory protected and only share what is really necessary to share. So each consumer can allocate and use its resources without bothering with the locking.
As for your index communication between your task, it should be noted that you could then use an area in your shared memory for that and using mutex for the accesses, as it is likely less heavy than the socket communication. File descriptor communication (sockets, pipes, files etc) always involves the kernel, shared memory with mutex locks or semaphores only when there is contention.
One point to be aware of when programming with shared memory in a multiprocessor environment, is to avoid false dependencies on variables. This happens when two unrelated objects share the same cache line. When one is modified it "dirties" also the other, which means that if other processor access the other object it will trigger a cache synchronisation between the CPUs. This can lead to bad scaling. By aligning the objects to the cache line size (64 byte usually but can differ from architecture to architecture) one can easily avoid that.
The main reason I met in my experience to replace processes by threads was efficiency.
If your processes are using a lot of code or unshared memory that could be shared in multithreading, then you could win a lot of performance on highly threaded CPUs like SUN Sparc CPUs having 64 or more threads per CPU. In this case, the CPU cache, especially for the code, will be much more efficient with multithreaded process (cache is small on Sparc).
If you see that your software is not running faster when running on new hardware with more CPU threads, then you should consider multi-threading. Otherwise, your arguments to avoid it seem good to me.
I did not meet this issue on Intel processors yet, but it could happen in the future when they add more cores per CPU.

Resources