are posix pipes lightweight? - c

In a linux application I'm using pipes to pass information between threads.
The idea behind using pipes is that I can wait for multiple pipes at once using poll(2). That works well in practice, and my threads are sleeping most of the time. They only wake up if there is something to do.
In user-space the pipes look just like two file-handles. Now I wonder how much resources such a pipes use on the OS side.
Btw: In my application I only send single bytes every now and then. Think about my pipes as simple message queues that allow me to wake-up receiving threads, tell them to send some status-data or to terminate.

No, I would not consider pipes "lightweight", but that doesn't necessarily mean they're the wrong answer for your application either.
Sending a byte over a pipe is going to require a minimum of 3 system calls (write,poll,read). Using an in-memory queue and pthread operations (mutex_lock, cond_signal) involves much less overhead. Open file descriptors definitely do consume kernel resources; that's why processes are typically limited to 256 open files by default (not that the limit can't be expanded where appropriate).
Still, the pipe/poll solution for inter-thread communication does have advantages too: particularly if you need to wait for input from a combination of sources (network + other threads).

As you are using Linux you can investigate and compare pipe performance with eventfd's. They are technically faster and lighter weight but you'll be very lucky to actually see the gains in practice.
http://www.kernel.org/doc/man-pages/online/pages/man2/eventfd.2.html

Measure and you'll know. Full processes with pipes are sufficiently lightweight for lots of applications. Other applications require something lighter weight, like OS threads (pthreads being the popular choice for many Unix apps), or superlightweight, like a user-level threads package that never goes into kernel mode except to handle I/O. While the only way to know for sure is to measure, pipes are probably good enough for up to a few tens of threads, whereas you probably want user-level threads once you get to a few tens of thousands of threads. Exactly where the boundaries should be drawn using today's codes, I don't know. If I wanted to know, I would measure :-)

Related

Is it better to use select or multi-threading (or both) when creating a server that will simultaneously send/receive tasks for 200+ clients

I am creating a server that will be sending and receiving tasks from over 200 clients simultaneously (potentially more client in the future). There will also be background engines on the clients that will perform tasks and send responses to the server without asking first. I expect there to be a high volume of information transferred both ways. I've been doing research into multi-threading and using the select function, and I'm wondering given some of the parameters of the project which option (or a combination) would be the most efficient scalable solution based on the amount of traffic that might occur.
Any suggestions would be greatly appreciated. I'd be glad to answer any questions to provide more clarity.
Either approach will work; as far is which is "better", that's going to depend a lot on how you define the word "better".
The single-threaded approach avoids any chance of problems with race conditions or deadlocks, because those problems inherently can't occur in a single-threaded program. In a multithreaded program you have to be extremely careful about data-locking patterns, or else you will find yourself trying to debug very mysterious malfunctions that only occur once every few days/weeks/months.
On the other hand, the single-threaded approach limits you to using a single core; it won't be able to take advantage of a modern multi-core CPU to give you a parallelism speedup.
On the third hand, the multi-threaded approach can get hairy (and lose its speedup potential) if the various threads/connections often need to access any shared/mutable data structures. In that "shared data bottleneck" scenario, the threads may spend a lot of their time blocked waiting to lock a mutex, and then you're mostly back to using a single core anyway. If each connection operates independently of the others (e.g. as part of a simple web server) and doesn't need to interact with the other threads, then this shouldn't be a concern.
Multithreading allows you to use blocking I/O (which is simpler to implement than non-blocking I/O), but blocking I/O limits your control over the threads (e.g. how do you get a thread to exit cleanly, or take some other non-client-initiated action, if it is blocked indefinitely inside a recv() call? There aren't any good solutions to that problem, only poor ones)
Single-threading requires you to use non-blocking I/O (otherwise a single unresponsive client can halt service to all the other clients while the server is blocked inside a send() or recv() call), and non-blocking I/O is tricky to do correctly, since you have to handle partial-reads and partial-writes gracefully.
If your program ever needs to do a non-trivial amount of computation or file I/O, note that a single-threaded design will force all clients to wait while the computation (or I/O) for any client completes. In a multithreaded design, OTOH, clients B through Z can continue to be serviced on other cores/threads while client A's is busy reading from the disk or crunching numbers.
The overhead of spawning and maintaining threads will vary from one OS to another. If you're going to be running hundreds of threads simultaneously, you might want to verify first that your target OS (and hardware) will be able to handle that load efficiently. (You can reduce the overhead of spawning and reaping threads via a thread-pool, at some expense of increased RAM usage)
I personally prefer the single-threaded/non-blocking-I/O approach, because blocking I/O is problematic if you want your program to be able to shut down cleanly and reliably (which you should want, if only so you can do e.g. memory-leak testing under valgrind). If single-core performance turns out to be insufficient, it's often fairly straightforward extend the handle-N-sockets-on-1-thread design to a more powerful handle-N-sockets-on-each-of-M-threads design, and then you can play around with different values of N and M until you find the one that gives you the best performance (e.g. by setting M to the number of cores on the host machine, and handing out newly-accepted sockets to whichever thread is currently handling the smallest number of sockets)
I once made a program in Java, a chat application, that each connection with the server that was established, represented a new Thread in the server, to manage the client in question.
Inside the Server class, there was a static variable, to manage which clients were connected.
I don't know if recommend different technologies is the right way to answer you question, but i think, that for your case, would be a good idea to take a look at Erlang/Elixir platform, the premise is the is able to hold a lot of clients at the same time.
Currently, big companies, like Whatsapp uses Erlang and Discord Elixir.
I hope that my answer was helpful.

should I use processes or threads for my application?

I have an ARM device running a Linux 2.6 Kernel, with total ram of 64 MB RAM.
There is a data source, which consists of a meter that is queried by the Linux box, through RS485 and ModBus as app protocol.
There is another task, that consists of reading these values and making a json object, then HTTP POST to a specific server.
Network operation might be slower than serial, especially on low GPRS Coverage.
I need concurrency, program is written in C.
Which way would you have concurrency? Using select() or using pthreads?
When analyzing this particular application there's really only one question relevant to choosing pthreads:
Do the sensor reader and network writer need to share an address space?
In this instance I think the answer is clearly "no". Of course that isn't the only possible question, but the only germane one. There are reasons to prefer separate processes:
the two halves of the application have no common code; RS485 is wildly different from HTTP/JSON
segregation of responsibility: if the RS485 side is waiting on a UART, do you really want to block the HTTP side?
letting the OS do its job so you don't have to: if using pthreads, you have to handle a lot of the synchronization and preemption that the kernel does for you for free and code that you don't have to write has no new bugs.
Further analysis would require more detail than you've given, but here is one additional way to think about the choice: threads were invented to mitigate some limitations of the process model. Unless you know that you are going to hit those limitations, use separate processes.
added in response to comments:
I half agree with psusi's suggested design. There need only be two processes, one (let's say the sensor reader, that's a fine choice) which forks one and only one http sender. The two processes can communicate using traditional IPC like a pipe. The sensor process sends data down the pipe when it has some and the child (http) process packs it up in json and sends it on its way.
It only takes two long-lived processes, it uses probably about the same amount of core as would a pthread implementation and it is far, far easier to get right.
select() is more efficient, because it avoids the context switching that comes with multiple threads. And threads would be more efficient than separate processes, because you avoid having to copy the data (unless you setup shared memory, but at that point you might as well have gone with threads). However, writing non-blocking I/O, as with select(), is harder to do and get right, and doesn't enjoy the multitasking that comes with multiple threads. And multiple processes is likely to be the easiest implementation, especially because you can use curl rather than writing the HTTP POST half yourself.
Why you need concurrency? Is the meter has to be polled in a strict time interval?
If the answer is YES: Just use two processes, one poll the meter data and write to a ring buffer in nand storage, the other read the data from the ring buffer and send HTTP data.
If the answer is NO: You don't need concurrency and non-block at all. Use a big loop in main() is enough.

Whats the advantages and disadvantages of using Socket in IPC

I have been asked this question in some recent interviews,Whats the advantages and disadvantages of using Socket in IPC when there are other ways to perform IPC.Have not found exact answer .
Any help would be much appreciated.
Compared to pipes, IPC sockets differ by being bidirectional, that is, reads and writes can be done on the same descriptor. Pipes, unlike sockets, are unidirectional. You have to keep a pair of descriptors if you want to do both reads and writes.
Pipes, on the other hand, guarantee atomicity when reading or writing under a certain amount of bytes. Writing something less than PIPE_BUF bytes at once is guaranteed to be delivered in one chunk and never observed partial. Sockets do require more care from the programmer in that respect.
Shared memory, when used for IPC, requires explicit synchronisation from the programmer. It may be the most efficient and most flexible mechanism, but that comes at an increased complexity cost.
Another point in favour of sockets: an app using sockets can be easily distributed - ie. it can be run on one host or spread across several hosts with little effort. This depends of course on the nature of the app.
Perhaps this is too simplified an answer, yet it is an important detail. Sockets are not supported on all OS's. Recently, I have been aware of a project that used sockets for IPC all over the place only to find that they were forced to change from Linux to a proprietary OS which was POSIX, but did not support sockets the same way as Linux.
Sockets allow you a few benefits...
You can connect a simple client to them for testing (manually enter data, see the response).
This is very useful for debugging, simulating and blackbox testing.
You can run the processes on different machines. This can be useful for scalability and is very helpful in debugging / testing if you work in embedded software.
It becomes very easy to expose your process as a service
But there are drawbacks as well
Overhead is greater than IPC optimized for a single machine. Shared memory in particular is better if you need the performance, and you know your processes are all on the same machine.
Security - if your client apps can connect so can anyone else, if you're not careful about authentication. Data can also be sniffed if you're not encrypting, and modified if you're not at least signing data sent over the wire.
Using a true message queue tends to leave you with fixed sized messages. If you have a large number of messages of wildly varying sizes this can become a performance problem. Using a socket can be a way around this, though you're then left trying to wrap this functionality to become identical to a queue, which is tricky to get the detail right on, particularly aspects like blocking/non-blocking and atomicity.
Shared memory is quick but requires management (you end up writing a version of malloc to manage the SHM) plus you have to synchronise and lock it in some way. Though you can use libraries to help with this the availability depends on your environment and language.
Queues are easy but have the downsides listed as pros to my socket discussion.
Pipes have been covered by Blagovests answer to this question.
As is ever the case with this kind of stuff I would suggest reading the W. Richard Stevens books on IPC and sockets. There is no better explanation than his! :-)

How Blocking IO Affects A Multithreaded Application/Service In Linux

Am exploring with several concepts for a web crawler in C on Linux. To decide if i'll use blocking IO, multiplexed OI, AIO, a certain combination, etc., I esp need to know (I probably should discover it for myself practically via some test code, but for expediency I prefer to know from others) when a call to IO in blocking mode is made, is it the particular thread (assuming a multithreaded app/svc) or the whole process itself that is blocked? Even more specifically, in a multitheaded (POSIX) app/service can a thread dedicated to remote read/writes block the entire process? If so, how can I unblock such a thread without terminating the entire process?
NB: Whether or not I should use blocking/nonblocking is not really the question here.
Kindly
Blocking calls block only the thread that made them, not the entire process.
Whether to use blocking I/O (with one socket per thread) or non-blocking I/O (with each thread managing multiple sockets) is something you are going to have to benchmark. But as a rule of thumb...
Linux handles multiple threads reasonably efficiently. So if you are only handling a few dozen sockets, using one thread for each is easy to code and should perform well. If you are handling hundreds of sockets, it is a closer call. And for thousands of sockets, you are almost certainly better off using one thread (or process) to manage large groups.
In the latter case, for optimal performance you probably want to use epoll, even though it is Linux-specific.

Non-blocking native files access - single-threaded daemon in C?

I've found out that native files access has no "non-blocking" state. (I'm correct?)
I've been googling for daemons which are "non-blocking", and I've found one which achieved said behavior by threading file access operations, so that the daemon won't block.
My question is, wouldn't threading and IPC'ing such operations be rather expensive? wouldn't it make more sense to either:
A) Pre-thread pool, simply have each client at a thread and let it block for which ever blocking operations it might need. Or,
B) In case of file access blocking, use a relatively small buffer, that way it's still blocking - but one would assume that a tiny buffer for multiple operations would make more sense than paying the price of threading each operation and IPC it?
If you use threading, little IPC overhead is needed. You have the same memory space for all your threads, so a simple mutex or semaphore may be all you need. Now, if you are blocking on a mutex or semaphore too long or too often, why use async I/O in the first place?
As to the actual computation performed by threads doing I/O, they are waiting for the kernel to wake them up most of the time, so I wouldn't worry.
If your application is going to revolve around reading files and other I/O sources, you may want to read up on Reactor patterns, and event-driven programming.
Also, you mentioned a daemon, and servicing clients. If the service you provide is reading files, the computational cost of spawning a new thread to serve each client is minimal, since each individual thread will take "long" to complete requests, and block most of the time anyway. There may be a memory problem if your client count is in the thousands, but otherwise I think you'll do okay.
Give us a little more detail about what you want to do, maybe there are more straightforward ways.

Resources