Application-level back-pressure VS TCP native flow-control - akka-stream

I am investigating why some systems implement application-level back-pressure given that TCP provides native flow control.
I was reading, in particular, akka-streams and (an higher level discussion) reactive streams.
Is it only to abstract out the idea of asynchronous communication out of the network and out of the TCP protocol?
Other, more precise, questions:
If the application (say akka-streams app) ends up communicating over TCP, will it fall back to TCP's native back-pressure?
Does reactive streams implement application-level back-pressure on top of TCP by simply leaving TCP handle it?
Any help and pointer would be appreciated!
Thanks :)

I think the underlying assumption of your question is that TCP is the only external communication within a stream pipeline.
Suppose your stream communicates over several IO channels such as file IO, database querying, and standard output to the console:
//Read Data from File --> DB Query --> TCP Server Query --> Slow Function --> Console
An akka-stream implementation would provide asynchronous, back-pressured, support through the entire pipeline. Therefore akka-stream has to provided application level back-pressure for calling long-running functions, querying databases, reading files, writing to the console, etc.
You are correct that akka's implementation of the TCP Server section of the stream relies on TCP's "native backpressure". From the documentation: using Akka Streams you are freed of having to manually react to
back-pressure signals, as the library does it transparently for you.


Why is there no extra channel for error or status?

I have a question on Client-Server-Computing.
Why is there only one connection from the server back to the client? In UNIX you normally have stdout and stderr.
Database-queries might take a much longer time than you expected.
Then you wonder if there is something wrong. Maybe the server is stuck in an endless loop. This can easily be the case because servers nowadays can be extended via procedures, triggers etc.
If there was an extra port for sending status messages from the server to the client the user could get the information "everything ok" e.g. via "executing node number 7 of the query execution plan".
These users who only would be puzzeld by such information could keep the message window closed.
Is there a real technical problem or need those responsible for TCP standardisation just a hint?
TCP is a generic transport protocol and does not distinguish between different semantics, like status, error, data, ... Such semantics are added by the application protocol on top of TCP.
To provide different semantics it is not necessary to have different TCP connections. One could easily define an application protocol which allow messages with different semantics to be transferred over the same TCP connection. And such protocols exist, for example TLS (handshake messages, application data, alerts ...). But one could also do multiple TCP connections, like in FTP with different TCP connections for control and data.
So the question should be instead why a specific server application does not have the capability for status updates in parallel to queries. It is definitely not because of limitations from using TCP as transport layer, but because of limitations in the application itself.

Why is UDP preferred over TCP, in making remote procedure call?

I was reading about RPC. The blog,, recommends to use UDP over TCP, in making remote procedure call, why is UDP preferred than TCP?
UDP is not generally preferred to TCP when doing remote procedure calls. In fact, most implementations of RPC technologies like CORBA, XML-RPC, SOAP, Java RMI, ... use TCP and not UDP as underlying transport. TCP is preferred here because contrary to UDP it already cares about reliability (dealing with packet loss, duplication, reordering) and can also easily and transparently handle arbitrary sized messages.
The blog you cite refers to classic Sun-RPC as used with NFS and which was primarily used in a local network - contrary to current RPC technologies which are often used in more complex network environments. In this kind of environment and at this time (long ago) UDP offered a smaller overhead and a faster recovery from network problems than TCP since there is no initial handshake and necessary retransmits, reordering ... are in full control of the RPC layer and can be tuned to the specific use case. So while preferring UDP for specific RPC in this environment made sense it cannot be said that UDP should be preferred for any kind of RPC.

select() equivalence in I/O Completion Ports

I am developing a proxy server using WinSock 2.0 in Windows. If I wanted to develop it in blocking model, select() was the way to wait for client or remote server to receive data from. Is there any applicable way to do this so using I/O Completion Ports?
I used to have two Contexts for two directions of data using I/O Completion Ports. But having a WSARecv pending couldn't receive any data from remote server! I coudn't find the problem.
Thanks in advance.
EDIT. Here's the WorkerThread Code on currently developed I/O Completion Ports. But I am asking about how to implement select() equivalence.
I/O Completion Ports provide an indication of when an I/O operation completes, they do not indicate when it is possible to initiate an operation. In many situations this doesn't actually matter. Most of the time the overlapped I/O model will work perfectly well if you assume it is always possible to initiate an operation. The underlying operating system will, in most cases, simply do the right thing and queue the data for you until it is possible to complete the operation.
However, there are some situations when this is less than ideal. For example you can always send to a socket using overlapped I/O. You can do this even when the remote peer is not reading and the TCP stack has started to use flow control and has filled the TCP window... This simply uses resources on your local machine in a completely uncontrolled manner (not entirely uncontrolled, but controlled by the peer, which is not ideal). I write about this here and in many situations you DO need to actively manage this kind of thing by tracking how many outstanding I/O write requests you have and using that as an indication of 'readiness to send'.
Likewise if you want a 'readiness to recv' indication you could issue a 'zero byte' read on the socket. This is a read which is issued with a zero length buffer. The read returns when there is data to read but no data is returned. This would give you the indication that there is data to be read on the connection but is, IMHO, pointless unless you are suffering from the very unlikely situation of hitting the I/O page lock limit, as you may as well read the data when it becomes available rather than forcing multiple kernel to user mode transitions.
In summary, you don't really need an answer to your question. You need to look at how the API works and write your code to work with it rather than trying to force the API to work in a way that other APIs that you are familiar with work.

HTTP Persistent connection

Trying to implement a simple HTTP server in C using Linux socket interface I have encountered some difficulties with a certain feature I'd like it to have, namely persistent connections. It is relatively easy to send one file at a time with separate TCP connections, but it doesn't seem to be very efficient solution (considering multiple handshakes for instance). Anyway, the server should handle several requests (HTML, CSS, images) during one TCP connection. Could you give me some clues how to approach the problem?
It is pretty easy - just don't close the TCP connection after you write the reply.
There are two ways to do this, pipelined, and non pipelined.
In a non-pipelined implementation you read one http request on the socket, process it, write it back out of the socket, and then try to read another one. Keep doing that until the remote party closes the socket, or close it yourself after you stop getting requests on the socket after about 10 seconds.
In a pipelined implementation, read as many requests as are on the socket, process them all in parallel, and then write them all back out on the socket, in the same order as your received them. You have one thread reading requests in all the time, and another one writing them out again.
You don't have to do it, but you can advertize that you support persistent connections and pipelining, by adding the following header in your replies:
Connection: Keep-Alive
Read this:
By the way, in practice there aren't huge advantages to persistent connections. The overhead of managing the handshake is very small compared to the time taken to read and write data to network sockets. There is some debate about the performance advantages of persistent connections. On the one hand under heavy load, keeping connections open means many fewer sockets on your system in TIME_WAIT. On the other hand, because you keep the socket open for 10 seconds, you'll have many more sockets open at any given time than you would in non-persistent mode.
If you're interested in improving performance of a self written server - the best thing you can do to improve performance of the network "front-end" of your server is to implement an event based socket management system. Look into libev and eventlib.

Server Architecture for Embedded Device

I am working on a server application for an embedded ARM platform. The ARM board is connected to various digital IOs, ADCs, etc that the system will consistently poll. It is currently running a Linux kernel with the hardware interfaces developed as drivers. The idea is to have a client application which can connect to the embedded device and receive the sensory data as it is updated and issue commands to the device (shutdown sensor 1, restart sensor 2, etc). Assume the access to the sensory devices is done through typical ioctl.
Now my question relates to the design/architecture of this server application running on the embedded device. At first I was thinking to use something like libevent or libev, lightweight C event handling libraries. The application would prioritize the sensor polling event (and then send the information to the client after the polling is done) and process client commands as they are received (over a typical TCP socket). The server would typically have a single connection but may have up to a dozen or so, but not something like thousands of connections. Is this the best approach to designing something like this? Of the two event handling libraries I listed, is one better for embedded applications or are there any other alternatives?
The other approach under consideration is a multi-threaded application in which the sensor polling is done in a prioritized/blocking thread which reads the sensory data and each client connection is handled in separate thread. The sensory data is updated into some sort of buffer/data structure and the connection threads handle sending out the data to the client and processing client commands (I supposed you would still need an event loop of sort in these threads to monitor for incoming commands). Are there any libraries or typical packages used which facilitate designing an application like this or is this something you have to start from scratch?
How would you design what I am trying to accomplish?
I would use a unix domain socket -- and write the library myself, can't see any advantages to using libvent since the application is tied to linux, and libevent is also for hundreds of connections. You can do all of what you are trying to do with a single thread in your daemon. KISS.
You don't need a dedicated master thread for priority queues you just need to write your threads so that it always processes high priority events before anything else.
In terms of libraries, you will possibly benifit from Google's protocol buffers (for serialization and representing your protocol) -- however it only has first class supports for C++, and the over the wire (serialization) format does a bit of simple bit shifting to numeric data. I doubt it will add any serious overhead. However an alternative is ASN.1 (asn1c).
My suggestion would be a modified form of your 2nd proposal. I would create a server that has two threads. One thread polling the sensors, and another for ALL of your client connections. I have used in embedded devices (MIPS) boost::asio library with great results.
A single thread that handles all sockets connections asynchronously can usually handle the load easily (of course, it depends on how many clients you have). It would then serve the data it has on a shared buffer. To reduce the amount and complexity of mutexes, I would create two buffers, one 'active' and another 'inactive', and a flag to indicate the current active buffer. The polling thread would read data and put it in the inactive buffer. When it finished and had created a 'consistent' state, it would flip the flag and swap the active and inactive buffers. This could be done atomically and should therefore not require anything more complex than this.
This would all be very simple to set up since you would pretty much have only two threads that know nothing about the other.
