How to synchronize read and multiple writes on sockets - c

I am asking some theoretical questions, since it would be really hard for me to post the code of the project involved, which is composed of too many files.
I am writing code for a server program, which has to communicate with several clients that send a variety of different requests and expect answers for each of them.
The server is multi-threaded, so every single thread mutually accesses a client connections list and performs all the operations within the request.
The two parts are communicating viaAF_UNIX sockets and for mutual exclusion I have used locks and condition variables.
Now my issue is this: with certain interleavings of execution, the server side ends up making two simultaneous writes to the client (actually two worker threads within the server both send a message to the same client), which is expecting just one. For unlucky interleavings, the client only gets to read one of the messages sent by the server, but I have noticed that sometimes this doesn't happen and everything works fine, this is because the two different requests to the client happen to be distanced in time.
The issue happens even when one server thread does a write before the client calls read, and in between those two events, another server thread calls another write to the same client. In this case, only the most recent write is received by the client.
From what I have understood regarding blocking mode (which is what I am using)read() and write() should block when no one's receiving from the other side. Now I don't understand why the second write from the server worker gets completely lost? Shouldn't it block if no one's receiving and then resume when the client calls read?
Should I usemutual exclusion on the socket so that the second write should wait for the previous one to be completely finished?
I hope that my issue is clear even though I am not showing any code, if necessary please tell me and I will try to post some pieces of code. I think that the issue might just be a conceptual thing, regarding my understood of read(), write() and mutual exclusion, but I understand that the issue might be somewhere else and that without code it would be hard to figure out! Thank you!

Related

No threads and blocking sockets - is it possible to handle several connections?

I have a program that needs to:
Handle 20 connections. My program will act as client in every connection, each client connecting to a different server.
Once connected my client should send a request to the server every second and wait for a response. If no request is sent within 9 seconds, the server will time out the client.
It is unacceptable for one connection to cause problems for the rest of the connections.
I do not have access to threads and I do not have access to non-blocking sockets. I have a single-threaded program with blocking sockets.
Edit: The reason I cannot use threads and non blocking sockets is that I am on a non-standard system. I have a single RTOS(Real-Time Operating System) task available.
To solve this, use of select is necessary but I am not sure if it is sufficient.
Initially I connect to all clients. But select can only be used to see if a read or write will block or not, not if a connect will.
So when I have connected to say 2 clients and they are all waiting to be served, what if the 3rd does not work, the connection will block causing the first 2 connections to time out as well.
Can this be solved?
I think the connection-issue can be solved by setting a timeout for the connect-operation, so that it will fail fast enough. Of course that will limit you if the network really is working, but you have a very long (slow) path to some of the server(s). That's bad design, but your requirements are pretty harsh.
See this answer for details on connection-timeouts.
It seems you need to isolate the connections. Well, if you cannot use threads you can always resort to good-old-processes.
Spawn each client by forking your server process and use traditional IPC mechanisms if communication between them is required.
If you can neither use a multiprocess approach I'm afraid you'll have a hard time doing that.

select() equivalence in I/O Completion Ports

I am developing a proxy server using WinSock 2.0 in Windows. If I wanted to develop it in blocking model, select() was the way to wait for client or remote server to receive data from. Is there any applicable way to do this so using I/O Completion Ports?
I used to have two Contexts for two directions of data using I/O Completion Ports. But having a WSARecv pending couldn't receive any data from remote server! I coudn't find the problem.
Thanks in advance.
EDIT. Here's the WorkerThread Code on currently developed I/O Completion Ports. But I am asking about how to implement select() equivalence.
I/O Completion Ports provide an indication of when an I/O operation completes, they do not indicate when it is possible to initiate an operation. In many situations this doesn't actually matter. Most of the time the overlapped I/O model will work perfectly well if you assume it is always possible to initiate an operation. The underlying operating system will, in most cases, simply do the right thing and queue the data for you until it is possible to complete the operation.
However, there are some situations when this is less than ideal. For example you can always send to a socket using overlapped I/O. You can do this even when the remote peer is not reading and the TCP stack has started to use flow control and has filled the TCP window... This simply uses resources on your local machine in a completely uncontrolled manner (not entirely uncontrolled, but controlled by the peer, which is not ideal). I write about this here and in many situations you DO need to actively manage this kind of thing by tracking how many outstanding I/O write requests you have and using that as an indication of 'readiness to send'.
Likewise if you want a 'readiness to recv' indication you could issue a 'zero byte' read on the socket. This is a read which is issued with a zero length buffer. The read returns when there is data to read but no data is returned. This would give you the indication that there is data to be read on the connection but is, IMHO, pointless unless you are suffering from the very unlikely situation of hitting the I/O page lock limit, as you may as well read the data when it becomes available rather than forcing multiple kernel to user mode transitions.
In summary, you don't really need an answer to your question. You need to look at how the API works and write your code to work with it rather than trying to force the API to work in a way that other APIs that you are familiar with work.

In C on Linux, how would I go about using 2 programs, the latter sending text data to the first displayed using stdout?

I am writing a simple instant messenger program in C on Linux.
Right now I have a program that binds a socket to a port on the local machine, and listens for text data being sent by another program that connected to my local machine IP and port.
Well, I can have this client send text data to my program, and have it displayed using stdout on my local machine; however, I cannot program a way to send data back to the client machine, because my program is busy listening and displaying the text sent by the client machine.
How would I go about either creating a new process (that listens and displays the text sent to it by the client machine, then takes that text and sends it to the other program's stdout, while the other program takes care of stdin being sent to the client machine) or create 2 programs that do the separate jobs (sending, receiving, and displaying), and sends the appropriate data to one another?
Sorry if that is weirdly worded, and I will clarify if need be. I looked into exec, execve, fork, etc. but am confused as to whether this is the appropriate path to look in to, or if there is a simpler way that I am missing.
Any help would be greatly appreciated, Thank you.
EDIT: In retrospect, I figured that this would be much easier accomplished with 2 separate programs. One, the IM server, and the others, the IM clients.
The IM Clients would connect to the IM server program, and send whatever text they wanted to the IM server. Then, the IM server would just record the data sent to it in a buffer/file with the names/ip's of the clients appended to the text sent to it by each client, and send that text (in format of name:text) to each client that is connected.
This would remove the need for complicated inter-process/program communication for stdin and stdout, and instead, use a simple client/server way of communicating, with the client programs displaying text sent to it from server via stdout, and using stdin to send whatever text to the server.
With this said, I am still interested in someone answering my original question: for science. Thank you all for reading, and hopefully someone will benefit from my mental brainstorming, or whatever answers come from the community.
however, i cannot program a way to send data back to the client machine, because my program is busy listening and displaying the text sent by the client machine.
The same socket that was returned from a listening-socket by accept() can be used for both sending and receiving data. So your socket is never "busy" just because you're reading from it ... you can write back on the same socket.
If you need to both read and write concurrently, then share the socket returned from accept() across two different threads. Since two different buffers are being used by the networking stack for sending and receiving on the socket, a dedicated thread for reading and another dedicated thread for writing to the socket will be thread-safe without the use of mutexes.
I would go with fork() - create a child process and now you have two different processes that can do two different things on two different sockets- one can receive and the other can send. I have no personal experience with coding a client/server like this yet, but that would be my first stab at solving your issue...
As #bdonlan mentioned in a comment, you definitely need a multiplexing call like select or preferably poll (or related syscalls like pselect, ppoll ...). These multiplexing calls are the primitive to wait on several channels at once (with pselect and ppoll able to atomically wait for both I/O events and signals). Read also the select tutorial man page. Of course, you can wait for several file descriptors, and you can wait for both reading & writing abilities (even on the same socket, if needed), in the same select or poll syscall.
All event-based loops and frameworks are using these multiplexing calls (like poll or select). You could also use libevent, or even (particularly when coding a graphical user interface application) some GUI toolkit like Gtk or Qt, which are all based around a central event loop.
I don't think that having a multi-process or multi-threaded application is useful in your case. You just need some event loop.
You might also ask to get a SIGIO signal when data arrives on your socket using fcntl with F_SETOWN, but this is not very useful for you. Then you often want to have your socket non-blocking.

Arbitrary two-way UNIX socket communication

I've been working on a complex server-client system in C and I'm not sure how to implement the socket communication.
In a nutshell, the system is a server application which communicates with a database and uses a UNIX socket to communicate with one or more child processes created with fork(). The purpose of the children is to run game servers. The process of launching a game server is like this:
The server/"manager" identifies a game server in the database that is to be made. (Assume database communication is already sorted.)
The manager forks a child (the "game controller").
The game controller sets up two pipe pairs, then forks, replacing its child's stdin with a pipe, and it's stdout and stderr with another pipe.
The game controller's child then runs execlp() to begin running the actual game server executable.
My experience with sockets is fairly minimal. I have used select() on a server application before to 'multiplex' numerous clients, as demonstrated by the simple example in the GNU C documentation here.
I now have a new challenge, as the system must be able to do more: the manager needs to be able to arbitrarily send commands to the game controller children (that it will find by periodically checking the database) and get replies, but also expect incoming arbitrary commands/errors from them and send replies back.
So, I need a sort-of "context" system, where sockets are meaningful only between themselves. In other words, when a command is sent from the manager to the game controller, each party needs to be aware of who is asking and know what the reply is (and, therefore, which command it is a reply to).
Because select() is only useful for knowing when we have incoming data, and a thread should block on it, would I need another thread that sends data and gets the replies? Will this require each game controller, although technically a 'client', to use a listening socket and use select() as well?
I hope I've explained the system and the problem concisely; I will add more detail if required. Thanks!
Ok, I am still not really sure I understand exactly where your trouble is, so I will just spout off some things about writing a client/server app. If I am off track, just let me know.
The way that the server will know which clients corresponds to which socket is that the clients will tell the server. Essentially, you need to have a log-in protocol. When the game controller connects to the server, it will send a message that says "Hi, i am registering as controller foo1 on host xyz, port abc..." and whatever else the server needs to know about its clients. The server will keep a data structure that maps sockets to client metadata, state, etc. Whenever it gets a new message, it can easily map from the incoming host/port to its metadata. Or your protocol can require that on each incoming message, the will client send the name it registered with as a field.
Handling the request/response can be done several ways. First lets deal with the networking part of it on the server side. One way to manage this, as you mentioned, is by using select (or poll, or epoll) to multiplex the sockets. This is actually usually considered the more complicated way to do things. Another way is to spawn off a thread (or fork a process, which is less common these days) for each incoming client. Each spawned thread can read its own assigned socket, responding to messages one at a time without worrying about the fact that there are other clients besides the own it is dealing with. This simple one to one thread to socket model breaks down if there are many clients, but if that is not the case, then it is worth consideration.
Part 2 really covers only the client sending the server a message, and the server replying. What happens when the server wants to initiate communication? How does it do it and how does the client handle it? Also, how do you model the model the communication at the application level, meaning assuming we have the read/write part down, how do we know what to send? You will probably want to model things in terms of state machines. There is also a lot more to deal with like what happens when a client crashes? What about when the server crashes? Also, what if you really have your heart set of using select, perhaps because you expect many client? I will try to add more to this answer tomorrow.

Is a server an infinite loop running as a background process?

Is a server essentially a background process running an infinite loop listening on a port? For example:
while(1){
command = read(127.0.0.1:xxxx);
if(command){
execute(command);
}
}
When I say server, I obviously am not referring to a physical server (computer). I am referring to a MySQL server, or Apache, etc.
Full disclosure - I haven't had time to poke through any source code. Actual code examples would be great!
That's more or less what server software generally does.
Usually it gets more complicated because the infinite loop "only" accepts the connection and each connection can often handle multiple "commands" (or whatever they are called in the used protocol), but the basic idea is roughly this.
There are three kinds of 'servers' - forking, threading and single threaded (non-blocking). All of them generally loop the way you show, the difference is what happens when there is something to be serviced.
A forking service is just that. For every request, fork() is invoked creating a new child process that handles the request, then exits (or remains alive, to handle subsequent requests, depending on the design).
A threading service is like a forking service, but instead of a whole new process, a new thread is created to serve the request. Like forks, sometimes threads stay around to handle subsequent requests. The difference in performance and footprint is simply the difference of threads vs forks. Depending on the memory usage that is not servicing a client (and prone to changing), its usually better to not clone the entire address space. The only added complexity here is synchronization.
A single process (aka single threaded) server will fork only once to daemonize. It will not spawn new threads, it will not spawn child processes. It will continue to poll() the socket to find out when the file descriptor is ready to receive data, or has data available to be processed. Data for each connection is kept in its own structure, identified by various states (writing, waiting for ACK, reading, closing, etc). This can be an extremely efficient design, if done properly. Instead of having multiple children or threads blocking while waiting to do work, you have a single process and event loop servicing requests as they are ready.
There are instances where single threaded services spawn multiple threads, however the additional threads aren't working on servicing incoming requests, one might (for instance) set up a local socket in a thread that allows an administrator to obtain a status of all connections.
A little googling for non blocking http server will yield some interesting hand rolled web servers written as code golf challenges.
In short, the difference is what happens once the endless loop is entered, not just the endless loop :)
In a matter of speaking, yes. A server is simply something that "loops forever" and serves. However, typically you'll find that "daemons" do things like open STDOUT and STDERR onto file handles or /dev/null along with double forks among other things. Your code is a very simplistic "server" in a sense.

Resources