ZeroMQ "lazy pirate pattern" fairly servicing multiple clients - c

I need an architecture for a single server reliably servicing multiple clients, with clients responding to unresponsive server similar to the lazy pirate pattern from the 0MQ guide (ie, they use zmq_poll to poll for replies; if timeout elapses, disconnect and reconnect the client socket and resend the request).
I took the "lazy pirate pattern" as a starting point, from the ZMQ C language examples directory (lpclient.c and lpserver.c). Removed the simulated failure stuff from lpserver.c so that it would run normally without simulating crashes, as follows:
Server has a simple loop:
Read next message from the socket
Do some simulated work (1 second sleep)
Reply that it has serviced the request
Client has simple loop:
Send request to server
Run zmq_poll to check for response with some set timeout value
If timeout has elapsed, disconnect and reconnect to reset the connection and resend request at start of next iteration of loop
This worked great for one or two clients. I then tried to service 20 clients by running them like:
$ ./lpserver &
$ for i in {1..20}
do
./lpclient &
done
The behaviour I get is:
Clients all send their requests and begin polling for replies.
Server does one second work on first message it gets, then replies
First client gets its response back and sends a new request
Server does one second work on second message it gets, then replies
Second client gets its response back and sends a new request
Server receives third client's request, but third client times out before work completes (2.5 second timeout, server work period is 1 second, so on the third request clients start dropping out).
Multiple clients (fourth through Nth) timeout and resend their requests.
Server keeps processing the defunct requests from the incoming message queue and doing work which hogs up the server, causing all clients to eventually timeout as it takes 20 seconds to get through each round of the queue with all of the defunct messages.
Eventually all clients are dead and server is still spitting out responses to defunct connections. This is terrible because the server keeps responding to requests the client has given up on (and therefore shouldn't expect that the work has been done), and spending all this time servicing dead requests guarantees that all future client requests will timeout.
This example was presented as a way to handle multiple clients and a single server, but it simply doesn't work (I mean, if you did very quick work and had a long timeout, you would have some illusion of reliability, but it's pretty easy to envision this catastrophic collapse rearing its head under this design).
So what's a good alternative? Yes, I could shorten the time required to do work (spinning off worker threads if needed) and increase the timeout period, but this doesn't really address the core shortcoming - just reduces its likelihood - which isn't a solution.
I just need a simple request / reply pattern that handles multiple clients and a single server that processes requests serially, in the order they're received, but in which clients can time-out reliably in the event that the server is taking too long and the server doesn't waste resources responding to defunct requests.

Related

Make clients wait until the rest of the clients write to server

I have a multithreading TCP server that waits for a number of predefined clients to write something to the server side and base of the requests, the server will write a message to all the clients based on their request. I'm stuck at the part when the clients that already sent a message must wait until all the clients have sent their respective message. How can I do this? I attempted to write 2 different thread functions, the first one calling the second one but not sure if this is the right way. Is there a way to make the clients wait until the server writes to all of them?

Failed Publish when subscribed to same topic as publisher?

I am currently working on a embedded c project using mqtt 3.1.1 and mosquitto broker 1.4.3. the issue I have is when the client board is publishing and subscribed to the same topic, after a random number of messages the client is blocked and the connection gets timed-out.
I am trying to send a string message, 25 bytes, over 3G network. Using QOS2 on both pub & sub, I have tried different settings on the client for keepalive (15s <-> 120s) and have a delay between each message (2000ms <-> 300000ms), on the broker I have tried different settings also, but nothing seem to work, is it possible to send messages using QOS2 over a 3G network or am I expecting too much?
We want to guarantee the transfer of some data that is critical so if this is not possible on mqtt is there a better alternative?
A keepalive of 120ms sounds bogus.
Keepalive is there for the broker to detect that a client may have gone missing, without having to wait for the TCP connection to time out. You would typically use a keepalive in the range of seconds, if not minutes.
With a keepalive of 120ms, you have to send a PING packet at least every 100ms or so (or do any other MQTT exchange in that time frame), so it might explain why you are introducing so much latency in your scenario – and probably killing your 3G data plan too ;-)
I suggest you start using a keep-alive of 30s to see if that improves things.

Minimising client processing - c socket programming

I am working on a client/server model based on Berkeley sockets and have almost finished but I'm stuck with a way to know that all of the data has been received whilst minimising the processing being executed on the client side.
The client I am working with has very little memory and battery and is to be deployed in remote conditions. This means that wherever possible I am trying to avoid processing (and therefore battery loss) on the client side. The following conditions on the client are outside of my control:
The client sends its data 1056 bytes at a time until it has ran out of data to send (I have no idea where the number 1056 came from but if you think that you know I would be very interested)
The client is very unpredictable in when it will send the data (it is attached to a wild animal and sends data determined by connection strength and battery life)
The client has an unknown amount of data to send at any given time
The data is transmitted though a GRSM enabled phone tag (Not sure that this is relevant but I'm assuming that extra information could only help)
(I am emulating the data I am expecting to receive from the client through localhost, if it seems to work I will ask the company where I am interning to invest in a static ip address to allow "real" tcp transfers, if it doesn't I won't. I don't think this is relevant but, again, I would rather provide too much information than too little)
At the moment I am using a while loop and incrementing the number of bytes received in order to "recv()" each of the 1056 byte sections. My problem is that the server needs to receive an unknown number of these. To me, the most obvious solutions are to send the number of sections to be received in an initial header from the client or to mark the last section being sent in some way. However, both of these approaches would require processing on the client side, I was wondering if there was a way to check whether the client has closed its socket from the server side? Or even whether something like closing the connection from the server after a pre-determined period of time without information from the client would be feasible? If these aren't possible then I would love to hear any other suggestions.
TLDR: What condition can I use here to minimise client-side processing?
while(!(/* Client has ran out of data to send*/)) {
receive1056Section();
}
Also, I know that it is bad practise to make a stackOverflow account and immediately ask a question, I didn't know what else to do, I'm sorry. Please don't hesitate to be mean if I've missed something very obvious.
Here is a suggestion for how to do the interaction:
The client:
Client connects to server via tcp.
Client sends chunks of data until all data has been sent. Flush the send buffer after each chunk.
When it is done the client issues a shutdown on the socket, sleeps for a couple of seconds and then closes the connection.
The client then sleeps until the next transmission. If the transmission was unsuccessful, the sleep time should be shorter to prevent unsent data to overflow the avaiable memory.
If the client is unable to connect for an extended period of time, you would have to discard data that doesn't fit in the memory.
I am assuming that sleep reduces power consumption.
The server:
The server programcan be single-threaded unless you need massive scalability. It is listening for incoming connections on the agreed port.
Whenever a client connects, a new socket is created.
Use select() to see which sockets has data (don't forget to include the listening socket!), and non-blocking reads to read from the sockets.
When you get the appropriate error (no more data to read and the other side has shutdown it's side of the connection), then you can close that socket.
This should work fine up to a couple of thousand simultaneous connections.
Example that handles many of the difficulties of implementing a server

How to handle when a Client or Server is Down in a UDP Application

I am developing a windows application for Client Server communication using UDP, but since UDP is connectionless, whenever a Client goes down, the Server does not know that Client is off and keeps sending the data. Similar is the case when a Server is down.
How can I cater this condition that whenever any of the Client or Server is down, the other party must know it and can handle it.
Waiting for reply.
What you are asking is beyond the scope of UDP. You'd need to implement your own protocol, over UDP, to achieve this.
One simple idea could be to periodically send keepalive messages (TCP on the other hand has this feature).
You can have a simple implementation as follows:
Have a background thread keep sending those messages and waiting for replies.
Upon receiving replies, you can populate some sort of data structure
or a file with a list of alive devices.
Your other main thread (or threads) can have the following changes:
Before sending any data, check if the client you're going to send to is present in that file/data structure.
If not, skip this client.
Repeat the above for all remaining clients in the populated file/data structure.
One problem I can see in the above implementation is analogous to the RAW hazard from the main thread's perspective.
Use the following analogy instead of the mentioned example for the RAW hazard,
i1 = Your background thread which sends the keepalive messages.
i2 = Your main thread (or threads) which send/receive data and do your other tasks.
The RAW hazard here would be when i2 tries to read the data structure/file which is populated by i1 before i1 has updated it.
This means (worst case), i2 will not get the updated list and it can miss out a few clients this way.
If this loss would be critical, I can suggest that you possibly have a sort of mechanism whereby i1 will signal i2 when it completes any-ongoing writing.
If this loss is not critical, then you can skip the above mechanism to make your program faster.
Explanation for Keepalive Messages:
You just need to send a very lightweight message (usually has no data. Just the header information). Make sure this message is unique. You do not want another message being interpreted as a keepalive message.
You can send this message using a sendto() call to a broadcast address. After you finish sending, wait for replies for a certain timeout using recv().
Log every reply in a data structure/file. After the timeout expires, have the thread go to sleep for some time. When that time expires, repeat the above process.
To help you get started writing good, robust networking code, please go through Beej's Guide to Network Programming. It is absolutely wonderful. It explains many concepts.

When SQL_ATTR_QUERY_TIMEOUT is exceeded, does SQL Server stop work?

When calling SQL Server from a client using ODBC, if a long-running query is run causing the time specified in SQL_ATTR_QUERY_TIMEOUT to be exceeded, I see that control is returned to the application. My question is does the work continue within the SQL Server engine. If it does continue what can be done to abort/cancel/stop the request on the server? Is there a best practice consideration to keep in mind?
The client sends an Attention signal to the server:
The client can interrupt and cancel the current request by sending an Attention message. This is also known as out-of-band data, but any TDS packet that is currently being sent MUST be finished before sending the Attention message. After the client sends an Attention message, the client MUST read until it receives an Attention acknowledgment.
The engine will abort the batch at the first opportunity (for all practical reasons, right away) and send back the Attention ack. In certains states a batch cannot be interrupted, eg. while rolling back a transaction. In such case a client may request an abort, but the response will come only after the server terminates the non-interruptible work.
The above is true for any SQL Server client stack, eg. exactly the same is how SqlCommand.CommandTimeout works.
The KILL command works in a very similar manner except that is not a client-server communication, but killing-SPID -> victim-SPID communication.

Resources