Reliable UDP implementation design issue - c

I have been working on a customization around UDP to make it reliable. I have this design problem which I realized only after my entire program was ready and I started sending packets from source to sink.
Scenario:
I created a single thread for reception of packets. The parent does packet sending job. Since this is just a POC, I have kept the buffer and common data structures as global pointer for whom memory is allocated on heap by the parent. I am taking care of critical memory sections using mutex.
As part of reliability I send across some control packets apart from data packets. At anytime, client will send data packets and receive control packets from server whereas server will receive data packets and send out control packets. I have used single socket, as my understanding is send & recv works simultaneously on single socket and default blocking.
Problem:
For test purpose, I send 100 packets from source to sink. Unfortunately, the thread on the server side busy keeps receiving packets and stores it in buffer. Server code isn't delivering packets to application until the parent thread gets the context switch. This add to unacceptable delay in overall communication.
Please help me understand, what is the issue; what can be changed to improve the performance?
Thanks in advance, Kedar

Since you're using a mutex, when the mutex is released on one thread after the packets are sent, then the other thread should consume the packet. Perhaps you are not releasing the mutex soon enough.
Alternately, let the socket's select() method handle the unblock-on-receive for you.

Related

Synchronizing between UDP and TCP

I'm currently implementing a daemon server that acts as 2 servers. One of the servers is recieving logs via UDP from a collection of producers. The second server is broadcasting every log that was received from a producer to a consumer who is currently connected via TCP.
These are 2 separte sockets. My current(pretty basic) implementation is to use select() on these 2 sockets, and handle every read signal accordingly, so my code is basicly(NOTE this is pseudo code)
for(;;) {
FDSET(consumers_server)
FDSET(producers_server)
select()
if consumers_server is set:
add new client to the consumers array
if producers server is set:
broadcast the log to every consumer in the array
}
This works just fine, the problem ocurres when this code is put in to stress. When multiple produers are sending logs(UDP) the real bottleneck here is the consumers which are TCP. Sending a log to the consumers can result in blocking, which i can't afford.
I've tried using non-blocking sockets and select()ing the consumers write fds, the problem is this would result in saving the non-sent logs in a buffer, until they can be sent. This results in a very unelegant massive code, and the system is also low on resources(mainly RAM)
I'm running on a linux distro.
An alternative approach to synchronize between these UDP and TCP connections would be welcomed.
This is doomed to failure. Sooner or later you will be unable to send to the TCP consumer. Whether that manifests itself as blocking or EAGAIN/EWOULDBLOCK isn't really relevant to the underlying problem, which is that the producer is overrunning the consumer. You have to decide what to do about that. You can have a certain amount of internal buffering but at some point you will have to stop reading from the UDP producers. At that point, UDP datagrams will be dropped and your system will lose data, and of course it is liable to lose data anyway by virtue of using UDP.
Don't do this. Use TCP for the producers: or else just accept the data loss and use blocking mode. Non-blocking mode only moves the problem slightly and complicates your code.

how does non-blocking tcp socket notify application on packets which fail to get sent.

Im working on a non-blocking C tcp sockets for linux system. I've read that in non-blocking mode, the "send" command will return "bytes sent" immediately if there is no error. I'm guessing this value returned does not actually mean that those data have been delivered to the destination but rather the data has been passed to kernel memory for it to handle further and send.
If that is the case, how would my application know which packet has really been sent out by kernel to the other end, assuming that the network connection had some problems and kernel decides to give up only after several retries in a span of a few minutes later?
Im asking because i would want my application to resend those failed packets again at a later time.
If that is the case, how would my application know which packet has
really been sent out by kernel to the other end, assuming that the
network connection had some problems and kernel decides to give up
only after several retries in a span of a few minutes later?
Your application won't know, unless it is able to recontact the receiving application and ask the receiving application about what data it had previously received.
Keep in mind that even with blocking I/O your application doesn't block until the data is received by the remote application -- it only blocks until there is some room in the kernel's outgoing-data buffer to hold the bytes you asked the TCP stack to send(). So even with blocking I/O you would face the same issue.
Also keep in mind that the byte arrays you pass to send() do not have a guaranteed 1-to-1 correspondence to the TCP packets that the TCP stack sends out. The TCP stack is free to pack your bytes into TCP packets any way it likes (e.g. the data from multiple send() calls can end up in a single TCP packet, or the data from a single send() call can end up in multiple TCP packets, or any other combination you can think of). Depending on network conditions, TCP stacks can and do pack things various different ways, their only promise is that the bytes will be received in FIFO order (if they get received at all).
Anyway, the answer to your question is: you can't know, unless you later ask the receiving program about what it got (or didn't get).
TCP internally takes care of retrying, application doesn't need to do any special handling for it. If you wish to confirm a packet received the other end of the TCP stack then you can set the send socket buffer (setsockopt(SOL_SOCKET, SO_SNDBUF)) to zero. In this case, kernel uses your application buffer to send the data & its only released after the TCP receives acknowledgement for this data. This way you can confirm that the data is pushed to the receiver end of the TCP stack. It doesn't confirm that the application has received the data. You need to have application layer acknowledgement in your protocol to confirm that the data reached the receiver application.

socket control to accept multiple UDP connections

I'm breaking my mind trying to understand how make a client/server write by myself accept multiple socket connections.
The connection is a datagram (UDP), for now was implemented based on getaddrinfo(3) man page works nice, but each client needs to wait the process of early connections be processed.
I've heard about select, but in its man page says:
select() can be used to solve many problems in a
portable and efficient way that naive programmers try to solve in a
more complicated manner using threads, forking, IPCs, signals, memory
sharing, and so on.
and more:
The Linux-specific epoll(7) API provides an interface that is
more
efficient than select(2) and poll(2) when monitoring large numbers of
file descriptors.
So, it is? epoll is simply better than select? Or it depends? If it depends, on what?
epoll man pages has a partial sample, so I'm trying to understand it.
At now, (on server) I think, I need a thread to listen in a thread and write in another. But how to control the completion of a partial message? If two clients send a partial message interleaved, how to identify? By the sockaddr? if it's the only need, I can manage it without a pool, so why use a epoll?
Can anyone try to explain to me, how to make, or where to learn about a multi-connection client-server UDP app?
I think there is a misunderstanding here about UDP. UDP is not a connection oriented protocol which means there is no permanent connection like TCP. UDP just bind to an address/port and waits for packets from everyone. At the server there is only one socket listening per address/port number. When a packet is received you may find out who is the sender by the packet's source IP, you can reply to the sender thru this IP.
As I see it, there is no need for poll() o select() you bind to an address/port and asynchronously receive packets. That is, when a packet is received you get a signal/message alerting your asynchronous function. This function should be reentrant, it means that in the middle of a reception another signal could be received and care must be taken when accessing/modifying global stuff (variables/objects). When dealing with an incoming packet it should be processed as soon as possible or, in case the process takes too long, you better keep the packet in a packet spool and process them in another [less priority] thread.
For UDP packet size read this question.
For UDP fragmentation read this
For UDP packet header read this

How to handle when a Client or Server is Down in a UDP Application

I am developing a windows application for Client Server communication using UDP, but since UDP is connectionless, whenever a Client goes down, the Server does not know that Client is off and keeps sending the data. Similar is the case when a Server is down.
How can I cater this condition that whenever any of the Client or Server is down, the other party must know it and can handle it.
Waiting for reply.
What you are asking is beyond the scope of UDP. You'd need to implement your own protocol, over UDP, to achieve this.
One simple idea could be to periodically send keepalive messages (TCP on the other hand has this feature).
You can have a simple implementation as follows:
Have a background thread keep sending those messages and waiting for replies.
Upon receiving replies, you can populate some sort of data structure
or a file with a list of alive devices.
Your other main thread (or threads) can have the following changes:
Before sending any data, check if the client you're going to send to is present in that file/data structure.
If not, skip this client.
Repeat the above for all remaining clients in the populated file/data structure.
One problem I can see in the above implementation is analogous to the RAW hazard from the main thread's perspective.
Use the following analogy instead of the mentioned example for the RAW hazard,
i1 = Your background thread which sends the keepalive messages.
i2 = Your main thread (or threads) which send/receive data and do your other tasks.
The RAW hazard here would be when i2 tries to read the data structure/file which is populated by i1 before i1 has updated it.
This means (worst case), i2 will not get the updated list and it can miss out a few clients this way.
If this loss would be critical, I can suggest that you possibly have a sort of mechanism whereby i1 will signal i2 when it completes any-ongoing writing.
If this loss is not critical, then you can skip the above mechanism to make your program faster.
Explanation for Keepalive Messages:
You just need to send a very lightweight message (usually has no data. Just the header information). Make sure this message is unique. You do not want another message being interpreted as a keepalive message.
You can send this message using a sendto() call to a broadcast address. After you finish sending, wait for replies for a certain timeout using recv().
Log every reply in a data structure/file. After the timeout expires, have the thread go to sleep for some time. When that time expires, repeat the above process.
To help you get started writing good, robust networking code, please go through Beej's Guide to Network Programming. It is absolutely wonderful. It explains many concepts.

Send same info to multiple threads/sockets?

I am writing a server application that simply connects a local serial port to multiple network connected clients. I am using linux and C for the server application because the equipment for the program is a router with limited memory.
I have everything setup for multiple clients to connect and send data to the serial port using a fork() process for each connection.
My problem lies in getting data incoming on the serial port out to the multiple (varing number) client connections. my problem lies in designing a way for each active socket to get all of the incoming data, and to only get it once. Any help?
Sounds like you need a data queue (buffer) for each connected client. Each time data comes in on the port, you post it to the back of each client's queue. The clients then read the data from the front of their respective queues. Since all the clients will probably read at different rates/times, this will ensure all of them get a copy of the data only once, and you won't get hung up waiting for any one client while more data comes in. Of course, you'll need to allocate a certain amount of memory for each connected client's queue (I'm not sure how many clients you're expecting, and you did say your available memory is limited), and you need to consider what to do if a queue gets full before the client reads all of it.
Presumably you keep a list or some other reference of/to connected clients, why not just loop over that for each bit of information and send it to all of them?
A thread per socket design might not be the best way to solve this. An event driven asynchronous approach should be a much better fit. However, if you must do it with threads, and given that serial ports are slow anyway, building a pipe between the thread listening to the serial port and all the threads talking to the network clients is the most practical. You could do fancy things with rwlocks to move the data, but you'll still need a way for the network threads to wait on both the socket and the data from the serial port, so you need to use file descriptors for both and something like poll.
But seriously, this would likely be much easier and would perform better without the threads. Think of it as a main loop which waits on poll which is watching the network and the serial port, determines which event occurred, and distributes data accordingly. It should be easier all around once you get the idea.

Resources