Efficient way of validating UDP communication protocol having single Server and multiple Clients

Efficient way of validating UDP communication protocol having single Server and multiple Clients - c

I have developed a single Server/multiple Clients udp application, where Server can handle x number of clients at a time. The Server has x number of threads each thread dedicated to one Client.
The code works perfectly fine. Now I want to check my application for all possible scenarios i.e. validate my application. For this purpose, I need to design a test best.
Initial Design:
The test bed I initially designed has following functionalities:
The Server GUI has a button on it. When the button is clicked, the
each thread in the Server reads a text file, picks up few bytes of
the text file, and sends those chunks to its respective clients. The
thread then picks next chunk of bytes from the text file, sends those
chunks to the client and so on until EOF is found.
The Client on the other side keep receiving these chunks of bytes,
creates a text file, and keeps storing these chunks of bytes in its
text file.
When EOF from Server is received, the Client starts sending the
completely received text file back to the Server over its Socket.
When the file is completely received back (echoed), the Server then
compares the two text files, the Sent file and the echoed one. If
both files are same, the communication process has occurred without
any fault and the communication protocol is validated.
The above mentioned validation technique (sending the text file, receiving the echoed file and then comparing both) checks the following things:
The number of bytes sent = number of bytes receieved.
No data is corrupted.
The data is receieved in proper order.
If any of the above mentioned three conditions is not fulfilled, that means that there is some error in communication.
Now I have been asked to make changes to this test bead and add more functionlities to it. Does the procedure that I am using actually can check above mentioned 3 conditions in all scenarios?
Are there some other conditions that must be checked besides above mentioned 3 conditions.
What could be other methods of checking communication protocol except the one I desgined i.e. Sending a text file and getting it echoed and then comparing.
I have to implement more functionlities to his test bed for making validation system more efficient or completely replece the above test bed with some better option.
Please help me with your suggestions.
Thanks in advane :)

The first two of your conditions are guaranteed by UDP. Picking "a few bytes", i.e. anything less than 65535 bytes (64kiB isn't really a "few" bytes) will result in a single datagram being sent, and anything larger than that will fail. Though you will not want to max out the largest possible datagram size, as it will incur IP fragmentation (staying below 1280 bytes is a good idea).
You will be able to receive exactly the amount you sent or nothing at all, never more or less. UDP does not guarantee that any datagram that is sent out arrives (it cannot guarantee that, since IP does not), but it does guarantee that the entire datagram arrives as-is -- or nothing. Never anything in between.
It further guarantees that the data inside the datagram matches its checksum (the underlying protocols including IP/ethernet/ATM further do their own checksumming) and thus arrives in the same binary representation as it was sent. In other words, data arrives in order (inside the datagram) and is not corrupted.
It is of course in theory possible that a bit error passes all 3 layers of checksums, but this is extremely unlikely and will not happen in practice. Unless you need to guard against someone maliciously tampering with packets, you do not need to worry. The kinds of bit errors that happen accidentially are reliably picked up by the checksums used in the protocols.
If, on the other hand, you do need to guard against malicious modification of your data, you must add a MAC (or a checksum and encrypt the entire packet -- adding a checksum alone is useless).
To ensure that data spanning several datagrams arrives in order, you must add sequence numbers to your packets (in the same manner TCP does). And with that, you can as well use TCP, which is likely more efficient and less error-prone. One of the main reasons why one would want to use UDP is normally because in-order delivery and reliability are not needed, or sometimes reliability is needed, but not in-order delivery.
In-order delivery is the main cause of TCP's latency during packet loss (in absence of packet loss, TCP is exactly as "fast" as UDP), so if this is needed, there is no sane reason not to use TCP in the first place. It is a protocol that has been fine-tuned and worked reliably for literally billions of people for 4 decades.
Also, using one socket and one thread per client is possibly not the best approach. The disk won't read any faster, and the network card won't send any faster either. UDP doesn't need a socket per client either. When using TCP, you'll have no other choice but to use one socket per client, but still multiplexing using a readiness notification system will give you much better performance and fewer opportunities for threading errors.
Also, sending back a checksum such as one of the SHA family (or a MAC, if it needs to be secure) may be more efficient than echoing back the whole lot of data. The likelihood that the checksum matches and the data accidentially doesn't is neglegible.
Entire revision control systems that manage millions of lines of code for millions of people (such as git) rely on the fact that this just doesn't happen to identify files (well, it does happen of course, you just won't live to see it).

I have a question here ? Why UDP why not TCP? especially when you are worried for packet order and data corruption. According to me(I may be wrong), UDP is good only when the data is timesensitive like video stream.
Secondly, yes there are other methods of checking integrity of transmitted data. Simplest may be checking the MD5 and SHA1 checksum.

Does the procedure that I am using actually can check above mentioned 3 conditions in all scenarios?
yes
What could be other methods of checking communication protocol except the one I desgined i.e. Sending a text file and getting it echoed and then comparing.
It doesn't have to be a file, but it has to be something you can check once you get the response. You could just generate some random data and hold on to it until you get the response.
You'd have to tell us what you really want to test. If you are trying to make sure that UDP doesn't give you bad data or out of order data, you're using the wrong protocol. You're not testing anything by seeing if you get the exact data in the exact order you send it over UDP except for the networking infrastructure you have in place.
You say you want to test your application for "all possible scenarios", but that doesn't even mean anything. You're testing to see if a behavior that is part of the UDP specification exists and trying to see that it doesn't? Well, it does. Even if you never see it.

Related

Sending multiple Package very fast in C

I'm trying to do a multiplayer game in c, but when I send multiple package like "ARV 2\n\0" and "POS 2 0 0\n\0" from the server to the client (with send()), when I try to read them with recv(), he only found 1 package that appear to be the 2 package in 1..
So I'm asking, is that normal ? And if yes, how could I force my client to read 1 by 1 the packages ? (or my server to send them 1 by 1 if the problem come from the call send)
Thanks !

Short answer: Yes, this is normal. You are using TCP/IP, I assume. It is a byte stream protocol, there are no "packets". Network and OS on either end may combine and split the data you send in any way that fits in some buffers, or parts of network. Only thing guaranteed is, that you get the same bytes in same order.
You need to use your own packet framing. For text protocol, separate packets with, for example, '\0' bytes or newlines. Also note that network or OS may give you partial packets per single "read", so you need to handle that in your code as well. This is easiest if packet separator is single byte.
Especially for a binary protocol where there are no "unused" byte values to mark packet boundaries, you could write length of packet as binary data, then that many data bytes, then again length, data, and so on. Note that the data stream may get split to different "read" calls even in the middle of the length info as well (unless length is single byte), so you may need a few lines more of code to handle receiving split packets.
Another option would be to use UDP protocol, which indeed sends packets. But UDP packets may get lost or delivered in wrong order (and have a few other problems), so you need to handle that somehow, and this often results in you re-inventing TCP, poorly. So unless you notice TCP/IP just won't cut it, stick with that.

One Socket Multiple Threads

I'm coding a part of little complex communication protocol to control multiple medical devices from single computer terminal. Computer terminal need to manage about 20 such devices. Every device uses same protocol fro communication called DEP. Now, I've created a loop that multiplexes within different devices to send the request and received the patient data associated with a particular device. So structure of this loop, in general, is something like this:
Begin Loop
Select Device i
if Device.Socket has Data
Strip Header
Copy Data on Queue
end if
rem_time = TIMEOUT - (CurrentTime - Device.Session.LastRequestTime)
if TIMEOUT <= 0
Send Re-association Request to Device
else
Sort Pending Request According to Time
Select First Request
Send the Request
Set Request Priority Least
end Select
end if
end Select
end Loop
I might have made some mistake in above pseudo-code, but I hope I've made myself clear about what this loop is trying to do. I've priority list structure that selects the device and pending request for that device, so that, all the requests and devices are selected at good optimal intervals.
I forgot to mention, above loop do not actually parse the received data, but it only strips off the header and put it in a queue. The data in queue is parsed in different thread and recorded in file or database.
I wish to add a feature so that other computers may also import the data and control the devices attached to computer terminal remotely. For this, I would need to create socket that would listen to commands in this INFINITE LOOP and send the data in different thread where PARSING is performed.
Now, my question to all the concurrency experts is that:
Is it a good design to use single socket for reading and writing in two different threads? Where each of the thread will be strictly involved in either reading or writing not both. Also, I believe socket is synchronized on process level, so do I need locks to synchronize the read and write over one socket from different threads?

There is nothing inherently wrong with having multiple threads handle a single socket; however, there are many good and bad designs based around this one very general idea. If you do not want to rediscover the problems as you code your application, I suggest you search around for designs that best fit your planned particular style of packet handling.
There is also nothing inherently wrong with having a single thread handle a single socket; however, if you put the logic handling on that thread, then you have selected a bad design, as then that thread cannot handle requests while it is "working" on the last reqeust.
In your particular code, you might have an issue. If your packets support fragmentation, or even if your algorithm gets a little ahead of the hardware due to timing issues, you might have just part of the packet "received" in the buffer. In that case, your algorithm will fail in two ways.
It will process a partial packet, one which has the first part of it's data.
It will mis-process the subsequent packet, as the information in the buffer will not start with a valid packet header.
Such failures are difficult to conceive and diagnose until they are encountered. Perhaps your library already buffers and splits messages, perhaps not.
In short, your design is not dictated by how many threads are accessing your socket: how many threads access your socket is dictated by your design.

Using send() twice for sending different types of data

I have a client which connects to a server and tries to send() some data. However there are two types of data that I need to send, lets say information about the weather and the current time (just examples).
The problem is: In the client I'm calling send() twice, one to send the weather info and one the current time, and in the server I'm looping recv().
What I expected (and built my code around) is that the first time the server calls recv() it would only get the weather info and at the second call to recv() the time, however only one call to recv() is enough for both of the data to be received on the same buffer.
While that may not be a problem the thing is I've built my program around that assumption, and I just wanted to know if there is a way to achieve what I want (I thought of a sleep() between the two send() but that may be unreliable), so that I can save time rewriting code.
If anyone knows a way it would save me quite some time, so I'm appreciating any help.

There is no alternative to a proper message protocol on top of TCP. TCP only transfers a stream of octets, (bytes). TCP cannot transfer messages, structs, objects.
If you've built a large program around the assumption that TCP can transfer messages on its own, you are in trouble.
Sleep() and timer bodges will just not work in any sort of reliable or performant way. You must do it properly and implement a protocol on top of TCP, eg. by sending a header containing the data length or using start/end bytes and escaping either byte that appears inside the data.

can one call of recv() receive data from 2 consecutive send() calls?

i have a client which sends data to a server with 2 consecutive send calls:
send(_sockfd,msg,150,0);
send(_sockfd,msg,150,0);
and the server is receiving when the first send call was sent (let's say i'm using select):
recv(_sockfd,buf,700,0);
note that the buffer i'm receiving is much bigger.
my question is: is there any chance that buf will contain both msgs? of do i need 2 recv() calls to get both msgs?
thank you!

TCP is a stream oriented protocol. Not message / record / chunk oriented. That is, all that is guaranteed is that if you send a stream, the bytes will get to the other side in the order you sent them. There is no provision made by RFC 793 or any other document about the number of segments / packets involved.
This is in stark contrast with UDP. As #R.. correctly said, in UDP an entire message is sent in one operation (notice the change in terminology: message). Try to send a giant message (several times larger than the MTU) with TCP ? It's okay, it will split it for you.
When running on local networks or on localhost you will certainly notice that (generally) one send == one recv. Don't assume that. There are factors that change it dramatically. Among these
Nagle
Underlying MTU
Memory usage (possibly)
Timers
Many others
Of course, not having a correspondence between an a send and a recv is a nuisance and you can't rely on UDP. That is one of the reasons for SCTP. SCTP is a really really interesting protocol and it is message-oriented.
Back to TCP, this is a common nuisance. An equally common solution is this:
Establish that all packets begin with a fixed-length sequence (say 32 bytes)
These 32 bytes contain (possibly among other things) the size of the message that follows
When you read any amount of data from the socket, add the data to a buffer specific for that connection. When 32 bytes are reached, read the length you still need to read until you get the message.
It is really important to notice how there are really no messages on the wire, only bytes. Once you understand it you will have made a giant leap towards writing network applications.

The answer depends on the socket type, but in general, yes it's possible. For TCP it's the norm. For UDP I believe it cannot happen, but I'm not an expert on network protocols/programming.

Yes, it can and often does. There is no way of matching up sends and receive calls when using TCP/IP. Your program logic should test the return values of both send and recv calls in a loop, which terminates when everything has been sent or recieved.

Socket client/server input/output polling vs. read/write in linux

Basically I set up a test to see which method is the fastest way to get data from another computer my network for a server with only a few clients(10 at max, 1 at min).
I tried two methods, both were done in a thread/per client fashion, and looped the read 10000 times. I timed the loop from the creation of the threads to the joining of the threads right after. In my threads I used these two methods, both used standard read(2)/write(2) calls and SOCK_STREAM/AF_INET:
In one I polled for data in my client reading(non blocking) whenever data was available, and in my server, I instantly sent data whenever I got a connection. My thread returned on a read of the correct number of bytes(which happened every time).
In the other, my client sent a message to the sever on connect and my server sent a message to my client on a read(both sides blocked here to make this more turn-based and synchronous). My thread returned after my client read.
I was pretty sure polling would be faster. I made a histogram of times to complete threads, and, as expected, polling was faster by a slight margin, but two things were not expected about the read/write method. Firstly, the read/write method gave me two distinct time spikes. I.E. some event sometimes occurred which would slow the read/write down by about .01 microseconds. I ran this test on a switch initially, and thought this may be a collision of packets, but then I ran the server and client on the same computer and still got these two different time spikes. Anyone know what event may be occurring?
The other, my read function returned too many bytes sometimes, and some bytes were garbage. I know streams don't guarantee you'll get all the data correctly, but why would the read function return extra garbage bytes?

Seems you are confusing the purpose of these two alternatives:
Connection per thread approach does not need polling (unless your protocol allows for random sequence of messages either way, which would be very confusing to implement). Blocking reads and writes will always be faster here since you skip one extra system call to select(2)/poll(2)/epoll(4).
Polling approach allows to multiplex I/O on many sockets/files in single-threaded or fixed-number-of-threads setup. This is how web-servers like nginx handle thousands of client connections in very few threads. The idea is that wait on any given file descriptor does not block others - wait on all of them.
So I would say you are comparing apples and goblins :) Take a look here:
High Performance Server Architecture
The C10K problem
libevent
As for the spikes - check if TCP gets into re-transmission mode, i.e. one of the sides is not reading fast enough to drain receive buffers, play with SO_RCVBUF and SO_SNDBUF socket options.
Too many bytes is definitely wrong - looks like API misuse - check if you are comparing signed and unsigned numbers, compile with high warning level.
Edit:
Looks like you are between two separate issues - data corruption and data transfer performance. I would strongly recommend focusing on the first one before tackling the second. Reduce the test to a minimum and try to figure out what you are doing wrong with the sockets. i.e. where's that garbage data comes from. Do you check return values of the read(2) and write(2) calls? Do you share buffers between threads? Paste the reduced code sample into the question (or provide a link to it) if really stuck.
Hope this helps.

I know streams don't guarantee you'll get all the data correctly, but why would the read function return extra garbage bytes?
Actually, streams do guarantee you will get all the data correctly, and in order. Datagrams (UDP) are what you were thinking of, SOCK_DGRAM, which is not what you are using. Within AF_INET, SOCK_STREAM means TCP and TCP means reliable.