socket send() repeatable - c

I'm having a question regarding send() on TCP sockets.
Is there a difference between:
char *text="Hello world";
char buffer[150];
for(i=0;i<10;i++)
send(fd_client, text, strlen(text) );
and
char *text="Hello world";
char buffer[150];
buffer[0]='\0';
for(i=0;i<10;i++)
strcat(buffer, text);
send(fd_client, buffer, strlen(buffer) );
Is there a difference for the receiver side using recv?
Are both going to be one TCP packet?
Even if TCP_NODELAY is set?

There's really no way to know. Depends on the implementation of TCP. If it were a UDP socket, they would definitely have different results, where you would have several packets in the first case and one in the second.
TCP is free to split up packets as it sees fit; it emulates a stream and abstracts it's packet mechanics away from the user. This is by design.

TCP is stream based protocol. If you run Send, it will put some data into OS TCP layer buffer and OS will send it periodically. But if you call Send too quick it might put few arrays into OS TCP layer before the previous one were sent. So it is like stack, it sends whatever it has and put everything in one big array.
Sending is btw done with segmentation by OS TCP layer, and there is as well Nagle's algorithm that will prevent sending small amount data before OS buffer will be big enough to satisfy one segment size.
So yes, there is difference.
TCP is stream based protocol, you can't rely on that single send will be single receive with same amount of data.
Data might merge together and you have to remember about that all the time.
Btw, based on your examples, in first case client will receive all bytes together or nothing. In meantime if sending one big segment will drop somewhere on the way then server OS will automatically resend it. Drop chance for bigger packets is higher so and resending of big segments will lead to some traffic lose. But this is based on percentage of dropped packets and might be not actual for your case at all.
In second example you might receive everything together or parts each separate or some merged. You never know and should implement you network reading that way, that you know how many bytes you expecting to receive and read just that amount of bytes. That way even if there is left some unread bytes they will be read on next "Read".

Related

Abstracting UDP and TCP send/receive procedures

Good day.
Intro.
Recently I've started to study some 'low-level' network programming as well as networking protocols in Linux. For this purpose I decided to create a small library for networking.
And now I wonder on some questions. I will ask one of them now.
As you know there are at least two protocols built on top of IP. I talk about TCP and UDP. Their implementation may differ in OS due to connection-orientation property of those.
According to man 7 udp all receive operations on UDP socket return only one packet. It is rational as different datagrams may come from different sources.
On the other hand TCP connection packets sequence may be considered as continuous byte flow.
Now, about the problem itself.
Say, I have an API for TCP connection socket and for UDP socket like:
void tcp_connection_recv(endpoint_t *ep, buffer_t *b);
void udp_recv(endpoint_t *ep, buffer_t *b);
endpoint_t type will describe the endpoint (remote for TCP connection and local for UDP). buffer_t type will describe some kind of vector-based or array-based buffer.
It is quite possible that buffer is already allocated by user and I'm not sure that this will be right for UDP to not change size of the buffer. And thus, to abstract code for TCP and UDP operations I think it will need to allocate as much buffer as needed to contain whole received data.
Also, to prevent from resizeing user buffer each socket may be maped to its own buffer (although it will be userspace buffer, but it will be hidden from user). And then on user's request data will be copied from that 'inner' buffer to user's one or read from socket if there is not enough amount.
Any suggestions or opinions?
If you want to create such API, it will depend on the service you want to provide. In TCP it will be different than UDP as TCP is stream oriented.
For TCP, tcp_connection_recv instead of reallocating a buffer, if the buffer passed by the user is not big enough, you can fill the whole buffer and then return, maybe with an output parameter, and indication that there is more data waiting to be read. Basically you can use the receive buffer that TCP connection already provides in the kernel, no need to create other buffer.
For, udp, you can request the user a number indicating the maximum datagram size it is waiting for. When you read from a UDP socket with recvfrom, if you read less data than what came in the arrived datagram, the rest of the datagram data is lost. You can read first with MSG_PEEK flag in order to find out how much data is available.
In general I wouldn't handle the buffer for the application as the application, actually the application layer protocol, is the one that knows how it expects to receive the data.

writing data to a socket that is sent in 2 frames

My appliactions sends through the wire using socket small messages. Each message is around 200 bytes of data. I would like to see my data sent in 2 frames instead of 1. My questions are
How to do that i.e. is there a way to cause TCP to automatically split the buffer in 2 frames?
Do I get the same if I send my buffer in 2 separate writes?
I am using Linux and C.
How to do that i.e. is there a way to cause TCP to automatically split
the buffer in 2 frames?
TCP is a stream communication protocol, all data is continuous. You should split your data by delimiters.
For example, in HTTP protocol each separated request is splited by two \n.
Do I get the same if I send my buffer in 2 separate writes?
No, you will receive them as a one continuous data stream. Frames are meaningless.
Note: Before you receive any data TCP in your application, packets are separated but OS collect and reassemble them. This process is transparent from your application.
Here are a few things you can consider.
TCP does have the PSH flag, that you can set in a packet, that makes TCP push out any buffered data. But this will work somewhat unreliably, because, in theory, data can get combined again on the receiving side. But in practice, you will see the data being delivered separately.
You can't really use "\n" as a delimiter, because it can occur naturally in your data. You have to come up with some kind of a escape sequence to use, and escape all the occurrences of "\n" in the data. This can be painful.
If you need message boundaries, consider a protocol that supports it. Like UDP. But with UDP you lose guaranteed delivery. You will have to roll your own confirmations, retries and what not.
Finally there is SCTP. Less used protocol, but available in the Linux stack at least. It gives you best of both worlds. Message boundaries, guaranteed delivery, guaranteed sequence.

How long does a UDP packet stay at a socket?

If data is sent to the client but the client is busy executing something else, how long will the data be available to read using recvfrom()?
Also, what happens if a second packet is sent before the first one is read, is the first one lost and the next one sitting there wating to be read?
(windows - udp)
If data is sent to the client but the client is busy executing something else, how long will the data be available to read using recvfrom()?
Forever, or not at all, or until you close the socket or read as much as a single byte.
The reason for that is:
UDP delivers datagrams, or it doesn't. This sounds like nonsense, but it is exactly what it is.
A single UDP datagram relates to either exactly one or several "fragments", which are IP packets (further encapsulated in some "on the wire" protocol, but that doesn't matter). The network stack collects all fragments for a datagram. If the checksum on any of the fragments is not good, or any other thing that makes the network stack unhappy, the complete datagram is discarded, and you get nothing, not even an error. You simply don't know anything happened.
If all goes well, a complete datagram is placed into the receive buffer. Never anything less, and never anything more. If you try to recvfrom later, that is what you'll get.
The receive buffer is obviously necessarily large enough to hold at least one max-size datagram (65535 bytes), but since usually datagrams will not be maximum size, but rather something below 1280 bytes (or 1500 if you will), it can usually hold quite a few of them (on most platforms, the buffer defaults to something around 128-256k, and is configurable).
If there is not enough room left in the buffer, the datagram is discarded, and you get nothing (well, you do still get the ones that are already in the buffer). Again, you don't even know something happened.
Each time you call recvfrom, a complete datagram is removed from the buffer (important detail!), and you get up to the number of bytes that you requested. Which means if you naively try read a few bytes and then a few bytes again, it just won't work. The first read will discard the rest of the datagram, and the subsequent ones read the first bytes of some future datagrams (and possibly block)!
This is very different from how TCP works. Here you can actually read a few bytes and a few bytes again, and it will just work, because the network layer simulates a data stream. You give a crap how it works, because the network stack makes sure it works.
Also, what happens if a second packet is sent before the first one is read, is the first one lost and the next one sitting there waiting to be read?
You probably meant to say "received" rather than "sent". Send and receive have different buffers, so that would not matter at all. About receiving another packet while one is still in the buffer, see the above explanation. If the buffer can hold the second datagram, it will store it, otherwise it silently goes * poof *.
This does not affect any datagrams already in the buffer.
Normally, the data will be buffered until it's read. I suppose if you wait long enough that the driver completely runs out of space, it'll have to do something, but assuming your code works halfway reasonably, that shouldn't be a problem.
A typical network driver will be able to buffer a number of packets without losing any.
If data is sent to the client but the client is busy executing something else, how long will the data be available to read using recvfrom()?
This depends on the OS, in windows, I believe the default for each UDP socket is 8012, this can be raised with setsockopt() Winsock Documentation So, as long as the buffer isn't full, the data will stay there until the socket is closed or it is read.
Also, what happens if a second packet is sent before the first one is read, is the first one lost and the next one sitting there wating to be read?
If the buffer has room, they are both stored, if not, one of them gets discarded. I believe its the newest one but I'm not 100% Sure.

can one call of recv() receive data from 2 consecutive send() calls?

i have a client which sends data to a server with 2 consecutive send calls:
send(_sockfd,msg,150,0);
send(_sockfd,msg,150,0);
and the server is receiving when the first send call was sent (let's say i'm using select):
recv(_sockfd,buf,700,0);
note that the buffer i'm receiving is much bigger.
my question is: is there any chance that buf will contain both msgs? of do i need 2 recv() calls to get both msgs?
thank you!
TCP is a stream oriented protocol. Not message / record / chunk oriented. That is, all that is guaranteed is that if you send a stream, the bytes will get to the other side in the order you sent them. There is no provision made by RFC 793 or any other document about the number of segments / packets involved.
This is in stark contrast with UDP. As #R.. correctly said, in UDP an entire message is sent in one operation (notice the change in terminology: message). Try to send a giant message (several times larger than the MTU) with TCP ? It's okay, it will split it for you.
When running on local networks or on localhost you will certainly notice that (generally) one send == one recv. Don't assume that. There are factors that change it dramatically. Among these
Nagle
Underlying MTU
Memory usage (possibly)
Timers
Many others
Of course, not having a correspondence between an a send and a recv is a nuisance and you can't rely on UDP. That is one of the reasons for SCTP. SCTP is a really really interesting protocol and it is message-oriented.
Back to TCP, this is a common nuisance. An equally common solution is this:
Establish that all packets begin with a fixed-length sequence (say 32 bytes)
These 32 bytes contain (possibly among other things) the size of the message that follows
When you read any amount of data from the socket, add the data to a buffer specific for that connection. When 32 bytes are reached, read the length you still need to read until you get the message.
It is really important to notice how there are really no messages on the wire, only bytes. Once you understand it you will have made a giant leap towards writing network applications.
The answer depends on the socket type, but in general, yes it's possible. For TCP it's the norm. For UDP I believe it cannot happen, but I'm not an expert on network protocols/programming.
Yes, it can and often does. There is no way of matching up sends and receive calls when using TCP/IP. Your program logic should test the return values of both send and recv calls in a loop, which terminates when everything has been sent or recieved.

Winsock UDP packets being dropped?

We have a client/server communication system over UDP setup in windows. The problem we are facing is that when the throughput grows, packets are getting dropped. We suspect that this is due to the UDP receive buffer which is continuously being polled causing the buffer to be blocked and dropping any incoming packets. Is it possible that reading this buffer will cause incoming packets to be dropped? If so, what are the options to correct this? The system is written in C. Please let me know if this is too vague and I can try to provide more info. Thanks!
The default socket buffer size in Windows sockets is 8k, or 8192 bytes. Use the setsockopt Windows function to increase the size of the buffer (refer to the SO_RCVBUF option).
But beyond that, increasing the size of your receive buffer will only delay the time until packets get dropped again if you are not reading the packets fast enough.
Typically, you want two threads for this kind of situation.
The first thread exists solely to service the socket. In other words, the thread's sole purpose is to read a packet from the socket, add it to some kind of properly-synchronized shared data structure, signal that a packet has been received, and then read the next packet.
The second thread exists to process the received packets. It sits idle until the first thread signals a packet has been received. It then pulls the packet from the properly-synchronized shared data structure and processes it. It then waits to be signaled again.
As a test, try short-circuiting the full processing of your packets and just write a message to the console (or a file) each time a packet has been received. If you can successfully do this without dropping packets, then breaking your functionality into a "receiving" thread and a "processing" thread will help.
Yes, the stack is allowed to drop packets — silently, even — when its buffers get too full. This is part of the nature of UDP, one of the bits of reliability you give up when you switch from TCP. You can either reinvent TCP — poorly — by adding retry logic, ACK packets, and such, or you can switch to something in-between like SCTP.
There are ways to increase the stack's buffer size, but that's largely missing the point. If you aren't reading fast enough to keep buffer space available already, making the buffers larger is only going to put off the time it takes you to run out of buffer space. The proper solution is to make larger buffers within your own code, and move data from the stack's buffers into your program's buffer ASAP, where it can wait to be processed for arbitrarily long times.
Is it possible that reading this buffer will cause incoming packets to be dropped?
Packets can be dropped if they're arriving faster than you read them.
If so, what are the options to correct this?
One option is to change the network protocol: use TCP, or implement some acknowledgement + 'flow control' using UDP.
Otherwise you need to see why you're not reading fast/often enough.
If the CPU is 100% utilitized then you need to do less work per packet or get a faster CPU (or use multithreading and more CPUs if you aren't already).
If the CPU is not 100%, then perhaps what's happening is:
You read a packet
You do some work, which takes x msec of real-time, some of which is spent blocked on some other I/O (so the CPU isn't busy, but it's not being used to read another packet)
During those x msec, a flood of packets arrive and some are dropped
A cure for this would be to change the threading.
Another possibility is to do several simultaneous reads from the socket (each of your reads provides a buffer into which a UDP packet can be received).
Another possibility is to see whether there's a (O/S-specific) configuration option to increase the number of received UDP packets which the network stack is willing to buffer until you try to read them.
First step, increase the receiver buffer size, Windows pretty much grants all reasonable size requests.
If that doesn't help, your consume code seems to have some fairly slow areas. I would use threading, e.g. with pthreads and utilize a producer consumer pattern to put the incoming datagram in a queue on another thread and then consume from there, so your receive calls don't block and the buffer does not run full
3rd step, modify your application level protocol, allow for batched packets and batch packets at the sender to reduce UDP header overhead from sending a lot of small packets.
4th step check your network gear, switches, etc. can give you detailed output about their traffic statistics, buffer overflows, etc. - if that is in issue get faster switches or possibly switch out a faulty one
... just fyi, I'm running UDP multicast traffic on our backend continuously at avg. ~30Mbit/sec with peaks a 70Mbit/s and my drop rate is bare nil
Not sure about this, but on windows, its not possible to poll the socket and cause a packet to drop. Windows collects the packets separately from your polling and it shouldn't cause any drops.
i am assuming your using select() to poll the socket ? As far as i know , cant cause a drop.
The packets could be lost due to an increase in unrelated network traffic anywhere along the route, or full receive buffers. To mitigate this, you could increase the receive buffer size in Winsock.
Essentially, UDP is an unreliable protocol in the sense that packet delivery is not guaranteed and no error is returned to the sender on delivery failure. If you are worried about packet loss, it would be best to implement acknowledgment packets into your communication protocol, or to port it to a more reliable protocol like TCP. There really aren't any other truly reliable ways to prevent UDP packet loss.

Resources