UDP non blocking write failure - c

I have worked in non blocking TCP, in that both read and write can fail in non blocking case. TCP non blocking read can fail if there is no data available and TCP write can fail if peer side`s TCP buffer is full (I hope TCP buffer size is 64K).
Similarly UDP read (recvfrom) can fail if no data available. But what is the failure case for UDP write (sendto). I think in UDP write there will not be any non block error. Because TCP write sends data and wait for the ACK from other side. But this is not the case for UDP write it will just send and comes out and it doesnt wait for any ACK from peer side. If its not send to other side means its packet loss.
Whether my understanding of UDP non blocking write is correct ? Please explain ?

The most likely reason why a UDP non-blocking send would fail is that the UDP socket's in-kernel outgoing-data buffer is full. In this case, send()/sendto() would return -1 and errno would be set to EWOULDBLOCK.
Note that a non-blocking send()/sendto() doesn't actually send the data out the network device before it returns; rather it copies the data into an in-kernel buffer and returns immediately, and thereafter it is the kernel's responsibility to move that data out to the network as quickly as it can. The outgoing-data buffer can become full if your program tries to send a lot of data at once, because the CPU can add your new data to the buffer much faster than the network hardware can forward the buffer's data out to the network.
If you get a -1/EWOULDBLOCK error, usually the most graceful way to handle it is to stop trying to send on that socket until the socket select()'s (or poll()'s, or etc) as ready-for-write. When that happens, you know that the in-kernel buffer has been at least partially drained, and you can try the send()/sendto() call again.
Another (less likely) cause of an error from send() would be if the IP address you are trying to send to is invalid. In any case, you should check errno and find out what the errno value is, as that will give you better insight into what is going wrong.
Btw the behavior described above is not unique to UDP, either; you can and will have the same problem with a non-blocking TCP socket (even if the remote peer's receive window is not full) if you try to send() data on the socket faster than the local network card can drain the socket's in-kernel buffer.

Because TCP write sends data and wait for the ACK from other side.
No it doesn't. It copies your data into the socket send buffer, and if that is full it either blocks or returns -1/EWOULDBLOCK/EAGAIN.
But this is not the case for UDP write it will just send and comes out and it doesnt wait for any ACK from peer side.
No it doesn't. It copies your data into the socket send buffer, and if that is full it either blocks or returns -1/EWOULDBLOCK/EAGAIN.
In both cases the actual putting of bytes onto the wire is asynchronous to your program.

Related

TCP Sockets in C with bad network

I am doing some test with TCP client application in a Raspberry Pi (server in the PC), with PPP (Point to Point Protocol) using a LTE Modem. I have used C program with sockets, checking system call's response. I wanted to test how socket works in a bad coverage area so I did some test removing the antenna.
I have followed the next steps:
Connect to server --> OK
Start sending data (write system call) --> OK (I also check in the server)
I removed the LTE modem's antenna (There is no network, it can't do ping)
Continue sending data (write system call) --> OK (server does not receive anything!!!)
It finished sending data and closed socket --> OK (connection is still opened and there is no data since the antenna was removed)
Program was finished
I put the antenna again
Some time later, the data has been uploaded and the connection closed. But I did another test following this steps but with more data, and it did not upload this data...
I do not know if there any way to ensure that the data written to TCP server is received by the server (I thought that TCP layer ensured this..). I could do it manually using an ACK but I guess that it has to be a better way to do.
Sending part code:
while(i<100)
{
sprintf(buf, "Message %d\n", i);
Return = write(Sock_Fd, buf, strlen(buf));
if(Return!=strlen(buf))
{
printf("Error sending data to TCP server. \n");
printf("Error str: %s \n", strerror(errno));
}
else
{
printf("write successful %d\n", i);
i++;
}
sleep(2);
}
Many thanks for your help.
The write()-syscall returns true, since the kernel buffers the data and puts it in the out-queue of the socket. It is removed from this queue when the data was sent and acked from the peer. When the OutQueue is full, the write-syscall will block.
To determine, if data has not been acked by the peer, you have to look at the size of the outqueue. With linux, you can use an ioctl() for this:
ioctl(fd, SIOCOUTQ, &outqlen);
However, it would be more clean and portable to use an inband method for determining if the data has been received.
TCP/IP is rather primitive technology. Internet may sound newish, but this is really antique stuff. TCP is needed because IP gives almost no guarantees, but TCP doesn't actually add that many guarantees. Its chief function is to turn a packet protocol into a stream protocol. That means TCP guarantees a byte order; no bytes will arrive out of order. Don't count on more than that.
You see that protocols on top of TCP add extra checks. E.g. HTTP has the famous HTTP error codes, precisely because it can't rely on the error state from TCP. You probably have to do the same - or you can consider implementing your service as a HTTP service. "RESTful" refers to an API design methodology which closely follows the HTTP philosophy; this might be relevant to you.
The short answer to your 4th and 5th topics was taken as a shortcut from this answer (read the whole answer to get more info)
A socket has a send buffer and if a call to the send() function succeeds, it does not mean that the requested data has actually really been sent out, it only means the data has been added to the send buffer. For UDP sockets, the data is usually sent pretty soon, if not immediately, but for TCP sockets, there can be a relatively long delay between adding data to the send buffer and having the TCP implementation really send that data. As a result, when you close a TCP socket, there may still be pending data in the send buffer, which has not been sent yet but your code considers it as sent, since the send() call succeeded. If the TCP implementation was closing the socket immediately on your request, all of this data would be lost and your code wouldn't even know about that. TCP is said to be a reliable protocol and losing data just like that is not very reliable. That's why a socket that still has data to send will go into a state called TIME_WAIT when you close it. In that state it will wait until all pending data has been successfully sent or until a timeout is hit, in which case the socket is closed forcefully.
The amount of time the kernel will wait before it closes the socket,
regardless if it still has pending send data or not, is called the
Linger Time.
BTW: that answer also refers to the docs where you can see more detailed info

what does readable/writable mean in a socket file descriptor? And why regular files don't bother with that?

Since I'm new in learning libev recently, there's a readable/writable concept in a io_watcher that I don't quite understand. For my knowledge there's a parameter in linux system programming:
O_ASYNC
A signal (SIGIO by default) will be generated when the specified file
becomes readable or writable. This flag is available only for
terminals and sockets, not for regular files.
So, since a regular file won't bother with readable/writable, what readable/writable really mean in socket programming? And what measure did kernel do to find out whether a socket file descriptor is readable?
Considering the everything-is-a-file philosophy, does every socket descriptor with different descriptor number actually point to the same file? If so,can I consider the readable/writable problem is caused by the synchronisation?
OK it seems that I'v asked a silly question. What I really mean is that both socket and regular file read and write via file descriptor, so why socket descriptor got a readable/writable concept but regular file doesn't. Since EJP told me that this is because the buffer and each descriptor got their own pair of buffers, here's my conclusion: readable/writable concept is for buffers, if a buffer is empty, it's unreadable, while it is full, it's unwritable. readable and writable have nothing to do with synchronisation, and since regular file don't have a buffer, it is always readable and writable.
And there are more questions: when saying receive buffer, this buffer is not the same thing in int recv(SOCKET socket, char FAR* buf, int len, int flags);, right?
This question is specifically addressed in Unix Network Programming, Volume 1: The Sockets Networking API (3rd Edition) [W. Richard Stevens, Bill Fenner, Andrew M. Rudoff] (see it here. I'll add some minor edits for enhanced readability):
Under What Conditions Is a Descriptor Ready?
[...]
The conditions that cause select to return "ready" for sockets [are]:
1. A socket is ready for reading if any of the following four conditions
is true:
The number of bytes of data in the socket receive buffer is
greater than or equal to the current size of the low-water mark for
the socket receive buffer. A read operation on the socket will not block and will return a value greater than 0 (i.e., the data that is
ready to be read). [...]
The read half of the connection is closed (i.e., a TCP connection that
has received a FIN). A read operation on the socket will not block and
will return 0 (i.e., EOF).
The socket is a listening socket and the
number of completed connections is nonzero. [...]
A socket error is
pending. A read operation on the socket will not block and will return
an error (–1) with errno set to the specific error condition. [...]
2. A socket is ready for
writing if any of the following four conditions is true:
The number of bytes of available space in the socket send buffer is greater than or
equal to the current size of the low-water mark for the socket send
buffer and either: (i) the socket is connected, or (ii) the socket
does not require a connection (e.g., UDP). This means that if we set
the socket to nonblocking, a write operation will not block and will
return a positive value (e.g., the number of bytes accepted by the
transport layer). [...]
The write half of the connection is closed. A write
operation on the socket will generate SIGPIPE.
A socket using a non-blocking connect has completed the connection, or the connect has failed.
A socket error is pending. A write operation on the socket
will not block and will return an error (–1) with errno set to the
specific error condition. [...]
3.
A socket has an exception condition pending if there is out-of-band data
for the socket or the socket is still at the out-of-band mark.
[Notes:]
Our definitions of "readable" and "writable" are taken directly from the
kernel's soreadable and sowriteable macros on pp. 530–531 of TCPv2.
Similarly, our definition of the "exception condition" for a socket is
from the soo_select function on these same pages.
Notice that when an error occurs on a socket, it is marked as both
readable and writable by select.
The purpose of the receive and send low-water marks is to give the application control over how much data must be available for reading or how much space must be available for writing before select returns a readable or writable status. For example, if we know that our application has nothing productive to do unless at least 64 bytes of data are present, we can set the receive low-water mark to 64 to prevent select from waking us up if less than 64 bytes are ready for reading.
As long as the send low-water mark for a UDP socket is less than the send buffer size (which should always be the default relationship), the UDP socket is always writable, since a connection is not required.
A related read, from the same book: TCP socket send buffer and UDP socket (pseudo) send buffer
Readable means there is data or a FIN present in the socket receive buffer.
Writable means there is space available in the socket send buffer.
Files don't have socket send or receive buffers.
Considering the everything-is-a-file philosophy
What philosophy is that?
does every socket descriptor with different descriptor number actually point to the same file?
What file? Why would they point to the same anything? Question doesn't make sense.
I'm confused with one thing: when a socket is created, the descriptor is actually point to the receive and send buffers of the socket
It 'points to' a lot of things: a source address, a target address, a source port, a target point, a pair of buffers, a set of counters and timers, ...
not the file represent the net hardware.
There is no such thing as 'the file represent[ing] the net hardware', unless you're talking about the device driver entry in /dev/..., which is barely relevant. A TCP socket is an endpoint of a connection. It is specific to that connection, to TCP, to the source and target addresses and ports, ...

For how long do the recv() functions buffer in UDP?

My program contains a thread that waits for UDP messages and when a message is received it runs some functions before it goes back to listening. I am worried about missing a message, so my question is something along the line of, how long is it possible to read a message after it has been sent? For example, if the message was sent when the thread was running the functions, could it still be possible to read it if the functions are short enough? I am looking for guidelines here, but an answer in microseconds would also be appreciated.
When your computer receives a UDP packet (and there is at least one program listening on the UDP port specified in that packet), the TCP stack will add that packet's data into a fixed-size buffer that is associated with that socket and kept in the kernel's memory space. The packet's data will stay in that buffer until your program calls recv() to retrieve it.
The gotcha is that if your computer receives the UDP packet and there isn't enough free space left inside the buffer to fit the new UDP packet's data, the computer will simply throw the UDP packet away -- it's allowed to do that, since UDP doesn't make any guarantees that a packet will arrive.
So the amount of time your program has to call recv() before packets start getting thrown away will depend on the size of the socket's in-kernel packet buffer, the size of the packets, and the rate at which the packets are being received.
Note that you can ask the kernel to make its receive-buffer size larger by calling something like this:
size_t bufSize = 64*1024; // Dear kernel: I'd like the buffer to be 64kB please!
setsockopt(mySock, SOL_SOCKET, SO_RCVBUF, &bufSize, sizeof(bufSize));
… and that might help you avoid dropped packets. If that's not sufficient, you'll need to either make sure your program goes back to recv() quickly, or possibly do your network I/O in a separate thread that doesn't get held off by processing.

blocking recv() that receives no data (TCP)

I'm attempting to write a simple server using C system calls that takes unknown byte streams from unknown clients and executes specific actions depending on client input. For example, the client will send a command "multiply 2 2" and the server will multiply the numbers and return the result.
In order to avoid errors where the server reads before the client has written, I have a blocking recv() call to wait for any data using MSG_PEEK. When recv detects data to be read, I move onto non-blocking recv()'s that read the stream byte by byte.
Everything works except in the corner case where the client sends no data (i.e. write(socket, "", 0); ). I was wondering how exactly I would detect that a message with no data is sent. In this case, recv() blocks forever.
Also, this post pretty much sums up my problem, but it doesn't suggest a way to detect a size 0 packet.
What value will recv() return if it receives a valid TCP packet with payload sized 0
When using TCP at the send/recv level you are not privy to the packet traffic that goes into making the stream. When you send a nonzero number of bytes over a TCP stream the sequence number increases by the number of bytes. That's how both sides know where the other is in terms of successful exchange of data. Sending multiple packets with the same sequence number doesn't mean that the client did anything (such as your write(s, "", 0) example), it just means that the client wants to communicate some other piece of information (for example, an ACK of data flowing the other way). You can't directly see things like retransmits, duplicate ACKs, or other anomalies like that when operating at the stream level.
The answer you linked says much the same thing.
Everything works except in the corner case where the client sends no data (i.e. write(socket, "", 0); ).
write(socket, "", 0) isn't even a send in the first place. It's just a local API call that does nothing on the network.
I was wondering how exactly I would detect that a message with no data is sent.
No message is sent, so there is nothing to detect.
In this case, recv() blocks forever.
I agree.
I have a blocking recv() call to wait for any data using MSG_PEEK. When recv detects data to be read, I move onto non-blocking recv()'s that read the stream byte by byte.
Instead of using recv(MSG_PEEK), you should be using select(), poll(), or epoll() to detect when data arrives, then call recv() to read it.

Can a Non Blocking UDP write return with fewer bytes than requested?

I have an application that sends data point to point from a sender to the receiver over an link that can operate in simplex (one way transmission) or duplex modes (two way). In simplex mode, the application sends data using UDP, and in duplex it uses TCP. Since a write on TCP socket may block, we are using Non Blocking IO (ioctl with FIONBIO - O_NONBLOCK and fcntl are not supported on this distribution) and the select() system call to determine when data can be written. NIO is used so that we can abort out of send early after a timeout if needed should network conditions deteriorate. I'd like to use the same basic code to do the sending but instead change between TCP/UDP at a higher abstraction. This works great for TCP.
However I am concerned about how Non Blocking IO works for a UDP socket. I may be reading the man pages incorrectly, but since write() may return indicating fewer bytes sent than requested, does that mean that a client will receive fewer bytes in its datagram? To send a given buffer of data, multiple writes may be needed, which may be the case since I am using non blocking IO. I am concerned that this will translate into multiple UDP datagrams received by the client.
I am fairly new to socket programming so please forgive me if have some misconceptions here. Thank you.
Assuming a correct (not broken) UDP implementation, then each send/sendmsg/sendto will correspond to exactly one whole datagram sent and each recv/recvmsg/recvfrom will correspond to exactly one whole datagram received.
If a UDP message cannot be transmitted in its entirety, you should receive an EMSGSIZE error. A sent message might still fail due to size at some point in the network, in which case it will simply not arrive. But it will not be delivered in pieces (unless the IP stack is severely buggy).
A good rule of thumb is to keep your UDP payload size to at most 1400 bytes. That is very approximate and leaves a lot of room for various forms of tunneling so as to avoid fragmentation.

Resources