Isn't recv() in C socket programming blocking? - c

In Receiver, I have
recvfd=accept(sockfd,&other_side,&len);
while(1)
{
recv(recvfd,buf,MAX_BYTES-1,0);
buf[MAX_BYTES]='\0';
printf("\n Number %d contents :%s\n",counter,buf);
counter++;
}
In Sender , I have
send(sockfd,mesg,(size_t)length,0);
send(sockfd,mesg,(size_t)length,0);
send(sockfd,mesg,(size_t)length,0);
MAX_BYTES is 1024 and length of mesg is 15. Currently, It calls recv only one time. I want recv function to be called three times for each corresponding send. How do I achieve it?

In short: yes, it is blocking. But not in the way you think.
recv() blocks until any data is readable. But you don't know the size in advance.
In your scenario, you could do the following:
call select() and put the socket where you want to read from into the READ FD set
when select() returns with a positive number, your socket has data ready to be read
then, check if you could receive length bytes from the socket:
recv(recvfd, buf, MAX_BYTES-1, MSG_PEEK), see man recv(2) for the MSG_PEEK param or look at MSDN, they have it as well
now you know how much data is available
if there's less than length available, return and do nothing
if there's at least length available, read length and return (if there's more than length available, we'll continue with step 2 since a new READ event will be signalled by select()

To send discrete messages over a byte stream protocol, you have to encode messages into some kind of framing language. The network can chop up the protocol into arbitrarily sized packets, and so the receives do not correlate with your messages in any way. The receiver has to implement a state machine which recognizes frames.
A simple framing protocol is to have some length field (say two octets: 16 bits, for a maximum frame length of 65535 bytes). The length field is followed by exactly that many bytes.
You must not even assume that the length field itself is received all at once. You might ask for two bytes, but recv could return just one. This won't happen for the very first message received from the socket, because network (or local IPC pipe, for that matter) segments are never just one byte long. But somewhere in the middle of the stream, it is possible that the fist byte of the 16 bit length field could land on the last position of one network frame.
An easy way to deal with this is to use a buffered I/O library instead of raw operating system file handles. In a POSIX environment, you can take an open socket handle, and use the fdopen function to associate it with a FILE * stream. Then you can use functions like getc and fread to simplify the input handling (somewhat).
If in-band framing is not acceptable, then you have to use a protocol which supports framing, namely datagram type sockets. The main disadvantage of this is that the principal datagram-based protocol used over IP is UDP, and UDP is unreliable. This brings in a lot of complexity in your application to deal with out of order and missing frames. The size of the frames is also restricted by the maximum IP datagram size which is about 64 kilobytes, including all the protocol headers.
Large UDP datagrams get fragmented, which, if there is unreliability in the network, adds up to greater unreliability: if any IP fragment is lost, the entire packet is lost. All of it must be retransmitted; there is no way to just get a repetition of the fragment that was lost. The TCP protocol performs "path MTU discovery" to adjust its segment size so that IP fragmentation is avoided, and TCP has selective retransmission to recover missing segments.

I bet you've created a TCP socket using SOCK_STREAM, which would cause the three messages to be read into your buffer during the first recv call. If you want to read the messages one-by-one, create a UPD socket using SOCK_DGRAM, or develop some type of message format which allows you to parse your messages when they arrive in a stream (assuming your messages will not always be fixed length).

First send the length to be received in a fixed format regarding the size of length in bytes you use to transmit this length, then make recv() loop until length bytes had been received.
Note the fact (as also already mentioned by other answers), that the size and number of chunks received do not necessarly need to be the same as sent. Only the sum of all bytes received shall be the same as the sum of all bytes sent.
Read the man pages for recvand send. Especially read the sections on what those functions RETURN.

recv will block until the entire buffer is filled, or the socket is closed.
If you want to read length bytes and return, then you must only pass to recv a buffer of size length.
You can use select to determine if
there are any bytes waiting to be read,
how many bytes are waiting to be read, then
read only those bytes
This can avoid recv from blocking.
Edit:
After re-reading the docs, the following may be true: your three "messages" may be being read all-at-once since length + length + length < MAX_BYTES - 1.
Another possibility, if recv is never returning, is that you may need to flush your socket from the sender-side. The data may be waiting in a buffer to actually be sent to the receiver.

Related

Sending integer atomically over a socket in c

I have a simple client which accepts a single uint32_t from the server through a socket. Using the solution that appeared here many times (e.g. transfer integer over a socket in C) seems to work, but:
When calling "read" on files I know that the system is not guaranteed to read the entire content of the message at once, and it therefore returns the number of bytes read. Couldn't the same happen when accepting 4 bytes over a network socket?
If this can't happen, why is that? and if it can, how is it possible to make sure the send is "atomic", or is it necessary to piece back the bytes myself?
Depending on the socket type, different protocols can be used. SOCK_STREAM (correspond to TCP on network sockets) is a stream oriented protocol, so packets may be re-combined by the sender, the receiver or any equipement in the middle.
But SOCK_DGRAM (UDP) or SOCK_SEQPACKET actually send packets that cannot be changed. In that case 4 bytes in the same packet are guaranteed be be available in the same read operation, except if the receive buffer is too small. From man socket:
If a message is too long to fit in the supplied buffer, excess
bytes may be discarded depending on the type of socket the message is
received from
So if you want to have atomic blocs, use a packet protocol and not a stream one, and make sure to have large enough receive buffers.
When calling "read" on files I know that the system is not guaranteed
to read the entire content of the message at once
That is wrong, if the requested number of bytes is available they are read:
POSIX read manual says: The value returned may be less than nbyte if the number of bytes left
in the file is less than nbyte
This is at least correct for regular files, for pipes and alike it is a different story.
Couldn't the same happen when accepting 4 bytes over a network socket?
(I suppose you are talking about TCP sockets.) That may happen with socket because underlying protocol may transport your byte in any suitable manner (read about TCP fragmentation for example), the only thing ensured is that if received bytes are received in the same order that they have been sent. So, to read a given number of bytes you have to try to read those bytes eventually with several reads. This is usually made by looping over the read until needed bytes are received and read.
If the underlying protocol is TCP/IP, which is stream-oriented (there are no "packets" or "messages", just two byte streams), then yes.
You need to take care to manage the amount of read data so that you can know where each "message" (in your case, a single integer) begins and ends.

C: Sending long strings with send()

I'm using Linux and trying to send a long message through send(). The message is 1270 bytes, but my client application is only receiving 1024 bytes.
Since 1024 bytes is such a convenient number, I'm guessing that send() can only send 1024 bytes at a time.
I looked at the man page for send, but all it says about long messages is:
When the message does not fit into the send buffer of the socket, send()
normally blocks, unless the socket has been placed in nonblocking I/O
mode. In nonblocking mode it would fail with the error EAGAIN or EWOULD-
BLOCK in this case. The select(2) call may be used to determine when it
is possible to send more data.
I'm using blocking mode, and the man page doesn't say what to do.
My exact call to send looks like this:
send(socket, message, strlen(message), 0);
Would I need to split up the string into 1024 byte chunks and send them separately? And how would my client handle this? If my client needs to do anything, I'll just mention that it's in Java and it uses InputStreamReader to receive data.
There is a misunderstanding on your part as to how send actually works. It does not send whatever you throw at it, it sends at most the number of bytes you pass in as a parameter. That means you have to check the return value of send, remove the number of bytes that actually got send from your send queue and try to send the remaining stuff with another call to send. Same holds true for recv, by the way.
In addition to what Jim Brissom said, it is worth pointing out that SOCK_STREAM sockets (ie. TCP) do not have a concept of message boundaries - the connection is just one long unstructured stream of bytes.
Just because you sent 100 bytes (say) with one send() call does not mean it will arrive as 100 bytes one recv() call. In the extreme, it may arrive in 100 separate recv()s of 1 byte each - or at the other, it may arrive as part of a larger recv(), along with prior or following data.
Your receiver must be able to handle this - if you need to define individual messages within the stream, you must impose some sort of structure upon the stream yourself, at the application layer.

How large should my recv buffer be when calling recv in the socket library

I have a few questions about the socket library in C. Here is a snippet of code I'll refer to in my questions.
char recv_buffer[3000];
recv(socket, recv_buffer, 3000, 0);
How do I decide how big to make recv_buffer? I'm using 3000, but it's arbitrary.
what happens if recv() receives a packet bigger than my buffer?
how can I know if I have received the entire message without calling recv again and have it wait forever when there is nothing to be received?
is there a way I can make a buffer not have a fixed amount of space, so that I can keep adding to it without fear of running out of space? maybe using strcat to concatenate the latest recv() response to the buffer?
I know it's a lot of questions in one, but I would greatly appreciate any responses.
The answers to these questions vary depending on whether you are using a stream socket (SOCK_STREAM) or a datagram socket (SOCK_DGRAM) - within TCP/IP, the former corresponds to TCP and the latter to UDP.
How do you know how big to make the buffer passed to recv()?
SOCK_STREAM: It doesn't really matter too much. If your protocol is a transactional / interactive one just pick a size that can hold the largest individual message / command you would reasonably expect (3000 is likely fine). If your protocol is transferring bulk data, then larger buffers can be more efficient - a good rule of thumb is around the same as the kernel receive buffer size of the socket (often something around 256kB).
SOCK_DGRAM: Use a buffer large enough to hold the biggest packet that your application-level protocol ever sends. If you're using UDP, then in general your application-level protocol shouldn't be sending packets larger than about 1400 bytes, because they'll certainly need to be fragmented and reassembled.
What happens if recv gets a packet larger than the buffer?
SOCK_STREAM: The question doesn't really make sense as put, because stream sockets don't have a concept of packets - they're just a continuous stream of bytes. If there's more bytes available to read than your buffer has room for, then they'll be queued by the OS and available for your next call to recv.
SOCK_DGRAM: The excess bytes are discarded.
How can I know if I have received the entire message?
SOCK_STREAM: You need to build some way of determining the end-of-message into your application-level protocol. Commonly this is either a length prefix (starting each message with the length of the message) or an end-of-message delimiter (which might just be a newline in a text-based protocol, for example). A third, lesser-used, option is to mandate a fixed size for each message. Combinations of these options are also possible - for example, a fixed-size header that includes a length value.
SOCK_DGRAM: An single recv call always returns a single datagram.
Is there a way I can make a buffer not have a fixed amount of space, so that I can keep adding to it without fear of running out of space?
No. However, you can try to resize the buffer using realloc() (if it was originally allocated with malloc() or calloc(), that is).
For streaming protocols such as TCP, you can pretty much set your buffer to any size. That said, common values that are powers of 2 such as 4096 or 8192 are recommended.
If there is more data then what your buffer, it will simply be saved in the kernel for your next call to recv.
Yes, you can keep growing your buffer. You can do a recv into the middle of the buffer starting at offset idx, you would do:
recv(socket, recv_buffer + idx, recv_buffer_size - idx, 0);
If you have a SOCK_STREAM socket, recv just gets "up to the first 3000 bytes" from the stream. There is no clear guidance on how big to make the buffer: the only time you know how big a stream is, is when it's all done;-).
If you have a SOCK_DGRAM socket, and the datagram is larger than the buffer, recv fills the buffer with the first part of the datagram, returns -1, and sets errno to EMSGSIZE. Unfortunately, if the protocol is UDP, this means the rest of the datagram is lost -- part of why UDP is called an unreliable protocol (I know that there are reliable datagram protocols but they aren't very popular -- I couldn't name one in the TCP/IP family, despite knowing the latter pretty well;-).
To grow a buffer dynamically, allocate it initially with malloc and use realloc as needed. But that won't help you with recv from a UDP source, alas.
For SOCK_STREAM socket, the buffer size does not really matter, because you are just pulling some of the waiting bytes and you can retrieve more in a next call. Just pick whatever buffer size you can afford.
For SOCK_DGRAM socket, you will get the fitting part of the waiting message and the rest will be discarded. You can get the waiting datagram size with the following ioctl:
#include <sys/ioctl.h>
int size;
ioctl(sockfd, FIONREAD, &size);
Alternatively you can use MSG_PEEK and MSG_TRUNC flags of the recv() call to obtain the waiting datagram size.
ssize_t size = recv(sockfd, buf, len, MSG_PEEK | MSG_TRUNC);
You need MSG_PEEK to peek (not receive) the waiting message - recv returns the real, not truncated size; and you need MSG_TRUNC to not overflow your current buffer.
Then you can just malloc(size) the real buffer and recv() datagram.
There is no absolute answer to your question, because technology is always bound to be implementation-specific. I am assuming you are communicating in UDP because incoming buffer size does not bring problem to TCP communication.
According to RFC 768, the packet size (header-inclusive) for UDP can range from 8 to 65 515 bytes. So the fail-proof size for incoming buffer is 65 507 bytes (~64KB)
However, not all large packets can be properly routed by network devices, refer to existing discussion for more information:
What is the optimal size of a UDP packet for maximum throughput?
What is the largest Safe UDP Packet Size on the Internet
16kb is about right; if you're using gigabit ethernet, each packet could be 9kb in size.

Confusion on recvfrom() in an application level protocol design

Assume Linux and UDP is used.
The manpage of recvfrom says:
The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested.
If this is the case, then it is highly possible to return partial application level protocol data from the socket, even if desired MAX_SIZE is set.
Should a subsequent call to recvfrom be made?
In another sense, it's also possible to have more than the data I want, such as two UDP packets in the socket's buffer. If a recvfrom() is called in this case, would it return both of them(assume within MAX_SIZE)?
I suppose there should be some application protocol level size info at the start of each UDP msg so that it won't mess up.
I think the man page you want is this one. It states that the extra data will be discarded. If there are two packets, the recvfrom call will only retrieve data from the first one.
Well..I got a better answer after searching the web:
Don't be afraid of using a big buffer and specifying a big datagram size when reading... recv() will only read ONE datagram even if there are many of them in the receive buffer and they all fit into your buffer... remember, UDP is datagram oriented, all operations are on those packets, not on bytes...
A different scenario would be faced if you used TCP sockets.... TCP doesn't have any boundary "concept", so you just read as many bytes as you want and recv() will return a number of bytes equals to MIN(bytes_in_buffer, bytes_solicited_by_your_call)
REF: http://www.developerweb.net/forum/archive/index.php/t-3396.html

How Do Sockets Work in C?

I am a bit confused about socket programming in C.
You create a socket, bind it to an interface and an IP address and get it to listen. I found a couple of web resources on that, and understood it fine. In particular, I found an article Network programming under Unix systems to be very informative.
What confuses me is the timing of data arriving on the socket.
How can you tell when packets arrive, and how big the packet is, do you have to do all the heavy lifting yourself?
My basic assumption here is that packets can be of variable length, so once binary data starts appearing down the socket, how do you begin to construct packets from that?
Short answer is that you have to do all the heavy lifting yourself. You can be notified that there is data available to be read, but you won't know how many bytes are available. In most IP protocols that use variable length packets, there will be a header with a known fixed length prepended to the packet. This header will contain the length of the packet. You read the header, get the length of the packet, then read the packet. You repeat this pattern (read header, then read packet) until communication is complete.
When reading data from a socket, you request a certain number of bytes. The read call may block until the requested number of bytes are read, but it can return fewer bytes than what was requested. When this happens, you simply retry the read, requesting the remaining bytes.
Here's a typical C function for reading a set number of bytes from a socket:
/* buffer points to memory block that is bigger than the number of bytes to be read */
/* socket is open socket that is connected to a sender */
/* bytesToRead is the number of bytes expected from the sender */
/* bytesRead is a pointer to a integer variable that will hold the number of bytes */
/* actually received from the sender. */
/* The function returns either the number of bytes read, */
/* 0 if the socket was closed by the sender, and */
/* -1 if an error occurred while reading from the socket */
int readBytes(int socket, char *buffer, int bytesToRead, int *bytesRead)
{
*bytesRead = 0;
while(*bytesRead < bytesToRead)
{
int ret = read(socket, buffer + *bytesRead, bytesToRead - *bytesRead);
if(ret <= 0)
{
/* either connection was closed or an error occurred */
return ret;
}
else
{
*bytesRead += ret;
}
}
return *bytesRead;
}
So, the answer to your question depends a fair bit on whether you are using UDP or TCP as your transport.
For UDP, life gets a lot simpler, in that you can call recv/recvfrom/recvmsg with the packet size you need (you'd likely send fixed-length packets from the source anyway), and make the assumption that if data is available, it's there in multiples of packet-length sizes. (I.E. You call recv* with the size of your sending side packet, and you're set.)
For TCP, life gets a bit more interesting - for the purpose of this explanation, I will assume that you already know how to use socket(), bind(), listen() and accept() - the latter being how you get the file descriptor (FD) of your newly made connection.
There are two ways of doing the I/O for a socket - blocking, in which you call read(fd, buf, N) and the read sits there and waits until you've read N bytes into buf - or non-blocking, in which you have to check (using select() or poll()) whether the FD is readable, and THEN do your read().
When dealing with TCP-based connections, the OS doesn't pay attention to the packet sizes, since it's considered a continual stream of data, not seperate packet-sized chunks.
If your application uses "packets" (packed or unpacked data structures that you're passing around), you ought to be able to call read() with the proper size argument, and read an entire data structure off the socket at a time. The only caveat you have to deal with, is to remember to properly byte-order any data that you're sending, in case the source and destination system are of different byte endian-ness. This applies to both UDP and TCP.
As far as *NIX socket programming is concerned, I highly recommend W. Richard Stevens' "Unix Network Programming, Vol. 1" (UNPv1) and "Advanced Programming in an Unix Environment" (APUE). The first is a tome regarding network-based programming, regardless of the transport, and the latter is a good all-around programming book as it applies to *NIX based programming. Also, look for "TCP/IP Illustrated", Volumes 1 and 2.
When you do a read on the socket, you tell it how many maximum bytes to read, but if it doesn't have that many, it gives you however many it's got. It's up to you to design the protocol so you know whether you've got a partial packet or not. For instance, in the past when sending variable length binary data, I would put an int at the beginning that said how many bytes to expect. I'd do a read requesting a number of bytes greater than the largest possible packet in my protocol, and then I'd compare the first int against however many bytes I'd received, and either process it or try more reads until I'd gotten the full packet, depending.
Sockets operate at a higher level than raw packets - it's like a file you can read/write from. Also, when you try to read from a socket, the operating system will block (put on hold) your process until it has data to fulfill the request.

Resources