tcp send and recv: always in loops? - c

What are the best practices when sending (or writing) and recving (or reading) to/from a TCP socket ?
Assume usual blocking I/O on sockets. From what I understand :
writing (sending) should be fine without a loop, because it will block if the write buffer of the socket is full, so something like
if ((nbytes_w = write(sock, buf, nb)) < nb)
/* something bad happened : error or interrupted by signal */
should always be correct ?
on the other hand, there is no guaranty that one will read a full message, so one should read with
while ((nbytes_r = read(sock, buf, MAX)) > 0) {
/* do something with these bytes */
/* break if encounter specific application protocol end of message flag
or total number of bytes was known from previous message
and/or application protocol header */
}
Am I correct ? Or is there some "small message size" or other conditions allowing to read safely outside a loop ?
I am confused because I have seen examples of "naked reads", for instance in Tanenbaum-Wetherall:
read(sa, buf, BUF_SIZE); /* read file name in socket */

Yes you must loop on the receive
Once a week I answer a question where someones TCP app stops working for this very reason. The real killer is that they developped the client and server on the same machine, so they get loopback connection. Almost all the time a loopback will receive the send messages in the same blocks as they were sent. This makes it look like the code is correct.
The really big challenge is that this means you need to know before the loop how big the message is that you are going to receive. Possibilities
send a fixed length length (ie you know its , say, 4 bytes) first.
have a recognizable end sequence (like the double crlf at the end of an HTTP request.
Have a fixed size message
I would always have a 'pull the next n bytes' function.
Writing should loop too, but that easy, its just a matter of looping.

Related

What is the correct way to use send() on sockets when the full message has not been sent in one go?

I am writing a simple C server that may sometimes not send nor receive the full message. I have looked at the beej guide and the linux man page among other resources. I cannot figure out how I can send and receive when multiple send and receive calls are necessary. This is what I have tried to do for send:
char* buffer [4096];
int client_socket, buffer_len, message_len, position;
....
while (position < message_len) {
position = send(client_socket, buffer, message_len, 0);
}
I am not sure if I should be doing that or..
while (position < message_len) {
position = send(client_socket, buffer+position, message_len-position, 0);
}
The docs do not address this and I cannot find a usage example that has send within a while loop. Some C functions can track state between function calls (such as strtok) but I am not sure if send does. What I don't want to do is repeatedly send from the beginning of the message until it completes in one go.
It is necessary that I send files that are up to 50MB at a time and so there will likely be more than one call to send in this scenario.
send() returns the number of bytes sent, or -1 if an error occurred. If you keep track of how many bytes you have sent, you can use that as an offset in the buffer you send from. The length of the message that remains to be sent of course decreases by the same amount.
int bytes_sent_total = 0;
int bytes_sent_now = 0;
while (bytes_sent_total < message_len)
{
bytes_sent_now = send(client_socket, &buffer[bytes_sent_total], message_len - bytes_sent_total, 0);
if (bytes_sent_now == -1)
{
// Handle error
break;
}
bytes_sent_total += bytes_sent_now;
}
Assuming you're using a stream socket (not specified), in fact it doesn't matter how many calls to the 'send' function your program will do. The socket library offers the abstraction of sending data as writing to a file. The network layer will divide the data into small packets for sending them through the net.
On the client side, the network layer reassembles the received packets and offers a similar abstraction for the client, so that receiving data is like reading from a file. So you don't have to read the entire buffer in a single call.
For the client side, this introduces a small gimmick: when to stop reading? Common idioms are:
Knowing beforehand how much data to expect (by protocol design).
Iterating reads of small chunks (say: 1k or so) with a reasonable timeout, stop on timeout.
Prepending the data with a field containing its size.
Closing the socket right after sending the data (that's what HTTP usually does).

TCP Client - Receive message from unknown / unlimited size

I am currently sitting at a university task and am facing a problem that cannot be solved for me. I'm developing a TCP client which connects to a server and gets a message from there.
The client should be able to work with strings of any length and output all received characters until the server closes the connection.
My client works and with a fixed string length, I can also receive messages from e.g. djxmmx.net port 17. However, I have no idea how to map this arbitrary length.
My C knowledge is really poor, which is why I need some suggestions, ideas or tips on how to implement my problem.
Actual this is my code for receiving messages:
// receive data from the server
char server_response[512];
recv(client_socket, &server_response, sizeof(server_response), 0);
If you're going to work with input of essentially unlimited length, you will need to call recv() several times in a loop to get each succeeding section of the input. If you can deal with each section at a time, and then discard it and move onto the next section, that's one approach. If you are going to need to process all the input in one go, you're going to have to find a way of storing arbitrarily large amounts of data, probably using dynamic memory allocation.
With recv() you will probably want to loop reading content until it returns 0 indicating that the socket has performed an orderly shutdown (documentation here). That might look something like this:
char server_response[512];
ssize_t bytes_read;
while ((bytes_read = recv(client_socket, &server_response,
sizeof(server_response), 0)) > 0) {
/* do something with the data of length bytes_read
in server_response[] */
}

Getting two messages from receive when only one is sent

I wrote a server that should wait for messages from a client after opening a connection to it:
while(1){
if(recv(mySocket, buffer, 1000, 0) < 1){
continue;
}
printf("Message received: %s", buffer);
}
I checked with wireshark which packets were sent to this server, but for every packet sent there were 2 printf outputs.
My question is now where did I get this additional message from.
(The additional message are some random bytes. But every time the same.)
Your apparent expectations for the behavior of recv() are not justified. As #KarolyHorvath observed in comments, stream sockets (among which TCP-based sockets fall) have no sense whatever of "messages". In particular, network packets do not correspond to messages on a stream socket. POSIX has this to say about the behavior of recv(), in fact:
For stream-based sockets, [...] message boundaries shall be ignored.
Although that's more likely to have the effect of combining multiple "messages", it can also mean that a single message (as dispatched by a single send() call) is split over multiple recv() calls. It certainly will mean that if the buffer length you specify to recv() is less than the number of bytes actually received on the socket, but there are other circumstances in which that result could be obtained, too.
On success, recv() returns the number of bytes copied into the receive buffer. If you are genuinely trying to implement some sort of "message" exchange, then you can use that to help you split incoming data on message boundaries. Do recognize, however, that that constitutes implementing a message-passing protocol on top of a stream, so sender and receiver need to cooperate, at least implicitly, for it to work.
John Bollinger's answer is accurate and provides insight into what you should do to create a reliable client / server application.
Regarding your question, There is another problem that explains the actual output you see. the packet is most probably sent and received in a single chunk, as you observe with wireshark. The bug is in your server: You receive the data in a char array and print it directly as a string with printf. I suspect the packet does not contain the terminating '\0' to make the buffer a proper string for "%s". printf will output the packet contents plus whatever buffer contents is there until it reaches a '\0' byte, possibly invoking undefined behaviour. If the packet is split in several chunks, you may see the same contents several times, and random characters too.
Here is how you should fix your code:
char buffer[2000];
...
for (;;) {
ssize_t count = recv(mySocket, buffer, 1999, 0);
if (count >= 1) {
buffer[count] = '\0';
printf("Message received: |%s|", buffer);
}
}
Note that the buffer must be at least 1 byte longer than the maximum packet size, and this tracing method cannot handle embedded '\0' bytes in the packets.
Of course the packets can be sliced and diced on the way between the client and the server, so you must deal with this appropriately to implement a proper protocol.

tcp indicating end of stream

I am having trouble ending tcp stream. I am writing a simple server and client where the client connects to the server and the server displays a welcome message asking the client for a username.
The problem is, when the server writes the message, the client's read() gets blocked. It only gets unblocked when I call shutdown().
Server:
if (FD_ISSET(tcp_listenfd, &rset)) {
len = sizeof(cliaddr);
if ((new_confd = accept(tcp_listenfd, (struct sockaddr *) &cliaddr, &len)) < 0) {
perror("accept");
exit(1);
}
/* Send connection message asking for handle */
writen(new_confd, handle_msg, strlen(handle_msg));
/* Fork here or shutdown fd is inherited */
shutdown(new_confd, SHUT_WR);
Clients:
if ((connect(sock, (struct sockaddr *) server, sizeof(struct sockaddr_in))) < 0) {
perror("inet_wstream:connect");
exit(1);
}
s_welcome_msg[19] = '\0';
readn(sock, s_welcome_msg, 20); //Blocks here if shutdown() is not called in server
The readn() and writen() functions are adapted from "The Socket Networking API" by Stevens found here: http://www.informit.com/articles/article.aspx?p=169505&seqNum=9
How do I write a welcome message from the server without calling shutdown() and not having the client block? If more context is needed, I will post more code.
Note that readn() is designed to read() in a loop until either 20 bytes are read or there's EOF or an error on the socket. If the message the server sends is less than 20 bytes long, the client will block waiting for more data.
To prevent it from blocking, you could do a normal read() (or recv()) on the socket instead. In this case, that is likely to do what you want.
In general, you can't rely on being able to pair up write()s and read()s for TCP connections though. A single write() of a string "bar" could split the data up arbitrarily. As an extreme example, three successive read()s might return "b", "a", and "r". That particular example is unlikely, but for larger write()s and read()s you have to take this into account (and for smaller transmissions too, if you want to be perfectly safe).
To work around this issue, you will have to do your own buffering on the receiving end. The simplest solution in this case is to read() one character at a time (or to use readn() with exactly the amount of data you expect, if it is known). A more general solution is to read() as much data as is currently available (make sure to check the return value of read() to see how much data you get back!) into a buffer and only acting on the data whenever you've collected enough of it. A plain read() will not block as long as there's some data available to be read, but you might get back less data than you requested.
"Enough of it" would usually be a full "message" in your protocol. You will need some way to determine message boundaries. Two alternatives are length fields (usually the best solution in my experience) or message terminators. Both would be sent along with the rest of the data.
Update:
You have a bug in your null-termination logic by the way. Reading twenty bytes into s_welcome_msg will set s_welcome_msg[19] to the last byte read, overwriting your null terminator. If you want to read a 20-byte non-null-terminated string into s_welcome_msg and null-terminate it, s_welcome_msg will need to be 21 bytes long, and you will need to do s_welcome_msg[20] = '\0'.
read(n) will block until it receives the requested number of bytes
(and the receiving field is only 19 bytes, so it it did read 20 bytes
that would be a buffer overflow which is undefined behaviour and can/will result in a seg fault event)
I suggest, as one possible fix, a loop with a select() statement with a timeout
and when select() indicates some data available,
read only one byte
append that byte to the s_welcome_msg[] buffer
(while always checking that the buffer is not overflowed
which, generally, would mean only read a max of 18 bytes
so the read value would be a valid string)
Your code should make the read() be non-blocking
so it will not hang.
After reading the byte,
if input buffer not full (18 bytes read)
then loop back to the select() statement
If the select() timeout occurs,
then assume all the data has been read
and proceed to the next code statements after the select/read loop
Also remember to always 'refresh' the timeout value
on the select() statement parameter before
executing the select()

how to correctly use recv() sys call

I am in the process of writing a TCP server using Berkely SOCKET API under linux. Now the clients have a set of specifications and each message send by the client is based on those specifications i-e the message sent from the client corresponds to one of the many structs specified by the specs. Now the scenario is that the server doesn't know which message the client is going to send at what time. Once we get the message we can figure out what it is but not before that. The messages sent by the clients have variable lengths so we can not know in advance what message we are going to get. To solve this I have used the following method:
const char *buf[4096] = { 0 };
if ( recv (connected, buf, 4096, 0) == -1)
{
printf ("Error in recvng message\n");
exit (-1);
}
That is I use a default buffer of size 4096 (no message from the client can be larger than this size). receive in that buffer and then afterwards I Check the message type and take the corresponding action as follows:
struct ofp_header *oph;
oph=(struct ofp_header *)buf;
switch (oph->type)
{
case example_pkt:
handle_example_pkt();
break;
}
This works fine but I just wanted to confirm that is it an appropriate method or is there something else that could be better than this. All help much appreciated.
Thanks.
TCP is stream-based. This means if you use a buffer larger than your message you may receive part of the next message as well.
This means you will need to know the size of the message and incorporate any additional data into the next message. There are two obvious ways to do this:
Modify the protocol to send the size of each message as the first few bytes of the message. Read the size first, then only read that many bytes.
Since you know the size of each message, keep track of how many bytes you read. Process the first message in the buffer then subtract the size of that message from the remaining bytes in the buffer. Continue to repeat this process until you either A. Don't have enough bytes left to identify the message type or B. Don't have enough bytes for a message of the detected type. Save any remaining bytes and call recv again to read more data.

Resources