select()ed socket fails to to recv() complete data

select()ed socket fails to to recv() complete data - c

With the following pseudo-Python script for sending data to a local socket:
s = socket.socket(AF_UNIX, SOCK_STREAM)
s.connect("./sock.sock")
s.send("test\n")
s.send("aaa\0")
s.close()
My C program will randomly end up recving the following buffers:
test\n
test\n<random chars>
test\naaa (as expected)
The socket is being recv()'d after select() points that the socket is readable. Question is, how to avoid the first two cases?
And side question: Is it possible to send the following two messages from that script:
asd\0
dsa\0
And have select() to show the socket as readable on each of those sends, or will it only do that if I run the script again (restarting the socket client connection) and sending a message for each connect?

At a guess, the len argument to recv specifies a maximum amount of data to read, not the precise amount to be returned. recv is free to return any amount of data up to len bytes instead.
If you want to read a specific number of bytes, call recv in a loop.
int bytes = 0;
while (bytes < len) {
int remaining = len - bytes;
int read = recv(sockfd, buf+bytes, remaining, 0);
if (read < 0) {
// error
break;
}
bytes += read;
}
As noted by junix, if you'll need to send unpredictable amounts of data, consider defining a simple protocol that either starts each message with a note of its length or ends with a particular byte or sequence of bytes.

Related

Unix Socket sending/receiving long messages

I am writing a simple application layer protocol using tcp and I encounter a problem. I want to make fragmentation in message sending because messages are so long. But I cannot synchronize the process and the client reads empty buffer before the server writes the data. The messages are approximately 4mb. How can I write these methods?
For client
void send_message(string message);
string receive_message()
For server
void send_message(int sock,string message)
string receive_message(int sock)
My functions are below
void send_fragment(char* buffer,int length){
int n = write(sockfd, buffer, length);
if (n < 0)
{
perror("ERROR writing to socket");
exit(1);
}
}
string receive_fragment(){
char buffer[FRAGMENT_LENGTH];
bzero(buffer,FRAGMENT_LENGTH);
int n = read(sockfd, buffer, FRAGMENT_LENGTH-1);
if (n < 0)
{
perror("ERROR reading from socket");
exit(1);
}
return string(buffer);
}
void send_message(string message){
char buffer[FRAGMENT_LENGTH];
bzero(buffer,FRAGMENT_LENGTH);
int message_length = message.length();
//computes the number of fragment
int number_of_fragment = ceil((double)message_length / FRAGMENT_LENGTH);
sprintf(buffer,"%d",number_of_fragment);
//sends the number of fragment
send_fragment(buffer,strlen(buffer));
for(int i=0;i<number_of_fragment;++i){
bzero(buffer,FRAGMENT_LENGTH);
//fragment interval
int start = i*FRAGMENT_LENGTH;
int end = (i+1)*FRAGMENT_LENGTH;
if(i==number_of_fragment-1){
end = min(end,message_length);
}
//creates a fragment
const char* fragment = message.substr(start,end).c_str();
sprintf(buffer,"%s",fragment);
//sends the fragment
send_fragment(buffer,strlen(buffer));
}
}
string receive_message(){
//receive and computes the number of fragment
string number_of_fragment_string = receive_fragment();
int number_of_fragment = atoi(number_of_fragment_string.c_str());
string message ="";
for(int i=0;i<number_of_fragment;++i){
//concatenating fragments
message += receive_fragment();
}
return message;
}

You have to implement the framing in your own code. TCP is a "stream" meaning it just sends bytes without any sort of start/end indication. (UDP is packet-based but not suitable for packets of your size.)
The simplest method would be to write a 4-byte length to the socket and have the receiving side read those bytes, remembering that endianess is an issue (use htonl() and ntohl() to convert local representations to "network order").
Then proceed to read that number of bytes. When that is done, you've received your message.
If you use blocking reads, it'll be fairly simple -- if you get less then the connection has broken. If you use non-blocking reads, you have to assemble the pieces you get (you could even get the length in pieces, though unlikely) back with each read call.
There are other ways of framing your data but this is the simplest.

You're ignoring the count returned by recv(). Instead of constructing a string with the entire buffer, construct it from only that many bytes of the buffer.

1)Create send_message() and receive_message() using send() and recv().
2)Select appropriate flags in recv() Read recv() man page for flags . http://linux.die.net/man/2/recv.
3)Use some delimiter at the start and end of the message transmitted at each time to mark the beginning and end so that check can be made at receiver side.

Maximum data size that can be sent and received using sockets, at once?(TCP socket)

I am designing a game which has master and multiple players. They send and receive data using TCP sockets.
Players transfer character strings between themselves via TCP sockets.The programs are being executed in red hat linux 6 os .
The character string transferred between players is of the type
char chain[2*hops+10];
The player code on sender side is
len = send(to,chain,sizeof(chain),0);
if (len != sizeof(chain)) {
perror("send");
exit(1);}
The code where player receives the data is like this :
char chain[2*hops+10];
len = recv(current,chain,sizeof(chain),0);
The value of hops is same for both the players.
For hops value till around 8000 it is working fine, but once the hops value crosses some point, the same program is not working. I believe data is not transferred in one go.
Is there a maximum buffer size for send and recv buffer?
Note: The sockets between them are opened using this code:
s = socket(AF_INET, SOCK_STREAM, 0);
and then the usual connect and bind sockets on both sides.

TCP is a stream-oriented protocol (as implied by SOCK_STREAM). Data that an application sends or receives (in [maximum-sized] chunks) is not received or sent in same-sized chunks. Thus one should read from a socket until enough data to be processed have been received, then attempt to process said data, and repeat:
while (true) {
unsigned char buffer [4096] = {};
for (size_t nbuffer = 0; nbuffer < sizeof buffer
; nbuffer = MAX(nbuffer, sizeof buffer)) { /* Watch out for buffer overflow */
int len = recv (sockd, buffer, sizeof buffer, 0);
/* FIXME: Error checking */
nbuffer += len;
}
/* We have a whole chunk, process it: */
;
}
You can also handle partial sends on the other side as described here, much better than I ever would.

(C Socket Programming) Seperate send() calls from server ending up in same client recv() buffer

I was wondering if anyone could shed any light as to why two seperate send() calls would end up in the same recv() buffer using the loopback address for testing yet once switched to two remote machines they would require two recv() calls instead? I have been looking at the wireshark captures yet cant seem to make any sense as to why this would be occuring. Perhaps someone could critique my code and tell me where im going wrong. The two incoming messages from the server is of an undetermined length to the client. By the way i'm using BSD sockets using C in Ubuntu.
In the example shown below im parsing the entire buffer to extract the two seperate messages from it which i'll admit isn't an ideal approach.
-------SERVER SIDE--------
// Send greeting string and receive again until end of stream
ssize_t numBytesSent = send(clntSocket, greeting, greetingStringLen, 0);
if (numBytesSent < 0)
DieWithSystemMessage("send() failed");
//-----------------------------Generate "RANDOM" Message -----------------------
srand(time(NULL)); //seed random number from system clock
size_t randomStringLen = rand() % (RANDOMMSGSIZE-3); //generates random num
// betweeen 0 and 296
char randomMsg [RANDOMMSGSIZE] = "";
// declare and initialize allowable characteer set for the
const char charSet[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
if (randomStringLen) {
--randomStringLen;
for (size_t i = 0; i < randomStringLen; i++) {
int p = rand() % (int) (sizeof charSet - 1);
randomMsg[i] = charSet[p];
}
randomStringLen = strlen(randomMsg);
printf("Random String Size Before newline: %d\n", (int)randomStringLen);
strcat(randomMsg,"\r\n");
}
randomStringLen = strlen(randomMsg);
printf("Random String: %s\n", randomMsg);
//-----------------------------Send "RANDOM" Message ---------------------------
// Send greeting string and receive again until end of stream
numBytesSent = send(clntSocket, randomMsg, randomStringLen, 0);
if (numBytesSent < 0)
DieWithSystemMessage("send() failed");
//------------------------------------------------------------------------------
------CLIENT SIDE-------
//----------------------------- Receive Server Greeting ---------------------------
char buffer[BUFSIZE] = ""; // I/O buffer
// Receive up to the buffer size (minus 1 to leave space for
// a null terminator) bytes from the sender
ssize_t numBytesRcvd = recv(sock, buffer, BUFSIZE - 1, 0);
if (numBytesRcvd < 0)
DieWithSystemMessage("recv() failed");
buffer[numBytesRcvd] = '\0'; //terminate the string after calling recv()
printf("Buffer contains: %s\n",buffer); // Print the buffer
//printf("numBytesRecv: %d\n",(int)numBytesRcvd); // Print the buffer
//------------------------ Extracts the random message from buffer ---------------------------
char *randomMsg = strstr(buffer, "\r\n"); // searches from first occurance of substring
char randomMessage [BUFSIZE] = "";
strcat(randomMessage, randomMsg+2);
int randomStringLen = strlen(randomMessage)-2;
printf("Random Message: %s\n",randomMessage); // Print the buffer
char byteSize [10];
sprintf(byteSize,"%d", randomStringLen);
printf("ByteSize = %s\n",byteSize);
//----------------------- Send the number for random bytes recieved -------------------------
size_t byteStringLen = strlen(byteSize); // Determine input length
numBytes = send(sock, byteSize, byteStringLen, 0);
if (numBytes < 0)
DieWithSystemMessage("send() failed");
else if (numBytes != byteStringLen)
DieWithUserMessage("send()", "sent unexpected number of bytes");
shutdown(sock,SHUT_WR); // further sends are disallowed yet recieves are still possible
//----------------------------------- Recieve Cookie ----------------------------------------

On Unix systems recv and send are just special cases of the read and write that accepts additional flags. (Windows also emulates this with Winsock).
You shouldn't assume that one recv corresponds to one send because that's generally isn't true (just like you can read a file in multiple parts, even if it was written in a single write). Instead you should start each "message" with a header that tells you how long the message is, if it's important to know what were the separate messages, or just read the stream like a normal file, if it's not important.

TCP is a byte-stream protocol, not a message protocol. There is no guarantee that what you write with a single send() will be received via a single recv(). If you need message boundaries you must implement them yourself, e.g. with a length-word prefix, a type-length-value protocol, or a self-describing protocol like XML.

You're experiencing a TCP congestion avoidance optimization commonly referred to as the Nagle algorithm (named after John Nagle, its inventor).
The purpose of this optimization is to reduce the number of small TCP segments circulating over a socket by combining them together into larger ones. When you write()/send() on a TCP socket, the kernel may not transmit your data immediately; instead it may buffer the data for a very short delay (typically a few tens of milliseconds), in case another request follows.
You may disable Nagle's algorithm on a per-socket basis, by setting the TCP_NODELAY option.
It is customary to disable Nagle in latency-sensitive applications (remote control applications, online games, etc..).

How recv() function works when looping?

I read in MSDN about the send() and recv() function, and there is one thing that I'm not sure I understand.
If I send a buffer of size 256 for example, and receive first 5 bytes, so the next time I call the recv() function, it will point to the 6th byte and get the data from there?
for example :
char buff[256];
memcpy(buff,"hello world",12);
send(sockfd, buffer, 100) //sending 100 bytes
//server side:
char buff[256];
recv(sockfd, buff, 5) // now buffer contains : "Hello"?
recv(socfd, buff,5) // now I ovveride the data and the buffer contains "World"?
thanks!

The correct way to receive into a buffer in a loop from TCP in C is as follows:
char buffer[8192]; // or whatever you like, but best to keep it large
int count = 0;
int total = 0;
while ((count = recv(socket, &buffer[total], sizeof buffer - total, 0)) > 0)
{
total += count;
// At this point the buffer is valid from 0..total-1, if that's enough then process it and break, otherwise continue
}
if (count == -1)
{
perror("recv");
}
else if (count == 0)
{
// EOS on the socket: close it, exit the thread, etc.
}

You have missed the principal detail - what kind of socket is used and what protocol is requested. With TCP, data is octet granulated, and, yes, if 256 bytes was sent and you have read only 5 bytes, rest 251 will wait in socket buffer (assuming buffer is larger, which is true for any non-embedded system) and you can get them on next recv(). With UDP and without MSG_PEEK, rest of a single datagram is lost, but, if MSG_PEEK is specified, next recv() will give the datagram from the very beginning. With SCTP or another "sequential packet" protocol, AFAIK, the same behavior as with UDP is got, but I'm unsure in Windows implementation specifics.

Read from socket

I need to read from an AF_UNIX socket to a buffer using the function read from C, but I don't know the buffer size.
I think the best way is to read N bytes until the read returns 0 (no more writers in the socket). Is this correct? Is there a way to guess the size of the buffer being written on the socket?
I was thinking that a socket is a special file. Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
I'm a very new to C, so please keep that in mind.

On common way is to use ioctl(..) to query FIONREAD of the socket which will return how much data is available.
int len = 0;
ioctl(sock, FIONREAD, &len);
if (len > 0) {
len = read(sock, buffer, len);
}

One way to read an unknown amount from the socket while avoiding blocking could be to poll() a non-blocking socket for data.
E.g.
char buffer[1024];
int ptr = 0;
ssize_t rc;
struct pollfd fd = {
.fd = sock,
.events = POLLIN
};
poll(&fd, 1, 0); // Doesn't wait for data to arrive.
while ( fd.revents & POLLIN )
{
rc = read(sock, buffer + ptr, sizeof(buffer) - ptr);
if ( rc <= 0 )
break;
ptr += rc;
poll(&fd, 1, 0);
}
printf("Read %d bytes from sock.\n", ptr);

I think the best way is to read N
bytes until the read returns 0 (no
more writers in the socket). Is this
correct?
0 means EOF, other side has closed the connection. If other side of communication closes the connection, then it is correct.
If connection isn't closed (multiple transfers over the same connect, chatty protocol), then the case is bit more complicated and behavior generally depends on whether you have SOCK_STREAM or SOCK_DGRAM socket.
Datagram sockets are already delimited for you by the OS.
Stream sockets do not delimit messages (all data are an opaque byte stream) and if desired one has to implement that on application level: for example by defining a size field in the message header structure or using a delimiter (e.g. '\n' for single-line text messages). In first case you would first read the header, extract length and using the length read the rest of the message. In other case, read stream into partial buffer, search for the delimiter and extract from buffer the message including the delimiter (you might need to keep the partial buffer around as depending on protocol several command can be received with single recv()/read()).
Is there a way to guess the
size of the buffer being written on
the socket?
For stream sockets, there is no reliable way as the other side of communication might be still in process of writing the data. Imagine the quite normal case: socket buffer is 32K and 128K is being written. Writing application would block inside send()/write(), the OS waiting for reading application to read out the data and thus free space for the next chunk of written data.
For datagram sockets, one normally knows the size of the message beforehand. Or one can try (never did that myself) recvmsg( MSG_PEEK ) and if the MSG_TRUNC is in the returned msghdr.msg_flags, try to increase the buffer size.

you are correct, if you don't know the size of the input you can just read one byte each time and append it to a larger buffer.

read N bytes until the read returns 0
Yes!
One added detail. If the sender doesn't close the connection, the socket will just block, instead of returning. A nonblocking socket will return -1 (with errno == EAGAIN) when there's nothing to read; that's another case.
Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
Nope. Sockets don't have a size. Suppose you sent two messages over the same connection: How long is the file?