I have two programs that use socket programming to communicate. Initially I will specify the no. of hops as to how many time they have to exchange messages between each other. Each time it receives a message, it will append its id to it. Hence the string grows in size every time. My program is working fine for 8000 hops, but after it crosses 8000, although program p1 sends a string of length 16388, p2 identifies that there are only 16385 in the socket ready to be read. I use ioctl() to determine the amount of characters ready to recv() in the socket, and then recv it in a char * variable...
Is it because there is a delay in the send () in p1 and recv() in p2 , that p2 identifies only 16385 characters in the socket ?
For ex:
If P1 sends length(16388)
P2 receives only the following length(16385)
Say I'm trying to send you 8 pumpkins. I put 6 of them on the table. You think, "I'm expecting 8 pumpkins, not 6. I'll wait until he puts the last two on the table." I think, "I don't want too many pumpkins 'in flight' at once. I'll wait until he takes 2 of these 6 before I put the last 2 on the table." We're stuck. We're each waiting for the other. We'll wait forever.
You are not permitted to wait until more bytes are received before accepting the bytes that have already been received. The reason for this is simple: No network protocol can allow each side to wait for the other. Since TCP permits the sending side to wait in this context, it cannot permit the receiving side to wait as well.
So accept the bytes as they are received. Don't wait for the other side to send all of them before accepting any of them. Otherwise, what happens if the other side is waiting for you to accept the first one before it sends any more?
You're probably hitting a kernel buffer limit. You can probably increase SO_RCVBUF on the receiver and it will work as you expect: SIOCINQ will eventually return the full size of the unread data.
But you shouldn't do that to ensure proper Functioning. messing with buffers should only be done when you want to tweak performance.
You should restructure the code so that you never have to ask the kernel how many bytes are available. Just read up to a reasonable limit(like 4096) and deal with one application-level message being broken up in multiple pieces. If you need message lengths/boundaries then you MUST implement them yourself on top of TCP.
Here's some silly code to read a message with a length header:
int ret, len = 0, have_read;
have_read = 0;
while (have_read < sizeof(len)) {
// This will likely always return sizeof(len) the first time.
ret = read(fd, ((char*)&len) + have_read, sizeof(len) - have_read);
if (ret <= 0) {
// Handle error.
}
have_read += ret;
}
char* buf = malloc(len);
if (!buf) {
// Handle error.
}
have_read = 0;
while (have_read < len) {
ret = read(fd, buf + have_read, len - have_read);
if (ret <= 0) {
// Handle error.
}
have_read += ret;
}
// Handle message in buf.
Related
I'm building a multi-client<->server messaging application over TCP.
I created a non blocking server using epoll to multiplex linux file descriptors.
When a fd receives data, I read() /or/ recv() into buf.
I know that I need to either specify a data length* at the start of the transmission, or use a delimiter** at the end of the transmission to segregate the messages.
*using a data length:
char *buffer_ptr = buffer;
do {
switch (recvd_bytes = recv(new_socket, buffer_ptr, rem_bytes, 0)) {
case -1: return SOCKET_ERR;
case 0: return CLOSE_SOCKET;
default: break;
}
buffer_ptr += recvd_bytes;
rem_bytes -= recvd_bytes;
} while (rem_bytes != 0);
**using a delimiter:
void get_all_buf(int sock, std::string & inStr)
{
int n = 1, total = 0, found = 0;
char c;
char temp[1024*1024];
// Keep reading up to a '\n'
while (!found) {
n = recv(sock, &temp[total], sizeof(temp) - total - 1, 0);
if (n == -1) {
/* Error, check 'errno' for more details */
break;
}
total += n;
temp[total] = '\0';
found = (strchr(temp, '\n') != 0);
}
inStr = temp;
}
My question: Is it OK to loop over recv() until one of those conditions is met? What if a client sends a bogus message length or no delimiter or there is packet loss? Wont I be stuck looping recv() in my program forever?
Is it OK to loop over recv() until one of those conditions is met?
Probably not, at least not for production-quality code. As you suggested, the problem with looping until you get the full message is that it leaves your thread at the mercy of the client -- if a client decides to only send part of the message and then wait for a long time (or even forever) without sending the last part, then your thread will be blocked (or looping) indefinitely and unable to serve any other purpose -- usually not what you want.
What if a client sends a bogus message length
Then you're in trouble (although if you've chosen a maximum-message-size you can detect obviously bogus message-lengths that are larger than that size, and defend yourself by e.g. forcibly closing the connection)
or there is packet loss?
If there is a reasonably small amount of packet loss, the TCP layer will automatically retransmit the data, so your program won't notice the difference (other than the message officially "arriving" a bit later than it otherwise would have). If there is really bad packet loss (e.g. someone pulled the Ethernet cable out of the wall for 5 minutes), then the rest of the message might be delayed for several minutes or more (until connectivity recovers, or the TCP layer gives up and closes the TCP connection), trapping your thread in the loop.
So what is the industrial-grade, evil-client-and-awful-network-proof solution to this dilemma, so that your server can remain responsive to other clients even when a particular client is not behaving itself?
The answer is this: don't depend on receiving the entire message all at once. Instead, you need to set up a simple state-machine for each client, such that you can recv() as many (or as few) bytes from that client's TCP socket as it cares to send to you at any particular time, and save those bytes to a local (per-client) buffer that is associated with that client, and then go back to your normal event loop even though you haven't received the entire message yet. Keep careful track of how many valid received-bytes-of-data you currently have on-hand from each client, and after each recv() call has returned, check to see if the associated per-client incoming-data-buffer contains an entire message yet, or not -- if it does, parse the message, act on it, then remove it from the buffer. Lather, rinse, and repeat.
I'm reading from one socket, but I don't know how many elements will arrive, so my socket remains stuck on the recv call also if the server has written all the elements of the struct array.
Client
for(;;){
...
if(strcmp(buf,"!who")==0){
while(recv(sd,name,50,0)>0)
{
printf("%s\n",name);
}
continue;
}
}
Server
while(recv(sock,msg,1024,0)>0){
if(strcmp(msg,"!who")==0){
if(i==0){
strcpy(msg2,"Nessun giocatore collegato.");
write(sock,msg2,strlen(msg2));
}
else
for(int c=0;c<=i;c++)
write(sock,players[c].name,strlen(players[c].name));
}
}
I don't know how many elements will arrive
You have to know. You have 3 options:
Sender prefixes the data with a count, telling the receiver how much to expect.
Sender concludes the data with some kind of "done" marker. In your case, a 0 would do (indicating a zero-length name).
Sender closes the connection, and receiver sees EOF.
I would change the protocol to send strlen first, then the name.
Note also that TCP has no message boundaries. recv(2) returns N bytes, where N is whatever was convenient for the kernel. If the call was interrupted by a signal, or the internal buffers were unable to accommodate the sender's output, N will be less than what was sent, and the next recv will get more of it (perhaps all of it).
As written, your code retrieves a name, followed by its length. But there is no assurance that the last bytes received constitute the length; if more data are pending, they will just be part of the name.
I'm very new to C++, but I'm trying to learn some basics of TCP socket coding. Anyway, I've been able to send and receive messages, but I want to prefix my packets with the length of the packet (like I did in C# apps I made in the past) so when my window gets the FD_READ command, I have the following code to read just the first two bytes of the packet to use as a short int.
char lengthBuffer[2];
int rec = recv(sck, lengthBuffer, sizeof(lengthBuffer), 0);
short unsigned int toRec = lengthBuffer[1] << 8 | lengthBuffer[0];
What's confusing me is that after a packet comes in the 'rec' variable, which says how many bytes were read is one, not two, and if I make the lengthBuffer three chars instead of two, it reads three bytes, but if it's four, it also reads three (only odd numbers). I can't tell if I'm making some really stupid mistake here, or fundamentally misunderstanding some part of the language or the API. I'm aware that recv doesn't guarantee any number of bytes will be read, but if it's just two, it shouldn't take multiple reads.
Because you cannot assume how much data will be available, you'll need to continuously read from the socket until you have the amount you want. Something like this should work:
ssize_t rec = 0;
do {
int result = recv(sck, &lengthBuffer[rec], sizeof(lengthBuffer) - rec, 0);
if (result == -1) {
// Handle error ...
break;
}
else if (result == 0) {
// Handle disconnect ...
break;
}
else {
rec += result;
}
}
while (rec < sizeof(lengthBuffer));
Streamed sockets:
The sockets are generally used in a streamed way: you'll receive all the data sent, but not necessarily all at once. You may as well receive pieces of data.
Your approach of sending the length is hence valid: once you've received the length, you cann then load a buffer, if needed accross successive reads, until you got everything that you expected. So you have to loop on receives, and define a strategy on how to ahandle extra bytes received.
Datagramme (packet oriented) sockets:
If your application is really packet oriented, you may consider to create a datagramme socket, by requesting linux or windows socket(), the SOCK_DGRAM, or better SOCK_SEQPACKET socket type.
Risk with your binary size data:
Be aware that the way you send and receive your size data appers to be assymetric. You have hence a major risk if the sending and receiving between machine with CPU/architectures that do not use the same endian-ness. You can find here some hints on how to ame your code platform/endian-independent.
TCP socket is a stream based, not packet (I assume you use TCP, as to send length of packet in data does not make any sense in UDP). Amount of bytes you receive at once does not have to much amount was sent. For example you may send 10 bytes, but receiver may receive 1 + 2 + 1 + 7 or whatever combination. Your code has to handle that, be able to receive data partially and react when you get enough data (that's why you send data packet length for example).
I was wondering if anyone could shed any light as to why two seperate send() calls would end up in the same recv() buffer using the loopback address for testing yet once switched to two remote machines they would require two recv() calls instead? I have been looking at the wireshark captures yet cant seem to make any sense as to why this would be occuring. Perhaps someone could critique my code and tell me where im going wrong. The two incoming messages from the server is of an undetermined length to the client. By the way i'm using BSD sockets using C in Ubuntu.
In the example shown below im parsing the entire buffer to extract the two seperate messages from it which i'll admit isn't an ideal approach.
-------SERVER SIDE--------
// Send greeting string and receive again until end of stream
ssize_t numBytesSent = send(clntSocket, greeting, greetingStringLen, 0);
if (numBytesSent < 0)
DieWithSystemMessage("send() failed");
//-----------------------------Generate "RANDOM" Message -----------------------
srand(time(NULL)); //seed random number from system clock
size_t randomStringLen = rand() % (RANDOMMSGSIZE-3); //generates random num
// betweeen 0 and 296
char randomMsg [RANDOMMSGSIZE] = "";
// declare and initialize allowable characteer set for the
const char charSet[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
if (randomStringLen) {
--randomStringLen;
for (size_t i = 0; i < randomStringLen; i++) {
int p = rand() % (int) (sizeof charSet - 1);
randomMsg[i] = charSet[p];
}
randomStringLen = strlen(randomMsg);
printf("Random String Size Before newline: %d\n", (int)randomStringLen);
strcat(randomMsg,"\r\n");
}
randomStringLen = strlen(randomMsg);
printf("Random String: %s\n", randomMsg);
//-----------------------------Send "RANDOM" Message ---------------------------
// Send greeting string and receive again until end of stream
numBytesSent = send(clntSocket, randomMsg, randomStringLen, 0);
if (numBytesSent < 0)
DieWithSystemMessage("send() failed");
//------------------------------------------------------------------------------
------CLIENT SIDE-------
//----------------------------- Receive Server Greeting ---------------------------
char buffer[BUFSIZE] = ""; // I/O buffer
// Receive up to the buffer size (minus 1 to leave space for
// a null terminator) bytes from the sender
ssize_t numBytesRcvd = recv(sock, buffer, BUFSIZE - 1, 0);
if (numBytesRcvd < 0)
DieWithSystemMessage("recv() failed");
buffer[numBytesRcvd] = '\0'; //terminate the string after calling recv()
printf("Buffer contains: %s\n",buffer); // Print the buffer
//printf("numBytesRecv: %d\n",(int)numBytesRcvd); // Print the buffer
//------------------------ Extracts the random message from buffer ---------------------------
char *randomMsg = strstr(buffer, "\r\n"); // searches from first occurance of substring
char randomMessage [BUFSIZE] = "";
strcat(randomMessage, randomMsg+2);
int randomStringLen = strlen(randomMessage)-2;
printf("Random Message: %s\n",randomMessage); // Print the buffer
char byteSize [10];
sprintf(byteSize,"%d", randomStringLen);
printf("ByteSize = %s\n",byteSize);
//----------------------- Send the number for random bytes recieved -------------------------
size_t byteStringLen = strlen(byteSize); // Determine input length
numBytes = send(sock, byteSize, byteStringLen, 0);
if (numBytes < 0)
DieWithSystemMessage("send() failed");
else if (numBytes != byteStringLen)
DieWithUserMessage("send()", "sent unexpected number of bytes");
shutdown(sock,SHUT_WR); // further sends are disallowed yet recieves are still possible
//----------------------------------- Recieve Cookie ----------------------------------------
On Unix systems recv and send are just special cases of the read and write that accepts additional flags. (Windows also emulates this with Winsock).
You shouldn't assume that one recv corresponds to one send because that's generally isn't true (just like you can read a file in multiple parts, even if it was written in a single write). Instead you should start each "message" with a header that tells you how long the message is, if it's important to know what were the separate messages, or just read the stream like a normal file, if it's not important.
TCP is a byte-stream protocol, not a message protocol. There is no guarantee that what you write with a single send() will be received via a single recv(). If you need message boundaries you must implement them yourself, e.g. with a length-word prefix, a type-length-value protocol, or a self-describing protocol like XML.
You're experiencing a TCP congestion avoidance optimization commonly referred to as the Nagle algorithm (named after John Nagle, its inventor).
The purpose of this optimization is to reduce the number of small TCP segments circulating over a socket by combining them together into larger ones. When you write()/send() on a TCP socket, the kernel may not transmit your data immediately; instead it may buffer the data for a very short delay (typically a few tens of milliseconds), in case another request follows.
You may disable Nagle's algorithm on a per-socket basis, by setting the TCP_NODELAY option.
It is customary to disable Nagle in latency-sensitive applications (remote control applications, online games, etc..).
With the following pseudo-Python script for sending data to a local socket:
s = socket.socket(AF_UNIX, SOCK_STREAM)
s.connect("./sock.sock")
s.send("test\n")
s.send("aaa\0")
s.close()
My C program will randomly end up recving the following buffers:
test\n
test\n<random chars>
test\naaa (as expected)
The socket is being recv()'d after select() points that the socket is readable. Question is, how to avoid the first two cases?
And side question: Is it possible to send the following two messages from that script:
asd\0
dsa\0
And have select() to show the socket as readable on each of those sends, or will it only do that if I run the script again (restarting the socket client connection) and sending a message for each connect?
At a guess, the len argument to recv specifies a maximum amount of data to read, not the precise amount to be returned. recv is free to return any amount of data up to len bytes instead.
If you want to read a specific number of bytes, call recv in a loop.
int bytes = 0;
while (bytes < len) {
int remaining = len - bytes;
int read = recv(sockfd, buf+bytes, remaining, 0);
if (read < 0) {
// error
break;
}
bytes += read;
}
As noted by junix, if you'll need to send unpredictable amounts of data, consider defining a simple protocol that either starts each message with a note of its length or ends with a particular byte or sequence of bytes.