Sockets - Reading and writing [duplicate] - c

I'm very new to C++, but I'm trying to learn some basics of TCP socket coding. Anyway, I've been able to send and receive messages, but I want to prefix my packets with the length of the packet (like I did in C# apps I made in the past) so when my window gets the FD_READ command, I have the following code to read just the first two bytes of the packet to use as a short int.
char lengthBuffer[2];
int rec = recv(sck, lengthBuffer, sizeof(lengthBuffer), 0);
short unsigned int toRec = lengthBuffer[1] << 8 | lengthBuffer[0];
What's confusing me is that after a packet comes in the 'rec' variable, which says how many bytes were read is one, not two, and if I make the lengthBuffer three chars instead of two, it reads three bytes, but if it's four, it also reads three (only odd numbers). I can't tell if I'm making some really stupid mistake here, or fundamentally misunderstanding some part of the language or the API. I'm aware that recv doesn't guarantee any number of bytes will be read, but if it's just two, it shouldn't take multiple reads.

Because you cannot assume how much data will be available, you'll need to continuously read from the socket until you have the amount you want. Something like this should work:
ssize_t rec = 0;
do {
int result = recv(sck, &lengthBuffer[rec], sizeof(lengthBuffer) - rec, 0);
if (result == -1) {
// Handle error ...
break;
}
else if (result == 0) {
// Handle disconnect ...
break;
}
else {
rec += result;
}
}
while (rec < sizeof(lengthBuffer));

Streamed sockets:
The sockets are generally used in a streamed way: you'll receive all the data sent, but not necessarily all at once. You may as well receive pieces of data.
Your approach of sending the length is hence valid: once you've received the length, you cann then load a buffer, if needed accross successive reads, until you got everything that you expected. So you have to loop on receives, and define a strategy on how to ahandle extra bytes received.
Datagramme (packet oriented) sockets:
If your application is really packet oriented, you may consider to create a datagramme socket, by requesting linux or windows socket(), the SOCK_DGRAM, or better SOCK_SEQPACKET socket type.
Risk with your binary size data:
Be aware that the way you send and receive your size data appers to be assymetric. You have hence a major risk if the sending and receiving between machine with CPU/architectures that do not use the same endian-ness. You can find here some hints on how to ame your code platform/endian-independent.

TCP socket is a stream based, not packet (I assume you use TCP, as to send length of packet in data does not make any sense in UDP). Amount of bytes you receive at once does not have to much amount was sent. For example you may send 10 bytes, but receiver may receive 1 + 2 + 1 + 7 or whatever combination. Your code has to handle that, be able to receive data partially and react when you get enough data (that's why you send data packet length for example).

Related

Using C POSIX sockets, can you determine how many bytes a socket contains without extracting?

I'm working with POSIX sockets in C.
Given X, I have a need to verify that the socketfd contains at least X bytes before proceeding to perform an operation with it.
With that being said, I don't want to receive X bytes and store it into a buffer using recv as X has the potential of being very large.
My first idea was to use MSG_PEEK...
int x = 9999999
char buffer[1];
int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
(value == X) ? good : bad;
...
...
...
// Do some operation
But I'm concerned X > 1 is corrupting memory, flag MSG_TRUNC seems to resolve the memory concern but removes X bytes from socketfd.
There's a big difference between e.g. TCP and UDP in this regards.
UDP is packet based, you send and receive packets of fixed size, basically.
TCP is a streaming protocol, where data begins to stream on connection and stops at disconnection. There are no message boundaries or delimiters in TCP, other than what you add at the application layer. It's simply a stream of bytes without any meaning (in TCP's point of view).
That means there's no way to tell how much will be received with a single recv call.
You need to come up with an application-level protocol (on top of TCP) which can either tell the size of the data to be received; For example there might be a fixed-size data-header that contains the size of the following data; Or you could have a specific delimiter between messages, something that can't occur in the stream of bytes.
Then you receive in a loop until you either have received all the data, or until you have received the delimiter. But note, with a delimiter there's the possibility that you also receive the beginning of the next message, so you need to be able to handle partial beginnings of message after the current message have been fully received.
int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
This will copy up to X byte into buffer and return it without removing it from the socket. But your buffer is only 1 byte large. Increase your buffer.
Have you tried this?
ssize_t available = recv(socketfd, NULL, 0, MSG_PEEK | MSG_TRUNC);
Or this?
size_t available;
ioctl(socketfd, FIONREAD, &available);

Reading from Socket in a loop?

I am creating a server/client TCP in C.
The idea is for the server to send a relatively large amount of information. However, the buffer in the client has a size of only 512 (I don't want to increase this size), and obviously, the information sent by the server is larger than this. Let's imagine 812 bytes.
What I want to do is, in the client, read 512 bytes, print them on the client's console, and then read the remaining bytes, and print them as well.
Here's what should happen:
1) Create server, and block in the read() system call (waiting for the client to write something);
2) Create the client, and write something in the socket, and then blocks on read(), waiting for the server to respond;
3) The server's read() call returns, and now server has to send that large amount of data, using the following code (after creating a new process):
dup2(new_socketfd, STDOUT_FILENO); // Redirect to socket
execlp("/application", "application", NULL); // Application that prints the information to send to the client
Let's imagine "application" printed 812 bytes of data to the socket.
4) Now the client has to read 812 bytes, with a buffer size of 512. That's my problem.
How can I approach this problem? I was wondering if I could make a loop, and read until there's nothing to read, 512 by 512 bytes. But as soon as there's nothing to read, client will block on read().
Any ideas?
recv will block when there is no data in the stream. Any data extracted from the stream, the length is returned from recv.
You can write a simple function to extract the full data just by using an offset variable and checking the return value.
A simple function like this will do.
ssize_t readfull(int descriptor,char* buffer, ssize_t sizetoread){
ssize_t offset = 0;
while (offset <sizetoread) {
ssize_t read = recv(descriptor,buffer+offset,sizetoread-offset,0);
if(read < 1){
return offset;
}
offset+=read;
}
return offset;
}
Also servers typically send some kind of EOF when the data is finished. Either the server might first send the length of the message to be read which is a constant size either four or eight bytes, then it sends the data so you know ahead of time how much to read. Or, in the case of HTTP for example, there is the content-length field as well as the '\r\n' delimeters.
Realistically there is no way to know how much data the server has available to send you, it's impractical. The server has to tell you how much data there is through some kind of indicator.
Since you're writing the server yourself, you can first send a four byte message which can be an int value of how much data the client should read.
So your server can look like this:
int sizetosend = arbitrarysize;
send(descriptor,(char*)&sizetosend,sizeof(int),0);
send(descriptor,buffer,sizetosend,0);
Then on your client side, read four bytes then the buffer.
int sizetoread = 0;
ssize_t read = recv(descriptor,(char*)&sizetoread,sizeof(int),0);
if(read < 4)
return;
//Now just follow the code I posted above

Is it OK to loop over recv / read to read all data from socket

I'm building a multi-client<->server messaging application over TCP.
I created a non blocking server using epoll to multiplex linux file descriptors.
When a fd receives data, I read() /or/ recv() into buf.
I know that I need to either specify a data length* at the start of the transmission, or use a delimiter** at the end of the transmission to segregate the messages.
*using a data length:
char *buffer_ptr = buffer;
do {
switch (recvd_bytes = recv(new_socket, buffer_ptr, rem_bytes, 0)) {
case -1: return SOCKET_ERR;
case 0: return CLOSE_SOCKET;
default: break;
}
buffer_ptr += recvd_bytes;
rem_bytes -= recvd_bytes;
} while (rem_bytes != 0);
**using a delimiter:
void get_all_buf(int sock, std::string & inStr)
{
int n = 1, total = 0, found = 0;
char c;
char temp[1024*1024];
// Keep reading up to a '\n'
while (!found) {
n = recv(sock, &temp[total], sizeof(temp) - total - 1, 0);
if (n == -1) {
/* Error, check 'errno' for more details */
break;
}
total += n;
temp[total] = '\0';
found = (strchr(temp, '\n') != 0);
}
inStr = temp;
}
My question: Is it OK to loop over recv() until one of those conditions is met? What if a client sends a bogus message length or no delimiter or there is packet loss? Wont I be stuck looping recv() in my program forever?
Is it OK to loop over recv() until one of those conditions is met?
Probably not, at least not for production-quality code. As you suggested, the problem with looping until you get the full message is that it leaves your thread at the mercy of the client -- if a client decides to only send part of the message and then wait for a long time (or even forever) without sending the last part, then your thread will be blocked (or looping) indefinitely and unable to serve any other purpose -- usually not what you want.
What if a client sends a bogus message length
Then you're in trouble (although if you've chosen a maximum-message-size you can detect obviously bogus message-lengths that are larger than that size, and defend yourself by e.g. forcibly closing the connection)
or there is packet loss?
If there is a reasonably small amount of packet loss, the TCP layer will automatically retransmit the data, so your program won't notice the difference (other than the message officially "arriving" a bit later than it otherwise would have). If there is really bad packet loss (e.g. someone pulled the Ethernet cable out of the wall for 5 minutes), then the rest of the message might be delayed for several minutes or more (until connectivity recovers, or the TCP layer gives up and closes the TCP connection), trapping your thread in the loop.
So what is the industrial-grade, evil-client-and-awful-network-proof solution to this dilemma, so that your server can remain responsive to other clients even when a particular client is not behaving itself?
The answer is this: don't depend on receiving the entire message all at once. Instead, you need to set up a simple state-machine for each client, such that you can recv() as many (or as few) bytes from that client's TCP socket as it cares to send to you at any particular time, and save those bytes to a local (per-client) buffer that is associated with that client, and then go back to your normal event loop even though you haven't received the entire message yet. Keep careful track of how many valid received-bytes-of-data you currently have on-hand from each client, and after each recv() call has returned, check to see if the associated per-client incoming-data-buffer contains an entire message yet, or not -- if it does, parse the message, act on it, then remove it from the buffer. Lather, rinse, and repeat.

How socket send data?

I got a snippet from internet for send data through a socket .
Here is the code .
u32_t nLength = 0;
u32_t nOffset = 0;
do {
nLength = nFullLength - nOffset;
status = Socket->Send(((u8_t*) buff) + nOffset, &nLength);
if (status != ERROR_SUCCESS) {
break;
}
nOffset += nLength;
} while (nOffset < nFullLength);
My doubts are :
When send(sock_fd, buf+bytes, buflen-bytes, flags); function running , it will send the entire data ?
Let's assume i have a buff with 45 byte length . So it will send like
send(buf+0, 45-0) = send(buf+0, 45);
So it will send complete data with length 45 ? what is the use of length here ? initially it will 45 . Isn't ?
Well, no. There's no guarantee that it will send all the data you ask it to send, that's why the code looks the way it does.
The manual page for send() states this pretty clearly:
Return Value
On success, these calls return the number of characters sent. On error, -1
is returned, and errno is set appropriately.
The same is true for e.g. a regular write() to a local file, by the way. It might never happen, but the way the interface is designed you're supposed to handle partial sends (and writes) if they do happen.
TCP is a streaming transport. There is no guarantee that a given send() operation will accept all of the bytes given to it at one time. It depends on the available kernel buffer space, the I/O mode of the socket (blocking vs non-blocking), etc. send() returns the number of bytes it actually accepted and put into the kernel buffer for subsequent transmission.
In the code example shown, it appears that Socket->Send() expects nLength to be initially set to the total number of bytes to sent, and it will then update nLength with the number of bytes actually sent. The code is adjusting its nOffset variable accordingly, looping just in case send() returns fewer bytes than requested, so it can call send() as many times as it takes to send the full number of bytes.
So, for example, lets assume the kernel accepts up to 20 bytes at a time. The loop would call send() 3 times:
send(buf+0, 45-0) // returns 20
send(buf+20, 45-20) // returns 20
send(buf+40, 45-40) // returns 5
// done
This is typical coding practice for TCP programming, given the streaming nature of TCP.

Writing messages of different types through sockets in C

I want to send a message to my Binder class via TCP socket connection in C. I have to pass in a request type (char*), ip address (int), argTypes(int array) etc. through this connection using the write() method. What's the best method to send all of the information in one single message?
There's no guarantee that you can send/receive all your data in a single read/write operation;
too many factors may influence the quality/packet-size/connection-stability/etc.
This question/answer explains it.
Some C-examples here.
A good explanation of socket programming in C.
A quick overview of TCP/IP.
About sending different types of messages:
The data you send is from your server-app is received by your client-app who then can interpret this data any way it likes.
If your data is related, you can create a struct in a separate header and use it in both the client and server code and send a variable of this struct across. If it is not related, then I am not sure why you would need to send them across as one single message.
If you want to transmit and receive your data with a single write and single read, then you have to use a datagram socket. Since a datagram socket is connectionless, you cannot use write/read. Instead, you use sendto/recvfrom or sendmsg/recvmsg. Since a datagram socket is unreliable, you will have to implement your own protocol to tolerate out of order data delivery and data loss.
If you don't want to deal with the unreliable nature of a datagram socket, then you want a stream socket. Since a stream socket is connected, transmitted data are guaranteed and are in order. If the data you send always has the same size, then you can mimic a datagram by using blocking mode for your send call and then passing in MSG_WAITALL in the recv call.
#define MY_MSG_SIZE 9876
int my_msg_send (int sock, const void *msg) {
int r, sent = 0;
do {
r = send(sock, (const char *)msg + sent, MY_MSG_SIZE - sent,
MSG_NOSIGNAL);
if (r <= 0) {
if (r < 0 && errno == EINTR) continue;
break;
}
sent += r;
} while (sent < MY_MSG_SIZE);
if (sent) return sent;
return r;
}
int my_msg_recv (int sock, void *msg) {
int r, rcvd = 0;
do {
r = recv(sock, (char *)msg + rcvd, MY_MSG_SIZE - rcvd, MSG_WAITALL);
if (r <= 0) {
if (r < 0 && errno == EINTR) continue;
break;
}
rcvd += r;
while (rcvd < MY_MSG_SIZE);
if (rcvd) return rcvd;
return r;
}
Notice that the software still has to deal with certain error cases. In the case of EINTR, the I/O operation needs to be retried. For other errors, the delivery or retrieval of data may be incomplete. But generally, for blocking sockets, we expect only one iteration for the do-while loops above.
If your messages are not always the same size, then you need a way to frame the messages. A framed message means you need a way to detect the start of a message, and the end of a message. Perhaps the easiest way to frame a message over a streaming socket is to precede a message with its size. Then the receiver would first read out the size, and then read the rest of the message. You should be able to easily adapt the sample code for my_msg_send and my_msg_recv to do that.
Finally, there is the question of your message itself. If the messages are not always the same size, this likely means there are one or more variable length records within the message. Examples are an array of values, or a string. If both the sender and receiver agree to the order of the records, then it is enough to precede each variable length record with its length. So suppose your message had the following structure:
struct my_data {
const char *name;
int address;
int *types;
int number_of_types;
};
Then you could represent an instance of struct my_data like this:
NAME_LEN : 4 bytes
NAME : NAME_LEN bytes
ADDRESS : 4 bytes
TYPES_LEN : 4 bytes
TYPES : TYPES_LEN * 4 bytes
NAME_LEN would be obtained from strlen(msg->name), and TYPES_LEN would obtain its value from msg->number_of_types. When you send the message, it would be preceded by the total length of the message above.
We have been using 32 bit quantities to represent the length, which is likely sufficient for your purposes. When transmitting a number over a socket, the sender and receiver has to agree on the byte order of the number. That is, whether the number 1 is represented as 0.0.0.1 or as 1.0.0.0. This can typically be handled using network byte ordering, which uses the former. The socket header files provides the macro htonl which converts a 32 bit value from the host's native byte order to network byte order. This is used when storing a value in, say NAME_LEN. The receiver would use the corresponding macro ntohl to restore the transmitted value back to a representation used by the host. The macros could of course be no-ops if the hosts native byte ordering matches network byte order already. Using these macros is of particular importance when sending and receiving data in a heterogeneous environment, since the sender and receiver may have different host byte orderings.

Resources