Reading wrong data from TCP socket

Reading wrong data from TCP socket - c

I'm trying to send data blockwise over a TCP socket. The server code does the following:
#define CHECK(n) if((r=n) <= 0) { perror("Socket error\n"); exit(-1); }
int r;
//send the number of blocks
CHECK(write(sockfd, &(storage->length), 8)); //p->length is uint64_t
for(p=storage->first; p!=NULL; p=p->next) {
//send the size of this block
CHECK(write(sockfd, &(p->blocksize), 8)); //p->blocksize is uint64_t
//send data
CHECK(write(sockfd, &(p->data), p->blocksize));
}
On the client side, I read the size and then the data (same CHECK makro):
CHECK(read(sockfd, &block_count, 8));
for(i=0; i<block_count; i++) {
uint64_t block_size;
CHECK(read(sockfd, &block_size, 8));
uint64_t read_in=0;
while(read_in < block_size) {
r = read(sockfd, data+read_in, block_size-read_in); //assume data was previously allocated as char*
read_in += r;
}
}
This works perfectly fine as long as both client and server run on the same machine, but as soon as I try this over the network, it fails at some point. In particular, the first 300-400 blocks (à ~587 bytes) or so work fine, but then I get an incorrect block_size reading:
received block #372 size : 586
read_in: 586 of 586
received block #373 size : 2526107515908
And then it crashes, obviously.
I was under the impression that the TCP protocol ensures no data is lost and everything is received in correct order, but then how is this possible and what's my mistake here, considering that it already works locally?

There's no guarantee that when you read block_count and block_size that you will read all 8 bytes in one go.

I was under the impression that the TCP protocol ensures no data is
lost and everything is received in correct order
Yes, but that's all that TCP guarantees. It does not guarantee that the data is sent and received in a single packet. You need to gather the data and piece them together in a buffer until you get the block size you want before copying the data out.

Perhaps the read calls are returning without reading the full 8 bytes. I'd check what length they report they've read.
You might also find valgrind or strace informative for better understanding why your code is behaving this way. If you're getting short reads, strace will tell you what the syscalls returned, and valgrind will tell you that you're reading uninitialized bytes in your length variables.

The reason why it works on the same machine is that the block_size and block_count are sent as binary values and when they are received and interpreted by the client, they have same values.
However, if two machines communicating have different byte order for representing integers, e.g. x86 versus SPARC, or sizeof(int) is different, e.g. 64 bit versus 32 bit, then the code will not work correctly.
You need to verify that sizeof(int) and byte order of both machines is identical. On the server side, print out sizeof(int) and values of storage->length and p->blocksize. On the client side print out sizeof(int) and values of block_count and block_size.
When it doesn't work correctly, I think you will find them that they are not the same. If this is true, then the contents of data is also going to be misinterpreted if it contains any binary data.

Related

Using C POSIX sockets, can you determine how many bytes a socket contains without extracting?

I'm working with POSIX sockets in C.
Given X, I have a need to verify that the socketfd contains at least X bytes before proceeding to perform an operation with it.
With that being said, I don't want to receive X bytes and store it into a buffer using recv as X has the potential of being very large.
My first idea was to use MSG_PEEK...
int x = 9999999
char buffer[1];
int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
(value == X) ? good : bad;
...
...
...
// Do some operation
But I'm concerned X > 1 is corrupting memory, flag MSG_TRUNC seems to resolve the memory concern but removes X bytes from socketfd.

There's a big difference between e.g. TCP and UDP in this regards.
UDP is packet based, you send and receive packets of fixed size, basically.
TCP is a streaming protocol, where data begins to stream on connection and stops at disconnection. There are no message boundaries or delimiters in TCP, other than what you add at the application layer. It's simply a stream of bytes without any meaning (in TCP's point of view).
That means there's no way to tell how much will be received with a single recv call.
You need to come up with an application-level protocol (on top of TCP) which can either tell the size of the data to be received; For example there might be a fixed-size data-header that contains the size of the following data; Or you could have a specific delimiter between messages, something that can't occur in the stream of bytes.
Then you receive in a loop until you either have received all the data, or until you have received the delimiter. But note, with a delimiter there's the possibility that you also receive the beginning of the next message, so you need to be able to handle partial beginnings of message after the current message have been fully received.

int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
This will copy up to X byte into buffer and return it without removing it from the socket. But your buffer is only 1 byte large. Increase your buffer.
Have you tried this?
ssize_t available = recv(socketfd, NULL, 0, MSG_PEEK | MSG_TRUNC);
Or this?
size_t available;
ioctl(socketfd, FIONREAD, &available);

Reading from Socket in a loop?

I am creating a server/client TCP in C.
The idea is for the server to send a relatively large amount of information. However, the buffer in the client has a size of only 512 (I don't want to increase this size), and obviously, the information sent by the server is larger than this. Let's imagine 812 bytes.
What I want to do is, in the client, read 512 bytes, print them on the client's console, and then read the remaining bytes, and print them as well.
Here's what should happen:
1) Create server, and block in the read() system call (waiting for the client to write something);
2) Create the client, and write something in the socket, and then blocks on read(), waiting for the server to respond;
3) The server's read() call returns, and now server has to send that large amount of data, using the following code (after creating a new process):
dup2(new_socketfd, STDOUT_FILENO); // Redirect to socket
execlp("/application", "application", NULL); // Application that prints the information to send to the client
Let's imagine "application" printed 812 bytes of data to the socket.
4) Now the client has to read 812 bytes, with a buffer size of 512. That's my problem.
How can I approach this problem? I was wondering if I could make a loop, and read until there's nothing to read, 512 by 512 bytes. But as soon as there's nothing to read, client will block on read().
Any ideas?

recv will block when there is no data in the stream. Any data extracted from the stream, the length is returned from recv.
You can write a simple function to extract the full data just by using an offset variable and checking the return value.
A simple function like this will do.
ssize_t readfull(int descriptor,char* buffer, ssize_t sizetoread){
ssize_t offset = 0;
while (offset <sizetoread) {
ssize_t read = recv(descriptor,buffer+offset,sizetoread-offset,0);
if(read < 1){
return offset;
}
offset+=read;
}
return offset;
}
Also servers typically send some kind of EOF when the data is finished. Either the server might first send the length of the message to be read which is a constant size either four or eight bytes, then it sends the data so you know ahead of time how much to read. Or, in the case of HTTP for example, there is the content-length field as well as the '\r\n' delimeters.
Realistically there is no way to know how much data the server has available to send you, it's impractical. The server has to tell you how much data there is through some kind of indicator.
Since you're writing the server yourself, you can first send a four byte message which can be an int value of how much data the client should read.
So your server can look like this:
int sizetosend = arbitrarysize;
send(descriptor,(char*)&sizetosend,sizeof(int),0);
send(descriptor,buffer,sizetosend,0);
Then on your client side, read four bytes then the buffer.
int sizetoread = 0;
ssize_t read = recv(descriptor,(char*)&sizetoread,sizeof(int),0);
if(read < 4)
return;
//Now just follow the code I posted above

Sockets - Reading and writing [duplicate]

I'm very new to C++, but I'm trying to learn some basics of TCP socket coding. Anyway, I've been able to send and receive messages, but I want to prefix my packets with the length of the packet (like I did in C# apps I made in the past) so when my window gets the FD_READ command, I have the following code to read just the first two bytes of the packet to use as a short int.
char lengthBuffer[2];
int rec = recv(sck, lengthBuffer, sizeof(lengthBuffer), 0);
short unsigned int toRec = lengthBuffer[1] << 8 | lengthBuffer[0];
What's confusing me is that after a packet comes in the 'rec' variable, which says how many bytes were read is one, not two, and if I make the lengthBuffer three chars instead of two, it reads three bytes, but if it's four, it also reads three (only odd numbers). I can't tell if I'm making some really stupid mistake here, or fundamentally misunderstanding some part of the language or the API. I'm aware that recv doesn't guarantee any number of bytes will be read, but if it's just two, it shouldn't take multiple reads.

Because you cannot assume how much data will be available, you'll need to continuously read from the socket until you have the amount you want. Something like this should work:
ssize_t rec = 0;
do {
int result = recv(sck, &lengthBuffer[rec], sizeof(lengthBuffer) - rec, 0);
if (result == -1) {
// Handle error ...
break;
}
else if (result == 0) {
// Handle disconnect ...
break;
}
else {
rec += result;
}
}
while (rec < sizeof(lengthBuffer));

Streamed sockets:
The sockets are generally used in a streamed way: you'll receive all the data sent, but not necessarily all at once. You may as well receive pieces of data.
Your approach of sending the length is hence valid: once you've received the length, you cann then load a buffer, if needed accross successive reads, until you got everything that you expected. So you have to loop on receives, and define a strategy on how to ahandle extra bytes received.
Datagramme (packet oriented) sockets:
If your application is really packet oriented, you may consider to create a datagramme socket, by requesting linux or windows socket(), the SOCK_DGRAM, or better SOCK_SEQPACKET socket type.
Risk with your binary size data:
Be aware that the way you send and receive your size data appers to be assymetric. You have hence a major risk if the sending and receiving between machine with CPU/architectures that do not use the same endian-ness. You can find here some hints on how to ame your code platform/endian-independent.

TCP socket is a stream based, not packet (I assume you use TCP, as to send length of packet in data does not make any sense in UDP). Amount of bytes you receive at once does not have to much amount was sent. For example you may send 10 bytes, but receiver may receive 1 + 2 + 1 + 7 or whatever combination. Your code has to handle that, be able to receive data partially and react when you get enough data (that's why you send data packet length for example).

Writing messages of different types through sockets in C

I want to send a message to my Binder class via TCP socket connection in C. I have to pass in a request type (char*), ip address (int), argTypes(int array) etc. through this connection using the write() method. What's the best method to send all of the information in one single message?

There's no guarantee that you can send/receive all your data in a single read/write operation;
too many factors may influence the quality/packet-size/connection-stability/etc.
This question/answer explains it.
Some C-examples here.
A good explanation of socket programming in C.
A quick overview of TCP/IP.
About sending different types of messages:
The data you send is from your server-app is received by your client-app who then can interpret this data any way it likes.

If your data is related, you can create a struct in a separate header and use it in both the client and server code and send a variable of this struct across. If it is not related, then I am not sure why you would need to send them across as one single message.

If you want to transmit and receive your data with a single write and single read, then you have to use a datagram socket. Since a datagram socket is connectionless, you cannot use write/read. Instead, you use sendto/recvfrom or sendmsg/recvmsg. Since a datagram socket is unreliable, you will have to implement your own protocol to tolerate out of order data delivery and data loss.
If you don't want to deal with the unreliable nature of a datagram socket, then you want a stream socket. Since a stream socket is connected, transmitted data are guaranteed and are in order. If the data you send always has the same size, then you can mimic a datagram by using blocking mode for your send call and then passing in MSG_WAITALL in the recv call.
#define MY_MSG_SIZE 9876
int my_msg_send (int sock, const void *msg) {
int r, sent = 0;
do {
r = send(sock, (const char *)msg + sent, MY_MSG_SIZE - sent,
MSG_NOSIGNAL);
if (r <= 0) {
if (r < 0 && errno == EINTR) continue;
break;
}
sent += r;
} while (sent < MY_MSG_SIZE);
if (sent) return sent;
return r;
}
int my_msg_recv (int sock, void *msg) {
int r, rcvd = 0;
do {
r = recv(sock, (char *)msg + rcvd, MY_MSG_SIZE - rcvd, MSG_WAITALL);
if (r <= 0) {
if (r < 0 && errno == EINTR) continue;
break;
}
rcvd += r;
while (rcvd < MY_MSG_SIZE);
if (rcvd) return rcvd;
return r;
}
Notice that the software still has to deal with certain error cases. In the case of EINTR, the I/O operation needs to be retried. For other errors, the delivery or retrieval of data may be incomplete. But generally, for blocking sockets, we expect only one iteration for the do-while loops above.
If your messages are not always the same size, then you need a way to frame the messages. A framed message means you need a way to detect the start of a message, and the end of a message. Perhaps the easiest way to frame a message over a streaming socket is to precede a message with its size. Then the receiver would first read out the size, and then read the rest of the message. You should be able to easily adapt the sample code for my_msg_send and my_msg_recv to do that.
Finally, there is the question of your message itself. If the messages are not always the same size, this likely means there are one or more variable length records within the message. Examples are an array of values, or a string. If both the sender and receiver agree to the order of the records, then it is enough to precede each variable length record with its length. So suppose your message had the following structure:
struct my_data {
const char *name;
int address;
int *types;
int number_of_types;
};
Then you could represent an instance of struct my_data like this:
NAME_LEN : 4 bytes
NAME : NAME_LEN bytes
ADDRESS : 4 bytes
TYPES_LEN : 4 bytes
TYPES : TYPES_LEN * 4 bytes
NAME_LEN would be obtained from strlen(msg->name), and TYPES_LEN would obtain its value from msg->number_of_types. When you send the message, it would be preceded by the total length of the message above.
We have been using 32 bit quantities to represent the length, which is likely sufficient for your purposes. When transmitting a number over a socket, the sender and receiver has to agree on the byte order of the number. That is, whether the number 1 is represented as 0.0.0.1 or as 1.0.0.0. This can typically be handled using network byte ordering, which uses the former. The socket header files provides the macro htonl which converts a 32 bit value from the host's native byte order to network byte order. This is used when storing a value in, say NAME_LEN. The receiver would use the corresponding macro ntohl to restore the transmitted value back to a representation used by the host. The macros could of course be no-ops if the hosts native byte ordering matches network byte order already. Using these macros is of particular importance when sending and receiving data in a heterogeneous environment, since the sender and receiver may have different host byte orderings.

C unix socket programming read() issue

I'm using C to implement a client server application. The client sends info to the server and the server uses it to send information back. I'm currently in the process of writing the code to handle the receiving of data to ensure all of it is, in fact, received.
The issue I'm having is best explained after showing some code:
int totalRead = 0;
char *pos = pBuffer;
while(totalRead < 6){
if(int byteCount = read(hSocket, pos, BUFFER_SIZE - (pos-pBuffer)>0)){
printf("Read %d bytes from client\n", byteCount);
pos += byteCount;
totalRead += byteCount;
}else return -1;
}
The code above runs on the server side and will print out "Read 1 bytes from client" 6 times and the program will continue working fine. I've hard-coded 6 here knowing I'm writing 6 bytes from the client side but I'll make my protocol require the first byte sent to be the length of rest of the buffer.
int byteCount = read(hSocket, pBuffer, BUFFER_SIZE);
printf("Read %d bytes from client", byteCount);
The code above, used in place of the first code segment, will print "Read 6 bytes from client" and continue working fine but it doesn't guarantee I've received every byte if only 5 were read for instance.
Can anyone explain to me why this is happening and a possible solution? I guess the first method ensures all bytes are being delivered but it seems inefficient reading one byte at a time...
Oh and this is taking place in a forked child process and I'm using tcp/ip.
Note: My goal is to implement the first code segment successfully so I can ensure I'm reading all bytes, I'm having trouble implementing it correctly.

Basically the right way to do this is a hybrid of your two code snippets. Do the first one, but don't just read one byte at a time; ask for all the bytes you're expecting. But look at bytesRead, and if it's less than you expected, adjust your destination pointer, adjust your expected number read, and call read() again. This is just how it works: sometimes the data you're expecting is split across packets and isn't all available at the same time.
Reading your comment below and looking at your code, I was puzzled, because yeah, that is what you're trying to do. But then I looked very closely at your code:
read(hSocket, pos, BUFFER_SIZE - (pos-pBuffer)>0)){
^
|
THIS ---------|
That "> 0" is inside the parentheses enclosing read's arguments, not outside; that means it's part of the arguments! In fact, your last argument is interpreted as
(BUFFER_SIZE - (pos-pBuffer)) > 0
which is 1, until the end, when it becomes 0.

Your code isn't quite right. read and write may not read or write the total amount of data you've requested they should. Instead, you should advance the read or write pointer after each call and count how many bytes you have left to submit with in the transmission.
If you get back negative one from either read or write you've gotten an error. A zero indicates the transmission completed (there were no more bytes to send) and any number above zero indicates how many bytes were sent in the last call to read or write respectively.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Reading wrong data from TCP socket - c

There's no guarantee that when you read block_count and block_size that you will read all 8 bytes in one go.

Related

Using C POSIX sockets, can you determine how many bytes a socket contains without extracting?

Reading from Socket in a loop?

Sockets - Reading and writing [duplicate]

Writing messages of different types through sockets in C

C unix socket programming read() issue

Categories

Resources