I'm using read and write functions to communicate between client and server.
If server use two times write, in Wireshark I can see that two packets was send, but my read function concat two packets in one buffer
Question:
It is possible to my read function read only one payload at one time?
I dont want reduce buffer
Ex:
Situation now:
Send(8bytes) Send(8bytes)
Read, read 16 bytes
I want
Send(8 bytes) Send(8Bytes)
Read, read 8 bytes(first packet)
Read, read 8 bytes(second packet)
TCP/IP gives you an ordered byte stream. Reads and writes are not guaranteed to have the same boundaries, as you have seen.
To see where messages begin and end, you need to add extra information to your protocol to provide this information. A workable simple approach is to have a byte count at the start of each message. Read the byte count, then you know how many more bytes to read to get the complete message and none of the next message.
If you want to synchronize server and client use something like semaphores or you can send read/write bytes and this avoid sending information before client read it. Or if you know exactly length of message you can separate readed bytes. If you make buffer exact length of message remain bytes will be lost so make a server sending information when reader read previous message or extend buffer and separate multiple messages.
Related
This question already exists:
How to determine buffer size at the server using socket?
Closed 2 years ago.
Client:
I've used write() to send data from client -> server.
I need to determine the file size at the server which is the 3rd argument in write(sockfd, buffer, strlen(buffer).
Should I use writev() to send the data from client -> server as separate buffers? writev(sockfd, buffer, strlen(buffer))? Is this the right approach?
write(sockfd, buffer, strlen(buffer));
Server:
On the server side, I'm reading the data using read().
Should I use, readv() - to obtain the file size?
It doesn't matter. TCP sends bytes. You send some bytes, some more bytes and some more bytes. It doesn't know about buffers. Buffers are just how you tell TCP which bytes to send.
When you use writev with 3 buffers, you're telling TCP to send the bytes in the first buffer and then the bytes in the second buffer and then the bytes in the third buffer. They all get joined together. Same as if you told it to write one big buffer.
If you want to send two things at once (like a file size and then file data) then writev can be more convenient, or not. Note that writev can decide to stop writing at any point in any buffer, and you have to call it again to write the rest. That makes it not very convenient.
And it has no relevance to the server either. The server is allowed to read the bytes into one buffer and then a second buffer if the first one fills up and then a third buffer if the second one fills up. Or it can read them into one big buffer. TCP doesn't care - they're the same bytes either way.
readv has the same problem as writev where it might only read the first one and a half buffers, instead of all 3 at once, and then you have to call it again and tell it to read the second half of the second buffer and the entire third buffer.
I have written a program in Linux using C/C++ that reads multicast packets and tries to understand that a specific event occurred or not as quickly as possible. Latency is the key point here.
In the protocol, first two bytes represent the message type.
In my current implementation, I read the first two bytes and decide how many bytes I should read for the payload according to the message type. Namely, I perform 2 read operations for 1 packet. One of the read operations is for the packet length and the other is for the payload. So, there are 2 I/O operations.
Alternatively, I could do that, I read as much as I can, check the first 2 bytes, let's say it is N, go for N bytes and form the packet1 and packet2. If there are remaining bytes after reading packet1 and packet2, read more bytes and again process the byte buffer as above. In this method, I do 1 I/O but it is required to traverse in the byte buffer.
Which one is faster theoretically? I know I must implement and measure both but I just wanted to hear your suggestions.
Thanks
The fastest method I know of is:
Open a raw packet socket (AF_PACKET)
Implement a BPF filter, that filters the packets you need as specific as possible
Switch to a memory-mapped ringbuffer (PACKET_MMAP/PACKET_RX_RING)
Read the packets directly from memory instead of using recv(). This can be done using poll() or, alternatively, by busy-looping over the in-memory packet meta-data to avoid the poll() syscall.
Process the packet directly in the ring-buffer (zero-copy)
Mark the buffer as "free for reuse"
This way, no syscalls at all are necessary, the path through the kernel is short and the latency should be minimal.
For more information, see the packet mmap kernel documentation
I have made a multi-client server, which uses select() to determine what clients are currently sending. However, I am wanting to send data that is larger than my buffer size (e.g. text from a file) while remaining a non-blocking client.
At first I have found solutions that place the send/recv into while loops to send the data, with the while loop condition being the amount of bytes sent, but wouldn't that block the server for a certain amount of time? Especially if the contents of the file is large?
I was thinking to send say 1024bytes in one iteration of my server while loop, and then on the next iteration it sends the next 1024bytes to the client etc. Although this would have consequences on the client side. Possibly the client could ask for the next x bytes per query to the server?
Please let me know if there is a standard way to go about this. Thanks.
You don't need to do anything special for this. Your sockets are presumably already configured as non-blocking, so when you write to them, pass as much data as you have, and check the return value to see how much was actually sent. Then keep the rest of the data in a buffer, and wait until the file descriptor is ready again before attempting to write more.
This refers to C sockets. Say I have written some data into the socket, then I call read, will read system call read buffer size(say 4096 etc) information and delete all the information in the socket? Basically will read just move the seek pointer forward to those many bytes that it read or will it read and just delete all the information from socket, so that next time when read is called, it reads from the 0th index?
Or say I write into the socket without calling read from anywhere else? Will the data be replaced or appended?
If there is more data available on a socket than the amount that you read(), the extra data will be kept in the socket's buffer until you read it. No data is lost during a short read.
Writing works similarly. If you call write() multiple times, each write will append data to the buffer on the remote host. Again, no data is lost.
(Eventually, the buffer on the remote host will fill up. When this happens, write() will block -- the local host will wait for the buffer to empty before sending more data.)
Conceptually, each direction in a socket pair behaves like a pipe between the two peers. The overall stream of data sent will be received in the same order it was sent, regardless of how much data was read/written at a time.
I'm having some doubts about the number of bytes I should write/read through a socket in C on Unix. I'm used to sending 1024 bytes, but this is really too much sometimes when I send short strings.
I read a string from a file, and I don't know how many bytes this string is, it can vary every time, it can be 10, 20 or 1000. I only know for sure that it's < 1024. So, when I write the code, I don't know the size of bytes to read on the client side, (on the server I can use strlen()). So, is the only solution to always read a maximum number of bytes (1024 in this case), regardless of the length of the string I read from the file?
For instance, with this code:
read(socket,stringBuff,SIZE);
wouldn't it be better if SIZE is 10 instead of 1024 if I want to read a 10 byte string?
In the code in your question, if there are only 10 bytes to be read, then it makes no difference whether SIZE is 10 bytes, 1,024 bytes, or 1,000,024 bytes - it'll still just read 10 bytes. The only difference is how much memory you set aside for it, and if it's possible for you to receive a string up to 1,024 bytes, then you're going to have to set aside that much memory anyway.
However, regardless of how many bytes you are trying to read in, you always have to be prepared for the possibility that read() will actually read a different number of them. Particularly on a network, when you can get delays in transmission, even if your server is sending a 1,024 byte string, less than that number of bytes may have arrived by the time your client calls read(), in which case you'll read less than 1,024.
So, you always have to be prepared for the need to get your input in more than one read() call. This means you need to be able to tell when you're done reading input - you can't rely alone on the fact that read() has returned to tell you that you're done. If your server might send more than one message before you've read the first one, then you obviously can't hope to rely on this.
You have three main options:
Always send messages which are the same size, perhaps padding smaller strings with zeros if necessary. This is usually suboptimal for a TCP stream. Just read until you've received exactly this number of bytes.
Have some kind of sentinel mechanism for telling you when a message is over. This might be a newline character, a CRLF, a blank line, or a single dot on a line followed by a blank line, or whatever works for your protocol. Keep reading until you have received this sentinel. To avoid making inefficient system calls of one character at a time, you need to implement some kind of buffering mechanism to make this work well. If you can be sure that your server is sending you lines terminated with a single '\n' character, then using fdopen() and the standard C I/O library may be an option.
Have your server tell you how big the message is (either in an initial fixed length field, or using the same kind of sentinel mechanism from point 2), and then keep reading until you've got that number of bytes.
The read() system call blocks until it can read one or more bytes, or until an error occurs.
It DOESN'T guarantee that it will read the number of bytes you request! With TCP sockets, it's very common that read() returns less than you request, because it can't return bytes that are still propagating through the network.
So, you'll have to check the return value of read() and call it again to get more data if you didn't get everything you wanted, and again, and again, until you have everything.