Will read (socket) block until the buffer is full? - c

I wrote a simple C socket program that sends an INIT package to the server to indicate to prepare a text transfer. The server does not sends any data back at that time.
After sending the INIT package the client sends a GET package and waits for chunks of data from the server.
So every time the server receives a GET package it will send a chunk of data to the client.
So far so good. The buffer has a size of 512 bytes, a chunk is 100 Bytes plus a little overhead big.
But my problem is that the client does not receive the second message.
So my guess is that read() will blpck until the buffer is full. Is that right or what might be the reason for that?

It depends. For TCP sockets read may return before the buffer is full, and you may need to receive in a loop to get a whole message. For UDP sockets the size you read is typically the size of a single packet (datagram) and then read may block until it has read all the requested data.

The answer is no: read() on a tcp/ip socket will not block until the buffer has the amount of data you requested. read() will return immediately in all cases if any data is available, even if your socket is blocking and you've requested more data than is available.
Keep in mind that TCP/IP is a byte stream protocol and you must treat it as such. The interface is under no obligation to transmit your data together in a single packet, as long as it is presented to you in the order you placed it in the socket.

The answer is no , read is not blocking call , You can refer below points to guess the error
Several Checkpoints you can find :
Find out what read is returning at the second time .
memset the buffer every time in while before recv
use fflush(stdout) if not able to output.
Make sure all three are present . if problem not solved yet .please post source code here

Related

Reading all available bytes via socket using blocking I/O

When reading from a socket using read(2) and blocking I/O, when do I know that the other side (the client) has no more data to send? (by "no more data to send" I mean that, as an example, the client is waiting for a response). At first, I thought that this point is reached when less than count bytes are returned by read (as in read(fd, *buf, count)).
But what if the client sends the data fragmented? Reading until read returns 0 would be a solution, but as far as I know 0 is only returned when the client closes the connection - otherwise, read would just block until the connection is closed. I thought of using non-blocking I/O and a timeout for select(2), but this does not seem to be a tidy solution to me.
Are there any known best practices?
The concept of "the other side has no more data to send", without either a timeout or some semantics in the transmitted data, is quite pointless. Normally, code on the client/server will be able to process data faster than the network can transmit it. So if there's no data in the receive buffer when you're trying to read() it, this just means the network has not yet transmitted everything, but you have no way to tell if the next packet will arrive within a millisecond, a second, or a day. You'd probably consider the first case as "there is more data to send", the third as "no more data to send", and the second depends on your application.
If the other side doesn't close the connection, you probably don't know when it's ready to send the next data packet either.
So unless you have specific semantics and knowledge about what the client sends, using select() and non-blocking I/O is the best you can do.
In specific cases, there might be other ways - for example, if you know the client will send and XML tag, some data, and a closing tag, every n seconds. In that case you could start reading n seconds after the last packet you received, then just read on until you receive the closing tag. But as i said, this isn't a general approach since it requires semantics on the channel.
TCP is a byte-stream protocol, not a message protocol. If you want messages you really have to implement them yourself, e.g. with a length-word prefix, lines, XML, etc. You can guess with the FIONREAD option of ioctl(), but guessing is all it is, as you can't know whether the client has paused in the middle of transmission of the message, or whether the network has done so for some reason.
The protocol needs to give you a way to know when the client is finishes sending a message.
Common approaches are to send the length of each message before it, or to send a special terminator after each message (similar to the NUL character at the end of strings in C).

recv() windows socket function not receiving whole buffer

I am working on a server client application on windows.
My program works perfectly fine when messages sent are really small (around 20-30KB).
But the moment the size of buffer sent is greater than 50 KB, the program doesnt work.
I noticed that send() function sends exact number of bytes I wish to. But recv function doesnt receive that much. I am assuming that the network might have fragmented the buffer internally. But shouldn't I be able to receive the whole buffer after multiple recv() calls?
I am sending the buffer through TCP, so it should ideally guarantee that I receive the whole message.
When I send a buffer of 50KB or greater, I get around 16KB in the first recv(), but the next time recv() is called I get an empty buffer. Why does the message get lost? And is there any way to get it? It happens only for buffers greater than 50KB.
I code in C using windows socket library.

UDP - Read data from the queue in chunks

I'm implementing a small application using UDP (in C). A server sends to a client the data from a given file in chunks of given amount (ex. 100 bytes / call). The client downloads the file and saves it somewhere. The catch is that the client can receive a parameter saying how many bytes to read / call.
My problem is when the server sends 100 bytes / call, and the client is set to read only 15 bytes / call. The other 85 bytes are lost, because the message is removed from the UDP queue.
Is there a way to read these messages in chunks without removing them from the queue until they're completely read?
UDP does not allow for chunked reading like TCP does. Reading a UDP message is an all-or-nothing operation, you either read the whole message in full or none of it at all. There is no in-between. Because of that, UDP-based protocols either use fixed-sized messages, or require both parties to dynamically negotiate the message sizes (like TrivialFTP does, for example).
There is no reason for a UDP protocol to require sending a byte size for each message. The message size itself implicitly dictates the size of the data inside of the message.
If you absolutely must determine the message size before actually reading the message, you could try calling recvfrom() with the MSG_PEEK flag, and give it a large buffer to copy data into (at least 64K, which a UDP message will never exceed, unless you are using IPv6 Jumbograms, but that is a separate issue). The output will tell you the actual size of the message that is still in the queue. However, if you go this route, then you may as well just drop the MSG_PEEK flag and always read using 64K buffers so there is no possibility of dropping data due to insufficient buffer sizes.
You can create a Thread to read the data from UDP Buffer infinitely And save the data to a circle-buffer. Than the client consume the data with your speed. If the Buffer is overflow,You can do nothing.Because the server's sending speed is quicker than the client's.

Padding data over TCP

I am working on a client-server project and need to implement a logic where I need to check whether I have received the last data over a TCP socket connection, before I proceed.
To make sure that I have received all the data , I am planning to pad a flag to the last packet sent.I had two options in mind as below and also related prob.
i. Use a struct as below and populate the vst_pad for the last packet sent and check the same on the recv side for its presence. The advantage over option two is that, I dont have to remove the flag from actual data before writing it to a file.Just check the first member of the struct
typedef struct
{
/* String holding padding for last packet when socket is changed */
char vst_pad[10];
/* Pointer to data being transmitted */
char *vst_data;
//unsigned char vst_data[1];
} st_packetData;
The problem is I have to serialize the struct on every send call. Also I am not sure whether I will receive the entire struct over TCP in one recv call and so have to add logic/overhead to check this every time. I have implemented this so far but figured it later that stream based TCP may not guarantee to recv entire struct in one call.
ii. Use function like strncat to add that flag at the end to the last data being sent.
The prob is I have to check on every receive call either using regex functions or function like strstr for the presence of that flag and if so have to remove it from the data.
This application is going to be used for large data transfers and hence want to add minimal overhead on every send/recv/read/write call. Would really appreciate to know if there is a better option then the above two or any other option to check the receipt of last packet. The program is multithreaded.
Edit: I do not know the total size of file I am going to send, but I am sending fixed amount of data. That is fgets read until the size specified -1 or until a new line is encountered.
Do you know the size of the data in advance, and is it a requirement that you implement a end of message flag?
Because I would simplify the design, add a 4-byte header (assuming you're not sending more than 4gb of data per message), that contains the expected size of the message.
Thus you parse out the first 4 bytes, calculate the size, then continue calling recv until you get that much data.
You'll need to handle the case where your recv call gets data from the next message, and obviously error handling.
Another issue not raised with your 10byte pad solution is what happens if the actual message contains 10 zero bytes--assuming you're padding it with zeros? You'd need to escape the 10bytes of zeros otherwise you may mistakenly truncate the message.
Using a fixed sized header and a known size value will alleviate this problem.
For a message (data packet) first send a short (in network order) of the size, followed by the data. This can be achieved in one write system call.
On the reception end, just read the short and convert back into host order (this will enable one to use different processors at a later state. You can then read the rest of the data.
In such cases, it's common to block up the data into chunks and provide a chunk header as well as a trailer. The header contains the length of the data in the chunk and so the peer knows when the trailer is expected - all it has to do is count rx bytes and then check for a valid trailer. The chunks allow large data transfers without huge buffers at both ends.
It's no great hassle to add a 'status' byte in the header that can identify the last chunk.
An alternative is to open another data connection, stream the entire serialization and then close this data connection, (like FTP does).
Could you make use of an open source network communication library written in C#? If so checkout networkComms.net.
If this is truely the last data sent by your application, use shutdown(socket, SHUT_WR); on the sender side.
This will set the FIN TCP flag, which signals that the sender->receiver stream is over. The receiver will know this because his recv() will return 0 (just like an EOF condition) when everything has been received. The receiver can still send data afterward, and the sender can still listen for them, but it cannot send more using this connection.

How long does a UDP packet stay at a socket?

If data is sent to the client but the client is busy executing something else, how long will the data be available to read using recvfrom()?
Also, what happens if a second packet is sent before the first one is read, is the first one lost and the next one sitting there wating to be read?
(windows - udp)
If data is sent to the client but the client is busy executing something else, how long will the data be available to read using recvfrom()?
Forever, or not at all, or until you close the socket or read as much as a single byte.
The reason for that is:
UDP delivers datagrams, or it doesn't. This sounds like nonsense, but it is exactly what it is.
A single UDP datagram relates to either exactly one or several "fragments", which are IP packets (further encapsulated in some "on the wire" protocol, but that doesn't matter). The network stack collects all fragments for a datagram. If the checksum on any of the fragments is not good, or any other thing that makes the network stack unhappy, the complete datagram is discarded, and you get nothing, not even an error. You simply don't know anything happened.
If all goes well, a complete datagram is placed into the receive buffer. Never anything less, and never anything more. If you try to recvfrom later, that is what you'll get.
The receive buffer is obviously necessarily large enough to hold at least one max-size datagram (65535 bytes), but since usually datagrams will not be maximum size, but rather something below 1280 bytes (or 1500 if you will), it can usually hold quite a few of them (on most platforms, the buffer defaults to something around 128-256k, and is configurable).
If there is not enough room left in the buffer, the datagram is discarded, and you get nothing (well, you do still get the ones that are already in the buffer). Again, you don't even know something happened.
Each time you call recvfrom, a complete datagram is removed from the buffer (important detail!), and you get up to the number of bytes that you requested. Which means if you naively try read a few bytes and then a few bytes again, it just won't work. The first read will discard the rest of the datagram, and the subsequent ones read the first bytes of some future datagrams (and possibly block)!
This is very different from how TCP works. Here you can actually read a few bytes and a few bytes again, and it will just work, because the network layer simulates a data stream. You give a crap how it works, because the network stack makes sure it works.
Also, what happens if a second packet is sent before the first one is read, is the first one lost and the next one sitting there waiting to be read?
You probably meant to say "received" rather than "sent". Send and receive have different buffers, so that would not matter at all. About receiving another packet while one is still in the buffer, see the above explanation. If the buffer can hold the second datagram, it will store it, otherwise it silently goes * poof *.
This does not affect any datagrams already in the buffer.
Normally, the data will be buffered until it's read. I suppose if you wait long enough that the driver completely runs out of space, it'll have to do something, but assuming your code works halfway reasonably, that shouldn't be a problem.
A typical network driver will be able to buffer a number of packets without losing any.
If data is sent to the client but the client is busy executing something else, how long will the data be available to read using recvfrom()?
This depends on the OS, in windows, I believe the default for each UDP socket is 8012, this can be raised with setsockopt() Winsock Documentation So, as long as the buffer isn't full, the data will stay there until the socket is closed or it is read.
Also, what happens if a second packet is sent before the first one is read, is the first one lost and the next one sitting there wating to be read?
If the buffer has room, they are both stored, if not, one of them gets discarded. I believe its the newest one but I'm not 100% Sure.

Resources