I am trying to read a file from a socket.
I use select with timeout to exit after reading.
select(maxfdp1, &rset, NULL, NULL, &timeout);
But if I knew the size of the file being sent right away, I could exit instantly after getting the right amount of bytes.
Сan i get the full file size before transferring it?
Or what should I use to exit instantly after the transfer is complete?
Because TCP is a stream-oriented protocol, it has no concept of the size of an application layer message. If you're setting up your own application layer protocol on top of TCP, you could have your sender first transmit the size of the following data, such as four bytes in network order (big Endian).
Once you've received all of the data you want, you can call close on the socket.
Related
I’m using raspberry pi b+ and building tcp server/client connection with C.
I have few questions from client side.
How long does Linux queue the packets for client? When the packet has received thru Linux, what if client is not ready to process it or select/epoll func inside loop has 1min sleep? If there is a timeout, is there a way to adjust the timeout with code/script?
What is the internal process inside of Linux when it receives the packet? (i.e., ethernet port->kernel->ram->application??)
The raspberry pi (with linux) and any known linux (or nonlinux) tcp/ip works in some way like this:
You have a kernel buffer in which the kernel stores all the data from the other side, this is the data that has not yet been read by the user process. the kernel normally has all this data acknowledged to the other side (the acknowledge states the last byte received and stored in that buffer) The sender side has also a buffer, where it stores all the sent data that has not yet been acknowledged by the receiver (This data must be resent in case of timeout) plus data that is not yet in the window admitted by the receiver. If this buffer fills, the sender is blocked, or a partial write is reported (depending on options) to the user process.
That kernel buffer (the reading buffer) allows the kernel to make the data available for reading to the user process while the process is not reading the data. If the user process cannot read it, it remains there until de process does a read() system call.
The amount of buffering that the kernel is still capable of reading (known as the window size) is sent to the other end on each acknowledge, so the sender knows the maximum amount of data it is authorized to send. When the buffer is full, the window size descends to zero and the receiver announces it cannot receive more data. This allows a slow receiver to stop a fast sender from filling the network with data that cannot be sent.
From then on (the situation with a zero window), the sender periodically (or randomly) sends a segment with no data at all (or with just one byte of data, depending on the implementation) to check if some window has open to allow it to send more data. The acknowledge to that packet will allow it to start communicating again.
Everything is stopped now, but no timeout happens. both tcps continue talking this way until some window is available (meaning the receiver has read() part of the buffer)
This situation can be mainained for days without any problem, the reading process is busy and cannot read the data, and the writing process is blocked in the write call until the kernel in the sending side has buffer to accomodate the data to be written.
When the reading process reads the data:
An ack of the last sent byte is sent, announcing a new window size, larger than zero (by the amount freed by the reader process when reading)
The sender receives this acknowledge and sends that amount of data from his buffer, if this allows to accomodate the data the writer has requested to write, it will be awaken and allowed to continue sending data.
Again, timeouts normally only occur if data is lost in transit.
But...
If you are behind a NAT device, your connection data can be lost from not exercising it (the nat device maintains a cache of used address/port local devices making connections to the outside) and on the next data transfer that comes from the remote device, the nat device can (or cannot) send a RST, because the packet refers to a connection that is not known to it (the cache entry expired)
Or if the packet comes from the internal device, the connection can be recached and continue, what happens, depends on who is the first to send a packet.
Nothing specifies that an implementation should provide a timeout for data to be sent, but some implementations do, aborting the connection with an error in case some data is timeout for a large amount of time. TCP specifies no timeout in this case, so it is the process resposibility to cope with it.
TCP is specified in RFC-793 and must be obeyed by all implementations if they want communications to succeed. You can read it if you like. I think you'll get a better explanation than the one I give you here.
So, to answer your first question: The kernel will store the data in its buffer as long as your process wants to wait for it. By default, you just call write() on a socket, and the kernel tries as long as you (the user) don't decide to stop the process and abort the operation. In that case the kernel will probably try to close the connection or reset it. The resources are surrogated to the life of the process, so as long as the process is alive and holding the connection, the kernel will wait for it.
I'm learning about C socket programming and I came across this piece of code in a online tutorial
Server.c:
//some server code up here
recv(sock_fd, buf, 2048, 0);
//some server code below
Client.c:
//some client code up here
send(cl_sock_fd, buf, 2048, 0);
//some client code below
Will the server receive all 2048 bytes in a single recv call or can the send be be broken up into multiple receive calls?
TCP is a streaming protocol, with no message boundaries of packets. A single send might need multiple recv calls, or multiple send calls could be combined into a single recv call.
You need to call recv in a loop until all data have been received.
Technically, the data is ultimately typically handled by the operating system which programs the physical network interface to send it across a wire or over the air or however else applicable. And since TCP/IP doesn't define particulars like how many packets and of which size should compose your data, the operating system is free to decide as much, which results in your 2048 bytes of data possibly being sent in fragments, over a period of time.
Practically, this means that by calling send you may merely be causing your 2048 bytes of data be buffered for sending, much like an e-mail in a queue, except that your 2048 bytes aren't even a single piece of anything to the system that sends it -- it's just 2048 more bytes to chop into packets the network will accept, marked with a destination address and port, among other things. The job of TCP is to only make sure they're the same bytes when they arrive, in same order with relation to each other and other data sent through the connection.
The important thing at the receiving end is that, again, the arriving data is merely queued and there is no information retained as to how it was partitioned when requested sent. Everything that was ever sent through the connection is now either part of a consumable stream or has already been consumed and removed from the stream.
For a TCP connection a fitting analogy would be the connection holding an open water keg, which also has a spout (tap) at the bottom. The sender can pour water into the keg (as much as it can contain, anyway) and the receiver can open the spout to drain the water from the keg into say, a cup (which is an analogy to a buffer in an application that reads from a TCP socket). Both sender and receiver can be doing their thing at the same time, or either may be doing so alone. The sender will have to wait (send call will block) if the keg is full, and the receiver will have to wait (recv call will block) if the keg is empty.
Another, shorter analogy is that sender and receiver sit each at their own end of a opaque pipe, with the former pushing stuff in one end and the latter removing pushed stuff out of the other end.
I wrote a simple C socket program that sends an INIT package to the server to indicate to prepare a text transfer. The server does not sends any data back at that time.
After sending the INIT package the client sends a GET package and waits for chunks of data from the server.
So every time the server receives a GET package it will send a chunk of data to the client.
So far so good. The buffer has a size of 512 bytes, a chunk is 100 Bytes plus a little overhead big.
But my problem is that the client does not receive the second message.
So my guess is that read() will blpck until the buffer is full. Is that right or what might be the reason for that?
It depends. For TCP sockets read may return before the buffer is full, and you may need to receive in a loop to get a whole message. For UDP sockets the size you read is typically the size of a single packet (datagram) and then read may block until it has read all the requested data.
The answer is no: read() on a tcp/ip socket will not block until the buffer has the amount of data you requested. read() will return immediately in all cases if any data is available, even if your socket is blocking and you've requested more data than is available.
Keep in mind that TCP/IP is a byte stream protocol and you must treat it as such. The interface is under no obligation to transmit your data together in a single packet, as long as it is presented to you in the order you placed it in the socket.
The answer is no , read is not blocking call , You can refer below points to guess the error
Several Checkpoints you can find :
Find out what read is returning at the second time .
memset the buffer every time in while before recv
use fflush(stdout) if not able to output.
Make sure all three are present . if problem not solved yet .please post source code here
I have worked in non blocking TCP, in that both read and write can fail in non blocking case. TCP non blocking read can fail if there is no data available and TCP write can fail if peer side`s TCP buffer is full (I hope TCP buffer size is 64K).
Similarly UDP read (recvfrom) can fail if no data available. But what is the failure case for UDP write (sendto). I think in UDP write there will not be any non block error. Because TCP write sends data and wait for the ACK from other side. But this is not the case for UDP write it will just send and comes out and it doesnt wait for any ACK from peer side. If its not send to other side means its packet loss.
Whether my understanding of UDP non blocking write is correct ? Please explain ?
The most likely reason why a UDP non-blocking send would fail is that the UDP socket's in-kernel outgoing-data buffer is full. In this case, send()/sendto() would return -1 and errno would be set to EWOULDBLOCK.
Note that a non-blocking send()/sendto() doesn't actually send the data out the network device before it returns; rather it copies the data into an in-kernel buffer and returns immediately, and thereafter it is the kernel's responsibility to move that data out to the network as quickly as it can. The outgoing-data buffer can become full if your program tries to send a lot of data at once, because the CPU can add your new data to the buffer much faster than the network hardware can forward the buffer's data out to the network.
If you get a -1/EWOULDBLOCK error, usually the most graceful way to handle it is to stop trying to send on that socket until the socket select()'s (or poll()'s, or etc) as ready-for-write. When that happens, you know that the in-kernel buffer has been at least partially drained, and you can try the send()/sendto() call again.
Another (less likely) cause of an error from send() would be if the IP address you are trying to send to is invalid. In any case, you should check errno and find out what the errno value is, as that will give you better insight into what is going wrong.
Btw the behavior described above is not unique to UDP, either; you can and will have the same problem with a non-blocking TCP socket (even if the remote peer's receive window is not full) if you try to send() data on the socket faster than the local network card can drain the socket's in-kernel buffer.
Because TCP write sends data and wait for the ACK from other side.
No it doesn't. It copies your data into the socket send buffer, and if that is full it either blocks or returns -1/EWOULDBLOCK/EAGAIN.
But this is not the case for UDP write it will just send and comes out and it doesnt wait for any ACK from peer side.
No it doesn't. It copies your data into the socket send buffer, and if that is full it either blocks or returns -1/EWOULDBLOCK/EAGAIN.
In both cases the actual putting of bytes onto the wire is asynchronous to your program.
I'm attempting to write a simple server using C system calls that takes unknown byte streams from unknown clients and executes specific actions depending on client input. For example, the client will send a command "multiply 2 2" and the server will multiply the numbers and return the result.
In order to avoid errors where the server reads before the client has written, I have a blocking recv() call to wait for any data using MSG_PEEK. When recv detects data to be read, I move onto non-blocking recv()'s that read the stream byte by byte.
Everything works except in the corner case where the client sends no data (i.e. write(socket, "", 0); ). I was wondering how exactly I would detect that a message with no data is sent. In this case, recv() blocks forever.
Also, this post pretty much sums up my problem, but it doesn't suggest a way to detect a size 0 packet.
What value will recv() return if it receives a valid TCP packet with payload sized 0
When using TCP at the send/recv level you are not privy to the packet traffic that goes into making the stream. When you send a nonzero number of bytes over a TCP stream the sequence number increases by the number of bytes. That's how both sides know where the other is in terms of successful exchange of data. Sending multiple packets with the same sequence number doesn't mean that the client did anything (such as your write(s, "", 0) example), it just means that the client wants to communicate some other piece of information (for example, an ACK of data flowing the other way). You can't directly see things like retransmits, duplicate ACKs, or other anomalies like that when operating at the stream level.
The answer you linked says much the same thing.
Everything works except in the corner case where the client sends no data (i.e. write(socket, "", 0); ).
write(socket, "", 0) isn't even a send in the first place. It's just a local API call that does nothing on the network.
I was wondering how exactly I would detect that a message with no data is sent.
No message is sent, so there is nothing to detect.
In this case, recv() blocks forever.
I agree.
I have a blocking recv() call to wait for any data using MSG_PEEK. When recv detects data to be read, I move onto non-blocking recv()'s that read the stream byte by byte.
Instead of using recv(MSG_PEEK), you should be using select(), poll(), or epoll() to detect when data arrives, then call recv() to read it.