I am believing TCP is reliable. If write(socket, buf, buf_len) and close(socket) returns without error, the receiver will receive the exact same data of buf with length buf_len.
But this article says TCP is not reliable.
A:
sock = socket(AF_INET, SOCK_STREAM, 0);
connect(sock, &remote, sizeof(remote));
write(sock, buffer, 1000000); // returns 1000000
close(sock);
B:
int sock = socket(AF_INET, SOCK_STREAM, 0);
bind(sock, &local, sizeof(local));
listen(sock, 128);
int client=accept(sock, &local, locallen);
write(client, "220 Welcome\r\n", 13);
int bytesRead=0, res;
for(;;) {
res = read(client, buffer, 4096);
if(res < 0) {
perror("read");
exit(1);
}
if(!res)
break;
bytesRead += res;
}
printf("%d\n", bytesRead);
Quiz question - what will program B print on completion?
A) 1000000
B) something less than 1000000
C) it will exit reporting an error
D) could be any of the above
The right answer, sadly, is āDā. But how could this happen? Program A
reported that all data had been sent correctly!
If this article is true, I have to change my mind. But I am not sure if this article is true or not.
Is this article true?
TCP is reliable (at least when the lower level protocols are), but programmers may use it in a unreliable way.
The problem here is that a socket should not be closed before all sent data has been correctly received at the other side: the close signal may destroy the connection before while the last data is still in transfert.
The correct way to ensure proper reception by peer is a graceful shutdown.
TCP/IP is reliable, for a very particular (and limited) meaning of the word "reliable".
Specifically, when write() returns 1000000, it is making you the following promises:
Your 1000000 bytes of data have been copied into the TCP socket's outgoing-data-buffer, and from now on, the TCP/IP stack is responsible for delivering those bytes to the remote program.
If those bytes can be delivered (with a reasonable amount of effort), then they will be delivered, even if some of the transmitted TCP packets get dropped during delivery.
If the bytes do get delivered, they will be delivered accurately and in-order (relative to each other and also relative to the data passed to previous and subsequent calls to write() on that same socket).
But there are also some guarantees that write() doesn't (and, in general, can't) provide. In particular:
write() cannot guarantee that the receiving application won't exit or crash before it calls recv() to get all 1000000 of those bytes.
write() cannot guarantee that the receiving application will do the right thing with the bytes it does receive (i.e. it might just ignore them or mishandle them, rather than acting on them)
write() cannot guarantee that the receiving application ever will call recv() to recv() those bytes at all (i.e. it might just keep the socket open but never call recv() on it)
write() cannot guarantee that the network infrastructure between your computer and the remote computer will work to deliver the bytes -- i.e. if some packets are dropped, no problem, the TCP layer will resend them, but e.g. if someone has pulled the power cable out of your cable modem, then there's simply no way for the bytes to get to their destination, and after a few minutes of trying and failing to get an ACK, the TCP layer will give up and error out the connection.
write() can't guarantee that the receiving application handles socket-shutdown issues 100% correctly (if it doesn't, then it's possible for any data you send just before the remote program closes to the socket to be silently dropped, since there's no receiver left to receive it)
write() doesn't guarantee that the bytes will be received by the receiving application in the same segment-sizes that you sent them in. i.e. just because you sent 1000000 bytes in a single call to send() doesn't mean that the receiving app can receive 1000000 bytes in a single call to recv(). He might instead receive the data in any-sized chunks, e.g. 1-byte-per-recv()-call, or 1000-bytes-per-recv-call, or any other size the TCP layer feels like providing.
Note that in all of the cases listed above, the problems occur after write() has returned, which is why write() can't just return an error code when these happen. (A later call to write() might well return an error code, of course, but that won't particularly help you know where the line between delivered-bytes and non-delivered bytes was located)
TLDR: TCP/IP is more reliable than UDP, but not a 100% iron-clad guarantee that nothing will ever go wrong. If you really want to make sure your bytes got processed on the receiving end, you'll want to program your receiving application to send back some sort of higher-level acknowledgement that it received (and successfully handled!) the bytes you sent it.
TCP/IP offers reliability which means it allows for the retransmission of lost packets, thereby making sure that all data transmitted is (eventually) received or you get a timeout error. Either your stuff get delivered or you are presented with a timeout error
What is the correct way of reading from a TCP socket in C/C++?
and btw, TCP/IP is reliable to some extent, it guarantees delivery assuming the network is working... if you unplug the cable TCP/IP will not deliver your packet
P.S. you should at least advance the pointer inside the buffer buffer
void ReadXBytes(int socket, unsigned int x, void* buffer)
{
int bytesRead = 0;
int result;
while (bytesRead < x)
{
result = read(socket, buffer + bytesRead, x - bytesRead);
if (result < 1 )
{
// Throw your error.
}
bytesRead += result;
}
}
Program A reported that all data had been sent correctly!
No. The return of write in program A only means that the returned number of bytes has been delivered to the kernel of operating system. It does not claim that the data have been send by the local system, have been received by the remote system or even have been processed by the remote application.
I am believing TCP is reliable.
TCP provides a reliability at the transport layer only. This means only that it will make sure that data loss gets detected (and packets re-transmitted), data duplication gets detected (and duplicates discarded) and packet reordering gets detected and fixed.
TCP does not claim any reliability of higher layers. If the applications needs this it has to be provided by the application itself.
Related
I am trying to understand some of the concepts regarding TCP data transmission.
Suppose we are using a socket (in C) to send and receive GET HTTP requests for a website. Regarding to the receiving end, what I have seen is the implementation below. Basically, a response buffer is created and filled up iteratively.
memset(response, 0, sizeof(response));
total = sizeof(response)-1;
received = 0;
while (received < total) {
bytes = recv(sockfd, response + received, total - received, 0);
if (bytes < 0)
error("ERROR reading response from socket");
if (bytes == 0)
break;
received += bytes;
}
The following two things weren't very clear to me
Where does the response get data from, cache from the OS or data directly from the website? I haven't learnt operating systems, which makes it difficult for me to comprehensive the buffering process.
Also, theoretically TCP would check for transmission loss, where does this happen? When I do the socket programming, I didn't see any handlers regarding transmission loss, is it automatically handled by socket?
Where does the response get data from, cache from the OS or data
directly from the website? I haven't learnt operating systems, which
makes it difficult for me to comprehensive the buffering process.
The TCP packets are received from the network device (Ethernet card, Wi-Fi adapter, etc), and their payload-data is placed into a temporary buffer inside the TCP/IP stack. When your program calls recv(), some or all of that data is copied from the temporary buffer into your program's buffer (response).
The TCP/IP stack will not do any caching of data other than what is described above. (e.g. if a web browser wants to cache a local copy of the web page so that it won't have to download it a second time, it will be up to the web browser itself to do that at the application level; the TCP/IP stack and the OS will not do that on their own)
Also, theoretically TCP would check for transmission loss, where does
this happen? When I do the socket programming, I didn't see any
handlers regarding transmission loss, is it automatically handled by
socket?
It is handled transparently inside the TCP stack. In particular, each TCP packet has a checksum and a sequence-number included in its header, and the TCP stack checks that sequence-number of each packet it receives, to make sure that it matches the next number in the sequence (relative to the previous packet it received from the same TCP stream). If it is not the expected next-packet-number, then the TCP stack knows that a packet was lost somehow, and it responds by sending a request to the remote computer that the dropped packet(s) be resent. Note that the TCP stack may have to drop subsequent packets as necessary until the originally-expected sequence can be resumed, because it is required to deliver the payload data to your application in the exact order in which it was sent (i.e. it isn't allowed to deliver "later" bytes before "earlier" bytes, even if some of the "earlier" bytes got lost and had to be retransmitted).
Where does the response get data from, cache from the OS? I haven't learnt operating systems, which makes it difficult for me to comprehensive the buffering process.
Maybe, maybe not, you don't care as you are the user of the feature. All of this is guaranteed by the TCP socket protocol. However if you forget to read the socket the socket data will be full. There is a limit to the data you can queue.
Also, theoretically TCP would check for transmission loss, where does this happen? When I do the socket programming, I didn't see any handlers regarding transmission loss, is it automatically handled by socket?
Yes. Tou don't need to worried about that. Beautiful, isn't it?
With respect to your code, I anticipate some problems:
// not necessary as you should only read the bytes affected by recv()
// memset(response, 0, sizeof(response));
// should declare total and received here
size_t total = sizeof(response) - 1;
size_t received = 0;
while (received < total) {
// should declare byte only here
ssize_t bytes = recv(sockfd, response + received, total - received, 0);
if (bytes < 0)
error("ERROR reading response from socket");
if (bytes == 0)
break;
received += (size_t)bytes;
}
// as you want read a string don't forget
response[received] = '\0';
I'm working on the sockets (in C, with no prior experience on socket programming) from the past few days.
Actually I have to collect WiFi packets on raspberry pi, do some processing and have to send the formatted information to another device over sockets (both the devices are connect in a network).
The challenge I'm facing is when receiving the data over the sockets.
While sending the data, the data is sent successfully over the sockets from the sending side but on the receiving side, sometimes some junk or previous data is received.
On Sending Side (client):
int server_socket = socket(AF_INET, SOCK_STREAM, 0);
//connecting to the server with connect function
send(server_socket, &datalength, sizeof(datalength),0); //datalength is an integer containing the number of bytes that are going to be sent next
send(server_socket, actual_data, sizeof(actual_data),0); //actual data is a char array containing the actual character string data
On Receiving Side (Server Side):
int server_socket = socket(AF_INET, SOCK_STREAM, 0);
//bind the socket to the ip and port with bind function
//listen to the socket for any clients
//int client_socket = accept(server_socket, NULL, NULL);
int bytes;
recv(client_socket, &bytes, sizeof(bytes),0);
char* actual_message = malloc(bytes);
int rec_bytes = recv(client_socket, actual_message, bytes,0);
*The above lines of code are not the actual lines of code, but the flow and procedure would be similar (with exception handling and comments).
Sometimes, I could get the actual data for all the packets quickly(without any errors and packet loss). But sometimes the bytes (integer sent to tell the size of byte stream for the next transaction) is received as a junk value, so my code is breaking at that point.
Also sometimes the number of bytes that I receive on the receving side are less than the number of bytes expected (known from the received integer bytes). So in that case, I check for that condition and retrieve the remaining bytes.
Actually the rate at which packets arrive is very high (around 1000 packets in less than a second and I have to dissect, format and send it over sockets). I'm trying different ideas (using SOCK_DGRAMS but there is some packet loss here, insert some delay between transactions, open and close a new socket for each packet, adding an acknowledgement after receiving packet) but none are them meets my requirement (quick transfer of packets with 0 packet loss).
Kindly, suggest a way to send and receive varying length of packets at a quick rate over sockets.
I see a few main issues:
I think your code ignores the possibility of a full buffer in the send function.
It also seems to me that your code ignores the possibility of partial data being collected by recv. (nevermind, I just saw the new comment on that)
In other words, you need to manage a user-land buffer for send and handle fragmentation in recv.
The code uses sizeof(int) which might be a different length on different machines (maybe use uint32_t instead?).
The code doesn't translate to and from network byte order. This means that you're sending the memory structure of the int instead of an integer that can be read by different machines (some machines store the bytes backwards, some forward, some mix and match).
Notice that when you send larger data using TCP/IP, it will be fragmented into smaller packets.
This depends, among others, on the MTU network value (which often runs at ~500 bytes in the wild and usually around ~1500 bytes in your home network).
To handle these cases you should probably use an evented network design rather than blocking sockets.
Consider routing the send through something similar to this (if you're going to use blocking sockets):
int send_complete(int fd, void * data, size_t len) {
size_t act = 0;
while(act < len) {
int tmp = send(fd, (void *)((uintptr_t)data + act), len - act);
if(tmp <= 0 && errno != EWOULDBLOCK && errno != EAGAIN && errno != EINTR)
return tmp; // connection error
act += tmp;
// add `select` to poll the socket
}
return (int)act;
}
As for the sizeof issues, I would replace the int with a specific byte length integer type, such as int32_t.
A few more details
Please notice that sending the integer separately doesn't guaranty that it would be received separately or that the integer itself wouldn't be fragmented.
The send function writes to the system's buffer for the socket, not to the network (just like recv reads from the available buffer and not from the wire).
You can't control where fragmentation occurs or how the TCP packets are packed (unless you implement your own TCP/IP stack).
I'm sure it's clear to you that the "junk" value is data that was sent by the server. This means that the code isn't reading the integer you send, but reading another piece of data.
It's probably a question of alignment to the message boundaries, caused by an incomplete read or an incomplete send.
P.S.
I would consider using the Websocket protocol on top of the TCP/IP layer.
This guaranties a binary packet header that works with different CPU architectures (endianness) as well as offers a wider variety of client connectivity (such as connecting with a browser etc').
It will also solve the packet alignment issue you're experiencing (not because it won't exist, but because it was resolved in whatever Websocket parser you will adopt).
I am doing some test with TCP client application in a Raspberry Pi (server in the PC), with PPP (Point to Point Protocol) using a LTE Modem. I have used C program with sockets, checking system call's response. I wanted to test how socket works in a bad coverage area so I did some test removing the antenna.
I have followed the next steps:
Connect to server --> OK
Start sending data (write system call) --> OK (I also check in the server)
I removed the LTE modem's antenna (There is no network, it can't do ping)
Continue sending data (write system call) --> OK (server does not receive anything!!!)
It finished sending data and closed socket --> OK (connection is still opened and there is no data since the antenna was removed)
Program was finished
I put the antenna again
Some time later, the data has been uploaded and the connection closed. But I did another test following this steps but with more data, and it did not upload this data...
I do not know if there any way to ensure that the data written to TCP server is received by the server (I thought that TCP layer ensured this..). I could do it manually using an ACK but I guess that it has to be a better way to do.
Sending part code:
while(i<100)
{
sprintf(buf, "Message %d\n", i);
Return = write(Sock_Fd, buf, strlen(buf));
if(Return!=strlen(buf))
{
printf("Error sending data to TCP server. \n");
printf("Error str: %s \n", strerror(errno));
}
else
{
printf("write successful %d\n", i);
i++;
}
sleep(2);
}
Many thanks for your help.
The write()-syscall returns true, since the kernel buffers the data and puts it in the out-queue of the socket. It is removed from this queue when the data was sent and acked from the peer. When the OutQueue is full, the write-syscall will block.
To determine, if data has not been acked by the peer, you have to look at the size of the outqueue. With linux, you can use an ioctl() for this:
ioctl(fd, SIOCOUTQ, &outqlen);
However, it would be more clean and portable to use an inband method for determining if the data has been received.
TCP/IP is rather primitive technology. Internet may sound newish, but this is really antique stuff. TCP is needed because IP gives almost no guarantees, but TCP doesn't actually add that many guarantees. Its chief function is to turn a packet protocol into a stream protocol. That means TCP guarantees a byte order; no bytes will arrive out of order. Don't count on more than that.
You see that protocols on top of TCP add extra checks. E.g. HTTP has the famous HTTP error codes, precisely because it can't rely on the error state from TCP. You probably have to do the same - or you can consider implementing your service as a HTTP service. "RESTful" refers to an API design methodology which closely follows the HTTP philosophy; this might be relevant to you.
The short answer to your 4th and 5th topics was taken as a shortcut from this answer (read the whole answer to get more info)
A socket has a send buffer and if a call to the send() function succeeds, it does not mean that the requested data has actually really been sent out, it only means the data has been added to the send buffer. For UDP sockets, the data is usually sent pretty soon, if not immediately, but for TCP sockets, there can be a relatively long delay between adding data to the send buffer and having the TCP implementation really send that data. As a result, when you close a TCP socket, there may still be pending data in the send buffer, which has not been sent yet but your code considers it as sent, since the send() call succeeded. If the TCP implementation was closing the socket immediately on your request, all of this data would be lost and your code wouldn't even know about that. TCP is said to be a reliable protocol and losing data just like that is not very reliable. That's why a socket that still has data to send will go into a state called TIME_WAIT when you close it. In that state it will wait until all pending data has been successfully sent or until a timeout is hit, in which case the socket is closed forcefully.
The amount of time the kernel will wait before it closes the socket,
regardless if it still has pending send data or not, is called the
Linger Time.
BTW: that answer also refers to the docs where you can see more detailed info
Lets imagine the following data sequence that was sent from the server to the client:
[data] [data] [data] [FIN] [RST]
And lets imagine that I'm doing the following on the client side (sockets are non-blocking):
char buf[sizeof(data)];
for (;;)
{
rlen = recv(..., buf, sizeof(buf), ...);
rerr = errno;
slen = send(..., "a", 1, ...);
serr = errno;
}
When I will see the ECONNRESET error?
I'm particularly curious about the following edge case. Let's imagine that all IP frames for the imagined sequence above are already received and ACKed by the TCP stack. However, my client application didn't send() or recv() anything yet. Will the first call to send() return an ECONNRESET? If so, will the next call to recv() succeed and allow me to read everything it has in its internal buffers (since it received the data and has it) before starting to report ECONNRESET (or returning 0 because of FIN)? Or something different will happen?
I will especially appreciate link on the documentation that explains that situation. I'm trying to grok linux tcp implementation to figure that out, but it's not that clear...
Will the first call to send() return an ECONNRESET?
Not unless it blocks for long enough for the peer to detect the incoming packet for the broken connection and return an RST. Most of the time, send will just buffer the data and return.
will the next call to recv() succeed
It depends entirely on (a) whether there is incoming that a to be read and (b) whether an RAT has been received yet.
and allow me to read everything it has in its internal buffers (since it received the data and has it) before starting to report ECONNRESET (or returning 0 because of FIN)?
If an RST is received, all buffered data will be thrown away.
It all depends entirely on the timing and on the size of the buffers at both ends.
I'm writing udp server/client application in which server is sending data and client
is receiving. When packet is loss client should sent nack to server. I set the socket as
O_NONBLOCK so that I can notice if the client does not receive the packet
if (( bytes = recvfrom (....)) != -1 ) {
do something
}else{
send nack
}
My problem is that if server does not start to send packets client is behave as the
packet is lost and is starting to send nack to server. (recvfrom is fail when no data is available)I want some advice how can I make difference between those cases , if the server does not start to send the packets and if it sends, but the packet is really lost
You are using UDP. For this protocol its perfectly ok to throw away packets if there is need to do so. So it's not reliable in terms of "what is sent will arrive". What you have to do in your client is to check wether all packets you need arrived, and if not, talk politely to your server to resend those packets you did not receive. To implement this stuff is not that easy,
If you have to use UDP to transfer a largish chunk of data, then design a small application-level protocol that would handle possible packet loss and re-ordering (that's part of what TCP does for you). I would go with something like this:
Datagrams less then MTU (plus IP and UDP headers) in size (say 1024 bytes) to avoid IP fragmentation.
Fixed-length header for each datagram that includes data length and a sequence number, so you can stitch data back together, and detect missed, duplicate, and re-ordered parts.
Acknowledgements from the receiving side of what has been successfully received and put together.
Timeout and retransmission on the sending side when these acks don't come within appropriate time.
you have a loop calling either select() or poll() to determine if data has arrived - if so you then call recvfrom() to read the data.
you can set time out for receive data as follows
ssize_t
recv_timeout(int fd, void *buf, size_t len, int flags)
{
ssize_t ret;
struct timeval tv;
fd_set rset;
// init set
FD_ZERO(&rset);
// add to set
FD_SET(fd, &rset);
// this is set to 60 seconds
tv.tv_sec =
config.idletimeout;
tv.tv_usec = 0;
// NEVER returns before the timeout value.
ret = select(fd, &rset, NULL, NULL, &tv);
if (ret == 0) {
log_message(LOG_INFO,
"Idle Timeout (after select)");
return 0;
} else if (ret < 0) {
log_message(LOG_ERR,
"recv_timeout: select() error \"%s\". Closing connection (fd:%d)",
strerror(errno), fd);
return;
}
ret = recvfrom(fd, buf, len, flags);
return ret;
}
It tells that if there are data ready, Normally, read() should return up to the maximum number of bytes that you've specified, which possibly includes zero bytes (this is actually a valid thing to happen!), but it should never block after previously having reported readiness.
Under Linux, select() may report a socket file descriptor as "ready
for reading", while nevertheless a subsequent read blocks. This could
for example happen when data has arrived but upon examination has
wrong checksum and is discarded. There may be other circumstances in
which a file descriptor is spuriously reported as ready. Thus it may
be safer to use O_NONBLOCK on sockets that should not block.
Look up sliding window protocol here.
The idea is that you divide your payload into packets that fit in a physical udp packet, then number them. You can visualize the buffers as a ring of slots, numbered sequentially in some fashion, e.g. clockwise.
Then you start sending from 12 oclock moving to 1,2,3... In the process, you may (or may not) receive ACK packets from the server that contain the slot number of a packet you sent.
If you receive a ACK, then you can remove that packet from the ring, and place the next unsent packet there which is not already in the ring.
If you receive a NAK for a packet you sent, it means that packet was received by the server with data corruptions, and then you resend it from the ring slot reported in the NAK.
This protocol class allows transmission over channels with data or packet loss (like RS232, UDP, etc). If your underlying data transmission protocol does not provide checksums, then you need to add a checksum for each ring packet you send, so the server can check its integrity, and report back to you.
ACK and NAK packets from the server can also be lost. To handle this, you need to associate a timer with each ring slot, and if you don't receive either a ACK or NAK for a slot when the timer reaches a timeout limit you set, then you retransmit the packet and reset the timer.
Finally, to detect fatal connection loss (i.e. server went down), you can establish a maximum timeout value for all your packets in the ring. To evaluate this, you just count how many consecutive timeouts you have for single slots. If this value exceeds the maximum you have set, then you can consider the connection lost.
Obviously, this protocol class requires dataset assembly on both sides based on packet numbers, since packets may not be sent or received in sequence. The 'ring' helps with this, since packets are removed only after successful transmission, and on the receiving side, only when the previous packet number has already been removed and appended to the growing dataset. However, this is only one strategy, there are others.
Hope this hepls.