I am working on a client-server application in C and on Linux platform. What I am trying to achieve is to change the socket id over a TCP connection on both client and server without data loss where in the client sends the data from a file to the server in the main thread. The application is multithreaded where the other threads change the socket id based on some global flags set.
Problem: The application has two TCP socket connections established, over both IPv4 and IPv6 paths. I am transferring a file over the TCP-IPv4 connection first in the main thread. The other thread is checking on some global flags and has access to/share the socket IDs created for each protocol in the main thread. The send and recv use a pointer variable in its call to point to the socket ID to be used for the data transfer. The data is transferred initially over TCP-Ipv4. Once the global flags are set and few other checks are made the other thread changes the socket ID used in send call to point to IPv6 socket. This thread also takes care of communicating the change between the two hosts.I am getting all the data over IPv4 sent completely before switching. Also I am getting data sent over Ipv6 after the socket ID is just switched. But down the transfer there is loss of data over IPv6 connection.(I am using a pointer variable in send function on server side send(*p_dataSocket.socket_id,sentence,p_size,0); to change the pointer to IPv6 socket ID on the fly)
The error after recv and send call on both side respectively is says ESPIPE:Illegal seek, but this error exists even before switching. So I am pretty much sure this is nothing to do with the data loss
I am using pselect() to check for the available data for each socket. I can somehow understand the data loss while switching(if not properly handled) but I am not able to figure out why the data loss is occurring down the transfer after switching. I hope I am clear on what the issue is. I have also checked to send the data individually over each protocol without switching and there is no data loss.It I initially transfer the data over Ipv6 and then switch to IPv4, there is no data loss. Also would really appreciate to know to how to investigate in this issue apart from using errno or netstat.
When you are using TCP to send data you just can't loose a part of the information in between. You either receive the byte stream the way it was sent or receive nothing at all - provided that you are using the socket-related functions correctly.
There are several points you may want to investigate.
First of all you must make sure that you are really sending the data which is lost. Add some logging on the server side application: dump anything that you transmit witn send() into some file. Include some extra info as well, like:
Data packet no.==1234, *p_dataSocket.socket_id==11, Data=="data_contents_here", 22 bytes total; send() return==22
The important thing here is to watch the contents of *p_dataSocket.socket_id. Make sure that you are using mutex or something like that cause you have a thread which regularly reads socket_id contents and another thread which occasionally changes it. You are not guranteed against the getting of a wrong value from that address unless your threads have monopoly access to it while reading/writing. It is important both for normal program operation and for the debugging information generation.
Another possible problem here is the logic which selects sentence to send. Corruption of this variable may be hard to track in multithreaded program. The logging of transmitted information will help you here too.
Use any TCP sniffer to check what TCP stack really transmits. Are there packets with lost data? If there are no those packets, try to find out which send() call was responsible for sending that data. If those packets exist, check the receiving side for bugs.
errno value should not be used alone. Its value has meaning only when you get an erroneous return from a function. Try to find out when exactly errno becomes ESPIPE That may happen when any of API functions return something like -1 (depends on function). When you find out where it happens you should find out what is wrong in that particular piece of code (debugger is your friend). Have in mind that errno behavior in multithreaded environment depends on your system implementation. Make sure that you use -pthread option (gcc) or at least compile with -D_REENTRANT to minimize the risks.
Check this question for some info about the possible cause of your situation with errno==ESPIPE. Try some debuggin techniques, as suggested there. Errno value of ESPIPE gives a hint that you are using file descriptors incorrectly somewhere in your program. Maybe somewhere you are using a socket fd as regular file or something like that. This may be caused by some race condition (simultaneous access to one object from several threads).
Related
I have a socket programming situation where the client shuts down the writing end of the socket to let the server know input is finished (via receiving EOF), but keeps the reading end open to read back a result (one line of text). It would be useful for the server to know that the client has successfully read the result and closed the socket (or at least shut down the reading end). Is there a good way to check/wait for such status?
No. All you can know is whether your sends succeeded, and some of them will succeed even after the peer read shutdown, because of TCP buffering.
This is poor design. If the server needs to know that the client received the data, the client needs to acknowledge it, which means it can't shutdown its write end. The client should:
send an in-band termination message, as data.
read and acknowledge all further responses until end of stream occurs.
close the socket.
The server should detect the in-band termination message and:
stop reading requests from the socket
send all outstanding responses and read the acknowledgements
close the socket.
OR, if the objective is only to ensure that client and server end at the same time, each end should shutdown its socket for output and then read input until end of stream occurs, then close the socket. That way the final closes will occur more or less simultaneously on both ends.
getsockopt with TCP_INFO seems the most obvious choice, but it's not cross-platform.
Here's an example for Linux:
import socket
import time
import struct
import pprint
def tcp_info(s):
rv = dict(zip("""
state ca_state retransmits probes backoff options snd_rcv_wscale
rto ato snd_mss rcv_mss unacked sacked lost retrans fackets
last_data_sent last_ack_sent last_data_recv last_ack_recv
pmtu rcv_ssthresh rtt rttvar snd_ssthresh snd_cwnd advmss reordering
rcv_rtt rcv_space
total_retrans
pacing_rate max_pacing_rate bytes_acked bytes_received segs_out segs_in
notsent_bytes min_rtt data_segs_in data_segs_out""".split(),
struct.unpack("BBBBBBBIIIIIIIIIIIIIIIIIIIIIIIILLLLIIIIII",
s.getsockopt(socket.IPPROTO_TCP, socket.TCP_INFO, 160))))
wscale = rv.pop("snd_rcv_wscale")
# bit field layout is up to compiler
# FIXME test the order of nibbles
rv["snd_wscale"] = wscale >> 4
rv["rcv_wscale"] = wscale & 0xf
return rv
for i in range(100):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("localhost", 7878))
s.recv(10)
pprint.pprint(tcp_info(s))
I doubt a true cross-platform alternative exists.
Fundamentally there are quite a few states:
you wrote data to socket, but it was not sent yet
data was sent, but not received
data was sent and losts (relies on timer)
data was received, but not acknowledged yet
acknowledgement not received yet
acknowledgement lost (relies on timer)
data was received by remote host but not read out by application
data was read out by application, but socket still alive
data was read out, and app crashed
data was read out, and app closed the socket
data was read out, and app called shutdown(WR) (almost same as closed)
FIN was not sent by remote yet
FIN was sent by remote but not received yet
FIN was sent and got lost
FIN received by your end
Obviously your OS can distinguish quite a few of these states, but not all of them. I can't think of an API that would be this verbose...
Some systems allow you to query remaining send buffer space. Perhaps if you did, and socket was already shut down, you'd get a neat error?
Good news is just because socket is shut down, doesn't mean you can't interrogate it. I can get all of TCP_INFO after shutdown, with state=7 (closed). In some cases report state=8 (close wait).
http://lxr.free-electrons.com/source/net/ipv4/tcp.c#L1961 has all the gory details of Linux TCP state machine.
TL;DR:
Don't rely on the socket state for this; it can cut you in many error cases. You need to bake the acknowledgement/receipt facility into your communications protocol. First character on each line used for status/ack works really well for text-based protocols.
On many, but not all, Unix-like/POSIXy systems, one can use the TIOCOUTQ (also SIOCOUTQ) ioctl to determine how much data is left in the outgoing buffer.
For TCP sockets, even if the other end has shut down its write side (and therefore will send no more data to this end), all transmissions are acknowledged. The data in the outgoing buffer is only removed when the acknowledgement from the recipient kernel is received. Thus, when there is no more data in the outgoing buffer, we know that the kernel at the other end has received the data.
Unfortunately, this does not mean that the application has received and processed the data. This same limitation applies to all methods that rely on socket state; this is also the reason why fundamentally, the acknowledgement of receipt/acceptance of the final status line must come from the other application, and cannot be automatically detected.
This, in turn, means that neither end can shut down their sending sides before the very final receipt/acknowledge message. You cannot rely on TCP -- or any other protocols' -- automatic socket state management. You must bake in the critical receipts/acknowledgements into the stream protocol itself.
In OP's case, the stream protocol seems to be simple line-based text. This is quite useful and easy to parse. One robust way to "extend" such a protocol is to reserve the first character of each line for the status code (or alternatively, reserve certain one-character lines as acknowledgements).
For large in-flight binary protocols (i.e., protocols where the sender and receiver are not really in sync), it is useful to label each data frame with an increasing (cyclic) integer, and have the other end respond, occasionally, with an update to let the sender know which frames have been completely processed, and which ones received, and whether additional frames should arrive soon/not-very-soon. This is very useful for network-based appliances that consume a lot of data, with the data provider wishing to be kept updated on the progress and desired data rate (think 3D printers, CNC machines, and so on, where the contents of the data changes the maximum acceptable data rate dynamically).
Okay so I recall pulling my hair out trying to solve this very problem back in the late 90's. I finally found an obscure doc that stated that a read call to a disconnected socket will return a 0. I use this fact to this day.
You're probably better off using ZeroMQ. That will send a whole message, or no message at all. If you set it's send buffer length to 1 (the shortest it will go) you can test to see if the send buffer is full. If not, the message was successfully transferred, probably. ZeroMQ is also really nice if you have an unreliable or intermittent network connection as part of your system.
That's still not entirely satisfactory. You're probably even better off implementing your own send acknowledge mechanism on top of ZeroMQ. That way you have absolute proof that a message was received. You don't have proof that a message was not received (something can go wrong between emitting and receiving the ack, and you cannot solve the Two Generals Problem). But that's the best that can be achieved. What you'll have done then is implement a Communicating Sequential Processes architecture on top of ZeroMQ's Actor Model which is itself implemented on top of TCP streams.. Ultimately it's a bit slower, but your application has more certainty of knowing what's gone on.
I am an experienced network programmer and am faced with a situation where i need some advice.
I am required to distribute some data on several outgoing interfaces (via different tcp socket connections, each corresponding to each interface). However, the important part is, i should be able to send MORE/most of the data on the interface with better bandwidth i.e. the one that can send faster.
The opinion i had was to use select api (both unix and windows) for this purpose. I have used select, poll or even epoll in the past. But it was always for READING from multiple sockets whenever data is available.
Here i intend to write successive packets on several interfaces in sequence, then monitor each of them for write descriptors (select parameter), then which ever is available (means it was able to send the packet first), i would keep sending more packets via that descriptor.
Will i be able to achieve my intension here? i.e. if i have an interface with 10Mbps link vs another one with 1Mbps, i hope to be able to get most of the packets out via the faster interface.
Update 1: I was wondering what would be select's behavior in this case, i.e. when you call select on read descriptors, the one on which data is available is returned. However, in my scenario when we are writing on the descriptors and waiting for select to return the one that finished writing first, does select ensure returning only when entire packet is written i.e. say i tried writing 1200 bytes in one go. Will it only return when entire 1200 are return or there is a permanent error? I am not sure how would select behave and failed to find any documentation describing that.
I'd adapt the producer/consumer pattern. In this case one producer and several consumers.
Let the main thread handle your source (be the producer) and spawn off one thread for each connection (being the consumers).
The treads in parallel pull a chunk of the source each and send it over the connection one by one.
The thread holding the fastest connection is expected to send the most chunks in this setup.
Using poll/epoll/select for writing is rather tricky. The reason is that sockets are mostly ready for writing unless their socket send buffer is full. So, polling for 'writable' is apt to just spin without ever waiting.
You need to proceed as follows:
When you have something to write to a socket, write it, in a loop that terminates when all the data has been written or write() returns -1 with errno == EAGAIN/EWOULDBLOCK.
At that point you have a full socket send buffer. So, you need to register this socket with the selector/poll/epoll for writability.
When you have nothing else to do, select/poll/epoll and repeat the writes that caused the associated sockets to be polled for writability.
Do those writes the same way as at (1) but this time, if the write completes, deregister the socket for writability.
In other words you must only select/poll for writeability if you already know the socket's send buffer is full, and you must stop doing so immediately you know it isn't.
How you fit all this into your application is another question.
So the basic premise of my program is that I'm supposed to create a tcp session, direct traffic through it, and detect any connection losses. If the connection does break, I need to close the sockets and reopen them (using the same ports) in such a way that it will seem like the connection (almost) never died. It should also be noted that the two programs will be treated as proxies (data gets sent to them, if the connection breaks it gets stored until connection is fixed, then data is sent off).
I've done some research and gone ahead and used setsockopt() with the SO_REUSEADDR option to set the socket options so that I can reuse the address.
Here's the basic algorithm I do to detect a connection break using signals:
After initial setup of sockets, begin sending data
After x seconds, set a flag to false, which will prevent all other data from being sent
Send a single piece of data to let the other program know the connection is still open, reset timer to x seconds
If I receive same piece of data from the program, set the flag to true to continue sending
If I don't receive the data after x seconds, close the socket and attempt to reconnect
(step 5 is where I'm getting the error).
Essentially one program is a client(on one VM) and one program is a server(on another VM), each sending and receiving data to/from each other and to/from another program on each VM.
My question is: Given that I'm still getting this error after setting the socket options, why am I not allowed to re-bind the address when a connection has been detected?
The server is the one complaining when a disconnect is detected (I close the socket, open a new one, set the option, and attempt to bind the port with the same information).
One other thing of note is the way I'm receiving the data from the sockets. If I have a socket open, I'm basically reading it by doing the following:
while((x = recv(socket, buff, 1, 0)>=0){
//add to buffer
// send out to other program if connection is alive
}
Since I'm using the timer to close/reopen the socket, and this is in a different thread, will this prevent the socket from closing?
SO_REUSEADDR only allows limited reuse of ports. Specifically, it does not allow reuse of a port that some other socket is currently actively listening for incoming connections on.
There seems to be an epidemic here of people calling bind() and then setsockopt() and wondering why the setsockopt() doesn't fix an error that had already happened on bind().
You have to call setsockopt() first.
But I don't understand your problem. Why do you think you need to use the same ports? Why are you setting a flag preventing you from sending data? You don't need any of this. Just handle the errors on send() when and if they arise, creating a new connection when necessary. Don't try to out-think TCP. Many have tried, few if any have succeeded.
i am learning to use SO_SNDTIMEO and SO_RCVTIMEO to check the timeout.
It is easy to use with read socket. But when i want to check write timeout, it always return successful. Here is what i did:(all in blocking mode)
close the client read socket and exit before server start write
terminate the client before server start write
unplug the cable of server after accept but before write
well, it seems all these case write just return sucessfully.
I think the reason should be that port is resource managed by os, and at the client side, after program gone, the tcp connection still shows FIN_WAIT2 state.
so, is there any convenient way to simulate some cases that write can receive errors such as EPIPE, EAGAIN?
How to get the error EAGAIN?
To get the error EAGAIN, you need to be using Non-Blocking Sockets. With Non-Blocking sockets, you need to write huge amounts of data (and stop receiving data on the peer side), so that your internal TCP buffer gets filled and returns this error.
How to get the error EPIPE?
To get the error EPIPE, you need to send large amount of data after closing the socket on the peer side. You can get more info about EPIPE error from this SO Link. I had asked a question about Broken Pipe Error in the link provided and the accepted answer gives a detailed explanation. It is important to note that to get EPIPE error you should have set the flags parameter of send to MSG_NOSIGNAL. Without that, an abnormal send can generate SIGPIPE signal.
Additional Note
Please note that it is difficult to simulate a write failure, as TCP generally stores the data that you are trying to write into it's internal buffer. So, if the internal buffer has sufficient space, then you won't get an error immediately. The best way is to try to write huge amounts of data. You can also try setting a smaller buffer size for send by using setsockopt function with SO_SNDBUF option
You can simulate errors using fault injection. For example, libfiu is a fault injection library that comes with an example project that allows you to simulate errors from POSIX functions. Basically it uses LD_PRELOAD to inject a wrapper around the regular system calls (including write), and then the wrapper can be configured to either pass through to the real system call, or return whatever error you like.
You could set the receive buffer size to be really small on one side, and send a large buffer on the other. Or on the one side set the send buffer small and try to send a large message.
Otherwise the most common test (I think) is to let the server and client talk for a while, and then remove a network cable.
I have a client which connects to a server and tries to send() some data. However there are two types of data that I need to send, lets say information about the weather and the current time (just examples).
The problem is: In the client I'm calling send() twice, one to send the weather info and one the current time, and in the server I'm looping recv().
What I expected (and built my code around) is that the first time the server calls recv() it would only get the weather info and at the second call to recv() the time, however only one call to recv() is enough for both of the data to be received on the same buffer.
While that may not be a problem the thing is I've built my program around that assumption, and I just wanted to know if there is a way to achieve what I want (I thought of a sleep() between the two send() but that may be unreliable), so that I can save time rewriting code.
If anyone knows a way it would save me quite some time, so I'm appreciating any help.
There is no alternative to a proper message protocol on top of TCP. TCP only transfers a stream of octets, (bytes). TCP cannot transfer messages, structs, objects.
If you've built a large program around the assumption that TCP can transfer messages on its own, you are in trouble.
Sleep() and timer bodges will just not work in any sort of reliable or performant way. You must do it properly and implement a protocol on top of TCP, eg. by sending a header containing the data length or using start/end bytes and escaping either byte that appears inside the data.