How to flush Input Buffer (if such thing exists at all) of an UDP Socket in C ?
I'm working on an embedded Linux environment and using C to create some native application. There are several of these embedded machines on the same network, and when an event occurs on one of them (lets call it the WHISTLE-BLOWER), WHISTLE-BLOWER should send a network message to the network broadcast address, so that all machines on the network (including the WHISTLE-BLOWER) knows about the event and executes some actions according to it. I'm using UDP socket by the way...
Here's the pseudo-code for it:
main
{
startNetworkListenerThread( networkListenerFunction );
while( not received any SIGTERM or such )
{
localEventInfo = checkIfTheLocalEventOccured();
broadcastOnNetwork( localEventInfo );
}
}
networkListenerFunction
{
bindSocket;
while( not SIGTERM )
{
// THIS IS WHERE I WANT TO FLUSH THE RECV BUFFER...
recv_data = recvfrom( socket );
if( validate recv data )
{
startExecuteLocalAction;
sleep( 5 );
stopExecuteLocalAction;
}
}
}
The way I expect and want to work this code is:
1. LOCAL_EVENT occured
2. Broadcasted LOCAL_EVENT_INFO on network
3. All machines received EVENT_INFO, including the original broadcaster
4. All machines started executing the local action, including the original broadcaster
5. All machines' network listener(thread)s are sleeping
6. Another LOCAL_EVENT2 occured
7. Since all machines' listener are sleeping, LOCAL_EVENT2 is ignored
8. All machines' network listener(thread)s are now active again
9. GO BACK TO 1 / RESTART CYCLE
RESULT = TOTAL 2 EVENTS, 1 IGNORED
The way it actually works is:
1. LOCAL_EVENT occured
2. Broadcasted LOCAL_EVENT_INFO on network
3. All machines received EVENT_INFO, including the original broadcaster
4. All machines started executing the local action, including the original broadcaster
5. All machines' network listener(thread)s are sleeping
6. Another LOCAL_EVENT2 occured
7. Eventhough all machines' listener are sleeping; LOCAL_EVENT2 is queued SOMEHOW
8. All machines' network listener(thread)s are now active again
9. All machines received EVENT_INFO2 and executed local actions again, slept and reactivated
10. GO BACK TO 1 / RESTART CYCLE
RESULT = TOTAL 2 EVENTS, 0 IGNORED
tl,dr: The packets/messages/UDP Broadcasts sent to an already binded socket, whoose parent thread is sleeping at the delivery-moment; are somehow queued/buffered and delivered at the next 'recvfrom' call on the said socket.
I want those UDP broadcasts to be ignored so I was thinking of flushing the receive buffer (obviously not the one i'm giving as parameter to the recvfrom method) if it exists before calling recvfrom. How can I do that? or what path should I follow?
Please note that the notion of "flushing" only applies to output. A flush empties the buffer and ensures everything in it was sent to its destination. Regarding an input buffer, the data is already at its destination. Input buffers can be read from or cleared out, but not "flushed".
If you just want to make sure you have read everything in the input buffer, what you are looking for is a non-blocking read operation. If you try it and there's no input, it should return an error.
A socket has a single receive buffer inside the TCP/IP stack. It's essentially a FIFO of the datagrams received. TCP and UDP handle that queue differently though. When you call recv(2) on a UDP socket you dequeue a single datagram from that buffer. TCP arranges the datagrams into a stream of bytes according to the sequence numbers. When receive buffer overflows the datagram is dropped by the stack. TCP tries to resend in this case. UDP doesn't. There's no explicit "flush" function for the receive buffer other then reading the socket or closing it.
Edit:
You have an inherent race condition in your application and it looks like you are trying to solve it with a wrong tool (TCP/IP stack). What I think you should be doing is defining a clean state machine for the app. Handle the events that make sense at current state, ignore events that don't.
One other thing to look at is using multicast instead of broadcast. It's a bit more involved but you would have more control of the "subscriptions" by joining/leaving multicast groups.
Related
If a C socket's recvmsg() has a queue, how can I find out how many items are backlogged in the queue?
My problem is that the speed of the code after I receive something from recvmsg() is sometimes slower than the rate of data sent to recvmsg(), which would logically result in a queue? What happens if the queue becomes too big?
For example if this is my recv() code:
while (recvmsg(SocketA,...) > 0)
{
...
...something that takes 1.5 seconds to execute/complete...
...
}
and if the following function somewhere else gets called every 1 second:
// gets called every 1 second
int send_to_sock()
{
...
send(SocketA, ...);
...
}
I checked the reference pages for recvmsg() but none of them mention anything about a queue, but it seems like there is a queue because I am observing delays that incrementally add up. But the code stops working if the queue gets too long, so I want to know if there is a way to check length of queue.
It's not a queue, it's a buffer associated with most of I/O devices, not only sockets. For example, when you read from stdin with something like scanf, data goes to your program when you press enter. Where do you think all keystrokes are stored in the meantime?
The specific details of these buffers are implementation defined, but usually what happens with sockets is that new packets are discarded if the buffer is full. You can find more information about querying the state of the buffer in How to find the socket buffer size of linux and How can I tell if a socket buffer is full?.
Yes. There is a socket receive buffer. Its maximum size can be got and set via getsockopt() and setsockopt() using the SO_RCVBUF option. When it fills, in the case of TCP the sender is told to stop sending; in the case of UDP further incoming data is discarded.
I have a socket programming situation where the client shuts down the writing end of the socket to let the server know input is finished (via receiving EOF), but keeps the reading end open to read back a result (one line of text). It would be useful for the server to know that the client has successfully read the result and closed the socket (or at least shut down the reading end). Is there a good way to check/wait for such status?
No. All you can know is whether your sends succeeded, and some of them will succeed even after the peer read shutdown, because of TCP buffering.
This is poor design. If the server needs to know that the client received the data, the client needs to acknowledge it, which means it can't shutdown its write end. The client should:
send an in-band termination message, as data.
read and acknowledge all further responses until end of stream occurs.
close the socket.
The server should detect the in-band termination message and:
stop reading requests from the socket
send all outstanding responses and read the acknowledgements
close the socket.
OR, if the objective is only to ensure that client and server end at the same time, each end should shutdown its socket for output and then read input until end of stream occurs, then close the socket. That way the final closes will occur more or less simultaneously on both ends.
getsockopt with TCP_INFO seems the most obvious choice, but it's not cross-platform.
Here's an example for Linux:
import socket
import time
import struct
import pprint
def tcp_info(s):
rv = dict(zip("""
state ca_state retransmits probes backoff options snd_rcv_wscale
rto ato snd_mss rcv_mss unacked sacked lost retrans fackets
last_data_sent last_ack_sent last_data_recv last_ack_recv
pmtu rcv_ssthresh rtt rttvar snd_ssthresh snd_cwnd advmss reordering
rcv_rtt rcv_space
total_retrans
pacing_rate max_pacing_rate bytes_acked bytes_received segs_out segs_in
notsent_bytes min_rtt data_segs_in data_segs_out""".split(),
struct.unpack("BBBBBBBIIIIIIIIIIIIIIIIIIIIIIIILLLLIIIIII",
s.getsockopt(socket.IPPROTO_TCP, socket.TCP_INFO, 160))))
wscale = rv.pop("snd_rcv_wscale")
# bit field layout is up to compiler
# FIXME test the order of nibbles
rv["snd_wscale"] = wscale >> 4
rv["rcv_wscale"] = wscale & 0xf
return rv
for i in range(100):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("localhost", 7878))
s.recv(10)
pprint.pprint(tcp_info(s))
I doubt a true cross-platform alternative exists.
Fundamentally there are quite a few states:
you wrote data to socket, but it was not sent yet
data was sent, but not received
data was sent and losts (relies on timer)
data was received, but not acknowledged yet
acknowledgement not received yet
acknowledgement lost (relies on timer)
data was received by remote host but not read out by application
data was read out by application, but socket still alive
data was read out, and app crashed
data was read out, and app closed the socket
data was read out, and app called shutdown(WR) (almost same as closed)
FIN was not sent by remote yet
FIN was sent by remote but not received yet
FIN was sent and got lost
FIN received by your end
Obviously your OS can distinguish quite a few of these states, but not all of them. I can't think of an API that would be this verbose...
Some systems allow you to query remaining send buffer space. Perhaps if you did, and socket was already shut down, you'd get a neat error?
Good news is just because socket is shut down, doesn't mean you can't interrogate it. I can get all of TCP_INFO after shutdown, with state=7 (closed). In some cases report state=8 (close wait).
http://lxr.free-electrons.com/source/net/ipv4/tcp.c#L1961 has all the gory details of Linux TCP state machine.
TL;DR:
Don't rely on the socket state for this; it can cut you in many error cases. You need to bake the acknowledgement/receipt facility into your communications protocol. First character on each line used for status/ack works really well for text-based protocols.
On many, but not all, Unix-like/POSIXy systems, one can use the TIOCOUTQ (also SIOCOUTQ) ioctl to determine how much data is left in the outgoing buffer.
For TCP sockets, even if the other end has shut down its write side (and therefore will send no more data to this end), all transmissions are acknowledged. The data in the outgoing buffer is only removed when the acknowledgement from the recipient kernel is received. Thus, when there is no more data in the outgoing buffer, we know that the kernel at the other end has received the data.
Unfortunately, this does not mean that the application has received and processed the data. This same limitation applies to all methods that rely on socket state; this is also the reason why fundamentally, the acknowledgement of receipt/acceptance of the final status line must come from the other application, and cannot be automatically detected.
This, in turn, means that neither end can shut down their sending sides before the very final receipt/acknowledge message. You cannot rely on TCP -- or any other protocols' -- automatic socket state management. You must bake in the critical receipts/acknowledgements into the stream protocol itself.
In OP's case, the stream protocol seems to be simple line-based text. This is quite useful and easy to parse. One robust way to "extend" such a protocol is to reserve the first character of each line for the status code (or alternatively, reserve certain one-character lines as acknowledgements).
For large in-flight binary protocols (i.e., protocols where the sender and receiver are not really in sync), it is useful to label each data frame with an increasing (cyclic) integer, and have the other end respond, occasionally, with an update to let the sender know which frames have been completely processed, and which ones received, and whether additional frames should arrive soon/not-very-soon. This is very useful for network-based appliances that consume a lot of data, with the data provider wishing to be kept updated on the progress and desired data rate (think 3D printers, CNC machines, and so on, where the contents of the data changes the maximum acceptable data rate dynamically).
Okay so I recall pulling my hair out trying to solve this very problem back in the late 90's. I finally found an obscure doc that stated that a read call to a disconnected socket will return a 0. I use this fact to this day.
You're probably better off using ZeroMQ. That will send a whole message, or no message at all. If you set it's send buffer length to 1 (the shortest it will go) you can test to see if the send buffer is full. If not, the message was successfully transferred, probably. ZeroMQ is also really nice if you have an unreliable or intermittent network connection as part of your system.
That's still not entirely satisfactory. You're probably even better off implementing your own send acknowledge mechanism on top of ZeroMQ. That way you have absolute proof that a message was received. You don't have proof that a message was not received (something can go wrong between emitting and receiving the ack, and you cannot solve the Two Generals Problem). But that's the best that can be achieved. What you'll have done then is implement a Communicating Sequential Processes architecture on top of ZeroMQ's Actor Model which is itself implemented on top of TCP streams.. Ultimately it's a bit slower, but your application has more certainty of knowing what's gone on.
I'm attempting to write a simple server using C system calls that takes unknown byte streams from unknown clients and executes specific actions depending on client input. For example, the client will send a command "multiply 2 2" and the server will multiply the numbers and return the result.
In order to avoid errors where the server reads before the client has written, I have a blocking recv() call to wait for any data using MSG_PEEK. When recv detects data to be read, I move onto non-blocking recv()'s that read the stream byte by byte.
Everything works except in the corner case where the client sends no data (i.e. write(socket, "", 0); ). I was wondering how exactly I would detect that a message with no data is sent. In this case, recv() blocks forever.
Also, this post pretty much sums up my problem, but it doesn't suggest a way to detect a size 0 packet.
What value will recv() return if it receives a valid TCP packet with payload sized 0
When using TCP at the send/recv level you are not privy to the packet traffic that goes into making the stream. When you send a nonzero number of bytes over a TCP stream the sequence number increases by the number of bytes. That's how both sides know where the other is in terms of successful exchange of data. Sending multiple packets with the same sequence number doesn't mean that the client did anything (such as your write(s, "", 0) example), it just means that the client wants to communicate some other piece of information (for example, an ACK of data flowing the other way). You can't directly see things like retransmits, duplicate ACKs, or other anomalies like that when operating at the stream level.
The answer you linked says much the same thing.
Everything works except in the corner case where the client sends no data (i.e. write(socket, "", 0); ).
write(socket, "", 0) isn't even a send in the first place. It's just a local API call that does nothing on the network.
I was wondering how exactly I would detect that a message with no data is sent.
No message is sent, so there is nothing to detect.
In this case, recv() blocks forever.
I agree.
I have a blocking recv() call to wait for any data using MSG_PEEK. When recv detects data to be read, I move onto non-blocking recv()'s that read the stream byte by byte.
Instead of using recv(MSG_PEEK), you should be using select(), poll(), or epoll() to detect when data arrives, then call recv() to read it.
I have a client-server application where each side communicate with the other via TCP socket.
I properly establish the connection and then I crash the server BEFORE any data is written on the socket by the client.
What I see is that the first write() attempt (client-side) is successful and it returns the actual number of written bytes, while the following ones return (as I expected) -1 (receiving a SIGPIPE) and errno=EPIPE.
Why the first write() is successful even if the socket is already closed?
EDIT
Sometimes also the following write() have a positive return values, as if everything goes well.
You're confused by what the return value of write() means. It doesn't mean, "the peer got the data and acknowledged it". Instead, it means, "I buffered so-many bytes to send to the peer and they're my responsibility now, so you can forget about them (and I don't have any pending errors)".
That is, if the TCP stack accepts the write and returns n bytes, that doesn't mean they've been written yet, just queued for writing. It'll take some time, perhaps 30s after it starts sending network traffic, before the stack gives up and returns an error to you. During that time, you could have done several calls to write() which were successful at queueing data for sending. (The write error will be returned in c.30s if the peer has vanished, or immediately if the peer can be contacted and sends a RST packet straight away to indicate the connection is dead.)
This has to do with how TCP/IP works, that can be roughly described as two mostly independent half-connections. When you close the socket at the server, the client is told that it will not receive further data from the C<-S half-connection, waking up read() immediatly, but not about the C->S direction. It only gets a reply resetting the connection after it tries to send some data. I recommend the TCP/IP Guide for further details.
The reason why sometimes you can write() twice is that you write faster than the round-trip time and can squeeze a second write() before the reply to the first one.
I'm using the following method to detect a disconnected server condition:
After getting the select() timeout on a socket (nothing was received, though was supposed to),
the 'system("ping -c 1 -w 1 server");' command is activated.
If the server is up and just lagging, the ping command will return in less than 0.1 seconds.
Otherwise (the server is down), the ping command will return in 1 second.
First, a little bit of context to explain why I am on the "UDP sampling" route:
I would like to sample data produced at a fast rate for an unknown period of time. The data I want to sample is on another machine than the one consuming the data. I have a dedicated Ethernet connection between the two so bandwidth is not an issue. The problem I have is that the machine consuming the data is much slower than the one producing it. An added constraint is that while it's ok that I don't get all the samples (they are just samples), it is mandatory that I get the last one.
My 1st solution was to make the data producer send a UDP datagram for each produced sample and let the data consumer try to get the samples it could and let the others be discarded by the socket layer when the UDP socket is full. The problem with this solution is that when new UDP datagrams arrive and the socket is full, it is the new datagrams that get discarded and not the old ones. Therefore I am not guarantueed to have the last one!
My question is: is there a way to make a UDP socket replace old datagrams when new arrive?
The receiver is currently a Linux machine, but that could change in favor of another unix-like OS in the future (windows may be possible as it implements BSD sockets, but less likely)
The ideal solution would use widespread mecanisms (like setsockopt()s) to work.
PS: I thought of other solutions but they are more complex (involve heavy modification of the sender), therefore I would like first to have a definite status on the feasability of what I ask! :)
Updates:
- I know that the OS on the receiving machine can handle the network load + reassembly of the traffic generated by the sender. It's just that its default behaviour is to discard new datagrams when the socket buffer is full. And because of the processing times in the receiving process, I know it will become full whatever I do (wasting half of the memory on a socket buffer is not an option :)).
- I really would like to avoid having an helper process doing what the OS could have done at packet-dispatching time and waste resource just copying messages in a SHM.
- The problem I see with modifying the sender is that the code which I have access to is just a PleaseSendThisData() function, it has no knowledge that it can be the last time it is called before a long time, so I don't see any doable tricks at that end... but I'm open to suggestions! :)
If there are really no way to change the UDP receiving behaviour in a BSD socket, then well... just tell me, I am prepared to accept this terrible truth and will start working on the "helper process" solution when I go back to it :)
Just set the socket to non-blocking, and loop on recv() until it returns < 0 with errno == EAGAIN. Then process the last packet you got, rinse and repeat.
I agree with "caf".
Set the socket to a non-blocking mode.
Whenever you receive something on the socket - read in a loop until nothing more is left. Then handle the last read datagram.
Only one note: you should set a large system receive buffer for the socket
int nRcvBufSize = 5*1024*1024; // or whatever you think is ok
setsockopt(sock, SOL_SOCKET, SO_RCVBUF, (char*) &nRcvBufSize, sizeof(nRcvBufSize));
This will be difficult to get completely right just on the listener side since it could actually miss the last packet in the Network Interface Chip, which will keep your program from ever having had a chance at seeing it.
The operating system's UDP code would be the best place to try to deal with this since it will get new packets even if it decides to discard them because it already has too many queued up. Then it could make the decision of dropping an old one or dropping a new one, but I don't know how to go about telling it that this is what you would want it to do.
You can try to deal with this on the receiver by having one program or thread that always tries to read in the newest packet and another that always tries to get that newest packet. How to do this would differ based on if you did it as two separate programs or as two threads.
As threads you would need a mutex (semaphore or something like it) to protect a pointer (or reference) to a structure used to hold 1 UDP payload and whatever else you wanted in there (size, sender IP, sender port, timestamp, etc).
The thread that actually read packets from the socket would store the packet's data in a struct, acquire the mutex protecting that pointer, swap out the current pointer for a pointer to the struct it just made, release the mutex, signal the processor thread that it has something to do, and then clear out the structure that it just got a pointer to and use it to hold the next packet that comes in.
The thread that actually processed packet payloads should wait on the signal from the other thread and/or periodically (500 ms or so is probably a good starting point for this, but you decide) and aquire the mutex, swap its pointer to a UDP payload structure with the one that is there, release the mutex, and then if the structure has any packet data it should process it and then wait on the next signal. If it did not have any data it should just go ahead and wait on the next signal.
The processor thread should probably run at a lower priority than the UDP listener so that the listener is less likely to ever miss a packet. When processing the last packet (the one you really care about) the processor will not be interrupted because there are no new packets for the listener to hear.
You could extend this by using a queue rather than just a single pointer as the swapping place for the two threads. The single pointer is just a queue of length 1 and is very easy to process.
You could also extend this by attempting to have the listener thread detect if there are multiple packets waiting and only actually putting the last of those into the queue for the processor thread. How you do this will differ by platform, but if you are using a *nix then this should return 0 for sockets with nothing waiting:
while (keep_doing_this()) {
ssize_t len = read(udp_socket_fd, my_udp_packet->buf, my_udp_packet->buf_len);
// this could have been recv or recvfrom
if (len < 0) {
error();
}
int sz;
int rc = ioctl(udp_socket_fd, FIONREAD, &sz);
if (rc < 0) {
error();
}
if (!sz) {
// There aren't any more packets ready, so queue up the one we got
my_udp_packet->current_len = len;
my_udp_packet = swap_udp_packet(my_ucp_packet);
/* swap_udp_packet is code you would have to write to implement what I talked
about above. */
tgkill(this_group, procesor_thread_tid, SIGUSR1);
} else if (sz > my_udp_packet->buf_len) {
/* You could resize the buffer for the packet payload here if it is too small.*/
}
}
A udp_packet would have to be allocated for each thread as well as 1 for the swapping pointer. If you use a queue for swapping then you must have enough udp_packets for each position in the queue -- since the pointer is just a queue of length 1 it only needs 1.
If you are using a POSIX system then consider not using a real time signal for the signaling because they queue up. Using a regular signal will allow you to treat being signaled many times the same as being signaled just once until the signal is handled, while real time signals queue up. Waking up periodically to check the queue also allows you to handle the possibility of the last signal arriving just after you have checked to see if you had any new packets but before you call pause to wait on a signal.
Another idea is to have a dedicated reader process that does nothing but loops on the socket and reads incoming packets into circular buffer in shared memory (you'll have to worry about proper write ordering). Something like kfifo. Non-blocking is fine here too. New data overrides old data. Then other process(es) would always have access to latest block at the head of the queue and all the previous chunks not yet overwritten.
Might be too complicated for a simple one-way reader, just an option.
I'm pretty sure that this is a provably insoluble problem closely related to the Two Army Problem.
I can think of a dirty solution: establish a TCP "control" sideband connection which carries the last packet which is also a "end transmission" indication. Otherwise you need to use one of the more general pragmatic means noted in Engineering Approaches.
This is an old question, but you are basically wanting to turn the socket queue (FIFO) into a stack (LIFO). It's not possible, unless you want to fiddle with the kernel.
You'll need to move the datagrams from kernel space to user space and then process. Easiest approach would be a loop like this...
Block until there is data on the socket (see select, poll, epoll)
Drain the socket, storing datagrams per your own selection policy
Process the stored datagrams
Repeat