Is sending data via UDP sockets on the same machine reliable? - c

If i use UDP sockets for interprocess communication, can i expect that all send data is received by the other process in the same order?
I know this is not true for UDP in general.

No. I have been bitten by this before. You may wonder how it can possibly fail, but you'll run into issues of buffers of pending packets filling up, and consequently packets will be dropped. How the network subsystem drops packets is implementation-dependent and not specified anywhere.

In short, no. You shouldn't be making any assumptions about the order of data received on a UDP socket, even over localhost. It might work, it might not, and it's not guaranteed to.

No, there is no such guarantee, even with local sockets. If you want an IPC mechanism that guraantees in-order delivery you might look into using full-duplex pipes with popen(). This opens a pipe to the child process that either can read or write arbitrarily. It will guarantee in-order delivery and can be used with synchronous or asynchronous I/O (select() or poll()), depending on how you want to build the application.
On unix there are other options such as unix domain sockets or System V message queues (some of which may be faster) but reading/writing from a pipe is dead simple and works. As a bonus it's easy to test your server process because it is just reading and writing from Stdio.
On windows you could look into Named Pipes, which work somewhat differently from their unix namesake but are used for precisely this sort of interprocess communication.

Loopback UDP is incredibly unreliable on many platforms, you can easily see 50%+ data loss. Various excuses have been given to the effect that there are far better transport mechanisms to use.
There are many middleware stacks available these days to make IPC easier to use and cross platform. Have a look at something like ZeroMQ or 29 West's LBM which use the same API for intra-process, inter-process (IPC), and network communications.

The socket interface will probably not flow control the originator of the data, so you will probably see reliable transmission if you have higher level flow control but there is always the possibility that a memory crunch could still cause a dropped datagram.
Without flow control limiting kernel memory allocation for datagrams I imagine it will be just as unreliable as network UDP.

Related

Message Passing between Processes without using dedicated functions

I'm writing a C program in which I need to pass messages between child processes and the main process. But the thing is, I need to do it without using the functions like msgget() and msgsnd()
How can I implement? What kind of techniques can I use?
There are multiple ways to communicate with children processes, it depends on your application.
Very much depends on the level of abstraction of your application.
-- If the level of abstraction is low:
If you need very fast communication, you could use shared memory (e.g. shm_open()). But that would be complicated to synchronize correctly.
The most used method, and the method I'd use if I were in your shoes is: pipes.
It's simple, fast, and since pipes file descriptors are supported by epoll() and those kind of asynchronous I/O APIs, you can take advantage from this fact.
Another plus is that, if your application grows, and you need to communicate with remote processes (processes that are not in your local machine), adapting pipes to sockets is very easy, basically it's still the same reading/writing from/to a file descriptor.
Also, Unix-domain sockets (which in other platforms are called "named pipes") let you to have a server process that creates a listening socket with a very well known name (e.g. an entry in the filesystem, such as /tmp/my_socket) and all clients in the local machine can connect to that.
Pipes, networking sockets, or unix-domain sockets are very interchangeable solutions, because - as said before - all involve reading/writing data from/to a file descriptor, so you can reuse the code.
The disadvantage with a file descriptor is that you're writing data to a stream of bytes, so you need to implement the "message streaming protocol" of your messages by yourself, to "unstream" your messages (marshalling/unmarshalling), but that's not so complicated in the most of the cases, and that also depends on the kind of messages you're sending.
I'd pass on other solutions such as memory mapped files and so on.
-- If the level of abstraction is higher:
You could use a 3rd party message passing system, such as RabbitMQ, ZMQ, and so on.

Best Way To Receive/Process High Amounts Of Packets/Traffic Via AF_PACKET Socket + EPoll Questions

I've made a test C program that creates an AF_PACKET socket, creates x amount of threads via pthreads, and within each thread performs epoll on the socket's file descriptor. This program was made for Linux and I've compiled it using GCC on Ubuntu 18.04. I've submitted a GitHub Gist of the program here since it's 200+ lines of code. I am still fairly new to C and network programming. Therefore, I'm sure there are many improvements I can make to the code. I am open to suggestions!
I have two main questions:
Is there a better way to receive and process high amounts of packets/traffic in a user space program than the above? I've read using pthreads along with epoll would be the best option, but I've also looked into select and standard poll.
When the program above is executed without any debug output via fprintf(), each thread consumes 100% CPU on the epoll_wait() function within the while loop. Is this normal behavior or am I using epoll incorrectly? I've looked at some other examples and I use epoll the same way as the examples do. I've taken a look at the manual page for epoll and I believe I'm using it correctly in my case. I've also tried setting a timeout for the epoll_wait() function, but it was still consuming 100% CPU per thread (which I'd expect due to the while loop).
I plan to make a program that will redirect traffic after inspecting the traffic and I expect a lot of incoming packets which is why I wanted to see if there is a better way to receive and process high amounts of packets. I also understand I could just use standard SOCK_DGRAM or SOCK_STREAM sockets and bind them to an IP and port. However, I do want to process and inspect all incoming traffic to an interface and forward traffic if necessary (e.g. if the destination address matches a forwarding rule). I also wasn't sure if I should make multiple sockets in this case (perhaps a socket per thread). I did do this initially, but it resulted in unexpected behavior and it was only ever reading from one socket descriptor anyways. Perhaps I wasn't creating the new sockets properly.
Any help is highly appreciated and if you need any more information, please let me know.
Thank you for your time.

Reading multiple UDP messages without polling

I would like to use the recvmmsg call to read multiple UDP messages from ONE single socket at once. I'm reading data from a single multicast group.
When I read TCP data, I usually use poll/select with a non-blocking socket (and timeout) to be notified when that is ready to be read. I follow this approach as I am aware of the issue of spurious wakeup and potential troubles of having a blocking socket.
As my application must be very quick, if I follow the same approach with recvmmsg I will introduce an extra system call (poll/select) that might slow down the execution.
So my two questions are the following:
With UDP, can I safely read from BLOCKING sockets using recvmmsg without poll/select or do I have to apply the same principle I've used for TCP (non-blocking+poll)?
Suppose I have a huge amount of multicast traffic, would you go for non-blocking socket + recvmmsg only (no poll) and burn a lot of CPU?
I am using Linux: CentOS 7 and Oracle Linux.
You can always use blocking mode, with both TCP and UDP sockets.
If you want to impose a read timeout there is setsockopt() with the SO_RCVTIMEO option.
I follow this approach as I am aware of the issue of spurious wakeup
What spurious wakeup? Never seen it in 25 years of network programming.
and potential troubles of having a blocking socket.
Never heard of those either.
Using select() and non-blocking mode with a single socket is pointless unless your platform doesn't support SO_RCVTIMEO. It's an extra system call, for a start.
The option of using blocking or non-blocking depends on what is the final purpose of the application.
- Say it's just a sample chat application showing the usage of UDP combined with TCP then you can use either.
- But if you are planning to make this module a part of highly used application with lots of data flowing then probably creating multiple threads/processes to handle different tasks will come in handy. The parent thread will to wait for the message but for processing it will spawn a different child thread and hence make the parent available for the next message.
But in a nutshell I don't see any issue with your first option of using a blocking socket without poll/select for a UDP application considering it's just for homework purposes.

Does recv remove packets from pcaps buffer?

Say there are two programs running on a computer (for the sake of simplification, the only user programs running on linux) one of which calls recv(), and one of which is using pcap to detect incoming packets. A packet arrives, and it is detected by both the program using pcap, and by the program using recv. But, is there any case (for instance recv() returning between calls to pcap_next()) in which one of these two will not get the packet?
I really don't understand how the buffering system works here, so the more detailed explanation the better - is there any conceivable case in which one of these programs would see a packet that the other does not? And if so, what is it and how can I prevent it?
AFAIK, there do exist cases where one would receive the data and the other wouldn't (both ways). It's possible that I've gotten some of the details wrong here, but I'm sure someone will correct me.
Pcap uses different mechanisms to sniff on interfaces, but here's how the general case works:
A network card receives a packet (the driver is notified via an interrupt)
The kernel places that packet into appropriate listening queues: e.g.,
The TCP stack.
A bridge driver, if the interface is bridged.
The interface that PCAP uses (a raw socket connection).
Those buffers are flushed independently of each other:
As TCP streams are assembled and data delivered to processes.
As the bridge sends the packet to the appropriate connected interfaces.
As PCAP reads received packets.
I would guess that there is no hard way to guarantee that both programs receive both packets. That would require blocking on a buffer when it's full (and that could lead to starvation, deadlock, all kinds of problems). It may be possible with interconnects other than Ethernet, but the general philosophy there is best-effort.
Unless the system is under heavy-load however, I would say that the loss rates would be quite low and that most packets would be received by all. You can decrease the risk of loss by increasing the buffer sizes. A quick google search tuned this up, but I'm sure there's a million more ways to do it.
If you need hard guarantees, I think a more powerful model of the network is needed. I've heard great things about Netgraph for these kinds of tasks. You could also just install a physical box that inspects packets (the hardest guarantee you can get).

Best way to pass data between two servers in C?

I wrote a program that creates a TCP and UDP socket in C and starts both servers up. The goal of the application is to monitor requests over the TCP socket as to what UDP packets to send it (i.e. monitor for something like "0x01 0x02" and if I see it, then have the UDP server parse the payload, and forward it over to the TCP server for processing). The problem is, the UDP server will be busy keeping another device up, literally sending thousands of packets back and forth with this device. So what is the best way to continuously monitor requests from the TCP server, but send it certain payloads from the UDP server when requested since the UDP server will be busy?
I looked into pthreads with semaphores and/or mutex (not sure all the socket operations are thread safe, though, and if this is the right way to approach it) as well as fork / pipe. Forking the UDP server off as a child process seems easy enough, but I don't see exactly how I would be passing the kind of data I need among both servers (need request data from TCP and payload data from the UDP).
Firstly, would it make sense to put these two servers into one program? If so, you won't have to communicate between processes, and the whole logic becomes substantially easier. You will have to think about doing asynchronous input and output, and the select() function is designed for just this. There will be many explanations around on how to do this, and a quick look finds this page.
However, if you must have two separate processes, then you will need to choose a mechanism for inter-process communication, of which there are several, and your choice will be affected by your operating system. A pipe, if available, might be suitable, as might a Unix named pipe. Or you could look into third-party message passing frameworks, or just use shared memory and/or semaphores (but be very careful!).
What you should look at is libevent, anything else you are reinventing the wheel writing this low level code yourself. Here is a Tutorial, Google, Krugle
Also you should use some predefined protocol between the servers. There are lots to choose from. Ranging from the extremely simple XDR to Protocol Buffers.
You could use pipes on Unix. See http://tldp.org/LDP/lpg/node11.html
Well, you certainly picked an interesting introduction to C!
You might try shared memory. What OS?

Resources