For many reasons, I would like to use unix domain sockets for IPC between two processes.
Each process reacts to asynchronous events of some specific kind from the outside world by writing to the socket and communicating this event to the second process and - at the same time - each process also needs to read data coming from the other socket to do some stuff. In substance, in this model there would be one socket and two threads per process: one for possibly blocking reads, and one for the writings.
I would like to know if it's possible to use unix domain sockets for concurrently reading and writing from/to each process independently, without making use of any explicitly locking in that safety would implicitly guaranteed by these kind of sockets. If yes, I'd also like to know where this guarantee is officially claimed.
The only relevant difference between an AF_LOCAL socket and an AF_INET socket is that AF_LOCAL sockets are local to the current computer. Creating an AF_LOCAL socket and binding it is no difference than creating an AF_INET socket and binding it to localhost.
The path used for binding AF_LOCAL sockets is only used for connecting to the socket, nothing else.
So if you create a connection-oriented AF_LOCAL socket (using SOCK_STREAM or SOCK_SEQPACKET) then each connection is unique and you can have multiple processes connecting through the same listening (passive) AF_LOCAL socket.
Related
I am implementing a reliable, connection-oriented application over UDP (uni assignment). It is required that the server, after it receives an initial packet from the client and acknowledges it, creates a new worker process. The worker process shall have a socket dedicated to this particular client.
Now I see three ways of implementing this.
The server has a socket, let's call it listener_socket, which is bound to the specified port and waits for requests. I could, then, in the child process, connect() this very socket to the client's address.
Or, I could close listener_socket in the child altogether, then open a brand new socket connection_socket, bind it to the port and connect it to the client.
Or, open a new socket on a new port and code the client to deal with this new port for the remaining duration of the connection instead.
What I am uncertain about (regarding option 1) is how connecting the listener_socket to the client affect the original listener_socket in the parent server. Will it prevent the parent from receiving further messages from other clients? If not, why? Don't they both refer ultimately to the same socket?
As for option 2, this gives me, quite expectedly, "address already in use". Is there some functionality like in routers (i.e. longest matching prefix) that delivers datagrams to the "most fitting" socket? (Well, TCP already does this in accept(), so it's quite unlikely that this logic could be replicated for UDP.)
Regarding option 3, I think it's quite inefficient, because it implies that I should use a new port for each client.
So could someone please advise on which method to use, or if there is another way I'm not yet aware of?
Thanks.
What I am uncertain about (regarding option 1) is how connecting the listener_socket to the client affect the original listener_socket in the parent server. Will it prevent the parent from receiving further messages from other clients?
Yes.
If not, why? Don't they both refer ultimately to the same socket?
Yes, when the parent forks the child, the child receives copies of all the parent's open file descriptors, and these refer to open file descriptions managed by the kernel. Those open file descriptions are thus shared between parent and child. Therefore, if the child connect()s the socket to a client, then both parent and child will see the effect that
If the socket sockfd is of type SOCK_DGRAM, then [the specified address] is the address to which datagrams are sent by default, and the only address from which datagrams are received.
(Linux manual page; emphasis added)
As for option 2, this gives me, quite expectedly, "address already in use". Is there some functionality like in routers (i.e. longest matching prefix) that delivers datagrams to the "most fitting" socket?
What would make one socket more fitting than the other? In any case, no. A UDP endpoint is identified by address and port alone.
Regarding option 3, I think it's quite inefficient, because it implies that I should use a new port for each client.
Inefficient relative to what? Yes, your option (3) would require designating a separate port for each client, but you haven't presented any other viable alternatives.
But that's not to say that there aren't other viable alternatives. If you don't want to negotiate a separate port and open a separate socket per client (which is one of the ways that FTP can operate, for example) then you cannot rely on per-client UDP sockets. In that case, all incoming traffic to the service must go to the same socket. But you can have the process receiving messages on that socket dispatch each one to an appropriate cooperating process, based on the message's source address and port. And you should be able to have those processes all send responses via the same socket.
There are numerous ways that the details of such a system could be set up. They all do have the limitation that the one process receiving all the messages could be a bottleneck, but in practice, we're talking about I/O. It is very likely that the main bottleneck would be at the network level, not in the receiving process.
If the socket in the child process after a fork is the same as the one in the parent, then they both reflect the same kernel socket. Since binding and connecting is done on the kernel socket any changes to the child socket will thus affect the parent socket too. It is therefore necessary to create a new (independent) connected socket.
The socket needs to bound to the same local address and port as the listener socket though, since this is where the peer expects packets to come from to keep the same UDP association.
I'm trying to write a server/client pair to run over udp, and the only way i've been able to get it going is having the server aware of the client's ip and port before the connection starts. My new design involves waiting for a packet to come in, recording the sender address, forking to a child process (the parent loops around and continues listening), which then connect's to the client the transmitted the packet. The child should then only receive packets from the associated client, but the documentation is unclear is the parent socket will continue receive traffic from that client. I'm working on a program to try it, but i figured i could ask the question at the same time.
EDIT: It seems that when the child's socket is connected'd it will connect the parent's socket too.
UDP protocol does not operate connections, it's a connection-less protocol. It is enough for one side to listen and other side to just send datagrams for data channel to work.
On the question (sorry, didn't got the point before): forking is not the way out when working with UDP. Connection-based protocols are widely used with that technique. That is possible because:
you can fork right after listen()
the first process accepts connection works with it (and only that process posesses the newly created connected socket.
When you work with UDP you don't have such gap (as before accept() with TCP) to know when exactly to fork (especially when you have intensive datagram flow).
So, when you design UDP service, you need either
use non-blocking I/O with event loop or
design threaded solution.
I'm breaking my mind trying to understand how make a client/server write by myself accept multiple socket connections.
The connection is a datagram (UDP), for now was implemented based on getaddrinfo(3) man page works nice, but each client needs to wait the process of early connections be processed.
I've heard about select, but in its man page says:
select() can be used to solve many problems in a
portable and efficient way that naive programmers try to solve in a
more complicated manner using threads, forking, IPCs, signals, memory
sharing, and so on.
and more:
The Linux-specific epoll(7) API provides an interface that is
more
efficient than select(2) and poll(2) when monitoring large numbers of
file descriptors.
So, it is? epoll is simply better than select? Or it depends? If it depends, on what?
epoll man pages has a partial sample, so I'm trying to understand it.
At now, (on server) I think, I need a thread to listen in a thread and write in another. But how to control the completion of a partial message? If two clients send a partial message interleaved, how to identify? By the sockaddr? if it's the only need, I can manage it without a pool, so why use a epoll?
Can anyone try to explain to me, how to make, or where to learn about a multi-connection client-server UDP app?
I think there is a misunderstanding here about UDP. UDP is not a connection oriented protocol which means there is no permanent connection like TCP. UDP just bind to an address/port and waits for packets from everyone. At the server there is only one socket listening per address/port number. When a packet is received you may find out who is the sender by the packet's source IP, you can reply to the sender thru this IP.
As I see it, there is no need for poll() o select() you bind to an address/port and asynchronously receive packets. That is, when a packet is received you get a signal/message alerting your asynchronous function. This function should be reentrant, it means that in the middle of a reception another signal could be received and care must be taken when accessing/modifying global stuff (variables/objects). When dealing with an incoming packet it should be processed as soon as possible or, in case the process takes too long, you better keep the packet in a packet spool and process them in another [less priority] thread.
For UDP packet size read this question.
For UDP fragmentation read this
For UDP packet header read this
I have been building a multi threaded server, with each thread having a single epoll fd to manage incoming tcp connections.
For inter thread communication I used unix domain sockets with the intention of leveraging the existing per thread epoll.
But it seems that, Epoll stops returning network socket events if the unix domain socket is also added.
My question is can one Epoll instance be used to track events on both tcp sockets and unix domain sockets? Is this the expected behaviour? I didn't come across any literature suggesting so. Or do I need to have a separate Epoll instances to track these two different types of sockets?
epoll, poll and select were designed to monitor multiple file descriptors. It doesn't limit of monitoring only one instance of file / socket descriptors at any time.
is can one Epoll instance be used to track events on both tcp sockets and unix domain sockets?
Yes, there is no specific limit on using epoll.
Have a look of sample epoll program at Could you recommend some guides about Epoll on Linux
I want to create one server and one client(two separate programs) where in server creates
two named pipes(i guess thats the minimum required for bidirectional flow of traffic) and then
the client is started,and Client and server should be able to send and recieve data both ways
all the time(full duplex types).I think that would require me to have non-blocking named pipes.Would like some help as i have been able to create half duplex type of communication
but struggling to make a contineous seamless transfer of data between client and server happen.
Thanks
Possible options:
Local domain sockets: AF_LOCAL family with SOCK_STREAM, SOCK_DGRAM, SOCK_SEQPACKET type. The socket can be "in-memory", meaning you connect to it using a unique string, or it can be a socket file in the filesystem. It works just like any network socket, full-duplex.
Two pipes: one for reading, one for writing (vice versa for the other process). Might be a bit more complicated keeping track of two pipes, as opposed to the local domain socket.
Helpful link Check out the part on Pipes and the one on Unix Sockets.
Have you considered using select() to handle the reading named pipe?