how to bind/connect multiple UDP socket - c

My initial UDP socket is binded to 127.0.0.1:9898.
The first time that I get notified of incoming data by epoll/kqueue, I do recvfrom() and I fill a struct sockaddr called peer_name that contain the peer informations (ip:port).
Then I create a new UPD socket using socket(),
then I bind() this newly created socket to the same ip:port (127.0.0.1:9898) than my original socket.
then I connect my newly created socket using connect() to the peer who just sent me something. I have the information in the struct sockaddr called peer_name.
I then add my newly created socket in my epoll/kqueue vector and wait for notification.
I would expect to ONLY receive UDP frame from the peer i'm ""connected to"".
1/ does netstat -a -p udp is suppose to show me the IP:PORT of the peer my newly created socket is ""connected to"" ?
2/ I'm probably doing something wrong since after creating my new socket, this socket receive all incoming UDP packets destinated to the IP:PORT I'm binded to, regardless of the source peer IP:PORT.
I would like to see a working example of what I'm trying to do :)
or any hint on what I'm doing wrong.
thanks!

http://www.softlab.ntua.gr/facilities/documentation/unix/unix-socket-faq/unix-socket-faq-5.html
"Does doing a connect() call affect the receive behaviourof the socket?
Yes, in two ways. First, only datagrams from your "connected peer" are returned. All others arriving at your port are not delivered to you.
But most importantly, a UDP socket must be connected to receive ICMP errors. Pp. 748-749 of "TCP/IP Illustrated, Volume 2" give all the gory details on why this is so."

connect(2) on a UDP socket just sets the default destination address of the socket (where the data will be sent if you use write(2) or send(2) on the socket). It has no other effect -- you can still send packets to other addresses with sendto(2) or sendmsg(2) and you'll still see packets sent from any address.
So it doesn't really make any sense to open a new socket on the port -- for every packet received, you need to look at the source address to see if it comes from an address you've seen already (and thus belongs to that logical stream) or is a new address (a new logical stream).

Related

What is the reason for using UNIX sockets "zero-length datagrams"?

In recv()'s man page I found that return value can be zero in such case:
Datagram sockets in various domains (e.g., the UNIX and Internet
domains) permit zero-length datagrams. When such a datagram is
received, the return value is 0.
When or why should one use zero-length datagrams in UNIX socket intercommunication? What's its purpose?
One such use is to unblock a recvfrom() call when you wish to close a UDP service thread - set a 'terminate' flag and send the zero-length datagram on the localhost stack.
One example I stumbled upon yesterday while researching the answer to another question.
In the old RFC 868 protocol for getting the current time of a remote server, the workflow for using UDP looks like:
When used via UDP the time service works as follows:
S: Listen on port 37 (45 octal).
U: Send an empty datagram to port 37.
S: Receive the empty datagram.
S: Send a datagram containing the time as a 32 bit binary number.
U: Receive the time datagram.
The server listens for a datagram on port 37. When a datagram
arrives, the server returns a datagram containing the 32-bit time
value. If the server is unable to determine the time at its site, it
should discard the arriving datagram and make no reply.
In this case, the server has to receive a datagram to be alerted that a user is requesting the time (And to know what address to send a reply to), due to UDP's connectionless nature (The TCP version just needs to connect to the server). The contents of that datagram are ignored, so might as well specify that it should be empty.

Raw sockets receiving messages sent by itself

I was trying to write some codes using raw sockets, while I observed some strange phenomenon. Consider the code:
int rsfd = socket(AF_INET,SOCK_RAW,253);
if(rsfd<0)
{
perror("Raw socket not created");
}
else
{
struct sockaddr_in addr2;
memset(&addr2,0,sizeof(addr2));
addr2.sin_family = AF_INET;
addr2.sin_addr.s_addr = inet_addr("127.0.0.2");
/* if(connect(rsfd,(struct sockaddr*)&addr2,sizeof(addr2))<0)
{
perror("Could not connect");continue;
} */
}
Now if I remove the commented portion, whatever message I am sending through this rsfd is also being received by itself. On the other end I have already bound a socket with the ip address 127.0.0.2. When I printed the ip address of the sender socket, it is printing 127.0.0.1 but still it is receiving packets which is meant for 127.0.0.2. This problem was solved when I added that connect request which is mentioned in the commented portion. This seems weird because on the other side, no one is accepting or listen on this address and moreover, I am using sendto and recvfrom functions for sending and receiving packets which is used for connection less sockets. My question is, why is this happening? How is this connect request solvong the problem here?
Now if I [don't connect() the socket], whatever message I am sending through this rsfd is also being received by itself.
I note first that raw sockets are an extension to POSIX. Linux offers them, and I think other systems do too, but details of their behavior are not certain to be consistent across implementations.
With that said, the problem seems likely to be that you are not bind()ing your socket to any address. On Linux, for example, the docs for raw sockets note that
A raw socket can be bound to a specific local address using the
bind(2) call. If it isn't bound, all packets with the specified IP
protocol are received.
(Emphasis added.) On a system where raw sockets have that behavior, if you're sending packets to an IP loopback address via a raw IP socket that is neither bound nor connected then yes, the source socket will receive them, or at least may do.
It's unclear why connecting the socket solves the problem, or why it is even successful at all. The behavior of connect() is unspecified for socket types other than the standard ones, SOCK_DGRAM, SOCK_STREAM, and SOCK_SEQPACKET. However, the behavior you observe is consistent with connect() having an effect on raw sockets like that it has on datagram sockets, which are also connectionless:
If the socket sockfd is of type SOCK_DGRAM, then addr is the address
to which datagrams are sent by default, and the only address from
which datagrams are received.
Instead of relying on that discovered behavior, however, I suggest following the documented (at least on Linux) procedure of binding the socket to an address (including a port), and communicating with it at that address.

How to set priority to sockets

I have two udp sockets
*:161
1.1.2.2:161
When there is a packet destined for 1.1.2.2:161, I want to make the packet to be received by *.161 and not 1.1.2.2:161. Is there an option I can set to the socket 1.1.2.2:161 while creating which makes it not to receive any packets?
The question does not make sense. Only one socket can be bound to a given (address, port) pair at any given time, so the situation presented cannot arise. There is never any question as to which socket should receive an incoming packet.

When the client socket should be bound in order to receive UDP messages from a server?

I have seen two examples that illustrate how the client socket can receive messages from server.
Example 1:
server code
http://man7.org/tlpi/code/online/book/sockets/ud_ucase_sv.c.html
client code
http://man7.org/tlpi/code/online/book/sockets/ud_ucase_cl.c.html
The client program creates a socket and binds the socket to an address, so that the server can send its reply.
if (bind(sfd, (struct sockaddr *) &claddr, sizeof(struct sockaddr_un)) == -1)
errExit("bind"); // snippet from ud_ucase_cl.c
Example 2:
server code
http://man7.org/tlpi/code/online/book/sockets/i6d_ucase_sv.c.html
client code
http://man7.org/tlpi/code/online/book/sockets/i6d_ucase_cl.c.html
In example 2, client code doesn't bind its socket with an address.
Question:
Is it necessary for the client code to bind the socket with an address in order to receive message from server?
Why in the first example, we have to bind the client socket with an address, why we don't have to in the second example?
The difference is the socket family - first example uses AF_UNIX, while the second does AF_INET6. According to Stevens UNP you need to explicitly bind pathname to Unix client socket so that the server has a pathname to which it can send its reply:
... sending a datagram to an unbound Unix domain datagram socket does not implicitly bind a pathname to the socket. Therefore, if we omit this step, the server's call to recvfrom ... returns a null pathname ...
This is not required for INET{4,6} sockets since they are "auto-bound" to an ephemeral port.
For the client (TCP) or sender (UDP), calling bind() is optional; it is a way to specify the interface. Suppose you have two interfaces, which are both routable to your destination:
eth0: 10.1.1.100/24
eth1: 10.2.2.100/24
route: 10.1.1.0/24 via 10.2.2.254 # router for eth1
0.0.0.0 via 10.1.1.254 # general router
Now if you just say connect() to 12.34.56.78, you don't know which local interface furnishes the local side of the connection. By calling bind() first, you make this specific.
The same is true for UDP traffic: Without bind()ing, your sendto() will use a random source address and port, but with bind() you make the source specific.
If you have not bound AF_INET/AF_INET6 client socket before connecting/sending something, TCP/IP stack will automatically bind it to ephemeral port on outbound address.
Unlike this, UNIX domain sockets (AF_UNIX) do not automatically bind when sending, so you can send messages via SOCK_DGRAM but can't get any replies.

How can non-exclusive ports exist when tcp identify applications using 16-bit port number?

I can't come up with the exact socket api,but I remember that there is a socket option that demonstrates the port exclusive/non-exclusive.
If it's not exclusive,how can TCP know which application it should forward to for a specific destination port?
I think you might be referring to the SO_REUSEPORT option available on some systems.
From the BSD man page:
SO_REUSEPORT allows completely duplicate bindings by multiple processes
if they all set SO_REUSEPORT before binding the port. This option permits multiple instances of a program to each receive UDP/IP multicast or
broadcast datagrams destined for the bound port.
Implementations vary a lot for this (from non-existant, to restricted to UDP, to allowing TCP also). In the cases where TCP is allowed, the connexions are distinguished by both source and target (ip,port) pairs. This is sufficient to allow the implementation to decide which app needs which packet. (see Trek - Socket options for instance.)
With multiple apps bind to the same TCP port, you could only have one socket accepting on the port. The others would use the port to initiate outbound connections. The TCP stack always knows where to send the packets to.
incoming packets that initiate (SYN) a connection go to the only accepting socket
incoming packets for a connected stream are routed to the socket they belong to
Note: the sockets themselves (including, I believe the accepting socket) can be shared across multiple processes. See Is there a way for multiple processes to share a listening socket? for example.
Here's how it could work. Voluntarily simplifying TCP (no three-way handshake). Let's note the socket information held by the TCP stack like this
(socketname)[owner app, (local IP, local port), (state, remote IP, remote port)]
With that, let's set up three apps A, B and C:
App A -> bind (localhost,12345,SO_REUSEPORT)
TCP stack: create socket (s1)[belongs to A, (localhost,12345), (not connected)]
App A <- s1
App B -> bind (localhost,12345,SO_REUSEPORT)
TCP stack: create socket (s2)[belongs to B, (localhost,12345), (not connected)]
App B <- s2
App C -> bind (localhost,12345,SO_REUSEPORT)
TCP stack: create socket (s3)[belongs to C, (localhost,12345), (not connected)]
App C <- s3
App C -> s3.listen()
TCP stack: update socket (s3)[belongs to C, (localhost,12345), (open for business)]
App C -> s3.accept()
At this point, no data has been sent, but the kernel knows exactly what socket belongs to what application. Let's have A actually try to do something with its socket:
App A -> s1.connect(otherhost,54321)
TCP stack: update socket (s1)[belongs to A, (localhost,12345), (connecting, otherhost,54321)]
TCP stack: send SYN
Here, three things can happen:
incoming ACK packet from (otherhost,54321) with correct sequence: this is fine, it's for s1, no other possibility
TCP stack: update socket (s1)[belongs to A, (localhost,12345), (connected,otherhost,54321)]
TCP stack: send SYN/ACK
TCP stack: notify App A that socket is connected
App A <- connect succeeded, you can start doin' stuff
incoming SYN packet from (client,4343): can only be for s3, only socket ready for SYN
TCP stack: create new socket (s4)[belongs to C, (localhost,12345), (connected,client,4343)]
TCP stack: send ACK to client
TCP stack: notify App C that (s4) has been accepted
App C <- s4 returned from accept()
incoming packet from somewhere else:
TCP stack: drop or reject, there are no matching sessions
Let's imagine the two things above happened. The kernel information is now:
(s1)[belongs to A, (localhost,12345), (connected,otherhost,54321)]
(s2)[belongs to B, (localhost,12345), (not connected)]
(s3)[belongs to C, (localhost,12345), (open for business)]
(s4)[belongs to C, (localhost,12345), (connected,client,4343)]
Now four kinds of packets can come in:
incoming normal packet from (otherhost,54321): this matches s1, hand it over to App A
incoming normal packet from (client,4343): this matches s4, hand it over to App C
incoming SYN packet from (otherclient,2398): this matches s3, same as before
anything else: drop or reject, invalid state
The TCP stack always nows which socket a packet belongs to. So it knows where to deliver the data.
SO_REUSEADDR is different: it only allows to bind to a port in TIME_WAIT state - i.e. the port is already closing down, the socket that had opened it has been issued a close already (more or less, there are other conditions). With SO_REUSEADDR, only one socket at a time holds the socket open.
A socket is identified by the IP address and the port - both. Together, in the BSD socket API, they are called a "name". You can bind your socket to a name and, if that name comprises a specific address, you can bind another socket to another name with the same port.
Also: a connection is a pair of two sockets, so a single address/port pair can be used in several connections - meaning more than one connected socket can have the same name (but not the same fd).

Resources