IP_DROP_MEMBERSHIP behaviour inconsistency - c

I have a use case in which I join the membership using IP_ADD_MEMBERSHIP and after some time I have to IP_DROP_MEMBERSHIP (just before drop seq id was 1) and then again I join the membership using IP_ADD_MEMBERSHIP (for the same multicast group). I am noticing that I am getting the next packet (seqid = 2) which I think shouldn't be happening since as per my understanding IP_DROP_MEMBERSHIP should stop receiving the udp packets and flush the socket it is using and once I joined back it should be the latest packet available and to this behaviour is not consistent sometimes I am getting the latest packet only.
Please note that I do not wish to close the socket. Continue using the existing one.
Please help. I am using Centos 7.4

Try setting IP_MULTICAST_ALL to 0. It defaults to 1.
Explanation: With IP_MULTICAST_ALL 0 your OS will filter the incoming UDP packets to the groups you currently joined, which is what you expect in your description.
But this is not the default behavior on Linux.
The default (with IP_MULTICAST_ALL=1) is to receive any UDP packets which come into your socket. When you bound to 0.0.0.0 this will be all UDP packets the machine is receiving for that port, multicast and unicast, regardless whether you joined any multicast group or not. This means you will see all the artifacts of the difference between joining and leaving multicast group and the actual IGMP message sent by your machine, and you will also see all the artifacts and bugs of all the routers and switches you have in your local network. For example, when you leave a multicast group your OS may decide not to send the corresponding IGMP message at all, for example because some other socket is listening for this multicast address as well, or because the OS decides to leave with a delay. This is all perfectly allowed.
BTW, when you bind to the multicast address on Linux, this just has a filtering function and no binding function at all. You will then only receive UDP packets targeted at that particular multicast IP, regardless whether you also joined other multicast groups as well or not.
Regarding "flushing" a socket: The packet queues behind a socket are completely out of scope for your application. You cannot influence the queue state or behavior (except for reading from the queue or not) and you cannot expect any particular behavior.
In practice I would suggest:
- Bind to 0.0.0.0.
- Join and leave multicast groups appropriately.
- Inspect the target address of each UDP packet you receive and do any filtering yourself. Use IP_PKTINFO to get the destination address for each packet.
- Do not rely at all at routers and switches having any obvious and deterministic multicast routing behavior. Most of them have minute long timeouts for leaving multicast groups. This means that even when you left a multicast group (and even when you did not join) you may continue to receive multicast traffic for a couple of minutes. This masks bugs in your code and will cause a headache when trying to debug this.
This way you will not have to rely on any OS dependent behavior and you have full control of what you receive and what not.

Related

How to filter a multicast receiving socket by interface?

I need to create two sockets listening on the same IP:port but on different interfaces:
socket0 receives UDP traffic sent to 224.2.2.2:5000 on interface eth0
socket1 receives UDP traffic sent to 224.2.2.2:5000 on interface eth1
It seemed pretty straight forward until I realized that Linux merges all of that into the same traffic. For example, say there's only traffic on eth1 and there's no activity on eth0. When I first create socket0 it won't be receiving any data but as soon as I create socket1 (and join the multicast group) then socket0 will also start receiving the same data. I found this link that explains this.
Now this actually makes sense to me because the only moment when I specify the network interface is when joining the multicast group setsockopt(socket,IPPROTO_IP,IP_ADD_MEMBERSHIP,...) with ip_mreq.imr_interface.s_addr. I believe this specifies which interface joins the group but has nothing to do with from which interface your socket will receive from.
What I tried so far is binding the sockets to the multicast address and port, which behaves like mentioned above. I've tried binding to the interface address but that doesn't work on Linux (it seems to do so on Windows though), you don't receive any traffic on the socket. And finally, I've tried binding to INADDR_ANY but this isn't what I want since I will receive any other data sent to the port regardless of the destination IP, say unicast for example, and it will still not stop multicast data from other interfaces.
I cannot use SO_BINDTODEVICE since it requires root privileges.
So what I want to know is if this is possible at all. If it can't be done then that's fine, I'll take that as an answer and move on, I just haven't been able to find any way around it. Oh, and I've tagged the question as C because that's what we're using, but I'm thinking it really might not be specific to the language.
I haven't included the code for this because I believe it's more of a theoretical question rather than a problem with the source code. We've been working with sockets (multicast or otherwise) for a while now without any problems, it's just this is the first time we've had to deal with multiple interfaces. But if you think it might help I can write some minimal working example.
Edit about the possible duplicate:
I think the usecase I'm trying to achieve here is different. The socket is supposed to receive data from the same multicast group and port (224.2.2.2:5000 in the example above) but only from one specific interface. To put it another way, both interfaces are receiving data from the same multicast group (but different networks, so data is different) and I need each socket to only listen on one interface.
I think that question is about multiple groups on same port, rather than same group from different interfaces. Unless there's something I'm not seeing there that might actually help me with this.
Yes, you can do what you want on Linux, without root privileges:
Bind to INADDR_ANY and set the IP_PKTINFO socket option. You then have to use recvmsg() to receive your multicast UDP packets and to scan for the IP_PKTINFO control message. This gives you some side band information of the received UDP packet:
struct in_pktinfo {
unsigned int ipi_ifindex; /* Interface index */
struct in_addr ipi_spec_dst; /* Local address */
struct in_addr ipi_addr; /* Header Destination address */
};
The ipi_ifindex is the interface index the packet was received on. (You can turn this into an interface name using if_indextoname() or the other way round with if_nametoindex().
As you said on Windows the same network functions have different semantics, especially for UDP and even more for multicast.
The Linux bind() semantics for the IP address for UDP sockets are mostly useless. It is essentially just a destination address filter. You will almost always want to bind to INADDR_ANY for UDP sockets since you either do not care to which address a packet was sent or you want to receive packets for multiple addresses (e.g. receiving unicast and multicast).

Joining a source specific multicast on same multicast address

I am trying to set up multicast sources for an application on linux using source specific multicast (SSM) and the code is going ok (using the C interface) but I would like to verify that the system will behave as I expect it to.
Setup:
Multicast address - 233.X.X.X:9876
Source1 - 192.X.X.1
Source2 - 192.X.X.2
Interface1 - 192.X.X.100
Interface1 - 192.X.X.101
Steps
Configure so that only Source1 is sending to the multicast address
Start a reader (reader1) that binds to the multicast address and joins the multicast with ssm src as Source1 and interface as Interface1
Observe that data is seen on reader1
Do the same (reader2) but using Source2 and Interface2
Desired Outcome:
Reader1 can see the data from the multicast.
Reader2 can't see the data from the multicast.
I am concerned that the above will not be the case as in my testing using non source specific multicast an IP_ADD_MEMBERSHIP has global effect. So reader2's socket sees data because it is bound to the unique multicast address which has been joined to an interface seeing data. The info at this link under "Joining a Multicast" matches up with my observations.
It may well be that IP_ADD_SOURCE_MEMBERSHIP behaves differently to IP_ADD_MEMBERSHIP but the documentation is sparse and not specific in this regard.
Specific questions:
Is a multicast join using IP_ADD_SOURCE_MEMBERSHIP global i.e. will that cause any socket bind()'d to the multicast address to receive packets from that source.
How is SSM supposed to be used in general? does it make sense to have one multicast address with N sources?
I am inexperienced with network programming so please forgive any shortcomings in my understanding.
Thanks for any assistance.
I've worked through this and after obtaining a copy of Unix Network Programming the behaviour at least seems clear and understandable.
The answer is yes all multicast joins are global whether they be SSM or otherwise. The reason for this is that the join actually takes effect a couple of layers down from a process issuing a join request. Basically, it tells the IP layer to accept multicast packets from the source specified and provide them to any process bound to the socket with the multicast address.
SSM was actually introduced because of the limited address space of IPv4. When using multicast on the internet there are not nearly enough unique multicast addresses such that each person who want to use one could have a unique address. SSM pairs a source address with a multicast address which as a pair form a globally unique identifier i.e. shared multicast address e.g. 239.10.5.1 and source 192.168.1.5. So the reason that SSM exists is purely for this purpose of facilitating multicast in a limited address space. In the environment that our software is working in (Cisco) SSM is being used for redundancy and convenience of transmission, stacking multiple streams of data on the same IP:port combo and having downstream clients select the stream they want. This all works just fine until a given host wants access to more than one stream in the multicast, because they're all on the same multicast address all subscribed processes get all the data, and this is unavoidable due to the way the network stack works.
Final solution
Now that the behaviour has been understood the solution is straightforward, but does require additional code in each running process. Each process must filter the incoming data from the multicast address and only read data from the source(s) that they are interested in. I had hoped that there was some "magic" in built into SSM to do this automatically, but there is not. recvfrom() already provides the senders address so doing this is relatively low cost.

Raw sockets / BPF - filtering done once or multiple times?

Context
Studying Berkeley packet filter on Linux Debian 64 bits to filter the packets received by the opened socket.
I use AF_PACKET so i manage even the layer 2 of packets.
So far it works beautifully. But i have to filter every packet on every socket and it is not efficient. Hence I use BPF.
Question
Since I have my applications set the filters by themselves with
setsockopt(sd, SOL_PACKET, SO_ATTACH_FILTER, &filter, sizeof(filter)) < 0 )
I would like to know :
if the kernel will filter and direct the packets to the right socket (filtering happens once per packet on the system at the kernel level)
or
if the kernel will send all the packets as before and bpf will take filter in every socket (each packet will be analyzed + filtered as many times as there are open sockets on the system because every application will see the packet coming <-> promiscuous mode. This is not efficient).
I am not sure.
Thanks
Negro - but the question shows a fundamental misunderstanding of AF_PACKET socket vs. promiscuous mode and I would like to outline that using BPF filters on AF_PACKET sockets in LINUX is implemented in a efficient way (for the usual use-case).
About the general issue with the question:
Using an AF_PACKET socket does not mean that the NIC is switched to
promiscuous mode - it just forwards all frames directed to the
host to the user space (so filter based on L2 address is still applied - in contrast to a NIC in promiscuous mode that happily ignores a non-matching destination-MAC). This should relax your question at all as the usual frame/packet distribution process is applied even if there is an AF_PACKET socket.
About efficiency:
Only AF_PACKET sockets will see all frames directed to the host. A filter attached to a socket is evaluated per socket. There is no central spot in the kernel that handles all the filters and distributes the frames to its direction. Usually an AF_PACKET socket is used to implement a protocol(handler) in user space. Therefore those old wise guys who implemented AF_PACKET assumed that most frames directed to an AF_PACKET socket will be filtered/dropped cause the user would like to handle only a very specific subset of the frames.
The filter is applied on a socket buffer (skb - a container that holds the frame and its associated control/status data) that is shared by all entities taking part in the frame processing. Only if the filter matches a clone of this buffer is created and handed over to the user. There is even a comment on top of the function responsible for processing the skb in context of a AF_PACKET socket that says:
* This function makes lazy skb cloning in hope that most of packets
* are discarded by BPF.
For further information on packet filter on AF_PACKET sockets see the kernel doc for network filter.
The bpf program will sit in the kernel. It will process the data going to the particular socket identified in the setsockopt call. If a particular packet passes the filter, it will get delivered, else it is filtered out.
I mean to emphasize that two parallel invocations of the API with different socket do not affect the other, and should work correctly.
Regarding internal implementation in the kernel, I am not sure.
tx

Few queries regarding raw sockets in C

I want to make a chat room using raw socket in C. I have following problems:
Q 1 : Can I use select function to handle multiple connections in case of raw sockets ?
Q 2 : Port nos in sockets are real ports or logically implemented for various applications on transport layer??
Q 3 : I am having one computer only so using lo ( local loop) as my interface. So the process which is initiating the chat has send first and then receive call, so it's receiving it's own data. How to restrict it?
Any help would be appreciated since that would help me in increasing my confidence on raw sockets.
Thanks :)
If you want this to be a real, usable chat system, stop. Don't use raw sockets. Huge mistake.
If you are just playing around because you want to put “raw sockets” under the “Experience” section of your résumé, you may read on.
You can use the select function to detect when a raw socket has a packet available to receive, and when it can accept a packet to transmit. You can pass multiple file descriptors to a single call to select if you want to check multiple raw sockets (or whatever) simultaneously.
Port numbers are part of the TCP and UDP protocols (and some other transport layer protocols). The kernel doesn't look for port numbers when receiving packets for raw sockets.
The raw(7) man page‚ states:
All packets or errors matching the protocol number specified for the raw socket are passed to this socket.
And it also states:
A raw socket can be bound to a specific local address using the bind(2) call. If it isn't bound, all packets with the specified IP protocol are received.
Therefore you probably want to at least use different IP addresses for each end of the “connection”, and bind each end to its address.
“But!” you say, “I'm using loopback! I can only use the 127.0.0.1 address!” Not so, my friend. The entire 127.0.0.0/8 address block is reserved for loopback addresses; 127.0.0.1 is merely the most commonly-used loopback address. Linux (but perhaps not other systems) responds to every address in the loopback block. Try this in one window:
nc -v -l 10150
And then in another window:
nc -s 127.0.0.1 127.0.0.2 10150
You will see that you have created a TCP connection from 127.0.0.1 to 127.0.0.2. I think you can also bind your raw sockets to separate addresses. Then, when you receive a packet, you can check whether it's from the other end's IP address to decide whether to process or discard it.
Just curious, why do you want to use raw sockets? Raw sockets (AF_INET, SOCK_RAW) allow you to send out "raw" packets, where you are responsible for crafting everything but the MAC and IP layers.
A1: There are no "connections" with raw sockets. Just packets.
A2: There are no "ports" with raw sockets. Just packets. "Port numbers" as we know them are part of the TCP or UDP protocols, both of which are above the level at which we work with raw sockets.
A3: This is not specific to raw sockets - you would have this issue regardless of your protocol selection. To really answer this, we would need to know much more about your proposed protocol, since right now, you're simply blasting out raw IP packets.

UDP based chat in C

I'm supposed to make a communicator in C, based on dgrams. I don't know what arguments should I pass to bind() function. I skimmed through most UDP-chat questions & codes here on StackOverflow but I still can't find any specific information on the issue.
What type of address structure should I use?
What port can I use? Any number bigger than 1024 ?
What IP adress do I bind my socket with? (most of people put INADDR_ANY but isn't it for receiving only?)
Also, do I need multiple sockets? One for receiving & another for sending messages.
What type of address structure should I use?
If you are using IPv4, use a sockaddr_in. If you want to use IPv6 instead, use a sockaddr_in6.
What port can I use? Any number bigger than 1024 ?
Yes, assuming no other program is already using that port number for its own UDP socket. (If another program is using the port number you chose, it will cause bind() to fail with errno EADDRINUSE)
What IP adress do I bind my socket with? (most of people put
INADDR_ANY but isn't it for receiving only?)
INADDR_ANY is what you generally want to use. It tells the OS that you want to receive incoming UDP packets on any of the computers network interfaces. (If you only wanted to receive UDP packets from a particular network interface, OTOH, e.g. only on WiFi, you could specify that network interface's IP address instead)
Also, do I need multiple sockets? One for receiving & another for
sending messages.
You can have multiple sockets if you want, but it's not necessary to do it that way. You can instead use a single socket for both sending and receiving UDP packets. One common pattern is to use a single socket, set to non-blocking mode, and something like select() or poll() to multiplex the input and output needs of your program. An alternative pattern would be to use two threads (one for sending and one for receiving), blocking I/O, and either one or two sockets (depending on whether you prefer to have the two threads share a socket, or give each thread its own socket). I prefer the single-threaded/single-socket/select() solution myself, as I think it is the least error-prone approach.

Resources