I'm using Linux kernel 2.6.32 (x86_64) and can get TCP statistics by passing TCP_INFO to getsockopt and receiving a tcp_info struct, which is defined in /usr/include/netinet/tcp.h.
Can I get similar statistics for UDP? (possibly fewer because there's no built-in congestion control and retransmission etc. but I'm satisfied with any statistics that I can get)
TCP_INFO literally means info for TCP. The reason there aren't info for UDP is because it's stateless. There's no guaranteed transfer, not rtf, no window size, no much info to provide about.
If you really wanna grab some extra info, take a look at man 2 recvmsg, especially this
Ancillary data should only be accessed by the macros defined in cmsg
Related
If I want to sniffing packet in linux without set any filters, I saw 2 options.
Use libpcap
Use raw socket myself like https://www.binarytides.com/packet-sniffer-code-in-c-using-linux-sockets-bsd-part-2/
Why libpcap is better than use raw sockets myself?
Three reasons:
1) It's way easier to correctly set up.
2) It's portable, even to Windows, which uses a quite similar yet different API for sockets.
3) It's MUCH faster.
1 and 2, IMO, don't need much explanation. I'll dive into 3.
To understand why libpcap is (generally) faster, we need to understand the bottlenecks in the socket API.
The two biggest bottlenecks that libpcap tends to avoid are syscalls and copies.
How it does so is platform-specific.
I'll tell the story for Linux.
Linux, since 2.0 IIRC, implements what it calls AF_PACKET socket family, and later with it PACKET_MMAP. I don't exactly recall the benefits of the former, but the latter is critical to avoid both copying from kernel to userspace (there are still a few copies kernel-side) and syscalls.
In PACKET_MMAP you allocate a big ring buffer in userspace, and then associate it to an AF_PACKET socket. This ring buffer will contain a bit of metadata (most importantly, a marker that says if a region is ready for user processing) and the packet contents.
When a packet arrives to a relevant interface (generally one you bind your socket to), the kernel makes a copy in the ring buffer and marks the location as ready for userspace*.
If the application was waiting on the socket, it gets notified*.
So, why is this better than raw sockets? Because you can do with few or none syscalls after setting up the socket, depending on whether you want to busy-poll the buffer itself or wait with poll until a few packets are ready, and because you don't need the copy from the socket's internal RX buffer to your user buffers, since it's shared with you.
libpcap does all of that for you. And does it on Mac, *BSD, and pretty much any platform that provides you with faster capture methods.
*It's a bit more complex on version 3, where the granularity is in "blocks" instead of packets.
Context
Studying Berkeley packet filter on Linux Debian 64 bits to filter the packets received by the opened socket.
I use AF_PACKET so i manage even the layer 2 of packets.
So far it works beautifully. But i have to filter every packet on every socket and it is not efficient. Hence I use BPF.
Question
Since I have my applications set the filters by themselves with
setsockopt(sd, SOL_PACKET, SO_ATTACH_FILTER, &filter, sizeof(filter)) < 0 )
I would like to know :
if the kernel will filter and direct the packets to the right socket (filtering happens once per packet on the system at the kernel level)
or
if the kernel will send all the packets as before and bpf will take filter in every socket (each packet will be analyzed + filtered as many times as there are open sockets on the system because every application will see the packet coming <-> promiscuous mode. This is not efficient).
I am not sure.
Thanks
Negro - but the question shows a fundamental misunderstanding of AF_PACKET socket vs. promiscuous mode and I would like to outline that using BPF filters on AF_PACKET sockets in LINUX is implemented in a efficient way (for the usual use-case).
About the general issue with the question:
Using an AF_PACKET socket does not mean that the NIC is switched to
promiscuous mode - it just forwards all frames directed to the
host to the user space (so filter based on L2 address is still applied - in contrast to a NIC in promiscuous mode that happily ignores a non-matching destination-MAC). This should relax your question at all as the usual frame/packet distribution process is applied even if there is an AF_PACKET socket.
About efficiency:
Only AF_PACKET sockets will see all frames directed to the host. A filter attached to a socket is evaluated per socket. There is no central spot in the kernel that handles all the filters and distributes the frames to its direction. Usually an AF_PACKET socket is used to implement a protocol(handler) in user space. Therefore those old wise guys who implemented AF_PACKET assumed that most frames directed to an AF_PACKET socket will be filtered/dropped cause the user would like to handle only a very specific subset of the frames.
The filter is applied on a socket buffer (skb - a container that holds the frame and its associated control/status data) that is shared by all entities taking part in the frame processing. Only if the filter matches a clone of this buffer is created and handed over to the user. There is even a comment on top of the function responsible for processing the skb in context of a AF_PACKET socket that says:
* This function makes lazy skb cloning in hope that most of packets
* are discarded by BPF.
For further information on packet filter on AF_PACKET sockets see the kernel doc for network filter.
The bpf program will sit in the kernel. It will process the data going to the particular socket identified in the setsockopt call. If a particular packet passes the filter, it will get delivered, else it is filtered out.
I mean to emphasize that two parallel invocations of the API with different socket do not affect the other, and should work correctly.
Regarding internal implementation in the kernel, I am not sure.
tx
I'm trying to use libpcap to sniff some "network interfaces" (loopback included).
In my example application, I have packets coming from the loopback in the ports 1234, 1235 and 1236. I found already a way to make libpcap filter only packets coming from these addresses, using libpcap_setfilter(): my goal was to forward these packets accordingly to the address/port from which they came (for example, packets coming from 127.0.0.1/1234 could go through the eth0 interface; packets coming from 127.0.0.1/1235 could be forwarded through the eth1; and the ones coming from 127.0.0.1/1236 could be forwarded though the eth2).
My question is: is there any way to know from exactly what port these packets came without having to look at their content? Can I, for example, set many filters and somehow know from what filter was the one who filtered my packet?
I've already read a lot of the documentation and of tutorials, but none seemed useful so far. I also will be ok if the answer is "it is not possible".
Thanks in advance.
The capture mechanisms atop which libpcap runs support only one filter, so libpcap has no APIs to set multiple filters.
You could, however, open multiple pcap_t's for the same network interface and apply different filters to them. Reading from multiple pcap_t's, however, is potentially platform-dependent. I infer from the "eth0", "eth1", and "eth2" that this is Linux, so you should be able to use select() or poll() or... on the return values from pcap_get_selectable_fd() on the pcap_t's and, if select() or poll() or... says a given descriptor is readable, call pcap_dispatch() on the corresponding pcap_t to process packets for that pcap_t.
I need to perform data filtering based on the source unicast IPv4 address of datagrams arriving to a Linux UDP socket.
Of course, it is always possible to manually perform the filtering based on the information provided by recvfrom, but I am wondering if there could be another more intelligent/efficient approach (if possible, not using libpcap).
Any ideas?
If it's a single source you need to allow, then use just connect(2) and kernel will do filtering for you. As a bonus, connected UDP sockets are more efficient. This, of cource, does not work for more then one source.
As already stated, NetFilter (the Linux firewall) can help you here.
You could also use the UDP options of xinetd and tcpd to perform filtering.
What proportion of datagrams are you expecting to discard? If it is very high, then you may want to review your application design (for example, to make the senders not send so many datagrams which are to be discarded). If it is not very high, then you don't really care about how much effort you spend discarding them.
Suppose discarding a packet takes the same amount of (runtime) effort as processing it normally; if you discard 1% of packets, you will only be spending 1% of time discarding. However, realistically, discarding is likely to be much easier than processing messages.
I am having this doubt in socket programming which I could not get cleared by reading the man pages.
In c the declaration of socket function is int socket(int domain, int type, int protocol);
The linux man page says that while type decides the stream that will be followed the protocol number is the one that decides the protocol being followed.
So my question is that suppose I give the type parameter as SOCK_STREAM which is reliable and add the protocol number for UDP would it give me a reliable UDP which is same as TCP but without flow control and congestion control.
Unfortunately I can't test this out as I have a single machine so there is no packet loss happening.
Could anyone clear this doubt? Thanks a lot...
UDP cannot be made reliable. Transmission of the packets is done on a "best effort" capacity, but any router/host along the chain is free to drop the packet in the garbage and NOT inform the sender that it has done so.
That's not to say you can't impose extra semantics on the sending and receiving ends to expect packets within a certain time frame and say "hey, we didn't get anything in the last X seconds". But that can't be done at the protocol level. UDP is a "dump it into the outbox and hope it gets there" protocol.
No. For an IPV4 or IPV6 protocol stack, SOCK_STREAM is going to get you TCP and the type SOCK_DGRAM is going to give you UDP. The protocol parameter is not used for either of the choices and the socket library is typically expecting a value of 0 to be specified there.
If you do this:
socket(AF_INET,SOCK_STREAM,IPPROTO_UDP):
socket() will return -1 and sett errno to
EPROTONOSUPPORT
The protocol type or the specified protocol
is not supported within this domain.