Ipv6 Source Address selection in linux - c

To get destination derive from an host name or to apply destination address selection algorithm (as per RFC 3484) we have an library api getaddrinfo(). If you search on net you would find that the same API could be used for Source address selection. But when i tested it practically it's not happening.
When i did some homework on it found that in linux the kernel itself decides the appropriate source address depending upon the destination address by applying those rules (as per RFC 3484). This is done by kernel in fib6_rule_action() method this is done while sending the data (e.g. in sendto()).
My question is there any library API or system call which could do this for me at earlier stage meaning before sending data.

You can get that information via linux routing sockets aka rtnetlink. Specifically it is the RTA_SRC you are looking for.
A word of warning, (rt)netlink sockets are not the easiest protocol to use and there is not much up to date documentation besides the source code. The wikipedia page for netlink might get you started. Some of the External Links seem good and the linked paper contains more references.
I suggest using a library, if you can find one and your netlink related code is any longer than the single source address query. Libnl or libmnl might be good. The former also has a nice page about routing sockets.
As a test, you can get the same functionality with the user space command ip -6 route get <dst_addr> eg ip -6 route get 2a00:1450:4010:c04::63.

Related

View - but not intercept - all IPv4 traffic to Linux computer

Is there a way to view all the IPv4 packets sent to a Linux computer?
I know I can capture the packets at the ethernet level using libpcap. This can work, but I don't really want to defragment the IPv4 packets. Does libpcap provide this functionality and I'm just missing it?
One thing that kinda works is using a tun device. I can capture all the IPv4 traffic by routing all traffic to the tun device via something like ip route add default via $TUN_IP dev $TUNID. This also stops outbound traffic though, which is not what I want.
I just want to see the IPv4 packets, not intercept them. (Or, even better, optionally intercept them.)
Edit: I'm specifically looking for a programmatic interface to do this. E.g. something I can use from within a C program.
Yes, you can see all the packets that arrive at your network interface. There are several options to access or view them. Here a small list of possible solutions, where the first one is the easiest and the last one the hardest to utilize:
Wireshark
I'd say this is pretty much the standard when it comes to protocol analyzers with a GUI (uses libpcap). It has tons of options, a nice GUI, great filtering capabilities and reassembles IP datagrams. It uses libpcap and can also show the raw ethernet frame data. For example it allows you to see layer 2 packets like ARP. Furthermore you can capture the complete data arriving at your network interface in a file that can later be analyzed (also in Wireshark).
tcpdump
Very powerful, similar features like Wireshark but a command line utility, which also uses libpcap. Can also capture/dump the complete interface traffic to a file. You can view the dumped data in Wireshark since the format is compatible.
ngrep
This is known as the "network grep" and is similar to tcpdump but supports regular expressions (regex) to filter the payload data. It allows to save captured data in the file format supported by Wireshark and tcpdump (also uses libpcap).
libnids
Quotation from the official git repository:
"Libnids is a library that provides a functionality of one of NIDS
(Network Intrusion Detection System) components, namely E-component. It means
that libnids code watches all local network traffic [...] and provides convenient information on them to
analyzing modules of NIDS. Libnids performs:
assembly of TCP segments into TCP streams
IP defragmentation
TCP port scan detection"
libpcap
Of course you can also write your own programs by using the library directly. Needless to say, this requires more efforts.
Raw or Packet Sockets
In case you want to do all the dirty work yourself, this is the low level option, which of course also allows you to do everything you want. The tools listed above use them as a common basis. Raw sockets operate on OSI layer 3 and packet sockets on layer 2.
Note: This is not meant to be a complete list of available tools or options. I'm sure there are much more but these are the most common ones I can think of.
Technically you have to make a copy of the received packet via libpcap. To be more specific, what you can do is to get packets with libpcap, that way the packets will be kind of blocked, so you need to re send them to the destination. Lets say that you want to make a Fire-Wall or something, what you should do is to have a layer that can work like getting the package and then send it to the destination, in between you can make a copy of what you got for further processes. In order to make the intercept option, you need to create some predefined rules, i.e. the ones that violates the rules will not be send again to their destination.
But that needs a lot of efforts and I don't think you want to waist your life on it.
Wire-shark as mentioned by #Barmar can do the job already.
If you need some kind of command line interface option I would say that "tcpdump" is one of the best monitoring tools. for example for capturing all ipv4 HTTP packets to and from port 80 the command will be:
tcpdump 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'
for more information and options see tcpdump
Please be specific if you need to write a program for it, then we can help about how to do it.

BSD Packet Interception (Not Copying)

I want to get in the middle of packet forwarding (Not routing). For example, the system is a layer 2 bridge between hosts and their gateway. I want to check the layer 7 for string or whatever "foo" and forward/drop/delay the packet based on the result. What I am having trouble with is intercepting the packet.
What I have read so far:
I know I can get the copy of packet from BPF device (Usenix paper by Steven McCanne and Van Jacobson http://www.tcpdump.org/papers/bpf-usenix93.pdf ). that's good for sniffing but not for me.
I can access the PF device and set the filtering rules which is good for forwarding or dropping decisions, but not for inspection. man pf (4)
I can get packets into the ALTQ queues, BUT I do not know how to access the individual packets located in the queue. man altq(9)
I have also looking into the source code for PF(/usr/src/sys/contrib/pf/net ), PFCTL (/usr/src/contrib/pf/pfctl) and ALTQ(/usr/src/sys/contrib/altq/altq).
On FreeBSD 9.1 machine
I am not C expert, but I am good with it.
Maybe I am getting tired today with all the reading and missed something trivial. Please forgive me if so. Plus, this will be a very good find fro those looking into the subject.
P.S. There is a way of controlling the flow of "foo", by detecting "foo" in packet and denying the answer to that from coming back by setting up the filter for answer to that request. This is NOT what I am trying to achieve. I do not want the packet to leave the system if it should not.
EDIT 2 P.S. There is a great way of doing this on Linux. I can achieve everything I mentioned here on Linux with libnetfilter_queue. I will not bother posting solution here because there are many many many tutorials on how to do it on Linux.
In conclusion, I am still looking for answer on how to do this on BSD. As far as I can understand, I need to write a wrapper/library based on pf (because there is no such thing on the net - otherwise I should have found it already), that does the same thing as libnetfilter with it's libnetfilter_queue library. Or I could somehow dig into libnetfilter and port it to FreeBSD, but since it is based on iptables, only thing I can get from digging into libnetfilter library is logic and algorithms not the actual code itself, which by itself could prove to be of no use to me.
FreeBSD 9.1 has an userspace framework for packet access called netmap. It was recently introduced and has an amazing performance scale. It does very simple but powerful thing - just mmaps the NIC buffers to userspace portion of memory and detaches the packet processing from host stack, this was exactly what I needed the rest is on me.
If anyone needs any goods reference for this, please refer to man netmap (4)
Have a look at OpenDPI or nDPI.
Check out the "Divert Sockets" in BSD implementation as well. Unlike Netmap, it is not zero-copy (IMHO) however it can work with ipfw in order to implement the necessary filters in order to filter packages you want to process.

Why is separate getaddrinfo-like() + connect() not refactored into a (theoretical) connect_by_name()?

Most of the applications I've seen that use TCP, do roughly the following to connect to remote host:
get the hostname (or address) from the configuration/user input (textual)
either resolve the hostname into address and add the port, or use getaddrinfo()
from the above fill in the sockaddr_* structure with one of the remote addresses
use the connect() to get the socket connected to the remote host.
if fails, possibly go to (3) and retry - or just complain about the error
(2) is blocking in the stock library implementation, and the (4) seems to be most frequently non-blocking, which seems to give a room for a lot of somewhat similar yet different code that serves the purpose to asynchronously connect to a remote host by its hostname.
So the question: what are the good reasons not to have the additional single call like following:
int sockfd = connect_by_name(const char *hostname, const char *servicename)
?
I can come up with three:
historic: because that's what the API is
provide for custom per-application policy mechanism for address selection/connection retry: this seems a bit superficial, since for the common case ("get me a tube to talk to remote host") the underlying OS should know better
provide the visual feedback to the user about the exact step involved ("name resolution" vs "connection attempt"): this seems rather important, lookup+connection attempt may take time
Only the last of them seems to be compelling enough to rewrite the resolve/connect code for every client app (as opposed to at least having and using a widely used library that would implement the connect_by_name() semantics in addition to the existing sockets API), so surely there should be some more reasons that I am missing ?
(one of the reasons behind the question is that this kind of API would appear to help the portability to IPv6, as well as possibly to other stream transport protocols significantly)
Or, maybe such a library exists and my google-fu failed me ?
(edited: corrected the definition to look like it was meant to look, thanks LnxPrgr3)
Implementing such an API with non-blocking characteristics within the constraints of the standard library (which, crucially, isn't supposed to start its own threads or processes to work asynchronously) would be problematic.
Both the name lookup and connecting part of the process require waiting for a remote response. If either of these are not to block, then that requires a way of doing asychronous work and signalling the change in state of the socket to the calling application. connect is able to do this, because the work of the connect call is done in the kernel, and the kernel can mark the socket as readable when the connect is done. However, name lookup is not able to do this, because the work of a name lookup is done in userspace - and without starting a new thread (which is verboten in the standard library), giving that name lookup code a way to be woken up to continue work is a difficult problem.
You could do it by having your proposed call return two file descriptors - one for the socket itself, and another that you are told "Do nothing with this file descriptor except to check regularly if it is readable. If this file descriptor becomes readable, call cbn_do_some_more_work(fd)". That is clearly a fairly uninspiring API!
The usual UNIX approach is to provide a set of simple, flexible tools, working on a small set of object types, that can be combined in order to produce complex effects. That applies to the programming API as much as it does to the standard shell tools.
Because you can build higher level APIs such as the one you propose on top of the native low level APIs.
The socket API is not just for TCP, but can also be used for other protocols that may have different end point conventions (i.e. the Unix-local protocol where you have a name only and no service). Or consider DNS which uses sockets to implement itself. How does the DNS code connect to the server if the connection code relies on DNS?
If you would like a higher level abstraction, one library to check out is ACE.
There are several questions in your question. For instance, why not
standardizing an API with such connect_by_name? That would certainly
be a good idea. It would not fit every purpose (see the DNS example
from R Samuel Klatchko) but for the typical network program, it would
be OK. A paper exploring such APIs is "Simplifying Internet Applications Development
With A Name-Oriented Sockets Interface" by Christian Vogt. Note
that another difficulty for such an API would be "callback"
applications, for instance a SIP client asking to be called back: the
application has no easy way to know its own name and therefore often
prefer to be called back by address, despite the problems it make, for
instance with NAT.
Now, another question is "Is it possible to build such
connect_by_name subroutine today?" Partly yes (with the caveats
mentioned by caf) but, if written in userspace, in an ordinary
library, it would not be completely "name-oriented" since the Unix
kernel still manages the connections using IP addresses. For instance,
I would expect a "real" connect_by_name routine to be able to
survive renumbering (for instance because a mobile host renumbered),
which is quite difficult to do in userspace.
Finally, yes, it already exists a lot of libraries with similar
semantics. For a HTTP client (the most common case for a program whose
network abilities are not the main feature, for instance a XML
processor), you have Neon and libcURL. With libcURL, you can
simply write things like:
#define URL "http://www.velib.paris.fr/service/stationdetails/42"
...
curl_easy_setopt(curl, CURLOPT_URL, URL);
result = curl_easy_perform(curl);
which is even higher-layer than connect_by_name since it uses an
URL, not a domain name.

How to get the address of the DNS server that getaddrinfo queries

I'm newbie to BSD socket programming in C. I can query a web address to get its associated ip addresses with "getaddrinfo" function. But i want to know which dns server getaddrinfo queries this information from.
If you are on linux or a unix platform, try looking at man -k resolver and look for the resolver man page or a page for functions like res_init, res_search, et. al. Those are the unix APIs to DNS, and it looks like, while there's no direct way to do what you want to do, one could glean the information through a combination of the functions and what they return, and doing a few other massaging of data.
With regard to wireshark knowing what's going on, it doesn't really know. It's just monitoring packets as they flow to and fro and printing out what it sees. The resolver is what knows, and that's the API I suggested.
I don't think you can find out which it used, but it uses one from /etc/resolv.conf
If you are on Linux, you can look at the source to 'dig'. Based on it's ability to print out the server address, I think there must be some means to do this other than just parsing the /etc/resolv.conf.
On Windows, there is a very convoluted API for the purpose.

Is there a notification mechanism for when getifaddrs() results change?

On startup, my program calls getifaddrs() to find out what network interfaces are available for link-local IPv6 multicasting. This works as far as it goes, but it doesn't handle the case where the set of available network interfaces changes after getifaddrs() has returned.
Is there some way for the OS to notify my program when the network interfaces have changed, so I can call getifaddrs() again and update my list? Or am I doomed to poll getifaddrs() every few seconds, forever?
(Note: on Windows, I call GetAdaptersAddresses() instead of getifaddrs(), but the same issue exists there)
Also, the Linux way to implement this is by opening a socket of family AF_NETLINK and subtype NETLINK_ROUTE and reading the messages that arrive on it from the kernel, as shown in the example code included in "man 7 netlink". (Thanks to Rob Searce for pointing me to that!)
In case anyone is interested, I found the following document on Apple's developer site that describes how to get notified when the network configuration changes. It's non-trivial, but I did get the technique to work for me. See Listing 8 in particular.
Technical Note TN1145 - Living in a Dynamic TCP/IP Environment"
You probably want to have a look at the NotifyAddrChange and NotifyIpInterfaceChange functions.

Resources