Unix Sockets : AF_LOCAL vs AF_INET - c

I'm just starting with socket programming in UNIX and I was reading the man pages for the socket system call. I'm a bit confused about the AF_LOCAL argument and when it is used. The manual just says local communication. Wouldn't an AF_INET format also work for local communication?

AF_LOCAL uses UNIX domain sockets which are local to the filesystem and can be used for internal communications. AF_INET is an IP socket. AF_LOCAL will not incur some performance penalties related to sending data over IP. See this old but very nice discussion of the topic.

Related

Difference between Unix domain SOCK_DGRAM and SOCK_SEQPACKET?

According to the Linux man pages for Unix sockets, "Valid socket types in the UNIX domain are . . . SOCK_DGRAM, for a datagram-oriented socket that preserves message boundaries (as on most UNIX implementations, UNIX domain datagram sockets are always reliable and don't reorder datagrams); and (since Linux 2.6.4) SOCK_SEQPACKET, for a sequenced-packet socket that is connection-oriented, preserves message boundaries, and delivers messages in the order that they were sent." (http://man7.org/linux/man-pages/man7/unix.7.html).
I thought "always reliable and don't reorder datagrams" is the same as "delivers messages in the order that they were sent."
What's the practical difference between SOCK_DGRAM and SOCK_SEQPACKET?
In the context of UNIX domain sockets, the main difference between two is "datagram-oriented" vs "connection-oriented".
In case of SOCK_DGRAM you don't create a connection (to a server, for example), you just send packets to the server socket. And if server needs to reply, you need to create your own socket, make server aware of this socket and then server can send a reply to it. Very inconvenient, if you really need a connection, but can be useful when you just need one-way communication, i.e. to send some notifies.
SOCK_SEQPACKET is the way to go, when you need connection-oriented approach.
The difference is better understood by the help of UDP and TCP.
A protocol like UDP(connection-less) uses SOCK_DGRAM, implementation
A protocol like TCP(connection-oriented) uses SOCK_STREAM. However, even SOCK_SEQPACKET can be used. The difference between the two is very minimal, TCP can be implemented using the latter as well. In fact, SOCK_SEQPACKET is somewhat a hybrid of both.
STCP is a use case for SOCK_SEQPACKET. Explained in this article: http://urchin.earth.li/~twic/Sequenced_Packets_Over_Ordinary_TCP.html
Here's a post that has discussed this in detail.

What is the internal mechanics of socket() function?

I am trying to use the BlueZ HCI function:
int hci_open_dev(int dev_id) {...}
which internally tries to create a socket like this:
socket(AF_BLUETOOTH, SOCK_RAW | SOCK_CLOEXEC, BTPROTO_HCI);
I tried to understand the linux kernel code for socket() but feel lost.
Id like to know what exactly does it mean to create a socket for the given domain (AF_BLUETOOTH), data transmission type (SOCK_RAW) and protocol (BTPROTO_HCI).
The man page just states that it takes these params, creates a socket and returns a device descriptor.
But id like to understand what exactly happens and the exact kernel steps involved in creating a socket.
Here is a very broad description (hope that helps understanding the main scheme).
Kernel developers will probably be horrified...
A socket is common abstract interface for many different communication means.
It provides many generic operations, such as closing, sending/receiving data, setting/retrieving options, which can be used on almost any kind of socket.
Creating a socket implies specifying the exact properties of this communication means.
It's a bit like the instantiation of a concrete type implementing an interface.
These properties are first organised by protocol families; this is the first argument to the socket() call.
For example:
PF_INET is used for communications relying on IPv4,
PF_INET6 is used for communications relying on IPv6,
PF_LOCAL is used for inter-process communication inside the system (kind of pipe),
PF_NETLINK is used for communication with the OS kernel,
PF_PACKET is used for direct communication with network interfaces,
... (there exist many of them)
Once a protocol family is chosen, you have to specify, which protocol you want to use amongst those which are provided by this family; this is the second argument to the socket() call.
For example:
SOCK_DGRAM is used for UDP over IPv4 or IPv6, or distinct messages in PF_LOCAL,
SOCK_STREAM is used for TCP over IPv4 or IPv6, or a continuous byte stream in PF_LOCAL,
SOCK_RAW, accesses directly is the raw underlying protocol in the family if any (IPv4, or IPv6 for example),
... (each family can provide many on them)
Some protocols can accept some variants or some restrictions; this is the third argument to the socket() call.
Often 0 is sufficient, but for example we can find:
PF_PACKET, SOCK_RAW, htons(ETH_P_ALL) to capture any kind of network packet received on a network interface,
PF_PACKET, SOCK_RAW, htons(ETH_P_ARP) to capture only ARP frames,
When we ask for the creation of a socket with these three arguments, the operating system creates an internal resource associated with the socket handle which will be obtained.
Of course, the exact structure of this resource depends on the chosen family/protocol/variant, and it is associated to kernel callbacks which are specific to it.
Each time an operation in invoked on this socket (through a system call), the specific callback will be called.
Please look here: it's a good high-level description of the BlueZ Linux implemention of the Bluetooth stack:
Linux Without Wires The Basics of Bluetooth. Specifically, it gives you a good overview of these BlueZ kernel drivers:
bluetooth.ko, which contains core infrastructure of BlueZ. It exports sockets of the Bluetooth family AF_BLUETOOTH. All BlueZ
modules utilise its services.
Bluetooth HCI packets are transported over UART or USB. The corresponding BlueZ HCI implementation is hci_uart.ko and hci_usb.ko.
The L2CAP layer of Bluetooth, which is responsible for segmentation, reassembly and protocol multiplexing, is implemented by l2cap.ko.
With the help of bnep.ko, TCP/IP applications can run over Bluetooth. This emulates an Ethernet port over the L2CAP layer. The
kernel thread named kbnepd is responsible for BNEP connections.
rfcomm.ko is responsible for running serial port applications like the terminal. This emulates serial ports over the L2CAP layer. The
kernel thread named krfcommd is responsible for RFCOMM connections.
hidp.ko implements the HID (human interface device) layer. The user mode daemon hidd allows BlueZ to handle input devices like Bluetooth
mice.
sco.ko implements the synchronous connection oriented (SCO) layer to handle audio. SCO connections do not specify a channel to connect to a
remote host; only the host address is specified.
Another excellent resource is the BlueZ project page:
http://www.bluez.org/

How bind works internally in kernel space?

Can anyone help me in tracing bind() system call in socket programming. I would like to know what happens when bind() is called, in kernel space. Like which are the structures it updates and what functions are invoked in lower level
The bind(2) system call just configures the local side's address parameters that a socket will use once you have connected (or sendto(2)). If you don't use it, the kernel selects defaults for it, depending on the underlying protocol.
The exact procedure bind(2) follows depends on the protocol family you are working on, as bind will behave differently depending if you are using PF_UNIX, PF_INET, PF_PACKET, PF_XNS, etc.
For example, in Unix sockets, you'll get your socket associated to an inode in the filesystem (an inode that supports unix sockets, of course), so clients have a path to connect to (in Unix sockets, addresses are paths in the filesystem). In TCP/IP sockets, you can fix the local IP address or the local IP port your socket can listen on (to accept connections) or you can force a IP address and/or port to connect from, to a server.
For a deeper understanding of networking sockets internals, I recommend you reading the excellent book from W.R. Stevens "TCP/IP Illustrated Vol 2. The implementation," describing the implementation of BSD sockets in NET2. It's old, but still the best explanation ever made. For a good introduction of the BSD socket system calls use, there's also an excellent book (for a long time it was indeed also the best system call reference for BSD unix system calls) by W.R.Stevens: "UNIX network programming, Vol 1 (2ND Ed): The sockets API." Both are two jewels everyone should have available at work.

When using PF_PACKET type of socket, what does PACKET_ADD_MEMBERSHIP?

When using a PF_PACKET type of socket with protocol type ETH_P_IP, the man packet documentation talks about a socket option for multicast. The socket option is PACKET_ADD_MEMBERSHIP.
Assuming you use PACKET_ADD_MEMBERSHIP socket option on a PF_PACKET socket correctly, what features and benefits and use cases is this socket option for?
Right now I receive all incoming IP packets so I look at each packet to see if it has the correct IP dst-address and UDP dst-port and I skip over all the other packets. Would using PACKET_ADD_MEMBERSHIP socket option mean I don't need to do my own filter because the kernel or driver would filter for me?
I dug into the linux-kernel source and traced down the code a little bit. I found that the ethernet-mac-address you pass in via setsockopt() is added to a list of ethernet-mac-addresses. And then the list is sent to the network-device hardware to do something... but I can't find any authoritative documentation telling me what happens next.
My educated guess is that the ethernet-mac-address list is used by the hardware to filter at the layer-2 ethernet protocol (i.e. the hardware only accepts packets that have a destination ethernet-address that matches one on the list). If there is some good documentation I would welcome that.
(I'm more familiar with TCP/UDP sockets and so this looks very similar to AF_INET type of socket's IP_ADD_MEMBERSHIP socket option... so I was expecting IGMP reports to be generated which would start multicast traffic from the router... but I found out experimentally that no IGMP reports are generated when you use this socket option.)
Your guess is correct. PACKET_ADD_MEMBERSHIP should add addresses to the NIC's hardware filter. As you've surmised, it's intended to allow you to receive multicasts for a number of different addresses without incurring the load(*) of full promiscuous mode.
(* With modern full duplex ethernet, there's generally not a lot of traffic coming to the NIC that it wouldn't want to receive anyway, unless it's in a virtualized environment.)
Note that there is also a separate PACKET_MR_UNICAST which does not appear in the packet(7) man page but works analogously. I would use the appropriate one (unicast vs multicast) for the type of address you're filtering on, as it's conceivable (though unlikely) that a driver would refuse to put a unicast address into the multicast filtering table.
All that being said, you'll still need to keep your software filtering as backup. There are some older drivers that don't implement MAC filtering at all (particularly for multiple unicast addresses). The core kernel or the driver handles this by simply turning on promiscuous mode if the feature isn't available.
As for the relationship with IP_ADD_MEMBERSHIP, the IP_ADD_MEMBERSHIP code will automatically construct the appropriate multicast MAC address and add it to the interface. See ip_mc_filter_add.

WINAPI: CreateFile to Network Adapter to Read Raw Bytes

Is it possible to read a Network Adapter similar to a Serial Port? I know that Serial Ports can be read with CreateFile WINAPI Function. Is there a similar way to read raw bytes from a Network Adapter?
I am aware of the WiFi/Network Functions but the WiFi Examples are fairly sparse.
You can pass the SOCK_RAW flag when you create the socket using WSASocket() (or socket(), as your tastes run). This is described in further detail under TCP/IP Raw Sockets on MSDN.
From that page --
Once an application creates a socket
of type SOCK_RAW, this socket may be
used to send and receive data. All
packets sent or received on a socket
of type SOCK_RAW are treated as
datagrams on an unconnected socket.
Of note, Microsoft crippled their raw sockets implementation after Windows XP SP2; the details are described on the MSDN page in the section Limitations on Raw Sockets:
TCP data cannot be sent over raw sockets.
UDP datagrams with an invalid source address cannot be sent over raw
sockets.
A call to the bind function with a raw socket is not allowed.
If these limitations are too restrictive, you can fall back to the previously recommended winpcap library.
If you want to capture raw packets you need a support driver like WinPCAP to do that.

Resources