To set a network interface to promiscuous mode, for example, one can use setsockopt as
struct packet_mreq opt;
opt.mr_ifindex = the_very_interface_index;
setsockopt(socket_fd, SOL_PACKET, PACKET_ADD_MEMBERSHIP, (void*)&opt, sizeof(opt));
On the other hand, as packet(7) suggested, one can also use ioctl with SIOCSIFFLAGS option, like
struct ifreq req;
strcpy(req.ifr_name, the_very_interface_name);
ioctl(socket_fd, SIOCGIFFLAGS, &req);
req.ifr_flags |= IFF_PROMISC;
ioctl(socket_fd, SIOCSIFFLAGS, &req);
I understand these two ways are completely equivalent, but is there any difference? Otherwise, why is there two ways to do the same thing?
There is very little difference between the two, as can be seen by perusing the source. Specifically, consider callers of __dev_set_promiscuity.
The setsockopt interface eventually calls dev_set_promiscuity.
The ioctl interface ends up calling dev_change_flags.
There has always been some duplication of mechanisms due to the file descriptor interface to devices. For example send() vs write(). To be honest, I have never thought too deeply about it. I imagine ioctl interfaces are a natural thing to add for generic devices, and setsockopt interfaces are natural to add for sockets, so both end up existing. You could think of sockets as a higher level abstraction over the network device, so a higher level interface to modify options would not be an unreasonable addition.
Related
Linux's iptable and iproute allows us to mark packets and matches the mark later (fwmark), allowing for great flexibility in configuring routes and firewalls.
Is there a way to set those marks while sending the packet from a C program, either via ordinary sockets interface or via specific linux system calls?
I found the SO_MARK socket option in socket(7) man page:
SO_MARK (since Linux 2.6.25)
Set the mark for each packet sent through this socket (similar
to the netfilter MARK target but socket-based). Changing the
mark can be used for mark-based routing without netfilter or
for packet filtering. Setting this option requires the
CAP_NET_ADMIN capability.
It is not per packet, as I originally asked, suits my purpose. You can set it with setsockopt():
int fwmark;
//fwmark = <some value>;
if (setsockopt(sockfd, SOL_SOCKET, SO_MARK, &fwmark, sizeof fwmark) == -1)
perror("failed setting mark for socket packets");
I'm using Linux kernel 2.6.32 (x86_64) and can get TCP statistics by passing TCP_INFO to getsockopt and receiving a tcp_info struct, which is defined in /usr/include/netinet/tcp.h.
Can I get similar statistics for UDP? (possibly fewer because there's no built-in congestion control and retransmission etc. but I'm satisfied with any statistics that I can get)
TCP_INFO literally means info for TCP. The reason there aren't info for UDP is because it's stateless. There's no guaranteed transfer, not rtf, no window size, no much info to provide about.
If you really wanna grab some extra info, take a look at man 2 recvmsg, especially this
Ancillary data should only be accessed by the macros defined in cmsg
I'm wondering how feasible it is to be able to convert an AF_INET socket to use an AF_UNIX instead. The reason for this is that I've got a program which will open a TCP socket, but which we cannot change. To reduce overhead we would therefore like to instead tie this socket to use an AF_UNIX one for its communication instead.
So far, my idea has been to use LD_PRELOAD to achieve this---intercepting bind() and accept(), however it is not clear how best to achieve this, or even if this is the best approach.
So far, bind in bind(), if the socket type is AF_INET and its IP/port is the socket I wish to convert to AF_UNIX, I then close the sockd here, and open an AF_UNIX one. However, this seems to be causing problems further on in accept() -- because I am unsure what to do when the sockfd in accept() matches the one I wish to tell to use an AF_UNIX socket.
Any help kindly appreciated.
Jason
Your idea sounds perfectly feasible. In fact, I think it sounds like the best way to achieve what you want. I wouldn't expect very different, or even measurably different, overhead/performance though.
Of course you'd also have to intercept socket() in addition to bind() and accept(). In bind(), you could, for example, converted the requested TCP port to a fixed pathname /tmp/converted_socket.<port-number> or something like that.
I had a similar problem and came up with unsock, a shim library that does what you describe, and more.
unsock supports other address types like AF_VSOCK and AF_TIPC, and the Firecracker VM multiplexing proxy as well.
There are three key insights I want to share:
Since sockets are created for a particular address family using socket(2), and then later connected or bound using connect(2)/bind(2), you may be tempted to simply intercept socket and fix the address there.
One problem is that you may want to selectively intercept certain addresses only, which you don't know at the time of the call.
The other problem is that file descriptors may be passed upon you from another process (e.g., via AF_UNIX auxiliary mes, so you may not be able to intercept socket(2) in the first place.
In other words, you need to intercept connect(2), bind(2), and sendto(2).
When you intercept connect(2), bind(2), and sendto(2), you need to retroactively change the address family for socket(2). Thankfully, you can just create a new socket and use dup3(2) to reassign the new socket to the existing file descriptor. This saves a lot of housekeeping!
accept(2) and recvfrom(2) also need to be intercepted, and the returned addresses converted back to something the caller understands. This will inevitably break certain assumptions unless you do maintain a mapping back to the actual, non-AF_INET address.
Why does struct sockaddr contain an address family field? Isn't the address family already fixed with the call to socket()?
sockaddr is used in more places than just connect and bind, including places where you don't have some external knowledge of the address family involved - getaddrinfo being one.
Additionally, whilst I don't believe the following equates to practice anywhere, I can see it having been in the eye of whoever designed this stuff originally: the call to socket() defines the protocol family. sockaddr contains the address family. In practice, I believe these are always the same, but you could theoretically have a protocol capable of supporting two different address types.
EDIT: There's another way that the parameter is useful. If you're using datagram (UDP) sockets and you have a socket in a "connected" state with a default destination address, you can clear out that address by calling connect() with a sockaddr with sa_family set to AF_UNSPEC.
If you look at the getaddrinfo interface, which is the only modern correct way to convert between interchange representations of addresses (names or numeric addresses) and the sockaddr structures, I think you'll see why it's needed.
With that said, the whole struct sockaddr stuff is a huge mess of misdesigns, especially the userspace endian conversion.
Another good instance of why the sa_family field is needed is the getsockname and getpeername interfaces. If the program inherited the file descriptor from another program, and doesn't already know what type of socket it is, it needs to be able to determine that in order to make new connections or even convert the address to a representation suitable for interchange.
If you look at the network code for 4.2BSD, where the sockets interface originated, you'll see that the sockaddr is passed to the network interface drivers but the socket fd is not.
The sa_family field is used to tell what type of address will be in the sa_data field. In a lot of applications, the address family is assumed to be IPV4. However, many applications also support IPV6.
for what especially the socket options are used i.e setsockopt() and getsockopt() in socket programming ?
For example you want to set or know receive buffer size
1)
int skt, int sndsize;
err = setsockopt(skt, SOL_SOCKET, SO_RCVBUF, (char *)&sndsize,
(int)sizeof(sndsize));
err = getsockopt(skt, SOL_SOCKET, SO_RCVBUF, (char *)&sockbufsize, &size);
2) Reuse address
int on = 1;
if (setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on)) < 0)
For many different things including changing the size of send and receive buffers, length of timeouts, multicasting, keeping the connection alive, disabling Nagel algorithm, etc.
There are levels of options depending on what network layer you what to interact with: socket itself, IP, TCP, and so forth.
As already mentioned they are used for setting/getting various options for a socket.
For example, if you are testing a server application that crashes, you don't wont to wait a certain number of minutes before the kernel let you reuse the port avoiding the "Address already in use" error messages. This can be avoided if you use the SO_REUSEADDR option, letting other sockets to bind to the same port unless there is an active listener bound already.
You can also retrieve data about a socket, such as the number of lost packets / retransmissions etc by using the TCP_INFO on linux machines.
Basically, you can configure all the fine settings.
Options for setsockopt(2) and getsockopt(2).
Superficially, sockets look like a bidirectional pipe which is useful because standard system calls such as write, read, close can be used on them just like on normal pipes or even files. Even if you add socket-specific calls (listen, connect, bind, accept), there is a useful level of abstraction that hides away details in favor of the notion of streaming or datagram sockets.
But as soon as protocol-specific details come into play and specific settings need to be tuned (for example send/receive buffers, timeout settings), a very generic interface is needed to account for the different settings and their specific data formats. getsockopt, setsockopt are part of this generic interface.
int getsockopt(int sockfd, int level, int optname,
void *optval, socklen_t *optlen);
int setsockopt(int sockfd, int level, int optname,
const void *optval, socklen_t optlen);
The protocol-specific options are selected using level and optname and the protocol-specific data is hidden in a buffer, so the two system calls do not need to know anything about the settings of every protocol the OS may support -- it's enough if your application and the actual protocol implementation know about those details.