Purpose of sendto address for C raw socket? - c

I'm sending some ping packets via a raw socket in C, on my linux machine.
int sock_fd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
This means that I specify the IP packet header when I write to the socket (IP_HDRINCL is implied).
Writing to the socket with send fails, telling me I need to specify an address.
If I use sendto then it works. For sendto I must specify a sockaddr_in struct to use, which includes the fields sin_family, sin_port and sin_addr.
However, I have noticed a few things:
The sin_family is AF_INET - which was already specified when the socket was created.
The sin_port is naturally unused (ports are not a concept for IP).
It doesn't matter what address I use, so long as it is an external address (the IP packet specifies 8.8.8.8 and the sin_addr specifies 1.1.1.1).
It seems none of the extra fields in sendto are actually used to great extent. So, is there a technical reason why I have to use sendto instead of send or is it just an oversight in the API?

Writing to the socket with send fails, telling me I need to specify an address.
It fails, because the send() function can only be used on connected sockets (as stated here). Usually you would use send() for TCP communication (connection-oriented) and sendto() can be used to send UDP datagrams (connectionless).
Since you want to send "ping" packets, or more correctly ICMP datagrams, which are clearly connectionless, you have to use the sendto() function.
It seems none of the extra fields in sendto are actually used to great
extent. So, is there a technical reason why I have to use sendto
instead of send or is it just an oversight in the API?
Short answer:
When you are not allowed to use send(), then there is only one option left, called sendto().
Long answer:
It is not just an oversight in the API. If you want to send a UDP datagram by using an ordinary socket (e.g. SOCK_DGRAM), sendto() needs the information about the destination address and port, which you provided in the struct sockaddr_in, right? The kernel will insert that information into the resulting IP header, since the struct sockaddr_in is the only place where you specified who the receiver will be. Or in other words: in this case the kernel has to take the destination info from your struct as you don't provide an additional IP header.
Because sendto() is not only used for UDP but also raw sockets, it has to be a more or less "generic" function which can cover all the different use cases, even when some parameters like the port number are not relevant/used in the end.
For instance, by using IPPROTO_RAW (which automatically implies IP_HDRINCL), you show your intention that you want to create the IP header on your own. Thus the last two arguments of sendto() are actually redundant information, because they're already included in the data buffer you pass to sendto() as the second argument. Note that, even when you use IP_HDRINCL with your raw socket, the kernel will fill in the source address and checksum of your IP datagram if you set the corresponding fields to 0.
If you want to write your own ping program, you could also change the last argument in your socket() function from IPPROTO_RAW to IPPROTO_ICMP and let the kernel create the IP header for you, so you have one thing less to worry about. Now you can easily see how the two sendto()-parameters *dest_addr and addrlen become significant again because it's the only place where you provide a destination address.
The language and APIs are very old and have grown over time. Some APIs can look weird from todays perspective but you can't change the old interfaces without breaking a huge amount of existing code. Sometimes you just have to get used to things that were defined/designed many years or decades ago.
Hope that answers your question.

The send() call is used when the sockets are in a TCP SOCK_STREAM connected state.
From the man page:
the send() call may be used only when the socket is in a connected
state (so that the intended recipient is known).
Since your application obviously does not connect with any other socket, we cannot expect send() to work.

In addition to InvertedHeli's answer, the dest_addr passed in sendto() will be used by kernel to determine which network interface to used.
For example, if dest_addr has ip 127.0.0.1 and the raw packet has dest address 8.8.8.8, your packet will still be routed to the lo interface.

Related

Raw sockets receiving messages sent by itself

I was trying to write some codes using raw sockets, while I observed some strange phenomenon. Consider the code:
int rsfd = socket(AF_INET,SOCK_RAW,253);
if(rsfd<0)
{
perror("Raw socket not created");
}
else
{
struct sockaddr_in addr2;
memset(&addr2,0,sizeof(addr2));
addr2.sin_family = AF_INET;
addr2.sin_addr.s_addr = inet_addr("127.0.0.2");
/* if(connect(rsfd,(struct sockaddr*)&addr2,sizeof(addr2))<0)
{
perror("Could not connect");continue;
} */
}
Now if I remove the commented portion, whatever message I am sending through this rsfd is also being received by itself. On the other end I have already bound a socket with the ip address 127.0.0.2. When I printed the ip address of the sender socket, it is printing 127.0.0.1 but still it is receiving packets which is meant for 127.0.0.2. This problem was solved when I added that connect request which is mentioned in the commented portion. This seems weird because on the other side, no one is accepting or listen on this address and moreover, I am using sendto and recvfrom functions for sending and receiving packets which is used for connection less sockets. My question is, why is this happening? How is this connect request solvong the problem here?
Now if I [don't connect() the socket], whatever message I am sending through this rsfd is also being received by itself.
I note first that raw sockets are an extension to POSIX. Linux offers them, and I think other systems do too, but details of their behavior are not certain to be consistent across implementations.
With that said, the problem seems likely to be that you are not bind()ing your socket to any address. On Linux, for example, the docs for raw sockets note that
A raw socket can be bound to a specific local address using the
bind(2) call. If it isn't bound, all packets with the specified IP
protocol are received.
(Emphasis added.) On a system where raw sockets have that behavior, if you're sending packets to an IP loopback address via a raw IP socket that is neither bound nor connected then yes, the source socket will receive them, or at least may do.
It's unclear why connecting the socket solves the problem, or why it is even successful at all. The behavior of connect() is unspecified for socket types other than the standard ones, SOCK_DGRAM, SOCK_STREAM, and SOCK_SEQPACKET. However, the behavior you observe is consistent with connect() having an effect on raw sockets like that it has on datagram sockets, which are also connectionless:
If the socket sockfd is of type SOCK_DGRAM, then addr is the address
to which datagrams are sent by default, and the only address from
which datagrams are received.
Instead of relying on that discovered behavior, however, I suggest following the documented (at least on Linux) procedure of binding the socket to an address (including a port), and communicating with it at that address.

What is the HOPOPT protocol and how does socket() work?

I'm messing with sockets in C and this protocol continues to show up, I couldn't find anything about it, so what is it used for? What's the difference between HOPOPT and IP?
Also i'm don't get why the last argument of the socket() function should be 0. According to the man page:
The protocol specifies a particular protocol to be used with the socket. Normally only a single protocol exists to support a particular socket type within a given protocol family, in which case protocol can be specified as 0. However, it is possible that many protocols may exist, in which case a particular protocol must be specified in this manner. The protocol number to use is specific to the “communication domain” in which communication is to take place; see protocols(5). See getprotoent(3) on how to map protocol name strings to protocol numbers.
As far as I understand setting the last argument to 0 will let the standard library to decide which protocol to use but in which case would one use a number other than 0?
HOPOPT is the acronym of the Hop-by-Hop IPv6 extension header. It is a header that allows to add even more options to an IPv6 packet. It is normal that IPv6 packets include this header.
socket() is the system call that BSD and others (Linux et al.) provide to create a new socket, that is the internal representation of a network connection. When creating a new socket, the desired protocol must be specified: TCP, UDP, etc., which may go over IPv4, IPv6, etc.
The paragraph that you are citing explains that one or many protocols may exist for each socket type.
If only one exists, the protocol argument must be zero. For instance, SOCK_STREAM sockets are only implemented by TCP:
int sk = socket(AF_INET, SOCK_STREAM, 0);
If more exist, than you must specify which protocol in particular you want to use. For example, the SOCK_SEQPACKET type can be implemented with the SCTP protocol:
int sk = socket(AF_INET, SOCK_SEQPACKET, IPPROTO_SCTP);
So, in conclusion:
If you want to create a socket, choose what protocol to use, for instance TCP over IPv4.
HOPOPT is totally normal in an IPv6 packet. If you see it appear in your traces, because you created an IPv6 socket (using AF_INET6), it is OK.

How to filter a multicast receiving socket by interface?

I need to create two sockets listening on the same IP:port but on different interfaces:
socket0 receives UDP traffic sent to 224.2.2.2:5000 on interface eth0
socket1 receives UDP traffic sent to 224.2.2.2:5000 on interface eth1
It seemed pretty straight forward until I realized that Linux merges all of that into the same traffic. For example, say there's only traffic on eth1 and there's no activity on eth0. When I first create socket0 it won't be receiving any data but as soon as I create socket1 (and join the multicast group) then socket0 will also start receiving the same data. I found this link that explains this.
Now this actually makes sense to me because the only moment when I specify the network interface is when joining the multicast group setsockopt(socket,IPPROTO_IP,IP_ADD_MEMBERSHIP,...) with ip_mreq.imr_interface.s_addr. I believe this specifies which interface joins the group but has nothing to do with from which interface your socket will receive from.
What I tried so far is binding the sockets to the multicast address and port, which behaves like mentioned above. I've tried binding to the interface address but that doesn't work on Linux (it seems to do so on Windows though), you don't receive any traffic on the socket. And finally, I've tried binding to INADDR_ANY but this isn't what I want since I will receive any other data sent to the port regardless of the destination IP, say unicast for example, and it will still not stop multicast data from other interfaces.
I cannot use SO_BINDTODEVICE since it requires root privileges.
So what I want to know is if this is possible at all. If it can't be done then that's fine, I'll take that as an answer and move on, I just haven't been able to find any way around it. Oh, and I've tagged the question as C because that's what we're using, but I'm thinking it really might not be specific to the language.
I haven't included the code for this because I believe it's more of a theoretical question rather than a problem with the source code. We've been working with sockets (multicast or otherwise) for a while now without any problems, it's just this is the first time we've had to deal with multiple interfaces. But if you think it might help I can write some minimal working example.
Edit about the possible duplicate:
I think the usecase I'm trying to achieve here is different. The socket is supposed to receive data from the same multicast group and port (224.2.2.2:5000 in the example above) but only from one specific interface. To put it another way, both interfaces are receiving data from the same multicast group (but different networks, so data is different) and I need each socket to only listen on one interface.
I think that question is about multiple groups on same port, rather than same group from different interfaces. Unless there's something I'm not seeing there that might actually help me with this.
Yes, you can do what you want on Linux, without root privileges:
Bind to INADDR_ANY and set the IP_PKTINFO socket option. You then have to use recvmsg() to receive your multicast UDP packets and to scan for the IP_PKTINFO control message. This gives you some side band information of the received UDP packet:
struct in_pktinfo {
unsigned int ipi_ifindex; /* Interface index */
struct in_addr ipi_spec_dst; /* Local address */
struct in_addr ipi_addr; /* Header Destination address */
};
The ipi_ifindex is the interface index the packet was received on. (You can turn this into an interface name using if_indextoname() or the other way round with if_nametoindex().
As you said on Windows the same network functions have different semantics, especially for UDP and even more for multicast.
The Linux bind() semantics for the IP address for UDP sockets are mostly useless. It is essentially just a destination address filter. You will almost always want to bind to INADDR_ANY for UDP sockets since you either do not care to which address a packet was sent or you want to receive packets for multiple addresses (e.g. receiving unicast and multicast).

UDP based chat in C

I'm supposed to make a communicator in C, based on dgrams. I don't know what arguments should I pass to bind() function. I skimmed through most UDP-chat questions & codes here on StackOverflow but I still can't find any specific information on the issue.
What type of address structure should I use?
What port can I use? Any number bigger than 1024 ?
What IP adress do I bind my socket with? (most of people put INADDR_ANY but isn't it for receiving only?)
Also, do I need multiple sockets? One for receiving & another for sending messages.
What type of address structure should I use?
If you are using IPv4, use a sockaddr_in. If you want to use IPv6 instead, use a sockaddr_in6.
What port can I use? Any number bigger than 1024 ?
Yes, assuming no other program is already using that port number for its own UDP socket. (If another program is using the port number you chose, it will cause bind() to fail with errno EADDRINUSE)
What IP adress do I bind my socket with? (most of people put
INADDR_ANY but isn't it for receiving only?)
INADDR_ANY is what you generally want to use. It tells the OS that you want to receive incoming UDP packets on any of the computers network interfaces. (If you only wanted to receive UDP packets from a particular network interface, OTOH, e.g. only on WiFi, you could specify that network interface's IP address instead)
Also, do I need multiple sockets? One for receiving & another for
sending messages.
You can have multiple sockets if you want, but it's not necessary to do it that way. You can instead use a single socket for both sending and receiving UDP packets. One common pattern is to use a single socket, set to non-blocking mode, and something like select() or poll() to multiplex the input and output needs of your program. An alternative pattern would be to use two threads (one for sending and one for receiving), blocking I/O, and either one or two sockets (depending on whether you prefer to have the two threads share a socket, or give each thread its own socket). I prefer the single-threaded/single-socket/select() solution myself, as I think it is the least error-prone approach.

Making a reliable UDP by socket function in c

I am having this doubt in socket programming which I could not get cleared by reading the man pages.
In c the declaration of socket function is int socket(int domain, int type, int protocol);
The linux man page says that while type decides the stream that will be followed the protocol number is the one that decides the protocol being followed.
So my question is that suppose I give the type parameter as SOCK_STREAM which is reliable and add the protocol number for UDP would it give me a reliable UDP which is same as TCP but without flow control and congestion control.
Unfortunately I can't test this out as I have a single machine so there is no packet loss happening.
Could anyone clear this doubt? Thanks a lot...
UDP cannot be made reliable. Transmission of the packets is done on a "best effort" capacity, but any router/host along the chain is free to drop the packet in the garbage and NOT inform the sender that it has done so.
That's not to say you can't impose extra semantics on the sending and receiving ends to expect packets within a certain time frame and say "hey, we didn't get anything in the last X seconds". But that can't be done at the protocol level. UDP is a "dump it into the outbox and hope it gets there" protocol.
No. For an IPV4 or IPV6 protocol stack, SOCK_STREAM is going to get you TCP and the type SOCK_DGRAM is going to give you UDP. The protocol parameter is not used for either of the choices and the socket library is typically expecting a value of 0 to be specified there.
If you do this:
socket(AF_INET,SOCK_STREAM,IPPROTO_UDP):
socket() will return -1 and sett errno to
EPROTONOSUPPORT
The protocol type or the specified protocol
is not supported within this domain.

Resources