UDP based chat in C

UDP based chat in C - c

I'm supposed to make a communicator in C, based on dgrams. I don't know what arguments should I pass to bind() function. I skimmed through most UDP-chat questions & codes here on StackOverflow but I still can't find any specific information on the issue.
What type of address structure should I use?
What port can I use? Any number bigger than 1024 ?
What IP adress do I bind my socket with? (most of people put INADDR_ANY but isn't it for receiving only?)
Also, do I need multiple sockets? One for receiving & another for sending messages.

What type of address structure should I use?
If you are using IPv4, use a sockaddr_in. If you want to use IPv6 instead, use a sockaddr_in6.
What port can I use? Any number bigger than 1024 ?
Yes, assuming no other program is already using that port number for its own UDP socket. (If another program is using the port number you chose, it will cause bind() to fail with errno EADDRINUSE)
What IP adress do I bind my socket with? (most of people put
INADDR_ANY but isn't it for receiving only?)
INADDR_ANY is what you generally want to use. It tells the OS that you want to receive incoming UDP packets on any of the computers network interfaces. (If you only wanted to receive UDP packets from a particular network interface, OTOH, e.g. only on WiFi, you could specify that network interface's IP address instead)
Also, do I need multiple sockets? One for receiving & another for
sending messages.
You can have multiple sockets if you want, but it's not necessary to do it that way. You can instead use a single socket for both sending and receiving UDP packets. One common pattern is to use a single socket, set to non-blocking mode, and something like select() or poll() to multiplex the input and output needs of your program. An alternative pattern would be to use two threads (one for sending and one for receiving), blocking I/O, and either one or two sockets (depending on whether you prefer to have the two threads share a socket, or give each thread its own socket). I prefer the single-threaded/single-socket/select() solution myself, as I think it is the least error-prone approach.

Related

How to filter a multicast receiving socket by interface?

I need to create two sockets listening on the same IP:port but on different interfaces:
socket0 receives UDP traffic sent to 224.2.2.2:5000 on interface eth0
socket1 receives UDP traffic sent to 224.2.2.2:5000 on interface eth1
It seemed pretty straight forward until I realized that Linux merges all of that into the same traffic. For example, say there's only traffic on eth1 and there's no activity on eth0. When I first create socket0 it won't be receiving any data but as soon as I create socket1 (and join the multicast group) then socket0 will also start receiving the same data. I found this link that explains this.
Now this actually makes sense to me because the only moment when I specify the network interface is when joining the multicast group setsockopt(socket,IPPROTO_IP,IP_ADD_MEMBERSHIP,...) with ip_mreq.imr_interface.s_addr. I believe this specifies which interface joins the group but has nothing to do with from which interface your socket will receive from.
What I tried so far is binding the sockets to the multicast address and port, which behaves like mentioned above. I've tried binding to the interface address but that doesn't work on Linux (it seems to do so on Windows though), you don't receive any traffic on the socket. And finally, I've tried binding to INADDR_ANY but this isn't what I want since I will receive any other data sent to the port regardless of the destination IP, say unicast for example, and it will still not stop multicast data from other interfaces.
I cannot use SO_BINDTODEVICE since it requires root privileges.
So what I want to know is if this is possible at all. If it can't be done then that's fine, I'll take that as an answer and move on, I just haven't been able to find any way around it. Oh, and I've tagged the question as C because that's what we're using, but I'm thinking it really might not be specific to the language.
I haven't included the code for this because I believe it's more of a theoretical question rather than a problem with the source code. We've been working with sockets (multicast or otherwise) for a while now without any problems, it's just this is the first time we've had to deal with multiple interfaces. But if you think it might help I can write some minimal working example.
Edit about the possible duplicate:
I think the usecase I'm trying to achieve here is different. The socket is supposed to receive data from the same multicast group and port (224.2.2.2:5000 in the example above) but only from one specific interface. To put it another way, both interfaces are receiving data from the same multicast group (but different networks, so data is different) and I need each socket to only listen on one interface.
I think that question is about multiple groups on same port, rather than same group from different interfaces. Unless there's something I'm not seeing there that might actually help me with this.

Yes, you can do what you want on Linux, without root privileges:
Bind to INADDR_ANY and set the IP_PKTINFO socket option. You then have to use recvmsg() to receive your multicast UDP packets and to scan for the IP_PKTINFO control message. This gives you some side band information of the received UDP packet:
struct in_pktinfo {
unsigned int ipi_ifindex; /* Interface index */
struct in_addr ipi_spec_dst; /* Local address */
struct in_addr ipi_addr; /* Header Destination address */
};
The ipi_ifindex is the interface index the packet was received on. (You can turn this into an interface name using if_indextoname() or the other way round with if_nametoindex().
As you said on Windows the same network functions have different semantics, especially for UDP and even more for multicast.
The Linux bind() semantics for the IP address for UDP sockets are mostly useless. It is essentially just a destination address filter. You will almost always want to bind to INADDR_ANY for UDP sockets since you either do not care to which address a packet was sent or you want to receive packets for multiple addresses (e.g. receiving unicast and multicast).

Purpose of sendto address for C raw socket?

I'm sending some ping packets via a raw socket in C, on my linux machine.
int sock_fd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
This means that I specify the IP packet header when I write to the socket (IP_HDRINCL is implied).
Writing to the socket with send fails, telling me I need to specify an address.
If I use sendto then it works. For sendto I must specify a sockaddr_in struct to use, which includes the fields sin_family, sin_port and sin_addr.
However, I have noticed a few things:
The sin_family is AF_INET - which was already specified when the socket was created.
The sin_port is naturally unused (ports are not a concept for IP).
It doesn't matter what address I use, so long as it is an external address (the IP packet specifies 8.8.8.8 and the sin_addr specifies 1.1.1.1).
It seems none of the extra fields in sendto are actually used to great extent. So, is there a technical reason why I have to use sendto instead of send or is it just an oversight in the API?

Writing to the socket with send fails, telling me I need to specify an address.
It fails, because the send() function can only be used on connected sockets (as stated here). Usually you would use send() for TCP communication (connection-oriented) and sendto() can be used to send UDP datagrams (connectionless).
Since you want to send "ping" packets, or more correctly ICMP datagrams, which are clearly connectionless, you have to use the sendto() function.
It seems none of the extra fields in sendto are actually used to great
extent. So, is there a technical reason why I have to use sendto
instead of send or is it just an oversight in the API?
Short answer:
When you are not allowed to use send(), then there is only one option left, called sendto().
Long answer:
It is not just an oversight in the API. If you want to send a UDP datagram by using an ordinary socket (e.g. SOCK_DGRAM), sendto() needs the information about the destination address and port, which you provided in the struct sockaddr_in, right? The kernel will insert that information into the resulting IP header, since the struct sockaddr_in is the only place where you specified who the receiver will be. Or in other words: in this case the kernel has to take the destination info from your struct as you don't provide an additional IP header.
Because sendto() is not only used for UDP but also raw sockets, it has to be a more or less "generic" function which can cover all the different use cases, even when some parameters like the port number are not relevant/used in the end.
For instance, by using IPPROTO_RAW (which automatically implies IP_HDRINCL), you show your intention that you want to create the IP header on your own. Thus the last two arguments of sendto() are actually redundant information, because they're already included in the data buffer you pass to sendto() as the second argument. Note that, even when you use IP_HDRINCL with your raw socket, the kernel will fill in the source address and checksum of your IP datagram if you set the corresponding fields to 0.
If you want to write your own ping program, you could also change the last argument in your socket() function from IPPROTO_RAW to IPPROTO_ICMP and let the kernel create the IP header for you, so you have one thing less to worry about. Now you can easily see how the two sendto()-parameters *dest_addr and addrlen become significant again because it's the only place where you provide a destination address.
The language and APIs are very old and have grown over time. Some APIs can look weird from todays perspective but you can't change the old interfaces without breaking a huge amount of existing code. Sometimes you just have to get used to things that were defined/designed many years or decades ago.
Hope that answers your question.

The send() call is used when the sockets are in a TCP SOCK_STREAM connected state.
From the man page:
the send() call may be used only when the socket is in a connected
state (so that the intended recipient is known).
Since your application obviously does not connect with any other socket, we cannot expect send() to work.

In addition to InvertedHeli's answer, the dest_addr passed in sendto() will be used by kernel to determine which network interface to used.
For example, if dest_addr has ip 127.0.0.1 and the raw packet has dest address 8.8.8.8, your packet will still be routed to the lo interface.

C `sendto` versus `write`

Correct me if I'm wrong, but my understanding of sending a raw packet inevitably is defined as buffering an array of bytes in an array, and writing it to a socket. However, most example code I've seen so far tend towards sendto, rarely is send used, and I've never seen code other than my own use write. Am I missing something? What is with this apparent preoccupation with complicating code like this?
Why use send and sendto when write seems to me to be the obvious choice when dealing with raw sockets?

sendto is typically used with unconnected UDP sockets or raw sockets. It takes a parameter specifying the destination address/port of the packet. send and write don't have this parameter, so there's no way to tell the data where to go.
send is used with TCP sockets and connected UDP sockets. Since a connection has been established, a destination does not need to be specified, and in fact this function doesn't have a parameter for one.
While the write function can be used in places where send can be used, it lacks the flags parameter which can enable certain behaviors on TCP sockets. It also doesn't return the same set of error codes as send, so if things go wrong you might not get a meaningful error code. In theory you could also use write on a raw socket if the IP_HDRINCL socket option is set, but again it's not preferable since it doesn't support the same error codes as send.

Few queries regarding raw sockets in C

I want to make a chat room using raw socket in C. I have following problems:
Q 1 : Can I use select function to handle multiple connections in case of raw sockets ?
Q 2 : Port nos in sockets are real ports or logically implemented for various applications on transport layer??
Q 3 : I am having one computer only so using lo ( local loop) as my interface. So the process which is initiating the chat has send first and then receive call, so it's receiving it's own data. How to restrict it?
Any help would be appreciated since that would help me in increasing my confidence on raw sockets.
Thanks :)

If you want this to be a real, usable chat system, stop. Don't use raw sockets. Huge mistake.
If you are just playing around because you want to put “raw sockets” under the “Experience” section of your résumé, you may read on.
You can use the select function to detect when a raw socket has a packet available to receive, and when it can accept a packet to transmit. You can pass multiple file descriptors to a single call to select if you want to check multiple raw sockets (or whatever) simultaneously.
Port numbers are part of the TCP and UDP protocols (and some other transport layer protocols). The kernel doesn't look for port numbers when receiving packets for raw sockets.
The raw(7) man page‚ states:
All packets or errors matching the protocol number specified for the raw socket are passed to this socket.
And it also states:
A raw socket can be bound to a specific local address using the bind(2) call. If it isn't bound, all packets with the specified IP protocol are received.
Therefore you probably want to at least use different IP addresses for each end of the “connection”, and bind each end to its address.
“But!” you say, “I'm using loopback! I can only use the 127.0.0.1 address!” Not so, my friend. The entire 127.0.0.0/8 address block is reserved for loopback addresses; 127.0.0.1 is merely the most commonly-used loopback address. Linux (but perhaps not other systems) responds to every address in the loopback block. Try this in one window:
nc -v -l 10150
And then in another window:
nc -s 127.0.0.1 127.0.0.2 10150
You will see that you have created a TCP connection from 127.0.0.1 to 127.0.0.2. I think you can also bind your raw sockets to separate addresses. Then, when you receive a packet, you can check whether it's from the other end's IP address to decide whether to process or discard it.

Just curious, why do you want to use raw sockets? Raw sockets (AF_INET, SOCK_RAW) allow you to send out "raw" packets, where you are responsible for crafting everything but the MAC and IP layers.
A1: There are no "connections" with raw sockets. Just packets.
A2: There are no "ports" with raw sockets. Just packets. "Port numbers" as we know them are part of the TCP or UDP protocols, both of which are above the level at which we work with raw sockets.
A3: This is not specific to raw sockets - you would have this issue regardless of your protocol selection. To really answer this, we would need to know much more about your proposed protocol, since right now, you're simply blasting out raw IP packets.

Get random port for UDP socket

I need to create a program that will communicate with other programs on the same computer via UDP sockets. It will read commands from stdin, and some of this commands will make it to send/receive packets without halting execution. I've read some information out there, but since I'm not familiar with socket programming and need to get this done quickly, I have the following questions:
I need to get a random unused port for the program to listen to, and reserve it so other programs can communicate with this and also the port isn't reserved by another program. I also need to store the port number on a variable for future usage.
Since the communication is across processes on the same machine, I'm wondering if I can use PF_LOCAL.
Also a code sample of the setup of such socket would be welcome, as well as an example of sending/receiving character strings.

Call bind() specifying port 0. That will allow the OS to pick an unused port. You can then use getsockname() to retreive the chosen port.

Answer by Remy Lebeau is good if you need a temporary port. It is not so good if you need a persistent reserved port because other software also uses the same method to get a port (including OS TCP stack that needs a new temporary port for each connection).
So the following might happen:
You call bind with 0 and getsockname() to get a port;
then save it into config (or into several configs) for future uses;
software that needs this port starts and binds the port.
Then you need to e.g. restart the software:
software stops and unbinds the port: the port can now be returned by bind(0) and getsockname() again;
e.g. TCP stack needs a port and binds your port;
software can't start because port is already bound.
So for "future uses" you need a port that is not in ephemeral port range (that's the range from which bind(host, 0) returns a port).
My solution for this issue is the port-for command-line utility.

If it being a random port is actually important, you should do something like:
srand(time(NULL));
rand() % NUM_PORTS; // NUM_PORTS isn't a system #define
Then specify that port in bind. If it fails, pick a new one (no need to re-seed the random generator. If the random port isn't important, look at Remy Lebeau's answer.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight