I'm writing a program in C on GNU/Linux that uses UDP to communicate messages between various instances of the program, either on a single machine, or across a network. Each instance of the program has it's own unique internal application layer address that it uses to differentiate between instances that run on a single machine (and thus share an IP address). Currently, the whole system communicates on a single UDP port.
This works fine between instances of the program running on separate machines, as these all have unique IP addresses, and thus unique socket connections. The problem is running multiple instances on a single machine. In this case, only the first instance of the program gets a socket connect and the others fail since the port is already in use.
Is there a way to bind multiple datagram sockets to a single port? I realize this is not normally advisable, but since I have unique application layer addresses that I can use to resolve the ambiguity, it would be helpful in this case. Essentially, I want to be able to do the following:
Bind all instances of the program on a single machine to the same common protocol port
When a message is received, each instance will use recv with the MSG_PEEK flag set to determine if the message's application layer address matches the instance's internal address.
For the single instance on a given machine where the addresses match, a regular call to recv will remove the message from the input queue for processing by the appropriate instance.
Essentially, I wish to use UDP as a common communication medium with more specific addressing occurring at the application layer.
Is there a standard way of doing this in GNU C? I realize that I could write a top level governing program to listen to all messages on the socket and reroute them to the appropriate instance, but this seems unnecessarily complicated, and breaks the program operating identically with multiple instances across a network vs across a shared single IP. I also know I could use multiple ports, but this adds the need to assign each instance a separate free port and keep track of these across the entire network of instances.
Essentially, I wish to "Broadcast" a message to a group of instances sharing a single IP address and let them sort out who the message belongs to at the application layer.
Thoughts?
You can do such binding with setsockopt(SO_REUSEPORT), but I think it would not help. You will have several sockets, each with its own packet queue, and each packet will go in one queue only. MSG_PEEK will do no good.
Top-level instance rerouting messages to different consumers looks like right solution.
You can't use the multiple socket bound to a unique ip/port combination.
Use some message queue / message passing interface, and forget about UDP.
For example, see 0MQ (zeromq) http://www.zeromq.org/
If it's a client/server style app, the client side need not bind.
When the server responds to the client that hasn't bound it will respond to the source port which will be randomly chosen by the OS when the client sends (without bind).
The client then reads from the unbound port.
Related
I am not clearly understanding the difference between using TCP socket with client connecting to 127.0.0.1 server address and other IPC such as message queues. Since both are used for communication within the same host, why at all someone would go for socket approach leaving the message queue one, as in this case, sockets will cause more overhead compared to the queues.
The differences that I am seeing:-
In case of sockets we can see the contents in wireshark, in queues there is no such way.
The point of the loopback interface / address is not that you write programs to use it specifically.
The point is that it lets you talk to network services running on the local computer in the same way that you would talk to network services running on a remote host. For instance, if I'm developing a website, I can start up a test instance of its server on my local computer and then point my browser at http://127.0.0.1/ and there it is. I don't have to modify the code of my browser to talk over AF_UNIX sockets or whatever first. Similarly, if I am writing an application that needs a database, I might start out with the database running on the same computer as the application, talking to it over loopback, but then later when the database gets bigger I can move it to a dedicated host and I don't have to change anything other than the connection configuration.
You are absolutely correct that local IPC has lower overhead, and should be used when the two processes that need to communicate will always be on the same machine.
TCP and IPC both approach we use for inter process communication in distributed architecture. If processes are running in same machine we will go for message queue but surely not TCP. But suppose one application is running in one box and another application is running in a different box definitely we have to go for TCP for inter process communication. Even web services also internally implement TCP for communicating to a remote application.
But still we need a TCP base communication in the same machine between two process where synchronize communication is must. For example if you send a request for an account information of a client and waiting for the response you need this approach. But if you just need to send a client information to a server to store it in a table and you don't need an answer from that server whether your records has been stored successfully or not you just go for a queue only to drop the message.
Firstly, please excuse my probable butchering the technical terminology. I've been thrown into socket IO with little formal education and I know that I am bungling words left and right.
I'm trying to build a client and server in C that enables multiple clients to connect to one another. The general procedure goes something like this:
1) Server has one port that is constantly listening and accepting connections
2) A client connects on that port
3) Server creates a new socket (same address, different port number), tells the client to connect to that socket, and closes the connection with the client.
4) The client connects to the designated socket and provides the server with a channel it would like to be on
5) Server places that socket on the designated channel
6) Repeat steps 2 through 5 for each client that connects to the server
/* all of the above has been coded already */
7) Once a channel has 2 or more members, I'd like to have each member port be able to broadcast to all other ports in the same channel (and thus the clients communicate with each other)
In this situation, all involved sockets on the same channel have the same address and DIFFERENT port numbers. Everything I've read and researched about broadcasting and multicasting revolves around each communicator having the same port number and different addresses.
Is there a way to do the communication that I'm hoping to do, in C?
I would think you want to use the listen() and accept() functions for TCP. You can do what you describe and have clients talk to each other, but all traffic will run through the server as a hub.
If you want all clients to be able to talk to every other client, you have a few options:
Server is the hub for all data and passes it between clients for you
Clients maintain direct connections to the other clients and pass data to each other in order to facilitate the hub. This means lots of data copying.
Broadcast or multicast (UDP). This is only possible over a local network, as internet routers will block multicast and broadcast traffic.
I would probably go with #1.
Remember that each client has it's own IP address, so for a client to communicate with another client, and not involving the server, it would need to open a new connection with the other client, send data and then close the connection. While doable, I do not think this idea would scale very well.
I do agree with Syplex that having the server act as a relay hub is probably the best, and certainly has the potential to scale well. So the data-flow would be something like this:
a client receives a message that is to be retransmitted to all the other clients.
this message is passed to all other instances of your server process
each of these instances of your server process sends out the message.
The issue becomes how you are implementing you server, and you do have two models that fit what you describe:
(1) you are using a multi-treaded server, in which each new connection causes a thread to be spawned to handle the communication between the client and the server.
(2) you are using a forking server, in which the server forks a new process to communicate with the client.
In case (1) you would be interested in intra-process communication (message queue for example) while in case (2) you would be interested in inter-process communication (named pipes or shared memory for example).
At this point there are two many variables to give a concise answer. I hope this helps gets you started and at least gives you somewhere to start looking.
I've got a little program that needs to communicate between two computers on the same LAN. I'm rather new to networking, but from what I've read it sounds like I want UDP multicasting so that the two computers can discover each other, after which I can establish a TCP connection for actual data communication. I've found this little example for UDP multicasting which I can follow. However, I'm wondering about the multicast group (HELLO_GROUP in that example, which is 225.0.0.37).
How can I know the group I should use? This program will be running on various networks, so I can't hard code one (as far as I know). Do I get the group from the router, and if so, how do I do that?
You can choose any multicast address (224.0.0.0 to 239.255.255.255) that isn't listed as reserved by IANA.
Its possible (if unlikely) that another program will also be using the same address. You can minimise the chances of this causing any confusion by making the announcement messages your program sends out suitably specific. e.g.
CORNSTALKS-DISCOVERY
HOST: {address:port}
[newline]
This would inform your recipients of the address to use for their TCP connection but should find its first line rejected by any other recipients.
You understood wrong.
What are you talking about is broadcasting. A broadcast UDP datagram is sent to every computer in the subnet. (Technically you send a datagram to the address 255.255.255.255.)
UDP broadcast work inside a specific subnet, but don't cross the subnet boundaties. That is, most of the routers are configured not to route broadcast datagrams (reduce spamming).
OTOH multicast is something completely different. The purpose of multicast is to avoid using TCP (or any other unicast) for data transmission. It's good when you need to send something to many other recipients "at once". Those machine agree preliminary on a specific multicast address (like 225.0.0.37 in your example), and "join" this multicast group. Within a specific subnet everything works pretty similar to broadcast, however in contrast to broadcast the multicast may also cross the subnet boundaries. This is due to the fact that when machines join a multicast group the appropriate routers are notified, and they are capable to route multicast datagrams appropriately.
EDIT:
Conclusion (for clarification).
In order to use a multicast one has to pick a multicast address. This is like choosing a port for the application.
The main purpose of multicast is to deliver content (transmit data) to a number of recipients. It's more efficient than unicast in this case.
A "network discovery" is usually done via broadcast. A multicast can theoretically be used for this as well, but this is like killing a fly with a cannon (because routers should also track the lifetime of the multicast session).
I would suggest you don't use multicast directly.
Rather, use zero-configuration networking. This, in its mDNS/DNS-SD incarnation, is available through Apple's Bonjour library on OS X and Windows, and Avahi on unices (and possibly on OS X and Windows too, not sure).
With DNS-SD, you define a name for your service, then use the library to advertise its availability on a given host, or to browse for hosts where it's available. This is how Macs discover printers, file shares, etc - exactly your use case, i believe. It's a simple but very effective technology. And it's an open standard with a good open source implementation, so it's not some proprietary Apple scarytime.
im new to linux environment and any help/feedback would be appreciated. Im actually trying to develop a client-server (MULTICAST) program, so, i would like to test one client sending information to different servers (one-to-many relationship). thus, i would like to simulate different server side in linux with different IP addresses in one computer.
Did you try using different ports instead? I didn't try it myself, but perhaps that can help you in the mid-time.
If you're really multicasting, you don't need to worry about physical host-specific IP:s, all you should need to do is make sure all the programs (clients and servers) are using the same multicast group addresses. Then they should all see each other's traffic automatically.
There's nothing stopping you from running multiple clients on the same machine that also runs the server, in this case.
I sounds like you want to test your code with different IP's. You can create IP aliases on your interface and simulate multiple IP's on one computer.
for e.g. if eth0 is you're active interface with IP, say 192.168.5.11 you can assign another IP to eth0:0 (an alias to eth0) as below.
ifconfig eth0:0 192.168.5.12 netmask255.255.255.0 up
ifconfig eth0:1 192.168.5.13 netmask255.255.255.0 up
run your server on one of the IP's and distribute clients to all your aliases
Use either of the following when you do not have sufficient hardware:
Multicast loop which has the IP stack redirect outbound packets to local receivers.
Virtual machines.
Be aware that semantics of the socket option for #1 change depending on the operating system; for #2 only some virtual machines support multicast, refer to the vendor for details.
http://msdn.microsoft.com/en-us/library/windows/desktop/ms739161(v=vs.85).aspx
Ultimately though you must test with different machines due to specific artifacts of how hosts manage multicast groups. You can for instance create send-only membership which will block every other application on the host. Also consider that an internet, lower case 'I', will introduce further artifacts regarding group joining and propagation delays and drops that your application may need to be aware of.
You can create multiple IP for same machine with help of IP alias. As mentioned above.
But to create multiple Server at one PC you must need different port for each server if you want to simulate the all server behavior with network as well.
I mean for one port multicast traffic always goes to that and some process in the PC will be receiving the packet and has to serve for all server in the PC, Means you have one packet only and all server is receiving with locally manipulation.
But really simulation would be you have multiple server at 1 PC and all are receiving multicast traffic from network rather then from local process.
my Solution: You keep number for server == number of port at the PC. Client send the multicast traffic over all port simultaneously and all server at the PC end will be receiving multicast packet from corresponding port from the Network.
Please correct me if my understanding is wrong.
How, in C, can I detect whether a program is connecting to itself.
For example, I've set up a listener on port 1234, then I set up another socket to connect to an arbitrary address on port 1234. I want to detect whether I'm connecting to my own program. Is there any way?
Thanks,
Dave
Linux provides tools that I think can solve this problem. If the connection is to the same machine, you can run
fuser -n tcp <port-number>
and get back a list of processes listening to that port. You can then look in /proc and found out if there is a process with a pid not your own which is running the same binary you are. A bit of chewing gum and baling wire will help keep the whole contraption together.
I don't think you can easily ask questions about a process on another machine.
One of the parameters to the accept() function is a pointer to a struct sockaddr.
When you call accept() on the server side it will fill in the address of the remote machine connecting to your server socket.
If that address matches the address of any of the interfaces on that machine then that indicates that the client is on the same machine as the server.
You could send a sequence of magic packets upon connection, which is calculated in a deterministic way. The trick is how to do this in a way that sender and receiver will always calculate the same packet contents if they are from the same instance of the program. A little more information on what your program is would be helpful here, but most likely you can do some sort of hash on a bunch of program state and come up with something fairly unique to that instance of the program.
I assume you mean not just the same program, but the same instance of it running on the same machine.
Do you care about the case where you're connecting back to yourself via the network (perhaps you have two network cards, or a port-forwarding router, or some unusual routing out on the internet somewhere)?
If not, you could check whether the arbitrary address resolves to loopback (127.0.0.1), or any of the other IP addresses you know are you. I'm not a networking expert, so I may have missed some possibilities.
If you do care about that "indirect loopback" case, do some handshaking including a randomly-generated number which the two endpoints share via memory. I don't know whether there are security concerns in your situation: if so bear in mind that this is almost certainly subject to MITM unless you also secure the connection.