Detect whether a socket program is connecting to itself - c

How, in C, can I detect whether a program is connecting to itself.
For example, I've set up a listener on port 1234, then I set up another socket to connect to an arbitrary address on port 1234. I want to detect whether I'm connecting to my own program. Is there any way?
Thanks,
Dave

Linux provides tools that I think can solve this problem. If the connection is to the same machine, you can run
fuser -n tcp <port-number>
and get back a list of processes listening to that port. You can then look in /proc and found out if there is a process with a pid not your own which is running the same binary you are. A bit of chewing gum and baling wire will help keep the whole contraption together.
I don't think you can easily ask questions about a process on another machine.

One of the parameters to the accept() function is a pointer to a struct sockaddr.
When you call accept() on the server side it will fill in the address of the remote machine connecting to your server socket.
If that address matches the address of any of the interfaces on that machine then that indicates that the client is on the same machine as the server.

You could send a sequence of magic packets upon connection, which is calculated in a deterministic way. The trick is how to do this in a way that sender and receiver will always calculate the same packet contents if they are from the same instance of the program. A little more information on what your program is would be helpful here, but most likely you can do some sort of hash on a bunch of program state and come up with something fairly unique to that instance of the program.

I assume you mean not just the same program, but the same instance of it running on the same machine.
Do you care about the case where you're connecting back to yourself via the network (perhaps you have two network cards, or a port-forwarding router, or some unusual routing out on the internet somewhere)?
If not, you could check whether the arbitrary address resolves to loopback (127.0.0.1), or any of the other IP addresses you know are you. I'm not a networking expert, so I may have missed some possibilities.
If you do care about that "indirect loopback" case, do some handshaking including a randomly-generated number which the two endpoints share via memory. I don't know whether there are security concerns in your situation: if so bear in mind that this is almost certainly subject to MITM unless you also secure the connection.

Related

What port(s) can I use for a messenger application

Please forgive me for being naive on the subject, however, I do not understand ports and how they work in the slightest. I am trying to make a program where two computers could communicate given their IP addresses and I am using TCP protocol. I don't, however, know what port(s) I would be able to use for this application, because when I look up TCP ports I get a list of ports each with their own function. Any help is useful.
P.S. I am using C to create my program
The short answer is you can choose any port you like - although the safe range is generally considered to be between 1024 and 65535. The only problem that you will encounter is when some other program installed on the device is already listening on that port. Unfortunately, there is no port that is guaranteed to be available to listen on.
One possible solution to this is to have a primary listening port and a fallback secondary port. You can then first try to connect on the primary port and, if a satisfactory response is not received, try to connect on the secondary port. However, even this is not infallible, as there is a chance that the secondary post could also be in use.
The easiest approach is to try to create your listener on the port that you have chosen, and if the port fails to create, let the user know that some other application is preventing execution of your application.

Assign a new socket to client after receving request from 8080 in server code

C Language TCP server/client.. I want to assign a new socket for a particular client which requested my server from 8080 lets say the new socket is 8081 to get further request, and want to free the previous socket(8080) so that the other clients will request my server from 8080. is there any way of doing it in C language. (OS Ubuntu) Thanks
Your problem statement is incorrect. You can't do this even if you wanted to. The way that TCP sockets work is that accept() gives you a new socket for the incoming client connection, on the same port you are listening to. That's all you need and it's all you can get. You can't 'allocate a new socket' to the client on a new port without engaging in another TCP handshake with him, which would be nothing but a complete waste of time when you already have a connection to him. This does not preclude another connection bring accepted while this one is open. You need to read a TCP Sockets networking tutorial.
Mat and EJP have said the pertinent things above, but I thought it might help others to describe the situation more verbosely.
A TCP/IP connection is identified by a four-tuple: target IP address, target TCP port number, source IP address, and source TCP port number. The kernel will keep track of established connections based on these four things. A single server port (and IP address) can be connected to thousands of clients at the same time, limited in practice only by the resources available.
When you have a listening TCP socket, it is bound to some IP address (or wildcard address) and TCP port. Such a socket does not receive data, only new connections. When accept() is called, the server notes the new four-tuple of the connection, and hands off the file descriptor that represents that connection (as the accept() return value). The original socket is free to accept new connections. Heck, you can even have more than one thread accepting new connections if you want to, although establishing new connections in Linux is so fast you shouldn't bother; it's just too insignificant to worry about.
If establishing the connection at application level is resource-intensive -- this is true for for example encrypted connections, where agreeing to an encryption scheme and preparing the data structures needed takes typically several orders of magnitude more CPU resources than a simple TCP connection --, then it is natural to wish to avoid that overhead. Let's assume this is the point in OP's question: to avoid unnecessary application-level connection establishment when a recent client needs another connection.
The preferred solution is connection multiplexing. Simply put, the application-level protocol is designed to allow multiple data streams via a single TCP connection.
The OP noted that it would be necessary/preferable to keep the existing application protocol intact, i.e. that the optimization should be completely on the server side, transparent to the clients.
This turns the recommended solution to a completely new direction. We should not talk about application protocols, but how to efficiently implement the existing one.
Before we get to that, let's take a small detour.
Technically, it is possible to use the kernel packet filtering facilities to modify incoming packets to use a different port based on the source IP address, redirecting requests from specific IP addresses to separate ports, and making those separate ports otherwise inaccessible. Technically possible, but quite complex to implement, and with very questionable benefits.
So, let's ignore the direction OP assumed would bring the desired benefits, and look at the alternatives. Or, actually, the common approach used.
Structurally, your application has
- A piece of code accepting new connections
- A piece of code establishing the application-level resources needed for that connection
- A piece of code doing the communication with the client (serving the response to the client, per the client's request)
There is no reason for these three pieces to be consecutive, or even part of the same code flow. Use data structures to your advantage.
Instead of treating new incoming connections (accept()ed) as equal, they can be simply thrown into separate pools based on their source IP addresses. (Or, if you are up to it, have a data structure which clusters source IP addresses together, but otherwise keeps them in the order they were received.)
Whenever a worker completes a request by a client, it checks if that same client has new incoming connections. If yes, it can avoid most if not all of the application-level connection establishment by checking that the new connection matches the application-level parameters of the old one. (You see, it is possible that even if the source IP address is the same, it could be a completely different client, for example if the clients are under the same VPN or NATted subnet.)
There are quite a few warts to take care of, for example how to keep the priorities, and avoid starving new IP addresses if known clients try to hog the service.
For protocols like HTTP, where the client sends the request information as soon as the server accepts the connection, there is an even better pattern to apply: instead of connection pools, have request pools. A single thread or a thread pool can receive the requests (they may span multiple packets in most protocols), without acting on them; only detecting when the request itself is complete. (A careful server will limit the number of pending requests, and the number of incomplete request, to avoid vulnerability to DOS.)
When the requests are complete, they are grouped, so that the same "worker" who serves one request, can serve another similar request with minimal overhead. Again, some careful thought is needed to avoid the situation where a prolific client hogs the server resources by sending a lot of requests, but it's nothing some careful thought and testing won't resolve.
One question remains:
Do you need to do this?
I'd wager you do not. Apache, which is one of the best HTTP servers, does not do any of the above. The performance benefits are not considered worth the extra code complexity. Could you write a new HTTP server (or a server for whatever protocol you're working with), and use a scheme similar to above, to make sure you can use your hardware as efficiently as possible? Sure. You don't even need to be a wizard, just do some research and careful planning, and avoid getting caught in minute details, keeping the big picture in mind at all times.
I firmly believe that code maintainability and security is more important than efficiency, especially when writing an initial implementation. The information gained from the first implementation has thus far always changed how I perceive the actual "problem"; similar to opening new eyes. It has always been worth it to create a robust, easy to develop and maintain, but not necessarily terribly efficient implementation, for the first generation. If there is someone willing to support the development of the next generation, you not only have the first generation implementation to compare (and verify and debug) against, but also all the practical knowledge gained.
That is also the reason old hands warn so often against premature optimization. In short, you end up optimizing resource waste and personal pain, not the implementation you're developing.
If I may, I'd recommend the OP back up a few steps, and actually describe what they intend to implement, what the observed problem with the implementation is, and suggestions on how to fix and avoid the problem. The current question is like asking how to better freeze a banana, as it keeps shattering when you hammer nails with it.

Create Random Usable Port

I'm writing a program in C in which the server listens on a well known port, waits for client to connect, and then creates a random port for the client to use and send this port number back to the client. My main difficulty is how to create a "random" port. Should I just be using srand and create a random 4 digit port is the usable range? Or is there a better way to do this? I know that if I use port 0 a port will be chosen for me but the problem here is the fact that I don't think i can "get/see" the actual value of the port so that I can send this port number back to the client.
Thanks...
Binding port 0 is the solution. It gives you an arbitrary port, not a random port, but this is what many applications do (e.g. FTP etc).
After binding, you can use getsockname to figure out which port you got.
What you do is bind() with port set to 0. The system will assign one. Then use getsockname() to discover what port the system assigned. Send that back to the client. That way there is no race condition and you follow any system rules for port assignment.
A random 4-digit port checked to make sure it's not in use is OK for that purpose..
Technically speaking, it sounds like you're trying to implement this for added security (some kind of primitive port knocking routine)? It might be worth mentioning that this approach is generally not considered too secure. It also imposes some artificial constraints on how many clients you can serve at a time and actually adds unnecessary load on the server. Why not just listen on the single well-known port for all clients?
I'm guessing TCP considering your description of listening and automatic port assignment by the OS. In this case, you don't need to worry about it. Once you accept the TCP connection, the OS on both sides takes care of all that you're trying to do and you're left with a working connection, ready for use. Unless you have a particular reason for doing any of this yourself, it's already done for you.

Neighbor discovery C

I need to discover all network neighbors in Linux(they are running Linux too) and I need to get theirs IP addresses(3rd layer). Any ideas how to do that?
Btw, I need to do that in C, not in shell
Many thanks in advance!
What you should do is, have the neighbours run a daemon which responds (with a unicast response to the sender) to UDP multicasts.
Then send a UDP multicast with a TTL of 1 (so it will not be routed) and listen to see who responds. You will only receive responses from the neighbours which are running the agent.
Another possibility is to use an existing protocol which already does this, for example, mDNS.
There is no guaranteed way to do this if the machines in question aren't co-operating.
The best you can do is to scan likely addresses and probe each one to see if you can get a response - that probe could be anything from a simple ICMP echo request (a ping) up to a sophisticated malformed packet that attempts to elicit a response from the remote host.
The level of sophistication required, and whether it will work at all, depends entirely on how heavily firewalled etc the host in question is.
As a commenter has already observed, there are entire programs like nmap dedicated to attempting to discover this information, which gives some idea of how non-trivial this can be.
At the other extreme, if the hosts are co-operating, then a simple broadcast ICMP echo request might be enough.
If your segment uses reasonably decent switch, you can discover the link-layer neighbours by inspecting the forwarding database of one of the switches. You should be able to obtain this fairly automatically via SNMP, check your switch's documentation.
Once you have a list of link neighbours, you can try and find out their IP addresses, but remember that they may have many or none at all. For this you'd need some sort of reverse-ARP. Perhaps your router maintains a list of MAC-to-IP associations and you can query it (again SNMP would be the most convenient solution).

Unix sockets: when to use bind() function?

I've not a clear idea about when I have to use the bind() function.
I guess it should be used whenever I need to receive data (i.e. recv() or recvfrom() functions) whether I'm using TCP or UDP, but somebody told me this is not the case.
Can anyone clarify a bit?
EDIT I've read the answers but actually I'm not so clear. Let's take an example where I have an UDP client which sends the data to the server and then has to get a response. I have to use bind here, right?
This answer is a little bit long-winded, but I think it will help.
When we do computer networking, we're really just doing inter-process communication. Lets say on your own computer you had two programs that wanted to talk to each other. You might use pipe to send the data from one program to another. When you say ls | grep pdf you are taking the output of ls and feeding it into grep. In this way, you have unidirectional communication between the two separate programs ls and grep.
When you do this, someone needs to keep track of the Process ID (PID) of each process. That PID is a unique identifier for each process and it helps us track who the "source" and "destination" processes are for the data we want to transfer.
So now lets say you have data from a webserver than you want to transfer to a browser. Well, this is the same scenario as above - interprocess communication between two programs, the "server" and "browser".
Except this time those two programs are on different computers. The mechanism for interprocess communication across two computers are called "sockets".
So great. You take some data, lob it over the wire, and the other computer receives it. Except that computer doesn't know what to do with that data. Remember we said we need a PID to know which processes are communicating? The same is true in networking. When your computer receives HTML data, how does it know to send it to "firefox" rather than "pidgin"?
Well when you transmit network data, you specify that it goes on a specific "port". Port 80 is usually used for web, port 25 for telnet, port 443 for HTTPS, etc.
And that "port" is bound to a specific process ID on the machine. This is why we have ports. This is why we use bind(). In order to tell the sender which process should receive our data.
This should explain the answers people have posted. If you are a sender, you don't care what the outgoing port is, so you usually don't use bind() to specify that port. If you are a receiver, well, everyone else has to know where to look for you. So you bind() your program to port 80 and then tell everyone to make sure to transmit data there.
To answer your hw question, yes, your probably want to use bind() for your server. But the clients don't need to use bind() - they just need to make sure they transmit data to whatever port you've chosen.
After reading your updated question. I would suggest not to use bind() function while making client calls. The function is used, while writing your own server, to bind the socket (created after making a call to socket()) to a physical address.
For further help look at this tutorial
bind() is useful when you are writing a server which awaits data from clients by "listening" to a known port. With bind() you are able to set the port on which you will listen() with the same socket.
If you are writing the client, it is not needed for you to call bind() -- you can simply call recv() to obtain the data sent from the server. Your local port will be set to an "ephemeral" value when the TCP connection is established.
You use bind whenever you want to bind to a local address. You mostly use this for opening a listening socket on a specific address/port, but it can also be used to fix the address/port of an outgoing TCP connection.
you need to call bind() only in your server. It's needed especially for binding a #port to your socket.

Resources