How do I get parameters from bind() in c? - c

I'm writing a program in C (on windows 32bit) that listening on a specific port. (using this guide)
The client connect in that way: "http://127.0.0.1:port/?param1=a&param2=b..."
I (as a server) want to get all of the parameters he entered.
How can i do that?

The bind() function does not receive the parameters, or for that matter anything the client is specifying regarding your communication; it merely binds to the port. Once a port is bound to and an active connection eventually gets established, your application protocol (HTTP presumably, in your case) takes over in the sense of deciding what data to write() (or other higher level functions) to the port.
With that in mind, your question has actually nothing at all to do with sockets in any way. Rather, it has to do with understanding the application protocol you're using. I would suggest you either Google how to use the HTTP protocol, review one of the myriad open source HTTP libraries available, or for theat matter just printf() the input you receive from your client.

Related

Retrieving local IP before a connection is made

I am trying to determine which local IP would be used on a socket for a TCP connection towards a given host on Linux, using C.
Let me make an example. I could connect my socket and use getsockname() on the file descriptor to get the local ip (and local TCP port); but can I do this without opening the connection?
I could read the routing table and make a decision based on that - but the networking subsystem must have that algorithm already, for when the connection is actually open. In short, I'd like to know if there is an API to access the routing algorithms without having to parse the rules myself or opening an actual connection. The solution - if any - will probably be Linux only but that's OK.
EDIT: someone on IRC suggested I create a UDP socket and use connect() on it. No network is used at that point but I should be able to use getsockname() on it
The only solution I know to this is what traceroute does. Send a packet with a TTL of 1 and see which interface the ICMP return comes in on. As I recall there are lots of incompatibilities between different hosts, so there's probably several different types of messages you might need to send/receive to get the data you need.

C check what service is running on an open port

I'm writing a port scanner in C and i want to detect what service is running on an open port and its version.I've already wrote the scanner code but now i have no idea about how to detect running service.
What can i do?
If you are determined to do it in your own code, you can connect to the port, see if you get any data on it, if nothing then send a few bytes, and check again.
Then match that against expected response.
to get an idea what you are looking for, you can connect manually to the port with telnet and poke at it. In many cases (a web server is an easy example) you must send some correctly formatted data in order to get a usable response.
nmap has done all this and much more (e.g. extensive checks such as looking for byte order and timing of arp traffic)
UPDATE: several people have mentioned well known ports, but that won't help you discover standard services running on nonstandard ports, such as ssh or http servers running on custom ports.
If server sends something first, use that to identify protocol.
If not, send something according to some protocol, such as http, and see what server sends back (valid response or error). You may need to make several attempts with different protocols, and a good order is important to minimize connection count.
Some protocols may be very hard to identify, and it is easy to make custom server with unique protocol you don't know about, or even hide real server under simple fake server of other proto such as http.
If you just want to know what the port usually is, check "well known ports" and official reserved ports.
Also check nmap source code.

UDP C Sockets: Multiple Sockets Sharing Single Port

I'm writing a program in C on GNU/Linux that uses UDP to communicate messages between various instances of the program, either on a single machine, or across a network. Each instance of the program has it's own unique internal application layer address that it uses to differentiate between instances that run on a single machine (and thus share an IP address). Currently, the whole system communicates on a single UDP port.
This works fine between instances of the program running on separate machines, as these all have unique IP addresses, and thus unique socket connections. The problem is running multiple instances on a single machine. In this case, only the first instance of the program gets a socket connect and the others fail since the port is already in use.
Is there a way to bind multiple datagram sockets to a single port? I realize this is not normally advisable, but since I have unique application layer addresses that I can use to resolve the ambiguity, it would be helpful in this case. Essentially, I want to be able to do the following:
Bind all instances of the program on a single machine to the same common protocol port
When a message is received, each instance will use recv with the MSG_PEEK flag set to determine if the message's application layer address matches the instance's internal address.
For the single instance on a given machine where the addresses match, a regular call to recv will remove the message from the input queue for processing by the appropriate instance.
Essentially, I wish to use UDP as a common communication medium with more specific addressing occurring at the application layer.
Is there a standard way of doing this in GNU C? I realize that I could write a top level governing program to listen to all messages on the socket and reroute them to the appropriate instance, but this seems unnecessarily complicated, and breaks the program operating identically with multiple instances across a network vs across a shared single IP. I also know I could use multiple ports, but this adds the need to assign each instance a separate free port and keep track of these across the entire network of instances.
Essentially, I wish to "Broadcast" a message to a group of instances sharing a single IP address and let them sort out who the message belongs to at the application layer.
Thoughts?
You can do such binding with setsockopt(SO_REUSEPORT), but I think it would not help. You will have several sockets, each with its own packet queue, and each packet will go in one queue only. MSG_PEEK will do no good.
Top-level instance rerouting messages to different consumers looks like right solution.
You can't use the multiple socket bound to a unique ip/port combination.
Use some message queue / message passing interface, and forget about UDP.
For example, see 0MQ (zeromq) http://www.zeromq.org/
If it's a client/server style app, the client side need not bind.
When the server responds to the client that hasn't bound it will respond to the source port which will be randomly chosen by the OS when the client sends (without bind).
The client then reads from the unbound port.

Unix sockets: when to use bind() function?

I've not a clear idea about when I have to use the bind() function.
I guess it should be used whenever I need to receive data (i.e. recv() or recvfrom() functions) whether I'm using TCP or UDP, but somebody told me this is not the case.
Can anyone clarify a bit?
EDIT I've read the answers but actually I'm not so clear. Let's take an example where I have an UDP client which sends the data to the server and then has to get a response. I have to use bind here, right?
This answer is a little bit long-winded, but I think it will help.
When we do computer networking, we're really just doing inter-process communication. Lets say on your own computer you had two programs that wanted to talk to each other. You might use pipe to send the data from one program to another. When you say ls | grep pdf you are taking the output of ls and feeding it into grep. In this way, you have unidirectional communication between the two separate programs ls and grep.
When you do this, someone needs to keep track of the Process ID (PID) of each process. That PID is a unique identifier for each process and it helps us track who the "source" and "destination" processes are for the data we want to transfer.
So now lets say you have data from a webserver than you want to transfer to a browser. Well, this is the same scenario as above - interprocess communication between two programs, the "server" and "browser".
Except this time those two programs are on different computers. The mechanism for interprocess communication across two computers are called "sockets".
So great. You take some data, lob it over the wire, and the other computer receives it. Except that computer doesn't know what to do with that data. Remember we said we need a PID to know which processes are communicating? The same is true in networking. When your computer receives HTML data, how does it know to send it to "firefox" rather than "pidgin"?
Well when you transmit network data, you specify that it goes on a specific "port". Port 80 is usually used for web, port 25 for telnet, port 443 for HTTPS, etc.
And that "port" is bound to a specific process ID on the machine. This is why we have ports. This is why we use bind(). In order to tell the sender which process should receive our data.
This should explain the answers people have posted. If you are a sender, you don't care what the outgoing port is, so you usually don't use bind() to specify that port. If you are a receiver, well, everyone else has to know where to look for you. So you bind() your program to port 80 and then tell everyone to make sure to transmit data there.
To answer your hw question, yes, your probably want to use bind() for your server. But the clients don't need to use bind() - they just need to make sure they transmit data to whatever port you've chosen.
After reading your updated question. I would suggest not to use bind() function while making client calls. The function is used, while writing your own server, to bind the socket (created after making a call to socket()) to a physical address.
For further help look at this tutorial
bind() is useful when you are writing a server which awaits data from clients by "listening" to a known port. With bind() you are able to set the port on which you will listen() with the same socket.
If you are writing the client, it is not needed for you to call bind() -- you can simply call recv() to obtain the data sent from the server. Your local port will be set to an "ephemeral" value when the TCP connection is established.
You use bind whenever you want to bind to a local address. You mostly use this for opening a listening socket on a specific address/port, but it can also be used to fix the address/port of an outgoing TCP connection.
you need to call bind() only in your server. It's needed especially for binding a #port to your socket.

Why is separate getaddrinfo-like() + connect() not refactored into a (theoretical) connect_by_name()?

Most of the applications I've seen that use TCP, do roughly the following to connect to remote host:
get the hostname (or address) from the configuration/user input (textual)
either resolve the hostname into address and add the port, or use getaddrinfo()
from the above fill in the sockaddr_* structure with one of the remote addresses
use the connect() to get the socket connected to the remote host.
if fails, possibly go to (3) and retry - or just complain about the error
(2) is blocking in the stock library implementation, and the (4) seems to be most frequently non-blocking, which seems to give a room for a lot of somewhat similar yet different code that serves the purpose to asynchronously connect to a remote host by its hostname.
So the question: what are the good reasons not to have the additional single call like following:
int sockfd = connect_by_name(const char *hostname, const char *servicename)
?
I can come up with three:
historic: because that's what the API is
provide for custom per-application policy mechanism for address selection/connection retry: this seems a bit superficial, since for the common case ("get me a tube to talk to remote host") the underlying OS should know better
provide the visual feedback to the user about the exact step involved ("name resolution" vs "connection attempt"): this seems rather important, lookup+connection attempt may take time
Only the last of them seems to be compelling enough to rewrite the resolve/connect code for every client app (as opposed to at least having and using a widely used library that would implement the connect_by_name() semantics in addition to the existing sockets API), so surely there should be some more reasons that I am missing ?
(one of the reasons behind the question is that this kind of API would appear to help the portability to IPv6, as well as possibly to other stream transport protocols significantly)
Or, maybe such a library exists and my google-fu failed me ?
(edited: corrected the definition to look like it was meant to look, thanks LnxPrgr3)
Implementing such an API with non-blocking characteristics within the constraints of the standard library (which, crucially, isn't supposed to start its own threads or processes to work asynchronously) would be problematic.
Both the name lookup and connecting part of the process require waiting for a remote response. If either of these are not to block, then that requires a way of doing asychronous work and signalling the change in state of the socket to the calling application. connect is able to do this, because the work of the connect call is done in the kernel, and the kernel can mark the socket as readable when the connect is done. However, name lookup is not able to do this, because the work of a name lookup is done in userspace - and without starting a new thread (which is verboten in the standard library), giving that name lookup code a way to be woken up to continue work is a difficult problem.
You could do it by having your proposed call return two file descriptors - one for the socket itself, and another that you are told "Do nothing with this file descriptor except to check regularly if it is readable. If this file descriptor becomes readable, call cbn_do_some_more_work(fd)". That is clearly a fairly uninspiring API!
The usual UNIX approach is to provide a set of simple, flexible tools, working on a small set of object types, that can be combined in order to produce complex effects. That applies to the programming API as much as it does to the standard shell tools.
Because you can build higher level APIs such as the one you propose on top of the native low level APIs.
The socket API is not just for TCP, but can also be used for other protocols that may have different end point conventions (i.e. the Unix-local protocol where you have a name only and no service). Or consider DNS which uses sockets to implement itself. How does the DNS code connect to the server if the connection code relies on DNS?
If you would like a higher level abstraction, one library to check out is ACE.
There are several questions in your question. For instance, why not
standardizing an API with such connect_by_name? That would certainly
be a good idea. It would not fit every purpose (see the DNS example
from R Samuel Klatchko) but for the typical network program, it would
be OK. A paper exploring such APIs is "Simplifying Internet Applications Development
With A Name-Oriented Sockets Interface" by Christian Vogt. Note
that another difficulty for such an API would be "callback"
applications, for instance a SIP client asking to be called back: the
application has no easy way to know its own name and therefore often
prefer to be called back by address, despite the problems it make, for
instance with NAT.
Now, another question is "Is it possible to build such
connect_by_name subroutine today?" Partly yes (with the caveats
mentioned by caf) but, if written in userspace, in an ordinary
library, it would not be completely "name-oriented" since the Unix
kernel still manages the connections using IP addresses. For instance,
I would expect a "real" connect_by_name routine to be able to
survive renumbering (for instance because a mobile host renumbered),
which is quite difficult to do in userspace.
Finally, yes, it already exists a lot of libraries with similar
semantics. For a HTTP client (the most common case for a program whose
network abilities are not the main feature, for instance a XML
processor), you have Neon and libcURL. With libcURL, you can
simply write things like:
#define URL "http://www.velib.paris.fr/service/stationdetails/42"
...
curl_easy_setopt(curl, CURLOPT_URL, URL);
result = curl_easy_perform(curl);
which is even higher-layer than connect_by_name since it uses an
URL, not a domain name.

Resources