I am trying to learn socket programming. So my question is that if you fork a child on the client after connecting to a socket i.e. after doing the connect call. Then can you read and write independently using socket descriptor in the child and the parent? Server only knows one socket. So if you are reading fast at the child than at the parent. Will there be data loss at the parent?
Yes. If two processes try to act on the same connection, they will compete.
Forking will duplicate the connection's file descriptor (as with dup/dup2/dup3), but those two filedescriptors will just be two counted references to the same connection.
Practically that means most fd-taking syscalls (read(),write(),...) will get through to the shared target file (the actual connection), however close() calls will only decrement the ref count and will only initiate a connection shutdown (as with the shutdown() syscall) when the refcount becomes 0.
Related
I am writing a crossplatform, multiprocess and multithreaded server using the "pre forking model" and the C language.
Depending on the mode (multiprocess or multithread), the server just started, creates a set of processes/threads whose task is to process the requests of clients that are accepted by the main server.
Because child processes are created before accepting a socket, they obviously do not inherit the accepted socket.
In win32 I solved, duplicating the socket.
How can I do in C linux?
Use an Unix domain socket instead of a pipe for any control communication between the parent and the child. Unlike pipes, they are bidirectional. If you use a datagram socket, each send() corresponds to one recv() and vice versa (ie. message boundaries are retained), which makes passing structures and such easier.
The point is, you can pass descriptors between processes using an Unix domain socket. The cmsg man page has example code.
Essentially, before you fork the child processes, you create an Unix domain socket pair, unique for each child process, for control communication between the parent and the child. I recommend using an Unix domain datagram socket.
When the parent process wishes to hand off a connection to a child process, it sends the child a message, with an SCM_RIGHTS ancillary message containing the connected socket descriptor. (The kernel will handle the details of copying the descriptor over; just note that the descriptor number may differ in the receiving process.)
This approach works not only in Linux, but in BSDs also.
I am doing some network testing, and I am connecting between linux boxes with 2 small C programs that are just using function:
connect()
After connection some small calculations are made and recorded to local file, and I instruct one of the programs to close the connection and then run a netcat listener on the same port. The first program then retries the connection and connects to netcat.
I wondered if someone could advise if it is possible to maintain the initial connection whilst freeing the port and pass the connection to netcat on that port (so that the initial connection is not closed).
Each TCP connection is defined by the four-tuple (target IP address, target port, source IP address, source port), so there is no need to "free up" the port on either machine.
It is very common for a server process to fork() immediately after accept()ing a new connection. The parent process closes its copy of the connection descriptor (returned by accept()), and waits for a new connection. The child process closes the original socket descriptor, and executes the desired program or script that should handle the actual connection. In many cases the child moves the connection descriptor to standard input and standard output (using dup2()), so that the executed script or program does not even need to know it is connected to a remote client: everything it writes to standard output is sent to the remote client, and everything the remote client sends is readable from standard input.
If there is an existing process that should handle the connection, and there is an Unix domain socket connection (stream, datagram or seqpacket socket; makes no difference) between the two processes, it is possible to transfer the connection descriptor as an SCM_RIGHTS ancillary message. See man 2 sendmsg, man 2 recvmsg, man 3 cmsg, and man 7 unix for details. This only works on the same machine over an Unix domain socket, because the kernel actually duplicates the descriptor from one process to the other; really, the kernel does some funky magic to make this happen.
If your server-side logic is something like
For each incoming connection:
Do some calculations
Store calculations into a file
Store incoming data from the connection into a file (or standard output)
then I recommend using pthreads. Just create the desired number of threads, have all of them wait for an incoming connection by calling accept() on the listening socket, and have each thread handle the connection by themselves. You can even use stdio.h I/O for the file I/O. For more complex output -- multiple statements per chunk --, you'll need a pthread_mutex_t per output stream, and remember to fflush() it before releasing the mutex. I suspect a single multithreaded program that does all that, and exits nicely if interrupted (SIGINT aka CTRL+C), should not exceed three hundred lines of C.
If you only need to output data from a stream socket, or if you only need to write input to it, you can treat it as a file handle.
So in a Posix program you can use dup2() to duplicate the socket handle to the value 1, which is standard output. Then close the original handle. Then use exec() to overwrite your program with "cat", which will write the output from standard input aka filehandle 1 aka your socket.
I am trying to program the server side of a groupchat system using C whilst my friend is programming the client side. For each client connection the server receives, it forks a child process so as to handle the client and continue accepting any possibly further clients.
The server is required to send a list of all online users(connected clients) to each of the current connected clients and for this reason I have used pipes. Basically, when a child process is created, it receives come information from the client through a socket and sends such information to the parent, which is keeping a list of all the clients, through a pipe. This list has to be updated every time a client makes a change like starts chatting or disconnects. For example if a client disconnects then the child sends a message to the parent through the pipe and the parent makes the necessary operations to the list so that it gets updated. Note that the pipe is created for each and every new connection.
My problem is that if for example I receive 3 connections one after another, and the 2nd child disconnects, the parent is not reading the information from the pipe since such parent has a different pipe from the 2nd child. (Remember that a new pipe has been created because a 3rd connection has been made). How can I go about solving this problem?
I have also tried creating one common pipe but if I don't close the pipe before reading/writing I get an error and if I do close them I get a segmentation fault when the second client connects since the pipe would be closed.
Any help would be greatly appreciated because I have been searching for hours to no avail.
Thanks.
The parent server process knows when a child is created because it creates the child. It can tell when a child dies by setting a SIGCLD signal handler so it is notified when a child does die. The Nth child has N-1 pipes to close — those going to the other children (unless some of the children have died). The parent process closes the write end of the pipes it creates; the child process closes the read end of the pipes it inherits (which leaves it with a socket to the client and the write end of the pipe created for it to communicate with the parent).
If you need to know when a child starts communicating with a client, then you need to send a message down the pipe from the child to the parent. It is not so obvious how to tell when the child stops communicating — how long needs to elapse before you declare that the child is idle again?
In the parent, you end up polling in some shape or form (select(), poll(), epoll()) on the listening socket and all the read pipes. When some activity occurs, the parent wakes up and responds appropriately. It's a feasible design as long as it doesn't have to scale to thousands or more clients. It requires some care, notably in closing enough file descriptors.
You say:
My problem is that if for example I receive 3 connections one after another, and the 2nd child disconnects, the parent is not reading the information from the pipe since such parent has a different pipe from the 2nd child. (Remember that a new pipe has been created because a 3rd connection has been made). How can I go about solving this problem?
The parent should have an array of open file descriptors (pipes open for reading to various children), along with an indication of which child (PID) is on the other end of the pipe. The parent will close the pipe when it gets EOF on the pipe, or when it is notified that the child has died (via waitpid() or a relative). The polling mechanism will tell you when a pipe is closed, at least indirectly (you will be told the file descriptor won't block, and then you get the EOF — zero bytes read).
In your scenario, the parent has one listening socket open, plus 3 read file descriptors for the pipes to the 3 children (plus standard input, output and error, and maybe syslog).
Although you could use a single pipe from all the children, it is much trickier to handle. You'd have to identify which child wrote each message in the message, ensuring that the message is written atomically by the child. The parent has to be able to tell how much to read at any point so as not to be confused. The advantage of a single pipe is that there is less file descriptor manipulation to do for the polling system call; it also scales indefinitely (no running out of file descriptors).
In neither case should you run into problems with core dumps.
I am implementing a web server where I need to make the parent process do the following:
fork() new worker processes (a pool) at the beginning.
Looping forever, listening for incoming requests (via socket communication).
Putting the socket descriptor (returned by accept() function) into a queue.
and the worker process will do the following:
Once created, loops forever watching the queue for any passed socket descriptors.
If he takes the socket descriptor, he handles the request and serves the client accordingly.
After looking around and searching the internet, I found that I can send a file descriptor between different processes via UNIX Domain Socket or Pipes. But unfortunately, I can do this synchronously only! (I can send one fd at a time, and I cannot put it in a waiting queue)
So, my question is:
How can I make the parent process puts the socket descriptor into a waiting queue, so that, the request is pending until one of the worker processes finishes a previous request?
File descriptors are just integers. They are used to index into a per-process table of file information, maintained by the kernel. You can't expect a file descriptor to be "portable" to other processes.
It works (somewhat) if you create the files before calling fork(), since the file descriptor table is part of the process and thus clone()d when the child is created. For file descriptors allocated after the processes have split, such as when using accept() to get a new socket, you can't do this.
UPDATE: It seems there is a way, using sendmsg() with AF_UNIX sockets, see here for details as mentioned in this question. I did not know that, sounds a bit "magical" but apparently it's a well-established mechanism so why not go ahead and implement that.
put the fd on an internal queue (lock-free if you want, but probably not necessary)
have a thread in the parent process which just reads an fd from the internal queue, and sends it via the pipe
all child processes inherit the other end of the pipe, and compete to read the next fd when they finish their current job
how is select for reading being handled on Linux systems in case the process was forked after opening a udp socket?
Especially - is it possible that in this kind of program:
so = open socket
fork
for(;;) {
select() for reading on socket so
recv from so
}
two packets will wake up only one of the processes (in case they arrive before the waiting process is notified / exits select) and the second one of those packets will not be received?
Or can I assume that for UDP, every packet will always wake up a process or leave the flag set?
Each processes, the parent and child, has a fie descriptor for the same socket. The per file descriptor attributes are independent (e.g. blocking, being able to close the socket).
In your scenario it is indeed feasible legal for one of the processes, for example to be waken and read the data from the socket before the other one getting into select.
Your question is not actually affected by the fork() at all.
select() returns if one of the file descriptors in the read set is readable. If you don't read from it and call select() again, it will still be readable. It will be remain readable until there is no more data to read from it.
In other words, select() is level-triggered, not edge-triggered.