I am implementing a web server where I need to make the parent process do the following:
fork() new worker processes (a pool) at the beginning.
Looping forever, listening for incoming requests (via socket communication).
Putting the socket descriptor (returned by accept() function) into a queue.
and the worker process will do the following:
Once created, loops forever watching the queue for any passed socket descriptors.
If he takes the socket descriptor, he handles the request and serves the client accordingly.
After looking around and searching the internet, I found that I can send a file descriptor between different processes via UNIX Domain Socket or Pipes. But unfortunately, I can do this synchronously only! (I can send one fd at a time, and I cannot put it in a waiting queue)
So, my question is:
How can I make the parent process puts the socket descriptor into a waiting queue, so that, the request is pending until one of the worker processes finishes a previous request?
File descriptors are just integers. They are used to index into a per-process table of file information, maintained by the kernel. You can't expect a file descriptor to be "portable" to other processes.
It works (somewhat) if you create the files before calling fork(), since the file descriptor table is part of the process and thus clone()d when the child is created. For file descriptors allocated after the processes have split, such as when using accept() to get a new socket, you can't do this.
UPDATE: It seems there is a way, using sendmsg() with AF_UNIX sockets, see here for details as mentioned in this question. I did not know that, sounds a bit "magical" but apparently it's a well-established mechanism so why not go ahead and implement that.
put the fd on an internal queue (lock-free if you want, but probably not necessary)
have a thread in the parent process which just reads an fd from the internal queue, and sends it via the pipe
all child processes inherit the other end of the pipe, and compete to read the next fd when they finish their current job
Related
I am trying to create a web server in C that uses epoll for multiplexing IO. I was trying to make it capable of generating PHP pages.
What I did: For each connection I read the path, created unnamed pipe, called forked then I redirected the output of the child process to the pipe and used execvp("php", (char *const *) argv);. In the parent process I added the pipe to epoll with EPOLLIN and then I wait for it in the main loop. When it was signaled I used io_prep_pread from libaio to read asynchronously, and then when the read part is finished I would send the buffer to the client.
The problem is the right result is outputted like 5-10% of the time. Is the logic I presented correct or should I wait for the child process to send SIGCHLD and then start reading the pipe?
I am writing a crossplatform, multiprocess and multithreaded server using the "pre forking model" and the C language.
Depending on the mode (multiprocess or multithread), the server just started, creates a set of processes/threads whose task is to process the requests of clients that are accepted by the main server.
Because child processes are created before accepting a socket, they obviously do not inherit the accepted socket.
In win32 I solved, duplicating the socket.
How can I do in C linux?
Use an Unix domain socket instead of a pipe for any control communication between the parent and the child. Unlike pipes, they are bidirectional. If you use a datagram socket, each send() corresponds to one recv() and vice versa (ie. message boundaries are retained), which makes passing structures and such easier.
The point is, you can pass descriptors between processes using an Unix domain socket. The cmsg man page has example code.
Essentially, before you fork the child processes, you create an Unix domain socket pair, unique for each child process, for control communication between the parent and the child. I recommend using an Unix domain datagram socket.
When the parent process wishes to hand off a connection to a child process, it sends the child a message, with an SCM_RIGHTS ancillary message containing the connected socket descriptor. (The kernel will handle the details of copying the descriptor over; just note that the descriptor number may differ in the receiving process.)
This approach works not only in Linux, but in BSDs also.
I am trying to learn socket programming. So my question is that if you fork a child on the client after connecting to a socket i.e. after doing the connect call. Then can you read and write independently using socket descriptor in the child and the parent? Server only knows one socket. So if you are reading fast at the child than at the parent. Will there be data loss at the parent?
Yes. If two processes try to act on the same connection, they will compete.
Forking will duplicate the connection's file descriptor (as with dup/dup2/dup3), but those two filedescriptors will just be two counted references to the same connection.
Practically that means most fd-taking syscalls (read(),write(),...) will get through to the shared target file (the actual connection), however close() calls will only decrement the ref count and will only initiate a connection shutdown (as with the shutdown() syscall) when the refcount becomes 0.
I am trying to program the server side of a groupchat system using C whilst my friend is programming the client side. For each client connection the server receives, it forks a child process so as to handle the client and continue accepting any possibly further clients.
The server is required to send a list of all online users(connected clients) to each of the current connected clients and for this reason I have used pipes. Basically, when a child process is created, it receives come information from the client through a socket and sends such information to the parent, which is keeping a list of all the clients, through a pipe. This list has to be updated every time a client makes a change like starts chatting or disconnects. For example if a client disconnects then the child sends a message to the parent through the pipe and the parent makes the necessary operations to the list so that it gets updated. Note that the pipe is created for each and every new connection.
My problem is that if for example I receive 3 connections one after another, and the 2nd child disconnects, the parent is not reading the information from the pipe since such parent has a different pipe from the 2nd child. (Remember that a new pipe has been created because a 3rd connection has been made). How can I go about solving this problem?
I have also tried creating one common pipe but if I don't close the pipe before reading/writing I get an error and if I do close them I get a segmentation fault when the second client connects since the pipe would be closed.
Any help would be greatly appreciated because I have been searching for hours to no avail.
Thanks.
The parent server process knows when a child is created because it creates the child. It can tell when a child dies by setting a SIGCLD signal handler so it is notified when a child does die. The Nth child has N-1 pipes to close — those going to the other children (unless some of the children have died). The parent process closes the write end of the pipes it creates; the child process closes the read end of the pipes it inherits (which leaves it with a socket to the client and the write end of the pipe created for it to communicate with the parent).
If you need to know when a child starts communicating with a client, then you need to send a message down the pipe from the child to the parent. It is not so obvious how to tell when the child stops communicating — how long needs to elapse before you declare that the child is idle again?
In the parent, you end up polling in some shape or form (select(), poll(), epoll()) on the listening socket and all the read pipes. When some activity occurs, the parent wakes up and responds appropriately. It's a feasible design as long as it doesn't have to scale to thousands or more clients. It requires some care, notably in closing enough file descriptors.
You say:
My problem is that if for example I receive 3 connections one after another, and the 2nd child disconnects, the parent is not reading the information from the pipe since such parent has a different pipe from the 2nd child. (Remember that a new pipe has been created because a 3rd connection has been made). How can I go about solving this problem?
The parent should have an array of open file descriptors (pipes open for reading to various children), along with an indication of which child (PID) is on the other end of the pipe. The parent will close the pipe when it gets EOF on the pipe, or when it is notified that the child has died (via waitpid() or a relative). The polling mechanism will tell you when a pipe is closed, at least indirectly (you will be told the file descriptor won't block, and then you get the EOF — zero bytes read).
In your scenario, the parent has one listening socket open, plus 3 read file descriptors for the pipes to the 3 children (plus standard input, output and error, and maybe syslog).
Although you could use a single pipe from all the children, it is much trickier to handle. You'd have to identify which child wrote each message in the message, ensuring that the message is written atomically by the child. The parent has to be able to tell how much to read at any point so as not to be confused. The advantage of a single pipe is that there is less file descriptor manipulation to do for the polling system call; it also scales indefinitely (no running out of file descriptors).
In neither case should you run into problems with core dumps.
how is select for reading being handled on Linux systems in case the process was forked after opening a udp socket?
Especially - is it possible that in this kind of program:
so = open socket
fork
for(;;) {
select() for reading on socket so
recv from so
}
two packets will wake up only one of the processes (in case they arrive before the waiting process is notified / exits select) and the second one of those packets will not be received?
Or can I assume that for UDP, every packet will always wake up a process or leave the flag set?
Each processes, the parent and child, has a fie descriptor for the same socket. The per file descriptor attributes are independent (e.g. blocking, being able to close the socket).
In your scenario it is indeed feasible legal for one of the processes, for example to be waken and read the data from the socket before the other one getting into select.
Your question is not actually affected by the fork() at all.
select() returns if one of the file descriptors in the read set is readable. If you don't read from it and call select() again, it will still be readable. It will be remain readable until there is no more data to read from it.
In other words, select() is level-triggered, not edge-triggered.