I have a program that needs to:
Handle 20 connections. My program will act as client in every connection, each client connecting to a different server.
Once connected my client should send a request to the server every second and wait for a response. If no request is sent within 9 seconds, the server will time out the client.
It is unacceptable for one connection to cause problems for the rest of the connections.
I do not have access to threads and I do not have access to non-blocking sockets. I have a single-threaded program with blocking sockets.
Edit: The reason I cannot use threads and non blocking sockets is that I am on a non-standard system. I have a single RTOS(Real-Time Operating System) task available.
To solve this, use of select is necessary but I am not sure if it is sufficient.
Initially I connect to all clients. But select can only be used to see if a read or write will block or not, not if a connect will.
So when I have connected to say 2 clients and they are all waiting to be served, what if the 3rd does not work, the connection will block causing the first 2 connections to time out as well.
Can this be solved?
I think the connection-issue can be solved by setting a timeout for the connect-operation, so that it will fail fast enough. Of course that will limit you if the network really is working, but you have a very long (slow) path to some of the server(s). That's bad design, but your requirements are pretty harsh.
See this answer for details on connection-timeouts.
It seems you need to isolate the connections. Well, if you cannot use threads you can always resort to good-old-processes.
Spawn each client by forking your server process and use traditional IPC mechanisms if communication between them is required.
If you can neither use a multiprocess approach I'm afraid you'll have a hard time doing that.
Related
Trying to implement a simple HTTP server in C using Linux socket interface I have encountered some difficulties with a certain feature I'd like it to have, namely persistent connections. It is relatively easy to send one file at a time with separate TCP connections, but it doesn't seem to be very efficient solution (considering multiple handshakes for instance). Anyway, the server should handle several requests (HTML, CSS, images) during one TCP connection. Could you give me some clues how to approach the problem?
It is pretty easy - just don't close the TCP connection after you write the reply.
There are two ways to do this, pipelined, and non pipelined.
In a non-pipelined implementation you read one http request on the socket, process it, write it back out of the socket, and then try to read another one. Keep doing that until the remote party closes the socket, or close it yourself after you stop getting requests on the socket after about 10 seconds.
In a pipelined implementation, read as many requests as are on the socket, process them all in parallel, and then write them all back out on the socket, in the same order as your received them. You have one thread reading requests in all the time, and another one writing them out again.
You don't have to do it, but you can advertize that you support persistent connections and pipelining, by adding the following header in your replies:
Connection: Keep-Alive
Read this:
http://en.wikipedia.org/wiki/HTTP_persistent_connection
By the way, in practice there aren't huge advantages to persistent connections. The overhead of managing the handshake is very small compared to the time taken to read and write data to network sockets. There is some debate about the performance advantages of persistent connections. On the one hand under heavy load, keeping connections open means many fewer sockets on your system in TIME_WAIT. On the other hand, because you keep the socket open for 10 seconds, you'll have many more sockets open at any given time than you would in non-persistent mode.
If you're interested in improving performance of a self written server - the best thing you can do to improve performance of the network "front-end" of your server is to implement an event based socket management system. Look into libev and eventlib.
I am in the middle of a multi-threaded TCP server design using Berkely SOCKET API under linux in system independent C language. The Server has to perform I/O multiplexing as the server is a centralized controller that manages the clients (that maintain a persistent connection with the server forever (unless a machine on which client is running fails etc)). The server needs to handle a minimum of 500 clients.
I have a 16 core machine, what I want is that I spawn 16 threads(one per core) and a main thread. The main thread will listen() to the connections and then dispatch each connection on the queue list to a thread which will then call accept() and then use the select() sys call to perform I/O multiplexing. Now the problem is how do I know that when to dispatch a thread to call accept() . I mean how do I find out in the main thread that there is a connection pending at the listen() so that I can assign a thread to handle that connection. All help much appreciated.
Thanks.
The listen() function call prepares a socket to accept incoming connections. You then use select() on that socket and get a notification that a new connection has arrived. You then call accept on the server socket and a new socket id will be returned. If you like you can then pass that socket id onto your thread.
What I would do is have a single thread for accepting connections and receiving data which then dispatches the data to a queue as a work item for processing.
Note that if each of your 16 threads is going to be running select (or poll, or whatever) anyway, there is no problem with them all adding the server socket to their select sets.
More than one may wake when the server socket has in incoming connection, but only one will successfully call accept, so it should work.
Pro: easy to code.
Con:
naive implementation doesn't balance load (would need eg. global
stats on number of accepted sockets handled by each thread, with
high-load threads removing the server socket from their select sets)
thundering herd behaviour could be problematic at high accept rates
epoll or aio/asio. I suspect you got no replies to your earlier post because you didn't specify linux when you asked for a scalable high-performnce solution. Asynchronous solutions on different OS are implemented with substantial kernel support and linux aio, Windows IOCP etc. are different enough that 'system independent' does not really apply - nobody could give you an answer.
Now that you have narrowed the OS down to linux, look up the appropriate asynchronous solutions.
I am currently experimenting with building an http server. The server is multi-threaded by one listening thread using select(...) and four worker threads managed by a thread pool. I'm currently managing around 14k-16k requests per second with a document length of 70 bytes, a response time of 6-10ms, on a Core I3 330M. But this is without keep-alive and any sockets I serve I immediatly close when the work is done.
EDIT: The worker threads processes 'jobs' that have been dispatched when activity on a socket is detected, ie. service requests. After a 'job' is completed, if there are no more 'jobs', we sleep until more 'jobs' gets dispatched or if there already are some available, we start processing one of these.
My problems started when I began to try to implement keep-alive support. With keep-alive activated I only manage 1.5k-2.2k requests per second with 100 open sockets. This number grows to around 12k with 1000 open sockets. In both cases the response time is somewhere around 60-90ms. I feel that this is quite odd since my current assumptions says that requests should go up, not down, and response time should hopefully go down, but definitely not up.
I've tried several different strategies for fixing the low performance:
1. Call select(...)/pselect(...) with a timeout value so that we can rebuild our FD_SET structure and listen to any additional sockets that arrived after we blocked, and service any detected socket activity.
(aside from the low performance, there's also the problem of sockets being closed while we're blocking, resulting in select(...)/pselect(...) reporting bad file descriptor.)
2. Have one listening thread that only accept new connections and one keep-alive thread that is notified via a pipe of any new sockets that arrived after we blocked and any new socket activity, and rebuild the FD_SET.
(same additional problem here as in '1.').
3. select(...)/pselect(...) with a timeout, when new work is to be done, detach the linked-list entry for the socket that has activity, and add it back when the request has been serviced. Rebuilding the FD_SET will hopefully be faster. This way we also avoid trying to listen to any bad file descriptors.
4. Combined (2.) and (3.).
-. Probably a few more, but they escape me atm.
The keep-alive sockets are stored in a simple linked List, whose add/remove methods are surrounded by a pthread_mutex lock, the function responsible for rebuilding the FD_SET also has this lock.
I suspect that it's the constant locking/unlocking of the mutex that is the main culprit here, I've tried to profile the problem but neither gprof or google-perftools has been very cooperative, either introducing extreme instability or plain refusing to gather any data att all (This could be me not knowing how to use the tools properly though.). But removing the locks risks putting the linked list in a non-sane state and probably crash or put the program into an infinite loop.
I've also suspected the select(...)/pselect(...) timeout when I've used it, but I'm pretty confident that this was not the problem since the low performance is maintained even without it.
I'm at a loss of how I should handle keep-alive sockets and I'm therefor wondering if you people out there has any suggestions on how to fix the low performance or have suggestions on any alternate methods I can use to go about supporting keep-alive sockets.
If you need any more information to be able to answer my question properly, don't hesitate to ask for it and I shall try my best to provide you with the necessary information and update the question with this new information.
Try to get rid of select completely. You can find some kind of event notification on every popular platform: kqueue/kevent on freebsd(), epoll on Linux, etc. This way you do not need to rebuild FD_SET and can add/remove watched fds anytime.
The time increase will be more visible when the client uses your socket for more then one request. If you are merely opening and closing yet still telling the client to keep alive then you have the same scenario as you did without keepalive. But now you have the overhead of the sockets sticking around.
If however you are using the sockets multiple times from the same client for multiple requests then you will lose the TCP connection overhead and gain performance that way.
Make sure your client is using keepalive properly. and likely a better way to get notification of the sockets state and data. Perhaps a poll device or queuing the requests.
http://www.techrepublic.com/article/using-the-select-and-poll-methods/1044098
This page has a patch for linux to handle a poll device. Perhaps some understanding of how it works and you can use the same technique in your application rather then rely on a device that may not be installed.
There are many alternatives:
Use processes instead of threads, and pass file descriptors via Unix sockets.
Maintain per-thread lists of sockets. You could even accept() directly on the worker threads.
etc...
Are your test clients reusing the sockets? Are they correctly handling keep alive?
I could see that case where you do the minimum change possible in your benchmarking code by just passing the keep alive header, but then not changing your code so that the socket is closed at the client end once the pay packet is received.
This would incure all the costs of keep-alive with none of the benefits.
What you are trying to do has been done before. Consider reading about the Leader-Follower network server pattern, http://www.kircher-schwanninger.de/michael/publications/lf.pdf
I'm learning about TCP/IP and am trying to use it to execute different commands on my server.
I thought i'd start small and build up. I've got a current example running which has a server and client connect, and then the server sends the current time to the client.
Now i want to make it such that the server can handle multiple clients.
How can I do this? I think i could use fork, but is there a way to do it without having multiple processes to worry about?
Are there any good primers on this sort of thing, or could you provide some instructions on how to modify my existing code?
Thanks,
glibc Manual has a nice example. The missing code bits can be found earlier in the chapter. The nice thing about the example is that you do not need multiple threads
I would recommend the use of threads:
One server thread has the sole purpose of listening at the server socket for incoming connections. Once a connection is received, it is passed off to a worker thread, while the server keeps listening.
One or more worker threads. These threads will do the majority of the work. You can choose to use one thread per socket, or you can use the select function to allow one thread to handle multiple sockets.
I don't know any primers off the top of my head, sorry.
Take a look at Erik's answer on this other question. You don't really need to do multithreading.
Appreciate if anyone can help me to get a better solution...
In my application, there is a TCP client(C) and other TCP server(S) on linux machine.
On production envoronment, on high load this server sometimes stop receiving request from Client and hence creating bottlenecks for client as client side is a blocking socket.To recreate the problem locally .. i put a load and take the server on GDB and this way the problem is recreated.
Can anyone suggest some other mechanism to block the socket wihout disturbing the process ?
What exactly would you like to hear? If the server is busy, that is, other processes are being serviced because they too do get a share of the timeslices by the scheduler, there is not much you can do except raising your program's priority/timeslice length, or lowering theirs.
Note that TCP implementations generally use a socket buffer so that some transfers can continue to happen while a process is currently busy dealing with data, or while waiting for the next timeslice.
Do you have some code to show?
Can anyone suggest some other mechanism to block the socket wihout
disturbing the process ?
Do the connection in a separate thread so that you do not block the whole process
I didn't really get the point, but I guess you have implement a blocked tcp server.
If this is true, then there may be some methods to solve it
Use multithreading or event-driven architecture to improve I/O efficiency.
Extract I/O methods code from the others
Multiprocessing may be required to avoid system limits, such as the count of open files.