HTTP Persistent connection - c

Trying to implement a simple HTTP server in C using Linux socket interface I have encountered some difficulties with a certain feature I'd like it to have, namely persistent connections. It is relatively easy to send one file at a time with separate TCP connections, but it doesn't seem to be very efficient solution (considering multiple handshakes for instance). Anyway, the server should handle several requests (HTML, CSS, images) during one TCP connection. Could you give me some clues how to approach the problem?

It is pretty easy - just don't close the TCP connection after you write the reply.
There are two ways to do this, pipelined, and non pipelined.
In a non-pipelined implementation you read one http request on the socket, process it, write it back out of the socket, and then try to read another one. Keep doing that until the remote party closes the socket, or close it yourself after you stop getting requests on the socket after about 10 seconds.
In a pipelined implementation, read as many requests as are on the socket, process them all in parallel, and then write them all back out on the socket, in the same order as your received them. You have one thread reading requests in all the time, and another one writing them out again.
You don't have to do it, but you can advertize that you support persistent connections and pipelining, by adding the following header in your replies:
Connection: Keep-Alive
Read this:
http://en.wikipedia.org/wiki/HTTP_persistent_connection
By the way, in practice there aren't huge advantages to persistent connections. The overhead of managing the handshake is very small compared to the time taken to read and write data to network sockets. There is some debate about the performance advantages of persistent connections. On the one hand under heavy load, keeping connections open means many fewer sockets on your system in TIME_WAIT. On the other hand, because you keep the socket open for 10 seconds, you'll have many more sockets open at any given time than you would in non-persistent mode.
If you're interested in improving performance of a self written server - the best thing you can do to improve performance of the network "front-end" of your server is to implement an event based socket management system. Look into libev and eventlib.

Related

Minimising client processing - c socket programming

I am working on a client/server model based on Berkeley sockets and have almost finished but I'm stuck with a way to know that all of the data has been received whilst minimising the processing being executed on the client side.
The client I am working with has very little memory and battery and is to be deployed in remote conditions. This means that wherever possible I am trying to avoid processing (and therefore battery loss) on the client side. The following conditions on the client are outside of my control:
The client sends its data 1056 bytes at a time until it has ran out of data to send (I have no idea where the number 1056 came from but if you think that you know I would be very interested)
The client is very unpredictable in when it will send the data (it is attached to a wild animal and sends data determined by connection strength and battery life)
The client has an unknown amount of data to send at any given time
The data is transmitted though a GRSM enabled phone tag (Not sure that this is relevant but I'm assuming that extra information could only help)
(I am emulating the data I am expecting to receive from the client through localhost, if it seems to work I will ask the company where I am interning to invest in a static ip address to allow "real" tcp transfers, if it doesn't I won't. I don't think this is relevant but, again, I would rather provide too much information than too little)
At the moment I am using a while loop and incrementing the number of bytes received in order to "recv()" each of the 1056 byte sections. My problem is that the server needs to receive an unknown number of these. To me, the most obvious solutions are to send the number of sections to be received in an initial header from the client or to mark the last section being sent in some way. However, both of these approaches would require processing on the client side, I was wondering if there was a way to check whether the client has closed its socket from the server side? Or even whether something like closing the connection from the server after a pre-determined period of time without information from the client would be feasible? If these aren't possible then I would love to hear any other suggestions.
TLDR: What condition can I use here to minimise client-side processing?
while(!(/* Client has ran out of data to send*/)) {
receive1056Section();
}
Also, I know that it is bad practise to make a stackOverflow account and immediately ask a question, I didn't know what else to do, I'm sorry. Please don't hesitate to be mean if I've missed something very obvious.
Here is a suggestion for how to do the interaction:
The client:
Client connects to server via tcp.
Client sends chunks of data until all data has been sent. Flush the send buffer after each chunk.
When it is done the client issues a shutdown on the socket, sleeps for a couple of seconds and then closes the connection.
The client then sleeps until the next transmission. If the transmission was unsuccessful, the sleep time should be shorter to prevent unsent data to overflow the avaiable memory.
If the client is unable to connect for an extended period of time, you would have to discard data that doesn't fit in the memory.
I am assuming that sleep reduces power consumption.
The server:
The server programcan be single-threaded unless you need massive scalability. It is listening for incoming connections on the agreed port.
Whenever a client connects, a new socket is created.
Use select() to see which sockets has data (don't forget to include the listening socket!), and non-blocking reads to read from the sockets.
When you get the appropriate error (no more data to read and the other side has shutdown it's side of the connection), then you can close that socket.
This should work fine up to a couple of thousand simultaneous connections.
Example that handles many of the difficulties of implementing a server

No threads and blocking sockets - is it possible to handle several connections?

I have a program that needs to:
Handle 20 connections. My program will act as client in every connection, each client connecting to a different server.
Once connected my client should send a request to the server every second and wait for a response. If no request is sent within 9 seconds, the server will time out the client.
It is unacceptable for one connection to cause problems for the rest of the connections.
I do not have access to threads and I do not have access to non-blocking sockets. I have a single-threaded program with blocking sockets.
Edit: The reason I cannot use threads and non blocking sockets is that I am on a non-standard system. I have a single RTOS(Real-Time Operating System) task available.
To solve this, use of select is necessary but I am not sure if it is sufficient.
Initially I connect to all clients. But select can only be used to see if a read or write will block or not, not if a connect will.
So when I have connected to say 2 clients and they are all waiting to be served, what if the 3rd does not work, the connection will block causing the first 2 connections to time out as well.
Can this be solved?
I think the connection-issue can be solved by setting a timeout for the connect-operation, so that it will fail fast enough. Of course that will limit you if the network really is working, but you have a very long (slow) path to some of the server(s). That's bad design, but your requirements are pretty harsh.
See this answer for details on connection-timeouts.
It seems you need to isolate the connections. Well, if you cannot use threads you can always resort to good-old-processes.
Spawn each client by forking your server process and use traditional IPC mechanisms if communication between them is required.
If you can neither use a multiprocess approach I'm afraid you'll have a hard time doing that.

Send same info to multiple threads/sockets?

I am writing a server application that simply connects a local serial port to multiple network connected clients. I am using linux and C for the server application because the equipment for the program is a router with limited memory.
I have everything setup for multiple clients to connect and send data to the serial port using a fork() process for each connection.
My problem lies in getting data incoming on the serial port out to the multiple (varing number) client connections. my problem lies in designing a way for each active socket to get all of the incoming data, and to only get it once. Any help?
Sounds like you need a data queue (buffer) for each connected client. Each time data comes in on the port, you post it to the back of each client's queue. The clients then read the data from the front of their respective queues. Since all the clients will probably read at different rates/times, this will ensure all of them get a copy of the data only once, and you won't get hung up waiting for any one client while more data comes in. Of course, you'll need to allocate a certain amount of memory for each connected client's queue (I'm not sure how many clients you're expecting, and you did say your available memory is limited), and you need to consider what to do if a queue gets full before the client reads all of it.
Presumably you keep a list or some other reference of/to connected clients, why not just loop over that for each bit of information and send it to all of them?
A thread per socket design might not be the best way to solve this. An event driven asynchronous approach should be a much better fit. However, if you must do it with threads, and given that serial ports are slow anyway, building a pipe between the thread listening to the serial port and all the threads talking to the network clients is the most practical. You could do fancy things with rwlocks to move the data, but you'll still need a way for the network threads to wait on both the socket and the data from the serial port, so you need to use file descriptors for both and something like poll.
But seriously, this would likely be much easier and would perform better without the threads. Think of it as a main loop which waits on poll which is watching the network and the serial port, determines which event occurred, and distributes data accordingly. It should be easier all around once you get the idea.

Problem supporting keep-alive sockets on a home-grown http server

I am currently experimenting with building an http server. The server is multi-threaded by one listening thread using select(...) and four worker threads managed by a thread pool. I'm currently managing around 14k-16k requests per second with a document length of 70 bytes, a response time of 6-10ms, on a Core I3 330M. But this is without keep-alive and any sockets I serve I immediatly close when the work is done.
EDIT: The worker threads processes 'jobs' that have been dispatched when activity on a socket is detected, ie. service requests. After a 'job' is completed, if there are no more 'jobs', we sleep until more 'jobs' gets dispatched or if there already are some available, we start processing one of these.
My problems started when I began to try to implement keep-alive support. With keep-alive activated I only manage 1.5k-2.2k requests per second with 100 open sockets. This number grows to around 12k with 1000 open sockets. In both cases the response time is somewhere around 60-90ms. I feel that this is quite odd since my current assumptions says that requests should go up, not down, and response time should hopefully go down, but definitely not up.
I've tried several different strategies for fixing the low performance:
1. Call select(...)/pselect(...) with a timeout value so that we can rebuild our FD_SET structure and listen to any additional sockets that arrived after we blocked, and service any detected socket activity.
(aside from the low performance, there's also the problem of sockets being closed while we're blocking, resulting in select(...)/pselect(...) reporting bad file descriptor.)
2. Have one listening thread that only accept new connections and one keep-alive thread that is notified via a pipe of any new sockets that arrived after we blocked and any new socket activity, and rebuild the FD_SET.
(same additional problem here as in '1.').
3. select(...)/pselect(...) with a timeout, when new work is to be done, detach the linked-list entry for the socket that has activity, and add it back when the request has been serviced. Rebuilding the FD_SET will hopefully be faster. This way we also avoid trying to listen to any bad file descriptors.
4. Combined (2.) and (3.).
-. Probably a few more, but they escape me atm.
The keep-alive sockets are stored in a simple linked List, whose add/remove methods are surrounded by a pthread_mutex lock, the function responsible for rebuilding the FD_SET also has this lock.
I suspect that it's the constant locking/unlocking of the mutex that is the main culprit here, I've tried to profile the problem but neither gprof or google-perftools has been very cooperative, either introducing extreme instability or plain refusing to gather any data att all (This could be me not knowing how to use the tools properly though.). But removing the locks risks putting the linked list in a non-sane state and probably crash or put the program into an infinite loop.
I've also suspected the select(...)/pselect(...) timeout when I've used it, but I'm pretty confident that this was not the problem since the low performance is maintained even without it.
I'm at a loss of how I should handle keep-alive sockets and I'm therefor wondering if you people out there has any suggestions on how to fix the low performance or have suggestions on any alternate methods I can use to go about supporting keep-alive sockets.
If you need any more information to be able to answer my question properly, don't hesitate to ask for it and I shall try my best to provide you with the necessary information and update the question with this new information.
Try to get rid of select completely. You can find some kind of event notification on every popular platform: kqueue/kevent on freebsd(), epoll on Linux, etc. This way you do not need to rebuild FD_SET and can add/remove watched fds anytime.
The time increase will be more visible when the client uses your socket for more then one request. If you are merely opening and closing yet still telling the client to keep alive then you have the same scenario as you did without keepalive. But now you have the overhead of the sockets sticking around.
If however you are using the sockets multiple times from the same client for multiple requests then you will lose the TCP connection overhead and gain performance that way.
Make sure your client is using keepalive properly. and likely a better way to get notification of the sockets state and data. Perhaps a poll device or queuing the requests.
http://www.techrepublic.com/article/using-the-select-and-poll-methods/1044098
This page has a patch for linux to handle a poll device. Perhaps some understanding of how it works and you can use the same technique in your application rather then rely on a device that may not be installed.
There are many alternatives:
Use processes instead of threads, and pass file descriptors via Unix sockets.
Maintain per-thread lists of sockets. You could even accept() directly on the worker threads.
etc...
Are your test clients reusing the sockets? Are they correctly handling keep alive?
I could see that case where you do the minimum change possible in your benchmarking code by just passing the keep alive header, but then not changing your code so that the socket is closed at the client end once the pay packet is received.
This would incure all the costs of keep-alive with none of the benefits.
What you are trying to do has been done before. Consider reading about the Leader-Follower network server pattern, http://www.kircher-schwanninger.de/michael/publications/lf.pdf

Best way to pass data between two servers in C?

I wrote a program that creates a TCP and UDP socket in C and starts both servers up. The goal of the application is to monitor requests over the TCP socket as to what UDP packets to send it (i.e. monitor for something like "0x01 0x02" and if I see it, then have the UDP server parse the payload, and forward it over to the TCP server for processing). The problem is, the UDP server will be busy keeping another device up, literally sending thousands of packets back and forth with this device. So what is the best way to continuously monitor requests from the TCP server, but send it certain payloads from the UDP server when requested since the UDP server will be busy?
I looked into pthreads with semaphores and/or mutex (not sure all the socket operations are thread safe, though, and if this is the right way to approach it) as well as fork / pipe. Forking the UDP server off as a child process seems easy enough, but I don't see exactly how I would be passing the kind of data I need among both servers (need request data from TCP and payload data from the UDP).
Firstly, would it make sense to put these two servers into one program? If so, you won't have to communicate between processes, and the whole logic becomes substantially easier. You will have to think about doing asynchronous input and output, and the select() function is designed for just this. There will be many explanations around on how to do this, and a quick look finds this page.
However, if you must have two separate processes, then you will need to choose a mechanism for inter-process communication, of which there are several, and your choice will be affected by your operating system. A pipe, if available, might be suitable, as might a Unix named pipe. Or you could look into third-party message passing frameworks, or just use shared memory and/or semaphores (but be very careful!).
What you should look at is libevent, anything else you are reinventing the wheel writing this low level code yourself. Here is a Tutorial, Google, Krugle
Also you should use some predefined protocol between the servers. There are lots to choose from. Ranging from the extremely simple XDR to Protocol Buffers.
You could use pipes on Unix. See http://tldp.org/LDP/lpg/node11.html
Well, you certainly picked an interesting introduction to C!
You might try shared memory. What OS?

Resources