This means, for example, a module can
start compressing the response from a
backend server and stream it to the
client before the module has received
the entire response from the backend.
Nice!
I know it's some kind of asynchronous IO but simple like that isn't enough.
Anyone knows?
Without looking at the source code of an actual implementation, I'm speculating here:
It's most likely some kind of stream (abstract buffered IO) that is passed from one module to the other ("chaining"). One module (maybe a servlet container) writes to a stream that is read by another module (the compression module in your example), which then writes its output to another stream. The contents of that stream may then be processed further or transmitted to the client.
The backend may need to wait on IO before it can fully produce the page. Modules can begin compressing the start of the message before the backend is entirely done writing it.
To understand why this is useful, you need to understand how ngnix is structured. ngninx is a server that relies on non-blocking input and output. Normally, a server will use blocking input and output: it will listen on a connection, and when a connection is found, it will process the page. In order to increase throughput, multiple threads are spawned, called 'workers'.
Contrast this to ngnix: It continually asks the kernel, "Are any of my IO requests ready?" This allows it to handle the same amount of pages with 1) less overhead from all the different processes, and 2) lower memory usage. It has some downsides, however. For extremely low-volume applications, ngnix may use more CPU than a blocking server. Second, it's much less portable. Windows uses an entirely different model for non-blocking IO.
Getting back to your original question, compressing the beginning of a page is useful because it can be ready for the rest of the page when it's done accessing a database or reading from a disk or what-have-you.
Related
I am writing a data logging application which reads some values from an external device and saves them to a file periodically. Also, I would like for the application to have a server component that would make current readings accessible over TCP/IP.
The application is (being) written in C in a unix-like environment.
I am not sure whether the server should run as a separate process (fork itself away after start) and use some IPC to obtain the data or whether it would be better off as a separate thread only?
What ingredients go into such a decision?
Thanks!
If you are after real-time, stay away from "another" process as this just introduces another hop in the data path, which slows transmission down.
Have one process, instantiating a reader thread, pulling data from the device and pushing it into an internal buffer, probably implementing double-buffering, depending on the device's capabilities.
Then have a logger thread and a sender thread reading from this internal buffer.
I am developing a proxy server using WinSock 2.0 in Windows. If I wanted to develop it in blocking model, select() was the way to wait for client or remote server to receive data from. Is there any applicable way to do this so using I/O Completion Ports?
I used to have two Contexts for two directions of data using I/O Completion Ports. But having a WSARecv pending couldn't receive any data from remote server! I coudn't find the problem.
Thanks in advance.
EDIT. Here's the WorkerThread Code on currently developed I/O Completion Ports. But I am asking about how to implement select() equivalence.
I/O Completion Ports provide an indication of when an I/O operation completes, they do not indicate when it is possible to initiate an operation. In many situations this doesn't actually matter. Most of the time the overlapped I/O model will work perfectly well if you assume it is always possible to initiate an operation. The underlying operating system will, in most cases, simply do the right thing and queue the data for you until it is possible to complete the operation.
However, there are some situations when this is less than ideal. For example you can always send to a socket using overlapped I/O. You can do this even when the remote peer is not reading and the TCP stack has started to use flow control and has filled the TCP window... This simply uses resources on your local machine in a completely uncontrolled manner (not entirely uncontrolled, but controlled by the peer, which is not ideal). I write about this here and in many situations you DO need to actively manage this kind of thing by tracking how many outstanding I/O write requests you have and using that as an indication of 'readiness to send'.
Likewise if you want a 'readiness to recv' indication you could issue a 'zero byte' read on the socket. This is a read which is issued with a zero length buffer. The read returns when there is data to read but no data is returned. This would give you the indication that there is data to be read on the connection but is, IMHO, pointless unless you are suffering from the very unlikely situation of hitting the I/O page lock limit, as you may as well read the data when it becomes available rather than forcing multiple kernel to user mode transitions.
In summary, you don't really need an answer to your question. You need to look at how the API works and write your code to work with it rather than trying to force the API to work in a way that other APIs that you are familiar with work.
I am writing an HTTP reverse-proxy in C using Libevent and I would like to implement multithreading to make use of all available CPU cores. I had a look at this example: http://roncemer.com/software-development/multi-threaded-libevent-server-example/
In this example it appears that one thread is used for the full duration of a connection, but for HTTP 1.1 I don't think this will be the most effective solution as connections are kept alive by default after each request so that they can be reused later. I have noticed that even one browser panel can open several connections to one server and keep them open until the tab is closed which would immediately exhaust the thread pool. For an HTTP 1.1 proxy there will be many open connections but only very few of them actively transferring data at a given moment.
So I was thinking of an alternative, to have one event base for all incoming connections and have the event callback functions delegate to worker threads. This way we could have many open connections and make use of a thread only when data arrives on a connection, returning it back to the pool once the data has been dealt with.
My question is: is this a suitable implementation of threads with Libevent?
Specifically – is there any need to have one event base per connection as in the example or is one for all connections sufficient?
Also – are there any other issues I should be aware of?
Currently the only problem I can see is with burstiness, when data is received in many small chunks triggering many read events per HTTP response which would lead to a lot of handing-off to worker threads. Would this be a problem? If it would be, then it could be somewhat negated using Libevent's watermarking, although I'm not sure how that works if a request arrives in two chunks and the second chunk is sufficiently small to leave the buffer size below the watermark. Would it then stay there until more data arrives?
Also, I would need to implement scheduling so that a chunk is only sent once the previous chunk has been fully sent.
The second problem I thought of is when the thread pool is exhausted, i.e. all threads are currently doing something, and another read event occurs – this would lead to the read event callback blocking. Does that matter? I thought of putting these into another queue, but surely that's exactly what happens internally in the event base. On the other hand, a second queue might be a good way to organise scheduling of the chunks without blocking worker threads.
I wonder what the most efficient file logging strategy would be in a server written in C?
I can see the following options:
fopen() append and then fwrite() the data for a time frame of say 1 hour, then fclose()?
Caching the data and then occasionally open() append write() and close()?
Using a thread is usually a good solution, we adopted it with interesting results.
The main thread that needs to log prepare the log string and passes it to a second thread. To feed the second thread we use a lockless queue + a circular memory in order to minimize amount of alloc/free and wait time.
The secon thread waits for the lockless queue to be available. When it finds there's some job to do, a new slot of the lockless queue is consumed and the data logged.
Using a separate thread you can save a great amount of time.
After we decided to use a secon thread we had to face another problem. Many istances of the same program (a full text serach engine) must log all together on the same file so the resource shoud be regularly shared among every instance of the server.
We could decide to use a semaphore or another syncornizing methiod but we found another solution: the second thread sends an UDP packet to a local log server that listen on a known port. This server reads each message and logs it on the file (the server is actually the only one that owns he file while it's written). The UDP socket itself grants serialization of logs.
I've been using this solution for more than 10 years and never loose a single line of my logs file, using the second thread I also saved a great percentage of time for every operation (we use to log a lot of information for any single command the server receives).
HTH
Why don't you directly log your data when the events occur?
If your server crashes, you want to retrieve those data at the time it crashed. If you only flush your buffered logs once an hour, you'll miss interesting logs.
File streams are usually buffered by the OS.
If you believe it makes your server slow, due to hard drive writing, you might consider to log into a separate thread. But I wonder if it is the problem. Premature optimizations?
Unless you've benchmarked and found that it's a bottleneck, use fopen and fprintf. There's no reason to put your own complex buffering layer on top unless stdio is too slow for you (and if it is too slow, you might consider whether to rethink the OS/C library your server is running).
The slowest part of writing a system log is the output operation to the physical disks.
Buffering and checksumming the log records are necessary to ensure that you don't lose any log data and that the log data can't be tampered with after the fact, respectively.
The user, administrators and support staff need detailed runtime and monitoring information from a daemon developed in C.
In my case these information are e.g.
the current system health, like throughput (MB/s), already written data, ...
the current configuration
I would use JMX in the Java world and the procfs (or sysfs) interface for a kernel module. A log file doesn't seem to be the best way.
What is the best way for such a information interface for a C daemon?
I thought about opening a socket and implementing a bare-metal http or xmlrpc server, but that seems to be overkill. What are alternatives?
You can use a signal handler in your daemon that reacts to, say USR1, and dumps information to the screen/log/net. This way, you can just send the process a USR1 signal whenever you need the info.
You could listen on a UNIX-domain socket, and write regularly write the current status (say once a second) to anyone who connects to it. You don't need to implement a protocol like HTTP or XMLRPC - since the communication will be one-way just regularly write a single line of plain text containing the state.
If you are using a relational database anyway, create another table and fill it with the current status as frequent as necessary. If you don't have a relational database, write the status in a file, and implement some rotation scheme to avoid overwriting a file that somebody reads at that very moment.
Write to a file. Use a file locking protocol to force atomic reads and writes. Anything you agree on will work. There's probably a UUCP locking library floating around that you can use. In a previous life I found one for Linux. I've also implemented it from scratch. It's fairly trivial to do that too.
Check out the lockdev(3) library on Linux. It's for devices, but it may work for plain files too.
I like the socket idea best. There's no need to support HTTP or any RPC protocol. You can create a simple application specific protocol that returns requested information. If the server always returns the same info, then handling incoming requests is trivial, though the trivial approach may cause problems down the line if you ever want to expand on the possible queries. The main reason to use a pre-existing protocol is to leverage existing libraries and tools.
Speaking of leveraging, another option is to use SNMP and access the daemon as a managed component. If you need to query/manage the daemon remotely, this option has its advantages, but otherwise can turn out to be greater overkill than an HTTP server.