I am reading a journal, it stated
Lighttpd is asynchronous server, and Apache2 is a process-based
server.
What does this actually mean?
Which server will you recommend for RasPi in purpose of monitoring purposes.
Thanks.
See this website for a detailed explanation.
In the traditional thread-based (Synchronous) models, for each client there is one thread which is completely separate and is dedicated to serve that thread. This might cause I/O blocking problems when process is waiting to get completed to release the resources (memory, CPU) in hold. Also,creating separate processes consumes more resources.
Asynchronous servers do not create a new process or thread for a new request. Here the worker process accepts the requests and process thousands of it with the implementation of highly efficient event loops.Asynchronous means that the threads can be executed concurrently with out blocking each other. It enhances the sharing of resources without being dedicated and blocked.
Related
I have a network application on a gateway. It receives and sends packets. For most of them, my gateway acts as a router, but in some cases, it can receive packets too.
Should I have:
only one main thread
a main thread + a dispatch thread in charge of giving it to the correct flow handler
as many threads as there are flows
something else.
?
Doing multithreading correctly is no simple matter, in many cases a select and friends based solution will be a whole lot easier to create.
Your case sounds a lot like a typical Unix service daemon. The popular solution to your problem is not to use threads, but forks.
The idea is that your program listens on the socket and waits for connections. As soon as a connection arrives, it forks. The child process then continues to process the connection. The father process itself just continues in the loop and waits for incoming connections.
Advantages over threading:
Very simple program design
No problems with concurrency
Established method for Unix/Linux systems
Disadvantages:
Things get complicated when several connections interact with each other (your use case doesn't sound like they would)
Performance penalty on Windows systems (not on Unix systems!)
You can find many code examples online.
I don't know much about networking applications, but I think it's like this:
If you have the ability to react asynchronous to the requests you would probably use just one single thread (like in Node.JS). If you won't be able to react asynchronous the main thread would always block the other actions.
If you are not able to react asynchronous on your requests you have to use more than one thread. But you could achieve that in many different ways: you could create for every request a thread, or a limited number of threads and assign them then to your requests.
My personal preference is use one main thread and one worker thread per connection. No cap whatsoever. I am assuming that your server will be stateless like a HTTP server.
For stateful servers you will have to figure out some way to control number of threads.
I'm writing a TCP server/client application on Windows, to become familiar with the Winsock API. I come from an UNIX background and would like to know which of these could be the best approach to implement the application:
First the specification
Must scale well on multiprocessor and single-processor systems.
No hardset limit of connections.
Application can both listen for connections, acting as server, and act as client.
Multi threaded.
First approach:
Non-blocking select-like socket for listening, in the 'server' thread.
for each client connecting we spawn a separate thread.
Second approach:
Blocking socket for listening, in the 'server' thread.
for each client connecting we spawn a separate thread.
Third approach:
Non-blocking select-like socket for listening, in the 'server' thread.
No separate thread for each incoming connection, the protocol would need state information kept across sessions I suppose.
I wonder what is the most efficient and scalable approach, and especially if it can work with a UDP socket too.
Note: I'm writing the application in plain and old C. No .NET nor C++ involved, C++ exceptions disabled too.
As Gary says, I/O Completion Ports are the most efficient way to manage multiple network connections in a non-blocking/async manner on Windows platforms.
With IOCP you get notified when your networking operations complete and you can process these completions with a small number of threads. You get to decide how many threads you allocate to process the completions and the kernel decides when to use the threads that you're providing. It uses them in a LIFO order, to reduce context switching, so that if you are only using the minimal number of threads required at any point and you're reusing the same threads rather than cycling through all of the threads that you have available for use.
The asynchronous nature of IOCP programming can be a little confusing to start with, but once you get the hang of it it's fairly straight forward.
I have some free IOCP server code which demonstrates the basics and provides some example servers that are pretty easy to build on. You can find the code here: http://www.serverframework.com/products---the-free-framework.html. That page also links to some articles that I wrote to explain the code.
Relating this to the detail of your question. You should be looking at a variation on your third approach. Use AcceptEx() to accept new connections, this can be used in an asynchronous manner and so you don't need a separate thread for connection acceptance and can use the threads that are also processing your overlapped/async read and write operations.
I've written an asynchronous client which does not use blocking sockets, so if you're interested in that approach, then take a look at my client: http://codesprout.blogspot.com/2011/04/asynchronous-http-client.html
It's an HTTP client, but I've shown very little HTTP protocol processing in there, it's all just .NET sockets. The server would work in a similar way: you can take advantage of the *Async methods such as AsseptAsync.
Under Windows, the best performances are achieved by using I/O completion calls.
This is because the lists and queuing mechanism is done in the kernel, far from the heavy user-mode overhead (which drags your code down if you dare to do the hard work yourself).
Unfortunately, Windows I/O completion calls need to allocate many threads to scale and this is quickly killing the performances (as compared to Linux epoll which can scale independently of the number of worker threads you decide to involve in the task).
Recently, I discovered http://gwan.com/ a Web server which came from Windows and was then ported under Linux. And their authors describe the problem in details on their forum.
Is a server essentially a background process running an infinite loop listening on a port? For example:
while(1){
command = read(127.0.0.1:xxxx);
if(command){
execute(command);
}
}
When I say server, I obviously am not referring to a physical server (computer). I am referring to a MySQL server, or Apache, etc.
Full disclosure - I haven't had time to poke through any source code. Actual code examples would be great!
That's more or less what server software generally does.
Usually it gets more complicated because the infinite loop "only" accepts the connection and each connection can often handle multiple "commands" (or whatever they are called in the used protocol), but the basic idea is roughly this.
There are three kinds of 'servers' - forking, threading and single threaded (non-blocking). All of them generally loop the way you show, the difference is what happens when there is something to be serviced.
A forking service is just that. For every request, fork() is invoked creating a new child process that handles the request, then exits (or remains alive, to handle subsequent requests, depending on the design).
A threading service is like a forking service, but instead of a whole new process, a new thread is created to serve the request. Like forks, sometimes threads stay around to handle subsequent requests. The difference in performance and footprint is simply the difference of threads vs forks. Depending on the memory usage that is not servicing a client (and prone to changing), its usually better to not clone the entire address space. The only added complexity here is synchronization.
A single process (aka single threaded) server will fork only once to daemonize. It will not spawn new threads, it will not spawn child processes. It will continue to poll() the socket to find out when the file descriptor is ready to receive data, or has data available to be processed. Data for each connection is kept in its own structure, identified by various states (writing, waiting for ACK, reading, closing, etc). This can be an extremely efficient design, if done properly. Instead of having multiple children or threads blocking while waiting to do work, you have a single process and event loop servicing requests as they are ready.
There are instances where single threaded services spawn multiple threads, however the additional threads aren't working on servicing incoming requests, one might (for instance) set up a local socket in a thread that allows an administrator to obtain a status of all connections.
A little googling for non blocking http server will yield some interesting hand rolled web servers written as code golf challenges.
In short, the difference is what happens once the endless loop is entered, not just the endless loop :)
In a matter of speaking, yes. A server is simply something that "loops forever" and serves. However, typically you'll find that "daemons" do things like open STDOUT and STDERR onto file handles or /dev/null along with double forks among other things. Your code is a very simplistic "server" in a sense.
I have a daemon to write in C, that will need to handle 20-150K TCP connections simultaneously. They are long running connections, and rarely ever tear down. They have a very small amount of data (rarely exceeding MTU even.. it's a stimulus/response protocol) in transmit at any given time, but response times to them are critical. I'm wondering what the current UNIX community is using to get large amounts of sockets, and minimizing the latency on response of them. I've seen designs revolving around multiplexing connects to fork worker pools, threads (per connection), static sized thread pools. Any suggestions?
the easiest suggestion is to use libevent, it makes it easy to write a simple non-blocking single-threaded server that would comply with your requirements.
if the processing for each response takes some time, or if it uses some blocking API (like almost anything from a DB), then you'll need some threading.
One answer is the worker threads, where you spawn a set of threads, each listening on some queue to work. it can be separate processes, instead of threads, if you like. The main difference would be the communications mechanism to tell the workers what to do.
A different way to do is to use several threads, and give to each of them a portion of those 150K connections. each will have it's own process loop and work mostly like the single-threaded server, except for the listening port, which will be handled by a single thread. This helps spreading the load between cores, but if you use a blocking resource, it would block all the connections handled by this specific thread.
libevent lets you use the second way if you're careful; but there's also an alternative: libev. it's not as well known as libevent, but it specifically supports the multi-loop scheme.
If performance is critical then you'll really want to go for a multithreaded event loop solution - i.e. a pool of worker threads to handle your connections. Unfortunately, there is no abstraction library to do this that works on most Unix platforms (note that libevent is only single-threaded as are most of these event-loop libraries), so you'll have to do the dirty work yourself.
On Linux that means using edge-triggered epoll with a pool of worker threads (Windows would have I/O completion ports which also works fine in a multithreaded environment - I am not sure about other Unixes).
BTW, I have done some work trying to abstract edge-triggered epoll on Linux and Windows I/O completion ports on http://nginetd.cmeerw.org (it is work in progress, but might provide some ideas).
If you have system configuration access don't over-do it and set up some iptables/pf/etc to load-balance connections across n daemon instances (processes) as this will work out of the box. Depending on how blocking the nature of the daemon n should be from the number of cores on the system or several times higher. This approach looks crude but it can handle broken daemons and even restart them if necessary. Also migration would be smooth as you could start diverting new connections to another set of processes (for example, a new release or migrating to a new box) instead of service interruptions. On top of that you get several features like source affinity wich can help significantly caching and contention of problematic sessions.
If you don't have system access (or ops can't be bothered), you can use load balancer daemon (there are plenty of open source ones) instead of iptables/pf/etc and use also n service daemons, like above.
Also this approach helps with separating privileges of ports. If the external service needs to service on a low port (<1024) you only need the load balancer running privileged/or admin/root, or kernel.)
I've written several IP load balancers in the past and it can be very error-prone in production. You don't want to support and debug that. Also operations and management will tend second-guess your code more than external code.
i think javier's answer makes the most sense. if you want to test the theory out, then check out the node javascript project.
Node is based on Google's v8 engine which compiles javascript to machine code and is as fast as c for certain tasks. It is also based on libev and is designed to be completely non-blocking, meaning you don't have to worry about context switching between threads (everything runs on a single event loop). It is very similar to erlang in that respect.
Writing high performance servers in javascript is now really, really easy with node. You could also, with a little bit of effort, write your custom code in c and create bindings for node to call into it to do your actual processing (look at the node source to see how to do this - documentation is a little sketchy at the moment). as an uglier alternative, you could build your custom c code as an application and use stdin/stdout to communicate with it.
I've tested node myself with upwards of 150k connections with absolutely no issues (of course you will need some serious hardware if all these connections are going to be communicating at once). A TCP connection in node.js on average uses only 2-3k of memory so you could theoretically handle 350-500k connections per 1GB of RAM.
Note - Node.js is not currently supported on windows, but it is only at an early stage of development and i'd imagine it will be ported at some stage.
Note 2 - you will have to ensure the code you are calling into from Node does not block
Several systems have been developed to improve on select(2) performance: kqueue, epoll, and /dev/poll. In all these systems, you can have a pool of worker threads waiting for tasks; you will not be forced to setup all file handles over and over again when done with one of them.
do you have to start from scratch? You could use something like gearman.
As part of an experiment, I want to write a OpenGL-based UI server for applications, similar to X11 or Quartz in architecture: a core process renders objects into a single viewport, but all graphical objects are being controlled by remote processes.
The idea is that the views stability is only dependent on the core process. If a client process segfaults, its allocated resources would be safely freed - a requirement for that feature is being able to securely find out whether a client process has crashed.
What is the best practice here?
I think this should be detected as an event on the connection to the client, just as with any other client/server architecture.
If you use sockets, the socket will eventually register that one side has closed the socket (as the process crashes, its end of the socket will be closed), and you can detect that, look up the owning client in the server's records, and clean out all resources.
It would be very weird for the server to directly (through process IDs or whatever) look for the clients, and that would also needlessly limit your architecture to only run locally, and not across a network.