When its better to write singlethreaded event servers, etc - c

When its better to write single threaded event or multithreaded or fork servers? Which approach more approriate for:
web server who serves statics
web server who serves statics and proxy requests to other http servers
previous plus this servers makes some logic in C without attending to harddrive or network
previous plus makes some requests to MySQL?
For example, the language is C/C++. I want to undestand this question very much. Thank you.

Given that modern processors are multicore, you should always write code under the assumption that it may at some point be multithreaded (make things reentrant wherever possible, and where not possible, use interfaces that would make it easy to drop-in an implementation that does the necessary locking).
If you have a fairly small number of queries-per-second (qps), then you might be able to get away with a single threaded event server. I would still recommend writing the code in a way that allows for multithreading, though, in the event that you increase the qps and need to handle more events.
A single threaded server can be scaled up by converting it into a forked server; however, forked servers use more resources -- any memory that is common to all requests ends up duplicated in memory when using forked servers, whereas only one copy of this data is required in a single multithreaded server. Also, a multithreaded server can deliver lower latency if it is taking advantage of true parallelism and can actually perform multiple parts of a single request processing in parallel.

There is a nice discussion on Single Thread vs. MultiThreaded server on joelonsoftware,

Related

Simulating multiple Policy Decision Points (PDPs) in distributed environment

Let's take a scenario where subjects will be requesting access to many objects per second. A heavy load on a single PDP would mean increase in wait and read/write times per request.
So far I have used the AuthzForce Core project to setup a single PDP for which I have a for loop sending multiple requests (this can be done simultaneously using threads). However, this does not seem like a suitable setup for evaluating my policies in a distributed environment.
Is there any way that it can be done? Perhaps using AuthzForce Server?
Edit:
I am running a Java application which uses Authzforce Core. The program creates an instance of a PDP which loads a single policy document, and then a for loop executes multiple requests. This is all done locally within the program itself.
It is difficult to help improve the performance here without looking at the code or the architecture, but I can give a few general tips (some of them maybe obvious to you but just to be thorough):
Since the PDP is embedded in your Java app, I assume (or make sure you do) you are using AuthzForce native Java API (example on the README), which is the most efficient way to evaluate.
I also assume you are (re-)using the same PDP (BasePdpEngine) instance throughout the lifetime of your application. It should be thread-safe.
In order to evaluate multiple requests at once, you may try the PDP engine's evaluate(List) method ( javadoc ) instead of the usual evaluate(DecisionRequest), which is faster in some cases.
If by "distributed environment", you mean that you may have multiple instances of your Java app deployed in different places, therefore multiple PDPs, the right setup(s) depend on where/how you load the policy document: local file, remote db, etc. See my last comment. As mentioned in Rafael Sisto's answer, you can reuse some guidelines from the High Availability section of AuthzForce Server installation guide there.
Authzforce server has an option for high availability:
https://github.com/authzforce/fiware/blob/master/doc/InstallationAndAdministrationGuide.rst#high-availability
You could follow the same guidelines to implement this using your single pdp.

Does sql queries block node.js event loop?

I'm going to create a web-api using pure node.js that do CRUD operations on SQL Server and return results to clients. The queries are almost long running (around 3 seconds) and request per second is high (around 30 rps). I'm using mssql package with a call back function to return result once it's ready.
I've already read a lot about node and I know its quite fits for IO intensive not CPU intensive apps and also event loop shouldn't be blocked because it's single threaded...
My question: Is Node.js suitable for this (SQL intensive) scenario? Is there any performance issue to use Node.js for this case?
Thanks
Node.js has gone all-in on non-blocking code to the degree that pretty much any function that's blocking in the Node.js API will be labelled as Sync.
Every database driver I've seen follows the model of requiring callbacks, using Promises, or in some cases both.
As a Node.js developer you must read the documentation carefully to look for any potentially blocking calls, and need to employ the correct concurrency method to handle asynchronous operations. Normally you don't need to overly concern yourself with the details of how long any given operation is, but you should be still be careful when doing things that are slow. Process data in smaller chunks (e.g. row by row) instead of all at once.
Even though it's just single threaded, Node.js can perform very well under load because it's very quick to switch between asynchronous operations. You can also scale up by having multiple Node.js processes working in parallel quite easily, especially if you're using a message bus or HTTP-type fan-out through a load balancer.

Usage PPM on top of Aerys

There is "Don’t use any blocking I/O functions in Aerys."
warning at https://amphp.org/aerys/io#blocking-io. Should I use PPM instead of Aerys if I need usage of PDO (e.g., Prooph components) and want to reuse initialized application instance for handling different requests?
I'm not bound to any existent PPM adapter (e.g., Symfony). Is there a way to reuse Aerys code (e.g., Router) for request-response logic when using PPM on top of Aerys (https://github.com/php-pm/php-pm/pull/267)?
You can just increase the worker count using the -w switch for the command line script to be higher if you want to use blocking functions. It's definitely not optimal, but with enough workers the blocking shouldn't be too noticeable, except for an increasing latency, which might occur.
Another possibility is to move the blocking calls into one or multiple worker threads with amphp/parallel.
As long as the responses are relatively fast everything should be fine. The issue begins if there's a lot of load and things get slower and might time out, because these are very long blocks then.
PHP-PM doesn't offer too much benefit over using Aerys directly. It redirects requests to a currently free worker, but with high enough load the kernel load balancing will probably good enough and not all requests that take longer will be routed to one worker. In fact, using Aerys will probably be better, because it's production ready and has multiple independent workers instead of one master that might be a bottleneck. PHP-PM could solve that in a better way, but it's currently not implemented. Additionally, Aerys supports keep-alive connections, which PHP-PM does currently not support.

Implementing multithreaded application under C

I am implementing a small database like MySQL.. Its a part of a larger project..
Right now i have designed the core database, by which i mean i have implemented a parser and i can now execute some basic sql queries on my database.. it can store, update, delete and retrieve data from files.. As of now its fine.. however i want to implement this on network..
I want more than one user to be able to access my database server and execute queries on it at the same time... I am working under Linux so there is no issue of portability right now..
I know i need to use Sockets which is fine.. I also know that i need to use a concept like Thread Pool where i will be required to create a maximum number of threads initially and then for each client request wake up a thread and assign it to the client..
As for now what i am unable to figure out is how all this is actually going to be bundled together.. Where should i implement multithreading.. on client side / server side.? how is my parser going to be configured to take input from each of the clients separately?(mostly via files i think?)
If anyone has idea about how i can implement this pls do tell me bcos i am stuck here in this project...
Thanks.. :)
If you haven't already, take a look at Beej's Guide to Network Programming to get your hands dirty in some socket programming.
Next I would take his example of a stream client and server and just use that as a single threaded query system. Once you have got this down, you'll need to choose if you're going to actually use threads or use select(). My gut says your on disk database doesn't yet support parallel writes (maybe reads), so likely a single server thread servicing requests is your best bet for starters!
In the multiple client model, you could use a simple per-socket hashtable of client information and return any results immediately when you process their query. Once you get into threading with the networking and db queries, it can get pretty complicated. So work up from the single client, add polling for multiple clients, and then start reading up on and tackling threaded (probably with pthreads) client-server models.
Server side, as it is the only person who can understand the information. You need to design locks or come up with your own model to make sure that the modification/editing doesn't affect those getting served.
As an alternative to multithreading, you might consider event-based single threaded approach (e.g. using poll or epoll). An example of a very fast (non-SQL) database which uses exactly this approach is redis.
This design has two obvious disadvantages: you only ever use a single CPU core, and a lengthy query will block other clients for a noticeable time. However, if queries are reasonably fast, nobody will notice.
On the other hand, the single thread design has the advantage of automatically serializing requests. There are no ambiguities, no locking needs. No write can come in between a read (or another write), it just can't happen.
If you don't have something like a robust, working MVCC built into your database (or are at least working on it), knowing that you need not worry can be a huge advantage. Concurrent reads are not so much an issue, but concurrent reads and writes are.
Alternatively, you might consider doing the input/output and syntax checking in one thread, and running the actual queries in another (query passed via a queue). That, too, will remove the synchronisation woes, and it will at least offer some latency hiding and some multi-core.

Is opening too many threads in an application bad?

I have a C# winform application. it has many forms with different functionalities. These forms wrap to a WCF service. for example
form1 calls serviceMethod1 continuously and updates the results
form2 calls serviceMethod2 continuously and updates the results
The calls are made in a different thread per each form, but this is ending up with too many threads as we have many forms. Is this bad and why? and is there a way to avoid this given my scenario?
Regards
How many threads are you talking about? If you have a lot of threads, you'll lose a bit of performance due to context switching - but in practice I wouldn't expect this to become a significant problem until you have an awful lot of them.
One alternative would be to use a Timer though (it sounds like a System.Timers.Timer or System.Threading.Timer would be most appropriate) - schedule each service call to be made on a regular basis, and the timer will use the threadpool to fire the calls. I suspect that although you say you're calling the services "continuously" you actually mean you're doing it regularly - which is exactly the kind of situation a timer is good for.
To answer the question frankly: It depends entirely on the OS and app design, but this question may indicate a shortcoming in the program's design.
Detail:
You want to learn the allocation requirements of a thread on your target architecture/OS, as well as keep your threads relatively busy/avoid polling, and to configure priorities correctly if you really do have a lot of threads. 'Many' threads may be 8 (or fewer, if busy), or 100+ if they have relatively little work to do, it ultimately depends on your needs and design.
As tests for some tests/objects/operations, I have used more than 100, and occasionally more than 1000 working threads. No explosions happened, though I have never had a true need for those operations to be that parallel in a shipping app (unless the aforementioned programs are being used in very unusual circumstances), and it made more sense to put the actual implementation into some centralized task manager. If you have time-critical/real time applications, then these tasks may be best on another thread. If they are short lived, consider a thread pool.. well, there are many ways to attack many problem classes...
You can use WCF asynchronious proxy
In Visual Studio, when you add Web Reference you can check "Generate Asynchronous operations" to generate an asynchronious proxy.
While the threads spend most of their time waiting for server response - even hundreds of threads are unlikely to degrade performance (CPU-wise). Otherwise, use thread pool and queue "request and update form once" tasks when previous update completes.
More important problem might be loading service with too many simultaneous requests.
As a general rule, you won't gain anything by having more threads than you have CPU cores. There are exceptions to the general rule, but I doubt they apply to your case.
From the OS' point of view, threads are no longer the lightweight things they used to be, but are almost as costly as full processes. Implementing thread synchronization correctly is not a simple task, debugging multi-threaded applications is a lot harder than a single threaded one.
With green threads, it is not an issue. Green threads being sort of a virtual thread, which is what you will generally get with Java and C#.
The benefit of threads in many apps is not to crunch more numbers but to allow lots of things to go on at once with good responsiveness, so having a lot of threads can be very useful for some things and will not always have any real cost.

Resources