Erlang: is lots of timers ok? - timer

I have a gen_server process that maintains a pool, for each incoming request, I need to examining the pool to see if there is a match for this incoming request, if there is one, the matched one is removed from the pool and replies are made to both requests; if there is none, the new request is put to the pool for later examination.
The biz logic requires that, if a request, R, sits in the pool for T seconds without been matched, I need to make a reply to R saying something like "I cannot find a match for you".
Ideally, I want to do this with timers, specifically, for each incoming request, if there is no match, put it to the pool as before, but also start a timer to tell the gen_server to remove it if time is up, of course, if it is matched later, the timer should be cancelled.
My concern is that, If there are lots unmatched requests in the pool, then there would be lots of running timers, will this (too many timers) becomes a problem?

There were done big improvements in timers implementation in R18.
Besides the API changes and time warp modes a lot of
scalability and performance improvements regarding time
management has been made internally in the runtime system.
Examples of such improvements are scheduler specific timer
wheels, scheduler specific BIF timer management, parallel
retrieval of monotonic time and system time on systems with
primitives that are not buggy.
scheduler specific timer wheels is exactly what is interesting in your scenario. I doubt you would come around better performant solution of your problem in Erlang or any other language/environment. So your solution should be OK when you are using R18 or newer.

Related

Usage PPM on top of Aerys

There is "Don’t use any blocking I/O functions in Aerys."
warning at https://amphp.org/aerys/io#blocking-io. Should I use PPM instead of Aerys if I need usage of PDO (e.g., Prooph components) and want to reuse initialized application instance for handling different requests?
I'm not bound to any existent PPM adapter (e.g., Symfony). Is there a way to reuse Aerys code (e.g., Router) for request-response logic when using PPM on top of Aerys (https://github.com/php-pm/php-pm/pull/267)?
You can just increase the worker count using the -w switch for the command line script to be higher if you want to use blocking functions. It's definitely not optimal, but with enough workers the blocking shouldn't be too noticeable, except for an increasing latency, which might occur.
Another possibility is to move the blocking calls into one or multiple worker threads with amphp/parallel.
As long as the responses are relatively fast everything should be fine. The issue begins if there's a lot of load and things get slower and might time out, because these are very long blocks then.
PHP-PM doesn't offer too much benefit over using Aerys directly. It redirects requests to a currently free worker, but with high enough load the kernel load balancing will probably good enough and not all requests that take longer will be routed to one worker. In fact, using Aerys will probably be better, because it's production ready and has multiple independent workers instead of one master that might be a bottleneck. PHP-PM could solve that in a better way, but it's currently not implemented. Additionally, Aerys supports keep-alive connections, which PHP-PM does currently not support.

Parallel calls to google.appengine.api.channel.send_message

I am using send_message(client_id, message) in google.appengine.api.channel to fan out messages. The most common use case is two users. A typical trace looks like the following:
The two calls to send_message are independent. Can I perform them in parallel to save latency?
Well there's no async api available, so you might need to implement a custom solution.
Have you already tried with native threading? It could work in theory, but because of the GIL, the xmpp api must block by I/O, which I'm not sure it does.
A custom implementation will invariably come with some overhead, so it might not be the best idea for your simple case, unless it breaks the experience for the >2 user cases.
There is, however, another problem that might make it worth your while: what happens if the instance crashes and only got to send the first message? The api isn't transactional, so you should have some kind of safeguard. Maybe a simple recovery mode will suffice, given how infrequently this will happen, but I'm willing to bet a transactional message channel sounds more appealing, right?
Two ways you could go about it, off the top of my head:
Push a task for every message, they're transactional and guaranteed to run, and will execute in parallel with a fairly identical run time. It'll increase the time it takes for the first message to go out but will keep it consistent between all of them.
Use a service built for this exact use case, like firebase (though it might even be too powerful lol), in my experience the channel api is not very consistent and the performance is underwhelming for gaming, so this might make your system even better.
Fixed that for you
I just posted a patch on googleappengine issue 9157, adding:
channel.send_message_async for asynchronously sending a message to a recipient.
channel.send_message_multi_async for asynchronously broadcasting a single message to multiple recipients.
Some helper methods to make those possible.
Until the patch is pushed into the SDK, you'll need to include the channel_async.py file (that's attached on that thread).
Usage
import channel_async as channel
# this is synchronous D:
channel.send_message(<client-id>, <message>)
# this is asynchronous :D
channel.send_message_async(<client-id>, <message>)
# this is good for broadcasting a single message to multiple recipients
channel.send_message_multi_async([<client-id>, <client-id>], <message>)
# or
channel.send_message_multi_async(<list-of-client-ids>, <message>)
Benefits
Speed comparison on production:
Synchronous model: 2 - 80 ms per recipient (and blocking -.-)
Asynchronous model: 0.15 - 0.25 ms per recipient

Silverlight Threading and its usage

Scenario : I am working on LOB application, as in silverlight every call to service is Async so automatically UI is not blocked when the request is processed at server side.
Silverlight also supports threading as per my understanding if you are developing LOB application threads are most useful when you need to do some IO operation but as i am not using OOB application it is not possible to access client resource and for all server request it is by default Async.
In above scenario is there any usage of Threading or can anyone provide some good example where by using threading we can improve performance.
I have tried to search a lot on this topic but everywhere i have identified some simple threading example from which it is very difficult to understand the real benefit.
Thanks for help
Tomasz Janczuk has also pointed out that if the UI thread is fairly busy, you can significantly improve the performance even of async WCF calls by marshaling them onto a separate thread. And I should note that the UI thread can get awfully busy doing things that you wouldn't anticipate would chew up cycles, like calculating drop-shadows and what-not, so this might be worth investigating (and measuring) for your application.
That said, I've been writing LOB apps for the better part of two decades, and synchronous IO aside, I haven't found a lot of scenarios where adding multiple threads in an LOB application was worth the additional complexity.
Edit 4/2/10: I had lunch with Tomasz Janczuk and some other folks from the WCF team the other day, and they clarified a few issues for me about how WCF works with Silverlight background threads. There are two things to be concerned with: sending data, and receiving it (say, from duplex callbacks or async call completions). When you send data, the call will always be made from the thread that actually makes the call. So if you have a lot of data that needs to be serialized, you might get a small performance boost by marshaling the outgoing call onto a background thread (say, by using ThreadPool.QueueUserWorkItem). But it's not likely to be a substantial performance boost.
However, when you receive data, either through a duplex callback, or through an async xxxCompleted method, the data is always received on the thread on which the connection was originally opened. This means that if you're opening the connection explicitly, it will receive data on that thread; but if you're opening the connection implicitly, it will receive data on the thread on which you made your first outbound connection. This won't make a lot of difference if you need to update the UI on every callback, since you'd just have to marshal the call back onto the UI thread. But if there are times when you just need to store the data for future reference or processing, you can get yourself a significant performance boost by opening your connection on a separate thread, so that you can receive and process callbacks without waiting on the UI thread.
Hope this helps. Thought I'd write it down while I still have it reasonably fresh in my head.
The same advantages apply to Silverlight as to other applications. If your are doing a long running calculation on the client and don't want to tie up the main/ui thread, then threading is an obvious choice.
Also, I haven't researched it, but I would imagine if you are running a multi-core machine, you could improve performance by splitting work into multiple separate threads.

Is opening too many threads in an application bad?

I have a C# winform application. it has many forms with different functionalities. These forms wrap to a WCF service. for example
form1 calls serviceMethod1 continuously and updates the results
form2 calls serviceMethod2 continuously and updates the results
The calls are made in a different thread per each form, but this is ending up with too many threads as we have many forms. Is this bad and why? and is there a way to avoid this given my scenario?
Regards
How many threads are you talking about? If you have a lot of threads, you'll lose a bit of performance due to context switching - but in practice I wouldn't expect this to become a significant problem until you have an awful lot of them.
One alternative would be to use a Timer though (it sounds like a System.Timers.Timer or System.Threading.Timer would be most appropriate) - schedule each service call to be made on a regular basis, and the timer will use the threadpool to fire the calls. I suspect that although you say you're calling the services "continuously" you actually mean you're doing it regularly - which is exactly the kind of situation a timer is good for.
To answer the question frankly: It depends entirely on the OS and app design, but this question may indicate a shortcoming in the program's design.
Detail:
You want to learn the allocation requirements of a thread on your target architecture/OS, as well as keep your threads relatively busy/avoid polling, and to configure priorities correctly if you really do have a lot of threads. 'Many' threads may be 8 (or fewer, if busy), or 100+ if they have relatively little work to do, it ultimately depends on your needs and design.
As tests for some tests/objects/operations, I have used more than 100, and occasionally more than 1000 working threads. No explosions happened, though I have never had a true need for those operations to be that parallel in a shipping app (unless the aforementioned programs are being used in very unusual circumstances), and it made more sense to put the actual implementation into some centralized task manager. If you have time-critical/real time applications, then these tasks may be best on another thread. If they are short lived, consider a thread pool.. well, there are many ways to attack many problem classes...
You can use WCF asynchronious proxy
In Visual Studio, when you add Web Reference you can check "Generate Asynchronous operations" to generate an asynchronious proxy.
While the threads spend most of their time waiting for server response - even hundreds of threads are unlikely to degrade performance (CPU-wise). Otherwise, use thread pool and queue "request and update form once" tasks when previous update completes.
More important problem might be loading service with too many simultaneous requests.
As a general rule, you won't gain anything by having more threads than you have CPU cores. There are exceptions to the general rule, but I doubt they apply to your case.
From the OS' point of view, threads are no longer the lightweight things they used to be, but are almost as costly as full processes. Implementing thread synchronization correctly is not a simple task, debugging multi-threaded applications is a lot harder than a single threaded one.
With green threads, it is not an issue. Green threads being sort of a virtual thread, which is what you will generally get with Java and C#.
The benefit of threads in many apps is not to crunch more numbers but to allow lots of things to go on at once with good responsiveness, so having a lot of threads can be very useful for some things and will not always have any real cost.

Why is it bad practice to make multiple database connections in one request?

A discussion about Singletons in PHP has me thinking about this issue more and more. Most people instruct that you shouldn't make a bunch of DB connections in one request, and I'm just curious as to what your reasoning is. My first thought is the expense to your script of making that many requests to the DB, but then I counter myself with the question: wouldn't multiple connections make concurrent querying more efficient?
How about some answers (with evidence, folks) from some people in the know?
Database connections are a limited resource. Some DBs have a very low connection limit, and wasting connections is a major problem. By consuming many connections, you may be blocking others for using the database.
Additionally, throwing a ton of extra connections at the DB doesn't help anything unless there are resources on the DB server sitting idle. If you've got 8 cores and only one is being used to satisfy a query, then sure, making another connection might help. More likely, though, you are already using all the available cores. You're also likely hitting the same harddrive for every DB request, and adding additional lock contention.
If your DB has anything resembling high utilization, adding extra connections won't help. That'd be like spawning extra threads in an application with the blind hope that the extra concurrency will make processing faster. It might in some certain circumstances, but in other cases it'll just slow you down as you thrash the hard drive, waste time task-switching, and introduce synchronization overhead.
It is the cost of setting up the connection, transferring the data and then tearing it down. It will eat up your performance.
Evidence is harder to come by but consider the following...
Let's say it takes x microseconds to make a connection.
Now you want to make several requests and get data back and forth. Let's say that the difference in transport time is negligable between one connection and many (just ofr the sake of argument).
Now let's say it takes y microseconds to close the connection.
Opening one connection will take x+y microseconds of overhead. Opening many will take n * (x+y). That will delay your execution.
Setting up a DB connection is usually quite heavy. A lot of things are going on backstage (DNS resolution/TCP connection/Handshake/Authentication/Actual Query).
I've had an issue once with some weird DNS configuration that made every TCP connection took a few seconds before going up. My login procedure (because of a complex architecture) took 3 different DB connections to complete. With that issue, it was taking forever to log-in. We then refactored the code to make it go through one connection only.
We access Informix from .NET and use multiple connections. Unless we're starting a transaction on each connection, it often is handled in the connection pool. I know that's very brand-specific, but most(?) database systems' cilent access will pool connections to the best of its ability.
As an aside, we did have a problem with connection count because of cross-database connections. Informix supports synonyms, so we synonymed the common offenders and the multiple connections were handled server-side, saving a lot in transfer time, connection creation overhead, and (the real crux of our situtation) license fees.
I would assume that it is because your requests are not being sent asynchronously, since your requests are done iteratively on the server, blocking each time, you have to pay for the overhead of creating a connection each time, when you only have to do it once...
In Flex, all web service calls are automatically called asynchronously, so you it is common to see multiple connections, or queued up requests on the same connection.
Asynchronous requests mitigate the connection cost through faster request / response time...because you cannot easily achieve this in PHP without out some threading, then the performance hit is greater then simply reusing the same connection.
that's my 2 cents...

Resources