I have a C# winform application. it has many forms with different functionalities. These forms wrap to a WCF service. for example
form1 calls serviceMethod1 continuously and updates the results
form2 calls serviceMethod2 continuously and updates the results
The calls are made in a different thread per each form, but this is ending up with too many threads as we have many forms. Is this bad and why? and is there a way to avoid this given my scenario?
Regards
How many threads are you talking about? If you have a lot of threads, you'll lose a bit of performance due to context switching - but in practice I wouldn't expect this to become a significant problem until you have an awful lot of them.
One alternative would be to use a Timer though (it sounds like a System.Timers.Timer or System.Threading.Timer would be most appropriate) - schedule each service call to be made on a regular basis, and the timer will use the threadpool to fire the calls. I suspect that although you say you're calling the services "continuously" you actually mean you're doing it regularly - which is exactly the kind of situation a timer is good for.
To answer the question frankly: It depends entirely on the OS and app design, but this question may indicate a shortcoming in the program's design.
Detail:
You want to learn the allocation requirements of a thread on your target architecture/OS, as well as keep your threads relatively busy/avoid polling, and to configure priorities correctly if you really do have a lot of threads. 'Many' threads may be 8 (or fewer, if busy), or 100+ if they have relatively little work to do, it ultimately depends on your needs and design.
As tests for some tests/objects/operations, I have used more than 100, and occasionally more than 1000 working threads. No explosions happened, though I have never had a true need for those operations to be that parallel in a shipping app (unless the aforementioned programs are being used in very unusual circumstances), and it made more sense to put the actual implementation into some centralized task manager. If you have time-critical/real time applications, then these tasks may be best on another thread. If they are short lived, consider a thread pool.. well, there are many ways to attack many problem classes...
You can use WCF asynchronious proxy
In Visual Studio, when you add Web Reference you can check "Generate Asynchronous operations" to generate an asynchronious proxy.
While the threads spend most of their time waiting for server response - even hundreds of threads are unlikely to degrade performance (CPU-wise). Otherwise, use thread pool and queue "request and update form once" tasks when previous update completes.
More important problem might be loading service with too many simultaneous requests.
As a general rule, you won't gain anything by having more threads than you have CPU cores. There are exceptions to the general rule, but I doubt they apply to your case.
From the OS' point of view, threads are no longer the lightweight things they used to be, but are almost as costly as full processes. Implementing thread synchronization correctly is not a simple task, debugging multi-threaded applications is a lot harder than a single threaded one.
With green threads, it is not an issue. Green threads being sort of a virtual thread, which is what you will generally get with Java and C#.
The benefit of threads in many apps is not to crunch more numbers but to allow lots of things to go on at once with good responsiveness, so having a lot of threads can be very useful for some things and will not always have any real cost.
Related
How can you practically test a synchronized data structure (in C)?
Firing a couple of threads and have them compete for access to the structure for a while to see if anything goes wrong doesn't sound very safe.
EDIT in response to comments: I mean that there are several threads running functions that operate on the same set of data, with some kind of synchronization strategy (flags/semaphores/lock-free CAS/etc) to presumably eliminate race conditions and deadlocks. The problem is programatically testing for the correct synchronization of the workers.
No-one really knows how to do this with 100% reliability. Here is just one example of of a testing tool to find concurrency bugs.
When its better to write single threaded event or multithreaded or fork servers? Which approach more approriate for:
web server who serves statics
web server who serves statics and proxy requests to other http servers
previous plus this servers makes some logic in C without attending to harddrive or network
previous plus makes some requests to MySQL?
For example, the language is C/C++. I want to undestand this question very much. Thank you.
Given that modern processors are multicore, you should always write code under the assumption that it may at some point be multithreaded (make things reentrant wherever possible, and where not possible, use interfaces that would make it easy to drop-in an implementation that does the necessary locking).
If you have a fairly small number of queries-per-second (qps), then you might be able to get away with a single threaded event server. I would still recommend writing the code in a way that allows for multithreading, though, in the event that you increase the qps and need to handle more events.
A single threaded server can be scaled up by converting it into a forked server; however, forked servers use more resources -- any memory that is common to all requests ends up duplicated in memory when using forked servers, whereas only one copy of this data is required in a single multithreaded server. Also, a multithreaded server can deliver lower latency if it is taking advantage of true parallelism and can actually perform multiple parts of a single request processing in parallel.
There is a nice discussion on Single Thread vs. MultiThreaded server on joelonsoftware,
I am implementing a small database like MySQL.. Its a part of a larger project..
Right now i have designed the core database, by which i mean i have implemented a parser and i can now execute some basic sql queries on my database.. it can store, update, delete and retrieve data from files.. As of now its fine.. however i want to implement this on network..
I want more than one user to be able to access my database server and execute queries on it at the same time... I am working under Linux so there is no issue of portability right now..
I know i need to use Sockets which is fine.. I also know that i need to use a concept like Thread Pool where i will be required to create a maximum number of threads initially and then for each client request wake up a thread and assign it to the client..
As for now what i am unable to figure out is how all this is actually going to be bundled together.. Where should i implement multithreading.. on client side / server side.? how is my parser going to be configured to take input from each of the clients separately?(mostly via files i think?)
If anyone has idea about how i can implement this pls do tell me bcos i am stuck here in this project...
Thanks.. :)
If you haven't already, take a look at Beej's Guide to Network Programming to get your hands dirty in some socket programming.
Next I would take his example of a stream client and server and just use that as a single threaded query system. Once you have got this down, you'll need to choose if you're going to actually use threads or use select(). My gut says your on disk database doesn't yet support parallel writes (maybe reads), so likely a single server thread servicing requests is your best bet for starters!
In the multiple client model, you could use a simple per-socket hashtable of client information and return any results immediately when you process their query. Once you get into threading with the networking and db queries, it can get pretty complicated. So work up from the single client, add polling for multiple clients, and then start reading up on and tackling threaded (probably with pthreads) client-server models.
Server side, as it is the only person who can understand the information. You need to design locks or come up with your own model to make sure that the modification/editing doesn't affect those getting served.
As an alternative to multithreading, you might consider event-based single threaded approach (e.g. using poll or epoll). An example of a very fast (non-SQL) database which uses exactly this approach is redis.
This design has two obvious disadvantages: you only ever use a single CPU core, and a lengthy query will block other clients for a noticeable time. However, if queries are reasonably fast, nobody will notice.
On the other hand, the single thread design has the advantage of automatically serializing requests. There are no ambiguities, no locking needs. No write can come in between a read (or another write), it just can't happen.
If you don't have something like a robust, working MVCC built into your database (or are at least working on it), knowing that you need not worry can be a huge advantage. Concurrent reads are not so much an issue, but concurrent reads and writes are.
Alternatively, you might consider doing the input/output and syntax checking in one thread, and running the actual queries in another (query passed via a queue). That, too, will remove the synchronisation woes, and it will at least offer some latency hiding and some multi-core.
Scenario : I am working on LOB application, as in silverlight every call to service is Async so automatically UI is not blocked when the request is processed at server side.
Silverlight also supports threading as per my understanding if you are developing LOB application threads are most useful when you need to do some IO operation but as i am not using OOB application it is not possible to access client resource and for all server request it is by default Async.
In above scenario is there any usage of Threading or can anyone provide some good example where by using threading we can improve performance.
I have tried to search a lot on this topic but everywhere i have identified some simple threading example from which it is very difficult to understand the real benefit.
Thanks for help
Tomasz Janczuk has also pointed out that if the UI thread is fairly busy, you can significantly improve the performance even of async WCF calls by marshaling them onto a separate thread. And I should note that the UI thread can get awfully busy doing things that you wouldn't anticipate would chew up cycles, like calculating drop-shadows and what-not, so this might be worth investigating (and measuring) for your application.
That said, I've been writing LOB apps for the better part of two decades, and synchronous IO aside, I haven't found a lot of scenarios where adding multiple threads in an LOB application was worth the additional complexity.
Edit 4/2/10: I had lunch with Tomasz Janczuk and some other folks from the WCF team the other day, and they clarified a few issues for me about how WCF works with Silverlight background threads. There are two things to be concerned with: sending data, and receiving it (say, from duplex callbacks or async call completions). When you send data, the call will always be made from the thread that actually makes the call. So if you have a lot of data that needs to be serialized, you might get a small performance boost by marshaling the outgoing call onto a background thread (say, by using ThreadPool.QueueUserWorkItem). But it's not likely to be a substantial performance boost.
However, when you receive data, either through a duplex callback, or through an async xxxCompleted method, the data is always received on the thread on which the connection was originally opened. This means that if you're opening the connection explicitly, it will receive data on that thread; but if you're opening the connection implicitly, it will receive data on the thread on which you made your first outbound connection. This won't make a lot of difference if you need to update the UI on every callback, since you'd just have to marshal the call back onto the UI thread. But if there are times when you just need to store the data for future reference or processing, you can get yourself a significant performance boost by opening your connection on a separate thread, so that you can receive and process callbacks without waiting on the UI thread.
Hope this helps. Thought I'd write it down while I still have it reasonably fresh in my head.
The same advantages apply to Silverlight as to other applications. If your are doing a long running calculation on the client and don't want to tie up the main/ui thread, then threading is an obvious choice.
Also, I haven't researched it, but I would imagine if you are running a multi-core machine, you could improve performance by splitting work into multiple separate threads.
Note: This is not for unit testing or integration testing. This is for when the application is running.
I am working on a system which communicates to multiple back end systems, which can be grouped into three types
Relational database
SOAP or WCF service
File system (network share)
Due to the environment this will run in, there are no guarantees that any of those will be available at run time. In fact some of them seem pretty brittle and go down multiple times a day :(
The thinking is to have a small bit of test code which runs before the actual code. If there is a problem then persist the request and poll until the target system until it is available. Tests could possibly be rerun within the code to check it is still available at logical points. The ultimate goal is to have a very stable system, regardless of the stability (or lack thereof) of the systems it communicates to.
My questions around this design are:
Are there major issues with it? (small things like the fact it may fail between the test completing and the code running are understandable)
Are there better ways to implement this sort of design?
Would using traditional exception handling and/or transactions be better?
Updates
The system needs to talk to the back end systems in a coordinated way.
The system is very async in nature so using things like queuing technologies is fine.
The system must run even if one or more backend systems are down as others may be up and processing of some information is possible.
You will be needing that traditional exception handling no matter what, since as you point out there's always the chance that things'll fail between your last check and the actual request. So I really think any solution you find should try to interact smoothly with this.
You are not stating if these flaky resources need to interact in some kind of coordinated manner, which would indicate that you should probably be using a transaction manager of some sort to do this. I do not believe you want to get into the footwork of transaction management in application code for most needs.
Sometimes I have also seen people use AOP to encapsulate retry logic to back-end systems that fail (for instance due to time-out issues). Used sparingly this may be a decent solution.
In some cases you can also use message queuing technology to alleviate unstable back-ends. You could for instance commit to a message queue as part of a transaction, and only pop off the queue when successful. But this design is normally only possible when you're able to live with an asynchronous process.
And as always, real stability can only be achieved by attacking the root cause of the problem. I had a 25-year old bug fixed in a mainframe TCP/IP stack fixed because we were overrunning it, so it is possible.
The Microsoft Smartclient framework provides a ConnectionMonitor class. Should be easy to use or duplicate.
Our approach to this kind of issue was to run a really basic 'sanity tester' prior to bringing up our main application. This was thick client so we could run the test every time the app started. This sanity test would go out and check things like database availability, and external network (extranet) access, and it could have been extended to do webservices as well.
If there was a failure, the user was informed, and crucially an email was also sent to the support/dev team. These emails soon became unweildy as so many were being created, but we then setup filters, so we knew when somethings really bad was happening. Overall the approach worked pretty well, our biggest win was being able to tell users that the system was down, before they had entered data, and got part way through a long winded process. They absolutely loved it.
At a technica level the sanity was written in C#, it used exception handling in a conventional way not to find the problems it was looking for. The sanity program became a mini app in its own right, and it was standalone from the main app. If I were doing it again I'd using a logging framework to capture issues, which is more flexible then our hard coded approach.