Rich database frontend - how to correctly handle low quality networks? - database

I have a very limited experience of database programming and my applications that access databases are simple ones :). Until now :(. I need to create a medium-size desktop application (it's called rich client?) that will use a database on the network to share data between multiple users. Most probably i will use C# and MSSQL/MySQL/SQLite.
I have performed a few drive tests and discovered that on low quality networks database access is not so smooth. In one company's LAN it's a lot of data transferred over network and servers are at constant load, so it's a common situation that a simple INSERT or SELECT SQL query will take 1-2 minutes or even fail with timeout / network error.
Is it any best practices to handle such situations? Of course i can split my app into GUI thread and DB thread so network problems will not lead to frozen GUI. But what to do with lots of network errors? Displaying them to user too often will be not very good :(. I'm thinking about automatic creating local copy of a database on each computer my app is running: first updating local database and synchronize it in background, simple retrying on network errors. This will allow an app to function event if network has great lags / problems.
Any hints and buzzwords what can i look into? Maybe it's some best practices already available that i don't know :)

Sorry this is prob not the answer you are looking for but you mention that a simple insert / update could take 1-2 minutes or even fail with timeout / network error.
This to me sounds like there may be another problem rather than the network itself. If your working on a corporate network there would have to be insane levels of traffic for this sort of behavior. I would do everything in your power to look at improving the network before proceeding. Can you post the result of a ping to the db box?
If your going to architect your application around this type of network it will significantly alter the end product and even possibly result in a poor quality product for other clients.
Depending upon the nature of the application maybe look at implementing an async persistence queue and caching data on startup or even embedding a copy of the db into your application.

Even though async behaviour/queues/caching/copying the database to each local instance etc will help solve the symptoms, the problem will still remain. If the network really is that bad then I'd address it with their I.T. department, or the project manager and build some performance requirement from their side of things into the contract.

Related

load balancing webserver / IIS / MSSQL - many clients

Have to say im not an administrator of any sorts and never needed to distribute load on a server before, but now im in a situation where i can see that i might have a problem.
This is the scenario and my problem :
I have a IIS running on a server with a MSSQL, a client can send off a request that will retrieve a datapackage with a request (1 request) to the MSSQL database, that data is then sent back to the client.
This package of data can be of different lenght, but generally <10 MB.
This is all working fine, but im now facing a what-if if i have 10.000 clients pounding on the server simulataniously, i can see my bandwith getting smashed probably and also imagine that both IIS and MSSQL will be dying of exhaustion.
So my question is, i guess the bandwith issue is only about hosting ? but how can i distribute this so IIS and MSSQL will be able to perform without exhausting them ?
Really appriciate an explanation of how this can be achieved, its probably standard knowledge but for me its abit of a mystery, but know it can be done when i look at dropbox and whatelse just a big question how i can do it.
thanks alot
You will need to consider some form of Load Balancing. Since you are using IIS, I'm assuming that you are hosting on Windows Server, which provides a software based Network Load Balancer. See Network Load Balancing Overview
You need to identify the performance bottleneck then plan to reduce them. A sledgehammer approach here might not be the best idea.
Setup performance counters and record a day or two's worth of data. See this link on how to do SQL server performance troubleshooting.
The bandwidth might just be one of the problems. By setting up performance counters and doing a analysis of what is actually happening you will be able to plan a better solution with the right data.

Profiling and output caching in ASP.NET MVC

So I was recently hired by a big department of a Fortune 50 company, straight out of college. I'll be supporting a brand new ASP.NET MVC app - over a million lines of code written by contractors over 4 years. The system works great with up to 3 or 4 simultaneous requests, but becomes very slow with more. It's supposed to go live in 2 weeks ... I'm looking for practical advice on how to drastically improve the scalability.
The advice I was given in Uni is to always run a profiler first. I've already secured a sizeable tools budget with my manager, so price wouldn't be a problem. What is a good or even the best profiler for ASP.NET MVC?
I'm also looking at adding caching. There is currently no second level and query cache configured for nHibernate. My current thinking is to use Redis for that purpose. Also looking at output caching, but unfortunately the majority of the users will login to the site. Is there a way to still cache parts of the pages served by MVC?
Do you have any monitoring or instrumentation setup for the application? If not, I would highly recommend starting there. I've been using New Relic for a few years with ASP.NET apps and been very happy with it.
Right off the bat you get a nice graph of request response times broken down into 3 kind of tasks that contribute to the response time
.NET CLR - Time spent running .NET code
Database - Time spent waiting on SQL requests
Request Queue - Time spent waiting for application workers to become available
It also breaks down performance by MVC action so you can see which ones are the slowest. You also get a breakdown of performance per database query. I've used this many times to detect procedures that were way too slow for heavy production loads.
If you want to, you can have New Relic add some unobtrusive Javascript to your page that allows you to instrument browser load times. This helps you figure things out like "my users outside North America spend on average 500ms loading images. I need to move my images to a CDN!"
I would highly recommend you use some instrumentation software like this. It will definitely get you pointed in the right direction and help you keep your app available and healthy.
Profiler is a handy tool to watch how apps communicate with your database and debug odd behaviour. It's not a long-term solution for performance instrumentation given that it puts a load on your server and the results require quite a bit of laborious processing and digestion to paint a clear picture for you.
Random thought: check out your application pool configuration and keep and eye out in the event log for too many recycling events. When an application pool recycles, it takes a long time to become responsive again. It's just one of those things can kill performance and you can rip your hair out trying to track it down. Improper recycling settings bit me recently so that's why I mention it.
For nHibernate analysis (session queries, caching, execution time) you could use HibernatingRhinos Profiler. It's developed by the guys that developed nhibernate, so you know it will work really good with it.
Here is the URL for it:
http://hibernatingrhinos.com/products/nhprof
You could give it a try and decide if it helps you or not.

Central data management for custom desktop applications

I have a background in web programming where both the data and the code live on the server. Web hosts with mysql or the like are plentiful and cheap so using the application from multiple pcs was never a problem.
However I'm considering switching to building desktop applications but the only factor that annoys me is the syncing of data across the many pcs I use. I was thinking of perhaps setting up a light amazon ec2 instance with a postgresql on it and having my desktop applications use that.
I have a few questions:
I'm curious as to what latency I might expect by running the database on ec2 instead of the local network, any experience or insight is appreciated.
Are there better/more obvious/cheaper solutions?
I've looked at the pricing and it seems to come down to 24.48$ per month for a yearly contract. Whilst not really expensive, it is not exactly cheap either. At what point does it become more interesting to run a local server?
I'm obviously not using my applications for large parts of the day (sleep, work,...). I was wondering if I can have the amazon server go into a sort of "sleep" mode and wake up when poked. An initial delay for the first desktop application is acceptable. The reason behind this behavior would be to save money on the instance if it is only actually needed for 10% of the day.
I welcome any feedback at all on how this problem is best tackled.
This could get ugly. Every single query you do will have latency associated with it. If you have a lot of queries, this can add up very fast. So keep your query count low, and try to pre-fetch and cache data when possible.
Not enough information to answer that question.
Depends on the cost of your local server. Keep in mind that you will need to pay for electricity to keep it on.
You can stop your instance when you are not needing it, with the exception of high utilization reservations, you wont get billed when its in stopped state. With high utilization reservations you will still pay the full cost.

.NET CF mobile device application - best methodology to handle potential offline-ness?

I'm building a mobile application in VB.NET (compact framework), and I'm wondering what the best way to approach the potential offline interactions on the device. Basically, the devices have cellular and 802.11, but may still be offline (where there's poor reception, etc). A driver will scan boxes as they leave his truck, and I want to update the new location - immediately if there's network signal, or queued if it's offline and handled later. It made me think, though, about how to handle offline-ness in general.
Do I cache as much data to the device as I can so that I use it if it's offline - Essentially, each device would have a copy of the (relevant) production data on it? Or is it better to disable certain functionality when it's offline, so as to avoid the headache of synchronization later? I know this is a pretty specific question that depends on my app, but I'm curious to see if others have taken this route.
Do I build the application itself to act as though it's always offline, submitting everything to a local queue of sorts that's owned by a local class (essentially abstracting away the online/offline thing), and then have the class submit things to the server as it can? What about data lookups - how can those be handled in a "Semi-live" fashion?
Or should I have the application attempt to submit requests to the server directly, in real-time, and handle it if it itself request fails? I can see a potential problem of making the user wait for the timeout, but is this the most reliable way to do it?
I'm not looking for a specific solution, but really just stories of how developers accomplish this with the smoothest user experience possible, with a link to a how-to or heres-what-to-consider or something like that. Thanks for your pointers on this!
We can't give you a definitive answer because there is no "right" answer that fits all usage scenarios. For example if you're using SQL Server on the back end and SQL CE locally, you could always set up merge replication and have the data engine handle all of this for you. That's pretty clean. Using the offline application block might solve it. Using store and forward might be an option.
You could store locally and then roll your own synchronization with a direct connection, web service of WCF service used when a network is detected. You could use MSMQ for delivery.
What you have to think about is not what the "right" way is, but how your implementation will affect application usability. If you disable features due to lack of connectivity, is the app still usable? If you have stale data, is that a problem? Maybe some critical data needs to be transferred when you have GSM/GPRS (which typically isn't free) and more would be done when you have 802.11. Maybe you can run all day with lookup tables pulled down in the morning and upload only transactions, with the device tracking what changes it's made.
Basically it really depends on how it's used, the nature of the data, the importance of data transactions between fielded devices, the effect of data latency, and probably other factors I can't think of offhand.
So the first step is to determine how the app needs to be used, then determine the infrastructure and architecture to provide the connectivity and data access required.
I haven't used it myself, but have you looked into the "store and forward" capabilities of the CF? It may suit your needs. I believe it uses an Exchange mailbox as a message queue to send SOAP packets to and from the device.
The best way to approach this is to always work offline, then use message queues to handle sending changes to and from the device. When the driver marks something as delivered, for example, update the item as delivered in your local store and also place a message in an outgoing queue to tell the server it's been delivered. When the connection is up, send any queued items back to the server and get any messages that have been queued up from the server.

Decision making in distributed applications

With a distributed application, where you have lots of clients and one main server, should you:
Make the clients dumb and the server smart: clients are fast and non-invasive. Business rules are needed in only 1 place
Make the clients smart and the server dumb: take as much load as possible off of the server
Additional info:
Clients collect tons of data about the computer they are on. The server must analyze all of this info to determine the health of these computers
The owners of the client computers are temperamental and will shut down the clients if the client starts to consume too many resources (thus negating the purpose of the distributed app in helping diagnose problems)
You should do as much client-side processing as possible. This will enable your application to scale better than doing processing server-side. To solve your temperamental user problem, you could look into making your client processes run at a very low priority so there's no noticeable decrease in performance on the part of the user.
In a client-server setting, if you care about security, you should always program on the assumption that the client may have been compromised. Even if it hasn't, there is always the risk of somebody using an old version of the client, using a competing or modified version of the client, or just of the net connection being a bit screwy.
So while you do as much work on the client as possible, processing and marshalling information into the right form, the server then needs to do a thorough sanity check on anything the client gives it.
So the answer I guess is "both".
The server must analyze all of this
info to determine the health of these
computers
That is probably the biggest clue so far explaning what your application is kinda about. Are you able to provide a more elaborate briefing on what this application is seeking to achieve in this distributed environment? We do not even know if the client-side processing is disk I/O or processor intensive. How you design the solution is dependent on the nature of what needs to be done to help the users/business accomplish their jobs and objectives.

Resources