Central data management for custom desktop applications - database

I have a background in web programming where both the data and the code live on the server. Web hosts with mysql or the like are plentiful and cheap so using the application from multiple pcs was never a problem.
However I'm considering switching to building desktop applications but the only factor that annoys me is the syncing of data across the many pcs I use. I was thinking of perhaps setting up a light amazon ec2 instance with a postgresql on it and having my desktop applications use that.
I have a few questions:
I'm curious as to what latency I might expect by running the database on ec2 instead of the local network, any experience or insight is appreciated.
Are there better/more obvious/cheaper solutions?
I've looked at the pricing and it seems to come down to 24.48$ per month for a yearly contract. Whilst not really expensive, it is not exactly cheap either. At what point does it become more interesting to run a local server?
I'm obviously not using my applications for large parts of the day (sleep, work,...). I was wondering if I can have the amazon server go into a sort of "sleep" mode and wake up when poked. An initial delay for the first desktop application is acceptable. The reason behind this behavior would be to save money on the instance if it is only actually needed for 10% of the day.
I welcome any feedback at all on how this problem is best tackled.

This could get ugly. Every single query you do will have latency associated with it. If you have a lot of queries, this can add up very fast. So keep your query count low, and try to pre-fetch and cache data when possible.
Not enough information to answer that question.
Depends on the cost of your local server. Keep in mind that you will need to pay for electricity to keep it on.
You can stop your instance when you are not needing it, with the exception of high utilization reservations, you wont get billed when its in stopped state. With high utilization reservations you will still pay the full cost.

Related

Profiling and output caching in ASP.NET MVC

So I was recently hired by a big department of a Fortune 50 company, straight out of college. I'll be supporting a brand new ASP.NET MVC app - over a million lines of code written by contractors over 4 years. The system works great with up to 3 or 4 simultaneous requests, but becomes very slow with more. It's supposed to go live in 2 weeks ... I'm looking for practical advice on how to drastically improve the scalability.
The advice I was given in Uni is to always run a profiler first. I've already secured a sizeable tools budget with my manager, so price wouldn't be a problem. What is a good or even the best profiler for ASP.NET MVC?
I'm also looking at adding caching. There is currently no second level and query cache configured for nHibernate. My current thinking is to use Redis for that purpose. Also looking at output caching, but unfortunately the majority of the users will login to the site. Is there a way to still cache parts of the pages served by MVC?
Do you have any monitoring or instrumentation setup for the application? If not, I would highly recommend starting there. I've been using New Relic for a few years with ASP.NET apps and been very happy with it.
Right off the bat you get a nice graph of request response times broken down into 3 kind of tasks that contribute to the response time
.NET CLR - Time spent running .NET code
Database - Time spent waiting on SQL requests
Request Queue - Time spent waiting for application workers to become available
It also breaks down performance by MVC action so you can see which ones are the slowest. You also get a breakdown of performance per database query. I've used this many times to detect procedures that were way too slow for heavy production loads.
If you want to, you can have New Relic add some unobtrusive Javascript to your page that allows you to instrument browser load times. This helps you figure things out like "my users outside North America spend on average 500ms loading images. I need to move my images to a CDN!"
I would highly recommend you use some instrumentation software like this. It will definitely get you pointed in the right direction and help you keep your app available and healthy.
Profiler is a handy tool to watch how apps communicate with your database and debug odd behaviour. It's not a long-term solution for performance instrumentation given that it puts a load on your server and the results require quite a bit of laborious processing and digestion to paint a clear picture for you.
Random thought: check out your application pool configuration and keep and eye out in the event log for too many recycling events. When an application pool recycles, it takes a long time to become responsive again. It's just one of those things can kill performance and you can rip your hair out trying to track it down. Improper recycling settings bit me recently so that's why I mention it.
For nHibernate analysis (session queries, caching, execution time) you could use HibernatingRhinos Profiler. It's developed by the guys that developed nhibernate, so you know it will work really good with it.
Here is the URL for it:
http://hibernatingrhinos.com/products/nhprof
You could give it a try and decide if it helps you or not.

How to create a reliable mobile service

I have developed a mobile application which is using extensively web services. It connects to my shared hosting server to get real-time information. Therefore, making sure the server is up is extremely important. Otherwise I am going to lose customers.
Some background. I changed no less than 3 hosting providers because they were not very reliable in terms of uptime. My currrent hosting is way better than those previous three, have I used it now for over a year, they have 99.9% uptime guarantee and all, but today I had about 3 hours of downtime. Which is why I am creating this post.
Not all of us small developers can afford expensive dedicated hosting, or have our own servers at home (which is not a guarantee it never will be down). In my case, having shared hosting for a very reasonable $10-15/month is OK. Except for those few hours it might be down.
One idea I have to deal with this is the following: have a second (different) shared hosting with another provider, and make the app to default to using this second hosting when my primary host is down. It's very unlikely that both will be down at the same time. I am going to pay only a few dollars extra per month for this, not 10 times more per month as I would for a dedicated hosting.
I am sure I am not the first person in this situation. Have anyone found a good way to deal with this problem, not requiring deep pockets? We are after all talking only about short periods of downtime on the primary server.
Thanks in advance for your suggestions.
If you are relying on a third party host and don't want to pay for greater reliability then a second server is the way to go. Depending on your application and budget you will also need to consider:
Database access and synchronization
Hosts in different physical locations
Multiple domain names and/or load balancing
If you opt to use multiple hosts and switch to a different (backup) host if one (the first) fails then you should aim to always have both (all) always in use. This way you won't get caught out trying/having to switch over to a "backup" server. By always using both (all) you can be sure that they are both (all) always up to date and working.
If your service is so critical that a couple of hours down time would be unacceptable to your users, then it should be easy to get the users to pay for that kind of reliability. This could fund hosting with a provider who can provide a greater level of up time or a second site. This will also help fund the time and effort to set all this up. ;)

How to gear towards scalability for a start up e-commerce portal?

I want to scale an e-commerce portal based on LAMP. Recently we've seen huge traffic surge.
What would be steps (please mention in order) in scaling it:
Should I consider moving onto Amazon EC2 or similar? what could be potential problems in switching servers?
Do we need to redesign database? I read, Facebook switched to Cassandra from MySql. What kind of code changes are required if switched to Cassandra? Would Cassandra be better option than MySql?
Possibility of Hadoop, not even sure?
Any other things, which need to be thought of?
Found this post helpful. This blog has nice articles as well. What I want to know is list of steps I should consider in scaling this app.
First, I would suggest making sure every resource served by your server sets appropriate cache control headers. The goal is to make sure truly dynamic content gets served fresh every time and any stable or static content gets served from somebody else's cache as much as possible. Why deliver a product image to every AOL customer when you can deliver it to the first and let AOL deliver it to all the others?
If you currently run your webserver and dbms on the same box, you can look into moving the dbms onto a dedicated database server.
Once you have done the above, you need to start measuring the specifics. What resource will hit its capacity first?
For example, if the webserver is running at or near capacity while the database server sits mostly idle, it makes no sense to switch databases or to implement replication etc.
If the webserver sits mostly idle while the dbms chugs away constantly, it makes no sense to look into switching to a cluster of load-balanced webservers.
Take care of the simple things first.
If the dbms is the likely bottle-neck, make sure your database has the right indexes so that it gets fast access times during lookup and doesn't waste unnecessary time during updates. Make sure the dbms logs to a different physical medium from the tables themselves. Make sure the application isn't issuing any wasteful queries etc. Make sure you do not run any expensive analytical queries against your transactional database.
If the webserver is the likely bottle-neck, profile it to see where it spends most of its time and reduce the work by changing your application or implementing new caching strategies etc. Make sure you are not doing anything that will prevent you from moving from a single server to multiple servers with a load balancer.
If you have taken care of the above, you will be much better prepared for making the move to multiple webservers or database servers. You will be much better informed for deciding whether to scale your database with replication or to switch to a completely different data model etc.
1) First thing - measure how many requests per second can serve you most-visited pages. For well-written PHP sites on average hardware it must be in 200-400 requests per second range. If you are not there - you have to optimize the code by reducing number of database requests, caching rarely changed data in memcached/shared memory, using PHP accelerator. If you are at some 10-20 requests per second, you need to get rid of your bulky framework.
2) Second - if you are still on Apache2, you have to switch to lighthttpd or nginx+apache2. Personally, I like the second option.
3) Then you move all your static data to separate server or CDN. Make sure it is served with "expires" headers, at least 24 hours.
4) Only after all these things you might start thinking about going to EC2/Hadoop, build multiple servers and balancing the load (nginx would also help you there)
After steps 1-3 you should be able to serve some 10'000'000 hits per day easily.
If you need just 1.5-3 times more, I would go for single more powerfull server (8-16 cores, lots of RAM for caching & database).
With step 4 and multiple servers you are on your way to 0.1-1billion hits per day (but for significantly larger hardware & support expenses).
Find out where issues are happening (or are likely to happen if you don't have them now). Knowing what is your biggest resource usage is important when evaluating any solution. Stick to solutions that will give you the biggest improvement.
Consider:
- higher than needed bandwidth use x user is something you want to address regardless of moving to ec2. It will cost you money either way, so its worth a shot at looking at things like this: http://developer.yahoo.com/yslow/
- don't invest into changing databases if that's a non issue. Find out first if that's really the problem, and even if you are having issues with the database it might be a code issue i.e. hitting the database lots of times per request.
- unless we are talking about v. big numbers, you shouldn't have high cpu usage issues, if you do find out where they are happening / optimization is worth it where specific code has a high impact in your overall resource usage.
- after making sure the above is reasonable, you might get big improvements with caching. In bandwith (making sure browsers/proxy can play their part on caching), local resources usage (avoiding re-processing/re-retrieving the same info all the time).
I'm not saying you should go all out with the above, just enough to make sure you won't get the same issues elsewhere in v. few months. Also enough to find out where are your biggest gains, and if you will get enough value from any scaling options. This will also allow you to come back and ask questions about specific problems, and how these scaling options relate to those.
You should prepare by choosing a flexible framework and be sure things are going to change along the way. In some situations it's difficult to predict your user's behavior.
If you have seen an explosion of traffic recently, analyze what are the slowest pages.
You can move to cloud, but EC2 is not the best performing one. Again, be sure there's no other optimization you can do.
Database might be redesigned, but I doubt all of it. Again, see the problem points.
Both Hadoop and Cassandra are pretty nifty, but they might be overkill.

Calculate server requirements based on programming specs

Have you ever encountered something so easy to develop but stopped a while to think of server requirements for your project ? It is my case.
I want to compete with a gaming site, they have multiplayer Flash games like poker, rummy, backgammon, and other card games, 8 games in total. For each game they have rooms and tables.
I'll use Silverlight with Sockets. I already managed to develop the policy server, the Socket Server app using WinForms, the Client Socket app in Silverlight. I own a VPS for tests, so there is no problem in developing what I want, the problem is How to calculate server requirements, RAM, bandwidth, internet speed based on the following requirements:
Server should support 24.000 users / day or 1000 users / hour
Each game room should have it's own tables where users can play
Users should not lose scores and game speed should be fast in general
I just wonder how to handle the following situation: if 1000 users are connected through Socket connection to a room full of tables and one user leave a table, all 1000 users must be updated and UI should reflect the changes. Let's say that I'll update the clients by sending a small Message of 100 bytes to each user, this will eat 100 bytes * 1000 users = 100 kb, and this just for 1 UI change, for 1 Game and for 1 Room, not counting all my other games and rooms. Also 1000 iterations that sends bytes to clients should be very time consuming.I am a developer, but not experienced in those situations. Please advice. Numbers will be great.
Until you've built -- and optimized -- your actual applications, you cannot predict much about the hardware required for some level of performance.
You have to finish the apps first. Then you can measure their performance under load. Then you can decide how much to spend on what levels of performance.
The best answer I can offer you is to run stress tests and see how much load a single server can support. While running those tests, monitor memory, IO, CPU and disk activity (if relevant) to understand which resource is running out first.
We deploy our applications on Amazon's EC2 cloud infrastructure. That lets us easily (within minutes) add or remove capacity as needed. Perhaps it's worth considering for your situation.
Always follow these two rules
“The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet.” - Michael A. Jackson
First of all you should think more about how and when to send what information to which clients. Not every client needs to be informed about every table change.
That there are only so much informations that a client needs, and you need to decide when/how it will be transmitted. Also you should pack the informations into meaningfull packets. Whats happening at a table is only interesting for that table.
Also you need to profile your application to make sure you know what ressources it consumes. Cardgames should not eat up so much ressources. But the important point is to FIRST build it, and when you HAVE a bottleneck, then try to fix it.
It's very difficult to guess at these things at this point.
From a pragmatic standpoint, you may eventually want to look into a) a cloud-hosting type service for better bandwidth price-scaling for you, or b) a very experienced full-service hosting company that can help you calculate your needs based on prior experience.
Disclaimer: I work for Rackspace Hosting which provides both of the above.

Rich database frontend - how to correctly handle low quality networks?

I have a very limited experience of database programming and my applications that access databases are simple ones :). Until now :(. I need to create a medium-size desktop application (it's called rich client?) that will use a database on the network to share data between multiple users. Most probably i will use C# and MSSQL/MySQL/SQLite.
I have performed a few drive tests and discovered that on low quality networks database access is not so smooth. In one company's LAN it's a lot of data transferred over network and servers are at constant load, so it's a common situation that a simple INSERT or SELECT SQL query will take 1-2 minutes or even fail with timeout / network error.
Is it any best practices to handle such situations? Of course i can split my app into GUI thread and DB thread so network problems will not lead to frozen GUI. But what to do with lots of network errors? Displaying them to user too often will be not very good :(. I'm thinking about automatic creating local copy of a database on each computer my app is running: first updating local database and synchronize it in background, simple retrying on network errors. This will allow an app to function event if network has great lags / problems.
Any hints and buzzwords what can i look into? Maybe it's some best practices already available that i don't know :)
Sorry this is prob not the answer you are looking for but you mention that a simple insert / update could take 1-2 minutes or even fail with timeout / network error.
This to me sounds like there may be another problem rather than the network itself. If your working on a corporate network there would have to be insane levels of traffic for this sort of behavior. I would do everything in your power to look at improving the network before proceeding. Can you post the result of a ping to the db box?
If your going to architect your application around this type of network it will significantly alter the end product and even possibly result in a poor quality product for other clients.
Depending upon the nature of the application maybe look at implementing an async persistence queue and caching data on startup or even embedding a copy of the db into your application.
Even though async behaviour/queues/caching/copying the database to each local instance etc will help solve the symptoms, the problem will still remain. If the network really is that bad then I'd address it with their I.T. department, or the project manager and build some performance requirement from their side of things into the contract.

Resources