Updating the production IIS WebSite and SQL Server Database without stopping - sql-server

There is a testing server that uses the testing database. We test the website on the testing server. If it is okay, we update the website and the database schema from the testing server to the production server. But this method is very painful and risky.
First, we have to redirect the users to a maintenance page, so the website is paused for a while.
Second, if something goes wrong when updating, we have to back to old website, because we can't put the website in a maintenance mode for a long time.
So I'm seeking a solid solution to update an IIS website and an Sql Server Database without data loss and using maintenance mode. Is there any way to do this? How the big websites do this without data loss and pausing.
We've thought using a release candidate website. We've planned to use this RC website for temporary. First, we update the RC site, then swap the bindings between RC and production website. But this time the database is problem. Because we can change the database schema, and the old one can't work with new database. So, if we use a temp site with temp db, there will be data loss. Also, the updated website won't work with old database if the temp site uses old production database. So I need a solid and practical solution for this problem.

This is orders of magnitude more complicated than you imagine. This specifically is not about HA nor about contiguous integration. Neither of those will provide what you need, they're only pieces of the much more complex puzzle.
There simply isn't possible to write code changes in a manner that is transparent/oblivious to schema changes as they occur. At best you can write the code in a manner that supports the schema at v. N and at v. N+1, which in itself is a big challenge. But is impossible to write the code in a manner that supports the schema as it transitions from v. N to v. N+1. The schema change induced by a deployment has to be atomic for the code operating on the schema. Since the schema change itself cannot be atomic, it follow that the upgrade has two possible avenues:
take the code offline during the schema change. This is what you're doing now and is the safest approach. Of course, it implies service availability down time and runs the risks you already experienced (rollback of failed upgrade, lengthy upgrade, etc). A variant of this approach is to redirect the service to a read-only copy of the data and offer a degraded service experience(no changes are possible during the downtime) which may or may not be acceptable, depending on the business specifics.
standby upgrade. This implies that you take a snapshot of the service data (various HA solutions may provide a standby snapshot out-of-the-box, eg. log shipping). Upgrade the snapshot, then apply all the transactions that occurred on the real service data to the upgraded snapshot. This is always tricky, because it requires a technology to detect, capture and apply the changes (eg. change tracking, replication, custom solution etc) and requires to transform each change to the new, upgraded, schema. Once the upgraded schema is up to date with changes from the main service, the service can be redirected to the upgraded schema. This redirection is also much more complex than it sounds. For one choosing the moment when to cut-off the old schema and stop accepting new changes, while making sure all changes were applied to the new upgraded schema DB is a challenge in itself. Another challenge is to resolve the conflict of the code understanding pre-upgrade and post-upgrade schema versions. Developing code that handles both is, as I said, problematic and error prone, so one solution is to, again, take the service offline for a short period and replace the code. Another solution is to have a standby service, running code that handles the post-upgrade DB schema and is connected to the post-upgrade DB, and redirect the live requests to your standby, upgraded, service.
And I did not even touch the thorny subject of service interaction, when a particular service of a much larger deployed solution has to be upgraded. This is when service API protocol back compatibility plays the major role to allow the post-upgrade service to play along with its peer services.
Ultimately there just isn't any silver bullet. I've witnessed single machine large DB deployments that took weeks to roll out version N+1, with transcriptional replication contiguously feeding the post-upgrade DB schema with changes from the pre-upgrade DB. And I witnessed deployments of thousands of machines deploying version N+1 in stages, as a complicated dance of enabling code and data changes over the course of several days to reach the full functionality of the post-upgrade. This problem is just plain hard.

This is what Azure is fantastic for. The Azure Cloud platform allows for staging servers and production servers. You can set it up so once you commit your changes to Git or TFS it is pushed to either a staging or production server automatically. You can also set up to manually push changes. Most of the ORM libraries like Entity Framework have migration support.
There is a lot more information out there on this topic like:
Azure seamless upgrade when database schema changes
Staging or Production Instance?

Youre describing a high availability solution (HA). HA solutions are expensive and can quickly be overkill. you would need redundancy for your app and db servers, which means setting up a db cluster. All this increases the amount of time you would spend deploying changes, but the tradeoff is that your app is always available.
The main thing with deployment is having a repeatable process. So my best recommendation is to script or automate as much as possible.

Related

Flyway: how to prevent accidentally running migrations from developer on a central database?

When I debug against a (central) database, Flyway can update the database schema. This happens when my local application runs on a development branch which is further ahead than the deployed application.
Running the local application will then invoke migration scripts on the central database. In the worst case, this might update a production database of course.
Another scenario is one where 2 developers work against 1 development database with test-data. Both developers are working on different features and both are modifying the schema. When one developer updates the db, the other is (probably) confronted with a checksum issue, and otherwise surprised by the changes made.
I'm thinking about a solution to such problems.
Of course, there are some solutions outside of Flyway, like only connecting to throw-away databases, or preventing access to important db's.
I'm interested in which options Flyway offers.
There are some options, which seems to be useful in your cases.
For the first one, you can set the target property with the same version your DB has, to prevent Flyway to update it.
For the second case with 2 developers working simultaneously with the same DB instance, you can try to turn off the validateOnMigrate property to avoid validation failures or the ignoreMissingMigrations to ignore the migrations some of the developers doesn't have yet.
Here you can find all available via console properties for migration task. You didn't specify, haw exactly you run the Flyway, if it's done vie Spring Boot, then not all this properties are available, just some of them - you can find it here under the Flyway section.
But for most cases, I think, the best solution is to simply turn Flyway migrations off during the development and debugging if it's possible and use for the delivery of ready features.

How hard to migrate from IaaS to PaaS on Azure

So I'm thinking of dipping a toe in the Azure pool
Our web App Suite will soon be a pure ASP.Net + SQL Server affair
For various reasons it will be simpler to initially create a SQL VM and run everything from there initially.
How hard will it be to ...
...migrate SQL off the VM and into either "Cloud Services" or "Data Management"?
...migrate the suite of WebApps off the VM and into "Websites"?
It is my understanding that having achieved this migration, the OS level updates will no longer be my concern as they will be handled by the service. Thus at this point I'll be able to throw the original VM away :)
This isn't exactly answering your questions, but it might help educate you on more questions to ask and giving you a boost out of the gate. These were all lessons learned before, during, or after our migration of our systems to Azure. Now that we're up there, we have a ~50GB database with ~6 services running across ~30 instances. As long as our database backup behaves, total amount of effort in upgrading all of this is less than an hour (and could be much less if we didn't have many safeguards meant to force us to be aware of what's going on during the migration process - we don't want it to be too easy to deploy just to protect us from ourselves).
Preparing to migrate your system to Azure:
If you're planning to go to Azure, you first need to make sure your architectures and technologies are compatible. This isn't to say you have to code everything specific to Azure. This means some of the following things:
You should realize that "high availability" does not mean "error-free". In fact, high-availability environments usually have more errors that you have to handle and manage. For example, if you have a request going over the network to a server that just had a motherboard fry and was taken offline, that network request will be unsuccessful. That's not typically a problem you code for in "standard" server apps. To take it even further, what if that failed network is for a Database Connection that gets put back into a connection pool? That will cause that connection to be poisoned and broken the next time somebody pulls it out regardless of the future state of that server that went poof! There are just some extra things to worry about here because you're no longer depending on just 1 network with 4 servers on it but are now depending on hundreds of networks with thousands of servers on them. That 0.05% error scenario will happen MUCH more often to you than you've ever experienced in the past and you really have to be aware of this!
You should use dependency-injection to easily change things around. Proper separation of concerns will changes that seem very difficult become very easy in Azure.
You should use architectures for "high-availability". For example, a web application that would break when ran in a web farm would also break in Azure but a web application designed to work in a web farm would be very easy to run in Azure.
You should have automated deployments and configuration transforms for all of your applications. Anything else is just unsustainable unless it's nothing more than one little web site or something like that.
Depending on your needs, you can do it in phases. If database latency is something that isn't a big deal, perhaps a hybrid approach (over VPN from Azure to your data center) is acceptable to get your apps in Azure first while you later migrate your database. Or perhaps the opposite. What we did was keep primary apps and database in our data center but put secondary apps up in Azure first. Then some primary apps (that took a performance hit for a month until) later our database and critical apps. That final migration sure made for a very long weekend and not much sleep, but it is SOO much nicer now that we're done!
Migrating your applications to Azure:
This ultimately depends very heavily on what your application is or does, and every scenario has different steps/issues/benefits. I'm not going to cover this deeply other than to say, "Use Google, it's your friend!". Beyond that, for us, getting our applications up into Azure was the largest payoff when compared to our data. The ROI on our app migration was less than a month between hosting costs, licensing, and management effort. Instead of taking a couple days to setup a server, I can now take a day to setup an entirely new and duplicated environment of all of our SaaS applications/databases/etc and have them running on ~25 different Cloud instances!
Instead of trying to tell you how to migrate these, let me give you a few words of caution so you know sooner rather than later:
If you have app problems in Windows 2012, humor me and try it in Windows 2008 R2. There are a couple bugs in some of the 2012 images that they've prepared. It's incredibly trivial to switch back and forth!
Go make your logging 1000x's better than what it is now. If you don't do that now, you'll regret it.
Don't depend 100% on the easy-to-implement "Azure Logging". It works well enough but it more-or-less requires your applications to start successfully and is absolutely useless in debugging startup problems. If you don't have an alternative, then you will waste many, many hours just debugging stupid little problems when your app starts up. By the time you're done with it, you could have easily added 5 other logging frameworks and had an amazingly awesome logging system in place plus a running app instead of nothing but a running app to show for the same amount of time. Really, do this! (Profiling is a good idea as well, although Mini-Profiler has load-balancing issues if you have multiple instances.)
If you add new endpoints to your deployment (ports, etc), you cannot simply "Upgrade" an existing deployment. You must delete it (the deployment, not the service) and install from scratch. You can delete the Staging one, deploy to Staging, then swap.
If you have WCF apps, pretend you don't know about Windows Activation Services. They're disabled in Azure by default. You can either hack them to turn them on (startup scripts) or create your own self-hosting application. We self-host so we can more easily tweak service configurations once we're deployed (it's not easy to edit web.config files in a way that sticks in Azure). Web services work in "IIS" in Azure but TCP, named pipes, etc. do not.
Go learn about and add the Transient Errors Application Block (or something equivalent) to anything you communicate with. If you don't do it now, you'll regret it.
Go make your logging better! Really, really, REALLY do this!
Migrating your SQL Server Database to Azure:
Getting your database up into Azure is a bit of a painful process. There isn't a quick and easy way to just get it up there and making it work. Some people have to make some major changes while others just have to tweak a few ignorable things here and there. However, no matter how large or small your database is, you really REALLY must devote a lot of time to testing it. Test your migration process. Test your scripts to prepare your database. Test the performance and stability of the database up in the cloud. Test your backup procedures. Test your upgrade procedures. Test your backup restoration procedures. Test ALL of this because I guarantee you that you will find some surprises!
Schema:
Go learn about all of the limitations of SQL Azure. No Heaps, etc. Learn them before you start! Go learn them now! They're all mostly to very reasonable.
Be aware of the 2GB T-Log limitation! This means some very large indexes can never be rebuilt! (that said, our 30GB table isn't yet hitting this)
To deploy your schema, go into SSMS for your local db and use the "Tasks -> Extract Data-tier Application..." feature (it's in different areas of the menu in different versions of SSMS). Take this file and go into SSMS for your Azure database and use the "Deploy Data-tier Application" feature. (This will help you catch some of the Azure limitations you aren't honoring if this process fails.) This is, by far, the easiest way to get an empty version of your database up into Azure.
Use a tool like Redgate SQL Compare to verify your work (you'll have to tweak a couple options, like WITH NOCHECK to get a clean comparison).
You'll have to cleanup users, schemas, broken sprocs, etc. before you succeed at this. (this is a good thing!)
Data:
Go learn about all of the limitations of SQL Azure. Learn them before you start! Go learn them now! They're all mostly to very reasonable.
Go download the Azure Database Migration Wizard from Codeplex (or wherever the latest version is). It's not the most amazing software (kinda unstable) but even if it crashes once or twice on you, it'll still save you a LOT of time!
I strongly recommend RedGate's SQL Data Compare. The previously-mentioned migration wizard will help you identify problems (it's on you to fix those) and will get you ~98% migrated but you'll want to come back and clean up after it. It has some bugs that misses nullable BIT fields (and upper ascii characters) and some other things that a tool like SQL Data Compare can easily identify and fix. It can also give you the peace-of-mind that you can depend on your database.
If your database is large, consider spinning up a temporary VM in Azure (they have them with SQL Server pre-installed and available in ~20 minutes) to do your migrations from. If you do this, it's best to upload a compressed database backup to Blob storage (Cerebrata's storage too is great for this) and then grab it and restore to SQL Server in that VM. Then stage your migrations all from there!
Test, test, TEST!!!
Be careful running SQL on a VM, it's not a high availability solution. Azure VMs are prone to restarting from time to time. Unless you have multiple VMs running SQL Server in an availability group, or you have some sort of mirroring and load balancing setup, you won't have a high availability solution. I too originally favoured the IaaS to PaaS route, in the end it seemed to be a false economy as migrating IaaS to PaaS is about as much work as migrating on-premise to PaaS. In the end I decided to take the time to optimise my application for PaaS, i.e. moving durable storage to blobs, implementing transient error handling and retry logic, etc.
What you're proposing is certainly possible but having a multi VM arrangement to deliver high availability SQL takes a bit of work and is expensive! Have a read of the following guide, it was really helpful to me when I started the migration process:
Top 7 Concerns of Migrating a .NET Application to Azure
Just yesterday Microsoft announced their plan to host also Iaas solutions and not only Saas solution on their Azure platform.
http://weblogs.asp.net/scottgu/archive/2013/04/16/windows-azure-general-availability-of-infrastructure-as-a-service-iaas.aspx
About migration, it really depends. We work with a distribution mechanism: TFS + Octopus so the deployment is very easy and it works on Iaas or SQL Azure, it doesn't really matter.
There are also other things to keep into consideration when moving into Saas. Probably your code should be refactored if it's not Saas oriented or your application may have a very high hosting cost over Azure.

How do you manage databases during development?

My development team of four people has been facing this issue for some time now:
Sometimes we need to be working off the same set of data. So while we develop on our local computers, the dev database is connected to remotely.
However, sometimes we need to run operations on the db that will step on other developers' data, ie we break associations. For this a local db would be nice.
Is there a best practice for getting around this dilemma? Is there something like an "SCM for data" tool?
In a weird way, keeping a text file of SQL insert/delete/update queries in the git repo would be useful, but I think this could get very slow very quickly.
How do you guys deal with this?
You may find my question How Do You Build Your Database From Source Control useful.
Fundamentally, effective management of shared resources (like a database) is hard. It's hard because it requires balancing the needs of multiple people, including other developers, testers, project managers, etc.
Often, it's more effective to give individual developers their own sandboxed environment in which they can perform development and unit testing without affecting other developers or testers. This isn't a panacea though, because you now have to provide a mechanism to keep these multiple separate environments in sync with one another over time. You need to make sure that developers have a reasonable way of picking up each other changes (both data, schema, and code). This isn't necesarily easier. A good SCM practice can help, but it still requires a considerable level of cooperation and coordination to pull it off. Not only that, but providing each developer with their own copy of an entire environment can introduce costs for storage, and additional DBA resource to assist in the management and oversight of those environments.
Here are some ideas for you to consider:
Create a shared, public "environment whiteboard" (it could be electronic) where developers can easily see which environments are available and who is using them.
Identify an individual or group to own database resources. They are responsible for keeping track of environments, and helping resolve the conflicting needs of different groups (developers, testers, etc).
If time and budgets allow, consider creating sandbox environments for all of your developers.
If you don't already do so, consider separating developer "play areas", from your integration, testing, and acceptance testing environments.
Make sure you version control critical database objects - particularly those that change often like triggers, stored procedures, and views. You don't want to lose work if someone overwrites someone else's changes.
We use local developer databases and a single, master database for integration testing. We store creation scripts in SCM. One developer is responsible for updating the SQL scripts based on the "golden master" schema. A developer can make changes as necessary to their local database, populating as necessary from the data in the integration DB, using an import process, or generating data using a tool (Red Gate Data Generator, in our case). If necessary, developers wipe out their local copy and can refresh from the creation script and integration data as needed. Typically databases are only used for integration testing and we mock them out for unit tests so the amount of work keeping things synchronized is minimized.
I recommend that you take a look at Scott AllenĀ“s views on this matter. He wrote a series of blogs which are, in my opinion, excellent.
Three Rules for Database Work,
The Baseline,
Change scripts,
Views, stored procs etc,
Branching and Merging.
I use these guidelines more or less, with personal changes and they work.
In the past, I've dealt with this several ways.
One is the SQL Script repository that creates and populates the database. It's not a bad option at all and can keep everything in sync (even if you're not using this method, you should still maintain these scripts so that your DB is in Source Control).
The other (which I prefer) was having a single instance of a "clean" dev database on the server that nobody connected to. When developers needed to refresh their dev databases, they ran a SSIS package that copied the "clean" database onto their dev copy. We could then modify our dev databases as needed without stepping on the feet of other developers.
We have a database maintenance tool that we use that creates/updates our tables and our procs. we have a server that has an up-to-date database populated with data.
we keep local databases that we can play with as we choose, but when we need to go back to "baseline" we get a backup of the "master" from the server and restore it locally.
if/when we add columns/tables/procs we update the dbMaintenance tool which is kept in source control.
sometimes, its a pain, but it works reasonably well.
If you use an ORM such as nHibernate, create a script that generate both the schema & the data in the LOCAL development database of your developers.
Improve that script during the development to include typical data.
Test on a staging database before deployment.
We do replicate production database to UAT database for the end users. That database is not accessible by developers.
It takes less than few seconds to drop all tables, create them again and inject test data.
If you are using an ORM that generates the schema, you don't have to maintain the creation script.
Previously, I worked on a product that was data warehouse-related, and designed to be installed at client sites if desired. Consequently, the software knew how to go about "installation" (mainly creation of the required database schema and population of static data such as currency/country codes, etc.).
Because we had this information in the code itself, and because we had pluggable SQL adapters, it was trivial to get this code to work with an in-memory database (we used HSQL). Consequently we did most of our actual development work and performance testing against "real" local servers (Oracle or SQL Server), but all of the unit testing and other automated tasks against process-specific in-memory DBs.
We were quite fortunate in this respect that if there was a change to the centralised static data, we needed to include it in the upgrade part of the installation instructions, so by default it was stored in the SCM repository, checked out by the developers and installed as part of their normal workflow. On reflection this is very similar to your proposed DB changelog idea, except a little more formalised and with a domain-specific abstraction layer around it.
This scheme worked very well, because anyone could build a fully working DB with up-to-date static data in a few minutes, without stepping on anyone else's toes. I couldn't say if it's worthwhile if you don't need the install/upgrade functionality, but I would consider it anyway because it made the database dependency completely painless.
What about this approach:
Maintain a separate repo for a "clean db". The repo will be a sql file with table creates/inserts, etc.
Using Rails (I'm sure could be adapted for any git repo), maintain the "clean db" as a submodule within the application. Write a script (rake task, perhaps) that queries a local dev db with the SQL statements.
To clean your local db (and replace with fresh data):
git submodule init
git submodule update
then
rake dev_db:update ......... (or something like that!)
I've done one of two things. In both cases, developers working on code that might conflict with others run their own database locally, or get a separate instance on the dev database server.
Similar to what #tvanfosson recommended, you keep a set of SQL scripts that can build the database from scratch, or
On a well defined, regular basis, all of the developer databases are overwritten with a copy of production data, or with a scaled down/deidentified copy of production, depending on what kind of data we're using.
I would agree with all the LBushkin has said in his answer. If you're using SQL Server, we've got a solution here at Red Gate that should allow you to easily share changes between multiple development environments.
http://www.red-gate.com/products/sql_source_control/index.htm
If there are storage concerns that make it hard for your DBA to allow multiple development environments, Red Gate has a solution for this. With Red Gate's HyperBac technology you can create virtual databases for each developer. These appear to be exactly the same as ordinary database, but in the background, the common data is being shared between the different databases. This allows developers to have their own databases without taking up an impractical amount of storage space on your SQL Server.

Getting the production database to work with locally

I'm nearing the stage of saying "lets go live" of a client's system I've been working on for the past few months, basically a few autonomous services including a public facing website, an intranet website, an hourly exporter from a legacy OLTP DB based on modified flags/triggers etc.. and few others services that compose the system, with each their own specialized database etc.. , communicating with each other via messages (NServiceBus).
When I started I tried to keep a local replication of everything but its proving more and more difficult and on reflection probably a major friction point of the past few weeks, I like to keep regularly up to date as the legacy database is growing and causing hundreds events daily. Having high latency & mediocre bandwidth (between myself and the client's site, I'm in SE Asia where bandwidth is generally crap anyway) is also an issue for RDP, SQL tools, remote connection strings etc.. Tracking down integration bugs and understanding scenarios they present during feedback/integration/QA is also difficult as my data doesn't reflect the current state of the client's DB (the clients' staff have been working and evolving the data their end) and means another break, coffee and lengthy sync again. It would be ideal to do it all locally and then deploy at the end but I have to deliver parts incrementally (to get the check), and some parts are even in use (although not critical) so bug fixing on in use pats needs turn around quick, and with it being a small company, incremental feedback, it helps flush out some of the more vague requirements along the way (curse me).
I was thinking it would be good to have twice daily sync between the environments (their DBs to mine), I somewhat have design control over everything apart from the legacy SQL server database.
What are the best options SO users?
I was thinking of setting up a Windows 2003 light VM on my dev box. And in this install the same setup of the client sites (but not spread across multiple servers obviously). And then for syncing the databases I was thinking about SQL Server replication? or batch scripts? Or is there any better tools - ones that are fast and good compressions? I don't want my changes to go back to production (I have a separate CI & deployment procedure), I just want (I think I want.. tell me if better idea) my databases to be refreshed every night or twice a day (maybe whilst I'm at lunch bandwidth permitting).
How does everyone approach this?
I would recommend two ways to do this:
Snapshot Replication
Backing up the transaction log and manually (or batch) applying it
Snapshot replication can be difficult to get working, but it is possible even in offline situations such as physcially carrying snapshots to another location.
The transaction log method can be used as part of your standard backup procedures. ie: A full backup twice a week with transaction log backups more regularly.
Remember that best practice is to cleanse the data before using it in a test environment. At the very least this should be changing all personal data, especially email addresses, passwords and any other method which could result in some automated process making contact with the user in your database.

Caching to a local SQL instance on a web server

I run a very high traffic(10m impressions a day)/high revenue generating web site built with .net. The core meta data is stored on a SQL server. My team and I have a unique caching strategy that involves querying the database for new meta data at regular intervals from a middle tier server, serializing the data to files and sending those to the web nodes. The web application uses the data in these files (some are actually serialized objects) to instantiate objects and caches those in memory to use for real time requests.
The advantage of this model is that it:
Allows the web nodes to cache all data in memory and not incur any IO overhead querying a database.
If the database ever goes down either unexpectedly or for maintenance windows, the web servers will continue to run and generate revenue. You can even fire up a web server without having to retrieve its initial data from the DB because all the data it needs are in files on its own disks.
Allows us to be completely horizontally scalable. If throughput suffers, we can just add a web server.
The disadvantages are that this caching and persistense layers adds complexity in the code that queries the database, packages the data and unpackages it on the web server. Any time our domain model requires us to add entities, more of this "plumbing" has to be coded. This architecture has been in place for four years and there are probably better ways to tackle this.
One strategy I have been considering is using replication to replicate our master sql server database to local database instances installed on each web server. The web server application would use normal sql/ORM techniques to instantiate objects. Here, we can still sustain a master database outage and we would not have to code up specialized caching code and could instead use nHibernate to handle the persistence.
This seems like a more elegant solution and would like to see what others think or if anyone else has any alternatives to suggest.
I think you're overthinking this. SQL Server already has mechanisms available to you to handle these kinds of things.
First, implement a SQL Server cluster to protect your main database. You can fail over from node to node in the cluster without losing data, and downtime is a matter of seconds, max.
Second, implement database mirroring to protect from a cluster failure. Depending on whether you use synchronous or asynchronous mirroring, your mirrored server will either be updated in realtime or a few minutes behind. If you do it in realtime, you can fail over to the mirror automatically inside your app - SQL Server 2005 & above support embedding the mirror server's name in the connection string, so you don't even have to lift a finger. The app just connects to whatever server's live.
Between these two things, you're protected from just about any main database failure short of a datacenter-wide power outage or network outage, and there's none of the complexity of the replication stuff. That covers your high availability issue, and lets you answer the scaling question separately.
My favorite starting point for scaling is using three separate connection strings in your application, and choose the right one based on the needs of your query:
Realtime - Points directly at the one master server. All writes go to this connection string, and only the most mission-critical reads go here.
Near-Realtime - Points at a load balanced pool of read-only SQL Servers that are getting updated by replication or log shipping. In your original design, these lived on the web servers, but that's dangerous practice and a maintenance nightmare. SQL Server needs a lot of memory (not to mention money for licensing) and you don't want to be tied into adding a database server for every single web server.
Delayed Reporting - In your environment right now, it's going to point to the same load-balanced pool of subscribers, but down the road you can use a technology like log shipping to have a pool of servers 8-24 hours behind. These scale out really well, but the data's far behind. It's great for reporting, search, long-term history, and other non-realtime needs.
If you design your app to use those 3 connection strings from the start, scaling is a lot easier, and doesn't involve any coding complexity - just pick the right connection string.
Have you considered memcached? Since it is:
in memory
can run locally
fully scalable horizontally
prevents the need to re-cache on each web server
It may fit the bill. Check out Google for lots of details and usage stories.
Just some addition to what RickNZ proposed above..
Since your master data which you are caching currently won't change so frequently and probably over some maintenance window, here is what should you do first on database side:
Create a SNAPSHOT replication for the master tables which you want to cache. Adding new entities will be equally easy.
On all the webservers, install SQL Express and subscribe to this Publication.
Since, this is not a frequently changing data, you can rest assure, no much server resource usage issue minus network trips for master data.
All your caching which was available via previous mechanism is still availbale minus all headache which comes when you add new entities.
Next, you can leverage .NET mechanisms as suggested above. You won't face memcached cluster failure unless your webserver itself goes down. There is a lot availble in .NET which a .NET pro can point out after this stage.
It seems to me that Windows Server AppFabric is exactly what you are looking for. (AKA "Velocity"). From the introductory documentation:
Windows Server AppFabric provides a
distributed in-memory application
cache platform for developing
scalable, available, and
high-performance applications.
AppFabric fuses memory across multiple
computers to give a single unified
cache view to applications.
Applications can store any
serializable CLR object without
worrying about where the object gets
stored. Scalability can be achieved by
simply adding more computers on
demand. The cache also allows for
copies of data to be stored across the
cluster, thus protecting data against
failures. It runs as a service
accessed over the network. In
addition, Windows Server AppFabric
provides seamless integration with
ASP.NET that enables ASP.NET session
objects to be stored in the
distributed cache without having to
write to databases. This increases
both the performance and scalability
of ASP.NET applications.
Have you considered using SqlDependency caching?
You could also write the data to the local disk at the web tier, if you're concerned about initial start-up time or DB outages. But at least with a SqlDependency, you shouldn't have to poll the DB to look for changes. It can also be made relatively transparent.
In my experience, adding a DB instance on web servers generally doesn't work out too well from a scalability or performance perspective.
If you're concerned about performance and scalability, you might consider partitioning your data tier. The specifics depend on your app, but as an example, you could move read-only data onto a couple of SQL Express servers that are populated with replication.
In case it helps, I talk about this subject at length in my book (Ultra-Fast ASP.NET).

Resources