how to minimize application downtime when updating database and application ORM - database

We currently run an ecommerce solution for a leisure and travel company. Everytime we have a release, we must bring the ecommerce site down as we update database schema and the data access code. We are using a custom built ORM where each data entity is responsible for their own CRUD operations. This is accomplished by dynamically generating the SQL based on attributes in the data entity.
For example, the data entity for an address would be...
[tableName="address"]
public class address : dataEntity
{
[column="address1"]
public string address1;
[column="city"]
public string city;
}
So, if we add a new column to the database, we must update the schema of the database and also update the data entity.
As you can expect, the business people are not too happy about this outage as it puts a crimp in their cash-flow. The operations people are not happy as they have to deal with a high-pressure time when database and applications are upgraded. The programmers are upset as they are constantly getting in trouble for the legacy system that they inherited.
Do any of you smart people out there have some suggestions?

The first answer is obviously, don't use an ORM. Only application programmers think they're good. Learn SQL like everyone else :)
OK, so back to reality. What's to stop you restricting all schema changes to be additions only. Then you can update the DB schema anytime you like, and only install the recompiled application until a safe time (6am works best I find) after the DB is updated. If you must remove things, perform the steps the other way round - install the new app leaving the schema unchanged, and then remove the bits from the schema.
You're always going to have a high-pressure time as you roll out changes, but at least you can manage it better by doing it in 2 easier to understand pieces. Your DBAs will be ok with updating the schema for the existing application.
The downside is that you have to be a lot more organised, but that's not a bad thing when dealing with production servers and you should be seriously organised about it currently.

Supporting this scenario will add significant complexity to your environment and/or process and/or application.
You can run a complex update process where your application code is smart enough to run correctly on both the old schema and the new schema at the same time. Then you can update the application first and the schema second. A third step may be to migrate any data, which again, the application has to be able to work with. In that case, you only need to "tombstone" the application for the time it takes to upgrade the application, which could just be seconds, depending on how many files and machines are involved in the upgrade.
In most cases, it's best to leave the application/environment/process simple and live with the downtown during a slow time of the day/week/month. Pretty much all applications need to be "taken down" for time to time for "regularly schedule maintenance".

Related

Altering database tables on updating website

This seems to be an issue that keeps coming back in every web application; you're improving the back-end code and need to alter a table in the database in order to do so. No problem doing manually on the development system, but when you deploy your updated code to production servers, they'll need to automatically alter the database tables too.
I've seen a variety of ways to handle these situations, all come with their benefits and own problems. Roughly, I've come to the following two possibilities;
Dedicated update script. Requires manually initiating the update. Requires all table alterations to be done in a predefined order (rigid release planning, no easy quick fixes on the database). Typically requires maintaining a separate updating process and some way to record and manage version numbers. Benefit is that it doesn't impact running code.
Checking table properties at runtime and altering them if needed. No manual interaction required and table alters may happen in any order (so a quick fix on the database is easy to deploy). Another benefit is that the code is typically a lot easier to maintain. Obvious problem is that it requires checking table properties a lot more than it needs to.
Are there any other general possibilities or ways of dealing with altering database tables upon application updates?
I'll share what I've seen work best. It's just expanding upon your first option.
The steps I've usually seen when updating schemas in production:
Take down the front end applications. This prevents any data from being written during a schema update. We don't want writes to fail because relationships are messed up or a table is suddenly out of sync with the application.
Potentially disconnect the database so no connections can be made. Sometimes there is code out there using your database you don't even know about!
Run the scripts as you described in your first option. It definitely takes careful planning. You're right that you need a pre-defined order to apply the changes. Also I would note often times you need two sets of scripts, one for schema updates and one for data updates. As an example, if you want to add a field that is not nullable, you might add a nullable field first, and then run a script to put in a default value.
Have rollback scripts on hand. This is crucial because you might make all the changes you think you need (since it all worked great in development) and then discover the application doesn't work before you bring it back online. It's good to have an exit strategy so you aren't in that horrible place of "oh crap, we broke the application and we've been offline for hours and hours and what do we do?!"
Make sure you have backups ready to go in case (4) goes really bad.
Coordinate the application update with the database updates. Usually you do the database updates first and then roll out the new code.
(Optional) A lot of companies do partial roll outs to test. I've never done this, but if you have 5 application servers and 5 database servers, you can first roll out to 1 application/1 database server and see how it goes. Then if it's good you continue with the rest of the production machines.
It definitely takes time to find out what works best for you. From my experience doing lots of production database updates, there is no silver bullet. The most important thing is taking your time and being disciplined in tracking changes (versioning like you mentioned).

CMS Database design - Master database or Multi-Db per site

I am in process of designing my CMS that I am about to create. I was thinking about the database and how I want to go by approaching it.
Do you think its best to create 1 master database for all my clients websites? or Should I have 1 database per site?
What is the benefits and negatives on both approaches? I am always thinking about the future so I was thinking about implementing memcache or APC cache to the project, to offer an option to my client.
Just trying to learn the best practices and what other developers apporach would be
I've run both. My business chooses to separate client-specific data into separate tables so that if one happens to go corrupt, not all are taken down. In an ideal world this might never happen, but murphy's law....It does seem very easy to find things with them separated. You will know with 100% certainty that one client's content will never show up on another's page.
If you do go down that route, be prepared to create scripts that build and configure databases for you. There's nothing fun about building a great system and having demand for it, only to spend your time manually setting up DB's and installs all day long. Also, setting db names is one additional step that's not part of using a single db table--it's a headache that will repeat itself seemingly over and over again.
Develop the single master DB. It will take a small amount of additional effort and add a little bit more complexity to the database design, but will give you a few nice features. The biggest is being able to share data between sites.
Designing for a master database means that you have the option to combine sites when it makes sense, but also lets you install a master per site. Best of both worlds.
It depends greatly upon the amount of customization each client will require. If you forsee clients asking for many one-off features specific to their deployment, separate databases based off of a single core structure might make sense. I would highly recommend trying to make any customizations usable by all clients though, and keep all structure defined in one place/database instead of duplicating it across multiple databases. By using one database, you make updating the structure straightforward and the implementation consistent across all sites so they can all use the same CMS code.

Track changes on a database

I am not sure whether this has been asked before; I did a few searches but nothing appropriate showed up.
OK, now my problem:
I want to migrate an old application to a different programming language. The only requirement we have is to keep the database structure stable. So no changes in my database schema. For the rest of the application I am basically reimplementing everything from scratch without reusing old code.
My Idea: in order to verify my new code was to let users do certain actions or workflows, capture the state of the database before that and after that and then maybe create unit tests with the help of this data. Does anyone know an elegant solution to keep track of these changes? Copying the database (>10GB) is pretty expensive. I also can't modify the code of the old application in which the users will be performing these sample actions. I have to keep it on the database level.
My database is Oracle 10g.
You could capture the old application behavior with a trace and then validate the changes against your new code. But, honestly, trying to write a new application by capturing the data modifications it makes and the imitating that will be a very difficult task as the inputs and the outputs to the original application are not guaranteed to be stateless (that is, the old application might do the same thing the first 1,000,000 times it is given a certain set of inputs and do something completely different on the 1,000,001st run.)
Your best bet is to start over with the business requirements and use the old application and a functional reference.
Take a look at Oracle Flashback Queries.
It enables to execute queries which return past data. The timeframe is limited, but it can be very useful.
In 10g the only way is to do with FLASHBACK queries.in 11g we can do this with RAT(Real Application Testing). RAT is quite useful for this senarios and also for load and volume testing.

Using a common database for collaborative development

Some of the people in my project seem to think that using a common development database with everyone connecting to it is the best thing. I think that it isn't and each developer having his own database (with periodic updated data dumps) is the best. Am I right or wrong? Have you encountered any problems in any of these approaches?
Disk space and CPU should be cheap enough that every developer can run their own instance of the database, with an automated build under version control. This is needed to allow developers to be bold in hacking on the database, in isolation from any other developer's concurrent hacking.
The caveat being, of course, that any changes they make to their private instance are useless to anyone else unless it can be automatically applied during the build process. So there needs to be a firm policy that application code can't depend on any database state unless that state is represented by version-controlled, unit-tested changes to the DDL.
For an excellent guide on the theory and practice of treating the database definition as another part of the project code, and coordinating changes and refactorings, see Refactoring Databases: Evolutionary Database Design by Scott W. Ambler and Pramod Sadalage.
I like having my own copy of the database for development, because it gives you the flexibility to rapidly change things without worrying how it will impact others.
However, if all the developers are hacking away on their own copy of the database, it becomes more and more difficult to merge everyone's work together in the end.
I think you can get the best of both worlds by letting developers work on a local copy during day-to-day development, but each developer should probably merge their work into a common copy on a pretty regular basis. Writing a lot of unit tests helps too.
We share a single database amongst all our developer (20-odd) but we've got it structured so that everyone has their own tables.
You don't need a separate database per developer if you structure the application right. It should be configurable which database or table-prefix it uses anyway so you can easily move it between instances (unit test, system test, acceptance test, production, disaster recovery and so on).
The advantage to using a single database is that the cost of maintenance is amortized. You don't have your DBAs trying to handle a lot of databases (or, if you're a small-DB shop, you don't have every developer trying to maintain their own database when they're better utilized in developing).
Having a single point of Failure is not a good thing isn't it?
I prefer a single, shared database. But it's very dependent on the situation and the applications being developed.
What works for me may not work for you. Go with your gut.
If you are working with Hibernate or any hibernate-based platform you can configure your database to be created when you start your server (create-drop option). This is very useful when you are adding new attributes to your classes. If this is the case each developer must have his own copy of the DB.
If you are not changing the DB structure at all then you can use a single shared DB.
In this second case is not a must. I prefer to have my own DB where I can do whatever I want. On the other hand remember that some queries can take a lot of time and this will affect your whole team if you are sharing a DB.

Managing the migration of breaking database changes to a database shared by old version of the same application

One of my goals is to be able to deploy a new version of a web application that runs side by side the old version. The catch is that everything shares a database. A database that in the new version tends to include significant refactoring to database tables. I would like to be rollout the new version of the application to users over time and to be able to switch them back to the old version if I need to.
Oren had a good post setting up the issue, but it ended with:
"We are still in somewhat muddy water with regards to deploying to production with regards to changes that affects the entire system, to wit, breaking database changes. I am going to discuss that in the next installment, this one got just a tad out of hand, I am afraid."
The follow-on post never came ;-). How would you go about managing the migration of breaking database changes to a database shared by old version of the same application. How would you keep the data synced up?
Read Scott Ambler's book "Refactoring Databases"; take with a pinch of salt, but there are quite a lot of good ideas in there.
The details of the solutions available depend on the DBMS you use. However, you can do things like:
create a new table (or several new tables) for the new design
create a view with the old table name that collects data from the new table(s)
create 'instead of' triggers on the view to update the new tables instead of the view
In some circumstances, you don't need a new table - you may just need triggers.
If the old version has to be maintained, the changes simply can't be breaking. That also helps when deploying a new version of a web app - if you need to roll back, it really helps if you can leave the database as it is.
Obviously this comes with significant architectural handicaps, and you will almost certainly end up with a database which shows its lineage, so to speak - but the deployment benefits are usually worth the headaches, in my experience.
It helps if you have a solid collection of integration tests for each old version involved . You should be able to run them against your migrated test database for every version which is still deemed to be "possibly live" - which may well be "every version ever" in some cases. If you're able to control deployment reasonably strictly you may get away with only having compatibility for three or four versions - in which case you can plan phasing out obsolete tables/columns etc if there's a real need. Just bear in mind the complexity of such planning against the benefits accrued.
Assuming only 2 versions of your client, I'd only keep one copy of the data in the new tables.
You can maintain the contract between the old and new apps behind views on top of the new tables.
Use before/instead of triggers to handle writes into the "old" views that actually write into the new tables.
You are maintaining 2 versions of code and must still develop your old app but it is unavoidable.
This way, there are no synchronisation issues, effectively you'd have to deal with replication conflicts between "old" and "new" schemas.
More than 2 versions becomes complicated as mentioned...
First, I would like to say that this problem is very hard and you might not find a complete answer.
Lately I've been involved in maintaining a legacy line of business application, which might soon evolve to a new version. Maintenance includes solving bugs, optimization of old code and new features, that sometimes cannot fit easily in the current application architecture. The main problem with our application is that it was poorly documented, there is no trace of changes and we are basically the 5th rotation team working on this project (we are fairly new to it).
Leaving the outer details on the side (code, layers, etc), I will try to explain a little how we are currently managing the database changes.
We have at this moment two rules that we are trying to follow:
First, is that old code (sql, stored procs, function, etc) works as is and should be kept as is, without modifying too much unless there is the case (bug or feature change), and of course, try to document it as much as possible (especially the problems like:
"WTF!, why did he do that instead of that?").
Second is that every new feature that comes in should use the best practices known at this moment, and modify the old database structure as little as it can. This would introduce some database refactoring options like using editable views on top of the old structure, introducing new extension tables for already existing ones, normalizing the structure and providing the older structure through views, etc.
Also, we are trying to write as many unit tests as we can provided the business analysts are working side by side and documenting the business rules.
Database refactoring is a very complex field to be answered in a short answer. There are a lot of books that answer all your problems, one http://databaserefactoring.com/ being pointed in one of the answers.
Later Edit: Hopefully the second rule will also answer the handling of breaking changes.

Resources