How EF code first modeling will affect already existing data in database - database

Its clear to me that I can customize the behavior of syncing models and DB schema process. I am using the DropCreateDatabaseIfModelChanges<> class to do so.
Assume that I have a working project and site and DB is filling in the data. Everything is working fine.
One day I decide that some functionality needs to be changed. The changes will affect the properties of my models (they can be renamed/deleted/added, some models will be new, some models are deleted).
My question: What will happen with the already existing data on my deployed site when I check in all of my changes?
Will I lose it? If so, how can I avoid that?

Yes, you will lose your data if your model changes and you are using DropCreateDatabaseIfModelChanges<T>
To avoid this:
Don't use Db initializers in production (maybe except the CreateDatabaseIfNotExists<T>). DB initializes are there to smooth the development experience, not for production use.
What you need is the new Migration feature of Entity Framework 4.3. (currently in Beta1) which provides features for automatic and code base db schema migration.
Also now you can set the DB initializer from the *.config file, so you can easily switch beetween the development time DropCreateDatabaseIfModelChanges to no initializer in production configurations.

Related

Can we store database triggers just like database migrations in modern web based frameworks?

Sometimes we need to define triggers in database for some use case. In such a scenario, how do version control the source code of triggers such that it can be replicated in all environments e.g. developement, pre-staging, staging, production.
An approach on the lines of database migrations would be good. Do we have any such thing in existence?
There are ton of migration tools available for almost every better framework in any language to manage migrations. Do a Google search on whatever language and framework you use.

asp.net code first automatic database updates

I am creating an application in C# Asp.net using Code First Entity Framework that will be using a different databases for different customers (in other words every customer has its own database, that will be generated on first time use).
I am trying to figure out a way to update all these databases automatically whenever I apply changes to my objects. In other words, how would I approach a cleanstep system in Code First EF?
Currently I am using InitializerIfModelChange to define a simple database that allows me to test my application whenever a schema change occurs. However, this method drops the database, which obviously is unacceptable in case of customer databases.
I must assume hundreds of customers so updating all databases by hand is not an option
I do not mind writing code that copies the data into a new database.
I think the best solution would be a way to backup a database somehow and then reinsert all data into the newly created database. Even better would be a way that automatically updates the schema without dropping the database. However I have no idea how to approach this. Can anyone point me in the right direction?
The link posted by Joakim was helpful. It requires you to update to EF 4.3.1 (dont forget your references in other projects if you have them) after which you can run the command that enables the migration. To automatically update the schema from code you can use
Configuration configuration = new Configuration();
DbMigrator migrator = new DbMigrator(configuration);
migrator.Update();
Database.SetInitializer<DbContext>(null);

how to minimize application downtime when updating database and application ORM

We currently run an ecommerce solution for a leisure and travel company. Everytime we have a release, we must bring the ecommerce site down as we update database schema and the data access code. We are using a custom built ORM where each data entity is responsible for their own CRUD operations. This is accomplished by dynamically generating the SQL based on attributes in the data entity.
For example, the data entity for an address would be...
[tableName="address"]
public class address : dataEntity
{
[column="address1"]
public string address1;
[column="city"]
public string city;
}
So, if we add a new column to the database, we must update the schema of the database and also update the data entity.
As you can expect, the business people are not too happy about this outage as it puts a crimp in their cash-flow. The operations people are not happy as they have to deal with a high-pressure time when database and applications are upgraded. The programmers are upset as they are constantly getting in trouble for the legacy system that they inherited.
Do any of you smart people out there have some suggestions?
The first answer is obviously, don't use an ORM. Only application programmers think they're good. Learn SQL like everyone else :)
OK, so back to reality. What's to stop you restricting all schema changes to be additions only. Then you can update the DB schema anytime you like, and only install the recompiled application until a safe time (6am works best I find) after the DB is updated. If you must remove things, perform the steps the other way round - install the new app leaving the schema unchanged, and then remove the bits from the schema.
You're always going to have a high-pressure time as you roll out changes, but at least you can manage it better by doing it in 2 easier to understand pieces. Your DBAs will be ok with updating the schema for the existing application.
The downside is that you have to be a lot more organised, but that's not a bad thing when dealing with production servers and you should be seriously organised about it currently.
Supporting this scenario will add significant complexity to your environment and/or process and/or application.
You can run a complex update process where your application code is smart enough to run correctly on both the old schema and the new schema at the same time. Then you can update the application first and the schema second. A third step may be to migrate any data, which again, the application has to be able to work with. In that case, you only need to "tombstone" the application for the time it takes to upgrade the application, which could just be seconds, depending on how many files and machines are involved in the upgrade.
In most cases, it's best to leave the application/environment/process simple and live with the downtown during a slow time of the day/week/month. Pretty much all applications need to be "taken down" for time to time for "regularly schedule maintenance".

Interacting with external DB via Django

I'm working on a Django app that interacts with an existing database (think ERP/transaction type data) to perform analysis. There will be minimal/no updating of the existing database mainly reading data in. Its just a simple small setup so no replication etc. issues to think about re. updating.
The analysis would result in new records created within the Django Model.
Currently the existing DB runs on PostgreSQL.
I am aware of Alex Gaynor's GSOC multidb code which, from what I gather is ticket #1142 which has no patch yet to trunk.
So from what I gather there are three options I can see:
1) Point Django db to the same db as the ERP and let it create the tables it needs within it (all the ERP tables have a prefix so there would be no collision) however this strikes me as hackey and a recipe for disaster.
2) Create a new db for Django and automatically copy over the required tables. Better but I cant update, thought I can probably live with this.
3) Try out the multidb patch.
Are there other better ideas out there? I'm leaning towards at least trying out the multidb patch but I'm a little worried about stability and forwards compatibility.
How about not using Django's ORM layer at all for that DB? It the interaction is minimal, you might do it faster by just using direct SQL with the appropriate postgresql-python library.

Generating database tables from object definitions

I know that there are a few (automatic) ways to create a data access layer to manipulate an existing database (LINQ to SQL, Hibernate, etc...). But I'm getting kind of tired (and I believe that there should be a better way of doing things) of stuff like:
Creating/altering tables in Visio
Using Visio's "Update Database" to create/alter the database
Importing the tables into a "LINQ to SQL classes" object
Changing the code accordingly
Compiling
What about a way to generate the database schema from the objects/entities definition? I can't seem to find good references for tools like this (and I would expect some kind of built-in support in at least some frameworks).
It would be perfect if I could just:
Change the object definition
Change the code that manipulates the object
Compile (the database changes are done auto-magically)
Check out DataObjects.Net - is is designed to support exactly this case. Code only, and nothing else. Its schema upgrade layer is probably the most featured one you can find, and it really fully abstracts schema upgrade SQL.
Check out product video - you'll notice nothing additional is made to sync the schema. Schema upgrade sample shows the intended usage of this feature.
You may be looking for an Object Database.
I believe this is the problem that the Microsofy Entity Framework is trying to address. Whilst not specifically designed to "Compile (the database changes are done auto-magically)" it does address the issue of handling changes to the domain model without a huge dependance on the underlying data model.
As Jason suggested, object db might be a good choice. Take a look at db4objects.
What you described is GORM. It is part of the Grails framework and is built to work with Hibernate (maybe JPA in the future). When I was first using Grails it seemed backwards. I was more comfortable with a Rails style workflow of making the tables and letting the framework generate scaffolding from the database schema. GORM persists your domain objects for you so you create and change the objects, it manages database create/update. This makes more sense now that I have gotten used to it. Sorry to tease you if you aren't looking for a new framework but it is on the roadmap for release 1.1 to make GORM available standalone.
When we built the first version of our own framework (Inon Datamanager) I had it read pre-existing SQL tables and autogenerate Java objects from them.
When my colleagues who came from a Smalltalkish background built the second version, they started from the objects and then autogenerated the tables.
Actually, they forgot about the SQL part altogether until I came back in and added it. But nowadays we just run a trigger on application startup which iterates over the object model, checks if the tables and all the right columns exist, and creates them if not. Very convenient.
This turned out to be a lot easier than you might expect - if your favourite tool doesn't support a similar process, you could probably write it in a couple of hours - assuming the relational to object mapping is relatively simple.
But the point is, it seems to depend on whether you're culturally an object person or a database person - you can regard either one as the authoritative source.
Some of the really big dogs, such as ERwin Data Modeler, will go object to DB. You need to have the big bucks to afford the product though.
I kept digging around some of the "major" frameworks and it seems that Django does exactly what I was talking about. Or so it seems from this screencast.
Does anyone have any remark to make about this? Does it work well?
Yes, Django works well.
yes, it will generate your SQL tables from your data model definitions (written in python)
It won't always alter existing tables if you update your structure, you might have to run an ALTER table manually
Ruby on Rails has an even more advanced version of these features (Rails migrations), but I don't like the framework as much, I find ruby and rails pretty idiosyncratic
Kind of a late answer, but here it goes:
I faced the exact same problem and ended up writing my own solution for it, working with .NET and SQL Server only however. It basicaly does implement the process you describe:
All DB objects are kept as embedded CREATE scripts as part of the source code
DB Objects are set up automatically (or on request) when using the data access functionality
All non-table changes are also performed automatically (or on request) at the same time
Table changes, which may require special attention to migrate data, are performend via (manually created) change scripts also upon upgrading the database
Even manual changes made to any databse object can be detected, so that schema integrity can be verified and rectified
An optional lightweight ORM can map stored procedures and objects as well as result sets (even multiple)
A command-line application helps keeping the SQL source files in sync with a development database
The library including the database are free under a LGPL license.
http://code.google.com/p/bsn-modulestore/

Resources