Play, Hibernate and Evolutions - database

I've no previous experience with tools like Liquibase and similar. Up to now the way I've usually managed deployment into production on apps using Hibernate was using manual SQL to modify the tables, as they were quite simple apps (the complex ones didn't use it...don't ask please :P).
I've wanted to use Evolutions in Play, but I see it clashes heavily with Hibernate in development, making it a pain and not a realistic option. In development Hibernate manages everything easily so there is no point on using Evolutions, but we wanted to keep the structure (files) to make it easier to migrate the app in production mode. But due to the clashes it doesn't seem worthy.
Liquibase had a Play module but it seems to have been discontinued since Evolutions was released (I wonder why, as I believe it would work wonder with Hibernate).
The question(s) would be:
How do you manage database migrations of apps in production?
What's the usual procedure/steps you use when your model changes between releases and you have to deploy to production?
Any specific tool or feature of Hibernate we are overlooking, or just old-faithful SQL Alter table and similar?
Focusing on Play Framework, how do you manage this?

What is often the case is that an application has two phases in its life cycle - initial development and post-production "maintenance". My experience is that often, all the big database changes happen in the first phase. Let yourself be flexible there by relying on Hibernate, then when you go to production, you take a schema dump, roll that on production with Evolutions, and manage your DDL manually from there.
In the second "phase" (I'm an agile guy, I hate the word ;-)), schema changes often include DML as well because you have to calculate initial values for new columns, etcetera. Also, you'll usually be spending more time on coding than on schema changes, so the whole manual experience becomes a bit less painful :).
(Having said that - I'd love a better integration between Evolutions and Play/Hibernate, like having the option to record the DDL that Hibernate spits out to the evolutions directory)

Well you ask a very good question. I struggled with this problem on grails, so I have not really a solution, but some thoughts. I will start with a comparison of Evolutions with Liquibase:
Liquibase is a matured solution, even if the plugin isn't under development any more, the underlying library is it. So I think it's an acceptable solution.
If you use Evolution you have one big disadvantage compared with liquibase: You must write your SQL directly, so the scripts depends on your database-system. Think abouts booleans and the representation in different databases. So you lost benefit Hibernate gives you.
Now to the general problem. I think you have to options:
Let Hibernate handle the database structure for you. Only in cases Hibernate can't do the job, you use liquibase or evolutions. Unfortunately you can run into some troubles if you have complicated update scenarios. How ever you win that your development is faster.
You ignore all DDL-Features from Hibernate and do everything with liquibase or evolutions. This is the most reliable and robust solution, but obviously you have much more work in development.
So what is my recommendation? I would try the following approach: Develop with an distributed version control system, like bzr or git. Then use feature-branches. Use for feature branches always the hibernate functionality. Before you merge the stuff into the trunk, create liquibase-script. These script can be generated by liquibase with some manual customizing). So you can develop a feature very quick and has in trunk always the robust solution 2.
How ever be aware that this isn't a proofed approach in great project. I only tested this strategy with Hibernate and Liquibase on a small project - it works fine.
So would be great to get feedback.

Regarding having hibernate spit out the SQL to get you started to use Evolutions or Flyway, take a look at this: http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/toolsetguide.html#toolsetguide-s1-6
EDIT: I actually made a plugin to bootstrap your migration script. I think it might be useful to most of the people that come across this thread:
http://web.ist.utl.pt/~joao.a.p.antunes/2014/08/09/play-2-2-x-jpa-hibernate-database-migration
Cheers!

Related

Database Agnostic Application

The database for one the application that I am working on is not confirmed yet by the business.
Best guess is Oracle and DB2.
What I've heard is initially the project will go live with DB2 V9 and then to Oracle 11g.
We are using Spring 3.0.5, Hibernate 3.5, JPA2 and JBoss 5 for this project
So what are the best practices here going into the build phase and test phase?
Shall I build using DB2 first and worry about Oracle later (this
doesn't sound right)?
Or, shall I write using JPA (Hibernate) and
then generate the database schema?
Or something else?
PS: I've no control over the choice of the DB, what and when, as these are strategic decision made by people sitting in nice rooms getting fat cheques and big bonuses.
Thanks,
Adi
Obviously you are loosing the access to specific features of the database if you are writing your application database agnostic. The database is, except for automatic optimizations done by JPA and Hibernate, reduced to common features. You have to set some things to automatic and trust JPA/Hibernate to do it right that you could set specifically if you knew the database (e.g. id generator strategies).
But it seems that the specific developer features of the database are not relevant for the decision so they can't be relevant to the application. What other reasons may influence the decision (like price, money, cash, personal relations, management tools, hardware requirements, existing knowledge and personell) can only be speculated about.
So IMHO you don't have a choice. Strictly avoid anything database specific. That includes letting the JPA/Hibernate generate the schema (your point #2). In this project setup you shouldn't tinker with the database manually.
Well... sadly there ARE some hidden traps in JPA/Hibernate developement that make it database dependent (e.g. logarithmic functions are not mapped consistenly). So you should run all your tests against all possible databases from day one. As you write "Best guess is..." you should just grab any database available and test against it. Should be easly setup with the given stack.
And you should try to accelerate the decision about the database used, if possible.
Just "write using JPA (Hibernate)" develop it to be de database agnostic. Put all you business logic in java code not stored procedures.
If you are using spring you don't need jboss you could use just tomcat, about a quarter of the foot print, and much simpler imho.
Spring vs Jboss and jboss represents all that is bad, while spring represents all that is good in Java enterprise development
We have add this issue and had to migrate late in the project, leading to a lot of extra works, frustrations and delays.
My advise is to define an abstract layer. Go to the point you may have a data model without any database, say with tables or text files.
Then when you have to switch to some database, you can optimize for it, while staying free to continue application development on any already developped model. So you don't delay the developpers on the app while one is tuning the DB2 layer. When everything is duly validated, the team can switch on it.
I will disagree with the currently accepted answer suggesting avoiding database specific things. From a performance perspective, that would be a pity, and it's definitely doable.
JPA/Hibernate and also jOOQ can abstract over a lot of things and if you're using the query builder APIs of either technology (criteria query in JPA, or jOOQ for more advanced SQL), you can get very far in a vendor agnostic way without removing all the vendor specific stuff. For example, you can easily create a vendor specific predicate like this:
.where(oracle ? oracleCondition() : db2Condition())
What you should do from the very beginning of such a project, once you know you'll have to support both dialects is to run integration tests on both database products. For this, I recommend testcontainers, which makes running such tests quite simple. If you have to add support for another dialect, and if you're using one of the above abstractions, you can simply add another testcontainers configuration, check if your application still works, tweak 2-3 things, and you're set.
Disclaimer: I work for the company behind jOOQ.

What is the best database to use with Grails in an enterprise application?

I realize this has flame potential, please refrain. That being said, I'm interested in what databases people have used with Grails. What positive experiences and what horror stories are out there?
I love MySQL, but there are a few significant bugs that are impacting me between Hibernate and MySQL, particularly as it pertains to index creation. So I guess my question is really, what is the most stable database for integration with Grails? Or what database has the fewest bugs with respect to Grails?
Or what database has the widest use in conjunction with Grails? I also realize that these questions are somewhat orthogonal and opposing. Anyway, I'd like to open it up to discussion.
I use Hibernate in enterprice apps since version 1. My personal chart is
Oracle: fast stable lot of dba knows it and hao to tune performace backups end so
SQLServer: same as above (but not as fast as Oracle)
DB2: not so easy to use with hibernate(I got several issues with date and char datatype)
MySQL: not so easy to manage or find professional support (may be different for you) but Hibernate stuff works great.
As Gregg said, this is a hibernate question - Grails does all it's DB interaction via that (except for any custom SQL you write).
The only problem you might hit is with the GORM DSL not correctly creating any tricky hibernate mappings you require for a particular DB (especially if it's a legacy one). But GORM is pretty mature these days and I personally haven't hit any issues lately.
We run MySQL in production on a public web application and it has been fine. We've also deployed 'enterprisey' apps on top of Oracle which also went well except for a couple of issues with id generator configuration if I recall correctly. But I think those have been fixed in the latest Grails version.
In summary, go with your gut feel based on previous experience with hibernate.
cheers
Lee

Using a common database for collaborative development

Some of the people in my project seem to think that using a common development database with everyone connecting to it is the best thing. I think that it isn't and each developer having his own database (with periodic updated data dumps) is the best. Am I right or wrong? Have you encountered any problems in any of these approaches?
Disk space and CPU should be cheap enough that every developer can run their own instance of the database, with an automated build under version control. This is needed to allow developers to be bold in hacking on the database, in isolation from any other developer's concurrent hacking.
The caveat being, of course, that any changes they make to their private instance are useless to anyone else unless it can be automatically applied during the build process. So there needs to be a firm policy that application code can't depend on any database state unless that state is represented by version-controlled, unit-tested changes to the DDL.
For an excellent guide on the theory and practice of treating the database definition as another part of the project code, and coordinating changes and refactorings, see Refactoring Databases: Evolutionary Database Design by Scott W. Ambler and Pramod Sadalage.
I like having my own copy of the database for development, because it gives you the flexibility to rapidly change things without worrying how it will impact others.
However, if all the developers are hacking away on their own copy of the database, it becomes more and more difficult to merge everyone's work together in the end.
I think you can get the best of both worlds by letting developers work on a local copy during day-to-day development, but each developer should probably merge their work into a common copy on a pretty regular basis. Writing a lot of unit tests helps too.
We share a single database amongst all our developer (20-odd) but we've got it structured so that everyone has their own tables.
You don't need a separate database per developer if you structure the application right. It should be configurable which database or table-prefix it uses anyway so you can easily move it between instances (unit test, system test, acceptance test, production, disaster recovery and so on).
The advantage to using a single database is that the cost of maintenance is amortized. You don't have your DBAs trying to handle a lot of databases (or, if you're a small-DB shop, you don't have every developer trying to maintain their own database when they're better utilized in developing).
Having a single point of Failure is not a good thing isn't it?
I prefer a single, shared database. But it's very dependent on the situation and the applications being developed.
What works for me may not work for you. Go with your gut.
If you are working with Hibernate or any hibernate-based platform you can configure your database to be created when you start your server (create-drop option). This is very useful when you are adding new attributes to your classes. If this is the case each developer must have his own copy of the DB.
If you are not changing the DB structure at all then you can use a single shared DB.
In this second case is not a must. I prefer to have my own DB where I can do whatever I want. On the other hand remember that some queries can take a lot of time and this will affect your whole team if you are sharing a DB.

Django database scalability

We have a new django powered project which have a potential heavy-traffic characteristic(means a heavy db interaction). So we need to consider the database scalability in advance. With some researches, the following questions are still not clear to us:
coarse-grained: how to specify one db table(a django model) to a specific db(maybe in another server)?
fine-grained: how to specify a group of table rows to a specific db(so-called sharding, also can in another db server)?
how to specify write and read to different db?(which will be helpful for future mysql master/slave replication)
We are finding the solution with:
be transparent to application program(means we don't need to have additional codes in views.py)
should be in ORM level(means only needs to specify in models.py)
compatible with the current(or future) django release(to keep a minimal change for future's upgrading of django)
I'm still doing the research. And will share in this thread later if I've got some fruits.
Hope anyone with the experience can answer. Thanks.
Don't forget about caching either. Using memcached to relieve your DB of load is key to building a high performance site.
As alex said, django-core doesn't support your specific requests for those features, though they are definitely on the todo list.
If you don't do this in the application layer, you're basically asking for performance trouble. There aren't any really good open source automation layers for this sort of task, since it tends to break SQL axioms. If you're really concerned about it, you should be coding the entire application for it, not simply hoping that your ORM will take care of it.
There is the GSoC project by Alex Gaynor that in future will allow to use multiple databases in one Django project. But now there is no cross-RDBMS working solution.
There is no solution right now too.
And again - there is no cross-RDBMS solution. But if you are using MySQL you can try excellent third-party Django application called - mysql_replicated. It allows to setup master-slave replication scenario easily.
here for some reason we r using django with sqlalchemy. maybe combination of django and sqlalchemy also works for your needs.

How can I get my database under version control with Perl?

I've been looking at the options for getting our database schemas under version control. It seems that Ruby folks have got Rails Migrations, and .NET folks have got a few options (for instance this, this, and this). What about Perl?
I've seen this thread on PerlMonks which doesn't have much, although it mentions DBIX::Migration::Directories. Is anyone actually using this module, or some other module? Or do you roll your own DB migration solutions?
Gratuitous details:
We don't use DBIx::Class for the most part
We use MySQL
We use SVN
At work, we use a modified version of DBIx::Migration (it has some limitations, such as no more than 10 migrations). Then, you have a core schema that you've dumped from your database and when the version number is too low, you upgrade your database using the migrations from the migration schema directory.
I also highly recommend the Database Refactoring book. Amongst other things, it will give you excellent techniques for managing migrations safely in such a way that if you need to roll back, you don't lose data (such as when you drop a column you think you don't need).
To help with the automatic deprecation schedules it suggests, I've written Devel::Deprecate so that you don't need to remember when to do the deprecations. Your code will complain loudly for you (and only in testing, not in production).
Important: You'll periodically find that you're applying so many database migration levels with this technique that you'll sometimes need to "bump up" your minimum base migration because it takes too long to rebuild the database. Just take a new dump of the database at the desired migration level and remove all migrations less than or equal to that level.
Update: Fast forward a few years and today I recommend sqitch. It's designed from the ground up to handle the case of putting a database under version control without tying you to a particular programming language or VCS.
One very interesting project that's still probably a little young to rely on is Adam Kennedy's ORLite::Migrate which takes it's inspiration from Rails migrations. He wrote up a very interesting journal over at use.perl.org about his plans and I hope to keep an eye on it for the future.
It does appear that this package only works with SQLite at the moment but I think Adam's planning on building this out to be more database agnostic in the future.
In POPFile we use our own solution. We store a schema version number in the db and if the program detects that there is a newer schema, it will update the db accordingly. This is not exactly the best and most fun part of our code.
To be honest, I fail to see the advantage of using DBIx::Migration::Directories if you aren't already using DBIx::Class. You have to provide the SQL and the version numbers and the database handle. You might as well provide a little more code to find the sql file and and feed it to the database.
Of course, having the schema in version control is a great bonus.
We use a system similar to what Manni described. The two big disadvantages are:
Can't rollback schema changes (typically this is rare, not well tested and hard anyway so having to do it manually isn't a big deal IMO).
Using a sequential version number is a pain when you develop in multiple branches -- since you are using SVN this isn't as likely to be an issue as if you were using git though. :-)
The script script I use is here: database_update and there's a small example data file.
How about sqitch? It advertises itself as a "database change management application",
There is an interesting CPAN module (Database::Migrator). I have used it, and works fine in order to handle the migrations of your project.
Each migration goes into its own directory. Migrations are applied in sorted order, typically you name them starting with a number prefix. The migration directory can either contain files with SQL or Perl.

Resources