Background:
I am using GitHub to store a ZF2 application.
The database schema + the actual data stored inside the schema are not being stored inside a version control. At the moment I am in development mode, so I have some database dump scripts that I load into the database when I need to. I also tweak entries in the database via phpMyAdmin when I need ongoing granular control for immediate testing purposes. I am also looking into using Doctrire ORM, so my schema will be part of my code via Annotations, and that will be checked into GitHub. Doctrine ORM will generate the actual schema for me, although it is still a separate step in the deployment process. The actual data however, will still be outside of the application and outside of the repository and currently has to be dealt with separately and is not automated.
Goal:
I want to be able to deploy ZF2 application and the database schema, and the data onto Zend Server and have it "just work" in the most automated, least manual way possible.
Question:
What is a recommended, best practice way to deploy every aspect of ZF2 application in the most automated, least manual way possible and have it "just work"? Let's focus on the Development and Testing mode here, as in Production it may be good to have separate deployment steps to protect against accidental live data overwrites.
You can try Phing (http://www.phing.info/) for deploying your PHP application, adjusting directory permissions, running database migrations, running unit tests, etc. I used Phing in couple of my projects with great success.
Related
I have a flask app that is deployed on Google's App Engine. I have noticed a minor bug and I would like to fix it but my database is already populated.
How can I make this minor code change and push / deploy back to my app without losing all my data? (which is probably a basic question but I'm not finding much. all tutorials online are focused on creating the app and deploy, not updating)
Thus far, I have been dropping and re-creating the tables whenever I redeploy, mostly out of ignorance. Here are the steps I have followed
1). make the change in my app
commit and push changes to bitbucket source code
in Google Cloud SDK: git pull
Google Cloud SDK: gcloud app deploy
These steps result in an empty database because the directory I am pushing from my local computer has an empty database. Is this where I should be using git merge?
Is this a database "migration" or is this a "git merge"? I'm not sure what the right terms are to use to research this further. Thanks.
There are a couple of angles to your question. I'm going to try to give you some information, but let me warn you, this isn't going to be a trivial change to your workflow, you'll have to change some things.
First of all, based on the way you worded your question I get the idea that you commit your database to git along with your code. If I got this right, then this is something that you need to stop doing. The database is not code, so it should not be committed to source control.
You should have a completely independent database on each installation of your application. For example, you will have a database on your own machine to do development. You will also need another database in your gcloud deployment. You may need more databases if you have other uses for your application. A very common third database for many people is one that is used for automated tests, which could also be located in your local development machine, but is not the same database that you use for day to day development.
To make changes to your database schema you will not drop and recreate tables anymore, that is clearly something that you already realized that needs an improvement. A good approach to make these changes is to use a database migration framework. These tools allow you to generate short scripts that make these changes to the database in a more focused way, without destroying and recreating everything, and for that reason, the data is in general not lost. For Flask-SQLAlchemy, the best option for database migrations is Flask-Migrate, which is a lightweight wrapper around the Alembic migration framework. (I might be biased here as I'm the author of the Flask-Migrate extension!).
Documentation for Flask-Migrate: https://flask-migrate.readthedocs.io/en/latest/.
Twist to the standard “SQL database change workflow best practices”
Background
ASP.NET/C# Web App
MS SQL
Environments
Production
UAT
Test
Dev
We create patch scripts (XML and sql) that are source controlled in Mercurial. We have cmd line utility that installs patches to DB (utitlity.exe install –patch) from a Release folder the build packages. Patches have meta data that helps with when patch should run and we log patches installed in a table in the target DB. All these were covered in the 3 year old question:
SQL Server database change workflow best practices
Our Problem/Twist
I think this works well for tables, views, functions and stored procedures. We struggle with application configuration data. Here are some touch points on application configurations.
New client. BA performs system study and fit analysis. Out of this comes a configuration word document of what application configurations need to be setup. Note some of these may also come in phases over time. We need to get these new configurations into the system for the developer and client UAT.
Developer works on feature request or bug fix. A new configuration change comes out of that change. The configuration needs to make it into the system for testing and promotion to UAT and up.
QA finds that the developer missed an associated configuration change. That configuration needs to make it into the system for promotion to UAT and up.
Build goes to UAT. Client performs acceptance testing but find they really want to change another unassociated configuration and have it promoted with the changes. In other words they found they want to change a business process by a configuration. The configuration needs to make it into the system for promotion to PRD.
As the client operates in PRD they may tweak application settings. These configurations need to make it into the system for future development and testing.
The general issue is making sure we are accounting for all the configurations and accidently not miss any during promotions which causes grief.
Our Attempts At A Process
a. We have had member of the QA team to write patches (xml and sql) and check those in. This requires a build to make sure those get into the package. With this approach it really just took care of item 1 above and we fell apart on the other items. The nice thing is for the items that made it into the patches it was just an install with the utility.
b. A developer threw together a Config page on the application. All the configurations could be uploaded and downloaded via XML document but it requires the app to be running. For item 1, member of QA team would manually setup configurations in the application and then would download the Config.xml file. This XML file would be used to upload configurations in other environments. We would use text diff tool to look at differences between config.xml files from different environments. This addressed item 1 and the others items but had problems. Problems were not all configurations made it into the XML document (just needs to be fixed by developer), some of the configurations didn’t have a UI in the application so you still had to manually go to the database on some, comparing the XML document with text diff was difficult at time (looked mostly due to sorting but I’m sure there are other issues), XML was not very human readable and finally the XML document did not allow for deleting existing incorrect or outdated configs.
c. Recently we went with option B, but over time for a new client we just started manually tracking configs and promoting them manually by hand (UI and DB) through the promotions. Needless to say lots of human errors.
So we have been looking at solutions. Eventually it would be great to get as much automation in as possible. I’m looking at going with the scripting approach and just focusing on process, documentation and looking at using Redgate data compare in addition to what we had been doing with compare on config.xml. With Redgate we have to create views though and there is no way to create update scripts from that approach except to manually update the scripts. It does at least allow a comparison without the app running. I’m also looking at pulling out the configs from our normal patches and making it a system independent of the build (utility.exe –patch –config). When I say focus on process it will be things like if we compare and find a config change either reported by client or not, we still script it, just means we have to have a process in place to quickly revalidate config install before promoting to the next level. As for documentation looking at making the original QA document a living document instead of just an upfront document. The goal is to try and enhance clarity and reduce missing configurations during promotion. Unfortunately it doesn’t improve speed of delivery.
Does anyone have any recommendations or best practices to pass along. Thanks.
Can I ask exactly what you mean by application configuration. I'm interpreting that as both:
Config files in the web application
Static reference data inside the database
Full disclosure I work for Red Gate. You might be interested in taking a look at Deployment Manager, it's a deployment tool that deploys applications, databases and configuration. It's free for up to 5 projects and target servers.
The approach it uses is to package application code and the database state into packages. These packages can be deployed into dev, test, staging and production environments. The same package is deployed to each environment.
Any application configuration that needs to change between environments is handled in one of the ways below:
Variable substitution in web.config. The tool allows you to specify override values for variables in these files, and set these per environment/server
Substituting the web.config file per environment.
Custom powershell scripts that are run pre/post deploy. You could use these to execute custom SQL based on the environment or server.
Static data within the database, using SQL Source Control's static
data feature. I've written a blog post about how to supply
different sets of static data to different environments/customers.
This allows you to source control the application configurations and deploy them to different environments.
I'm trying to implement automated integration tests for my application. It's a very complex monster. You could say that its database and part of the filesystem are part of its state, because it saves image files in the hard drive, and references to those in the DB. The software needs all those, in a coherent state, to work properly.
Back to writing tests: To run any relevant test, I need some image files in the filesystem, and certain records filled in the database. I thought of putting all of these in a separate folder called TestEnvironmentData in the repository, and retrieving them from the Continuous Integration Server (Team City), but a colleague said the repo is quite full as it is, and that I should set up a special directory, and databases, only in the Continuous Integration server. I don't like that because the tests success depend on me manually mantaining stuff in the server, and restoring initial state before every test becomes cumbersome.
What do you guys do when you need to write integration tests for an app like this? The main goal is having an automated test harness to approach a large scale refactoring. There's lots of spaghetti code and the app's current architecture is hardly unit testable, that's why I decided on integration tests first.
Any alternative approach is welcome.
Developer Repeatability is key when setting up a Continous Integrations Server. I have set one up for my last three employers and I have found the key to success is the developers being able to run the same tests from their dev system in order to get the same results as the CI Server.
The easiest way to do this would be to check in the test artifacts into source control but you could also use dropbox or a Network Share that you copy them from in one of the build steps.
For a .Net solution I have always used MsBuild as you can most easily replicate the build process of Visual Studio and get the same binaries/deployables. As for keeping your database in sync so that tests can be repeatable in the past I used the MbUnit test framework and the [Rollback] attribute as it would roll back any changes to Sql Server that happened in the test. I believe that Nunit now has this attribute as well.
The CI server is great for finding code that breaks existing functionality but unless developers can reproduce the error on their machine they won't trust the CI server for some time.
First of all, we use Maven to build our code. It's like ant, but it relies on convention instead of configuration for many things, like Ruby On Rails does. One of those conventions is a standardized directory structure:
(project)----src----main----(language)
| | \--resources
| \--test----(language)
| \--resources
\--target---...
Using a directory structure like this makes it easy to keep your application resources and testing resources near each other, yet still be able to build for test or build for production, or just build both but just package up the application parts after running the tests.
As far as resetting the database between tests, how you do that is greatly dependent on the DBMS you're using. For instance, if you're using MySQL it's very easy to get the test data the way you want and do a mysqldump to a file you then load before the test. With other DBMSs you may have to drop and recreate the tables and reload the data, or make separate tables for the starting point and use a CREATE/SELECT sql statement to duplicate it each time.
There really is no reliable way around the "reset the database between tests" step.
We are working on a project where database requirement is not clear. So we are building a database agnostic application.
See my previous question here: Database Agnostic Application
Now I want to test my Spring application DAO with multiple database. I've written number of test cases using TestNG and DBUnit.
When I run these test in a CI environment, I want them to test the application against all the configured databases. I've installed the databases on the 'test server'.
e.g. I want something like this:
for ( each database configured ) {
run each dao test
}
Not sure what is the best way of doing this? And help is welcome.
Thanks,
Adi
If you want to be database independent, you have to test against every single database system you want to support. There are very fine differences which leak through Hibernate.
What I did in the past was to make the test retrieve their database configuration through some System Property. Typically by using hibernate_.property instead of the default hibernate.property. Then setup CI Jobs, which set the property to different values and provide one hibernate_xxx.property for every database to test against. I did this using JUnit Rules, to have the logic in one place. Don't know the apropriate tool for TestNG
I'm not to fond of the loop construct you are hinting at, because it might make it difficult to run a test suit against a single specific database.
I'm also not to fond of dbunit, because it seems to make maintaining testdata rather painful. I prefer in most cases a handcrafted DSL. Have a look at some articles I wrote about it:
http://blog.schauderhaft.de/2011/03/13/testing-databases-with-junit-and-hibernate-part-1-one-to-rule-them/
http://blog.schauderhaft.de/2011/03/20/testing-databases-with-junit-and-hibernate-part-2-the-mother-of-all-things/
http://blog.schauderhaft.de/2011/03/27/testing-databases-with-junit-and-hibernate-part-3-cleaning-up-and-further-ideas/
If you're building a database agnostic application and not using any of the inherent features of a specific database vendor, then the scope of your test cases should be to test the setup, manipulation, and accessing of the data through the DAO objects and less with the testing of the actual database backend. Hibernate 3.5 has dialects available for both Oracle 11g and DB2, so if you were writing test cases that tested the integration of the database agnostic application with a specific database vendor, then really what you are doing is testing that the hibernate dialects do as they say they do (which I'm sure has been covered by test cases in the hibernate project).
In other words, in your case I would think that the testing should focus more on the DAO retrieving the data that you think that it will retrieve after you've set that data up, and in-memory databases are fine for that.
Now all that said, both DB2 and Oracle have very good documentation related to setup. Indeed, both of them have "wizards" to do that. If you still think that it's prudent to test adding data to the database and retrieving it from the physical, non-in-memory database, then I would recommend setting up a "test database" environment and pointing your datasource to that during your continuous integration tests. If you're using Hudson or Jenkins for CI, you can set it up to run a script after the build completes that will truncate the database tables so that the next round of tests work from a blank slate.
EDIT:
I just saw the updates that you posted to your question, so let me address them. Since you already have the databases setup and configured then what you really want to do is dynamically select what the database should be. One way to do this would be to setup your datasource using System properties that can be inherited from a properties file, and running your tests in a "DB2-test" environment and an "Oracle-test" environment. Using this method, you'll have to setup the datasource programmatically and have it read system environment variables to determine which database it connects to. This would essentially require you to change your CI script to run the DB2-test environment first, then the Oracle-test environment following that -- your test suites will run twice.
Hope this helps!
Unit 4.9 has a new Feature: TestRule
You should be able to write a rule, that repeat a test for different databases.
There is this stack overflow question: How to Re-run failed JUnit tests immediately?
It is a slightly different question, but the solution should be the same technique.
I'm in the process of starting up a web site project. My plan is to roll out the site in a somewhat rudimentary form first and then add to the site functionality along the way.
I'm using Subsonic 3 for my DAL, and I'm expecting the database will go through multiple versions as the sites evolve. This means I'll need some kind of versioning and migration tools. I understand that Subsonic has built in migration possibilities, but I'm having difficulties grasping how to use these tools, in my scenario.
First there's the SimpleRepository model, where the Subsonic "automagically" handles the migrations as i develop my site. I can see how this works on my dev-machine, but I'm not sure how to handle deployments with this.
Would Subsonic run the necessary migrations on my live site as the appropriate methods are called?
Is there some way I can force all necessary migrations on a site while taking the site offline, when using the Simplerepository model? (Else I would expect random users to experience severe performance cuts, as the migration routines kick in)
Would I be better off using the ActiveRecord model, and then handling migrations with the Subsonic.Schema.Migrator? (I suspect so)
Do you know of any good resources explaining how to handle this situation with the migrator? (I read the doc, but I can't piece together how I would use this in practice)
Thanks for listening/replying.
Regards
Jesper Hauge
I would advise against ever running migrations against a live site. SubSonic's migrations are really there to make development simpler and should never be used against a live environment. To be honest even using SubSonic.Schema.Migrator you're still going to bump into the fact that refactoring databases is an incredibly hard problem. For example renaming a column in a table using management studio is trivial, but what happens in the background involves creating an entirely new table and migrating all the constraints, data etc. before renaming the new table.
The most effective way I've found for dealing with this is:
Script all database changes as you make them in your development environment (SQL Server Management Studio will do this for you) and add these scripts to your source control.
As part of deployment (obviously backup first) run the migration scripts and then deploy the updated application on success.
Whether you use ActiveRecord or SimpleRepository is then down to whether you want the extra features/complexity of ActiveRecord.
Hope this helps
i would use activerecord easy to use and any changes you just run the TT files, you would then just build or publish your slution and done ???? SVN will keep your multiple versions of the build stage so if you make a tit of it you just drop back a revision.