How to merge Drupal database changes - database

We currently use an SVN repository to ensure everyone's local environments are kept up-to-date. However, Drupal website development is somewhat trickier in that any custom code you write (for instance, PHP code written for a node body) is stored in the DB and the changes aren't recognized by the SVN working copy.
There are a couple of developers who are presently working on the same area of a Drupal site, but we're uncertain about how to best merge our local Drupal database changes together. Committing patches of database dumps seem clumsy at best and is most likely inefficient and error-prone for this purpose.
Any suggestions about how to approach this issue is appreciated!

Unfortunately, database deployment/update is one of Drupals weak spots. See this question & answers as well as this one for some suggestions on how to deal with it.
As for CCK, you could find some hints here.
As for php code in content, I agree with googletorp in that you should avoid doing this. However, if for some reason you absolutely have to do it, you could try to reduce the code to a simple function call. Thus you'd have the function itself in a module (and this would be tracked via SVN). But then you are only a little step from removing the need for the inline code anyways ...

If you are putting php code into your database then you are doing it wrong. Some stuff are inside the database like views and cck fields plus some settings. But if you put php code inside the node body you are creating a big code maintenance problem. You should really use the API and hooks instead. Create modules instead of ugly hacks with eval etc.

All that has been said above is true and good advice.. To answer your practical question, there are a number of recent modules that you could use to transport the changes done by the various developers.
The "Features" modules is a cure the the described issue of Drupal often providing nice features, albeit storing lots of configs and structure in the DB. This module enables you to capture a feature and output it as a pseudo-module (qualifies as a module with .info and code-files and all). Here is how it works:
Select functionality/feature to export
The module analyses the modules, files, DB content that is required to rebuild that feature elsewhere
The module creates a pseudo-module that contains the instructions in #3 and outputs everything (even SQL to rebuild the stuff in the DB) into a module package (as well as sets dependencies for other modules required)
Install the pseudo-module on your new site and enable it
The pseudo-module replicates the feature you exported rebuilding DB data and all
And you can tell your boss you did it all manually with razor focus to avoid even 1 error ;)
I hope this helps - http://drupal.org/project/features

By committing patches of database dumps, do you mean taking an entire extract of the db and committing it after each change?
How about a master copy of the database? Extract all tables, views, sps, etc... into individual files, put them into svn and do your merge edits on the individual objects?

Related

SSDT Circular reference: Complex project

I have a fairly complex setup with eight databases on a server each referencing each other (about every database referencing each other), giving way to quite a complex web. The design is far from ideal, but unfortunately this is something we have to work with.
We need to create a SSDT solution to facilitate CI/CD
The whole project needs to be deployed from scratch on a new instance and I am trying to get my head around this, as I have limited SSDT knowledge for a project this scale.
The approaches I consider are as follows:
1) Split objects into shared objects, and reference the shared objects. This seems to be a nightmare to implement, as we would require different layers because of the complex web of references. (shared object referencing other shared objects). Also how do we deploy such a project on a blank server?
2) Create stubs for each object in a project being referenced by other objects, and make a database reference to these. This seems to be the easiest option, although it seems that if the object the stub is based on gets changed, the stubs also needs to be maintained otherwise the project will break. Is this the right assumption?
3) Only create stubs for projects required to compile (eg. tables referenced by views in other databases), and ignore warning references. I am leaning towards this route as the stubs will be much smaller and project easier to maintain, but I hate to ignore referencewarnings..
If we deploy using the stubs option, do we need to deploy the stubs first and then delete them after successful deployment?
Another (more straightforward question). What is the best way to deploy logins, users and object permissions ?
Thanks for replying.
The question is too broad but these are few suggestions:
You can't do anything with circular reference. There are some ways to workaround it but all of them are "hacky" and most probably will introduce more problems than to solve your problem. So try to move objects in so manner that there is only one way dependency;
Use synonyms for ALL cross database objects, so there supposed to be no straight reference outside database;
I agree with Peter Schott that it is better to ignore logins and users for now as handling them in SSDT is a bit of pain and you need to have good expertise on SSDT to make it working properly.

Is it possible to use Git as source control for code stored in a database?

I work on Labware LIMS, which has both configuration, and customization via its own programming language and internal code editor, and stores this customization code in database records. (Note, not the source code of the actual application itself, just the customization code a.k.a. LIMS Basic.) Almost everything in LIMS is stored in the database.
We want to investigate the possibility of using source control to protect this code but we don't know much more than the theory of using something like Git. (I have worked as a junior QA and used git but not as a dev and my knowledge is limited!)
Of particular use would be the merging tools, as currently we have to manually merge code in a text editor, if we even notice there is a conflict (checking content between dev and live is time consuming and involves using multiple tools, some of which are 3rd party tools we have developed ourselves, which are hit and miss. I personally find it easiest to cut and paste into a text file and then use Beyond Compare.
There is no notification that the code is different when moving it from dev to live (no deployment as such, you just import an xml file) so we often have things going live that someone was working on unbeknownst to each other. I.e. dev 1 is working on the code in object 1, dev 2 gets a ticket to make a change to object 1, does so and puts their change Live, whatever dev 1 was doing is now also Live in whatever state it was in. (Because we don't always have time to thoroughly check what state each object is in between up to 3 different databases.)
Is it possible to use source control just on the code within the database, but not necessarily the database itself? (We have backups and such for that but its easy for some aspects of the system to get overwritten by multiple devs working on overlapping areas at the same time.)
If anyone reading this has any specific knowledge of LW LIMS, we are referring to the Subroutines mostly, we have versioned Analyses which stands in for source control for the moment and is somewhat effective but no way to control who is doing what on the subroutines other than a comment log at the top. I have tried to find any information on how other teams source control their code in LIMS but to no avail.
The structure of one of these tables can range from as simple as the code just existing in one field as a straight text dump with a few other fields such as changed_on, changed_by and name (Subroutines), or more complex with code relating to one record being sprinkled around in multiple rows on another table entirely (Analyses) but even if it could just deal with the simple scenario to start with that would be great!
TL;DR: Could the contents of the Code field in a database record be treated like a regular code object in other dev environments somehow and source controlled using Git? (And is anyone willing to explain it simply for me to follow?)
As you need to version control table fields of subroutine, but LW LIMS doesn’t have the IDE for version control (such as git, svn etc). So the direct answer is no.
If you really want to do version control for the codes in database, you can create a git repository and only put the codes in git repository. when a file has updated, you can commit & push the changes. And it’s easy to compare the difference between versions.
More detail about git, you can refer git book.
LabWare LIMS has a number of options for version control. You COULD version the Subroutine table by adding a SUBROUTINE.VERSION field to the table, this works the same way as other versioned tables in LabWare where it asks you if you would like to create a new version of the object before saving. There are a few customers I work with that have done this.
Alternatively, (and possibly our more recommended method prior to LEM) there is the Snapshot capability where the system automatically takes a "snapshot" of objects as they are saved - when viewing these you have the ability to view them side by side in a comparison dialogue - it will show < or > for lines which are different.
Another approach is, if you have auditing turned on you are able to view the audit history for changes to specific objects - this includes subroutines.
One other approach is to use configuration packages - this has the ability to record version AND build numbers. Though individual subroutines is probably a bit too granular for it's intended design.
Lastly, since this question was originally posted we have developed a product called LabWare Environment Manager (LEM) which has some good change control functionality built-in.
For more information on the suggestions above, please have a look at the LabWare Technical manual for the version you are on. We also have a mailing list for questions like this to be posted. You might find an answer there. If you have access to our Support webpage you're able to search previous questions that have been asked. I'd also suggest that you get in touch with your Account Manager at LabWare who can help you answer some of your questions.
HTH

Database - Version Control - Managing dropped/deleted objects

We want to clean up our database schema and drop/delete objects which are no longer being used.
We suspect that sometime in the future we'll want to resurrect the removed functionality.
We've discussed the following options for dealing with dropped objects in version control:
Deleting the .sql files from source control once they are gone from the database and relying on the version history to store the definitions. Our concern with this approach is that sometime over the years source control will be moved and we will lose the history. It also seems difficult to know what to look for to recover if we can't see all the dropped objects.
Leaving the .sql files in source control but updating the definitions to "drop proc {someproc}". With this approach we our concerned about leaving the objects in version control which no longer exists and also the risk to losing the history if the vcs was moved
Creating a new repo for dropped objects and migrating .sql files to this repo once they have been dropped from SQL Server.
We're working in a windows environment and are fairly new to working with VCS for databases. Currently GIT + SSDT.
Currently option 3 is our preferred approach.
I see this a lot with database code, what happens is over time people end up with stuff in the database that is either not used or just does not work (think a proc that references a table and the table is modified but not the proc).
The thing to do is to get everything in source control (which it looks like you have) and then create a tag or branch of all the code before and after deleting it so you can get it back.
Two things normally transpire, either the code was genuinely never used or it was used at year end and when you find out, the world is about to fall on your head so better have a quick way to get it back.
Of course if you had a full suite of tests then even the year end process would be safe :)
I personally wouldn't use option 3, I would just keep the history in the main branch so you keep the history with it.
ed
There are a lot of good tools for versioning database changes: you have a big chance to get this question closed with "Too broad" reason, but I'll try to suggest to
Read about, understand and try to add Liquibase to your Development-Toolbox
Adopt your workflow for using this additional layer - technically it will be one more file (changelog in terms of Liquibase) in changesets, where you changing DD and|or data.
These changelogs provide good and smooth way of moving back and forth in linear history of changes in databases, not so good (or I don't know The Right Way) for direct jumping between nodes of diverged history, but it seems not your case
From your options-list it will be more p.1, than others (but it's storing changes in database in version-contol, not states)
Just to note another option, in SSDT you can mark the file property as Build Action = None. The file won't be included in the dacpac when this build option is selected. But I tend to agree with the idea that you should rely on your VCS to handle history.

Is there any good way to refactor an MEAN stack project?

Since each part of MEAN stack projects are separated, it's really hard to refactor the whole project. I'm trying to do the following things
Modify mongoose schemas
Reorganize server code
Rename some api calls and parameters
Modify Angular code to adapt new APIs
Is there any good ways to do them?
There is no tool for any of theese dedicated to MEAN stack. Yeo Man got some generators but these existing are only for creating, not for refactoring. You can still create your own yeo man generator with custom actions or any other server side script that is looking for patterns and changing names according to given configuation file. This can be also automated with gulp or grunt task runners, but its really time-consuming ;)

Altering database tables in Django

I'm considering using Django for a project I'm starting (fyi, a browser-based game) and one of the features I'm liking the most is using syncdb to automatically create the database tables based on the Django models I define (a feature that I can't seem to find in any other framework).
I was already thinking this was too good to be true when I saw this in the documentation:
Syncdb will not alter existing tables
syncdb will only create tables for models which have not yet been installed. It will never issue ALTER TABLE statements to match changes made to a model class after installation. Changes to model classes and database schemas often involve some form of ambiguity and, in those cases, Django would have to guess at the correct changes to make. There is a risk that critical data would be lost in the process.
If you have made changes to a model and wish to alter the database tables to match, use the sql command to display the new SQL structure and compare that to your existing table schema to work out the changes.
It seems that altering existing tables will have to be done "by hand".
What I would like to know is the best way to do this. Two solutions come to mind:
As the documentation suggests, make the changes manually in the DB;
Do a backup of the database, wipe it, create the tables again (with syncdb, since now it's creating the tables from scratch) and import the backed-up data (this might take too long if the database is big)
Any ideas?
Manually doing the SQL changes and dump/reload are both options, but you may also want to check out some of the schema-evolution packages for Django. The most mature options are django-evolution and South.
EDIT: And hey, here comes dmigrations.
UPDATE: Since this answer was originally written, django-evolution and dmigrations have both ceased active development and South has become the de-facto standard for schema migration in Django. Parts of South may even be integrated into Django within the next release or two.
UPDATE: A schema-migrations framework based on South (and authored by Andrew Godwin, author of South) is included in Django 1.7+.
As noted in other answers to the same topic, be sure to watch the DjangoCon 2008 Schema Evolution Panel on YouTube.
Also, two new projects on the map: Simplemigrations and Migratory.
One good way to do this is via fixtures, particularly the initial_data fixtures.
A fixture is a collection of files that contain the serialized contents of the database. So it's like having a backup of the database but as it's something Django is aware of it's easier to use and will have additional benefits when you come to do things like unit testing.
You can create a fixture from the data currently in your DB using django-admin.py dumpdata. By default the data is in JSON format, but other options such as XML are available. A good place to store fixtures is a fixtures sub-directory of your application directories.
You can load a fixure using django-admin.py loaddata but more significantly, if your fixture has a name like initial_data.json it will be automatically loaded when you do a syncdb, saving the trouble of importing it yourself.
Another benefit is that when you run manage.py test to run your Unit Tests the temporary test database will also have the Initial Data Fixture loaded.
Of course, this will work when when you're adding attributes to models and columns to the DB. If you drop a column from the Database you'll need to update your fixture to remove the data for that column which might not be straightforward.
This works best when doing lots of little database changes during development. For updating production DBs a manually generated SQL script can often work best.
I've been using django-evolution. Caveats include:
Its automatic suggestions have been uniformly rotten; and
Its fingerprint function returns different values for the same database on different platforms.
That said, I find the custom schema_evolution.py approach handy. To work around the fingerprint problem, I suggest code like:
BEFORE = 'fv1:-436177719' # first fingerprint
BEFORE64 = 'fv1:-108578349625146375' # same, but on 64-bit Linux
AFTER = 'fv1:-2132605944'
AFTER64 = 'fv1:-3559032165562222486'
fingerprints = [
BEFORE, AFTER,
BEFORE64, AFTER64,
]
CHANGESQL = """
/* put your SQL code to make the changes here */
"""
evolutions = [
((BEFORE, AFTER), CHANGESQL),
((BEFORE64, AFTER64), CHANGESQL)
]
If I had more fingerprints and changes, I'd re-factor it. Until then, making it cleaner would be stealing development time from something else.
EDIT: Given that I'm manually constructing my changes anyway, I'll try dmigrations next time.
django-command-extensions is a django library that gives some extra commands to manage.py. One of them is sqldiff, which should give you the sql needed to update to your new model. It is, however, listed as 'very experimental'.
So far in my company we have used the manual approach. What works best for you depends very much on your development style.
We generally have not so many schema changes in production systems and somewhat formalized rollouts from development to production servers. Whenever we roll out (10-20 times a year) we do a fill diff of the current and the upcoming production branch reviewing all the code and noting what has to be changed on the production server. The required changes might be additional dependencies, changes to the settings file and changes to the database.
This works very well for us. Having it all automated is a niche vision but to difficult for us - maybe we could manage migrations but we still would need to handle additional library, server, whatever dependencies.
Django 1.7 (currently in development) is adding native support for schema migration with manage.py migrate and manage.py makemigrations (migrate deprecates syncdb).

Resources