What is the Clearcase equivalent of git's no fast forward (--no-ff) switch? Or how can I achieve the same functionality?
BACKGROUND
The situation is we are running multiple release branches and we need to be able to pull a feature or defect fix from the release branch if required. Currently (and I'm not the one managing Clearcase) all defect work is checked directly into the release branch, so backing out changes is time-consuming and potentially error-prone.
When using git with --no-ff, I can back out a feature or defect very quickly with minimal chance of causing an issue.
In ClearCase, you would instead cancel an activity (if you are using ClearCase UCM), or cancelling merges for a set of files with negative merges.
But there is no notion of "fast-forward": there is no HEAD to move, only versions (file by file) to merge. So if you know the merged versions, you can create new versions which cancel them (that is what the negative merge does).
There is no ff but there is also no HEAD in clearcase in the sense of git. The workflow is different. You always check in your code to a new checkin (or commit in git terminology). So it is like you are always doing a merge commit
A bit complex to describe, but I'll do my best. Basically we're using the Git workflow, meaning we have the following branches:
production, which is the live branch. Everything is production is running in the live web environment.
integration, in which all new functionality is integrated. This branch is merged to production every week.
one or more feature branches, in which developers or development teams develop new functionality. After this is done, developers merge their feature branch to integration.
So, nothing really complex here. But, since our application is a web application running against a MySQL database, new functionality often requires changes to the database scheme. To automate this, we're using dbdeploy, which allows us to create alter scripts, given a number. E.g. 00001.sql, 00002.sql, etc. Upon merging to the integration branch, dbdeploy will check which alter scripts have a higher number than the latest executed one on that specific database, and will execute those.
Now assume the following.
- integration has alter scripts up until 00200.sql. All of these are executed on the integration database.
- developer John has a feature branch featureX, which was created when integration still had 00199.sql as the highest alter script.
John creates 00200.sql because of some required db schema changes.
Now, at some point John will merge his modifications back to the integration branch. John will get a merge conflict and will see that his 00200.sql already exists in integration. This means he needs to open the conflicting file, extract his contents, reset that file back to 'mine' (the original state as in integration) and put his own contents in a new file.
Now, since we're working with ten developers, we get this situation daily. And while we do understand the reasons behind this, it's sometimes very cumbersome. John renames his script, does a merge commit to integration, pushes the changes to the upstream only to see that somebody else already created a 00201.sql, requiring John to do the proces again.
Surely there must be more teams using the Git workflow and using a database change management tool for automating database schema changes?
So, in short, my questions are:
How to automate database schema changes, when working on different feature branches, that operate on different instances of the same db?
How to prevent merge conflicts all the time, while still having the option to have a fixed order in the executed alter scripts? E.g. 00199.sql must be executed before 00200.sql, because 00200.sql might be depending on something done in 00199.sql.
Any other tips are most welcome ofcourse.
Rails used to do this, with exactly the problems you describe. They changed to the following scheme: the files (rails calls them migrations) are labelled with a utc timestamp of when the file was created, eg
20140723069701_add_foo_to_bar
(The second part of the name doesn't contribute to the ordering).
Rails records the timestamps of all the migrations that have been run. When you ask it to run pending migrations it selects all the migration files whose timestamp isn't in the list of already run migrations and runs them in numerical order.
You'll no longer get merge conflicts unless two people create one at exactly the same point in time.
Files still get executed in the order you wrote them, but possibly interleaved with someone else's work. In theory you can still have problems - eg developer a decides to rename a table that I had decided to add a column too. That is much less common than 2 developers both making any changes to the db and you would have problems even not considering the schema changes presumably I have just written code that queries a no longer existant table - at some point developers working on related stuff will have to talk to each other!
A few suggestions:
1 - have a look at Liquibase, each version gets a file that references the changes that need to happen, then the change files can be named using a meaningful string rather than by number.
2 - have a central location for getting the next available number, then people use the latest number.
I've used Liquibase in the past, pretty successfully, and we didn't have the problem you describe.
As Frederick Cheung suggested, use timestamps rather than a serial number. Applying schema changes by order of datestamp should work, because schema changes can only depend on changes of a prior date.
In addition, include the name of the developer in the name of the alter script. This will prevent merge conflicts 100%.
Your merge hook should just look for newly added alter scripts (present in the merged branch but not in the upstream branch) and execute them by order of timestamp.
I've used two different approaches to overcome your problem in the past.
The first is to use a n ORM which can handle the schema updates.
The other approach is to create a script, which incrementally builds the database schema. This way if a developer needs to an additional row in a table, he should add the appropriate sql statement after the table is create. Likewise if he needs a new table, he should add the sql statement for that. Then merging becomes a question of making sure things happen in the correct order. This is basically what the database update process in an ORM does. Such a script needs to be coded very defensively, and each statement should check if its perquisites exists.
For the dbvc commandline tool, I use git log to determine the order of the update scripts.
git log -c --no-merges --pretty="format:" --name-status -p dev/db/updates/ | \
grep '^A' | awk '{print $2}' | tac
In this case the way the order of your commits will determine the sequence in which the updates are run. Which is most likely what you want.
If you run git merge b, the updates from master will be run first and than from B.
If you run git rebase b, the update from B will run first and than from master.
We use ClearCase as control version system.
In our system sometimes we make releases without some developers commits because of time limit.
For example I made some changes in six classes but another user did changes in all or some of them also. And I have to commit code without his changes. So I scan my files with previous versions so that I can revert his changes. But it's a slow and boring process.
Is there another way to do that? Maybe an extension or a script?
The only way to automate that process is through:
subtractive merge or negative merge (as described in this IBM article):
cleartool merge -to filename -delete -ver \main\branch\version_number
cset.pl, which can take all the checkins of an UCM activity and cancel them.
See "Clearcase: how to rollback all changes on specific branch?".
But this is for UCM (which might not be your case)
In both cases, the idea is to create a new version which cancels the version of your other developer.
we have delivered some set of packages to testing team and they completed testing.
In one of the package they report a defect and it was fixed and delivered to integration stream. But while deliver it asked for rebase and delivery contained reabase activity.
In rebase activity due to merging issues one of the file was modified in a package which had no defect.
As testing was already completed and the changes in the delivery is not required , our team wants to delete the latest version of a file [which is added as inadvertent] in integration stream.
If i delete the version of a file , will it have any ill-effect? ( For ex. while doing rebase again)
Deleting a version is almost never a good idea.
If that version has any hyperlink: don't delete it!
(You can see it by looking at its version tree: look for any red arrow coming to or going from that version)
If that version has any tag: don't delete it.
That label is probably the result of a baseline, and that would break the integrity of said baseline.
I would recommend checkout that file, and replace its content with the right one, before check it in back in ClearCase.
See also:
"How do I undo a checkin in ClearCase remote client": rmver is dangerous
"How do I roll back a file checked in to Clearcase?": a subtractive merge is preferable to restore the right content in a new version
WARNING: LONG QUESTION.
[QUESTION]
If the strategy is to have a branch per database, as described in the problem below, where scripts are version controlled.
How do you manage the data migration issues when trying to consolidate to fewer branches?
Is it just a cost you incur as part of data migration?
Essentially transform scripts will have to be created at the time of migration.
Is there a better way?
Can we have both issues resolved at the same time?
What is the best practice?
[BACKGROUND]
At my work place we have a product which has 3 branches. Mainline having the "LATEST AND GREATEST" changes which is not necessary ready for release.
Version B (names have been changed to protect the guilty)
Version A (names have been changed to protect the guilty)
Mainline
Because of these branches there is effectively 3 versions of the database.
Code version control is fairly easy however database version control seems difficult.
Having read Do you use source control for your database items?
it seems the best way is to export all the create scripts for each object/table.
NOTE: How you manage it, in one big script or multiple scripts or a hybrid, is your preference according to the article.
I agree with this and have inquired as to why it's not done.
Currently the DBAs refuse to branch the scripts into branches.
Aside from laziness as an excuse the reason is to save time with data migration.
Effectively the database changes are forcibly maintained across all versions.
All the scripts are version controlled and maintain only in mainline.
Version A and Version B have their own special file that states which change scripts to run on their respective branch. The problem arises when there is a change script, for instance applied to Version A but Version B only requires part of the changes. It is up to the developer to inform the DBAs to update the file which indicates which patches to apply for each branch. For change scripts which does too much manual intervention is needed to manually apply part of the change script.
To update a database on Version A all patches are extracted with Version A's which patch to apply file.
[SCENARIO]
The 3 versions above exist.
Database changes occur to Version A.
Branch consolidation where the code is merged from Version B to A so that Version B can be removed.
The same needs to happen with the database.
Hope this makes sense.
Take a look at Chapter 8 in Eric Sink's Source Control HOW TO. It's a great resource for understanding the ins and outs of source control.