Implementing mostly-atomic ClearCase commits

Implementing mostly-atomic ClearCase commits - clearcase

I have a project where I need to perform a number of operations on a dynamic view. If any of those operations fails, or some error comes up in the program, I need to be able to back out the commits.
The straightfoward way seems to be to simply put the commands into a queue and then, when my program finishes processing, execute the queue. However, I am concerned about some exceptional event interrupting the commits and causing an inconsistent dataset on the server.
Or, in other words, I'm looking for a way to create a svn-style 'changeset' in Clearcase dynamic views. The script language I'm using is Perl, if that matters.
Ideas?

The atomicity of operation in ClearCase being at the file-level, there is no strict equivalent of a svn changeset (i.e. a "revision").
The closest thing of a changeset in ClearCase is the notion of activity (in UCM), or a label set on a collection of files (a UCM Baseline is actually closer, since it represents labels you cannot move, on a pre-defined set of files -- UCM component --)
Now, UCM or not, I would recommend:
locking the branch on which you will make checkins
(that way, the vob is still accessible, and nobody is trying to add other versions on that particular branch during your "atomic" operation)
do your checkins
unlock the branch
In case of trouble, while the branch is still lock, you can 'ct rmver' the versions added. (Note: to use with care: a rmver can not be undone)
Note1: if you are not working in UCM, you will have to record all checked-in versions in order to be able to rmver them
Note2: when I said "lock the branch", I meant of course: "lock for everyone except you" (-nusers yourLogin). That way, only you can make checkins (that applies to all files in LATEST on the branch on which you are working (main or another).
The problem, with this approach, is what the clients (the other users with their dynamic views in LATEST on the branch) will see during your atomic transaction.
Since those are dynamic views, they will see the checked-in files while these files are checked-in, one by one. That may not be good, especially if there are 200 files and if the all process takes more than a minute.
One solution would be to have those client views set their config spec to the following:
element * .../myBranch/FREEZED_LATEST
element * .../myBranch/LATEST
If you are not doing atomic changeset commit, the label FREEZED_LATEST does not exist, and all the client views are displaying LATEST, as they should. Any checkin is immediatly seen by all.
But during your atomic commit, you could:
first set a label FREEZED_LATEST on all the current files (currently in LATEST, that is)
That means, all the clients will only see those specific versions during the atomic commit
do your process (all the way, or roll back: either way, the branch is locked, and the config spec of the clients still shows the same "freezed" content)
delete the label FREEZED_LATEST (all the clients go on seeing the new LATEST resulting from your atomic operation, and can make new versions with some checkouts of their own)

With v7.1.1 ClearCase supports atomic commits.You will be able to treat a set of files as one unit and check them in or rollback based on a given criteria.For more info , for more info see
https://publib.boulder.ibm.com/infocenter/cchelp/v7r1m0/index.jsp?topic=/com.ibm.rational.clearcase.relnotes.doc/topics/c_cc_relnotes_features.htm

Lock out all other users.
Do a backup of your server.
Do your commits.
If something goes horribly wrong, restore clearcase from backup.

I haven't used clearcase in years, so here are a few stray and naive thoughts.
Look ahead and determine if files are out of sync.
I would lock all the files you're about to check in before checking them in, and if you fail to lock one, abort the whole mess, with a useful message.
Can you "delete" a check in? Or revert, so HEAD looks at a previous version? Define your undo of a check in.
Can you make a temporary branch, check-in, then merge/rebase (my terminology is lose here).
That way your rollback is to kill the branch. Though I remember coworkers cursing clearcase because of it's branching.
In general, queuing actions is great, but use the queue to identify potential problems before they occur. In addition, define your actions and their UNDO criteria, so if they want to do something that isn't pseudo-atomic, you can warn them, "This might get messy".

Related

How to find ClearCase activity's dependents?

As you know, sometimes a UCM activity depends on another activity, and sometimes other activities are depended on that activity. I'm wondering how can I get this information easily?
Assuming my input is an activity ID - how do I get these two outputs easily?
Thank you

The activity dependency is determined in the context of a deliver or rebase.
See "About activity dependencies in the deliver operation
So maybe the easiest way to see what activities are involved is to do a deliver -preview.
But beside that, there is no easy way to list those dependencies because they involve:
version dependency (the same file has versions in both activities, making one depending on the other)
timelines (see for instance "ClearCase : Making new baseline with old baseline activities"): a baseline made by a deliver/rebase and which will link (that is the second form of dependencies) all the activities in a given stream together. Even if they don't have any file in common.

REM output is space delimited list
view-context> cleartool lsact -fmt "%[contrib_acts]p" activityID#\pvob
act1 act2
If you make (mkbl) and compare (diffbl -act) baselines, you can obtain the same information as well as recursive delivery information.
From ClearCase GUIs (Project Explorer and ClearCase Explorer -> My Activities, you can also right click on an activity and select "Show Contributing Activities".
This answer only addresses one direction for activities. Using baselines with other %[xxx]p format specifiers for baselines should allow forward and reverse resolution.

Is this a functional syncing algorithm?

I'm working on a basic syncing algorithm for a user's notes. I've got most of it figured out, but before I start programming it, I want to run it by here to see if it makes sense. Usually I end up not realizing one huge important thing that someone else easily saw that I couldn't. Here's how it works:
I have a table in my database where I insert objects called SyncOperation. A SyncOperation is a sort of metadata on the nature of what every device needs to perform to be up to date. Say a user has 2 registered devices, firstDevice and secondDevice. firstDevice creates a new note and pushes it to the server. Now, a SyncOperation is created with the note's Id, operation type, and processedDeviceList. I create a SyncOperation with type "NewNote", and I add the originating device ID to that SyncOperation's processedDeviceList. So now secondDevice checks in to the server to see if it needs to make any updates. It makes a query to get all SyncOperations where secondDeviceId is not in the processedDeviceList. It finds out its type is NewNote, so it gets the new note and adds itself to the processedDeviceList. Now this device is in sync.
When I delete a note, I find the already created SyncOperation in the table with type "NewNote". I change the type to Delete, remove all devices from processedDevicesList except for the device that deleted the note. So now when new devices call in to see what they need to update, since their deviceId is not in the processedList, they'll have to process that SyncOperation, which tells their device to delete that respective note.
And that's generally how it'd work. Is my solution too complicated? Can it be simplified? Can anyone think of a situation where this wouldn't work? Will this be inefficient on a large scale?

Sounds very complicated - the central database shouldn't be responsible for determining which devices have recieved which updates. Here's how I'd do it:
The database keeps a table of SyncOperations for each change. Each SyncOperation is has a change_id numbered in ascending order (that is, change_id INTEGER PRIMARY KEY AUTOINCREMENT.)
Each device keeps a current_change_id number representing what change it last saw.
When a device wants to update, it does SELECT * FROM SyncOperations WHERE change_id > current_change_id. This gets it the list of all changes it needs to be up-to-date. Apply each of them in chronological order.
This has the charming feature that, if you wanted to, you could initialise a new device simply by creating a new client with current_change_id = 0. Then it would pull in all updates.
Note that this won't really work if two users can be doing concurrent edits (which edit "wins"?). You can try and merge edits automatically, or you can raise a notification to the user. If you want some inspiration, look at the operation of the git version control system (or Mercurial, or CVS...) for conflicting edits.

You may want to take a look at SyncML for ideas on how to handle sync operations (http://www.openmobilealliance.org/tech/affiliates/syncml/syncml_sync_protocol_v11_20020215.pdf). SyncML has been around for a while, and as a public standard, has had a fair amount of scrutiny and review. There are also open source implementations (Funambol comes to mind) that can also provide some coding clues. You don't have to use the whole spec, but reading it may give you a few "ahah" moments about syncing data - I know it helped to think through what needs to be done.
Mark
P.S. A later version of the protocol - http://www.openmobilealliance.org/technical/release_program/docs/DS/V1_2_1-20070810-A/OMA-TS-DS_Protocol-V1_2_1-20070810-A.pdf

I have seen the basic idea of keeping track of operations in a database elsewhere, so I dare say it can be made to work. You may wish to think about what should happen if different devices are in use at much the same time, and end up submitting conflicting changes - e.g. two different attempts to edit the same note. This may surface as a change to the user interface, to allow them to intervene to resolve such conflicts manually.

What is the standard guidelines for activity creation in Clearcase UCM?

What is the standard guidelines for activity creation?
In our team, all team members are creating activities by their own.
It is not being assigned by team leader. Is it possible to create an activity by team leader then assign it to members?
How to achieve it?

Two ways you could go.
ClearCase (stand alone):
A trigger can enforce, the activity or the naming of the activity but this can require intial development of trigger and script & also the maitenance. You may also go part way in which you enforce the prefix to be ENH_* or DEF_* or CR_*. You can even check to see if total activity is in a list of strings you specify...limited to your need.
Alternative (ClearCase with integration):
What you may be looking for is a higher level order, I had created such a system with ClearCase integrated to ClearQuest. Developers are assigned "WorkRequests" (e.g. Defects / Enhancements) These can be directly assigned, tracked and added to builds.
In essence you use the record ID acts node that holds all activities checked in by developer. You can report/slice/dice with activitis and checkin refs as you want)
In this model you control the assigned record not the activity (but they can be the same! ie. raised records with known activties in advance and assign them.)
Regards
Jim2

No, the usual practice is that usually one would select an activity he/she created when checking in new versions.
The "setactivity" doesn't list any restriction in term of Identity when selecting the activity to use.
An activity is here to group some tightly linked changes together, changes being new versions on files or directories for a given component on a given stream.
There is no real "standard guidelines" except to keep linked changes together.
You could prevent the creation of activity (except for a project manager) with a pre-op trigger though.
I suppose another trigger might be able to enforce the selection of an activity only by a specific resource, emulating that way the "assignment" process.
But I rarely seen that implemented (or only when use with a link with ClearQuest).

how to detect if there is any check in on a stream after a given time

We use clearcase UCM with 15 vobs.
We use cleartool lshistory -all -since "time" -nco vob1/ vob2/src/ vob3/tests/ ...many more... to detect changes since last time. This gives correct result, but takes too long on streams with lot of history.
Is there a way to return early if there is 'any change' on a stream, but not detailing that change? One options is to limit the lshistory to individual vobs, but that does not look elegant. I guess there is a better way to do this?

Multisite is off course not an option, due to huge license costs.
You cannot make one vob multisite without having to make its adminvob/pvob also multisite, which in turn means other vobs associated to said adminvob, while not always multisited themselves, need to pay multisite license as well!.
Depending on the level of information you are after, a simple and regular update on a snapshot view is enough to detect/update any changes, with results in the update.20xx-yy-zzT123456-0x.updt file.
You can setup a cron job in charge of:
updating the snapshot UCM views (set on the streams you want to monitor, instead of a lshistory after any modification on any stream)
concatenating the result of the various updt files.
Whenever you need to check for changes, read/parse the concatenated result made by your job (and have it reset/create new concatenated ones).
This is a bit of scripting work, but for large histories, this will be much more efficient than the slow 'lshistory -all'.

The following suggestion should be super fast compared to lshistory but it does not support any generic "time" reference, only from earlier manually saved entries. It also depends on multisite.
If you only want to check if there is any (local) change made to each of the individual vobs, you could perhaps use the multitool lsepoch command to compare the epoch number with the previous.
Edit: Since I have no experience with UCM I did not notice at first, but as noted, this answer will only consider changes from the whole vob, not individual streams as the question asks for.

Difference between branches and streams in ClearCase?

What is difference between branches and streams in ClearCase?

A branch is a classic versioning way to parallelize the history of versions for a given file: See "When should you branch"
A Stream is not a branch: it is just a metadata able to memorize what baseline any view referencing that Stream will see.
When you create a Stream, nothing happen (no branch is created).
But a Stream name will be used when a file is checked out: any view will set its config spec in order to create a branch named after the Stream in order to isolate the development effort in said branch.
(See "How do I create a snapshot view of some project or stream in ClearCase?")
This is why it is important to adequately name a Stream: If I create a Stream named "VonC", you will eventually see (in the version tree for any modified file) a branch named "VonC": what is the purpose of a branch "VonC"?
If I create a Stream named "REL2.2_FIX", you will see branches named "REL2.2_FIX" and will infer that any view referencing that Stream is there to produce fixes on the release 2.2: a much more useful name. (This is why I don't like the "one stream per developer model")
So if you have any writable component, a Stream could be considered as a template for branches:
You declare what you need in a stream (what baseline you want to see)
You create a view on that stream
Any checkout will create a branch named after the Stream.
(And that is why so many UCM users mix or equate "Stream" with "branch")
But if you have only non-writable components in your project, then a Stream is just the list of baselines (labels on components) that you want to see in any view you will create on said Stream.
That becomes a visualization mechanism, useful for testing environment where you only need to access precise versions of a set of components in order to test your system.
In that case, no branches will ever be created, since no checkout will ever be made on any file: the component are declared non-writable in the UCM project.
The other major difference between a Stream and a branch is the organization of Stream in a hierarchy (parent Stream / sub-Streams).
That hierarchy simply don't exist for branches: when you have 3 branches A, B, C:
you don't know where to merge from branch A once you have finish your work on it.
any merge you do has the same meaning: A->B, or C->A, or B->C, or ...
With Stream, you would have:
MyProject_Int
|
--MyProject_Dev
|
-- MyProject_Feature1
The hierarchy of Streams is there to:
introduce a possible workflow of merges (you know where you should merge from one Stream to another: namely its parent. It is not mandatory, but at least you have a visual way of knowing that:
Feature1, once fully developed, will get back (be merged to) MyProject_Dev (its parent Stream), and that:
MyProject_Dev, once a stable state is reached, can be merged into its parent Stream MyProject_Int, where integration tests can be conducted while development go on uninterrupted in MyProject_Dev.
add a meaning to those merges:
merging from a sub-stream to its parent or any other parent stream (for instance, you can merge directly from MyProject_Feature1 to MyProject_Int if you have to) is called a deliver.
merging from a parent Stream (like MyProject_Dev) to an immediate sub-Stream (like (MyProject_Feature1) is called a rebase.
Its purpose is to ensure that Feature1 is developed with the latest changes of Dev, in order to make the final deliver as painless as possible: with regular rebases, the common set of code would not have diverged too much between the two parallelized histories of those two branches derived from those two Streams.
Keep in mind that those two UCM operations deliver and rebase are, at their core, no more than simple merges between two branches A and B.
However, because of their names, you know that you don't merge just between any two branches, but between a sub-Stream and a parent Stream (deliver), or between a parent Stream and a sub-Stream (rebase).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight