How to merge the files bettween two different VOBS with different branches
You can't through a regular ClearCase merge.
Merges are based on a 3-way merge between a common ancestor, a source and a destination version of the same file
And by definition, files from different Vobs won't share a common history.
So the only way is a manual merge through a third-party external diff tool (like kdiff3 for instance), in order to compare/merge two different trees of files.
The fact that those trees are managed by ClearCase Vobs won't be relevant for that tool.
Related
Our application has an MS Access 2010 database (I know.. I would much prefer SQL Server, but that's another topic).
Since MS Access stores its data in single mysterious monolithic binary files rather than scripts, my team is thinking of creating several extra tables corresponding to different versions of the software and maintain these versions inside one master database.
I suggest simply placing the binary file in the same source control tool as the software source code. Then the vast majority of the database content would be a duplicate of the other versions, but at least it puts the version control tool in control of the software source and database simultaneously in a synced fashion.
The application uses XML files that are exported from the database (doesn't tie into the database directly).
What are the pros and cons of these two approaches?
I'm familiar with version control methods for SQL Server, but MS Access seems cumbersome to manage for applications with lots of branches.
To put it short: You are pushing Access to something it is not intended for.
You do have the commands SaveAsText and LoadFromText that can export and import most objects as discrete text files. This has been used by Visual SourceSafe to create some sort of source control but it doesn't work 100% reliably.
Also, you can just as well import and export objects "as is" to another (archive) database building some kind of version control.
I once worked with a team in a very large corporation having all imaginable resources from MS at hand and, still, we ended up with a simple system of zip files given a filename including the date and time.
We had a master accdb file we pulled as a copy to a local folder, then did what we were assigned, and copied the file back leaving a note about what objects were altered. One person had the task to collect the altered objects and "rebuild" a new master. A minimum was one per day, but often we also created one at lunch break.
It worked better than you might imagine, because we typically operated in different corners - one with some reports, one with other reports, one with some forms, and one (typically me) with some code modules. Of course, mistakes happened, but as we had the zip files, it was always fast and safe to pull an old copy of an object if in doubt.
I have an app that keeps a database of files located on the user's machine or perhaps on networked volumes that may or may not be online. This database can potentially be several thousand files located in different folders. What is the best way to monitor them to receive notification when a file's name is changed, or it moves or is deleted?
I have used FSEvents before for a single directory but I am guessing that it does not scale well to a few thousand individual files. What about using kqueues?
I might be able to try to maintain a dynamic list of folders trying to encompass all the files with as few folders as possible, but this means reading though the full list and trying to figure out common ancestors etc.
Thoughts or suggestions?
From Apple's docs:
If you are monitoring a large hierarchy of content, you should use
file system events instead, however, because kernel queues are
somewhat more complex than kernel events, and can be more resource
intensive because of the additional user-kernel communication
involved.
https://developer.apple.com/library/mac/documentation/Darwin/Conceptual/FSEvents_ProgGuide/KernelQueues/KernelQueues.html#//apple_ref/doc/uid/TP40005289-CH5-SW2
I am working on a Java based backup client that scans for files on the file system and populates a Sqlite database with the directories and file names that it find to backup. Would it make sense to use neo4j instead of sqlite? Will it be more perfomant and easier to use for this application. I was thinking because a filesystem is a tree (or graph if you consider symbolic links), a gaph database may be suitable? The sqlite database schema defines only 2 tables, one for directories (full path and other info) and one for files (name only with foreign key to containing directory in directory table), so its relatively simple.
The application needs to index many millions of files so the solution needs to be fast.
As long as you can perform the DB operations essentially using string matching on the stored file system paths, using a relational databases makes sense. The moment the data model gets more complex and you actually can't do your queries with string matching but need to traverse a graph, using a graph database will make this much easier.
As I understand it then one of the earliest uses of Neo4j were to do exactly this as a part of the CMS system Neo4j is originiated from.
Lucene, the indexing backend for Neo4j, will allow you to build any indexes you might need.
You should read up on that and ask them directly.
I am considering a similar solution to index a data store on a filesystem. Remark about the queries above is right.
Examples of worst case queries:
For sqlite:
if you have a large quantity of subdirectories somewhere deep into the fs, your space need on sqlite will not be optimal: save the full path for each small subdirectories (think of a code project for instance)
if you need to move a directory, the closer to the root, the more work you will have to do, so that will not be a O(1) as it would be with neo4j
can you do multithreading on sqlite to scale?
For neo4j:
each time you search for a full path, you need to split it into components, and build a cypher query with all the elements of the path.
the data model will probably be more complex than 2 tables: all the different objects, then dir-in-dir relationship, file-in-dir relationship, symlink relationship
Greetings, hj
Which one is better? UCM or base ClearCase?
For paralel development, do we need UCM? Is using manual branching is error prone on base ClearCase?
Is serial development not meaningful? working on same branch?
One is not better than the other, UCM represents a different set of best practices that you can choose to apply on top of base ClearCase.
UCM is great at defining a coherent set of files (the UCM "component") that will be:
branched in the same branch
labelled ("baseline") as a all (all the files receive an immutable label)
referenced by other streams (list of baselines)
Parallel development can benefit from UCM because of the streams you can set-up in advance, in order to define your merge workflow. You don't impose anything, but if you finish a development effort on a sub-stream, the natural merge to do is a "deliver" to the parent stream. (As opposed to base ClearCase, where there is no "hierarchical organization" for branches: once you finish a task in a branch, you can merge your work to any other branch: there is nothing to remind you what could be a natural candidate branch for your merge).
But the other advantage is the definition of a configuration, i.e. the exact list of baselines (labels) you need to get in order to "work" (compile, or develop a new feature, or deploy, or refactor, or...).
Depending on the number of components you have to deal with, you will then adopt:
a system approach: every component is modifiable
a component approach: one component modifiable, the others non-modifiables: you only develop in one set of files, and use the others at a fix label for your compilations.
May I just add...
It really depends on the size and complexity of the development team working on the same project. For example, we have a large dev team consisting of 100s of devs from all over the world, and this team really benefits from all the features that UCM provides (as mentioned by master VonC above).
On the other hand, most of the teams in my organization are around 10 people, and they are all co-located at one location, and these teams really do not want to mess around with the deliveries and merges, so they choose to simply use base CC with basic branching strategy, e.g. one Integration/Release Branch, followed by each dev branch for each release, or personal dev branch. For smaller teams, we usually recommend them to use base CC because it is easier to manage.
Hope this helps.
At my company, we save each database object (stored proc, view, etc) as an individual SQL file, and place them under source control that way.
Up until now, we've had a very flat storage model in our versioned file structure:
DatabaseProject
Functions
(all functions here; no further nesting)
StoredProcedures
(all stored procs in here; no further nesting)
Views
(ditto)
For a big new project, another idea has occurred to me: why not store these files by subject instead of in these prefab flat lists?
For example:
DatabaseProject
Reports
(individual stored procs, views, etc.)
SpecificReport
(more objects here, further nesting as necessary)
SpecificApplication
(all types of DB objects, with arbitrarily deep nesting)
et cetera....
The obvious flaw is that this folder structure doesn't impose any kind of namespace hierarchy on the database objects; it's for organization only. Thus, it would be very easy to introduce objects with duplicate names. You'd need some kind of build tool to survey the database project and die on naming conflicts.
What I'd like to know is: has anyone tried this method of organizing SQL files by application subject in their versioned file structure? Was it worth it? Did you create a build tool that would police the project as I have described?
I like to have my SQL scripts organized by topics, rather than by name. As a rule, I even group related items into single files. The main advantages of this are :
You do not clutter your filesystem/IDE with files (many of them being a few lines long).
The overall database structure shows more directly.
ON the other hand, it may be more difficult to find the source code related to a specific object...
As for duplicate names : it can never happen, because you obviously have automated scripts to build your database. Relying on your filesystem for this is looking for trouble...
As a conclusion, I would say that your current rules are much better than no rule at all.
You should define a naming scheme for your database objects, so that it's clear where a view or SP is being used.
This can either be prefixes to describe the app modules, or separate schema names for modules/functionality.
No nesting required, and names in the VCS show up the same as in the database, and sort properly depending on the naming scheme.
We save our SQL files in a "SQL" solution folder with each project. That way, each project is "installed" separately.