clearcase config spec to select greater revision of two labels - version

I am trying to make a Clearcase config spec that will select a file based on the greater revision number when that file has 2 labels I want.
Example:
file1.c; rev 1 ---> PR438
file1.c; rev 2
file1.c; rev 3 ---> PR433
The "basic" config spec of:
element * PR438
element * PR433
would choose file1.c; rev1 since this label is first specified in the config spec.
What I want is to choose file1.c; rev 3 without having to analyze the label ordering of every file to properly order a config spec.
Basically, I want a rule that says choose PR438 and PR433 and if a file has both labels, use the file with the highest revision number.

Basically, I want a rule that says choose PR438 and PR433 and if a file has both labels, use the file with the highest revision number.
This isn't how a config spec uses selection rules.
If the naming convention of those labels is properly done, the highest (most recent) version will always by PR438.
That means selecting PR438 first, then, as a fallback, selecting PR433 is enough.
What you could have tried is to select first versions which have both labels.
Even if the config spec syntax doesn't specify AND or OR operators, that would be:
element * {lbtype(PR438)&&lbtype(PR433)}

This is a can of worms. In this case, if the NEWER label is attached to the OLDER version you can't use the age of the label type to solve the problem. You're wandering in to "create file-specific configspecs" territory.
So, you'd have to start with the output of something like this:
cleartool find -all -version "lbtype(PR438) || lbtype(PR433)" -print
From there, you would have to
parse it to locate all the duplicate element names (stripping out the version ID's)
Take the later version of the duplicate files
Put those versions starting on the SECOND line a configspec based on the labels (unless you're OK with not checking out those files, in which case, the "element * CHECKEDOUT" line isn't that important anyway.
Since you're already this far down the path, you could also just build the configspec based entirely on the find output. But that can get unwieldy and unreadable.

Related

Multiple file renaming based on regex pattern + renaming within files based on new file names

How to achieve this? I have a folder with over 1000 code files, mostly xml.
Most of the files have a common pattern:
abbb
accc
addd
a should be placed by z:
zbbb
zccc
zddd
However, there are also files that do not start with a:
efff
ghhh
These names should then simply be preceded by z.
zefff
zghhh
Within the files various of the file names can show up. Hence, all original file names should be replaced by the new names within the files, too.
My idea was something like putting original names in column 1 of a table and put the new names in column 2 next to them. Then looping over this table and if an original name is found within a file - it can also show up multiple times in code lines - and replace it with the new name. Any tips?
This solved it for me:
File renaming e.g. as described here: https://superuser.com/questions/16007/how-can-i-mass-rename-files
Batch replacement with this VS Code plugin: https://marketplace.visualstudio.com/items?itemName=angelomollame.batch-replacer
since I had the said table (old vs. new name) prepared, a simple regex replacement of the table entries did the trick to meet the prerequisites for the plugin, i. e. the old names were replaced by replace "old_file_name" and the new names by with "new_file_name"; then, just copy & paste everything for this plugin as described there and all is replaced

What does the + mean in software versioning

In libraries or packages I often see something like 0.5.4+6 or maybe 1.12.4+2, etc. I know the first number is the major version, the next one minor version, the next one maybe build number or revision. But what does the +2 or +6 signify?
Usually it is used to provide some metadata / build metadata (eg. a build number or date).
For more detailed info, see the Semantic Versioning spec.
Trailing part after MAYOR.MINOR.PATCH not defined strictly in SemVer (AFAICR), thus - everybody can add any useful information in it. Most common usage - provide (in case of using VCS) unique (but readable) id, which allow to identify exact changeset in source, used for building this artifact.
Because (mainly) tags (or equivalent) used for naming|numbering versions in VCSes and (internal) builds between tags (releases) are possible, such ids appear, which, in plain words, mean something "N commits after version X".
Sample from my labeling (don't try to grok hg-templating, I'll explain it)
semver = "{latesttag}{ifeq(latesttagdistance,0,'','+{latesttagdistance}')}"
Find latest tag in history
If there are commits after it - add "+" sign and this number of commits
Just human-friendly type of id, which also allow (rather) fast detection of commit in question, if it's needed. And it's a lot more readable and memorable and pronounceable than, f.e. b800644fcbe2

Using org-mode as a flat file database and sanitizing input

I'd like to use an org-mode file as a flat file database that can be edited both programmatically and by hand. An example follows showing a list of bookmarks.
* Somebody's blog :: I like org-mode
:url: http://somebody.com/org
** Quotation 1
:date: 2013/01/13 08:32:11 EST
Very interesting observations here.
** Quotation 2
:date: 2013/01/13 08:33:46 EST
A marvelous code snippets
* Man bites dog
:url: http://newssite.com/today
I'd like emacs or a webserver cgi-script or similar to edit such a file (in the example above, to add more bookmarks or more quotations to existing bookmarks).
The problem is when, e.g., accepting arbitrary selections from websites to insert under an org-mode heading, it becomes necessary to sanitize the input so that, at minimum, quoted lines starting with asterisks don't affect the file's structure: if a quotation starts with "* this is pathological example", and is inserted into the file under some heading, when I open the file in emacs, it'll appear as a new first-level (h1) heading.
How can I meet the twin goals of (i) an editable org-mode flat file database (this rules out escaping and all our XML tricks) and (ii) isolating arbitrary inputs?
anti-solution: #+BEGIN_QUOTE wouldn't work because lines starting with "* " are rendered as new headings.
possibility 1: box/rebox everything from the outside world: http://www.emacswiki.org/emacs/BoxQuote this seems excessive though.

PDFBox adding white spaces within words

When I try to extract text from my PDF files, it seems to insert white spaces between severl words randomly.
I am using pdfbox-app-1.6.0.jar (latest version) on following sample file in Downloads section of this page :
http://www.sheffield.gov.uk/roads/children/parents/6-11/pedestrian-training
I've tried with several other PDF files and it seems to be doing same on several pages.
I do the following:
java -jar pdfbox-app-1.6.0.jar ExtractText -force -console ~/Desktop/ped training pdf.pdf
on the downloaded file and you will see spaces in following inserted wrongly in the result on console:
"• If ch ildren are able to walk to
schoo l safely this could reduce the
congestion. "
"• Develops good hab its for later life."
"www.sheff ield.gov.uk"
"Think Ahead!, wh ich is based on the"
etc etc.
As you can see several of words above have spaces between them for no reason I can fathom.
I am on ubuntu and running Sun's JDK 1.6.
I've tried this on several different PDF files and tried searching for solution on forums, there were similar bugs but all seemed to have been resolved.
Any help or if anyone else has same problem please comment. This is causing big problem in indexing the content properly for searching.
Unfortunately there is currently no easy solution for this.
Internally PDF documents simply contain instructions like "place characters 'abc' in position X" and "place characters 'def' in position Y", and PDFBox tries to reason whether the resulting extracted text should be "abc def" or "abcdef" based on things like the distance between X and Y. These heuristics are generally pretty accurate, but as you can see they don't always produce the correct result.
One way to improve the quality of the extracted text is to try a dictionary lookup on each extracted word or token. If the lookup fails, try combining the token with the next one. If a dictionary lookup on the combined token succeeds, then it's fairly likely that the text extractor has mistakenly added an extra space inside the word. Unfortunately such a feature does not yet exist in PDFBox. See https://issues.apache.org/jira/browse/PDFBOX-1153 for the feature request filed for this. Patches welcome!
The class org.apache.pdfbox.util.PDFTextStripper (pdfbox-1.7.1) allows to modify the propensity to decide if two strings are part of the same word or not.
Increasing spacingTolerance will reduce the number of inserted spaces.
/**
* Set the space width-based tolerance value that is used
* to estimate where spaces in text should be added. Note that the
* default value for this has been determined from trial and error.
* Setting this value larger will reduce the number of spaces added.
*
* #param spacingToleranceValue tolerance / scaling factor to use
*/
public void setSpacingTolerance(float spacingToleranceValue) {
this.spacingTolerance = spacingToleranceValue;
}

Can ClearCase's findmerge tool ignore a predefined conflict?

There are 2 branches of a file. I have to merge from one branch to the other. The automatic merges fails as there are conflicts. The conflicts are due to the date command output stored in file. Can the findmerge tool ignore some conflicts using some filter options? I want findmerge to ignore Date: .* lines and auto merge the rest of file.
As there are many such files, manual merge of all such file having difference of "Date: .*" takes too much time. How can I automate such a merge ?
Date is different in all 3 files, so there is conflict:
file1.txt##/main/branch1/LATEST
Date: 03/03/2010 11:00PM
Some information1
file1.txt##/main/branch2/LATEST
Date: 11/11/2009 10:30AM
Some information1
New information2
New information3
Base file: file1.txt##/main/main/20
Date: 07/07/2005 05:30AM
Some information1
Thanks
Deepak
Keyword expansions in ClearCase have been debated before: without the right type manager, it isn't supported.
(Not mentioning the fact is doesn't bring much value in a VCS)
The crux of the issue is that the findmerge algorithm has a case where the actual file content is compared. Unfortunately, findmerge does not use the type manager's compare function, but something hard coded and will think the files are different even though the only difference is in the keywords
You still have in theory a way to develop a type manager, combined with a trigger, as discussed here. This isn't trivial by any mean, so the best solution is to:
either avoid modification in both branches (the merge will then be trivial for that section)
or avoid keyword expansion entirely (for instance, a meta data like a date should be associated with the revision date itself, and not as a textual metadata within the data itself)

Resources