How can I change my authors mapping file in subgit? - subgit

I have used subgit to create a standalone mirror from svn repository as git repository usingsubgit configure --layout auto SVN_REPO. THe main functionality is working great, but I am trying to map svn authors into git user names and viceversa,
By using subgit configure --layout auto SVN_REPO the mapping file authors.txt was created automatically as described in its documentation
user1 = user1 <user2#localhost>
user2 = user2 <user2#localhost>
...
I have tried to modify the file authors.txt in such a way that it would look like
user1 = peter <peter#localhost>
user2 = rob <rob2#localhost>
...
and after that I have issue the command subgit fetch in subgit installation folder. However, I still see the old mapping ( using user1,user2...) instead the names themselves in my cloned git repo.
What should I do to make subgit use my authors mapping? Should I issue the command subgit configure again to import the whole svn repository with the new authors.txt file?

Changing names in the authors file is not enough: the matter is that SubGit sets authors during the actual translation from SVN to Git (or back). So changes made the authors file don't change authors in already translated commits or revisions, those changes will only affect later commits that are created after the authors mapping is changed. If you want that authors mapping to be applied to all commits, then you would need to re-translate the whole repository from scratch, it can be done with 'subgit install --rebuild ' command.
If you mean that new commits are being created with authors from the old mapping, then the most probably that caused by an incorrect path to the authors file, try to check core.authorsFile setting.

Related

How do I stop Git from overwriting my db connection file?

I have a file "db-connection.php" that has to be different for each version of my server. (Localhost, Dev and Production). At first I thought .gitignore was the answer, but after much pain and research, I realized that .gitignore only works on untracked file: e.g. files NOT already in the Repo.
For obvious reasons, the localhost version I'm using with xampp requires that the db file be within the repo. Of course, this means that every time I push it to Dev, it ruins the Dev db connection.
Is there a way to tell .git "Yes, I realize this file exists, but leave it alone anyway"?
This is a common problem, and there are two solutions, depending on your needs.
First, if you always are going to have the same configuration files and they will change depending only on the environment (but not the developer machine), then simply create the three versions of the file in your repository (e.g., in a config directory), and copy the appropriate one into place, either with a script or manually. You then remove the db-connection.php file and ignore it.
If this file actually needs to depend on the user's system (say, it contains personal developer credentials or system-specific paths), then you should ship a template file and copy it into place with a script (which may fill out the relevant details for the user). In this case, too, the db-connection.php would be ignored and removed from the repository.
There are two things people try to do that don't work. One of them is to try to keep multiple branches each with their own copy of the file. This doesn't work because Git doesn't really provide a way to not merge certain files between branches.
The other thing people try to do is just ignored the changes to a tracked file using some invocation of git update-index. That breaks in various cases because doing that isn't supported, and the Git FAQ entry and the git update-index manual page explain why.
You can use the skip worktree option with git-update-index when you don't want git to manage the changes to that file.
git update-index --skip-worktree db-connection.php
Reference: Skip worktree bit

How to capture all information from ClearCase?

I need some help on how to collect all information from ClearCase and tar or zip it, and store it in a provided space. we have migrated major baselines from ClearCase to different SCM tool .But we still have ClearCase. we want to capture all version, change, baseline, etc (basically capture everything but not the SCM tool itself) and zip it or put it in a flat file or so. this is just for historical purposes, so that tomorrow if someone wants to know what was in the ClearCase then they can see. so ,is there any idea?
The reason this doesn't exist (as far as I know) is in the nature of ClearCase (compared to a revision-based VCS tools).
It is a file-based VCS:
You create a new version for each file you change (instead of a unique repository-wide revision)
You create a label on each file you want to label (instead of a tag referring to a revision or a commit)
You create a branch for each file modified in that branch (instead of a single directory for SVN, or branch for other VCS)
...
That means you wouldn't simply export revisions/labels/branches with ClearCase. You would export them for each file: it doesn't scale well and would take too much time and space.
Migrating major baselines is sensible course of action that I have recommended before.
But for the rest, I always put a ClearCase instance as a way to explore the full history/events in case in is needed, while the recent history is managed in the new VCS tool.
Storing that as a flat file you could read without ClearCase isn't, again as far as I know, available.
Hence my previous "vobstore-reformatvob" proposition.

What's the correct way to deal with databases in Git?

I am hosting a website on Heroku, and using an SQLite database with it.
The problem is that I want to be able to pull the database from the repository (mostly for backups), but whenever I commit & push changes to the repository, the database should never be altered. This is because the database on my local computer will probably have completely different (and irrelevant) data in it; it's a test database.
What's the best way to go about this? I have tried adding my database to the .gitignore file, but that results in the database being unversioned completely, disabling me to pull it when I need to.
While git (just like most other version control systems) supports tracking binary files like databases, it only does it best for text files. In other words, you should never use version control system to track constantly changing binary database files (unless they are created once and almost never change).
One popular method to still track databases in git is to track text database dumps. For example, SQLite database could be dumped into *.sql file using sqlite3 utility (subcommand .dump). However, even when using dumps, it is only appropriate to track template databases which do not change very often, and create binary database from such dumps using scripts as part of standard deployment.
you could add a pre-commit hook to your local repository, that will unstage any files that you don't want to push.
e.g. add the following to .git/hooks/pre-commit
git reset ./file/to/database.db
when working on your code (potentially modifying your database) you will at some point end up:
$ git status --porcelain
M file/to/database.db
M src/foo.cc
$ git add .
$ git commit -m "fixing BUG in foo.cc"
M file/to/database.db
.
[master 12345] fixing BUG in foo.cc
1 file changed, 1 deletion(-)
$ git status --porcelain
M file/to/database.db
so you can never accidentally commit changes made to your database.db
Is it the schema of your database you're interested in versioning? But making sure you don't version the data within it?
I'd exclude your database from git (using the .gitignore file).
If you're using an ORM and migrations (e.g. Active Record) then your schema is already tracked in your code and can be recreated.
However if you're not then you may want to take a copy of your database, then save out the create statements and version them.
Heroku don't recommend using SQLite in production, and to use their Postgres system instead. That lets you do many tasks to the remote DB.
If you want to pull the live database from Heroku the instructions for Postgres backups might be helpful.
https://devcenter.heroku.com/articles/pgbackups
https://devcenter.heroku.com/articles/heroku-postgres-import-export

Changing event record information in Clearcase

At our work, we are forced to use Clearcase UCM as our central repository (specifically for labelling/baselining, builds and code reviews), but our team wants to use Git as our real SCM system.
What we want to achieve is essentially a scraping service that takes the commits as they are pushed to our central Git repo, and push them on to a Clearcase VOB that is read-only as far as the development team is concerned, including important information such as the comment and the user name (exact date/time matching is not important, but getting the user correct is).
Our centralized Git server has been configured (using the excellent scm-manager) to accept Windows domain users and passwords, and our Clearcase servers use Windows domain accounts, but I am unsure how a scraper service would "impersonate" the correct user so this information is duplicated correctly in Clearcase.
I thought the chevent command might hold some promise, but that only gives access to the comment.
Is there any way to amend the details of a Clearcase event record once it is in the database, in particular the user-name? Or is there a better way to do this?
Again, we don't need a bi-directional bridge - all access to the Clearcase VOB as far as code commits is concerned would be through the scraper.
ClearCase is a file-by-file SCM, not a revision-based SCM.
(See "What are the basic ClearCase concepts every developer should know?" for a more detailed comparison between ClearCase and git)
That means, for each git commit, you need to:
clearfsimport into ClearCase any file included in the git commit.
Create a specific UCM activty for that import.
As a ClearCase admin, cleartool protect -chown on the activity: see "Why is the owner of the clearcase activity 'nobody'" (as well as a protect -chgrp, if the CLEARCASE_PRIMARY_GROUP environment variable wasn't correctly set at the time of the import).
Note that cleartool protect affects the entire "element" (file or directory), not just one version, so you cannot record the user id that way: the next import would overwrite that id with the id of the new committer whom content is imported.
Plus, you cannot changed the initial creator (see "Changing the name of the original creator of an element")
That means you should record that information (author and creator git id) in attribute:
see cleartool mkattr.
If I did want to accurately reflect the Git user as the "creator" of the new version of the file does that mean I would need a way to run clearfsimport as that user - impersonate them?
Yes: for each commit, you would need to clearfsimport "as" (runas in Windows, as mentioned in this thread) that use, in order for ClearCase to properly set the creator (if this is a new element) or the version author (if this is an update of an existing version).
The reason I didn't mention that possibility in the first place is that I don't have access to the credentials of another user, for me to switch to for each clearfsimport.
Other import tools (CVS, PCVS, RCS, SCCS, SSafe) simply:
ignore that creator/author information entirely.
add attributes of their own for tool-specific information (like the promotion group 'PVCS_GROUP', or RCS_REVISION.
Each time, you will find the limitation similar to:
clearexport_sccs ignores information in SCCS files that is not related to version-tree structure; this includes flags, ID keywords, user lists, and Modification Request numbers
most of our other systems that need the Clearcase history use the creator to reflect who made that change
That means your other systems can rely on the user ID version, except if it is the one used for the import (in which case they would consult the special attribute recording that data from the import)

What is the difference between a reserved checkout and an unreserved checkout?

When I check out a file in ClearCase it asks me if I want to check out the file "Reserved" or "Unreserved". What are the differences between these types of checkouts and when are the appropriate times to use them?
As mentioned in "What are the basic clearcase concepts every developer should know?", ClearCase support a locking mechanism which is both:
"pessimistic": reserved checkout doesn't actually prevent other people to do their own checkout, but they will have to wait for the person who has the file checked out as "reserved" to do the check in: nobody can check-in until that person does the first check-in (then each other user will have to merge his/her version with the latest checked-in file)
Note: a "reserved" checkout can release its lock and be made unreserved, either by the owner or the administrator;
"optimistic": unreserved checkout which means (if nobody use a reserved checkout on the same file): the first one to check-in can do it without any other operation, the other ones will have to merge their work with the latest checked-in file.
In term if usage policy:
Usually, reserved checkout is fine since it allows you to make your changes with a "high prioritization": they have to be taken into account first.
For local modifications which don't have to be checked-in right away, an unreserved checkout is enough.
For local modification which don't have to be checked-in at all, hijacked file or eclipsed files are enough (so, no checkout at all)
Notes:
A cleartool checkout/checkin is not the same as:
svn checkout/git checkout, which are updating a working repository with the content of a revision/commit, as opposed to checkout a version of a file: set of files vs. one file.
"checkin": svn commit/git commit which registered changes of possibly multiple files to the repo (remote for SVN, local for Git), as opposed to creating a new version for one file.
Git itself would not have "file locking" (reserved checkout). Only system using Git might offer that feature, like Git LFS.

Resources