How to run a subgit import for trunk and one specific branch - subgit

Hi I have a question about SubGit version 3.2.1 ('Bobique') build #3593.
The svn repo to import doesn't have a standard layout.
I can't find the documentation to configure a 'subgit import' for just a trunk and one specific branch on the same level as trunk. The branches are not in a branches folder.
In other words..
In svn we got a trunk which should be mapped to master.
In svn we have branch x, which should be mapped to develop.
I hope I have been clear. Can anybody help me?
Greetings

First of all you can run
$ subgit configure --svn-url PROJECT_ROOT repo.git
Then edit repo.git/subgit/config depending on the following condition. If you want continuous synchronization, do you want other branches be ever translated (e.g. when someone pushes refs/heads/new branch, should it be translated to SVN)? If yes, you should have the following configuration:
trunk = trunk:refs/heads/master
branches = x:refs/heads/develop
branches = *:refs/heads/*
#it's up to you whether you want to have tags/shelves or not
#shelves = shelves/*:refs/shelves/*
#tags = tags/*:refs/tags/*
If no, you can specify that certain branch only:
trunk = trunk:refs/heads/master
branches = x:refs/heads/develop
#it's up to you whether you want to have tags/shelves or not
#shelves = shelves/*:refs/shelves/*
#tags = tags/*:refs/tags/*
In the second case when you push refs/heads/branch, SubGit will ignore it.
If you need just one time translation, both configuration are the same.
Finally, run
$ subgit install repo.git
If you don't need continuous synchronization, you can then run
$ subgit uninstall repo.git

Related

How do I commit/push my build folder into another git repository and not into the main repository?

So, I've recently made a React app which I have posted on GitHub. However, I would like to post the output (build folder after I run npm run build) to a Glitch application. Since all Glitch applications have a git repository, I thought that would be the best way to go about doing this. Here is my desired structure:
My main git repo, which pushes to GitHub. This repository ignores the build folder.
Another "sub" git repository, which only pushes the contents of build to Glitch.
I've seen people using submodules, but I can't figure out how to make my main git repo ignore the build folder and have the submodule just push the build folder.
I'm also confused on how to setup a submodule in general, so an example/explanation for that as well would be appreciated.
~ Ayush
I'm not entirely sure that you want a submodule here, but submodules will let you do what you are describing. Submodules are tricky, though. There's a reason people call them sob-modules. 😀
Long
First, it will help a great deal if you get your definitions—actors and actions—straight:
A repository does not push anything. It's just a collection of commits (plus some names; see the last point below).
Git (the software suite) creates and manipulates repositories, including the commits inside them.
The git push command pushes commits.
A commit is a thingy (technically, a commit object, but people use the term pretty loosely, hence the loose "thingy" term here 😀) with the following features:
It has a unique hash ID.
It stores files. Note that commits do not store folders, just files. These files have path names that include embedded slashes (always forward slashes, even if you extract the commit's files on a Windows sytem with their backward slashes). This eventually becomes important later, but if you like, you can think of them as folders-full-of-files, as long as you remember that Git can't store an empty folder properly (because it only stores files). The files are stored as full snapshots, although they get compressed and—importantly—de-duplicated across all commits in a repository. So, the fact that typically some new commit re-uses 30,000 files from some previous commit doesn't matter: the re-used files take no space, because they're literally re-used.
It stores metadata, or information about the commit itself. This includes stuff like who made the commit and when, and a log message, and so on; and, crucial for Git's own operation, it also includes the raw hash ID of some set of earlier commits. Most commits store just one earlier-commit hash ID, which we (and Git) call the parent. This is how history works, in a Git repository: each commit remembers its parent.
It is completely read-only. No part of any commit can ever be changed. (This is what allows the de-duplication, and a lot of other Git magic.)
A repository also contains names—such as branch and tag names—that allow Git to find commits. This works by having one name store exactly one hash ID. For branch names, that stored hash ID is, by definition, the last commit in the branch. Since commits store parent hash IDs, Git can work backwards from whichever commit we decide to call "last in branch X": X~1 is the second-to-last in X, X~2 is the third-to-last, and so on.
The act of adding a new commit to a branch consists of the following steps:
You check out that commit (with git checkout or git switch) by checking out that branch (with the same command), so that this is now the current branch. This action fills in both Git's index—which holds your proposed next commit—and your working tree, where Git copies out all the files into a usable form. The internal, de-duplicated form is generally unusable to everything except Git itself.
You do some stuff in your working tree. Git has zero control or influence over this part, a lot of the time, since you'll be using your own editor or compiler or whatever. You can use Git commands here and then Git will be able to see what you did, but mostly, Git doesn't have to care, because we move on to step 3:
You run git add. This instructs Git to take a look at the updated working tree files. Git will copy these updated files back into Git's index (aka the staging area), in their updated form, re-compressing and de-duplicating them and generally making them ready for the next commit.
You run git commit. This packages up new metadata—your name, the current date and time, a log message, and so on—and adds the current commit's hash ID to make up the metadata for the new commit. The new commit's parent will thus be the current commit. Git then snapshots everything in the index at this time (which is why git checkout filled it in, in step 1, and then git add updated it in step 3), along with the metadata, to make the new commit. This gives the new commit its new hash ID, which is actually just a cryptographic checksum of the entire data set here.
It's at this point that the magic happens: git commit writes the new commit's hash ID into the current branch name. So now, the last commit on the branch is your new commit. This is how a branch grows, one commit at a time. No existing commit changes—none can change—but the new commit points back to what was the last commit, and is now the second-to-last commit. The branch name moves.
You really need to have all of these down pretty cold to make submodules work, because submodules actually use all of this stuff, but then violate some rules. Now it starts to get tricky. We also need to look more closely at git push, just for a moment.
git push: cross-connecting one Git repository with another
Making a new Git commit, in some Git repository, just makes a new snapshot-plus-metadata. The next trick is to get that commit into some other Git repository.
If we start with two otherwise-identical Git repositories, each has some set of commits and some branch names identifying the same last commit:
... <-F <-G <-H <--branch-name [in Repo A]
and the same in Repo B. But then, over in Repo A, we do:
git checkout branch-name
<do stuff>
git commit
which causes repo A to contain:
...--F--G--H--I <-- branch-name
(I get lazy and don't bother drawing the commit-to-commit arrows correctly here). New commit I—I, like H and G and F, stands in for some big ugly random-looking hash ID—points back to existing commit H. You might even make more than one new commit:
...--F--G--H--I--J <-- branch-name
Now you run git push origin branch-name, to send your new commits, in your repository, back to the "origin" repo (which we were calling "repo B" before, but let's call it origin now).
Your Git software suite ("your Git") calls up theirs. Your Git lists out the hash ID of your latest commit, i.e., commit J. Their Git checks in their repository, to see if they have J, by hash ID. They don't (because you just made it). So their Git tells your Git: OK, gimme! Your Git is now obligated to offer J's parent I. They check and don't have I either, so they ask for that one too. Your Git is now obligated to offer commit H. They check and—hey!—this time they do have commit H already, so they say: no thanks, I have that one already.
Your Git now knows not only that you must send commits J and I, but also which files they already have. They have commit H, so they must have commit G too, and commit F, and so on. They have all the de-duplicated files that go with those commits. So your Git software suite can now compute a minimal set of stuff to send them so that they can reconstruct commits I-J.
Your Git does so; that's the "counting" and "compressing" and so on that you see. Their Git receives this stuff, unpacks it, and adds the new commits to their repository. They now have:
...--F--G--H <-- branch-name
\
I--J
in their Git repository. Now we hit a really tricky bit: How does a Git, in general, find a commit? The answer is always, ultimately, by its hash ID—but that just brings another question, which is: how does a Git find a hash ID? They look random.
We already said this earlier though: a Git (the software suite) often finds some specific commit in some specific repository through the use of a branch name. The branch name branch-name, in your repository, finds the last commit, which is now J. We'd like the same name in their repository to find the same last commit.
So, your Git software now asks their Git to set their repository's branch name branch-name to identify commit J. They will do this if you are allowed to do this. The "allowed" part can get arbitrarily complicated—sites like GitHub and Bitbucket add all kinds of permissions and rules here—but if we assume that it's OK, and that they'll do that, then they will end up with:
...--F--G--H--I--J <-- branch-name
in their repository, and your Git repository and their Git repository will be in sync again, at least for this particular branch name.
So that's how git push normally works: you make new commits, adding them on to the end of your branch, and then you send your new commits to some other Git, and ask their software to add the same commits to the end of a branch of the same name in their repository. (Whew!)
Submodules
A submodule, in Git, is little more than two separate, mostly-independent Git repositories. This of course needs a lot of explanation:
Why are they only "mostly" independent? (What does that even mean?)
If they're little more, what more are they?
First, like any repository, a submodule repository is a collection of commits, each with a unique hash ID. We—or Git at least—like to refer to one of the two repositories as the superproject and the other as the submodule. Both of these start with the letter S, which is annoying, and both words are long and klunky, so here I'll use R (in bold like this) as the superproject Repository, and S as the Submodule.
(Side note: the hash IDs in R and S are independent from each other. Git tries pretty hard—and usually succeeds—at making hash IDs globally unique across every Git repository everywhere in the universe. So there's no need to worry about "contaminating" R with S IDs or vice versa. In any case we can just treat every commit hash ID as if it's totally unique. Normally, with a normal non-R non-S repository, we don't even have to care about IDs, as we just use names. But submodules make you have to be more aware of the IDs.)
What makes R a superproject in the first place is that it lists raw hash IDs from S. It also has to list instructions: if we've done a git clone of R, we don't even have a clone of S yet. So R needs to contain the instructions so that your Git software can make a clone of S.
The instructions you give to git clone are pretty simple:
git clone <url> <path>
(where the path part is even optional, but here, R will always specify a path—using those forward slash path names we mentioned earlier). This set of instructions goes into a file named .gitmodules. The git submodule add command will set up this file in R for you. It's important to use it, to set up the .gitmodules file. Git will still make a submodule even if you don't set this up, but without the cloning instructions, the submodule won't actually work.
Note that there's no proper place to put authentication (user and password names) in here. That's a generic submodule issue. (You can put them in as plaintext in the .gitmodules file, but don't do it, it's a very bad idea, they're not encrypted or protected.) As long as you have open access to cloning the submodule, it doesn't normally present any real problem. If you don't, you'll have to solve this problem somehow.
In any case, you will need, just once, to run:
git submodule add ...
(filling in the ... part) in what will thus become superproject R, so as to create the .gitmodules file. You then need to commit the resulting .gitmodules file, so that people who clone R and check out a commit that contains that file, get that file, so that their Git software can run the git clone command to create S on their system.
You'll also need to put S somewhere they can clone it. This, of course, means that first you need to create a Git repository to hold S. You do this the way you make any Git repository:
git init
or:
git clone
(locally, on your machine) along with whatever you do on whatever hosting site that creates the repository there.
Now that you have a local repository S, you need to put some commit(s) into it. What goes into these commits?
Well, you already said that you'd like your R to have a build/ directory (folder) in it, but not actually store any of the built files in any of the commits made in R. This is where submodules actually work. A submodule, in R, for S, works by saying: create me a folder here, then clone the submodule into the folder. Or, if the submodule repository already exists—as it will when you're setting all this up in the first place, with you just now having created S—you simply put that entire repository into your working tree for R, under the name build.
Note that build/.git will exist in R's working tree at this point. That's because a Git repository hides all the Git files in the .git directory (folder) at the top level of the working tree. So your new, empty S repository consists of just a .git/ containing Git files.
You can now run that git submodule add command in R, because now you have the submodule in place:
git submodule add <url> build
(You might want to wait just a little bit, but you can definitely do it at this point—and this is the earliest point at which you can do it, since up until now, S didn't exist or was not in the right place yet.)
You can now fill the build/ directory that lives in R's working tree with files, e.g., by running npm run build, or whatever it is that populates the build/ directory. Then you can:
(cd build; git add .)
or equivalent, so as to add the build output in S. You can now create the first commit in S, or maybe as the second commit in S if you like to create a README.md and LICENSE and such as your initial commit. You can now have branches in S as well, since you now have at least one commit in S.
Now that you're back in R though, it's time to git add build—or, if you chose to delay it, run that first git submodule add. In the future you'll use git add build. This directs the Git that is manipulating the index / staging-area for R to enter the repository S and run:
git rev-parse HEAD
to find the raw hash ID of the current commit in S.
The superproject's Git repository's index now acquires a new gitlink entry. A gitlink entry is like a regular file, except that instead of git checkout checking it out as a file, it provides a raw hash ID. That's basically all it is: a pathname—in this case, build/—and a raw hash ID.
This gitlink is like one of those read-only, compressed, and de-duplicated files that goes in a commit. It's just that instead of storing file data, it stores a commit hash ID. That hash ID is that of some commit in S, not some commit in R itself. But now that you've updated the index (or staging area) for R, you will need to make a new commit in R. The new commit will contain any updated files, plus the right hash ID for S, as found just now by the git add you ran (or that git submodule add ran for you).
The next commit you make in R (not in S) will list the hash ID of the current commit in S. So once you've committed the built files in S, you can git add them in R and git commit in R.
The last and trickiest part
Now comes the last part, which—if you thought all of the above was complicated and tricky—is the trickiest:
You have to git push the submodule commit in S so that it's generally available. In general, you should do this first, though you don't actually have to.
Then you have to git push the superproject commit in R so that others can get it. When others get this commit from the other clone of R, they'll be able to see the right hash ID from S.
Then, if someone else—let's say your co-worker Bob—wants to get both the built files and the sources, they have to:
Obtain your new R commit.
Instruct their Git to check out the new R commit.
Instruct their Git to use the new checked out R commit to run git fetch in S so as to obtain the new S commit.
Instruct their Git to actually enter their clone of S and git checkout the correct commit.
They can do this all at once with git checkout --recursive, or set the recursive checkout option. Note what can go wrong though:
They might obtain your new R commit and check it out, but forget to update their S at all.
Or, they might obtain your new R commit and check it out and then try to check out the new commit in S without first running git fetch in their clone of S, so that they don't have the new commit.
Or, they might remember everything they should do, but someone forgot to push the new S commit to the shared repository people can get it from. They'll get an error about their submodule Git being unable to find the requested commit.
You can see how this can get pretty messy. It's very easy for the various separate commits to get de-synchronized in various ways. Once you have the procedures down, and have scripts around everything that make sure that all the steps happen at the right times, it can work pretty well. But there are many ways for things to go wrong.
To ignore the specific folder/files, yes by using .gitignore file
To push specific folder/ "sub" git repository to different repository is by initialized new git on that specific folder, by running "git init" and "git remote add".
Example:
git init
git add somefile
git commit -m "initial commit"
git remote add origin https://github.com/username/new_repo
git push -u origin master
Stop thinking about repository boundaries as anything substantial. The only important structure in Git is the history.
rm -rf build
git branch build $(git commit-tree -m 'Glitch project' `git mktree <&-`)
git worktree add build
git add build # you'll get a newbie warning here
git remote add glitch u://r/l
git config remote.glitch.push +refs/heads/build:refs/heads/build
and don't push the build branch to your main repo, you don't want that history there so don't push it there.
git config remote.origin.push : # this is "matching", see the docs
git config remote.origin.push --add ^refs/heads/build # don't match this
and now after a build completes and you like it enough to publish,
git -C build add .
git -C build commit -m "built $(date)"
git add build
git push glitch
When you clone from your github repo you'll get the history with a build entry in it, and checkout will create an empty directory there, but you won't have the build history itself. That's okay: if you want it, you can fetch it from someplace that does have it and then git worktree add it, or you can just not bother, git init build and redo the build locally.
1 "only" might seem a bit strong, but it's really not. Everything else is support, scaffolding, infrastructure, just there to help with inspecting, analyzing, extending history.
Use .gitignore file in the root and add the files you don’t wanna push to GH

git for-each-ref not showing default branch

I'm using this command in a bash script to build an array of current local branches:
for branch in $(git for-each-ref --no-merged dev --format='%(refname:short)' refs/heads/); do
branches+=("$branch")
done
it returns all my local branches EXCEPT the default branch of the repo. What am I missing?
I have tried various other "patterns" according to the documentation (https://git-scm.com/docs/git-for-each-ref)
including refs/heads/* but none return the default branch. I confirmed it doesn't matter what branch I am checked out on, I can't get it to show up in any situation.
Any help is appreciated!
You're asking git which branches are not merged in dev.
It means that only branches with at least one commit not on dev will show up. dev can never be on that list, no matter what.
To get all your branches without filter, just get rid of the conditions :
git for-each-ref --format='%(refname:short)' refs/heads/

How to import tags from multiple folders?

I successfully executed subgit import on large old repository.
Later i discovered, that there were two directories for tags: default tags and tag.
I tried to edit subgit config file according to advices in Does subgit support multiple 'branches' directories?
Currently i have tags configured the following way:'
tags = tags/*:refs/tags/*
tags = tag/*:refs/tags/tag/*
But now when i try to execute import command again, Subgit does nothing, like everything seems to be already up to date. What i did wrong or i need to run subgit import from scratch?
Indeed, you have to start importing from scratch. You can run
$ subgit configure --svn-url SVN_URL repo.git
Then edit repo.git/subgit/config to specify
tags = tags/*:refs/tags/*
tags = tag/*:refs/tags/tag/*
Then
$ subgit install repo.git
and finally
$ subgit uninstall repo.git
to stop continuous synchronization. You can also use "subgit import" command as a shortcut for "subgit install" + "subgit uninstall".
As a bonus you'll have all SVN revision numbers saved in refs/svn/map reference. To see revision numbers in "git log" output you can setup you Git clients as it is recommended in SubGit book or run the following command on the server:
$ git update-ref refs/notes/commits refs/svn/map

subgit import and multiple branches directories

I'm trying to do an import using subgit. Just a one-time migration. My SVN structure contains:
branches
branch1
features
branch2
hotfixes
branch3
I'd like to convert all three to branches in git. I've tried:
proj=myproject; subgit import --svn-url <correctPath>/$proj --authors-file
~/authors --branches branches --branches branches/features
--branches hotfixes --tags tags $i
This seems to just use "hotfixes" as the only place to import from. (I'm using SubGit version 2.0.2 ('Patrick') build #2731.) I also tried using:
--branches "branches;branches/features;hotfixes"
But that completely failed (it was probably looking for a directory with semi-colons in it).
Any suggestion for the one-time import?
(Note, I saw this related question.)
You can use a combination of 'configure' + 'install' + 'uninstall' commands. I suppose, your repository has the following structure:
$ svn ls --depth infinity <SVN_URL>
branches/
branches/branch1/
branches/branch2/
branches/features/
branches/features/feature1/
branches/features/feature2/
hotfixes/
hotfixes/hotfix1/
hotfixes/hotfix2/
tags/
tags/tag1/
tags/tag2/
trunk/
Then do the following. Run 'configure' command:
$ subgit configure --svn-url <SVN_URL> repo
Edit repo/subgit/config file to this repository structure (or you can invent your own refs/heads/ namespaces, the only requirement is: the shouldn't be the same for different kinds of branches; if you need one-time import and everything under refs/heads/*, you can rename them later with a script):
trunk = trunk:refs/heads/master
branches = branches/*:refs/heads/*
branches = branches/features/*:refs/heads/features/*
branches = hotfixes/*:refs/heads/hotfixes/*
tags = tags/*:refs/tags/*
shelves = shelves/*:refs/shelves/*
Run 'install' command:
$ subgit install repo
Then if you run "git branch -a" from "repo" directory, you'll see something like that:
$ git branch -a
branch1
branch2
features/feature1
features/feature2
hotfixes/hotfix1
hotfixes/hotfix2
* master
Optionally you can run 'uninstall' command to disable synchronization temporary or forever (--purge option)
$ subgit uninstall [--purge] repo

How to set up subgit to mirror an svn repo that looks like a Windows Explorer hierarchy?

Being windows users, we created one svn repo with a hierarchy of folders. The bottom nodes contain the svn standard layout:
ProjectA/
ApplicationOne/
ModuleX/
trunk/
branches/
tags/
ApplicationTwo/
ModuleY/
trunk/
branches/
tags/
... and so on ad infinitum. The repo now contains around 100+ real svn projects with the trunk/branches/tags structure, but almost none of them at the top level.
How would I configure subgit to handle this?
SubGit can work in two different modes: local mirror mode and remote mirror mode. Below you can find a general overview of these modes and some recommendations for your particular case.
Local Mirror Mode
In this mode both Subversion and Git repositories reside on the same host, so SubGit has local access to both SVN and Git sides.
Below I've provided basic instructions. Please find detailed documentation and common pitfalls in SubGit 'Local Mode' Book.
Configuration
subgit configure <SVN_REPO>
SubGit version <VERSION> build #<BUILD_NUMBER>
Detecting paths eligible for translation... done.
Subversion to Git mapping has been configured:
/ProjectA/ApplicationOne/ModuleX : <SVN_REPO>/git/ProjectA/ApplicationOne/ModuleX.git
/ProjectA/ApplicationTwo/ModuleY : <SVN_REPO>/git/ProjectA/ApplicationTwo/ModuleY.git
...
CONFIGURATION SUCCESSFUL
...
This command tries to auto-detect repository layout and generate configuration file at <SVN_REPO>/conf/subgit.conf. It may take a while in case of big Subversion repository like yours.
Please make sure that auto-generated configuration file looks as follows, adjust it if necessary:
...
[git "ProjectA/ApplicationOne/ModuleX"]
translationRoot = /ProjectA/ApplicationOne/ModuleX
repository = git//ProjectA/ApplicationOne/ModuleX.git
pathEncoding = UTF-8
trunk = trunk:refs/heads/master
branches = branches/*:refs/heads/*
shelves = shelves/*:refs/shelves/*
tags = tags/*:refs/tags/*
...
Authors mapping
At this stage you have to create /conf/authors.txt file that maps existing SVN usernames to Git authors. Please refer to documentation for more details.
Installation
Finally you have to import your Subversion repository to Git and enable synchronization by running subgit install command:
subgit install repo
SubGit version <VERSION> build #<BUILD_NUMBER>
Subversion to Git mapping has been found:
/ProjectA/ApplicationOne/ModuleX : <SVN_REPO>/git/ProjectA/ApplicationOne/ModuleX.git
/ProjectA/ApplicationTwo/ModuleY : <SVN_REPO>/git/ProjectA/ApplicationTwo/ModuleY.git
...
Processing '/ProjectA/ApplicationOne/ModuleX'
Translating Subversion revisions to Git commits...
Processing '/ProjectA/ApplicationTwo/ModuleY'
Translating Subversion revisions to Git commits...
...
Subversion revisions translated: <REVISIONS_NUMBER>.
Total time: <TIME_SPENT> seconds.
INSTALLATION SUCCESSFUL
Git Server
When the installation is over and synchronization between Subversion and Git repositories is enabled, you can setup some Git server (or reuse existing Apache HTTP server). Please refer to documentation on that and see a couple of posts on this topic in our blog:
VisualSVN Server and SubGit
Gitolite and SubGit
Remote Mirror Mode
When using this mode one has to install SubGit into Git repository only and keep this repository synchronized with remote Subversion server hosted on a different machine.
Below you can find some basic instructions. Please refer to SubGit 'Remote Mode' Book for more details.
Configuration
In remote mirror mode SubGit does not try to auto-detect repository layout, so you have to run subgit configure --svn-url <SVN_URL> command for every module within Subversion repository:
subgit configure --svn-url <SVN_ROOT_URL>/ProjectA/ApplicationOne/ModuleX <GIT_ROOT_DIR>/ProjectA/ApplicationOne/ModuleX.git
SubGit version <VERSION> build #<BUILD_NUMBER>
Configuring writable Git mirror of remote Subversion repository:
Subversion repository URL : <SVN_ROOT_URL>/ProjectA/ApplicationOne/ModuleX
Git repository location : <GIT_ROOT_DIR>/ProjectA/ApplicationOne/ModuleX.git
CONFIGURATION SUCCESSFUL
...
As result SubGit generates configuration file <GIT_REPO>/subgit/config for every Git repository. For your case this configuration file should look as follows:
...
[svn]
url = <SVN_ROOT_URL>/ProjectA/ApplicationOne/ModuleX
trunk = trunk:refs/heads/master
branches = branches/*:refs/heads/*
tags = tags/*:refs/tags/*
shelves = shelves/*:refs/shelves/*
fetchInterval = 60
connectTimeout = 30
readTimeout = 60
keepGitCommitTime = false
auth = default
[auth "default"]
passwords = subgit/passwd
useDefaultSubversionConfigurationDirectory = false
subversionConfigurationDirectory = <SVN_CONFIG_DIR>
...
Authors mapping
At this stage you have to create /subgit/authors.txt file that maps existing SVN usernames to Git authors. Please refer to documentation for more details.
SVN credentials
In case you're not using file:// protocol you have to provide necessary credentials, so SubGit is able to authenticate against Subversion server. For more information on that please read corresponding chapter in SubGit Book.
We also recommend enabling pre-revprop-change hook on Subversion side which makes further installation and maintenance a bit easier, see SubGit Book.
Installation
Finally you have to import your Subversion repository to Git and enable synchronization by running subgit install command:
subgit install git
SubGit version <VERSION> build #<BUILD_NUMBER>
Translating Subversion revisions to Git commits...
Subversion revisions translated: <REVISIONS_NUMBER>.
Total time: <TIME_SPENT> seconds.
INSTALLATION SUCCESSFUL
This command also launches background process that polls SVN server and fetches new revisions when they appear there. Basically, that means that SubGit uses dedicated process for every Git repository. Sometimes it makes sense to avoid running such processes and use some job scheduler instead.
Git server
Those links I've provided above are relevant for remote mode as well.
However, if you're going to use Atlassian Stash for Git hosting, you can use SVN Mirror Plugin which is based on SubGit engine and provides some better experience with regards to UI and maintenance.
We have the following guideline which is based on our experience:
In case of many independent Subversion repositories, it's better to use SubGit in local mirror mode as it doesn't require SVN polling and maintaining additional process(es) for that.
In case of one giant Subversion repository with many modules, it's better to use remote mirror mode with file:// protocol and also adjust basic setup slightly.
It definitely doesn't make sense to run 100+ background processes in your case, instead we recommend installing additional post-commit SVN hook that checks what particular modules were modified by a given revision and then triggers synchronization for corresponding Git repositories.
If you have any other questions, feel free to ask us here at Stack Overflow, at our issue tracker or contact us via email: support#subgit.com.

Resources