How to filter git hook post-commit to a specific file? - githooks

Currently my post-commit githook triggers on committing all files. How can I filter the trigger to fire only if a specific file e.g. example.txt has been commited. I'm currently looking at git-diff for file filtering, though believe there could be something more elegant

git diff remains the normal approach for that.
See for instance "In a git post-commit hook how do I get a list of the files that were changed?"
git diff-tree -r --name-only --no-commit-id <tree-ish>
Once you have the list, you can grep for the file.

Related

"fatal: bad tree object" error when pull from remote branch [duplicate]

Whenever I pull from my remote, I get the following error about compression. When I run the manual compression, I get the same:
$ git gc
error: Could not read 3813783126d41a3200b35b6681357c213352ab31
fatal: bad tree object 3813783126d41a3200b35b6681357c213352ab31
error: failed to run repack
Does anyone know, what to do about that?
From cat-file I get this:
$ git cat-file -t 3813783126d41a3200b35b6681357c213352ab31
error: unable to find 3813783126d41a3200b35b6681357c213352ab31
fatal: git cat-file 3813783126d41a3200b35b6681357c213352ab31: bad file
And from git fsck I get this ( don't know if it's actually related):
$ git fsck
error: inflate: data stream error (invalid distance too far back)
error: corrupt loose object '45ba4ceb93bc812ef20a6630bb27e9e0b33a012a'
fatal: loose object 45ba4ceb93bc812ef20a6630bb27e9e0b33a012a (stored in .git/objects/45/ba4ceb93bc812ef20a6630bb27e9e0b33a012a) is corrupted
Can anyone help me decipher this?
I had the same problem (don't know why).
This fix requires access to an uncorrupted remote copy of the repository, and will keep your locally working copy intact.
But it has some drawbacks:
You will lose the record of any commits that were not pushed, and will have to recommit them.
You will lose any stashes.
The fix
Execute these commands from the parent directory above your repo (replace 'foo' with the name of your project folder):
Create a backup of the corrupt directory:
cp -R foo foo-backup
Make a new clone of the remote repository to a new directory:
git clone git#www.mydomain.de:foo foo-newclone
Delete the corrupt .git subdirectory:
rm -rf foo/.git
Move the newly cloned .git subdirectory into foo:
mv foo-newclone/.git foo
Delete the rest of the temporary new clone:
rm -rf foo-newclone
On Windows you will need to use:
copy instead of cp -R
rmdir /S instead of rm -rf
move instead of mv
Now foo has its original .git subdirectory back, but all the local changes are still there. git status, commit, pull, push, etc. work again as they should.
Your best bet is probably to simply re-clone from the remote repository (i.e., GitHub or other). Unfortunately you will lose any unpushed commits and stashed changes, however your working copy should remain intact.
First make a backup copy of your local files. Then do this from the root of your working tree:
rm -fr .git
git init
git remote add origin [your-git-remote-url]
git fetch
git reset --mixed origin/master
git branch --set-upstream-to=origin/master master
Then commit any changed files as necessary.
Working on a VM, in my notebook, battery died, got this error;
error: object file .git/objects/ce/theRef is empty error: object
file .git/objects/ce/theRef is empty fatal: loose object theRef
(stored in .git/objects/ce/theRef) is corrupt
I managed to get the repo working again with only 2 commands and without losing my work (modified files/uncommitted changes)
find .git/objects/ -size 0 -exec rm -f {} \;
git fetch origin
After that I ran a git status, the repo was fine and there were my changes (waiting to be committed, do it now..).
git version 1.9.1
Remember to backup all changes you remember, just in case this solution doesn't works and a more radical approach is needed.
Looks like you have a corrupt tree object. You will need to get that object from someone else. Hopefully they will have an uncorrupted version.
You could actually reconstruct it if you can't find a valid version from someone else by guessing at what files should be there. You may want to see if the dates & times of the objects match up to it. Those could be the related blobs. You could infer the structure of the tree object from those objects.
Take a look at Scott Chacon's Git Screencasts regarding git internals. This will show you how git works under the hood and how to go about doing this detective work if you are really stuck and can't get that object from someone else.
My computer crashed while I was writing a commit message. After rebooting, the working tree was as I had left it and I was able to successfully commit my changes.
However, when I tried to run git status I got
error: object file .git/objects/xx/12345 is empty
fatal: loose object xx12345 (stored in .git/objects/xx/12345 is corrupt
Unlike most of the other answers, I wasn't trying to recover any data. I just needed Git to stop complaining about the empty object file.
Overview
The "object file" is Git's hashed representation of a real file that you care about. Git thinks it should have a hashed version of some/file.whatever stored in .git/object/xx/12345, and fixing the error turned out to be mostly a matter of figuring out which file the "loose object" was supposed to represent.
Details
Possible options seemed to be
Delete the empty file
Get the file into a state acceptable to Git
Approach 1: Remove the object file
The first thing I tried was just moving the object file
mv .git/objects/xx/12345 ..
That didn't work - Git began complaining about a broken link. On to Approach 2.
Approach 2: Fix the file
Linus Torvalds has a great writeup of how to recover an object file that solved the problem for me. Key steps are summarized here.
$> # Find out which file the blob object refers to
$> git fsck
broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
to blob xx12345
missing blob xx12345
$> git ls-tree 2d926
...
10064 blob xx12345 your_file.whatever
This tells you what file the empty object is supposed to be a hash of. Now you can repair it.
$> git hash-object -w path/to/your_file.whatever
After doing this I checked .git/objects/xx/12345, it was no longer empty, and Git stopped complaining.
Try
git stash
This worked for me. It stashes anything you haven't committed and that got around the problem.
A garbage collection fixed my problem:
git gc --aggressive --prune=now
It takes a while to complete, but every loose object and/or corrupted index was fixed.
simply running a git prune fixed this issue for me
I encountered this once my system crashed. What I did is this:
(Please note your corrupt commits are lost, but changes are retained. You might have to recreate those commits at the end of this procedure)
Backup your code.
Go to your working directory and delete the .git folder.
Now clone the remote in another location and copy the .git folder in it.
Paste it in your working directory.
Commit as you wanted.
I just experienced this - my machine crashed whilst writing to the Git repo, and it became corrupted. I fixed it as follows.
I started with looking at how many commits I had not pushed to the remote repo, thus:
gitk &
If you don't use this tool it is very handy - available on all operating systems as far as I know. This indicated that my remote was missing two commits. I therefore clicked on the label indicating the latest remote commit (usually this will be /remotes/origin/master) to get the hash (the hash is 40 chars long, but for brevity I am using 10 here - this usually works anyway).
Here it is:
14c0fcc9b3
I then click on the following commit (i.e. the first one that the remote does not have) and get the hash there:
04d44c3298
I then use both of these to make a patch for this commit:
git diff 14c0fcc9b3 04d44c3298 > 1.patch
I then did likewise with the other missing commit, i.e. I used the hash of the commit before and the hash of the commit itself:
git diff 04d44c3298 fc1d4b0df7 > 2.patch
I then moved to a new directory, cloned the repo from the remote:
git clone git#github.com:username/repo.git
I then moved the patch files into the new folder, and applied them and committed them with their exact commit messages (these can be pasted from git log or the gitk window):
patch -p1 < 1.patch
git commit
patch -p1 < 2.patch
git commit
This restored things for me (and note there's probably a faster way to do it for a large number of commits). However I was keen to see if the tree in the corrupted repo can be repaired, and the answer is it can. With a repaired repo available as above, run this command in the broken folder:
git fsck
You will get something like this:
error: object file .git/objects/ca/539ed815fefdbbbfae6e8d0c0b3dbbe093390d is empty
error: unable to find ca539ed815fefdbbbfae6e8d0c0b3dbbe093390d
error: sha1 mismatch ca539ed815fefdbbbfae6e8d0c0b3dbbe093390d
To do the repair, I would do this in the broken folder:
rm .git/objects/ca/539ed815fefdbbbfae6e8d0c0b3dbbe093390d
cp ../good-repo/.git/objects/ca/539ed815fefdbbbfae6e8d0c0b3dbbe093390d .git/objects/ca/539ed815fefdbbbfae6e8d0c0b3dbbe093390d
i.e. remove the corrupted file and replace it with a good one. You may have to do this several times. Finally there will be a point where you can run fsck without errors. You will probably have "dangling commit" and "dangling blob" lines in the report, these are a consequence of your rebases and amends in this folder, and are OK. The garbage collector will remove them in due course.
Thus (at least in my case) a corrupted tree does not mean unpushed commits are lost.
The solution offered by Felipe Pereira (above) in addition to Stephan's comment to that answer with the name of the branch I was on when the objects got corrupted is what worked for me.
find .git/objects/ -size 0 -exec rm -f {} \;
git fetch origin
git symbolic-ref HEAD refs/heads/${BRANCH_NAME}
The answer of user1055643 is missing the last step:
rm -fr .git
git init
git remote add origin your-git-remote-url
git fetch
git reset --hard origin/master
git branch --set-upstream-to=origin/master master
Runnning git stash; git stash pop fixed my problem
I followed many of the other steps here; Linus' description of how to look at the git tree/objects and find what's missing was especially helpful. git-git recover corrupted blob
But in the end, for me, I had loose/corrupt tree objects caused by a partial disk failure, and tree objects are not so easily recovered/not covered by that doc.
In the end, I moved the conflicting objects/<ha>/<hash> out of the way, and used git unpack-objects with a pack file from a reasonably up to date clone. It was able to restore the missing tree objects.
Still left me with a lot of dangling blobs, which can be a side effect of unpacking previously archived stuff, and addressed in other questions here
I was getting a corrupt loose object error as well.
./objects/x/x
I successfully fixed it by going into the directory of the corrupt object. I saw that the users assigned to that object was not my git user's. I don't know how it happened, but I ran a chown git:git on that file and then it worked again.
This may be a potential fix for some peoples' issues but not necessary all of them.
To me this happened due to a power failure while doing a git push.
The messages looked like this:
$ git status
error: object file .git/objects/c2/38824eb3fb602edc2c49fccb535f9e53951c74 is empty
error: object file .git/objects/c2/38824eb3fb602edc2c49fccb535f9e53951c74 is empty
fatal: loose object c238824eb3fb602edc2c49fccb535f9e53951c74 (stored in .git/objects/c2/38824eb3fb602edc2c49fccb535f9e53951c74) is corrupt
I tried things like git fsck but that didn't help.
Since the crash happened during a git push, it obviously happened during rewrite on the client side which happens after the server is updated. I looked around and figured that c2388 in my case was a commit object, because it was referred to by entries in .git/refs. So I knew that I would be able to find c2388 when I look at the history (through a web interface or second clone).
On the second clone I did a git log -n 2 c2388 to identify the predecessor of c2388. Then I manually modified .git/refs/heads/master and .git/refs/remotes/origin/master to be the predecessor of c2388 instead of c2388.
Then I could do a git fetch.
The git fetch failed a few times for conflicts on empty objects. I removed each of these empty objects until git fetch succeeded. That has healed the repository.
We just had the case here. It happened that the problem was the ownership of the corrupt file was root instead of our normal user. This was caused by a commit done on the server after someone has done a "sudo su --".
First, identify your corrupt file with:
$> git fsck --full
You should receive a answer like this one:
fatal: loose object 11b25a9d10b4144711bf616590e171a76a35c1f9 (stored in .git/objects/11/b25a9d10b4144711bf616590e171a76a35c1f9) is corrupt
Go in the folder where the corrupt file is and do a:
$> ls -la
Check the ownership of the corrupt file. If that's different, just go back to the root of your repo and do a:
$> sudo chown -R YOURCORRECTUSER:www-data .git/
Hope it helps!
I solved this way:
I decided to simply copy the uncorrupted object file from the backup's clone to my original repository. This worked just as well. (By the way: If you can't find the object in .git/objects/ by its name, it probably has been [packed][pack] to conserve space.)
I got this error after my (Windows) machine decided to reboot itself.
Thankfully my remote repository was up to date, so I just did a fresh Git clone...
This seems to be an issue with Dropbox or symlinking folders out of Dropbox for me. Probably the same for any of the other similar services. When I go to git push I'd get the Corrupt loose object error. For me, on macOS Big Sur, the fix was simply to make a recursive copy of the repo to a directory outside of Dropbox. I believe this caused Dropbox to pull the live files for the broken dynamic references. After the copy I was immediately able to git push without error.
I have had the same issue before.
I simply get passed it by removing the object file from the .git/objects directory.
For this error below.
$ git fsck
error: inflate: data stream error (invalid distance too far back)
error: corrupt loose object '45ba4ceb93bc812ef20a6630bb27e9e0b33a012a'
fatal: loose object 45ba4ceb93bc812ef20a6630bb27e9e0b33a012a (stored in .git/objects/45/ba4ceb93bc812ef20a6630bb27e9e0b33a012a) is corrupted
Solution:
Go to your top directory and unhide the .git folder
On windows, you can do this by running this command on cmd: attrib +s +h .git
Then go to .git/objects folder
As mentioned on the error message above (stored in .git/objects/45/ba4ceb93bc812ef20a6630bb27e9e0b33a012a) is corrupted
you can see that the object is found on a director called "45". Therefore, go to the directory .git/objects/45/
Finally find the object named ba4ceb93bc812ef20a6630bb27e9e0b33a012a and delete it.
Now, you can go ahead and check with git status or git add . your change and proceed.
I had this same problem in my bare remote git repo. After much troubleshooting, I figured out one of my coworkers had made a commit in which some files in .git/objects had permissions of 440 (r--r-----) instead of 444 (r--r--r--). After asking the coworker to change the permissions with "chmod 444 -R objects" inside the bare git repo, the problem was fixed.
I just had a problem like this. My particular problem was caused by a system crash that corrupted the most recent commit (and hence also the master branch). I hadn't pushed, and wanted to re-make that commit. In my particular case, I was able to deal with it like this:
Make a backup of .git/: rsync -a .git/ git-bak/
Check .git/logs/HEAD, and find the last line with a valid commit ID. For me, this was the second most recent commit. This was good, because I still had the working directory versions of the file, and so the every version I wanted.
Make a branch at that commit: git branch temp <commit-id>
re-do the broken commit with the files in the working directory.
git reset master temp to move the master branch to the new commit you made in step 2.
git checkout master and check that it looks right with git log.
git branch -d temp.
git fsck --full, and it should now be safe to delete any corrupted objects that fsck finds.
If it all looks good, try pushing. If that works,
That worked for for me. I suspect that this is a reasonably common scenario, since the most recent commit is the most likely one to be corrupted, but if you lose one further back, you can probably still use a method like this, with careful use of git cherrypick, and the reflog in .git/logs/HEAD.
When I had this issue I backed up my recent changes (as I knew what I had changed) then deleted that file it was complaining about in .git/location. Then I did a git pull. Take care though, this might not work for you.
Create a backup and clone the repository into a fresh directory
cp -R foo foo-backup
git clone git#url:foo foo-new
(optional) If you are working on a different branch than master, switch it.
cd foo-new
git checkout -b branch-name origin/branch-name
Sync changes excluding the .git directory
rsync -aP --exclude=.git foo-backup/ foo-new
This problem usually occures when using various git clients with different versions on the same git checkout. Think of:
Command line
IDE build-in git
Inside docker / vm container
GIT gui tool
Make sure you push with the same client that created the commits.
What I did not to lose other unpushed branches:
A reference to the broken object should be in refs/heads/<current_branch>. If you go to .git\logs\refs\heads\<current_branch> you can see that the last commit has the exactly same value. I copied the one from the previous commit to the first file and it solved the problem.
I had a similar issue on a Windows 10 computer with onedrive backing up my documents folder where I have my git repositories.
Looking at the object in the git object directory I did not see a green checkmark but the blue sync icon for that file. All other object files appeared to have the green checkmark. Playing around, trying things, I tried selecting the option always keep this folder on this device but got an error: error 0x80071129 the tag present in the reparse point buffer is invalid.
This link (https://answers.microsoft.com/en-us/msoffice/forum/all/error-0x80071129-the-tag-present-in-the-reparse/b8011cee-98c5-4c33-ba99-d0eec7c535a0) suggests to run chkdsk /r /f as an admin to fix the issue (have to reboot computer). I did that and it fixed my issue.
Have the same issue after my linux mint crash, and I press power button to shutdown my laptop, that's why my .git is corrupt
find .git/objects/ -empty -delete
After that, I get error fatal: Bad object head. I just Reinitialized my git
git init
And fetch from remote repo
git fetch
To check your git, use
git status
And it's work again. I don't lose my local changes, so I can commit without rewriting code
I had the exact same error and managed to get my repo back without losing my changes.
I do not know if it could work for others as corruption reason can be multiple, but it's worth trying
I:
Made several backups of the corrupt git repository just in case
Cloned the lasted pushed version from the remote repository
Copied all the files from the corrupt .git folder EXCEPT all files related to HEAD, FETCH_HEAD, ORG_HEAD etc ... the most important are the refs, obj, and index
Ended up with a valid history, but corrupt index, applied the solution from this post How to resolve "Error: bad index – Fatal: index file corrupt" when using Git
And my repository was back working ...
To make sure I did not push anything wrong, I cloned again from the remote, checked-out the changes I wanted to save from the restored repository, and comited them fresh.

How do I commit/push my build folder into another git repository and not into the main repository?

So, I've recently made a React app which I have posted on GitHub. However, I would like to post the output (build folder after I run npm run build) to a Glitch application. Since all Glitch applications have a git repository, I thought that would be the best way to go about doing this. Here is my desired structure:
My main git repo, which pushes to GitHub. This repository ignores the build folder.
Another "sub" git repository, which only pushes the contents of build to Glitch.
I've seen people using submodules, but I can't figure out how to make my main git repo ignore the build folder and have the submodule just push the build folder.
I'm also confused on how to setup a submodule in general, so an example/explanation for that as well would be appreciated.
~ Ayush
I'm not entirely sure that you want a submodule here, but submodules will let you do what you are describing. Submodules are tricky, though. There's a reason people call them sob-modules. 😀
Long
First, it will help a great deal if you get your definitions—actors and actions—straight:
A repository does not push anything. It's just a collection of commits (plus some names; see the last point below).
Git (the software suite) creates and manipulates repositories, including the commits inside them.
The git push command pushes commits.
A commit is a thingy (technically, a commit object, but people use the term pretty loosely, hence the loose "thingy" term here 😀) with the following features:
It has a unique hash ID.
It stores files. Note that commits do not store folders, just files. These files have path names that include embedded slashes (always forward slashes, even if you extract the commit's files on a Windows sytem with their backward slashes). This eventually becomes important later, but if you like, you can think of them as folders-full-of-files, as long as you remember that Git can't store an empty folder properly (because it only stores files). The files are stored as full snapshots, although they get compressed and—importantly—de-duplicated across all commits in a repository. So, the fact that typically some new commit re-uses 30,000 files from some previous commit doesn't matter: the re-used files take no space, because they're literally re-used.
It stores metadata, or information about the commit itself. This includes stuff like who made the commit and when, and a log message, and so on; and, crucial for Git's own operation, it also includes the raw hash ID of some set of earlier commits. Most commits store just one earlier-commit hash ID, which we (and Git) call the parent. This is how history works, in a Git repository: each commit remembers its parent.
It is completely read-only. No part of any commit can ever be changed. (This is what allows the de-duplication, and a lot of other Git magic.)
A repository also contains names—such as branch and tag names—that allow Git to find commits. This works by having one name store exactly one hash ID. For branch names, that stored hash ID is, by definition, the last commit in the branch. Since commits store parent hash IDs, Git can work backwards from whichever commit we decide to call "last in branch X": X~1 is the second-to-last in X, X~2 is the third-to-last, and so on.
The act of adding a new commit to a branch consists of the following steps:
You check out that commit (with git checkout or git switch) by checking out that branch (with the same command), so that this is now the current branch. This action fills in both Git's index—which holds your proposed next commit—and your working tree, where Git copies out all the files into a usable form. The internal, de-duplicated form is generally unusable to everything except Git itself.
You do some stuff in your working tree. Git has zero control or influence over this part, a lot of the time, since you'll be using your own editor or compiler or whatever. You can use Git commands here and then Git will be able to see what you did, but mostly, Git doesn't have to care, because we move on to step 3:
You run git add. This instructs Git to take a look at the updated working tree files. Git will copy these updated files back into Git's index (aka the staging area), in their updated form, re-compressing and de-duplicating them and generally making them ready for the next commit.
You run git commit. This packages up new metadata—your name, the current date and time, a log message, and so on—and adds the current commit's hash ID to make up the metadata for the new commit. The new commit's parent will thus be the current commit. Git then snapshots everything in the index at this time (which is why git checkout filled it in, in step 1, and then git add updated it in step 3), along with the metadata, to make the new commit. This gives the new commit its new hash ID, which is actually just a cryptographic checksum of the entire data set here.
It's at this point that the magic happens: git commit writes the new commit's hash ID into the current branch name. So now, the last commit on the branch is your new commit. This is how a branch grows, one commit at a time. No existing commit changes—none can change—but the new commit points back to what was the last commit, and is now the second-to-last commit. The branch name moves.
You really need to have all of these down pretty cold to make submodules work, because submodules actually use all of this stuff, but then violate some rules. Now it starts to get tricky. We also need to look more closely at git push, just for a moment.
git push: cross-connecting one Git repository with another
Making a new Git commit, in some Git repository, just makes a new snapshot-plus-metadata. The next trick is to get that commit into some other Git repository.
If we start with two otherwise-identical Git repositories, each has some set of commits and some branch names identifying the same last commit:
... <-F <-G <-H <--branch-name [in Repo A]
and the same in Repo B. But then, over in Repo A, we do:
git checkout branch-name
<do stuff>
git commit
which causes repo A to contain:
...--F--G--H--I <-- branch-name
(I get lazy and don't bother drawing the commit-to-commit arrows correctly here). New commit I—I, like H and G and F, stands in for some big ugly random-looking hash ID—points back to existing commit H. You might even make more than one new commit:
...--F--G--H--I--J <-- branch-name
Now you run git push origin branch-name, to send your new commits, in your repository, back to the "origin" repo (which we were calling "repo B" before, but let's call it origin now).
Your Git software suite ("your Git") calls up theirs. Your Git lists out the hash ID of your latest commit, i.e., commit J. Their Git checks in their repository, to see if they have J, by hash ID. They don't (because you just made it). So their Git tells your Git: OK, gimme! Your Git is now obligated to offer J's parent I. They check and don't have I either, so they ask for that one too. Your Git is now obligated to offer commit H. They check and—hey!—this time they do have commit H already, so they say: no thanks, I have that one already.
Your Git now knows not only that you must send commits J and I, but also which files they already have. They have commit H, so they must have commit G too, and commit F, and so on. They have all the de-duplicated files that go with those commits. So your Git software suite can now compute a minimal set of stuff to send them so that they can reconstruct commits I-J.
Your Git does so; that's the "counting" and "compressing" and so on that you see. Their Git receives this stuff, unpacks it, and adds the new commits to their repository. They now have:
...--F--G--H <-- branch-name
\
I--J
in their Git repository. Now we hit a really tricky bit: How does a Git, in general, find a commit? The answer is always, ultimately, by its hash ID—but that just brings another question, which is: how does a Git find a hash ID? They look random.
We already said this earlier though: a Git (the software suite) often finds some specific commit in some specific repository through the use of a branch name. The branch name branch-name, in your repository, finds the last commit, which is now J. We'd like the same name in their repository to find the same last commit.
So, your Git software now asks their Git to set their repository's branch name branch-name to identify commit J. They will do this if you are allowed to do this. The "allowed" part can get arbitrarily complicated—sites like GitHub and Bitbucket add all kinds of permissions and rules here—but if we assume that it's OK, and that they'll do that, then they will end up with:
...--F--G--H--I--J <-- branch-name
in their repository, and your Git repository and their Git repository will be in sync again, at least for this particular branch name.
So that's how git push normally works: you make new commits, adding them on to the end of your branch, and then you send your new commits to some other Git, and ask their software to add the same commits to the end of a branch of the same name in their repository. (Whew!)
Submodules
A submodule, in Git, is little more than two separate, mostly-independent Git repositories. This of course needs a lot of explanation:
Why are they only "mostly" independent? (What does that even mean?)
If they're little more, what more are they?
First, like any repository, a submodule repository is a collection of commits, each with a unique hash ID. We—or Git at least—like to refer to one of the two repositories as the superproject and the other as the submodule. Both of these start with the letter S, which is annoying, and both words are long and klunky, so here I'll use R (in bold like this) as the superproject Repository, and S as the Submodule.
(Side note: the hash IDs in R and S are independent from each other. Git tries pretty hard—and usually succeeds—at making hash IDs globally unique across every Git repository everywhere in the universe. So there's no need to worry about "contaminating" R with S IDs or vice versa. In any case we can just treat every commit hash ID as if it's totally unique. Normally, with a normal non-R non-S repository, we don't even have to care about IDs, as we just use names. But submodules make you have to be more aware of the IDs.)
What makes R a superproject in the first place is that it lists raw hash IDs from S. It also has to list instructions: if we've done a git clone of R, we don't even have a clone of S yet. So R needs to contain the instructions so that your Git software can make a clone of S.
The instructions you give to git clone are pretty simple:
git clone <url> <path>
(where the path part is even optional, but here, R will always specify a path—using those forward slash path names we mentioned earlier). This set of instructions goes into a file named .gitmodules. The git submodule add command will set up this file in R for you. It's important to use it, to set up the .gitmodules file. Git will still make a submodule even if you don't set this up, but without the cloning instructions, the submodule won't actually work.
Note that there's no proper place to put authentication (user and password names) in here. That's a generic submodule issue. (You can put them in as plaintext in the .gitmodules file, but don't do it, it's a very bad idea, they're not encrypted or protected.) As long as you have open access to cloning the submodule, it doesn't normally present any real problem. If you don't, you'll have to solve this problem somehow.
In any case, you will need, just once, to run:
git submodule add ...
(filling in the ... part) in what will thus become superproject R, so as to create the .gitmodules file. You then need to commit the resulting .gitmodules file, so that people who clone R and check out a commit that contains that file, get that file, so that their Git software can run the git clone command to create S on their system.
You'll also need to put S somewhere they can clone it. This, of course, means that first you need to create a Git repository to hold S. You do this the way you make any Git repository:
git init
or:
git clone
(locally, on your machine) along with whatever you do on whatever hosting site that creates the repository there.
Now that you have a local repository S, you need to put some commit(s) into it. What goes into these commits?
Well, you already said that you'd like your R to have a build/ directory (folder) in it, but not actually store any of the built files in any of the commits made in R. This is where submodules actually work. A submodule, in R, for S, works by saying: create me a folder here, then clone the submodule into the folder. Or, if the submodule repository already exists—as it will when you're setting all this up in the first place, with you just now having created S—you simply put that entire repository into your working tree for R, under the name build.
Note that build/.git will exist in R's working tree at this point. That's because a Git repository hides all the Git files in the .git directory (folder) at the top level of the working tree. So your new, empty S repository consists of just a .git/ containing Git files.
You can now run that git submodule add command in R, because now you have the submodule in place:
git submodule add <url> build
(You might want to wait just a little bit, but you can definitely do it at this point—and this is the earliest point at which you can do it, since up until now, S didn't exist or was not in the right place yet.)
You can now fill the build/ directory that lives in R's working tree with files, e.g., by running npm run build, or whatever it is that populates the build/ directory. Then you can:
(cd build; git add .)
or equivalent, so as to add the build output in S. You can now create the first commit in S, or maybe as the second commit in S if you like to create a README.md and LICENSE and such as your initial commit. You can now have branches in S as well, since you now have at least one commit in S.
Now that you're back in R though, it's time to git add build—or, if you chose to delay it, run that first git submodule add. In the future you'll use git add build. This directs the Git that is manipulating the index / staging-area for R to enter the repository S and run:
git rev-parse HEAD
to find the raw hash ID of the current commit in S.
The superproject's Git repository's index now acquires a new gitlink entry. A gitlink entry is like a regular file, except that instead of git checkout checking it out as a file, it provides a raw hash ID. That's basically all it is: a pathname—in this case, build/—and a raw hash ID.
This gitlink is like one of those read-only, compressed, and de-duplicated files that goes in a commit. It's just that instead of storing file data, it stores a commit hash ID. That hash ID is that of some commit in S, not some commit in R itself. But now that you've updated the index (or staging area) for R, you will need to make a new commit in R. The new commit will contain any updated files, plus the right hash ID for S, as found just now by the git add you ran (or that git submodule add ran for you).
The next commit you make in R (not in S) will list the hash ID of the current commit in S. So once you've committed the built files in S, you can git add them in R and git commit in R.
The last and trickiest part
Now comes the last part, which—if you thought all of the above was complicated and tricky—is the trickiest:
You have to git push the submodule commit in S so that it's generally available. In general, you should do this first, though you don't actually have to.
Then you have to git push the superproject commit in R so that others can get it. When others get this commit from the other clone of R, they'll be able to see the right hash ID from S.
Then, if someone else—let's say your co-worker Bob—wants to get both the built files and the sources, they have to:
Obtain your new R commit.
Instruct their Git to check out the new R commit.
Instruct their Git to use the new checked out R commit to run git fetch in S so as to obtain the new S commit.
Instruct their Git to actually enter their clone of S and git checkout the correct commit.
They can do this all at once with git checkout --recursive, or set the recursive checkout option. Note what can go wrong though:
They might obtain your new R commit and check it out, but forget to update their S at all.
Or, they might obtain your new R commit and check it out and then try to check out the new commit in S without first running git fetch in their clone of S, so that they don't have the new commit.
Or, they might remember everything they should do, but someone forgot to push the new S commit to the shared repository people can get it from. They'll get an error about their submodule Git being unable to find the requested commit.
You can see how this can get pretty messy. It's very easy for the various separate commits to get de-synchronized in various ways. Once you have the procedures down, and have scripts around everything that make sure that all the steps happen at the right times, it can work pretty well. But there are many ways for things to go wrong.
To ignore the specific folder/files, yes by using .gitignore file
To push specific folder/ "sub" git repository to different repository is by initialized new git on that specific folder, by running "git init" and "git remote add".
Example:
git init
git add somefile
git commit -m "initial commit"
git remote add origin https://github.com/username/new_repo
git push -u origin master
Stop thinking about repository boundaries as anything substantial. The only important structure in Git is the history.
rm -rf build
git branch build $(git commit-tree -m 'Glitch project' `git mktree <&-`)
git worktree add build
git add build # you'll get a newbie warning here
git remote add glitch u://r/l
git config remote.glitch.push +refs/heads/build:refs/heads/build
and don't push the build branch to your main repo, you don't want that history there so don't push it there.
git config remote.origin.push : # this is "matching", see the docs
git config remote.origin.push --add ^refs/heads/build # don't match this
and now after a build completes and you like it enough to publish,
git -C build add .
git -C build commit -m "built $(date)"
git add build
git push glitch
When you clone from your github repo you'll get the history with a build entry in it, and checkout will create an empty directory there, but you won't have the build history itself. That's okay: if you want it, you can fetch it from someplace that does have it and then git worktree add it, or you can just not bother, git init build and redo the build locally.
1 "only" might seem a bit strong, but it's really not. Everything else is support, scaffolding, infrastructure, just there to help with inspecting, analyzing, extending history.
Use .gitignore file in the root and add the files you don’t wanna push to GH

I return from 'detached HEAD' state and git deleted my .env file from the local directory

I was checking my commit history to fixing a bug.
I use
git checkout 4a3cf4ebfc8c3d4a7f8b055b3f38cc90acf2a0cd to Switching between commits.
I was in - 'detached HEAD' state.
in that state, I use some command like:
git add .
git commit -m "massage"
Then I Switch different commits by git checkout e255fb94e967c4c1463c25e44090bfc5a40b8463.
I return from 'detached HEAD' state by using the command: git checkout master.
Now, my .env file is gone from the local directory.
I am creating a react app. I am using .gitignore file to ignore my API key in .env file.
Given some particular conditions (which must be the case here), this is normal.
We start with these facts:
Git stores commits. Git does not store files, nor branches, but rather entire commits. (Each commit does store files, but you work with an entire commit at a time. Branch names help you—and Git—find commits, because the commit numbers are so random-looking.)
Each commit has a unique number, which looks random, but isn't: it's the commit's hash ID. The hash ID is actually a cryptographic checksum of the contents of the commit. No part of any commit can ever be changed.
Each commit actually stores two things: a snapshot—a complete set of all of your files—and some metadata. The entire commit is completely read-only, and all the files in the commit are stored in a read-only, Git-only, compressed and de-duplicated form.
This in turn means that you cannot actually work with the files inside a commit. The files that you see and work with are copied out of some commit, into an area Git calls your working tree or work-tree.
Hence the git checkout command, when given a name like master or a raw commit hash like e255fb94e967c4c1463c25e44090bfc5a40b8463, has to copy the files that are stored (forever) in that commit (but are not usable by programs like ReactJS) into your work-tree (where they are are usable by programs like ReactJS).
This means that the files you see and work with, in your work-tree, are not actually in Git in the first place. They're just copies of files from some commit in Git.
Now, suppose that the last commit of your master branch is a123456.... I made up hash ID, but we can be pretty sure that whatever the actual hash ID is, it's not e255fb94e967c4c1463c25e44090bfc5a40b8463. So your Git has, inside it, as two different commits:
e255fb94e967c4c1463c25e44090bfc5a40b8463: this commit has a .env file in it.
a123456...: this commit does not have a .env file in it.
When you check out commit e255fb94e967c4c1463c25e44090bfc5a40b8463, Git must extract the saved .env file to your work-tree.
When you switch back to a123456..., Git must remove that saved .env file to get it out of the way.
I am using .gitignore file to ignore my API key in .env file.
Unfortunately, the .gitignore file does not actually tell Git to ignore files. It can't: commit e255fb94e967c4c1463c25e44090bfc5a40b8463 has the .env file in it, and no part of any existing commit can ever be changed.
So, when you extract commit e255fb94e967c4c1463c25e44090bfc5a40b8463, Git copies the .env file out of that commit (into both Git's index or staging area, and your work-tree). If your .env file has valuable data in it, this copying-out process could destroy the valuable data.
Usually, when git checkout might destroy the contents of a file, Git will warn you about this, and refuse to do the git checkout until you have saved those file contents elsewhere. Unfortunately, one of the side effects of listing a file in .gitignore is that it sometimes (not always, but sometimes) gives Git permission to destroy the file's contents.
Perhaps, though, the .env file contents you had that were in your work-tree, but not in commit a123456..., matched the contents in commit e255fb94e967c4c1463c25e44090bfc5a40b8463. If that was the case, git checkout e255fb94e967c4c1463c25e44090bfc5a40b8463 left the contents in place, without actually destroying anything. Unfortunately Git will still remove the file when switching back, but you can, at this point, instruct Git to retrieve the contents of that one particular file from that one particular commit:
git show e255fb94e967c4c1463c25e44090bfc5a40b8463:.env > .env
for instance.

How ClearCase identify hijacked files?

One says an hijacked file is a file where the "Read Only" flag has been removed.
I tried to remove the "Read Only" flag (Windows) and ClearCase does not recognize it as hijacked. Then I tried to touch the file using Cygwin without actually changing any mode flags. This time ClearCase warns me, we've got hijacked!
It seems ClearCase only look at the timestamp of files not their content and not their read-only flags. This mechanism has a very bad side effects when working in parallel with git. For example, if I do this:
git checkout bar
git checkout master
It would be the same as:
touch foo
Thus, ClearCase will think foo was hijacked which is not the case. For huge projects, this would be very dramatic and unfortunately I always use git to quickly switch to back and forth in my snapshot view.
What would be a good solution in my case?
EDIT
A much more dangerous example would this one:
stat -c 'touch --no-create -d "%y" "%n"' foo > restore_timestamp
echo "ClearCase will not see this" >> foo
source restore_timestamp
rm restore_timestamp
When I work in parallel between ClearCase and Git, I don't touch to the git repo within ClearCase: I clone it elsewhere and work from there.
Actually, I don't create a git repo in the ClearCase view directly: I create it outside, adding in it all the file from the ClearCase view (using just for the initial add: git add --work-tree=/path/to/CC/view)
When it is time to synchronize the ClearCase snapshot view with the git working tree, a do a clearfsimport (as in this answer) from that working tree to the ClearCase view: obnly the modified files are checked out/updated and checked in.
That way, I completely bypass the "hijacked/not hijacked" issue.

Hook to create/add a database dump file to repository on git pull

My aim is to minimize the steps needed to locally clone my website + database.
I have a central git repository on a webserver and a local clone. When I pull updates on my local machine, not only should I get the latest file versions from the remote repository but also should a script run on this webserver to dump the live database and additionally add it to the repository prior to delivering the pull.
My guess is that I need the following actions to happen on the remote machine when I fire git pull on the local machine prior to delivering the repository:
Create database dump file, e.g. dump.sql (by exectuting mysqldump)
Add dump.sql to repository
Commit dump.sql to repository
… and only then deliver the pull to the local machine.
What kind of git hook should I use for this?
I'd also appreciate any additional experience with such a scenario.
git help hooks lists the types of hooks and how they work, but there isn't a hook that you can use to do what you want (you'd need something like pre-upload that would be executed by git-upload-pack).
However, you could create a wrapper script around git-upload-pack on the server that performs the necessary actions and then executes the real git-upload-pack command:
find the git-upload-pack executable
rename it to git-upload-pack.real
create a new script called git-upload-pack somewhere in PATH that does the following:
use the arguments to find the Git repository
cd into the Git repository
if hooks/pre-upload exists and is an executable file:
run it
if the hook exited with a non-zero status:
print an error message to standard error
exit with a non-zero return value
run git-upload-pack.real with the original command-line arguments
create a hooks/pre-upload script in your Git repository that does whatever you want

Resources