Undoing git rebase mistakes - rebase

I never had too much trouble with rebase, mainly because I tend to be more careful while committing regarding the amount of code and scope. But while working to merge some legacy projects changes with my peers, we had a major problem using a rebase-first approach (because of large sets of changes in commits). So this got me thinking about how to solve some of these problems that seem very common for this situation.
Ok now, please consider that I'm currently doing a rebase and I have applied half of my commits so far. I'm now applying this next commit and resolving some conflicts. I have three main questions:
1) How do I redo the rebase for this single wrongly merged file?
2) How do I redo the rebase for all files inside this commit I'm applying, if I made more the one mistake merging/ deleting or adding files?
3) How do I go back already applied commits in this rebase if I realize I made a mistake merging a file or two that was already applied some commits back?
PS.: I'm aware of git reflog and the ORIG_HEAD pointer. I do want to make it work while preserving the git rebase operation state. I don't know if there is an easier way around this.

Ok.... just thinking..... I guess you might --abort the rebase operation, go back to the rebased revision that you would like to correct, and then run rebase again but specifying a new segment of revisions to apply. Let's suppose that you have branch A that was started from master..... it has 20 revisions. So.... you are already rebasing A~10..... and you just noticed that it's actually not correct... A~15 was not correctly rebased. So.... this is what I would do
git rebase --abort # stop rebase
git reflog # find the rebased revision of A~15 on top of master
git checkout rebased-revision-for-A~15
# correct the revision
git add .
git commit --amend --no-edit # correct the rebased revision
# continue with the process
git rebase --onto HEAD A~15 A
That way you can continue as you were doing.... only with a detour.

This is (a) a hard problem in general, and (b) shot through with personal preferences, which makes it really tough to have a good general solution.
The way to think about solving it is to remember that rebase copies commits. We need a way to establish some kind of user friendly mapping between multiple copies of commits.
That is, suppose we have:
O1--O2--O3 <-- branch#{1} (as originally developed with original commits)
/
...--M1--M2--M3--M4 <-- mainline
\ \
\ S1--S2 <-- HEAD (in middle of second rebase)
\
R1--R2--R3 <-- branch (after first rebase)
The mapping here is that O1, R1, and S1 are all somehow "equivalent", even if their patch IDs don't match and/or there's a mistake in R1 and/or S1. Similarly, O2, R2, and S2 are "equivalent" and O3 and R3 are "equivalent" (there is no S3).
Git does not offer a mechanism to go back to S1. You can fuss with S2 all you like, using git commit --amend to make an S2a whose parent is S1, but your only built in option is to keep going or abort entirely. If you keep going, eventually the name branch will be peeled off R3 and pasted onto S3, and branch#{1} becomes branch#{2}, with branch#{1} and ORIG_HEAD remembering R3.
Git also does not offer a solid mechanism to mark any of the O/R/S commits as "equivalent". The closest you get is git patch-id. And of course if you've used squash or fixup operations in an automatic rebase, what you really want something fancier, e.g., "R2 is equivalent to Ox squashed with Oy" or whatever.
You can use reflogs, or set a branch or tag name, to be able to recover the hash ID of commit S2 from which you can find S1. That allows you to make your own rebase-like command that works by cherry-picking S1 and stopping for amend, then cherry-picking S2 and going on to cherry-pick R3. But you'll only know how to do this in general with an equivalence mapping.
How to proceed from here is up to you: you'll be building your own tools. You can use git rev-list to get the hash IDs of selected commits. Just be sure that if there are branch-and-merge operations within the sequence of commits to be cherry-picked, you have used --topo-order to get consistent ordering.

First of all: READ THE ENTIRE ANSWER AND UNDERSTAND IT BEFORE RUNNING THE COMMANDS, as there is a git reset --hard on middle of the answer otherwise YOU MAY LOSE YOUR WORK
What I usually do is
Create patches for each commit that you want to pick. Suppose that you're 3 commits of master I would do something like this
# Generate patch for the committed code so we don't lose code
git format-patch -3
Check the patches and that they have the code that you expect. The
above command will generate three files 0001-something.patch 0002-something.patch and 0003-something.patch where something is the
commit message for each commit. With this code living on the filesystem
I'm sure that I will not lose it. Then I do a hard reset.
** THIS IS DANGEROUS, MAKE SURE THAT THE PATCHES ARE OKAY **
git reset --hard origin/master
Then I apply the patches
git apply 0001-something.patch
git apply 0002-something.patch
git apply 0003-something.patch
A better solution would be to checkout to master commit and cherry pick your commits, but at this point I don't know to to override my branch's commits, if some one knows how to do it it would be better.
Some time is easier to create another branch and cherry pick the right commits, but if you have a pull request already opened and if there is discussion on it you may want to keep the same branch and override the commits. I don't know how to do this with git checkout + git cherry-pick

Related

How to avoid running Snakemake rule after input or intermediary output file was updated

Even if the output files of a Snakemake build already exist, Snakemake wants to rerun my entire pipeline only because I have modified one of the first input or intermediary output files.
I figured this out by doing a Snakemake dry run with -n which gave the following report for updated input file:
Reason: Updated input files: input-data.csv
and this message for update intermediary files
reason: Input files updated by another job: intermediary-output.csv
How can I force Snakemake to ignore the file update?
You can use the option --touch to mark them up to date:
--touch, -t
Touch output files (mark them up to date without
really changing them) instead of running their
commands. This is used to pretend that the rules were
executed, in order to fool future invocations of
snakemake. Fails if a file does not yet exist.
Beware that this will touch all your files and thus modify the timestamps to put them back in order.
In addition to Eric's answer, see also the ancient flag to ignore timestamps on input files.
Also note that the Unix command touch can be used to modify the timestamp of an existing file and make it appear older than it actually is:
touch --date='2004-12-31 12:00:00' foo.txt
ls -l foo.txt
-rw-rw-r-- 1 db291g db291g 0 Dec 31 2004 foo.txt
In case --touch (with --force, --forceall or --forcerun as the official documentation says that needs to be used in order to force the "touch" if doesn't work by itself) didn't work out as expected, ancient is not an option or it would need to modify too much from the workflow file, or you faced https://github.com/snakemake/snakemake/issues/823 (that's what happened to me when I tried --force and --force*), here is what I did to solve this solution:
I noticed that there were jobs that shouldn't be running since I put files in the expected paths.
I identified the input and output files of the rules that I didn't want to run.
In the order of the rules that were being executed and I didn't want to, I executed touch on the input files and, after, on the output files (taking into account the order of the rules!).
That's it. Since now the timestamp is updated according the rules order and according the input and output files, snakemake will not detect any "updated" files.
This is the manual method, and I think is the last option if the methods mentioned by the rest of people don't work or they are not an option somehow.

Patch edited file having original and new file

Basically I have three files: Original file, edited (by me) file and new file (edited original file). I need to apply changes made in new file into edited file without loosing my changes. Can I do this?
Note: Running linux.
Suppose you have a text file original_file
There is one true candidate
and it is A
And you have copied it to my_file and added a line to
my_file so it looks like
There is one true candidate
and it is A
B would not cut it
Now you learned that the owner of original_file has also edited it and you have copied the new version to new_file that looks like
There is one true candidate
and it is C
Fortunately the change did not add new lines where you have added yours, so the conflict between my_file and new_file can be trivially resolved.
You create a patch using the diff command. The options -Naur are commonly used. The options specify the format of the patch file and ensure that the input files are treated as text.
diff -Naur original_file my_file > my_file.patch
Now you apply a patch to the new_file using patch
patch new_file my_file.patch
The console output would be something like
patching file new_file
Hunk #1 succeeded at 1 with fuzz 2.
This updates the new_file so it now looks like
There is one true candidate
and it is C
B would not cut it
By default the file new_file.orig is also created that is the unchanged backup copy of new_file.
When it fails
Patch does good job trying to make a sensible change accounting for minor modifications. Sometimes it fails. Sometimes it produces inconsistent results.
Suppose the new_file was
There is one true candidate
and it is C
B is also good
Applying your patch to this file would also succeed resulting in
There is one true candidate
and it is C
B would not cut it
B is also good
This does not look consistent. It is your responsibility to check for the inconsistencies and fix them when they appear. Fortunately they do not appear often.

How to take a recursive snapshot of a btrfs subvol?

Assume that a btrfs subvol named "child-subvol" is within a another subvol say, "root-subvol" and if we take snapshot of "root-subvol" then, the "child-subvol" should also be taken a snapshot.
Since recursive snapshot support is not yet there in btrfs file system, how can this be achieved alternatively ?
Step 1:
Get all the residing btrfs sub-volumes. Preferably in the sorted order as achieved by the command below.
$ btrfs subvolume list --sort=-path < top_subvol >
Step 2:
In the order of preference as obtained, perform delete/Snapshot operation.
$ btrfs subvolume delete < subvol-name >
I've been wondering this too and haven't been able to find any recommended best practices online. It should be possible to write a script to create a snapshot that handles the recursion.
As Peter R suggests, you can write a script. However, if you want to send the subvolume it must be marked as readonly, and you can't snapshot recursively into readonly volumes.
To solve that you can use btrfs-property (found through this answear) in the script that handles recursion, making it (after all snapshots are taken) mark the snapshots readonly, so you can send them.
Alternatively, you can do
cp -a --reflink=always /path/to/root_subvol/ /path/to/child_subvol/
(--reflink=auto never worked for me before, and could also help you catch errors)
It should be fast, and afaik with the same advantages as a snapshot, although you don't keep the old subvolume structure.

ClearCase: Is it possible to deliver or rebase selectively?

When delivering stream A to stream B, is it possible only to deliver selected elements (directories to be precise) from A to B?
When rebasing a stream A from a baseline B, is it possible only to rebase selected elements (directories to be precise) from B to A?
With ClearCase UCM:
what you are delivering are baselines or activities
what you are rebasing are baselines only.
(and only baselines coming from the direct parent Stream, at that).
So if you directories or elements are the only items of an activity, and that activity doesn't depend on other activities (which can happen when a deliver to another Stream has already been done: all present activities are "linked together" by a technical baseline), then you can deliver just those items (by delivering only that activity).
If your directories and files are then only difference between the source baseline and the foundation baseline you are about to change on the Stream you are rebasing, you can rebase just those items.
But the fact is: it is difficult to make partial deliveries or rebases with ClearCase.
cleartool findmerge does exactly what you are looking for. You'd need to build a wrapper (ANT/Perl) around it if your list is long.
so go to the target stream/view context and run findmerge srcdir –type d –merge -print to test and replace -print with -exec,-gmerge,-abort etc. as you need. Just replace srcdir with your directory or an iterative list/variable/array in your script.
Find complete details look at http://www.ipnom.com/ClearCase-Commands/findmerge.html

Clear case exclusive checkout

Is there a way to enable exclusive checkouts on clear case?
I want that when I work on a file, no one else will be able to check it out.
TY
You just check out "reserved". Anyone else who checks out the same file will get an "unreserved" version. You will then be guaranteed the right to check in a version which creates the successor to the current version, whereas anyone else with an "unreserved" checkout will not. This is actually a much better system than exclusive checkouts.
ClearCase support both:
"soft" pessimistic lock: checkout reserved
optimistic lock: (unreserved checkouts)
The advantage of checkout reserved is that it does not prevent another person to work on the same file, since he/she will have to wait for your checking before having to merge his/her work with your new version.
See about reserved/unreserved checkouts
That said, you could add a post-op trigger (post-checkout) which would check if the file has already a checkedout version and which would undo the checkout and exit with a message preventing the second user to checkout at all the same file.
cleartool mktrtype -element -all -postop checkout \
-execwin "\\path\to\checkIfNotCo.pl" \
-execunix "/path/to/checkIfNotCo.pl" \
-c "check if not CheckedOut" notco_trigger
You could still need to write the checkIfNotCo.pl, but as Paul mentions in his answer, this is not really needed.
If it is a really sensitive file, you could lock it.

Resources