Patch edited file having original and new file

Patch edited file having original and new file - file

Basically I have three files: Original file, edited (by me) file and new file (edited original file). I need to apply changes made in new file into edited file without loosing my changes. Can I do this?
Note: Running linux.

Suppose you have a text file original_file
There is one true candidate
and it is A
And you have copied it to my_file and added a line to
my_file so it looks like
There is one true candidate
and it is A
B would not cut it
Now you learned that the owner of original_file has also edited it and you have copied the new version to new_file that looks like
There is one true candidate
and it is C
Fortunately the change did not add new lines where you have added yours, so the conflict between my_file and new_file can be trivially resolved.
You create a patch using the diff command. The options -Naur are commonly used. The options specify the format of the patch file and ensure that the input files are treated as text.
diff -Naur original_file my_file > my_file.patch
Now you apply a patch to the new_file using patch
patch new_file my_file.patch
The console output would be something like
patching file new_file
Hunk #1 succeeded at 1 with fuzz 2.
This updates the new_file so it now looks like
There is one true candidate
and it is C
B would not cut it
By default the file new_file.orig is also created that is the unchanged backup copy of new_file.
When it fails
Patch does good job trying to make a sensible change accounting for minor modifications. Sometimes it fails. Sometimes it produces inconsistent results.
Suppose the new_file was
There is one true candidate
and it is C
B is also good
Applying your patch to this file would also succeed resulting in
There is one true candidate
and it is C
B would not cut it
B is also good
This does not look consistent. It is your responsibility to check for the inconsistencies and fix them when they appear. Fortunately they do not appear often.

Related

How would I know that file is opened and it is saved after some writing operation using C code?

I have a set of configuration files (10 or more), and if user opens any of these file using any editor (e.g vim,vi,geany,qt,leafpad..). How would I come to know that which file is opened and if some writing process is done, then it is saved or not (using C code).

For the 1st part of your question, please refer e.g. to How to check if a file has been opened by another application in C++?
One way described there is to use a system tool like lsof and call this via a system() call.
For the 2nd part, about knowing whether a file has been modified, you will have to create a backup file to check against. Most editors already do that, but their naming scheme is different, so you might want to take care of that yourself. How to do that? Just automatically create a (hidden) file .mylogfile.txt if it does not exist by simply copying mylogfile.txt. If .mylogfile.txt exists, is having an older timestamp than mylogfile.txt, and differs in size and/or hash-value (using e.g. md5sum) your file was modified.
But before re-implementing this, take a look at How do I make my program watch for file modification in C++?

How to avoid running Snakemake rule after input or intermediary output file was updated

Even if the output files of a Snakemake build already exist, Snakemake wants to rerun my entire pipeline only because I have modified one of the first input or intermediary output files.
I figured this out by doing a Snakemake dry run with -n which gave the following report for updated input file:
Reason: Updated input files: input-data.csv
and this message for update intermediary files
reason: Input files updated by another job: intermediary-output.csv
How can I force Snakemake to ignore the file update?

You can use the option --touch to mark them up to date:
--touch, -t
Touch output files (mark them up to date without
really changing them) instead of running their
commands. This is used to pretend that the rules were
executed, in order to fool future invocations of
snakemake. Fails if a file does not yet exist.
Beware that this will touch all your files and thus modify the timestamps to put them back in order.

In addition to Eric's answer, see also the ancient flag to ignore timestamps on input files.
Also note that the Unix command touch can be used to modify the timestamp of an existing file and make it appear older than it actually is:
touch --date='2004-12-31 12:00:00' foo.txt
ls -l foo.txt
-rw-rw-r-- 1 db291g db291g 0 Dec 31 2004 foo.txt

In case --touch (with --force, --forceall or --forcerun as the official documentation says that needs to be used in order to force the "touch" if doesn't work by itself) didn't work out as expected, ancient is not an option or it would need to modify too much from the workflow file, or you faced https://github.com/snakemake/snakemake/issues/823 (that's what happened to me when I tried --force and --force*), here is what I did to solve this solution:
I noticed that there were jobs that shouldn't be running since I put files in the expected paths.
I identified the input and output files of the rules that I didn't want to run.
In the order of the rules that were being executed and I didn't want to, I executed touch on the input files and, after, on the output files (taking into account the order of the rules!).
That's it. Since now the timestamp is updated according the rules order and according the input and output files, snakemake will not detect any "updated" files.
This is the manual method, and I think is the last option if the methods mentioned by the rest of people don't work or they are not an option somehow.

changing and removing lines from text file using swi-prolog

I'm using text files as a database for saving users' information for a game which i made using swi-prolog. The information is saved like this:user(Name,Password,Age,Points). What i want to do is to change a user's Points without having to rewrite the entire db. In other words, I am looking for something that will work like retractall(user(Name,_,_,_)), but with the text file. I know how to find the specific user using read/2, and how to assert a new fact using write/2, but i don't know how to delete one specific line in the text file.
Thank you for helping.

Take a look at SWI-Prolog's library(persistency). It removes a fact by adding a line that the fact is removed. If the file gets too big with add/remove lines, it provides db_sync/1 to write a clean file. OS file system operations do not allow to remove part of a file (except from truncating the end). The normal way to do this is to write a new file and, if successful, rename this to the existing one, so nothing is lost if you crash while writing the new file.

removing a line from a text file?

I am working with a text file, which contains a list of processes under my programs control, along with relevant data.
At some point, one of the processes will finish, and thus will need to be removed from the file (as its no longer under control).
Here is a sample of the file contents (which has enteries added "randomly"):
PID=25729 IDLE=0.200000 BUSY=0.300000 USER=-10.000000
PID=26416 IDLE=0.100000 BUSY=0.800000 USER=-20.000000
PID=26522 IDLE=0.400000 BUSY=0.700000 USER=-30.000000
So for example, if I wanted to remove the line that says PID=26416.... how could I do that, without writing the file over again?
I can use external unix commands, however I am not very familiar with them so please if that is your suggestion, give an example.
Thanks!

Either you keep the contents of the file in temporary memory and then rewrite the file. Or you could have a file for each of the PIDs with the relevant information in them. Then you simply delete the file when it's no longer running. Or you could use a database for this instead.

As others have already pointed out, your only real choice is to rewrite the file.
The obvious way to do that with "external UNIX commands" would be grep -v "PID=26416" (or whatever PID you want to remove, obviously).
Edit: It is probably worth mentioning that if the lines are all the same length (as you've shown here) and order doesn't matter, you could delete a line more efficiently by copying the last line into the space being vacated, then shorten the file so eliminate what had been the last line. This will only work if they really are all the same length though (e.g., if you got a PID of '1', you'd need to pad it to the same length as the others in the file).

The only way is by copying each character that comes after the deleted line down over the characters that are deleted.
It is far more efficient to simply rewrite the file.

how could I do that, without writing the file over again?
You cannot. Filesystems (perhaps besides more esoteric record based ones) does not support insertion or deletion.
So you'll have to write the lines to a temporary file up till the line you want to delete, skip over that line, and write the rest of the lines to the file. When done, rename/copy the temp file to the original filename

Why are you maintaining these in a text file? That's not the best model for such a task. But, if you're stuck with it ... if these lines are guaranteed to all be the same length (it appears that way from the sample), and if the order of the lines in the file doesn't matter, then you can write the last line over the line for the process that has died and then shorten the file by one line with the (f)truncate() call if you're on a POSIX system: see Jonathan Leffler's answer in How to truncate a file in C?
But note carefully netrom's answer, which gives three different better ways to maintain this info.
Also, if you stick with a text file (preferably written from scratch each time from data structures you maintain, as per netrom's first suggestion), and you want to be sure that the file is always well formed, then write the new data into a temp file on the same device (putting it in the same directory is easiest) and then do a rename() call, which is an atomic operation.

You can use sed:
sed -i.bak -e '/PID=26416/d' test
-i is for editing in place. It also creates a back-up file with the new extension .bak
-e is for specifying the pattern. The /d indicates all lines matching the pattern should be deleted.
test is the filename

The unix command for it is:
grep -v "PID=26416" myfile > myfile.tmp
mv myfile.tmp myfile
The grep -v part outputs the file without the rows with the search term.
The > myfile.tmp part creates a new temp file for this output.
The mv part renames the temp file to the original file.
Note that we are rewriting the file here, and moreover, we can lose data if someone write something to file between the two commands.

stdio's remove() not always deleting on time

For a particular piece of homework, I'm implementing a basic data storage system using sequential files under standard C, which cannot load more than 1 record at a time. So, the basic part is creating a new file where the results of whatever we do with the original records are stored. The previous file's renamed, and a new one under the working name is created. The code's compiled with MinGW 5.1.6 on Windows 7.
Problem is, this particular version of the code (I've got nearly-identical versions of this floating around my functions) doesn't always remove the old file, so the rename fails and hence the stored data gets wiped by the fopen().
FILE *archivo, *antiguo;
remove("IndiceNecesidades.old"); // This randomly fails to work in time.
rename("IndiceNecesidades.dat", "IndiceNecesidades.old"); // So rename() fails.
antiguo = fopen("IndiceNecesidades.old", "rb");
// But apparently it still gets deleted, since this turns out null (and I never find the .old in my working folder after the program's done).
archivo = fopen("IndiceNecesidades.dat", "wb"); // And here the data gets wiped.
Basically, anytime the .old previously exists, there's a chance it's not removed in time for the rename() to take effect successfully. No possible name conflicts both internally and externally.
The weird thing's that it's only with this particular file. Identical snippets except with the name changed to Necesidades.dat (which happen in 3 different functions) work perfectly fine.
// I'm yet to see this snippet fail.
FILE *antiguo, *archivo;
remove("Necesidades.old");
rename("Necesidades.dat", "Necesidades.old");
antiguo = fopen("Necesidades.old", "rb");
archivo = fopen("Necesidades.dat", "wb");
Any ideas on why would this happen, and/or how can I ensure the remove() command has taken effect by the time rename() is executed? (I thought of just using a while loop to force call remove() again so long as fopen() returns a non-null pointer, but that sounds like begging for a crash due to overflowing the OS with delete requests or something.)

So suddenly, after reading Scott's mention of permissions, I thought about "Permission Denied" and applied some Google. Turned out it's a pretty common, if obscure, error.
caf was right, it was in another piece of code. Namely, I had forgotten to fclose that same file in the function meant to show the contents. Since I wasn't tracking that particular detail, it appeared to be random.
Disclaimer: Weekly math assigments make for very little sleeptime. ¬¬

That sounds quite strange, and even more so when you say that the same code works OK with a different filename - I would strongly suspect a bug elsewhere in your code. However, you should be able to work around it by renaming the file you want to remove:
rename("IndiceNecesidades.old", "IndiceNecesidades.older");
remove("IndiceNecesidades.older");
rename("IndiceNecesidades.dat", "IndiceNecesidades.old");

It would probably be a good idea to check the remove() function for errors. man remove says that the function returns 0 on success and -1 on failure, setting errno to record the error. Try replacing the call with
if (remove("IndiceNecesidades.old") != 0){
perror("remove(\"IndiceNecesidades.old\") failed");
}
which should give an error message saying what failed.
Further, it doesn't appear that the remove is neccessary
man rename()
The rename() system call causes the
link named old to be renamed as new.
If new exists, it is first removed.
Both old and new must be of the same
type (that is, both must be either
directories or non-directories) and
must reside on the same file system.
The rename() system call guarantees
that an instance of new will always
exist, even if the system should crash
in the middle of the operation.
If the final component of old is a
symbolic link, the symbolic link is
renamed, not the file or directory to
which it points.
EPERM will be returned if:
[EPERM] The directory
containing old is marked sticky, and
neither the containing directory nor
old are owned by the effective user
ID.
[EPERM] The new file
exists, the directory containing new
is marked sticky, and neither the
containing directory nor new are owned
by the effec-
tive user ID.
so the next step would be to check you have permissions on the containing directory