I am writing a simple encryption program, that takes any given file, writes encrypted data to a temporary file and I am now looking for the most efficient way to replace the original file with its encrypted counterpart.
I know I could just fopen the original with w and copy line by line the encrypted file, but I was wondering if there was any more efficient way to do it, like overwriting the original file hard-link to point to the ciphered file sparing me the need to rewrite the entirety of the file?
on linux, you could use mv.
And if the two files are not in the same directory, mv would be the better choice for several reasons, including that an option can be given to mv so no prompt output when a file is overwritten I.E
mv -f tempfile original_newfile
the result will be that tempfile no longer exists and the original file now contains the tempfile with the original name
Note: mv manipulates the 'hardlinks' to do its work
As suggested by #Chris-Turner and explained by #Jabberwocky using rename works fine
Related
I want to read a .gz file (text.gz) with 300MB length and search a pattern in it. I opened the text file in a binary format using fopen with "rb" and stored it in a buffer. When I search a pattern that I know it exists in the text, the result is wrong. When I debug the program, the elements of the buffer are different from what I expect. Do I have to read and store these kind of files in other ways??????
You might try using zlib and gzread to read the file.
http://zlib.net/manual.html
Try this.
gunzip -c file.gz | grep <pattern>
If the program is exiting and failing to read the file, a real common problem is that you don't close the file in Notepad or whatever is using it and the FileIO fails due to not being able to access the file. Make sure you don't have anything with that file open before you test your program.
There are n-number of files with vary in size. How we could efficently append the content of all the files into a single file?
Techniques or algorithm would help? Basically I am expecting efficent method to achieve this in c language.
Start simple. Multithreading will introduce significant complexity, and won't necessarily make things run any faster. Pseudocode time:
Create a new file "dest" in write-only mode.
For each file "source" you want to append:
Open "source" in read-only mode
For each line "L" in "source":
Write "L" to "dest"
Close "source"
Close "dest"
BTW, this is dead simple (and near-optimal) to implement using simple command-line Linux tools (cat, etc.), though of couse that isn't exactly portable to Windows. One-liner example:
for i in `find . -type f -name "*.txt"`; do cat $i >> result.out; done
(Find every .txt file in the current directory and append it to result.out.)
Go through and find the total size of all of the files.
Then allocate an output file of that size, go through them again and write the data to your output.
Since I don't what the contents of the files are or the purpose of appending them, this solution might not be the best if its just text or something. However, I'd probably find a zip library to use (either licensed or open source), then just zip all the files into a single archive.
zlib looks interesting: http://www.zlib.net/
get the size Sn of each file and calculate the total size T of all the files
create the dest file
use mmap to map the dest file with the size T, you will get a pointer P to the start address of the memmap region
mmap each file to mem, and copy each data to the region above in order.
after that, you would get the dest file with all the data from all the files
I am working with a text file, which contains a list of processes under my programs control, along with relevant data.
At some point, one of the processes will finish, and thus will need to be removed from the file (as its no longer under control).
Here is a sample of the file contents (which has enteries added "randomly"):
PID=25729 IDLE=0.200000 BUSY=0.300000 USER=-10.000000
PID=26416 IDLE=0.100000 BUSY=0.800000 USER=-20.000000
PID=26522 IDLE=0.400000 BUSY=0.700000 USER=-30.000000
So for example, if I wanted to remove the line that says PID=26416.... how could I do that, without writing the file over again?
I can use external unix commands, however I am not very familiar with them so please if that is your suggestion, give an example.
Thanks!
Either you keep the contents of the file in temporary memory and then rewrite the file. Or you could have a file for each of the PIDs with the relevant information in them. Then you simply delete the file when it's no longer running. Or you could use a database for this instead.
As others have already pointed out, your only real choice is to rewrite the file.
The obvious way to do that with "external UNIX commands" would be grep -v "PID=26416" (or whatever PID you want to remove, obviously).
Edit: It is probably worth mentioning that if the lines are all the same length (as you've shown here) and order doesn't matter, you could delete a line more efficiently by copying the last line into the space being vacated, then shorten the file so eliminate what had been the last line. This will only work if they really are all the same length though (e.g., if you got a PID of '1', you'd need to pad it to the same length as the others in the file).
The only way is by copying each character that comes after the deleted line down over the characters that are deleted.
It is far more efficient to simply rewrite the file.
how could I do that, without writing the file over again?
You cannot. Filesystems (perhaps besides more esoteric record based ones) does not support insertion or deletion.
So you'll have to write the lines to a temporary file up till the line you want to delete, skip over that line, and write the rest of the lines to the file. When done, rename/copy the temp file to the original filename
Why are you maintaining these in a text file? That's not the best model for such a task. But, if you're stuck with it ... if these lines are guaranteed to all be the same length (it appears that way from the sample), and if the order of the lines in the file doesn't matter, then you can write the last line over the line for the process that has died and then shorten the file by one line with the (f)truncate() call if you're on a POSIX system: see Jonathan Leffler's answer in How to truncate a file in C?
But note carefully netrom's answer, which gives three different better ways to maintain this info.
Also, if you stick with a text file (preferably written from scratch each time from data structures you maintain, as per netrom's first suggestion), and you want to be sure that the file is always well formed, then write the new data into a temp file on the same device (putting it in the same directory is easiest) and then do a rename() call, which is an atomic operation.
You can use sed:
sed -i.bak -e '/PID=26416/d' test
-i is for editing in place. It also creates a back-up file with the new extension .bak
-e is for specifying the pattern. The /d indicates all lines matching the pattern should be deleted.
test is the filename
The unix command for it is:
grep -v "PID=26416" myfile > myfile.tmp
mv myfile.tmp myfile
The grep -v part outputs the file without the rows with the search term.
The > myfile.tmp part creates a new temp file for this output.
The mv part renames the temp file to the original file.
Note that we are rewriting the file here, and moreover, we can lose data if someone write something to file between the two commands.
I have a program that accepts two file names as arguments: it reads the first file in order to create the second file. How can I ensure that the program won't overwrite the first file?
Restrictions:
The method must keep working when the file system supports (soft or hard) links
File permissions are fixed and it is only required that the first file is readable and the second file writeable
It should preferably be platform-neutral (although Linux is the primary target)
On linux, open both files, and use fstat to check if st_ino (edit:) and st_dev are the same. open will follow symbolic links. Don't use stat directly, to prevent race conditions.
The best bet is not to use filenames as identities. Instead, when you open the file for reading, lock it, using whatever mechanism your OS supports. When you then also open the file for writing, also lock it - if the lock fails, report an error.
If possible, open the first file read-only, (O_RDONLY) in LINUX. Then, if you try to open it again to write to it, you will get an error.
You can use stat to get the file status, and check if the inode numbers are the same.
Maybe you could use the system() function in order to invoke some shell commands?
In bash, you would simply call:
stat -c %i filename
This displays the inode number of a file. You can compare two files this way and if their inodes are identical, it means they are hard links. The following call:
stat -c %N filename
will display the file's name and if it's a symbolic link, it'll print the file name it links to as well. It prints out only one name, even if the file it points to has hard links, so checking the symbolic link would require comparing inode numbers for the 2nd file and the file the symbolic links links to in order to make sure.
You could redirect stat output to a text file and then parse the file in your program.
If you mean the same inode, in bash, you could do
[ FILE1 -ef FILE2 ] && echo equal || echo difference
Combined with realpath/readlink, that should handle the soft-links as well.
When you open a .txt file with fopen
Is there any way to delete some strings in a file without rewriting.
For example this is the txt file that i will open with fopen() ;
-------------
1 some string
2 SOME string
3 some STRING
-------------
i want to delete the line which's first character is 2 and change it into
-------------
1 some string
3 some STRING
-------------
My solution is;
First read all data and keep them in string variables. Then fopen the same file with w mode. And write the data again except line 2. (But this is not logical i am searching for an easier way in C ...)
(i hope my english wasn't problem)
The easiest way might be to memory-map the whole file using mmap. With mmap you get access to the file as a long memory buffer that you can modify with changes being reflected on disk. Then you can find the offset of that line and move the whole tail of the file that many bytes back to overwrite the line.
you should not overwrite the file, better is to open another (temp)-file, write contents inside and then delete old file and rename the file. So it is safer if problems occur.
I think the easiest way is to
read whole file
modify contents in memory
write back to a temp file
delete original file
rename temp file to original file
Sounds not too illogical to me..
For sequential files, no matter what technique you use to delete line 2, you still have to write the file back to disk.