C: reading file backwards : why this particular method is not considered good? - c

Hi I have a doubt regarding following question: In the OS textbook "Operating Systems in Depth by Thomas W Doeppner", one of the chapter exercise questions asks us to find fault with the given code for reading file contents backwards and also asks for a better way to do it. Now I have come across many ways to do that but cant really find out why the following is not considered a good way of doing it?
Appreciate your time and help ,thank you!
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
int main() {
int fd;
off_t fptr;
fd = open("./file.txt", O_RDONLY);
char buf[3];
/* go to last char in file */
fptr = lseek(fd, (off_t)-1, SEEK_END);
while (fptr != -1) {
read(fd, buf, 1);
write(1, buf, 1);
fptr = lseek(fd, (off_t)-2, SEEK_CUR);
}
return 0;
}

The method illustrated in your code is inefficient because you make 3 system calls for each byte in the file. Furthermore, you do not check the return values of the read() and write() function calls, nor that the file was opened successfully.
To improve efficiency, you should bufferize the input/output operations.
Using putchar() instead of write() would be both more efficient and more reliable.
Reading a chunk of file contents (from a few kilobytes to several megabytes) at a time would be more efficient too.
As always, benchmark the resulting code to measure actual performance improvements.

Related

How to read file permission bits using only the open and read system calls?

I can examine a file's permission bits using the stat() system call, which returns a struct, which contains a field that in turn contains the file type and mode. Is there a way to do the same using nothing but the open and read syscalls? I.e. by analyzing each bit? For example the following code reads a file (the first four bytes) and determines whether it's an ELF file or not ..
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
int fd = open("main", O_RDONLY);
char *buf = malloc(sizeof (char) * 4);
read (fd, buf, 4);
if (strcmp(buf, "ELF"))
printf("It is an ELF file.\n");
free(buf);
return 0;
}
Is there a similar way to read a file to extract the information bit-by-bit?
File permissions are not part of the file's contents but part of its directory entry, therefore you can't read the permissions using open or read on the file.
Using stat is the proper way to do this.
I saw that you mentioned in another comment that you're doing this for learning purposes only. Anyone else reading this for production work...DONT. It'll be non-portable! You probably just want to use stat on the containing directory.
You're going to want to take a look at your systems definition of the stat function. Here is one example of the stat function implementation. Its definitely not as easy as just calling stat. But if you study this source and follow links in it, you'll get an idea of how it works.
Unfortunately I'm sane enough to not study the source, and am unsure if it can be done with just combinations of open and read. My guess is no, though (just a guess)

shred and remove files in linux from a C program

I want to shred some temp files produced by my C program before the files are removed.
Currently I am using
system("shred /tmp/datafile");
system("rm /tmp/datafile");
from within my program, but I think instead of calling the system function is not the best way (correct me if I am wrong..) Is there any other way I can do it? How do I shred the file from within my code itself? A library, or anything? Also, about deletion part, is this answer good?
Can I ask why you think this is not the best way to achieve this? It looks like a good solution to me, if it is genuinely necessary to destroy the file contents irretrievably.
The advantage of this way of doing it are:
the program already exists (so it's faster to develop); and
the program is already trusted.
The second is an important point. It's possible to overstate the necessity of elaborately scrubbing files (Peter Gutmann, in a remark quoted on the relevant wikipedia page, has described some uses of his method as ‘voodoo’), but that doesn't matter: in any security context, using a pre-existing tool is almost always more defensible than using something home-made.
About the only criticism I'd make of your current approach, using system(3), is that since it looks up the shred program in the PATH, it would be possible in principle for someone to play games with that and get up to mischief. But that's easily dealt with: use fork(2) and execve(2) to invoke a specific binary using its full path.
That said, if this is just a low-impact bit of tidying up, then it might be still more straightforward to simply mmap the file and quickly write zeros into it.
You can use the following code:
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#define BUF_SIZE 4096
#define ABS_FILE_PATH "/tmp/aaa"
int main()
{
//get file size
struct stat stat_buf;
if (stat(ABS_FILE_PATH, &stat_buf) == -1)
return errno;
off_t fsize = stat_buf.st_size;
//get file for writing
int fd = open(ABS_FILE_PATH, O_WRONLY);
if (fd == -1)
return errno;
//fill file with 0s
void *buf = malloc(BUF_SIZE);
memset(buf, 0, BUF_SIZE);
ssize_t ret = 0;
off_t shift = 0;
while((ret = write(fd, buf,
((fsize - shift >BUF_SIZE)?
BUF_SIZE:(fsize - shift)))) > 0)
shift += ret;
close(fd);
free(buf);
if (ret == -1)
return errno;
//remove file
if (remove(ABS_FILE_PATH) == -1)
return errno;
return 0;
}

How do i read a file backwards using read() in c? [duplicate]

This question already has answers here:
Reading a text file backwards in C
(5 answers)
Closed 9 years ago.
I am supposed to create a program that takes a given file and creates a file with reversed txt. I wanted to know is there a way i can start the read() from the end of the file and copy it to the first byte in the created file if I dont know the exact size of the file?
Also i have googled this and came across many examples with fread, fopen, etc. However i cant use those for this project i can only use read, open, lseek, write, and close.
here is my code so far its not much but just for reference:
#include<stdio.h>
#include<unistd.h>
int main (int argc, char *argv[])
{
if(argc != 2)/*argc should be 2 for correct execution*/
{
printf("usage: %s filename",argv[0[]);}
}
else
{
int file1 = open(argv[1], O_RDWR);
if(file1 == -1){
printf("\nfailed to open file.");
return 1;
}
int reversefile = open(argv[2], O_RDWR | O_CREAT);
int size = lseek(argv[1], 0, SEEK_END);
char *file2[size+1];
int count=size;
int i = 0
while(read(file1, file2[count], 0) != 0)
{
file2[i]=*read(file1, file2[count], 0);
write(reversefile, file2[i], size+1);
count--;
i++;
lseek(argv[2], i, SEEK_SET);
}
I doubt that most filesystems are designed to support this operation effectively. Chances are, you'd have to read the whole file to get to the end. For the same reasons, most languages probably don't include any special feature for reading a file backwards.
Just come up with something. Try to read the whole file in memory. If it is too big, dump the beginning, reversed, into a temporary file and keep reading... In the end combine all temporary files into one. Also, you could probably do something smart with manual low-level manipulation of disk sectors, or at least with low-level programming directly against the file system. Looks like this is not what you are after, though.
Why don't you try fseek to navigate inside the file? This function is contained in stdio.h, just like fopen and fclose.
Another idea would be to implement a simple stack...
This has no error checking == really bad
get file size using stat
create a buffer with malloc
fread the file into the buffer
set a pointer to the end of the file
print each character going backwards thru the buffer.
If you get creative with google you can get several examples just like this.
IMO the assistance you are getting so far is not really even good hints.
This appears to be schoolwork, so beware of copying. Do some reading about the calls used here. stat (fstat) fread (read)
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
int main(int argc, char **argv)
{
struct stat st;
char *buf;
char *p;
FILE *in=fopen(argv[1],"r");
fstat(fileno(in), &st); // get file size in bytes
buf=malloc(st.st_size +2); // buffer for file
memset(buf, 0x0, st.st_size +2 );
fread(buf, st.st_size, 1, in); // fill the buffer
p=buf;
for(p+=st.st_size;p>=buf; p--) // print traversing backwards
printf("%c", *p);
fclose(in);
return 0;
}

Using fseek and fread

I am working on a project that reads data from bin files and processes the data. The bin file is huge and is about 150MB. I am trying to use fseek to skip unwanted processing of data.
I am wondering if the processing time of fseek is the same as fread.
Thanks!
fseek just repositions the internal file pointer whereas fread actually reads data. So I guess fseek should be much faster than fread
If you are really curious to see what's happening behind the screen, download glibc from here and check for yourself :)
I am wondering if the processing time of fseek is the same as fread.
Probably not though, of course, it's implementation-dependent.
Most likely, fseek will only set an in-memory "file pointer" without going out to the disk to read any information. fread, on the other hand, will read information.
An fseek to file position 149M followed by a 1M fread will probably be faster than 150 different 1M fread calls, throwing away all but the last.
I probably feel fseek might be bit faster than fread as fseek changes the pointer position to the new address space that you have mentioned and there is no date read is happening.
If you are processing huge files have you considered alternatives to read/write?
You may find that mmap() (UNIX) or MapViewOfFile (Windows) is a more suitable alternative.
The following UNIX example demonstrates opening a file for reading and counting the occurance of the ASCII character 'Q'. NOTE - all error checking has been omitted to make the example shorter.
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
int main(int argc, char **argv)
{
int i, fd, len, total;
char *map, *ptr;
fd = open("/tmp/mybigfile", O_RDONLY);
len = lseek(fd, SEEK_END, 0);
map = (char *)mmap(0, len, PROT_READ, MAP_SHARED, fd, 0);
total = 0;
for (i=0; i<len; i++) {
if (map[i] == 'Q') total++;
}
printf("Found %d instances of 'Q'\n");
munmap(map, len);
close(fd);
}

Problem in code with File Descriptors. C (Linux)

I've written code that should ideally take in data from one document, encrypt it and save it in another document.
But when I try executing the code it does not put the encrypted data in the new file. It just leaves it blank. Someone please spot what's missing in the code. I tried but I couldn't figure it out.
I think there is something wrong with the read/write function, or maybe I'm implementing the do-while loop incorrectly.
#include <stdio.h>
#include <stdlib.h>
#include <termios.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
int main (int argc, char* argv[])
{
int fdin,fdout,n,i,fd;
char* buf;
struct stat fs;
if(argc<3)
printf("USAGE: %s source-file target-file.\n",argv[0]);
fdin=open(argv[1], O_RDONLY);
if(fdin==-1)
printf("ERROR: Cannot open %s.\n",argv[1]);
fdout=open(argv[2], O_WRONLY | O_CREAT | O_EXCL, 0644);
if(fdout==-1)
printf("ERROR: %s already exists.\n",argv[2]);
fstat(fd, &fs);
n= fs.st_size;
buf=malloc(n);
do
{
n=read(fd, buf, 10);
for(i=0;i<n;i++)
buf[i] ^= '#';
write(fd, buf, n);
} while(n==10);
close(fdin);
close(fdout);
}
You are using fd instead of fdin in fstat, read and write system calls. fd is an uninitialized variable.
// Here...
fstat(fd, &fs);
// And here...
n=read(fd, buf, 10);
for(i=0;i<n;i++)
buf[i] ^= '#';
write(fd, buf, n);
You're reading and writing to fd instead of fdin and fdout. Make sure you enable all warnings your compiler will emit (e.g. use gcc -Wall -Wextra -pedantic). It will warn you about the use of an uninitialized variable if you let it.
Also, if you checked the return codes of fstat(), read(), or write(), you'd likely have gotten errors from using an invalid file descriptor. They are most likely erroring out with EINVAL (invalid argument) errors.
fstat(fd, &fs);
n= fs.st_size;
buf=malloc(n);
And since we're here: allocating enough memory to hold the entire file is unnecessary. You're only reading 10 bytes at a time in your loop, so you really only need a 10-byte buffer. You could skip the fstat() entirely.
// Just allocate 10 bytes.
buf = malloc(10);
// Or heck, skip the malloc() too! Change "char *buf" to:
char buf[10];
All said it true, one more tip.
You should use a larger buffer that fits the system hard disk blocks, usually 8192.
This will increase your program speed significantly as you will have less access to the disk by a factor of 800. As you know, accessing to disk is very expensive in terms of time.
Another option is use stdio functions fread, fwrite, etc, which already takes care of buffering, still you'll have the function call overhead.
Roni

Resources