I want to shred some temp files produced by my C program before the files are removed.
Currently I am using
system("shred /tmp/datafile");
system("rm /tmp/datafile");
from within my program, but I think instead of calling the system function is not the best way (correct me if I am wrong..) Is there any other way I can do it? How do I shred the file from within my code itself? A library, or anything? Also, about deletion part, is this answer good?
Can I ask why you think this is not the best way to achieve this? It looks like a good solution to me, if it is genuinely necessary to destroy the file contents irretrievably.
The advantage of this way of doing it are:
the program already exists (so it's faster to develop); and
the program is already trusted.
The second is an important point. It's possible to overstate the necessity of elaborately scrubbing files (Peter Gutmann, in a remark quoted on the relevant wikipedia page, has described some uses of his method as ‘voodoo’), but that doesn't matter: in any security context, using a pre-existing tool is almost always more defensible than using something home-made.
About the only criticism I'd make of your current approach, using system(3), is that since it looks up the shred program in the PATH, it would be possible in principle for someone to play games with that and get up to mischief. But that's easily dealt with: use fork(2) and execve(2) to invoke a specific binary using its full path.
That said, if this is just a low-impact bit of tidying up, then it might be still more straightforward to simply mmap the file and quickly write zeros into it.
You can use the following code:
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#define BUF_SIZE 4096
#define ABS_FILE_PATH "/tmp/aaa"
int main()
{
//get file size
struct stat stat_buf;
if (stat(ABS_FILE_PATH, &stat_buf) == -1)
return errno;
off_t fsize = stat_buf.st_size;
//get file for writing
int fd = open(ABS_FILE_PATH, O_WRONLY);
if (fd == -1)
return errno;
//fill file with 0s
void *buf = malloc(BUF_SIZE);
memset(buf, 0, BUF_SIZE);
ssize_t ret = 0;
off_t shift = 0;
while((ret = write(fd, buf,
((fsize - shift >BUF_SIZE)?
BUF_SIZE:(fsize - shift)))) > 0)
shift += ret;
close(fd);
free(buf);
if (ret == -1)
return errno;
//remove file
if (remove(ABS_FILE_PATH) == -1)
return errno;
return 0;
}
Related
I can examine a file's permission bits using the stat() system call, which returns a struct, which contains a field that in turn contains the file type and mode. Is there a way to do the same using nothing but the open and read syscalls? I.e. by analyzing each bit? For example the following code reads a file (the first four bytes) and determines whether it's an ELF file or not ..
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
int fd = open("main", O_RDONLY);
char *buf = malloc(sizeof (char) * 4);
read (fd, buf, 4);
if (strcmp(buf, "ELF"))
printf("It is an ELF file.\n");
free(buf);
return 0;
}
Is there a similar way to read a file to extract the information bit-by-bit?
File permissions are not part of the file's contents but part of its directory entry, therefore you can't read the permissions using open or read on the file.
Using stat is the proper way to do this.
I saw that you mentioned in another comment that you're doing this for learning purposes only. Anyone else reading this for production work...DONT. It'll be non-portable! You probably just want to use stat on the containing directory.
You're going to want to take a look at your systems definition of the stat function. Here is one example of the stat function implementation. Its definitely not as easy as just calling stat. But if you study this source and follow links in it, you'll get an idea of how it works.
Unfortunately I'm sane enough to not study the source, and am unsure if it can be done with just combinations of open and read. My guess is no, though (just a guess)
My problem is to deal with sparse file reads and understand where the extents of the file are to perform some logic around it.
Since, there is no direct API call to figure these stuff out, I decided to use ioctl api to do this. I got the idea from how cp command deals with problems of copying over sparse files by going through their code and ended up seeing this.
https://github.com/coreutils/coreutils/blob/df88fce71651afb2c3456967a142db0ae4bf9906/src/extent-scan.c#L112
So, I tried to do the same thing in my sample program running in user space and it errors out with "Invalid argument". I am not sure what I am missing or if this is even possible from userspace. I am running on ubuntu 14.04 on an ext4 file system. Could this be a problem with device driver supporting these request modes underneath?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/fcntl.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <linux/fs.h>
#include "fiemap.h" //This is from https://github.com/coreutils/coreutils/blob/df88fce71651afb2c3456967a142db0ae4bf9906/src/fiemap.h
int main(int argc, char* argv[]) {
int input_fd;
if(argc != 2){
printf ("Usage: ioctl file1");
return 1;
}
/* Create input file descriptor */
input_fd = open (argv [1], O_RDWR);
if (input_fd < 0) {
perror ("open");
return 2;
}
union { struct fiemap f; char c[4096]; } fiemap_buf;
struct fiemap *fiemap = &fiemap_buf.f;
int s = ioctl(input_fd, FS_IOC_FIEMAP, fiemap);
if (s == 0) {
printf("ioctl success\n");
} else {
printf("ioctl failure\n");
char * errmsg = strerror(errno);
printf("error: %d %s\n", errno, errmsg);
}
/* Close file descriptors */
close (input_fd);
return s;
}
As you're not properly setting the fiemap_buf.f parameters before invoking ioctl(), it is likely that the EINVAL is coming from the fiemap invalid contents than from the FS_IOC_FIEMAP request identifier support itself.
For instance, the ioctl_fiemap() (from kernel) will evaluate the fiemap.fm_extent_count in order to determine if it is greater than FIEMAP_MAX_EXTENTS and return -EINVAL in that case. Since no memory reset nor parameterization is being performed on fiemap, this is very likely the root cause of the problem.
Note that from the coreutils code you referenced, it performs the correct parameterization of fiemap before calling ioctl():
fiemap->fm_start = scan->scan_start;
fiemap->fm_flags = scan->fm_flags;
fiemap->fm_extent_count = count;
fiemap->fm_length = FIEMAP_MAX_OFFSET - scan->scan_start;
Note fiemap is not recommended as you have to be sure to pass FIEMAP_FLAG_SYNC which has side effects. The lseek(), SEEK_DATA and SEEK_HOLE interface is the recommended one, though note that will, depending on file system, represent unwritten extents (allocated zeros) as holes.
Hi I have a doubt regarding following question: In the OS textbook "Operating Systems in Depth by Thomas W Doeppner", one of the chapter exercise questions asks us to find fault with the given code for reading file contents backwards and also asks for a better way to do it. Now I have come across many ways to do that but cant really find out why the following is not considered a good way of doing it?
Appreciate your time and help ,thank you!
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
int main() {
int fd;
off_t fptr;
fd = open("./file.txt", O_RDONLY);
char buf[3];
/* go to last char in file */
fptr = lseek(fd, (off_t)-1, SEEK_END);
while (fptr != -1) {
read(fd, buf, 1);
write(1, buf, 1);
fptr = lseek(fd, (off_t)-2, SEEK_CUR);
}
return 0;
}
The method illustrated in your code is inefficient because you make 3 system calls for each byte in the file. Furthermore, you do not check the return values of the read() and write() function calls, nor that the file was opened successfully.
To improve efficiency, you should bufferize the input/output operations.
Using putchar() instead of write() would be both more efficient and more reliable.
Reading a chunk of file contents (from a few kilobytes to several megabytes) at a time would be more efficient too.
As always, benchmark the resulting code to measure actual performance improvements.
There are probably several problems with the code below. Found it online after searching for a way to get keyboard input in linux. I've verified the correct event for keyboard input. The reason it seems fishy to me is regardless of what i put in the filepath, it always seems to pass the error check (the open call returns something greater than 0). Something is obviously wrong, so suggestions are welcome.
This won't run correctly unless you run the exe as su.
When i want to read in my keystroke, do i just use something like fgets on the file descriptor in an infinite while loop(would that even work)? I want it to be constantly polling for keyboard inputs. Any tips on decoding the inputs from the keyboard event?
Thanks again! This project of mine may be overly ambitious, as it's been a really long time since i've done any coding.
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <fcntl.h>
#include <linux/input.h>
#include <unistd.h>
// Edit this line to reflect your filepath
#define FILE_PATH "/dev/input/event4"
int main()
{
printf("Starting KeyEvent Module\n");
size_t file; //will change this to int file; to make it possible to be negative
const char *str = FILE_PATH;
printf("File Path: %s\n", str);
error check here
if((file = open(str, O_RDONLY)) < 0)
{
printf("ERROR:File can not open\n");
exit(0);
}
struct input_event event[64];
size_t reader;
reader = read(file, event, sizeof(struct input_event) * 64);
printf("DO NOT COME HERE...\n");
close(file);
return 0;
}
the problem is here:
size_t file;
size_t is unsigned, so it will always be >=0
it should have been:
int file;
the open call returns something greater than 0
open returns int, but you put in in an unsigned variable (size_t is usually unsigned), so you fail to detect when it is <0
I am working on an application wherein i need to compare 10^8 entries(alphanumeric entries). To retrieve the entries from file( file size is 1.5 GB) and then to compare them, i need to take less than 5 minutes of time. So, what would b the effective way to do that, since, only retrieving time is exceeding 5 min. And i need to work on file only. please suggest a way out.
I m working on windows with 3GB RAM n 100Gb hard disk.
Read a part of the file, sort it, write it to a temporary file.
Merge-sort the resulting files.
Error handling and header includes are not included. You need to provide DataType and cmpfunc, samples are provided. You should be able to deduce the core workings from this snippet:
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
typedef char DataType; // is this alphanumeric?
int cmpfunc(char const *left, char const *right)
{
return *right - *left;
}
int main(int argc, char **argv)
{
int fd = open(argv[1], O_RDWR|O_LARGEFILE);
if (fd == -1)
return 1;
struct stat st;
if (fstat(fd, &st) != 0)
return 1;
DataType *data = mmap(NULL, st.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (!data)
return 1;
qsort(data, st.st_size / sizeof(*data), cmpfunc);
if (0 != msync(data, st.st_size, MS_SYNC))
return 1;
if (-1 == munmap(data, st.st_size))
return 1;
if (0 != close(fd))
return 1;
return 0;
}
I can't imagine you can get much faster than this. Be sure you have enough virtual memory address space (1.5GB is pushing it but will probably just work on 32bit Linux, you'll be able to manage this on any 64bit OS). Note that this code is "limited" to working on a POSIX compliant system.
In terms of C and efficiency, this approach puts the entire operation in the hands of the OS, and the excellent qsort algorithm.
If retrieving time is exceeding 5 min it seems that you need to look at how you are reading this file. One thing that has caused bad performance for me is that a C implementation sometimes uses thread-safe I/O operations by default, and you can gain some speed by using thread-unsafe I/O.
What kind of computer will this be run on? Many computers nowadays have several gigabytes of memory, so perhaps it will work to just read it all into memory and then sort it there (with, for example, qsort)?