checking a file from a certain offset without reading the entire file - c

what I want to do is open a file(which is huge) and read from a certain point of bytes to an offset.
in c# this can be done with:
File.ReadAllBytes(file).Skip(50).Take(10).ToArray();
the problem with this is that it reads the entire file but since my files can be huges this also takes a long time. is there a way to read parts of a file similiar to this method but WITHOUT reading the entire file? In c preferably

Yes, use the fseek() standard library function to move ("seek") to the desired position:
FILE *in = fopen("myfancyfile.dat", "rb");
if(fseek(in, 50, SEEK_SET) == 0)
{
char buf[10];
if(fread(buf, sizeof buf, 1, in) == 1)
{
/* got the data, process it here */
}
}
fclose(in);

Related

Writing to binary files at offset zeroes all previous bytes

I'm trying to write to a new file with 'wb' mode at given offset using function owrite provided below, but every time it overwrite all bytes before the offset.
Using windows 10, visual studio 2019 16.0.3.
Offset is positive number and outside file bounds (since it's a new file).
count == 64000 == size of buf.
I've tried to use lseek/_lseek write/_write (with fileno) but ended up with similar result. owrite don't return -1, also checked output of fwrite and everything seems fine. What is the right way to perform this operation?
int owrite(FILE* fd, char* buf, size_t count, int offset)
{
if (fseek(fd, offset, SEEK_SET) != 0) {
return -1;
}
fwrite((char*)buf, sizeof(char), count, fd);
fseek(fd, 0, SEEK_SET);
return 0;
}
Also here is function that calls owrite:
void insert_chunk(byte* buffer, int len, char* filename, long offset)
{
FILE* builded_file = fopen(filename, "wb");
owrite(builded_file, buffer, len, offset);
fclose(builded_file);
}
//byte is unsigned char
You are telling it to discard existing contents when you open the file. You want "r+", not "w" (or, "r+b" in your case).
From http://www.cplusplus.com/reference/cstdio/fopen/:
"w" write: Create an empty file for output operations. If a file with the same name already exists, its contents are discarded and the file is treated as a new empty file.
Note that "r+" only works if the file already exists. If you don't know whether the file exists, you may need to check that first, and open with "w" or "w+" if it doesn't exist.
If you really want to add to the end of the file, and not to an offset in the middle, you could use "a" or "a+", which will create the file if it does not exist.

Reading a text file full with null characters and texts using fread

I am trying to design a small file system.
I have created a text file to store the files data in.
int kufs_create_disk(char* disk_name, int disk_size){
FILE* file_ptr = fopen(disk_name, "w");
if (file_ptr == NULL)
return -1;
fseek (file_ptr, disk_size * 1024-1, SEEK_SET);
fwrite("", 1, sizeof(char), file_ptr); // to make a size for the file
fclose(file_ptr);
DiskName=disk_name;
return 0;
}
After writing to the file I get a file with the size I determine when I call the function.
kufs_create_disk("test.txt", 5);
which creates a file with size of 5kbs with '\0' to fill this file to the size.
I have created another function to write to this file in different places of the file which works just fine and I won't paste the code for simplicity.
When I try to read from the file using fread(), I'm not getting all the data I have written into the memory; rather I get just some of the data.
My read implementation would be:
int kufs_read(int fd, void* buf, int n){
FILE *file_ptr= fopen("test.txt","a+");
fseek (file_ptr, FAT[fd].position, SEEK_SET); //where FAT[fd].position is where I want to start my read and fd is for indexing purposes
fread(buf, 1, n, file_ptr); //n is the number of bytes to be read
FAT[fd].position = FAT[fd].position + n;
}
The thing is the file reads some of the characters written and doesn't read the rest. I did a little test by looping all over the file and checking whether every thing is being read and fread reads every thing but in the buf I only get some of the characters I've written.
The text file looks something like this:
0\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00written string1written string2 0\00\00\00\00\00\00\00\00\00\00\00\000\00\00\00\00\00\00\00\00\00\00\00\00writtenstring 3 \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00
I get writtenstring1 and writtenstring2 in the buffer but I don't get writtenstring 3 for example.
Can you explain why?

How to duplicate an image file? [duplicate]

I am designing an image decoder and as a first step I tried to just copy the using c. i.e open the file, and write its contents to a new file. Below is the code that I used.
while((c=getc(fp))!=EOF)
fprintf(fp1,"%c",c);
where fp is the source file and fp1 is the destination file.
The program executes without any error, but the image file(".bmp") is not properly copied. I have observed that the size of the copied file is less and only 20% of the image is visible, all else is black. When I tried with simple text files, the copy was complete.
Do you know what the problem is?
Make sure that the type of the variable c is int, not char. In other words, post more code.
This is because the value of the EOF constant is typically -1, and if you read characters as char-sized values, every byte that is 0xff will look as the EOF constant. With the extra bits of an int; there is room to separate the two.
Did you open the files in binary mode? What are you passing to fopen?
It's one of the most "popular" C gotchas.
You should use freadand fwrite using a block at a time
FILE *fd1 = fopen("source.bmp", "r");
FILE *fd2 = fopen("destination.bmp", "w");
if(!fd1 || !fd2)
// handle open error
size_t l1;
unsigned char buffer[8192];
//Data to be read
while((l1 = fread(buffer, 1, sizeof buffer, fd1)) > 0) {
size_t l2 = fwrite(buffer, 1, l1, fd2);
if(l2 < l1) {
if(ferror(fd2))
// handle error
else
// Handle media full
}
}
fclose(fd1);
fclose(fd2);
It's substantially faster to read in bigger blocks, and fread/fwrite handle only binary data, so no problem with \n which might get transformed to \r\n in the output (on Windows and DOS) or \r (on (old) MACs)

Proper way to get file size in C

I am working on an assignment in socket programming in which I have to send a file between sparc and linux machine. Before sending the file in char stream I have to get the file size and tell the client. Here are some of the ways I tried to get the size but I am not sure which one is the proper one.
For testing purpose, I created a file with content " test" (space + (string)test)
Method 1 - Using fseeko() and ftello()
This is a method I found on https://www.securecoding.cert.org/confluence/display/c/FIO19-C.+Do+not+use+fseek()+and+ftell()+to+compute+the+size+of+a+regular+file
While the fssek() has a problem of "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream", fseeko() is said to have tackled this problem but it only works on POSIX system (which is fine because the environment I am using is sparc and linux)
fd = open(file_path, O_RDONLY);
fp = fopen(file_path, "rb");
/* Ensure that the file is a regular file */
if ((fstat(fd, &st) != 0) || (!S_ISREG(st.st_mode))) {
/* Handle error */
}
if (fseeko(fp, 0 , SEEK_END) != 0) {
/* Handle error */
}
file_size = ftello(fp);
fseeko(fp, 0, SEEK_SET);
printf("file size %zu\n", file_size);
This method works fine and get the size correctly. However, it is limited to regular files only. I tried to google the term "regular file" but I still not quite understand it thoroughly. And I do not know if this function is reliable for my project.
Method 2 - Using strlen()
Since the max. size of a file in my project is 4MB, so I can just calloc a 4MB buffer. After that, the file is read into the buffer, and I tried to use the strlen to get the file size (or more correctly the length of content). Since strlen() is portable, can I use this method instead? The code snippet is like this
fp = fopen(file_path, "rb");
fread(file_buffer, 1024*1024*4, 1, fp);
printf("strlen %zu\n", strlen(file_buffer));
This method works too and returns
strlen 8
However, I couldn't see any similar approach on the Internet using this method. So I am thinking maybe I have missed something or there are some limitations of this approach which I haven't realized.
Regular file means that it is nothing special like device, socket, pipe etc. but "normal" file.
It seems that by your task description before sending you must retrieve size of normal file.
So your way is right:
FILE* fp = fopen(...);
if(fp) {
fseek(fp, 0 , SEEK_END);
long fileSize = ftell(fp);
fseek(fp, 0 , SEEK_SET);// needed for next read from beginning of file
...
fclose(fp);
}
but you can do it without opening file:
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
struct stat buffer;
int status;
status = stat("path to file", &buffer);
if(status == 0) {
// size of file is in member buffer.st_size;
}
OP can do it the easy way as "max. size of a file in my project is 4MB".
Rather than using strlen(), use the return value from fread(). stlen() stops on the first null character, so may report too small a value. #Sami Kuhmonen Also we do not know the data read contains any null character, so it may not be a string. Append a null character (and allocate +1) if code needs to use data as a string. But in that case, I'd expect the file needed to be open in text mode.
Note that many OS's do not even use allocated memory until it is written.
Why is malloc not "using up" the memory on my computer?
fp = fopen(file_path, "rb");
if (fp) {
#define MAX_FILE_SIZE 4194304
char *buf = malloc(MAX_FILE_SIZE);
if (buf) {
size_t numread = fread(buf, sizeof *buf, MAX_FILE_SIZE, fp);
// shrink if desired
char *tmp = realloc(buf, numread);
if (tmp) {
buf = tmp;
// Use buf with numread char
}
free(buf);
}
fclose(fp);
}
Note: Reading the entire file into memory may not be the best idea to begin with.

Ansi C: Attempting to count total chars in a file

The task is simple but I am having an issue with the method returning 0.
This means my loop:
int getCharCount(FILE *fp) {
int c;
int i = 0;
while( (c = fgetc(fp)) != EOF) {
i++;
printf("Loop ran");
}
return i;
}
Did not run.
In my testing I found that the loop never runs because the "Loop ran" never prints. I am new to c and not sure if I am doing something wrong when trying to count chars in the file.
I feel like I should mention that the file is opened with "wb+" mode and that there are a few long methods that edit the file. Essentially before using this getCharCount() method the text file is cleared of all previous data, then user enters a number of 44 char length strings at a time and I use this method I just posted to calculate the total number of chars which will be used to navigate my display data method.
I am in a library working on this so if anything extra is needed to be posted or if anything needs to be clarified I will try to be quick with my responses. I don't want to post my whole code because there would be a chance to cheat and I need to get this done myself.
Thanks ahead.
If you write to the file and then call your method on the same file handle, the file handle is already at the end of the file so it will see EOF immediately. We would need to see more of the code to be sure I think.
So, you could rewind the file handle at the start of your function.
Or you could just call ftell to find out your offset in the file, which is the same as the number of bytes written if you truncate, write and do not rewind.
Why you have to read all bytes one by one to count them? It is much easier to do fseek( fp, 0, 2 ) to jump at end of file and get current position (file length) with ftell( fp ).
You are opening with the mode w+ which will truncate the file. From the fopen man page:
w+ Open for reading and writing. The file is created if it does not exist, otherwise it is truncated. The stream is positioned at the beginning of the file.
You will want to open it with rb+ instead of wb+ if you are reading from a file and still want to write to it. If you have already written to it and want to read what was written, you will need to seek to the start of the file pointer.
// Both of these seek to the start
rewind(fp);
fseek(fp, 0, SEEK_SET);
If the file is not open, you could use:
off_t getFileSize(const char *filepath) {
struct stat fileStat;
stat(filepath, &fileStat);
return(fileStat.st_size);
}
If the file is open:
off_t getFileSize(int fd) {
struct stat fileStat;
fstat(fd, &fileStat);
return(fileStat.st_size);
}

Resources