I want to start at a particular offset, write and then at some point read from that same offset and confirm I read what I wrote to the file. The file is in binary. I'm confident I need to use fseek but I may want to call a write several times prior to reading the whole file.
write(unsigned long long offset, void* pvsrc, unsigned long long nbytes)
pFile = fopen("D:\\myfile.bin","wb");
fseek(pFile,offset,SEEK_SET);
WriteResult = fwrite (pvsrc, 1, nbytes, pFile);
fclose(pFile);
Anyone see any issue with this? .. Anyone?
You can use ftell() to tell you your current position in the file, perform some writes and then fseek() to the starting position you got with ftell() to read the data that you wrote.
If you are on Linux, you can use the pread() and pwrite() functions: http://linux.die.net/man/2/pread
If you are on Windows, you can use the ReadFile() and WriteFile() functions with the lpOverlapped parameter: http://msdn.microsoft.com/en-us/library/aa365467%28VS.85%29.aspx
Related
this question is somewhat related to another question I posted before (I'm posting this here as a new question, as I didnt wanted to interrupt the ongoing discussion in the other thread). I'm trying to implement an own read() implementation using (among others) the pread system call. Note that this is not intended to provide better performance whatsover, but just to check how pread() could be used to achieve the same as read().
I'm intercepting the read() execution, and forward it to my own read_handler(). There I will extract the parameters from their respective registers and execute my mapping.
static volatile int read_handler(void){
register void *rdi asm ("rdi");
register void *rsi asm ("rsi");
register void *rdx asm ("rdx");
int fd = rdi;
char *buf = rsi;
int count = rdx;
printf("[OWN_READ] Got read(%d, %p, %d)\n", fd,buf, count);
int current_offset = lseek(fd, 0, SEEK_CUR);
int pread_return = pread(fd,buf,count,current_offset);
//set fd offset, as pread will NOT change it automatically
lseek(fd, current_offset+pread_return, SEEK_CUR);
return pread_return;
}
So, first I extract the parameter from the register, which works fine. Next i get the current file offset (as pread will not change the offset according to the man page). I call pread using the same parameters, in addition to the current offset defined by the file itself. Next, I update the file offset using lseek and return the number of bytes.
As stated in my previous question, a read() call within fseek will somehow break my read implementation. I had a function to get the current file size, which was as follows:
long get_file_size(const char *name)
{
FILE *temp_file = fopen(name, "rb");
if (temp_file == NULL)
{
return -1;
}
fseek(temp_file, 0L, SEEK_END);
long sz = ftell(temp_file);
fclose(temp_file);
return sz;
}
When I execute this function using the reference read() implementation, it returns the correct file size. My implementation on the other hand forces the get_file_size function to always return double the actually size.
My understanding of read() and pread() is that the main difference (regarding the functionality to read from a file) is that pread will not update the file offset, which I added in my implementation using lseek. Thus, (for now not including corner cased) my implementation should work just as the reference implementation.
Additonally (if it is helpfull), this "get_file_size" function works just fine:
unsigned long get_file_size()
{
const char *text_file = "/tmp/syscalltest/tests/truncate_test.txt";
int fd = open(text_file, O_RDONLY);
unsigned int size = lseek(fd, 0 , SEEK_END);
printf("[FILE SIZE] Current file size: %d\n",size );
close(fd);
return size;
}
My goal with this test is to check if pread() can produce the same output as a read() call. I tried to verify this by executing various test files, including on that will use the above get_file_size. In my previous question it was hinted that maybe my read() implementation has an error, which will force the get_file_size() to produce wrong results. I'm trying to understand if I have to check my read() implementation, or if the error is caused by the undefined behaviour of the use of fseek in the function. Thanks for any hints on which part might cause an error here.
I have this sample code where I'm trying to implement for my operating systems assignment a program that copies the contents of an input file to an output file. I'm only allowed to use POSIX system calls, stdio is forbidden.
I've thought about storing the contents in a buffer but in my implementation I must know the file descriptor contents size. I googled a little and found about
off_t fsize;
fsize = lseek (input, 0, SEEK_END);
But in this case my file descriptor (input) gets messed up and I can't rewind it to the start. I played around with the parameters but I can't figure a way to rewind it back to the first character in the file after using lseek. That's the only thing I need, having that I can loop byte by byte and copy all the contents of input to output.
My code is here, it's very short in case any of you want have to take a look:
https://github.com/lucas-sartm/OSAssignments/blob/master/copymachine.c
I figured it out by trial and error. All that was needed was to read the documentation and take a look at read() return values... This loop solved the issue.
while (read (input, &content, sizeof(content)) > 0){ //this will write byte by byte until end of buffer!
write (output, &content, sizeof(content));
}
In C, we can find the size of file using fseek() function. Like,
if (fseek(fp, 0L, SEEK_END) != 0)
{
// Handle repositioning error
}
So, I have a question, Is it recommended method for computing the size of a file using fseek() and ftell()?
If you're on Linux or some other UNIX like system, what you want is the stat function:
struct stat statbuf;
int rval;
rval = stat(path_to_file, &statbuf);
if (rval == -1) {
perror("stat failed");
} else {
printf("file size = %lld\n", (long long)statbuf.st_size;
}
On Windows under MSVC, you can use _stati64:
struct _stati64 statbuf;
int rval;
rval = _stati64(path_to_file, &statbuf);
if (rval == -1) {
perror("_stati64 failed");
} else {
printf("file size = %lld\n", (long long)statbuf.st_size;
}
Unlike using fseek, this method doesn't involve opening the file or seeking through it. It just reads the file metadata.
The fseek()/ftell() works sometimes.
if (fseek(fp, 0L, SEEK_END) != 0)
printf("Size: %ld\n", ftell(fp));
}
Problems.
If the file size exceeds about LONG_MAX, long int ftell(FILE *stream) response is problematic.
If the file is opened in text mode, the return value from ftell() may not correspond to the file length. "For a text stream, its file position indicator contains unspecified information," C11dr ยง7.21.9.4 2
If the file is opened in binary mode, fseek(fp, 0L, SEEK_END) is not well defined. "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state." C11dr footnote 268. #Evert This most often applies to earlier platforms than today, but it is still part of the spec.
If the file is a stream like a serial input or stdin, fseek(file, 0, SEEK_END) makes little sense.
The usual solution to finding file size is a non-portable platform specific one. Example good answer #dbush.
Note: If code attempts to allocate memory based on file size, the memory available can easily be exceeded by the file size.
Due to these issues, I do not recommend this approach.
Typically the problem should be re-worked to not need to find the file size, but to grow the data as more input is processed.
LL disclaimer: Note that C spec footnotes are informative and so not necessarily normative.
The best method in my opinion is fstat(): https://linux.die.net/man/2/fstat
Well, you can estimate the size of a file in several ways:
You can read(2) the file from the beginning to the end, and the number or chars read is the size of the file. This is a tedious way of getting the size of a file, as you have to read the whole file to get the size. But if the operating system doesn't allow to position the file pointer arbitrarily, then this is the only way to get the file size.
Or you can move the pointer at the end of file position. This is the lseek(2) you showed in the question, but be careful that you have to do the system call twice, as the value returned is the actual position before moving the pointer to the desired place.
Or you can use the stat(2) system call, that will tell you all the administrative information of the file, like the owner, group, permissions, size, number of blocks the file occupies in the disk, disk this file belongs to, number of directory entries pointing to it, etc. This allows you to get all this information with only one syscall.
Other methods you point (like the use of the ftell(3) stdio library call) will work also (with the same problem that it results in two system calls to set and retrieve/restore the file pointer) but have the problem of involving libraries that probably you are not using for anything else. It should be complicated to get a FILE * pointer (e.g. fdopen(3)) on a int file descriptor, just to be able to use the ftell(3) function on it (twice), and then fclose(3) it again.
I am simulating multithreads file downloading. My strategy is in each thread would receive small file pieces( each file piece has piece_length and piece_size and start_writing_pos )
And then each thread writes to the same buffer. How do I realize it ? Do I have to worry about collisions ?
//=================== follow up ============//
so I write a small demo as follows:
#include <stdio.h>
int main(){
char* tempfilePath = "./testing";
FILE *fp;
fp = fopen(tempfilePath,"w+");//w+: for reading and writing
fseek( fp, 9, SEEK_SET);//starting in 10-th bytes
fwrite("----------",sizeof(char), 10, fp);
fclose(fp);
}
And before execution I let content in "./testing" to be "XXXXXXXXXXXXXXXXXXX", after I do the above I get "^#^#^#^#^#^#^#^#^#----------" I wonder where is the problem then ....
Do what most torrent clients do. Create a file with the final size having an extension .part. Then allocate non-overlapping parts of the file to each thread, who shall have their own file-descriptors. Thus collisions are avoided. Rename to final name when finished.
Unless you want to use a mutex, you can't use fwrite(). FILE *-based IO using fopen(), fwrite(), and all related functions simply isn't reentrant - the FILE uses a SINGLE buffer., a SINGLE offset, etc.
You can't even use open() and lseek()/write() - multiple threads will interfere with each other, modifying the one offset an open file descriptor has.
Use open() to open the file, and use pwrite() to write data to exact offsets.
pwrite() man page:
pwrite() writes up to count bytes from the buffer starting at buf to
the file descriptor fd at offset offset. The file offset is not
changed.
The task is simple but I am having an issue with the method returning 0.
This means my loop:
int getCharCount(FILE *fp) {
int c;
int i = 0;
while( (c = fgetc(fp)) != EOF) {
i++;
printf("Loop ran");
}
return i;
}
Did not run.
In my testing I found that the loop never runs because the "Loop ran" never prints. I am new to c and not sure if I am doing something wrong when trying to count chars in the file.
I feel like I should mention that the file is opened with "wb+" mode and that there are a few long methods that edit the file. Essentially before using this getCharCount() method the text file is cleared of all previous data, then user enters a number of 44 char length strings at a time and I use this method I just posted to calculate the total number of chars which will be used to navigate my display data method.
I am in a library working on this so if anything extra is needed to be posted or if anything needs to be clarified I will try to be quick with my responses. I don't want to post my whole code because there would be a chance to cheat and I need to get this done myself.
Thanks ahead.
If you write to the file and then call your method on the same file handle, the file handle is already at the end of the file so it will see EOF immediately. We would need to see more of the code to be sure I think.
So, you could rewind the file handle at the start of your function.
Or you could just call ftell to find out your offset in the file, which is the same as the number of bytes written if you truncate, write and do not rewind.
Why you have to read all bytes one by one to count them? It is much easier to do fseek( fp, 0, 2 ) to jump at end of file and get current position (file length) with ftell( fp ).
You are opening with the mode w+ which will truncate the file. From the fopen man page:
w+ Open for reading and writing. The file is created if it does not exist, otherwise it is truncated. The stream is positioned at the beginning of the file.
You will want to open it with rb+ instead of wb+ if you are reading from a file and still want to write to it. If you have already written to it and want to read what was written, you will need to seek to the start of the file pointer.
// Both of these seek to the start
rewind(fp);
fseek(fp, 0, SEEK_SET);
If the file is not open, you could use:
off_t getFileSize(const char *filepath) {
struct stat fileStat;
stat(filepath, &fileStat);
return(fileStat.st_size);
}
If the file is open:
off_t getFileSize(int fd) {
struct stat fileStat;
fstat(fd, &fileStat);
return(fileStat.st_size);
}