How read remember the last offset of file? - c

How read function know the next position to read from a file.
or How can I manage to made a function that can remember last offset of file even after open another file a changing it's file descriptor.
Is there is a way to know that a file descriptor is already opened and pointed to a file?
like this:
int main()
{
int fd;
char *file;
file = (char *)malloc(sizeof(char) * 32);
fd = open("file.txt", O_RDONLY);
read_file(fd, *file); /* reading the first line from file.txt */
fd = open("file1.txt", O_RDONLY);
read_file(fd, *file); /* reading the first line from file1.txt */
fd = open("file.txt", O_RDONLY);
read_file(fd, *file); /* Now it should read the second line from file file.txt, how can I manage to do that*/
close(fd);
return (0);
}

The current location in the file is maintained by the kernel I think, the file descriptor serves as the key to all the information associated with the open file.
If you need to open and read from two files at the same time, they should of course not share the file descriptor. Just use two, one per file.
const int fd1 = open("file.txt", O_RDONLY);
const int fd2 = open("file1.txt", O_RDONLY);
The treatment of char *file in your code makes no sense, but at this point you can mix accesses to fd1 and fd2.
Remember to close the files when you're done:
close(fd2);
close(fd1);
In real code you would also check that the open-calls succeeded, before trying to do I/O from the file(s), of course.

Is there a way to know that a file descriptor is already opened and
pointed to a file?
If you can lseek(fd, 0, SEEK_CUR) successfully, that means that fd is opened and seekable (so probably a file, but remember that "file" includes directories and device files as well as regular files).
If it returns (off_t)-1 and errno==EBADF then the descriptor is not open; if returns (off_t)-1 and errno==ESPIPE, then it's a pipe, socket, or FIFO.

Related

fstat st_size says file size is 0 when unlinking file

Somewhere online I've seen a technique to immediately unlink a temporary file after opening it, since you will discard it anyway. As per my understanding of the man-page for unlink, the file will not be unlinked as long as it has an open file descriptor.
When I execute the following piece of code:
char *file, *command;
asprintf(&file, "/tmp/tempXXXXXX");
int fd = mkstemp(file);
unlink(file);
asprintf(&command, "some_command > %s", file); /*Writes n bytes in temp file*/
FILE *f = popen(command, "re");
pclose(f);
struct stat sbuf;
fstat(fd, &sbuf);
printf("%i\n", sbuf.st_size);
close(fd);
free(file);
free(command);
exit(0);
It will print a size of 0. However, if I comment unlink(file), it will show the correct file size. I would expect both scenarios to show the correct size of the file, since unlink should wait till no processes have the file open anymore. What am I missing here?
You're missing the fact that the file referred to by your fd is not the same file as that created by your call to popen().
In a POSIX-like shell, some_command > some_file will create some_file if it does not already exist, otherwise it will truncate some_file.
Your call to popen() invokes a shell, which in turn creates or truncates the output file before invoking some_command as per POSIX.
Since you have unlinked some_file before the call to popen(), the file is created anew: that is, the output file set up by your popen() shell is allocated under a different inode than the (now anonymous) file created by your previous call to mkstemp().
You can see that the files are different if you compare st_ino values from your fstat() (by fd) and a separate call to stat() (by name) after the popen().

Use opened file descriptor

I've got 2 programs, and in one i'm opening a file to read and from the other one i'm trying to read from file :
first program
fd = open("test.txt",O_RDONLY);
printf("%d\n",fd);
while(1);
second program :
char sir[100];
int fd, result;
scanf("%d",&fd);
rez = read(fd,((void*)sir), 2);
In the second program i read what i printed in first program. Why this code doesn't work and how can i read from that file descriptor from program nr 2?
File descriptors are unique to the process. Also you need to write to the file descriptor.
There are several problems:
fd = open("test.txt", O_RDONLY) opens the file for reading. If I understand what you are trying to do, you want to create the file and open it for writing. That would be fd = open("test.txt", O_CREAT | O_WRONLY).
printf("%d\n",fd) displays the value of the file handle. While that might be useful for debugging, I think you want something which writes to the file handle. write (fd, "hello", 5) is closer to that.
while(1); is an infinite CPU busy loop. This is not very useful.
Similarly the second program has issues:
fd = scanf("%d",&fd) is peculiar. I think you want to open the file just written, no? Instead, fd = open("test.txt", O_RDONLY).
With that corrected, the program can then read the content into the variable read (fd, sir, sizeof sir).
See if those help you.
If you are not primarily working with binary data in the files, the fopen() and fprintf() library calls are more convenient.

Read a file in a while loop line-by-line while another function updates it

I am reading a file in a while loop from start to end:
FILE *file;
file = fopen(path_to_file), "r");
char *line = NULL;
size_t len = 0;
while (getline(&line, &len, file) > 0) {
delete_line_from_file(line);
}
fclose(file);
The function delete_line_from_file() removes the line passed to it from the file. It reads in the whole file via open(fd, O_RDONLY | O_CLOEXEC) + read() + close(), then removes the line from the buffer and writes the whole buffer to the same file via open(fd, O_WRONLY | O_TRUNC | O_CLOEXEC) + write() + close(). The read() is locked in an advisory read-lock via struct flock lk and the write() is locked in an advisory write-lock.
When I read the file there are lines that get missed which has something to do with me reading the file from start to finish in one loop while writing to it. If I read in the whole file and go through the buffer line-by-line no lines get missed. (This is my preferred solution so far.) There are also no mistakes made when truncating and writing the file. The missed lines are still in the file after the loop finishes.
Can I make sure that my while-loop does not miss a line and cleanly empties the file? The file needs to be emptied line-by-line. It cannot be just truncated.
Here is one possible solution I had in mind. Mirror the file via fstat(file &fbuf) and check it's size with if (fbuf.st_size !=0) fseek(file, 0, SEEK_SET); but that seems inefficient.
So is the goal to empty the file completely?
Why don't you open the file as such:
open("file", O_TRUNC | O_WRONLY);
This will open the file with truncation. Alternatively, and perhaps a better solution, you can do this:
fopen("file", "w");
fopen with the "w" option delete the original file and replaces it with the new file of name "file".
Use fseek and ftell inside your loop.
Two processes modifying the same file is a recipe for problems. May be you need to use a pipe(2).

dup() and cache flush

I am a C beginner, trying to use dup(), I wrote a program to test this function, the result is a little different from what I expected.
Code:
// unistd.h, dup() test
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
extern void dup_test();
int main() {
dup_test();
}
// dup()test
void dup_test() {
// open a file
FILE *f = fopen("/tmp/a.txt", "w+");
int fd = fileno(f);
printf("original file descriptor:\t%d\n",fd);
// duplicate file descriptor of an opened file,
int fd_dup = dup(fd);
printf("duplicated file descriptor:\t%d\n",fd_dup);
FILE *f_dup = fdopen(fd_dup, "w+");
// write to file, use the duplicated file descriptor,
fputs("hello\n", f_dup);
fflush(f_dup);
// close duplicated file descriptor,
fclose(f_dup);
close(fd_dup);
// allocate memory
int maxSize = 1024; // 1 kb
char *buf = malloc(maxSize);
// move to beginning of file,
rewind(f);
// read from file, use the original file descriptor,
fgets(buf, maxSize, f);
printf("%s", buf);
// close original file descriptor,
fclose(f);
// free memory
free(buf);
}
The program try write via the duplicated fd, then close the duplicated fd, then try to read via the original fd.
I expected that when I close the duplicated fd, the io cache will be flushed automatically, but it's not, if I remove the fflush() function in the code, the original fd won't be able to read the content written by the duplicated fd which is already closed.
My question is:
Does this means when close the duplicated fd, it won't do flush automatically?
#Edit:
I am sorry, my mistake, I found the reason, in my initial program it has:
close(fd_dup);
but don't have:
fclose(f_dup);
after use fclose(f_dup); to replace close(f_dup); it works.
So, the duplicated fd do automatically flush if close in a proper way, write() & close() is a pair, fwrite() & fclose() is a pair, should not mix them.
Actually, in the code I could have use the duplicated fd_dup directly with write() & close(), and there is no need to create a new FILE at all.
So, the code could simply be:
// unistd.h, dup() test
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#define BUF_SIZE 1024 // 1 kb
extern void dup_test();
int main() {
dup_test();
}
// dup()test
void dup_test() {
// open a file
FILE *f = fopen("/tmp/a.txt", "w+");
int fd = fileno(f);
printf("original file descriptor:\t%d\n",fd);
// duplicate file descriptor of an opened file,
int fd_dup = dup(fd);
printf("duplicated file descriptor:\t%d\n",fd_dup);
// write to file, use the duplicated file descriptor,
write(fd_dup, "hello\n", BUF_SIZE);
// close duplicated file descriptor,
close(fd_dup);
// allocate memory
char *buf = malloc(BUF_SIZE);
// move to beginning of file,
rewind(f);
// read from file, use the original file descriptor,
fgets(buf, BUF_SIZE, f);
printf("%s", buf);
// close original file descriptor,
fclose(f);
// free memory
free(buf);
}
From dup man pages:
After a successful return from one of these system calls, the old and new file descriptors maybe used interchangeably. They refer to the same open file description (see open(2))and thus share file offset and file status flags; for example, if the file offset is modified by using lseek(2) on one of the descriptors, the offset is also changed for the other.
It means the seek pointer is changed when you write to the duplicated file descriptor, so, reading from the first file descriptor after writing to the duplication shouldn't read any data.
You are using fdopen to create separated seek_ptr and end_ptr of the duplicated stream, in that way, the fd_dup stops being a duplication. That's why you can read data after flushing and closing the stream.
I couldn't find any strong facts about why you can't read if you don't flush the second file descriptor. I can add that it may be related to sync system call.
After all, if you need a IO buffer, you might be using the wrong mechanism, check named pipes and other buffering OS mechanism.
I cannot really understand your problem. I tested it under Microsoft VC2008 (had to replace unistd.h with io.h) and gcc 4.2.1.
I commented out fflush(f_dup) because it is no use before a close and close(fd_dup); because the file descriptor was already closed, so the piece of code now looks like :
// write to file, use the duplicated file descriptor,
fputs("hello\n", f_dup);
// fflush(f_dup);
// close duplicated file descriptor,
fclose(f_dup);
// close(fd_dup);
And it works correctly. I get on both systems :
original file descriptor: 3
duplicated file descriptor: 4
hello

open() what happens if I open twice the same file?

If I open the same file twice, will it give an error, or will it create two different file descriptors? For example
a = open("teste.txt", O_RDONLY);
b = open("teste.txt", O_RDONLY);
To complement what #Drew McGowen has said,
In fact, in this case, when you call open() twice on the same file, you get two different file descriptors pointing to the same file (same physical file). BUT, the two file descriptors are indepedent in that they point to two different open file descriptions(an open file description is an entry in the system-wide table of open files).
So read operations performed later on the two file descriptors are independent, you call read() to read one byte from the first descriptor, then you call again read()on the second file descriptor, since thier offsets are not shared, both read the same thing.
#include <fcntl.h>
int main()
{
// have kernel open two connection to file alphabet.txt which contains letters from a to z
int fd1 = open("alphabet.txt",O_RDONLY);
int fd2 = open("alphabet.txt",O_RDONLY);
// read a char & write it to stdout alternately from connections fs1 & fd2
while(1)
{
char c;
if (read(fd1,&c,1) != 1) break;
write(1,&c,1);
if (read(fd2,&c,1) != 1) break;
write(1,&c,1);
}
return 0;
}
This will output aabbccddeeffgghhiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz
See here for details, especially the examples programs at the end.
In this case, since you're opening both files as read-only, you will get two different file descriptors that refer to the same file. See the man page for open for more details.
It will create a new entry in the file descriptor table and the file table. But both the entries (old and new) in the file table will point to the same entry in the inode table.

Resources