I am trying to understand how a file position indicator moves after I read some bytes from a file. I have a file named "filename.dat" with a single line: "abcdefghijklmnopqrstuvwxyz" (without the quotes).
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main () {
int fd = open("filename.dat", O_RDONLY);
FILE* fp = fdopen(fd,"r");
printf("ftell(fp): %ld, errno = %d\n", ftell(fp), errno);
fseek(fp, 5, SEEK_SET); // advance 5 bytes from beginning of file
printf("file position indicator: %ld, errno = %d\n", ftell(fp), errno);
char buffer[100];
int result = read(fd, buffer, 4); // read 4 bytes
printf("result = %d, buffer = %s, errno = %d\n", result, buffer, errno);
printf("file position indicator: %ld, errno = %d\n", ftell(fp), errno);
fseek(fp, 3, SEEK_CUR); // advance 3 bytes
printf("file position indicator: %ld, errno = %d\n", ftell(fp), errno);
result = read(fd, buffer, 6); // read 6 bytes
printf("result = %d, buffer = %s, errno = %d\n", result, buffer, errno);
printf("file position indicator: %ld\n", ftell(fp));
close(fd);
return 0;
}
ftell(fp): 0, errno = 0
file position indicator: 5, errno = 0
result = 4, buffer = fghi, errno = 0
file position indicator: 5, errno = 0
file position indicator: 8, errno = 0
result = 0, buffer = fghi, errno = 0
file position indicator: 8
I do not understand why the second time I try to use read, I get no bytes from the file. Also, why does the file position indicator not move when I read contents from the file using read? On the second fseek, advancing 4 bytes instead of 3 did also not work. Any suggestions?
Use fseek and fread or lseek and read, but do not mix the two APIs, it won't work.
A FILE* has its own internal buffer. fseek may or may not move the internal buffer pointer only. It is not guaranteed that the real file position indicator (one that lseek is responsible for) changes, and if it does, it is not known by how much.
First thing to note is that the read calls read chars into a raw buffer, but printf() expects to be handed null-terminated strings for %s parameters. You're not explicitly adding a null-terminator byte so your program might print garbage after the first 4 bytes of the buffer, but you've been lucky and your compiler has initialized the buffer to zeroes so you haven't noticed this problem.
The essential problem in this program is that you're mixing high-level buffering FILE * calls with low level file descriptor calls, which will result in unpredictable behavior. FILE structs contain a buffer and a couple of ints to support more efficient and convenient access to the file behind a file descriptor.
Basically all f*() calls (fopen(), fread(), fseek(), fwrite()) expect that all I/O is going to be done by f*() calls using a FILE struct, so the buffer and index values in the FILE struct will be valid. The low-level calls (read(), write(), open(), close(), seek()) completely ignore the FILE struct.
I ran strace on your program. The strace utility logs all system calls made by a process. I've omitted all the uninteresting stuff up to your open() call.
Here is your open call:
open("filename.dat", O_RDONLY) = 3
Here is where fdopen() is happening. The brk calls are evidence of memory allocation, presumably for something like malloc(sizeof(FILE)).
fcntl64(3, F_GETFL) = 0 (flags O_RDONLY)
brk(0) = 0x83ea000
brk(0x840b000) = 0x840b000
fstat64(3, {st_mode=S_IFREG|0644, st_size=26, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7728000
This might be the effect of ftell() or just the last part of fdopen, I'm not sure.
_llseek(3, 0, [0], SEEK_CUR) = 0
Here is the first printf.
write(1, "ftell(fp): 0, errno = 0\n", 24) = 24
Here is the first fseek, which has decided the easiest way to get to position 5 in the file is to just read in 5 bytes and ignore them.
_llseek(3, 0, [0], SEEK_SET) = 0
read(3, "abcde", 5) = 5
Here is the third printf. Notice that there is no evidence of a ftell() call. ftell() uses the information in the FILE struct, which claims to be accurate, so no system call is necessary.
write(1, "file position indicator: 5, errn"..., 38) = 38
Here is your read() call. Now, the operating system file handle is at position 9, but the FILE struct thinks it is still at position 5.
read(3, "fghi", 4) = 4
The third and fourth printf with ftell indication position 5.
write(1, "result = 4, buffer = fghi, errno"..., 37) = 37
write(1, "file position indicator: 5, errn"..., 38) = 38
Here is the fseek(fp, 3, SEEK_CUR) call. fseek() has decided to just SEEK_SET back to the beginning of the file and read the whole thing into the FILE struct's 4k buffer. Since it "knew" it was at position 5, it "knows" it must be at position 8 now. Since the file is only 26 bytes long, the os file position is now at eof.
_llseek(3, 0, [0], SEEK_SET) = 0
read(3, "abcdefghijklmnopqrstuvwxyz", 4096) = 26
The fifth printf.
write(1, "file position indicator: 8, errn"..., 38) = 38
Here is your second read() call. Since the file handle is at eof, it reads 0 bytes. It doesn't change anything in your buffer.
read(3, "", 6) = 0
The sixth and seventh printf calls.
write(1, "result = 0, buffer = fghi, errno"..., 37) = 37
write(1, "file position indicator: 8\n", 27) = 27
Your close() call, and the process exit.
close(3) = 0
exit_group(0) = ?
Related
I use fwrite to write a buffer of 100,000 chars to file, but the return value from fwrite is only 4096.
char buffer [100000];
memset(buffer,0x00,100000);
FILE *f = fopen("<path>","ab+");
if(f ==NULL)
{
return;
}
int ret =fwrite(buffer,1,100000,f);
printf("ret = %d",ret);
ret = 4096
Why this code write only 4096 bytes instead of 100,000 ?
This is Linux embedded system
From man pages:
RETURN VALUE
[...] If an error occurs, or the end of the file is
reached, the return value is a short item count (or zero).
In this case you should use ferror(f) to see if the file handle is in error condition. Also, you can zero the errno before the call, and print the error message with perror:
errno = 0;
int ret = fwrite(buffer, 1, 100000, f);
if (ret != 100000) {
printf("Stream error indication %d", ferror(f));
perror("Short item count");
}
The maximum length for any write call is defined by SSIZE_MAX which can be found in unistd.h.
This holds for every POSIX-compliant system. SSIZE_MAX may differ for different implementations.
Try out the following example to determine the maximum write length on your system:
#include <stdio.h>
#include <limits.h>
int main(void) {
printf("SSIZE_MAX : %ld\n", SSIZE_MAX);
return 0;
}
It prints SSIZE_MAX : 9223372036854775807 on my machine.
EDIT: You can also try to locate your limits.h file, but compiling might be the easier option
I got confused about lseek()'s return value(which is new file offset)
I have the text file (Its name is prwtest). Its contents are written to a to z.
And, the code what I wrote is following,
1 #include <unistd.h>
2 #include <fcntl.h>
3 #include <stdlib.h>
4 #include <stdio.h>
5 #include <string.h>
6
7 #define BUF 50
8
9 int main(void)
10 {
11 char buf1[]="abcdefghijklmnopqrstuvwxyz";
12 char buf2[BUF];
13 int fd;
14 int read_cnt;
15 off_t cur_offset;
16
17 fd=openat(AT_FDCWD, "prwtest", O_CREAT | O_RDWR | O_APPEND);
18 cur_offset=lseek(fd, 0, SEEK_CUR);
19 //pwrite(fd, buf1, strlen(buf1), 0);
20 //write(fd, buf1, strlen(buf1));
21 //cur_offset=lseek(fd, 0, SEEK_END);
22
23 printf("current offset of file prwtest: %d \n", cur_offset);
24
25 exit(0);
26 }
On the line number 17, I use flag O_APPEND, so the prwtest's current file offset is taken from i-node's current file size. (It's 26).
On the line number 18, I use lseek() which is used by SEEK_CUR, and the offset is 0.
But the result value cur_offset is 0. (I assume that it must be 26, because SEEK_CUR indicates current file offset.)
However, SEEK_END gives me what I thought, cur_offset is 26.
Why the lseek(fd, 0, SEEK_CUR); gives me return value 0, not 26?
O_APPEND takes effect before each write to the file, not when opening file.
Therefore right after the open the position remains 0 but if you invoke write, the lseek on SEEK_CUR will return correct value.
Your issue is with open() / openat(), not lseek().
From the open() manpage, emphasis mine:
O_APPEND
The file is opened in append mode. Before each write(2), the file offset is positioned at the end of the file, as if with lseek(2).
Since you don't write to the file, the offset is never repositioned to the end of the file.
While we're at it, you should be closing the file before ending the program...
Actually, while we're really at it, if you do #include <stdio.h> already, why not use the standard's file I/O (fopen() / fseek() / fwrite()) instead of the POSIX-specific stuff? ;-)
Also, on Linux, your commented-out code won't work as you expect. This code:
17 fd=openat(AT_FDCWD, "prwtest", O_CREAT | O_RDWR | O_APPEND);
18 cur_offset=lseek(fd, 0, SEEK_CUR);
19 pwrite(fd, buf1, strlen(buf1), 0);
will fail to write the contents of buf1 at the beginning of the file (unless the file is empty).
pwrite on Linux is buggy:
BUGS
POSIX requires that opening a file with the O_APPEND flag should
have no effect on the location at which pwrite() writes data.
However, on Linux, if a file is opened with O_APPEND, pwrite()
appends data to the end of the file, regardless of the value of
offset.
Let's have a look at this Hello World program
#include <stdio.h>
int main(int argc, char ** argv) {
printf("Hello, World!");
const char* sFile = "/dev/stdout"; // or /proc/self/fd/0
const char* sMode = "w";
FILE * output = fopen(sFile, sMode);
//fflush(stdout) /* forces `correct` order */
putc('!', output); // Use output or stdout from stdio.h
return 0;
}
When compiled using the output file descriptor the output is:
!Hello, World!
when compiled using the stdout file descriptor provided by stdio.h the output is as expected:
Hello, World!!
I figure when calling putc with the latter, it will print directly to the stdout and when using the file descriptor on /dev/stdout it will open a pipe and print into that. I'm not certain though.
The behavior is even more interesting, as it does not overwrite the first character of the 'Hello' but rather pushes itself into the first position of the line buffer in front of the already pushed string.
From a logical point of view this is quiet unexpected.
Can anyone explain what exactly is going on here?
I'm using
cc (Ubuntu 4.8.2-19ubuntu1) 4.8.2 and a 3.13.0-52 linux kernel compiled w/ gcc 4.8.2
Edit: I've done an strace of both programs, and here is the important part:
The output (fopen("/dev/stdout", "w")) without fflush(stdout) scenario produces:
...
open("/dev/stdout", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62f21e9000
write(3, "!", 1!) = 1
write(1, "Hello, World!", 13Hello, World!) = 13
exit_group(0) = ?
using fflush(stdout) produces and enforces correct order:
...
open("/dev/stdout", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
write(1, "Hello, World!", 13Hello, World!) = 13
fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5ad4557000
write(3, "!", 1!) = 1
exit_group(0) = ?
The stdout (from stdlib.h) scenario produces:
...
write(1, "Hello, World!!", 14Hello, World!!) = 14
exit_group(0) = ?
So it seems the FILE * output = fopen("/dev/stdout") stream uses a different file descriptor than stdout
Also as it seems printf uses stdout
Thus in the third scenario the string is assembled before it is pushed on the stream.
Both of the streams (stdout and output) are buffered. Nothing actually gets written until they are flushed. Since you are not explicitly flushing them, nor arranging for them to be automatically flushed, they are only being auto-flushed when they are closed.
You are also not explicitly closing them, so they're being closed (and flushed) by the standard library's on_exit hooks. And as William Pursell correctly pointed out, the order in which the buffered I/O streams are closed is not specified.
Look at fflush(3), fclose(3), and setbuf(3) manual pages for more information on controlling when and how your output gets flushed.
/dev/stdout is not the same as /proc/self/fd/0 . In fact, if you have not enough priviledges, you won't be able to write to /dev/stdout as /dev/stdout is not any standard character device in Linux and any attempt to fopen it with the "w" option will try to create an actual regular file on that directory. The character device you are looking for is /dev/tty
In C language, stdout is an initialized global variable of type FILE * which points to the standard output file, that is, the file whose descriptor is 1. stdout only exists in the C namespace and doesn't relate to any actual file named "stdout"
This code allocates just 10 bytes for line buffering and reads a file which have a 45 bytes first line.
When it runs, the program reads all the 45 bytes not just the first 10 bytes as I expected it to do, so what setvbuf actually did?
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE *tst;
tst = fopen("x.log","r");
char *buff = malloc(10); //Just 10 characters
setvbuf(tst, buff, _IOLBF, 10);
char *mystring = malloc(45); //First line of x.log is 45 characters exactly
if ( fgets (mystring, 45, tst) != NULL )
puts(mystring);
fclose (tst);
free(buff);
}
fgets() uses getc() internally, to read one character at a time, until it reads a newline or reaches the limit it was given. Whenever getc() reaches the end of the I/O buffer, it will refill the buffer, so it's not limited to the size set by setvbuf(). Setting a small buffer size just makes it less efficient, but doesn't change the amount of data that can be read.
setvbuf associates with the file a buffer of size 10.
Why reads all 45 bytes?
You are reading the file with fgets, and trying to read 45 bytes. Since the file buffer is size 10 (and _IOLBF option), this means the read is done this way:
Read bytes 0-9 from file to mystring
Read bytes 10-19...
Read bytes 20-29...
Read bytes 30-39...
Read bytes 40-45 and stops at \n
Instead of use a default buffer and probably reading all bytes at once (without refilling the buffer)
The difference without and with setvbuf is,
open("file.txt", O_RDONLY) = 3
read(3, "Hickanckdnckncksckscskcnnacnckad"..., 4096) = 65
Vs
open("file.txt", O_RDONLY) = 3
read(3, "Hickanckdn", 10) = 10
read(3, "cknckscksc", 10) = 10
read(3, "skcnnacnck", 10) = 10
read(3, "adjsnccnad", 10) = 10
read(3, "ncacsjcadj", 10) = 10
fgets() reads in 4096 chucks of bytes at a time. setvbuf is the way to control and how large is the buffer while read.
setvbuf(tst, buff, _IOLBF, csize * 10);
You set buffering mode to _IOLBF = Line Buffered, according to the
man page of setvbuf "...when it is line buffered characters are saved up until a newline..."
setvbuf(tst, buff, _IOFBF, csize * 10);
Should buffer only 10 Bytes, but fgets would still read the full line.
Buffering means, internal the Data is read to buff, when buff is full or in line buffered also when a Newline is read, the buffer is overwritten.
I used this code to read file. But fread function always return 0. What is my mistake?
FILE *file = fopen(pathToSourceFile, "rb");
if(file!=NULL)
{
char aByte[50000];
int ret = fread(aByte, sizeof(aByte), 1, file);
if(ret != 0)
{
not jump into there;
fseek(file, 0, SEEK_SET);
fwrite(aByte, ret, 1, file);
}
}
fclose(file);
are you sure that your file has a size greater than 50000 ? otherwise you could try:
fread(aByte,1, sizeof(aByte), file);
ferror() will tell when something is wrong.
You can print the actual error message using perror().
You can't fwrite to a file open in rb mode.
Your statement that ret is always zero is false. If you had properly instrumented your code, you'd not be making false claims:
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE *file = fopen("junk.dat", "rb");
if(file!=NULL)
{
char aByte[50000];
int ret = fread(aByte, sizeof(aByte), 1, file);
fprintf(stderr, "fread returned %d\n", ret);
if(ret != 0)
{
int fs = fseek(file, 0, SEEK_SET);
if(fs == -1) {
perror("fseek");
exit(1);
}
fs = fwrite(aByte, ret, 1, file);
if(fs != ret) {
perror("fwrite");
exit(1);
}
}
}
fclose(file);
return 0;
}
Yields:
fread returned 1
fwrite: Bad file descriptor
when run.
In my case I wanted to read a file of size 6553600 bytes (an mp3), and it was returning 0 bytes read. It drove me crazy, till, I tried to manually hardcode 30 bytes, and it did read 30 bytes.
I started playing with it and see how much can it read, and it turns out that it can read exactly 262144 (2^18) bytes, if you ask it to read 262145 bytes it reads 0.
Conclusion: at least with this function you can't load the whole file in one go.
In case someone else runs into this. I just ran into a similar issue. It is because the 2nd argument to fread should be the size of each element in the buffer. In OP's code it is the size of the pointer to the buffer.
This should work provided buff has at least 1 element:
int ret = fread(aByte, sizeof(aByte[0]), 1, file);
Please check man fread
man fread(3)
size_t fread(void *restrict ptr, size_t size, size_t nmemb, FILE *restrict stream);
RETURN VALUE
On success, fread() and fwrite() return the number of items read
or written. This number equals the number of bytes transferred
only when size is 1. If an error occurs, or the end of the file
is reached, the return value is a short item count (or zero).
As your file is smaller than 50000Bytes aka. size of a item, the read item count is 0.
In my case,
fseek(rFile, 0, SEEK_END);
iTotalSize = ftell(rFile);
fseek(rFile, 0, SEEK_SET); // <-- I wrote SEEK_END, not SEEK_SET
so it read 0 byte(anything)
Did you:
#include <unistd.h>
If not, and if you compile without -Wall, the C compiler can incorrectly assume that the second argument to fread() is an int rather than an off_t, which can mess up the function call. Your code snippet doesn't show any #include statements, so please make sure you're including everything that you're using.