How to use read and write past BUFSIZ in C - c

For an assignment, I'm supposed to create two methods: Method one will read() and write() the input file to an empty output file, one byte at a time (slowly).
The other method will instead use char buf[BUFSIZ]; where BUFSIZ is from <stdio.h>. We are supposed to read() and write() with the BUFSIZ which will make things a lot faster.
The input file we test each method on is just a linux dictionary (/dict/linux.words).
I've correctly implemented method one, where I call read() and write() on one character at a time, copying the input file to the output file. Although it's very slow, it at least copies everything over.
My code for this looks like this:
// assume we have a valid, opened fd_in and fd_out file.
char buf;
while(read(fd_in, buf, 1) != 0)
write(fd_out, buf, 1);
For method two however, where I use BUFSIZ, I am not able to transfer every single entry into the output file. It fails in the z entries, and doesn't write anymore.
So, my first try:
// assume we have a valid, opened fd_in and fd_out file
char buf[BUFSIZ];
while(read(fd_in, buf, BUFSIZ) != 0)
write(fd_out, buf, BUFSIZ);
doesn't work.
I understand that read() will return either the number of bytes read or 0 if it is at the end of a file. The problem I'm having is understanding how I can compare read() to BUFSIZ, and then loop around and start read() at where it left off until I reach the real end of file.

Since your file will most likely not be an exact multiple of BUFSIZ you need to check for the actual number of bytes read, so that the last block will be written correctly, e.g.
char buf[BUFSIZ];
ssize_t n;
while((n = read(fd_in, buf, BUFSIZ)) > 0)
write(fd_out, buf, n);

this code:
// assume we have a valid, opened fd_in and fd_out file
char buf[BUFSIZ];
while(read(fd_in, buf, BUFSIZ) != 0)
write(fd_out, buf, BUFSIZ);
leaves much to be desired,
does not handle a short remaining char count at the end of the file,
does not handle errors, etc.
a much better code block would be:
// assume we have a valid, opened fd_in and fd_out file
char buf[BUFSIZ];
int readCount; // number of bytes read
int writeCount; // number of bytes written
while(1)
{
if( 0 > (readCount = read(fd_in, buf, BUFSIZ) ) )
{ // then, read failed
perror( "read failed" );
exit( EXIT_FAILURE );
}
// implied else, read successful
if( 0 == readCount )
{ // then assume end of file
break; // exit while loop
}
// implied else, readCount > 0
if( readCount != (writeCount = write( fd_out, buf, readCount ) ) )
{ // then, error occurred
perror( "write failed" );
exit( EXIT_FAILURE );
}
// implied else, write successful
} // end while
Note: I did not include the closing of input/output files statements
before each call to exit() however, that does need to be added

Related

Read from stdin and fill buffer until EOF

I need to read from stdin and fill a buffer of _SC_PAGESIZE (from sysconf()) until stdin is at EOF. This program is supposed to be a wc clone, so I would be expecting something like the contents of a regular file to be passed in. If the buffer isn't big enough for stdin, then I have to keep filling it, process it for information, then clear it and continue to fill the buffer again from the file offset in stdin. I'm just having a problem with tracking the EOF of stdin, and I'm getting an infinite loop. Here's what I have:
int pSize = sysconf(_SC_PAGESIZE);
char *buf = calloc(pSize, sizeof(char));
assert(buf);
if (argc < 2) {
int fd;
while (!feof(stdin)) {
fd = read(0, buf, pSize);
if (fd == -1)
err_sys("Error reading from file\n");
lseek(0, pSize, SEEK_CUR);
if (fd == -1)
err_sys("Error reading from file\n");
processBuffer(buf);
buf = calloc(pSize, sizeof(char));
}
close(fd);
}
I'm assuming the problem has to do with the test condition (while (!feof(stdin)), so I guess what I need is a correct test condition to exit the loop.
Why are you using a low-level read instead of opening a FILE *stream and using fgets (or POSIX getline)? Further, you leak memory every time you call:
buf = calloc(pSize, sizeof(char));
in your loop because you overwrite the address contained in buf losing the reference to the previous block of memory making it impossible to free.
Instead, allocate your buffer once, then continually fill the buffer passing the filled buffer to processBuffer. You can even use a ternary operator to determine whether to open a file or just read from stdin, e.g.
int pSize = sysconf(_SC_PAGESIZE);
char *buf = calloc(pSize, sizeof(char));
assert(buf);
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) {
perror ("fopen failed");
return 1;
}
while (fgets (buf, pSize, fp))
processBuffer(buf); /* do not call calloc again -- memory leak */
if (fp != stdin) fclose (fp); /* close file if not stdin */
(note: since fgets will read a line-at-a-time, you can simply count the number of iterations to obtain your line count -- provided your lines are not longer than _SC_PAGESIZE)
If you want to use exact pSize chunks, then you can use fread instead of fgets. The only effect would be to reduce the number of calls to processBuffer marginally, but it is completely up to you. The only thing that you would need to do is change the while (...) line to:
while (fread (buf, (size_t)pSize, 1, fp) == 1)
processBuffer(buf); /* do not call calloc again -- memory leak */
if (ferror(fp)) /* you can test ferror to insure loop exited on EOF */
perror ("fread ended in error");
(note: like read, fread does not insure a nul-terminated string in buf, so insure that processBuffer does not pass buf to a function expecting a string, or iterate over buf expecting to find a nul-terminating character at the end.)
Look things over and let me know if you have further questions.
You can write the loop like
int n;
do {
n = read(0, buf, pSize);
// process it
} while(n > 0);
Remember EOF is just one exit condition that may not occur before any other error condition occurs. True check for validity to run the loop is a healthy return code from read. Also, note that condition while(n > 0) is enough or not depends on where you are reading from. In case of stdin it may be enough. But for example for sockets the condition can be written like while(n > 0 || errno == EAGAIN)

Program gets stuck while trying to read a file using read() system call

Here is my code snippet:
int fd;
bufsize = 30;
char buf[bufsize];
char cmd[100] = "file.txt";
int newfd = 1;
if (fd = open(cmd,O_RDONLY) >=0){
puts("wanna read");
while (read(fd,&bin_buf,bufsize)==1){
puts("reading");
write(newfd,&bin_buf,bufsize);
}
close(fd);
}
So here the program prints "wanna read" but never prints "reading". I have also tried opening using nonblock flag, but no use. Can anybody help me? I must use open() and read() system calls only. Thanks.
Edit: I have made some clarifications in the code. Actually the newfd that I'm writing to is a socket descriptor, but I don't think that is important for this problem because it sticks on the read which is before the write.
The first problem is your if statement. You forgot to use enough parentheses, so if the open() works, the read tries to read from file descriptor 1, aka standard output. If that's your terminal (it probably is) on a Unix box, then that works — surprising though that may be; the program is waiting for you to type something.
Fix: use parentheses!
if ((fd = open(cmd, O_RDONLY)) >= 0)
The assignment is done before, not after, the comparison.
I observe in passing that you don't show how you set cmd, but if you see the 'wanna read' message, it must be OK. You don't show how newfd is initialized; maybe that's 1 too.
You also have the issue with 'what the read() call returns'. You probably need:
int fd;
char buf[bufsize];
int newfd = 1;
if ((fd = open(cmd, O_RDONLY)) >= 0)
{
puts("wanna read");
int nbytes; // ssize_t if you prefer
while ((nbytes = read(fd, buf, sizeof(buf))) > 0)
{
puts("reading");
write(newfd, buf, nbytes);
}
close(fd);
}
You can demonstrate my primary observation by typing something ('Surprise', or 'Terminal file descriptors are often readable and writable' or something) with your original if but my loop body and then writing that somewhere.
Your read() call attempts to read bufsize bytes and returns the number of bytes actually read. Unless bufsize ==, it is quite unlikely read() will return 1, so the block is almost always skipped and nothing get written.
Also note that if (fd = open(cmd, O_RDONLY) >= 0) is incorrect and would set fd to 1, the handle for standard output, if the file exists, causing the read to fail as standard input is most likely not opened for reading.
Note that reading with the read system call is tricky on some environments, because a return value of -1 may be restartable.
Here is an improved version:
int catenate_file(const char *cmd, int newfd, size_t bufsize) {
int fd;
char buf[bufsize];
if ((fd = open(cmd, O_RDONLY)) >= 0) {
puts("wanna read");
ssize_t nc;
while ((nc = read(fd, buf, bufsize)) != 0) {
if (nc < 0) {
if (errno == EINTR)
continue;
else
break;
}
printf("read %zd bytes\n", nc);
write(newfd, buf, nc);
}
close(fd);
return 0;
}
return -1;
}
read returns the number of bytes read from file that can be bufsize or less if the remainder of the file that has to be read is shorter than bufsize.
In your case most probably bufsize is bigger than 1 and the file is bigger than 1 byte so the condition of the while loop is evaluated false, the code is skipped to the point where file is closed.
You should check if there if there are more bytes to be read:
while( read(fd,&bin_buf,bufsize) > 0 ) {

Read all characters written in FIFO using open() system call

I have a FIFO pipe, which is opened at both ends using open() in O_RDWR mode. At the reading end, read() is not reading all the characters, but lesser than that specified in the call. Is there a way to ensure that all characters are read using open()?
Thanks in advance
if (p != NULL){
printf("Inside p not null!\n");
if((fd = open(p, O_RDWR)) < 0){
perror("File could not be opened!");
exit(EXIT_FAILURE);
}
//FILE *rdptr = fopen(p,"r");
memset(buf,0,file_len);
rc = read(fd, buf, file_len);
printf("Number of bytes read: %d\n", rc);
printf("Data detected on FIFO\n");
buf[rc] = '\0';
char base[20] = "output.txt";
char name[20];
sprintf(name, "%d%s", suffix, base);
FILE *fptr = fopen(name,"ab+");
fd_wr = open(name,O_WRONLY);
charnum = write(fd_wr,buf,rc);
kill(id_A, SIGKILL);
//printf("No. of characters written: %d\n",charnum);
//FD_CLR(fd, &rdfs);
}
First minor comment: you should use O_RDONLY to open the file: don't use more permissions than necessary.
Second issue: if file_len is very large, it's possible that the writer has blocked trying to write the entire chunk of data (since a FIFO can only hold a certain amount of unread data). If that's the case, then read will only read the data that has been stored in the FIFO, and will immediately return with whatever it could read. This will allow the writer to write more bytes, which will then be read in the next read.
You should loop reads, adjusting an offset into the buffer, until the entire file_len bytes are read. Something like this:
size_t offset = 0;
while(offset < file_len) {
rc = read(fd, buf+offset, file_len-offset);
if(rc < 0) {
/* handle I/O error or something... */
} else {
offset += rc;
}
}

C read() and write() while loop

Here is example code:
int nbajt; int buf[];
// we opened file and get descriptor fd
while ((nbajt = read(fd, buf, 5)) > 0) {
if (write(fd2, buf, nlbajt) == -1) {
perror("ERROR");
exit(1);
}
}
I don't understand how it is working when we use while loop. How many times this loop will proceed? (times of the lengs of buf?). Will nbajt has only values of 1 or 0 + buf file position will be changing 1 place after each loop step? So in first step we have nlbajt = 1 and we take buf first position char and then write it to fd2?. On the end we have nlbajt==0 so it means it's end of file? I would be grateful for checking if i am wrong.My main concern is how nbajt value is changing. How it is diffrent for this attitude:
nbajt = read(fd, buf, 5));
write(fd2, buf, sizeof(a));
The read() has the below prototype:
int read( int handle, void *buffer, int nbyte );
It returns number of bytes successfully read . 0 when EOF is reached.-1 when there is an error.
Yes nlbajt = 0 means EOF here.

Strange behaviour of fgets

Ive got a function here that blocks on fgets but when I print something before fgets it doesn't block.
int exec_command(char *command, char *output_buf, int buf_size)
{
FILE* pipe = NULL;
char buffer[BUFFER_SIZE];
char tmp[SMALL_BUFFER_SIZE];
unsigned total_read = 0;
pipe = popen( command, "r");
if( !pipe )
{
//Error
return -1;
}
memset(buffer, 0, sizeof(buffer));
while( !feof(pipe) )
{
//printf("reading"); //If I uncomment this fgets doesnt block
if( fgets(tmp, sizeof(tmp), pipe) != NULL )
{
// check that it'll fit:
size_t len = strlen(tmp);
if (total_read + len >= sizeof(buffer))
break;
// and add it to the big buffer if it fits
strcat(buffer, tmp);
total_read += len;
}
}
//Is there anything to copy
if ( total_read )
strncpy (output_buf, buffer, buf_size);
return pclose(pipe);
}
Is there anything wrong on my function above?
Its because whatever is writing your pipe isn't flushing its out buffer. When you print, it ends up flushing that (not garenteed to happen though). When you don't print, the pipe isn't actually getting written to because its because stored in a kernel buffer until it fills, and then the kernel will write the data. Call fsync or flush on the pipe fd in the process that is writing to the pipe to make sure that the kernel buffer is flushed.
Add a new-line to the print-statement. It is (probably) line-buffered, and doesn't print until a new-line is encountered, you call fflush.

Resources