Read from stdin and fill buffer until EOF - c

I need to read from stdin and fill a buffer of _SC_PAGESIZE (from sysconf()) until stdin is at EOF. This program is supposed to be a wc clone, so I would be expecting something like the contents of a regular file to be passed in. If the buffer isn't big enough for stdin, then I have to keep filling it, process it for information, then clear it and continue to fill the buffer again from the file offset in stdin. I'm just having a problem with tracking the EOF of stdin, and I'm getting an infinite loop. Here's what I have:
int pSize = sysconf(_SC_PAGESIZE);
char *buf = calloc(pSize, sizeof(char));
assert(buf);
if (argc < 2) {
int fd;
while (!feof(stdin)) {
fd = read(0, buf, pSize);
if (fd == -1)
err_sys("Error reading from file\n");
lseek(0, pSize, SEEK_CUR);
if (fd == -1)
err_sys("Error reading from file\n");
processBuffer(buf);
buf = calloc(pSize, sizeof(char));
}
close(fd);
}
I'm assuming the problem has to do with the test condition (while (!feof(stdin)), so I guess what I need is a correct test condition to exit the loop.

Why are you using a low-level read instead of opening a FILE *stream and using fgets (or POSIX getline)? Further, you leak memory every time you call:
buf = calloc(pSize, sizeof(char));
in your loop because you overwrite the address contained in buf losing the reference to the previous block of memory making it impossible to free.
Instead, allocate your buffer once, then continually fill the buffer passing the filled buffer to processBuffer. You can even use a ternary operator to determine whether to open a file or just read from stdin, e.g.
int pSize = sysconf(_SC_PAGESIZE);
char *buf = calloc(pSize, sizeof(char));
assert(buf);
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) {
perror ("fopen failed");
return 1;
}
while (fgets (buf, pSize, fp))
processBuffer(buf); /* do not call calloc again -- memory leak */
if (fp != stdin) fclose (fp); /* close file if not stdin */
(note: since fgets will read a line-at-a-time, you can simply count the number of iterations to obtain your line count -- provided your lines are not longer than _SC_PAGESIZE)
If you want to use exact pSize chunks, then you can use fread instead of fgets. The only effect would be to reduce the number of calls to processBuffer marginally, but it is completely up to you. The only thing that you would need to do is change the while (...) line to:
while (fread (buf, (size_t)pSize, 1, fp) == 1)
processBuffer(buf); /* do not call calloc again -- memory leak */
if (ferror(fp)) /* you can test ferror to insure loop exited on EOF */
perror ("fread ended in error");
(note: like read, fread does not insure a nul-terminated string in buf, so insure that processBuffer does not pass buf to a function expecting a string, or iterate over buf expecting to find a nul-terminating character at the end.)
Look things over and let me know if you have further questions.

You can write the loop like
int n;
do {
n = read(0, buf, pSize);
// process it
} while(n > 0);
Remember EOF is just one exit condition that may not occur before any other error condition occurs. True check for validity to run the loop is a healthy return code from read. Also, note that condition while(n > 0) is enough or not depends on where you are reading from. In case of stdin it may be enough. But for example for sockets the condition can be written like while(n > 0 || errno == EAGAIN)

Related

Infinite Loop While Reading and Writing from Files - C

I'm trying to read from an input file and ultimately reverse the buffer it reads from and write it to an output file. For now though, I'm testing to see if a buffer I read from would even make it to the output file, and so far it isn't and I'm getting an infinite loop. The buffer should read in PAGESIZE bytes (from a call to sysconf()) and if the file output is larger than the buffer, then the buffer should be written to the output file first then be flushed and reused again to get the rest of the input until the file descriptor returns 0 for no data left. This is what I have so far:
int fdRead = open(inputFile, O_RDONLY);
if (fdRead == -1)
err_sys("Error reading input file '%s', check spelling?\n", inputFile);
int fdWrite = open(outputFile, O_WRONLY | O_CREAT | O_TRUNC, 0644); //overwrites file if it exists
if (fdWrite == -1)
err_sys("Error creating output file '%s'\n", outputFile);
while (1) {
read(fdRead, buf, size);
if (fdRead == 0)
break;
if (fdRead == -1)
err_sys("Error reading from input file '%s'\n", inputFile);
lseek(fdRead, size, SEEK_CUR);
if (fdRead == -1)
err_sys("Error reading from input file '%s'\n", inputFile);
write(fdWrite, buf, size);
if (fdWrite == -1)
err_sys("Error writing to output file '%s'\n", outputFile);
lseek(fdWrite, size, SEEK_CUR);
if (fdWrite == -1)
err_sys("Error writing to output file '%s'\n", outputFile);
memset(buf, '\0', size);
}
close(fdRead);
close(fdWrite);
I suppose that fdRead is never returning 0, and thus not exiting the loop. My question is how do I fix that?
p.s: size is the call from sysconf() that gets the PAGESIZE, e.g
size = sysconf(_SC_PAGESIZE);
And inputFile and outputFile are both char * and I've tested that they return and store good strings.
Transcribing comments into an answer.
You need to capture the return value from read() — you're ignoring it and testing whether the file descriptor is 0 or negative after the read().
So if I did something like int bytesRead then tested for if (bytesRead == 0) instead of if (fdRead == 0), then that should solve my problem?
Yes, you need something like:
int nbytes = read(fdRead, buf, size);
if (nbytes <= 0) break;
You should use the positive nbytes in the write() call; you might not get all size bytes filled by the read().
Testing the file descriptor after the read is wrong (it won't have changed under normal circumstances), and ignoring the value returned by read() is wrong, and not using the value returned by read() in the call to write() is wrong.
OK so it should be write(fdWrite, buf, nbytes)?
Yes, it should be
int obytes;
if ((obytes = write(fdWrite, buf, nbytes)) != nbytes)
{
…oops — short write …
}
You get to decide what's the appropriate response to a short write (a positive value, but not the number of bytes you expected to write). If you're writing to a socket, it might be appropriate to try writing the unwritten section of the data again (that's why obytes is used to capture the number of bytes successfully written). If you're writing to a disk file, it probably means there's no space left, so there's no point (little point) in trying again. If obytes is negative, you've had a write error; there is usually little point in trying to continue.
This all has helped out a lot and it seems to be working okay. I've only run into one other problem. I tested this on a large file (Alice in Wonderland text file) and the output file is almost the whole thing, but cuts off the last two paragraphs or so. …
You need to review why you have the lseek() operations in the code. Neither of them should be necessary, and both are dubious. I think the lseek() on fdRead() means you miss chunks of text of size bytes each; I think the lseek() on fdWrite() means you insert size null bytes into the output file.
/*Here's the code which may help for that.i havemodified in it for your need.*/
int fdRead = open(inputFile, O_RDONLY);
int Ret_Val;
if (fdRead == -1)
err_sys("Error reading input file'%s',check spelling?\n",
inputFile );
int fdWrite = open(outputFile, O_WRONLY | O_CREAT
| O_TRUNC, 0644); //overwrites file if it exists
if (fdWrite == -1)
err_sys("Error creating output file '%s'\n", outputFile);
while (1) {
Ret_Val=read(fdRead, buf, size);
if (Ret_Val == 0)
break;
if (Ret_Val == -1)
err_sys("Error reading from input file'%s'\n",inputFile);
lseek(fdRead, size, SEEK_CUR);
if (Ret_Val == -1)
err_sys("Error reading from input file'%s'\n",
inputFile );
Ret_Val=write(fdWrite, buf, size);
if (Ret_Val == -1)
err_sys("Error writing to output file'%s'\n",outputFile);
lseek(fdWrite, size, SEEK_CUR);
if (Ret_Val == -1)
err_sys("Error writing to output file'%s'\n",outputFile);
memset(buf, '\0', size);
}
close(fdRead);
close(fdWrite);

How to use read and write past BUFSIZ in C

For an assignment, I'm supposed to create two methods: Method one will read() and write() the input file to an empty output file, one byte at a time (slowly).
The other method will instead use char buf[BUFSIZ]; where BUFSIZ is from <stdio.h>. We are supposed to read() and write() with the BUFSIZ which will make things a lot faster.
The input file we test each method on is just a linux dictionary (/dict/linux.words).
I've correctly implemented method one, where I call read() and write() on one character at a time, copying the input file to the output file. Although it's very slow, it at least copies everything over.
My code for this looks like this:
// assume we have a valid, opened fd_in and fd_out file.
char buf;
while(read(fd_in, buf, 1) != 0)
write(fd_out, buf, 1);
For method two however, where I use BUFSIZ, I am not able to transfer every single entry into the output file. It fails in the z entries, and doesn't write anymore.
So, my first try:
// assume we have a valid, opened fd_in and fd_out file
char buf[BUFSIZ];
while(read(fd_in, buf, BUFSIZ) != 0)
write(fd_out, buf, BUFSIZ);
doesn't work.
I understand that read() will return either the number of bytes read or 0 if it is at the end of a file. The problem I'm having is understanding how I can compare read() to BUFSIZ, and then loop around and start read() at where it left off until I reach the real end of file.
Since your file will most likely not be an exact multiple of BUFSIZ you need to check for the actual number of bytes read, so that the last block will be written correctly, e.g.
char buf[BUFSIZ];
ssize_t n;
while((n = read(fd_in, buf, BUFSIZ)) > 0)
write(fd_out, buf, n);
this code:
// assume we have a valid, opened fd_in and fd_out file
char buf[BUFSIZ];
while(read(fd_in, buf, BUFSIZ) != 0)
write(fd_out, buf, BUFSIZ);
leaves much to be desired,
does not handle a short remaining char count at the end of the file,
does not handle errors, etc.
a much better code block would be:
// assume we have a valid, opened fd_in and fd_out file
char buf[BUFSIZ];
int readCount; // number of bytes read
int writeCount; // number of bytes written
while(1)
{
if( 0 > (readCount = read(fd_in, buf, BUFSIZ) ) )
{ // then, read failed
perror( "read failed" );
exit( EXIT_FAILURE );
}
// implied else, read successful
if( 0 == readCount )
{ // then assume end of file
break; // exit while loop
}
// implied else, readCount > 0
if( readCount != (writeCount = write( fd_out, buf, readCount ) ) )
{ // then, error occurred
perror( "write failed" );
exit( EXIT_FAILURE );
}
// implied else, write successful
} // end while
Note: I did not include the closing of input/output files statements
before each call to exit() however, that does need to be added

Buffer is not reading in string properly

void download(char *file)
{
int size = getsize(file);
printf("Got size %d\n", size);
sprintf(buff, "GET %s\n", file);
send(sockfd, buff, strlen(buff), 0);
rsize = recv(sockfd, buff, 1000, 0);
sscanf(buff, "%d", &resultcode);
printf("%s", buff);
if (strcmp(buff, "+OK\n") != 0)
{
printf("download failed\n");
}
FILE *dlfile = NULL;
if ((dlfile = fopen(file, "r")) != NULL)
{
dlfile = fopen(file, "w");
do
{
rsize = recv(sockfd, buff, 1000, 0);
for (int i = 0; i < rsize; i++)
{
fprintf(dlfile, "%c", buff[i]);
}
size = size - rsize;
} while (size != 0);
}
fclose(dlfile);
}
I am trying to make the download function print out contents of file user typed, then save it to their current directory. I did a debug line printf("%s", buff); and it prints out +OK\n(filename). It is supposed to print out +OK\n. It also prints out download failed then a segmentation fault error. What am I missing?
Several things going on here. First, recv and send basically operate on arrays of bytes so they do not know about line endings and such. Also note that recv is not guaranteed to fill the buffer - it generally reads what is available up to the limit of the buffer. For your strcmp against "+OK\n", you could use strncmp with a length of 4 but that is a bit direct (see below). Next note that the buff string is not null terminated by recv so your printf could easily crash.
When you go in to your loop, the buffer already has part of the rest of your I/O in it. May include other fields or parts of the file. You need to process it as well. It is not clear to me what getsize does - but using that size to drive your loop seems off. Also, your loop to fprintf the values can be replaced by a call to fwrite.
Overall, you need to properly buffer and then parse the incoming stream of data. If you want to do it yourself, you could look at fdopen to get a FILE object.

Writing to an output file in C

I am writing a program in C that takes a a command line argument that represents the name of an output file. I then opened the file to write to it. The problem is that when I write something in the command line, it never shows up in the file I was writing to and any text in the file is erased. Here is the code I have for writing to the file from stdin.
(fdOut is the FILE * stream that was specified)
while(fread(buf, 1, 1024, stdin))
{
fwrite(buf, 1, 1024, fdOut);
}
try this code.
#include "stdio.h"
int main()
{
char buf[1024];
FILE *fdOut;
if((fdOut = fopen("out.txt","w")) ==NULL)
{ printf("fopen error!\n");return -1;}
while (fgets(buf, 1024, stdin) != NULL)
{
// int i ;
// for(i = 0;buf[i]!=NULL; ++i)
// fputc(buf[i],fdOut);
fputs(buf,fdOut);
// printf("write error\n");
}
fclose(fdOut);
return 0;
}
Note : use Ctrl+'D' to stop input.
let's assume that there is data coming in at stdin, d.g. you are using your program like:
cat infile.txt | ./myprog /tmp/outfile.txt
then data written with fwrite() will be buffered, so it won't appear immediately in the output file, but only when your OS decides that it's time to flush the buffer.
you can manually force writing to disk by using
fflush(fdOut);
(probably you don't want to do this all the time, as buffering allows for great speedups, esp when writing to slow media)
size_t nbytes;
while ((nbytes = fread(buf, 1, 1024, stdin)) > 0)
{
if (fwrite(buf, 1, nbytes, fdOut) != nbytes)
...handle short write...out of space?...
}
As you wrote it, you mishandle a short read, writing garbage that was not read to the output.

Strange behaviour of fgets

Ive got a function here that blocks on fgets but when I print something before fgets it doesn't block.
int exec_command(char *command, char *output_buf, int buf_size)
{
FILE* pipe = NULL;
char buffer[BUFFER_SIZE];
char tmp[SMALL_BUFFER_SIZE];
unsigned total_read = 0;
pipe = popen( command, "r");
if( !pipe )
{
//Error
return -1;
}
memset(buffer, 0, sizeof(buffer));
while( !feof(pipe) )
{
//printf("reading"); //If I uncomment this fgets doesnt block
if( fgets(tmp, sizeof(tmp), pipe) != NULL )
{
// check that it'll fit:
size_t len = strlen(tmp);
if (total_read + len >= sizeof(buffer))
break;
// and add it to the big buffer if it fits
strcat(buffer, tmp);
total_read += len;
}
}
//Is there anything to copy
if ( total_read )
strncpy (output_buf, buffer, buf_size);
return pclose(pipe);
}
Is there anything wrong on my function above?
Its because whatever is writing your pipe isn't flushing its out buffer. When you print, it ends up flushing that (not garenteed to happen though). When you don't print, the pipe isn't actually getting written to because its because stored in a kernel buffer until it fills, and then the kernel will write the data. Call fsync or flush on the pipe fd in the process that is writing to the pipe to make sure that the kernel buffer is flushed.
Add a new-line to the print-statement. It is (probably) line-buffered, and doesn't print until a new-line is encountered, you call fflush.

Resources