Using fread() hangs until killed - c

This is the general structure of my code:
if (contentLength > 0)
{
// Send POST data
size_t sizeRead = 0;
char buffer[1024];
while ((sizeRead < contentLength) && (!feof(stream)))
{
size_t diff = contentLength - sizeRead;
if (diff > 1024)
diff = 1024;
// Debuging
fprintf(stderr, "sizeRead: %zu\n", sizeRead);
fprintf(stderr, "contentLength: %ul\n", contentLength);
fprintf(stderr, "diff: %zu\n", diff);
size_t read = fread(buffer, 1, diff, stream);
sizeRead += read;
exit(1);
// Write to pipe
fwrite(buffer, 1, read, cgiPipePost);
exit(1);
}
}
However, the program hangs when it hits the fread() line. If I add an exit() before that line, the program exists. If I add it after, the program hangs until I send a SIGINT signal.
Any help would be appreciated, I have been stuck on this for quite some time now.
Thanks

fread tries to fill an internal buffer. Depending on the implementation, you may be able to stop or limit it by setting the buffering mode (in particular, setting _IONBF, see setbuf, should work for all implementations). The general rule, though, is to avoid mixing counted I/O on sockets with stdio at all—to use raw read calls.
Also, while it's not biting you here, a !feof(stream) test is almost always wrong: people mean this to be predictive (EOF is about to occur), but feof is only "post-dictive": after a read operation fails (getc or fgetc returns EOF), the feof and ferror indicators allow you to discover why the previous failure occurred.

Related

How to properly fread & fwrite from & to a pipe

I have this code which acts as a pipe between two shell invocations.
It reads from a pipe, and writes into a different one.
#include <stdio.h>
#include <stdlib.h>
#define BUFF_SIZE (0xFFF)
/*
* $ cat /tmp/redirect.txt |less
*/
int main(void)
{
FILE *input;
FILE *output;
int c;
char buff[BUFF_SIZE];
size_t nmemb;
input = popen("cat /tmp/redirect.txt", "r");
output = popen("less", "w");
if (!input || !output)
exit(EXIT_FAILURE);
#if 01
while ((c = fgetc(input)) != EOF)
fputc(c, output);
#elif 01
do {
nmemb = fread(buff, 1, sizeof(buff), input);
fwrite(buff, 1, nmemb, output);
} while (nmemb);
#elif 01
while (feof(input) != EOF) {
nmemb = fread(buff, 1, sizeof(buff), input);
fwrite(buff, 1, nmemb, output);
}
#endif
/*
* EDIT: The previous implementation is incorrect:
* feof() return non-zero if EOF is set
* EDIT2: Forgot the !. This solved the problem.
*/
#elif 01
while (feof(input)) {
nmemb = fread(buff, 1, sizeof(buff), input);
fwrite(buff, 1, nmemb, output);
}
#endif
pclose(input);
pclose(output);
return 0;
}
I want it to be efficient, so I want to implement it with fread()&fwrite(). There are the 3 way I tried.
The first one is implemented with fgetc()&fputc() so it will be very slow. However it works fine because it checks for EOF so it will wait until cat (or any shell invocation I use) finishes its job.
The second one is faster, but I'm concerned that I don't check for EOF so if there is any moment when the pipe is empty (but the shell invocation hasn't finished, so may not be empty in the future), it will close the pipe and end.
The third implementation is what I would like to do, and it relatively works (all the text is received by less), but for some reason it gets stuck and doesn't close the pipe (seems like it never gets the EOF).
EDIT: Third implementation is buggy. Fourth tries to solve the bug, but now less doesn't receive anything.
How should this be properly done?
First of all, to say that I think you are having problems more with buffering, than with efficiency. That is a common problem when first dealing with the stdio package.
Second, the best (and simplest) implementation of a simple data copier from input to output is the following snippet (copied from K&R first ed.).
while((c = fgetc(input)) != EOF)
fputc(c, output);
(well, not a literal copy, as there, K&R use stdin and stdout as FILE* descriptors, and they use the simpler getchar(); and putchar(c); calls.) When you try to do better than this, normally you incur in some false assumptions, as the fallacy of the lack of buffering or the number of system calls.
stdio does full buffering when the standard output is a pipe (indeed, it does full buffering always except when the file descriptor gives true to the isatty(3) function call), so you should do, in the case you want to see the output as soon as it is available, at least, no output buffering (with something like setbuf(out, NULL);, or fflush()) your output at some point, so it doesn't get buffered in the output while you are waiting in the input for more data.
What it seems to be is that you see that the output for the less(1) program is not visible, because it is being buffered in the internals of your program. And that is exactly what is happening... suppose you feed your program (which, despite of the handling of individual characters, is doing full buffering) doesn't get any input until the full input buffer (BUFSIZ characters) have been feeded to it. Then, a lot of single fgetc() calls are done in a loop, with a lot of fputc() calls are done in a loop (exactly BUFSIZ calls each) and the buffer is filled at the output. But this buffer is not written, because it need one more char to force a flush. So, until you get the first two BUFSIZ chunks of data, you don't get anything written to less(1).
A simple, and efficient way is to check after fputc(c, out); if the char is a \n, and flush output with fflush(out); in that case, and so you'll write a line of output at a time.
fputc(c, out);
if (c == '\n') fflush(out);
If you don't do something, the buffering is made in BUFSIZ chunks, and normally, not before you have such an amount of data in the output side. And remember always to fclose() things (well, this is handled by stdio), or you can lose output in case your process gets interrupted.
IMHO the code you should use is:
while ((c = fgetc(input)) != EOF) {
fputc(c, output);
if (c == '\n') fflush(output);
}
fclose(input);
fclose(output);
for the best performance, while not blocking unnecessarily the output data in the buffer.
BTW, doing fread() and fwrite() of one char, is a waste of time and a way to complicate things a lot (and error prone). fwrite() of one char will not avoid the use of buffers, so you won't get more performance than using fputc(c, output);.
BTW(bis) if you want to do your own buffering, don't call stdio functions, just use read(2) and write(2) calls on normal system file descriptors. A good approach is:
int input_fd = fileno(input); /* input is your old FILE * given by popen() */
int output_fd = fileno(output);
while ((n = read(input_fd, your_buffer, sizeof your_buffer)) > 0) {
write(output_fd, your_buffer, n);
}
switch (n) {
case 0: /* we got EOF */
...
break;
default: /* we got an error */
fprintf(stderr, "error: read(): %s\n", strerror(errno));
...
break;
} /* switch */
but this will awaken your program only when the buffer is fully filled with data, or there's no more data.
If you want to feed your data to less(1) as soon as you have one line for less, then you can disable completely the input buffer with:
setbuf(input, NULL);
int c; /* int, never char, see manual page */
while((c == fgetc(input)) != EOF) {
putc(c, output);
if (c == '\n') fflush(output);
}
And you'll get less(1) working as soon as you have produced a single line of output text.
What are you exactly trying to do? (This would be nice to know, as you seem to be reinventing the cat(1) program, but with reduced functionality)
Simplest solution:
while (1) {
nmemb = fread(buff, 1, sizeof buff, input);
if (nmemb < 1) break;
fwrite(buff, 1, nmemb, output);
}
Similarly, for the getc() case:
while (1) {
c = getc(input);
if (c == EOF) break;
putc(c, output);
}
Replacing fgetc() by getc() will give performance equivalent to the fread()case. (getc() will (often) be a macro, avoiding function-call overhead). [just take a look at the generated assembly.

Incorrect fprintf results

In the code below, I am trying to read from a socket and store the results in a file.
What actually happens, is that my client sends a GET request to my server for a file.html. My server finds the file and writes the contents of it to the socket. Lastly my client reads the content from thread_fd and recreates the file.
For some reason the recreated file has less content than the original. I have located the problem to be some lines in the end, that are missing. When I use printf("%s", buffer) inside the while loop everything seems fine in STDOUT but my fprintf misses somewhat 3.000 bytes for a file of 81.000 bytes size.
#define MAXSIZE 1000
int bytes_read, thread_fd;
char buffer[MAXSIZE];
FILE* new_file;
memset(buffer, 0, MAXSIZE);
if((new_file = fopen(path, "wb+")) == NULL)
{
printf("can not open file \n");
exit(EXIT_FAILURE);
}
while ((bytes_read = read(thread_fd, buffer, MAXSIZE)) > 0)
{
fprintf(new_file, "%s", buffer);
if(bytes_read < MAXSIZE)
break;
memset(buffer, 0, MAXSIZE);
}
You read binary data from the socket that may or may not contain a \0 byte. When you then fprintf that data the fprintf will stop at the first \0 it encounters. In your case that is 3000 bytes short of the full file. If your file contains no \0 byte the fprintf will simply continue printing the ram contents until it segfaults.
Use write() to write the data back to the file and check for errors. Don't forget to close() the file and check that for errors too.
Your code should/could look like:
int readfile(int thread_fd, char *path)
{
unsigned int bytes_read;
char buffer[MAXSIZE];
int new_file;
if ((new_file = open(path, _O_CREAT|_O_BINARY,_S_IWRITE)) == -1) return -1;
while ((bytes_read = read(thread_fd, buffer, MAXSIZE)) > 0)
{
if (write(new_file, buffer, bytes_read)!= bytes_read) {
close(new_file);
return -2;
}
}
close(new_file);
return 0;
}
There are a few issues with your code that can cause this.
The most likely cause is this :
if(bytes_read < MAXSIZE)
break;
This ends the loop when read returns less than the requested amount of bytes. This is however perfectly normal behavior, and can happen eg. when not enough bytes are available at the time of the read call (it's reading from a network socket after all). Just let the loop continue as long as read returns a value > 0 (assuming the socket is a blocking socket - if not, you'll also have to check for EAGAIN and EWOULDBLOCK).
Additionally, if the file you're receiving contains binary data, then it's not a good idea to use fprintf with "%s" to write to the target file. This will stop writing as soon as it finds a '\0' byte (which is not uncommon in binary data). Use fwrite instead.
Even if you're receiving text (suggested by the html file extension), it's still not a good idea to use fprintf with "%s", since the received data won't be '\0' terminated.
This worked!
ps: I don't know if I should be doing this, since I am new here, but really there is no reason for negativity. Any question is a good question. Just answer it if you know it. Do not judge it.
#define MAXSIZE 1000
int bytes_read, thread_fd, new_file;
char buffer[MAXSIZE];
memset(buffer, 0, MAXSIZE);
if((new_file = open(path, O_RDONLY | O_WRONLY | O_CREAT)) < 0)
{
printf("can not open file \n");
exit(EXIT_FAILURE);
}
while ((bytes_read = read(thread_fd, buffer, MAXSIZE)) > 0)
write(new_file, buffer, bytes_read);
close(new_file);

stdin to stdout after processing

I have a utility that is supposed to optimize files by transforming them into an alternate file-format. If it cannot make the files smaller, I would like the original file returned.
The design is to use stdin in and stdout for input and output. This is for a case where the processed size is larger than the original file size. All other branches are tested as working.
char readbuffer[65536];
ssize_t readinbytes;
while ((readinbytes = fread(readbuffer, sizeof(char), insize, stdin)) > 0) {
if (fwrite(readbuffer, sizeof(char), readnbytes, stdout) != readnbytes) {
fatal("can't write to stdout, please smash and burn the computer\n");
}
}
Problem This is resulting in a file with size 0
Right this question has a strange answer. Essentially I had to read stdin into a buffer (inbuf), then output the contents of that buffer. The overarching reason I was getting no output was multi-faceted.
Firstly I'd failed to spot a branch which already determined if the input buffer was smaller than the output buffer
if((readinbytes < outbuffersize) || force) {
// inside this is where the code was...
It looks like (because stdout was being used to write to) there was a section that contained a log statement that was not output in the matching else block. The code inherited was terribly formatted so it was never picked up on.
As outputting error messages is not fulfilling the purpose of the utility (always output a valid output file if a valid input file is provided)
solution (stdin is read into inbuf at the start of the program)
set_filemode_binary(stdout);
if (fwrite(inbuf, 1, readinbytes, stdout) != insize) {
fprintf(stderr, "error writing to stdout\n");
free(inbuf);
exit(3);
}
errata (reading in stdin)
unsigned char * inbuf = NULL;
size_t readinbytes;
long insize = 0;
// elsewhere...
// die if no stdin
insize = getFileSize(stdin);
if (insize < 0) {
fprintf(stderr, "no input to stdin\n");
exit(2);
}
// read stdin to buffer
inbuf = createBuffer(insize); // wrapper around malloc handling OOM
if ((readinbytes = fread(inbuf, sizeof(char), insize, stdin)) < 0) {
fprintf(stderr, "error reading from stdin\n");
free(inbuf);
exit(3);
}
Also don't forget to free(inbuf).
if(inbuf){ free(inbuf); }
I Hope this helps someone.

Getting characters past a certain point in a file in C

I want to take all characters past location 900 from a file called WWW, and put all of these in an array:
//Keep track of all characters past position 900 in WWW.
int Seek900InWWW = lseek(WWW, 900, 0); //goes to position 900 in WWW
printf("%d \n", Seek900InWWW);
if(Seek900InWWW < 0)
printf("Error seeking to position 900 in WWW.txt");
char EverythingPast900[appropriatesize];
int NextRead;
char NextChar[1];
int i = 0;
while((NextRead = read(WWW, NextChar, sizeof(NextChar))) > 0) {
EverythingPast900[i] = NextChar[0];
printf("%c \n", NextChar[0]);
i++;
}
I try to create a char array of length 1, since the read system call requires a pointer, I cannot use a regular char. The above code does not work. In fact, it does not print any characters to the terminal as expected by the loop. I think my logic is correct, but perhaps a misunderstanding of whats going on behind the scenes is what is making this hard for me. Or maybe i missed something simple (hope not).
If you already know how many bytes to read (e.g. in appropriatesize) then just read in that many bytes at once, rather than reading in bytes one at a time.
char everythingPast900[appropriatesize];
ssize_t bytesRead = read(WWW, everythingPast900, sizeof everythingPast900);
if (bytesRead > 0 && bytesRead != appropriatesize)
{
// only everythingPast900[0] to everythingPast900[bytesRead - 1] is valid
}
I made a test version of your code and added bits you left out. Why did you leave them out?
I also made a file named www.txt that has a hundred lines of "This is a test line." in it.
And I found a potential problem, depending on how big your appropriatesize value is and how big the file is. If you write past the end of EverythingPast900 it is possible for you to kill your program and crash it before you ever produce any output to display. That might happen on Windows where stdout may not be line buffered depending on which libraries you used.
See the MSDN setvbuf page, in particular "For some systems, this provides line buffering. However, for Win32, the behavior is the same as _IOFBF - Full Buffering."
This seems to work:
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdio.h>
int main()
{
int WWW = open("www.txt", O_RDONLY);
if(WWW < 0)
printf("Error opening www.txt\n");
//Keep track of all characters past position 900 in WWW.
int Seek900InWWW = lseek(WWW, 900, 0); //goes to position 900 in WWW
printf("%d \n", Seek900InWWW);
if(Seek900InWWW < 0)
printf("Error seeking to position 900 in WWW.txt");
int appropriatesize = 1000;
char EverythingPast900[appropriatesize];
int NextRead;
char NextChar[1];
int i = 0;
while(i < appropriatesize && (NextRead = read(WWW, NextChar, sizeof(NextChar))) > 0) {
EverythingPast900[i] = NextChar[0];
printf("%c \n", NextChar[0]);
i++;
}
return 0;
}
As stated in another answer, read more than one byte. The theory behind "buffers" is to reduce the amount of read/write operations due to how slow disk I/O (or network I/O) is compared to memory speed and CPU speed. Look at it as if it is code and consider which is faster: adding 1 to the file size N times and writing N bytes individually, or adding N to the file size once and writing N bytes at once?
Another thing worth mentioning is the fact that read may read fewer than the number of bytes you requested, even if there is more to read. The answer written by #dreamlax illustrates this fact. If you want, you can use a loop to read as many bytes as possible, filling the buffer. Note that I used a function, but you can do the same thing in your main code:
#include <sys/types.h>
/* Read from a file descriptor, filling the buffer with the requested
* number of bytes. If the end-of-file is encountered, the number of
* bytes returned may be less than the requested number of bytes.
* On error, -1 is returned. See read(2) or read(3) for possible
* values of errno.
* Otherwise, the number of bytes read is returned.
*/
ssize_t
read_fill (int fd, char *readbuf, ssize_t nrequested)
{
ssize_t nread, nsum = 0;
while (nrequested > 0
&& (nread = read (fd, readbuf, nrequested)) > 0)
{
nsum += nread;
nrequested -= nread;
readbuf += nread;
}
return nsum;
}
Note that the buffer is not null-terminated as not all data is necessarily text. You can pass buffer_size - 1 as the requested number of bytes and use the return value to add a null terminator where necessary. This is useful primarily when interacting with functions that will expect a null-terminated string:
char readbuf[4096];
ssize_t n;
int fd;
fd = open ("WWW", O_RDONLY);
if (fd == -1)
{
perror ("unable to open WWW");
exit (1);
}
n = lseek (fd, 900, SEEK_SET);
if (n == -1)
{
fprintf (stderr,
"warning: seek operation failed: %s\n"
" reading 900 bytes instead\n",
strerror (errno));
n = read_fill (fd, readbuf, 900);
if (n < 900)
{
fprintf (stderr, "error: fewer than 900 bytes in file\n");
close (fd);
exit (1);
}
}
/* Read a file, printing its contents to the screen.
*
* Caveat:
* Not safe for UTF-8 or other variable-width/multibyte
* encodings since required bytes may get cut off.
*/
while ((n = read_fill (fd, readbuf, (ssize_t) sizeof readbuf - 1)) > 0)
{
readbuf[n] = 0;
printf ("Read\n****\n%s\n****\n", readbuf);
}
if (n == -1)
{
close (fd);
perror ("error reading from WWW");
exit (1);
}
close (fd);
I could also have avoided the null termination operation and filled all 4096 bytes of the buffer, electing to use the precision part of the format specifiers of printf in this case, changing the format specification from %s to %.4096s. However, this may not be feasible with unusually large buffers (perhaps allocated by malloc to avoid stack overflow) because the buffer size may not be representable with the int type.
Also, you can use a regular char just fine:
char c;
nread = read (fd, &c, 1);
Apparently you didn't know that the unary & operator gets the address of whatever variable is its operand, creating a value of type pointer-to-{typeof var}? Either way, it takes up the same amount of memory, but reading 1 byte at a time is something that normally isn't done as I've explained.
Mixing declarations and code is a no no. Also, no, that is not a valid declaration. C should complain about it along the lines of it being variably defined.
What you want is dynamically allocating the memory for your char buffer[]. You'll have to use pointers.
http://www.ontko.com/pub/rayo/cs35/pointers.html
Then read this one.
http://www.cprogramming.com/tutorial/c/lesson6.html
Then research a function called memcpy().
Enjoy.
Read through that guide, then you should be able to solve your problem in an entirely different way.
Psuedo code.
declare a buffer of char(pointer related)
allocate memory for said buffer(dynamic memory related)
Find location of where you want to start at
point to it(pointer related)
Figure out how much you want to store(technically a part of allocating memory^^^)
Use memcpy() to store what you want in the buffer

Socket Read/Write error

would install valgrind to tell me what the problem is, but unfortunately can't any new programs on this computer... Could anyone tell me if there's an obvious problem with this "echo" program? Doing this for a friend, so not sure what the layout of the client is on the other side, but I know that both reads and writes are valid socket descriptors, and I've tested that n = write(writes,"I got your message \n",20); and n = write(reads,"I got your message \n",20); both work so can confirm that it's not a case of an invalid fd. Thanks!
int
main( int argc, char** argv ) {
int reads = atoi(argv[1]) ;
int writes = atoi(argv[3]) ;
int n ;
char buffer[MAX_LINE];
memset(buffer, 0, sizeof(buffer));
int i = 0 ;
while (1) {
read(reads, buffer, sizeof(buffer));
n = write(writes,buffer,sizeof(buffer));
if (n < 0) perror("ERROR reading from socket");
}
There are a few problems, the most pressing of which is that you're likely pushing garbage data down the the write socket by using sizeof(buffer) when writing. Lets say you read data from the reads socket and it's less than MAX_LINES. When you go to write that data, you'll be writing whatever you read plus the garbage at the end of the buffer (even though you memset at the very beginning, continual use of the same buffer without reacting to different read sizes will probably generate some garbage.
Try getting the return value from read and using it in your write. If the read return indicates an error, clean up and either exit or try again, depending on how you want your program to behave.
int n, size;
while (1) {
size = read(reads, buffer, sizeof(buffer));
if (size > 0) {
n = write(writes, buffer, size);
if (n != size) {
// write error, do something
}
} else {
// Read error, do something
}
}
This, of course, assumes your writes and reads are valid file descriptors.
These two lines look very suspicious:
int reads = atoi(argv[1]) ;
int writes = atoi(argv[3]) ;
Do you really get file/socket descriptor numbers on the command line? From where?
Check the return value of your read(2) and write(2), and then the value of errno(3) - they probably tell you that your file descriptors are invalid (EBADF).
One point not made thus far: Although you know that the file descriptors are valid, you should include some sanity checking of the command line.
if (argc < 3) {
printf("usage: foo: input output\n");
exit(0);
}
Even with this sanity checking passing parameters like this on a command line can be dangerous.
The memset() is not needed, provided you change the following (which you should do nevertheless).
read() has a result, telling you how much it has actually read. This you should give to write() in order to write only what you actually have, removing the need for zeroing.
MAX_LINE should be at least 512, if not more.
There probably are some more issues, but I think I have the most important ones.

Resources