stdin to stdout after processing - c

I have a utility that is supposed to optimize files by transforming them into an alternate file-format. If it cannot make the files smaller, I would like the original file returned.
The design is to use stdin in and stdout for input and output. This is for a case where the processed size is larger than the original file size. All other branches are tested as working.
char readbuffer[65536];
ssize_t readinbytes;
while ((readinbytes = fread(readbuffer, sizeof(char), insize, stdin)) > 0) {
if (fwrite(readbuffer, sizeof(char), readnbytes, stdout) != readnbytes) {
fatal("can't write to stdout, please smash and burn the computer\n");
}
}
Problem This is resulting in a file with size 0

Right this question has a strange answer. Essentially I had to read stdin into a buffer (inbuf), then output the contents of that buffer. The overarching reason I was getting no output was multi-faceted.
Firstly I'd failed to spot a branch which already determined if the input buffer was smaller than the output buffer
if((readinbytes < outbuffersize) || force) {
// inside this is where the code was...
It looks like (because stdout was being used to write to) there was a section that contained a log statement that was not output in the matching else block. The code inherited was terribly formatted so it was never picked up on.
As outputting error messages is not fulfilling the purpose of the utility (always output a valid output file if a valid input file is provided)
solution (stdin is read into inbuf at the start of the program)
set_filemode_binary(stdout);
if (fwrite(inbuf, 1, readinbytes, stdout) != insize) {
fprintf(stderr, "error writing to stdout\n");
free(inbuf);
exit(3);
}
errata (reading in stdin)
unsigned char * inbuf = NULL;
size_t readinbytes;
long insize = 0;
// elsewhere...
// die if no stdin
insize = getFileSize(stdin);
if (insize < 0) {
fprintf(stderr, "no input to stdin\n");
exit(2);
}
// read stdin to buffer
inbuf = createBuffer(insize); // wrapper around malloc handling OOM
if ((readinbytes = fread(inbuf, sizeof(char), insize, stdin)) < 0) {
fprintf(stderr, "error reading from stdin\n");
free(inbuf);
exit(3);
}
Also don't forget to free(inbuf).
if(inbuf){ free(inbuf); }
I Hope this helps someone.

Related

Incorrect fprintf results

In the code below, I am trying to read from a socket and store the results in a file.
What actually happens, is that my client sends a GET request to my server for a file.html. My server finds the file and writes the contents of it to the socket. Lastly my client reads the content from thread_fd and recreates the file.
For some reason the recreated file has less content than the original. I have located the problem to be some lines in the end, that are missing. When I use printf("%s", buffer) inside the while loop everything seems fine in STDOUT but my fprintf misses somewhat 3.000 bytes for a file of 81.000 bytes size.
#define MAXSIZE 1000
int bytes_read, thread_fd;
char buffer[MAXSIZE];
FILE* new_file;
memset(buffer, 0, MAXSIZE);
if((new_file = fopen(path, "wb+")) == NULL)
{
printf("can not open file \n");
exit(EXIT_FAILURE);
}
while ((bytes_read = read(thread_fd, buffer, MAXSIZE)) > 0)
{
fprintf(new_file, "%s", buffer);
if(bytes_read < MAXSIZE)
break;
memset(buffer, 0, MAXSIZE);
}
You read binary data from the socket that may or may not contain a \0 byte. When you then fprintf that data the fprintf will stop at the first \0 it encounters. In your case that is 3000 bytes short of the full file. If your file contains no \0 byte the fprintf will simply continue printing the ram contents until it segfaults.
Use write() to write the data back to the file and check for errors. Don't forget to close() the file and check that for errors too.
Your code should/could look like:
int readfile(int thread_fd, char *path)
{
unsigned int bytes_read;
char buffer[MAXSIZE];
int new_file;
if ((new_file = open(path, _O_CREAT|_O_BINARY,_S_IWRITE)) == -1) return -1;
while ((bytes_read = read(thread_fd, buffer, MAXSIZE)) > 0)
{
if (write(new_file, buffer, bytes_read)!= bytes_read) {
close(new_file);
return -2;
}
}
close(new_file);
return 0;
}
There are a few issues with your code that can cause this.
The most likely cause is this :
if(bytes_read < MAXSIZE)
break;
This ends the loop when read returns less than the requested amount of bytes. This is however perfectly normal behavior, and can happen eg. when not enough bytes are available at the time of the read call (it's reading from a network socket after all). Just let the loop continue as long as read returns a value > 0 (assuming the socket is a blocking socket - if not, you'll also have to check for EAGAIN and EWOULDBLOCK).
Additionally, if the file you're receiving contains binary data, then it's not a good idea to use fprintf with "%s" to write to the target file. This will stop writing as soon as it finds a '\0' byte (which is not uncommon in binary data). Use fwrite instead.
Even if you're receiving text (suggested by the html file extension), it's still not a good idea to use fprintf with "%s", since the received data won't be '\0' terminated.
This worked!
ps: I don't know if I should be doing this, since I am new here, but really there is no reason for negativity. Any question is a good question. Just answer it if you know it. Do not judge it.
#define MAXSIZE 1000
int bytes_read, thread_fd, new_file;
char buffer[MAXSIZE];
memset(buffer, 0, MAXSIZE);
if((new_file = open(path, O_RDONLY | O_WRONLY | O_CREAT)) < 0)
{
printf("can not open file \n");
exit(EXIT_FAILURE);
}
while ((bytes_read = read(thread_fd, buffer, MAXSIZE)) > 0)
write(new_file, buffer, bytes_read);
close(new_file);

C - Print lines from file with getline()

I am trying to write a simple C program that loads a text-file, prints the first line to screen, waits for the user to press enter and then prints the next line, and so on.
As only argument it accepts a text-file that is loaded as a stream "database". I use the getline()-function for this, according to this example. It compiles fine, successfully loads the text-file, but the program never enters the while-loop and then exits.
#include <stdio.h>
#include <stdlib.h>
FILE *database = NULL; // input file
int main(int argc, char *argv[])
{
/* assuming the user obeyed syntax and gave input-file as first argument*/
char *input = argv[1];
/* Initializing input/database file */
database = fopen(input, "r");
if(database == NULL)
{
fprintf(stderr, "Something went wrong with reading the database/input file. Does it exist?\n");
exit(EXIT_FAILURE);
}
printf("INFO: database file %s loaded.\n", input);
/* Crucial part printing line after line */
char *line = NULL;
size_t len = 0;
ssize_t read;
while((read = getline(&line, &len, database)) != -1)
{
printf("INFO: Retrieved line of length %zu :\n", read);
printf("%s \n", line);
char confirm; // wait for user keystroke to proceed
scanf("%c", &confirm);
// no need to do anything with "confirm"
}
/* tidy up */
free(line);
fclose(database);
exit(EXIT_SUCCESS);
}
I tried it with fgets() -- I can also post that code --, but same thing there: it never enters the while-loop.
It might be something very obvious; I am new to programming.
I use the gcc-compiler on Kali Linux.
Change your scanf with fgetline using stdin as your file parameter.
You should step through this in a debugger, to make sure your claim that it never enters the while loop is correct.
If it truly never enters the while loop, it is necessarily because getline() has returned -1. Either the file is truly empty, or you have an error reading the file.
man getline says:
On success, getline() and getdelim() return the number of
characters
read, including the delimiter character, but not including the termi‐
nating null byte ('\0'). This value can be used to handle embedded
null bytes in the line read.
Both functions return -1 on failure to read a line (including end-of-
file condition). In the event of an error, errno is set to indicate
the cause.
Therefore, you should enhance your code to check for stream errors and deal with errno -- you should do this even when your code works, because EOF is not the only reason for the function
to return -1.
int len = getline(&line, &len, database);
if(len == -1 && ferror(database)) {
perror("Error reading database");
}
You can write more detailed code to deal with errno in more explicit ways.
Unfortunately handling this thoroughly can make your code a bit more verbose -- welcome to C!

why is this not writing (receiving) the correct number of Bytes?

I receive data from an ftp socket connection. The connection seems tio be fine but for some reason, I don't get the correct nmumber of Bytes written to my destination file.
My source file has a size of 18735004 Bytes and my algorithm writes 19713024 Bytes to the file. Now can this be?
The code I have:
if (ftpXfer ("3.94.213.53", "**", "******", NULL,
"RETR %s", "/home/ge", "ngfm.bin",
&ctrlSock, &dataSock) == ERROR)
return (ERROR);
pFile = fopen( "flash:/ngfm.bin", "wb" );
if ( pFile == NULL ) {
printf("fopen() failed!\n");
status = ERROR;
}
while ((nBytes = read (dataSock, buf, sizeof (buf))) > 0) {
cnt++;
n+=fwrite (buf , sizeof(char), sizeof(buf), pFile);
if(cnt%100==0)
printf(".");
}
fclose( pFile );
printf("%d Bytes written to flash:/ngfm.bin\n",n);
The screen output ended with:
19713024 Bytes writen to flash:/ngfm.bin
What's wrong here?
You are ignoring the nBytes return value from read(), and instead always writing sizeof buf bytes to the output. That's wrong, for partial reads (where nBytes is less than sizeof buf) you are injecting junk into the written stream.
The write should of course use nBytes, too.
Also: the write can fail, and write less than you requested, so you need to loop it until you know that all bytes have been written, or you get an error from it.
It seems you aren't putting your FTP server in binary mode, and it's transferring in ascii. This replaces every \n with a \r\n sequence.
Additionally, unwind's reply is correct as well.

Using fread() hangs until killed

This is the general structure of my code:
if (contentLength > 0)
{
// Send POST data
size_t sizeRead = 0;
char buffer[1024];
while ((sizeRead < contentLength) && (!feof(stream)))
{
size_t diff = contentLength - sizeRead;
if (diff > 1024)
diff = 1024;
// Debuging
fprintf(stderr, "sizeRead: %zu\n", sizeRead);
fprintf(stderr, "contentLength: %ul\n", contentLength);
fprintf(stderr, "diff: %zu\n", diff);
size_t read = fread(buffer, 1, diff, stream);
sizeRead += read;
exit(1);
// Write to pipe
fwrite(buffer, 1, read, cgiPipePost);
exit(1);
}
}
However, the program hangs when it hits the fread() line. If I add an exit() before that line, the program exists. If I add it after, the program hangs until I send a SIGINT signal.
Any help would be appreciated, I have been stuck on this for quite some time now.
Thanks
fread tries to fill an internal buffer. Depending on the implementation, you may be able to stop or limit it by setting the buffering mode (in particular, setting _IONBF, see setbuf, should work for all implementations). The general rule, though, is to avoid mixing counted I/O on sockets with stdio at all—to use raw read calls.
Also, while it's not biting you here, a !feof(stream) test is almost always wrong: people mean this to be predictive (EOF is about to occur), but feof is only "post-dictive": after a read operation fails (getc or fgetc returns EOF), the feof and ferror indicators allow you to discover why the previous failure occurred.

c read() causing bad file descriptor error

Context for this is that the program is basically reading through a filestream, 4K chunks at a time, looking for a certain pattern. It starts by reading in 4k, and if doesn't find the pattern there, it starts a loop which reads in the next 4k chunk (rinse and repeat until EOF or pattern is found).
On many files the code is working properly, but some files are getting errors.
The code below is obviously highly redacted, which I know might be annoying, but it includes ALL lines that reference the file descriptor or the file itself. I know you don't want to take my word for it, since I'm the one with the problem...
Having done a LITTLE homework before crying for help, I've found:
The file descriptor happens to always = 6 (it's also 6 for the files that are working), and that number isn't getting changed through the life of execution. Don't know if that's useful info or not.
By inserting print statements after every operation that accesses the file descriptor, I've also found that successful files go through the following cycle "open-read-close-close" (i.e. the pattern was found in the first 4K)
Unsuccessful files go "open-read-read ERROR (Bad File Descriptor)-close." So no premature close, and it's getting in the first read successfully, but the second read causes the Bad File Descriptor error.
.
int function(char *file)
{
int len, fd, go = 0;
char buf[4096];
if((fd = open(file, O_RDONLY)) <= 0)
{
my_error("Error opening file %s: %s", file, strerror(errno));
return NULL;
}
//first read
if((len = read(fd, buf, 4096)) <= 0)
{
my_error("Error reading from file %s: %s", file, strerror(errno));
close(fd); return NULL;
}
//pattern-searching
if(/*conditions*/)
{
/* we found it, no need to keep looking*/
close(fd);
}
else
{
//reading loop
while(!go)
{
if(/*conditions*/)
{
my_error("cannot locate pattern in file %s", file);
close(fd); return NULL;
}
//next read
if((len = read(fd, buf, 4096)) <= 0) /**** FAILS HERE *****/
{
my_error("Error reading from file, possible bad message %s: %s",
file, strerror(errno));
close(fd); return NULL;
}
if(/*conditions*/)
{
close(fd);
break;
}
//pattern searching
if(/*conditions*/)
{
/* found the pattern */
go++; //break us out of the while loop
//stuff
close(fd);
}
else
{
//stuff, and we will loop again for the next chunk
}
} /*end while loop*/
}/*end else statement*/
close(fd);
}
.
Try not to worry about the pattern-reading logic - all operations are done on the char buffer, not on the file, so it ought to have no impact on this problem.
EOF returns 0 (falls into if ... <= 0), but does not set errno, which may have an out of date code in it.
Try testing for 0 and negative (error, -1) values seperately.
Regarding "strace": I've used it a little at home, and in previous jobs. Unfortunately, it's not installed in my current work environment. It is a useful tool, when it's available. Here, I took the "let's read the fine manual" (man read) approach with the questioner :-)

Resources