Failure to write subsquent compressed data to an output file in C

Failure to write subsquent compressed data to an output file in C - c

I am reading data from an input file and compressing it with bzip library function calls BZ2_bzCompress in C. I can compress the data successfully. But I cannot write all the compressed data to an output file. Only the first compressed line can be written. Am I missing something here.
int main()
{
bz_stream bz;
FILE* f_d;
FILE* f_s;
BZFILE* b;
int bzerror = -10;
unsigned int nbytes_in;
unsigned int nbytes_out;
char buf[3000] = {0};
int result = 0;
char buf_read[500];
char file_name[] = "/path/file_name";
long int save_pos;
f_d = fopen ( "myfile.bz2", "wb+" );
f_s = fopen(file_name, "r");
if ((!f_d) && (!f_s)) {
printf("Cannot open files");
return(-1);
}
bz.opaque = NULL;
bz.bzalloc = NULL;
bz.bzfree = NULL;
result = BZ2_bzCompressInit(&bz, 1, 2, 30);
while (fgets(buf_read, sizeof(buf_read), f_s) != NULL)
{
bz.next_in = buf_read;
bz.avail_in = sizeof(buf_read);
bz.next_out = buf;
bz.avail_out = sizeof(buf);
printf("%s\n", buf_read);
save_pos = ftell(f_d);
fseek(f_d, save_pos, SEEK_SET);
while ((result == BZ_RUN_OK) || (result == 0) || (result == BZ_FINISH_OK))
{
result = BZ2_bzCompress(&bz, (bz.avail_in) ? BZ_RUN : BZ_FINISH);
printf("2 result:%d,in:%d,outhi:%d, outlo:%d \n",result, bz.total_in_lo32, bz.total_out_hi32, bz.total_out_lo32);
fwrite(buf, 1, bz.total_out_lo32, f_d);
}
if (result == BZ_STREAM_END)
{
result = BZ2_bzCompressEnd(&bz);
}
printf("3 result:%d, out:%d\n", result, bz.total_out_lo32);
result = BZ2_bzCompressInit(&bz, 1, 2, 30);
memset(buf, 0, sizeof(buf));
}
fclose(f_d);
fclose(f_s);
return(0);
}

TL;DR: there are multiple problems, but the main one that explains the problem you asked about is likely that you compress each line of the file independently, instead of the whole file as a unit.
According to the docs of BZ2_bzCompressInit, the bz_stream argument should be allocated and initialized before the call. Yours is (automatically) allocated, but not (fully) initialized. It would be clearer and easier to change to
bz_stream bz = { 0 };
and then skip the assignments to bz.opaque, bz.alloc, and bz.free.
You store but do not really check the return value of your BZ2_bzCompressInit call. It does eventually get tested in the condition of the inner while loop, but you do not detect error conditions there, but instead just success and normal completion conditions.
Your handling of the input buffer is significantly flawed.
In the first place, you set the number of available input bytes incorrectly:
bz.avail_in = sizeof(buf_read);
Since you're using fgets() to read data into the buffer, under no circumstances is the full size of the buffer occupied by input data, because fgets() ensures that a string terminator is written into the array. In fact, it could be worse because fgets() will stop at after newlines, so it may provide as few as just one input byte on a successful read.
If you want to stick with fgets() then you need to use strlen() to determine the number of bytes available from each read, but I would suggest that you instead switch to fread(), which will more reliably fill the buffer, indicate with its return value how many bytes were read, and correctly handle inputs containing null bytes.
In the second place, you use BZ2_bzCompress() to compress each buffer of input as if it were a complete file. When you come to the end of the buffer, you finish a compression run and reinitialize the bz_stream. This will definitely interfere with decompressing, and it may explain why your program (seems to) compress only the first line of its input. You should be reading the whole content of the file (in suitably-sized chunks) and feeding all of it to BZ2_bzCompress(... BZ_RUN) before you finish up. There should be one sequence of calls to BZ2_bzCompress(... BZ_FINISH) and finally one call to BZ2_bzCompressEnd() for the whole file, not per line.
You do not perform error detection or handling for any of your calls to standard library or bzip functions. You do handle the expected success-case return values for some of these, but you need to be rpepared for errors, too.
There are some additional oddities
you have unused variables nbytes_in, nbytes_out, bzerror, and b.
you open the input file as a text file, though whether that makes any difference is platform-dependent.
the ftell() / fseek() pair has no overall effect other than setting save_pos, which is not otherwise used.
although it is not harmful, it also is not useful to memset() the output buffer to all-zeroes at the end of each line (or initially).
Given that you're compressing the input, it's odd (but again not harmful) that you provide six times as much output buffer as you do input buffer.

Related

Why the fread() function is not reaching the end-of-file and the loop continues infinitely, while reading structures from a file?

The code snippet given below is a part of my Customer Billing System Project which I'm writing in C. Here I'm taking input of the Customer Data in a Customer structure and saving the entire structure in "CustomerRecord.dat" file using the writefile() function as defined in my code. After successfully writing the Customer data in my file, I'm trying to traverse the file and assign Customer numbers to each record. For doing this I'm using the setCustomerNo() funtion as defined in my code.
The problem is the while loop inside the setCustomerNo() funtion is running infinitely and reading even after the end of file.
Please help me solving this issue.
I've tried replacing:
while(!feof(fp) { // LOOP BODY }
while(!feof(fp) && !ferror(fp)) { // LOOP BODY }
while(fread(&x, sizeof(Customer), 1, fp) || !feof(fp)) { // LOOP BODY }
But none of these above replacements worked for me.
#define RECORD CustomerRecord.dat
typedef struct
{
int customerNo;
unsigned int phoneNo;
char name[20];
char address[32];
float balance;
}Customer;
void writefile(Customer x)
{
FILE *fp;
fp = fopen("CustomerRecord.dat", "ab");
fwrite(&x, sizeof(Customer), 1, fp);
fclose(fp);
}
Customer setCustomerNo()
{
FILE *fp;
Customer x, y;
int k = 1;
fp = fopen(RECORD, "rb+");
while(!feof(fp) && !ferror(fp))
{
fread(&x, sizeof(Customer), 1, fp);
x.customerNo = k++;
y = x;
fseek(fp, -sizeof(Customer), SEEK_CUR);
fwrite(&x, sizeof(Customer), 1, fp);
}
fclose(fp);
return y;
}
The expected result is that the function setCustomerNo() will read all the structures stored in file from the beginning and update customerNo(s) starting from 1 and so on to the structures stored in the file. At the end return the last updated structure.

You definitely don't want your loop to test feof(fp) -- indeed, that is practically never right -- because feof(fp) will always be false at the end if the loop, so the loop will never end. Why? The fseek manpage spells it out:
A successful call to the fseek() function clears the end-of-file indicator for the stream.
Anyway, if you think about it, you want to terminate the loop exactly when fread reports that it got no data. You certainly don't want to process the non-existent data that fread didn't read. So testing whether there was an EOF after you handle the possibly unread data can't be right, even if feof(fp) actually returned a useful value.
A couple of other notes:
You're better off using ftell to recall the read cursor before the read, and then fseeking to the saved point. That avoids a bunch of corner cases, and is easily upgraded to use the more general fseeko/ftello functions, which work properly with large files. (Or you could just go directly to using fseeko and ftello.
Think about what you want to do if the last read is partial. Of course, that "can't happen" if you always create fixed-length records. But everything that "can't happen" eventually does, for one reason or another. Best to deal with it.

How could I "manually" signify EOF when writing a file?

I have this function :
int cipher_file(char *file_path, uint8_t *key, int key_size){
FILE *file;
size_t read_char_count, wrote_char_count;
fpos_t *pos = malloc(sizeof(fpos_t));
char *block = malloc(16*sizeof(uint8_t));
if ( !(file = fopen(file_path, "rb+")) ) {
return EXIT_FAILURE;
}
while(!feof(file)){
while( ( read_char_count = fread(block, 1, 16*sizeof(uint8_t), file) ) > 0 ) {
block = cipher_block(block, key, key_size);
fseek(file, -read_char_count, SEEK_CUR);
wrote_char_count = fwrite(block , 1, 16*sizeof(uint8_t), file);
}
}
fclose(file);
return EXIT_SUCCESS;
}
(I know ECB mode is not safe btw)
Which takes a file, break it down in blocks of 128 bits, cipher them using an AES and write them back to the file, effectively replacing plain text with cipher text.
I also wrote a function decipher_file() to decipher the file.
The issue is that if the file size is not a multiple of 128 bits, at the end fread() only partially replace content of "block" (which is 16 bytes long) with the successfully read characters, leaving a bunch of garbage from the previous ciphered block.
When deciphering since decipher_file() has normally no way of knowing the size of the original file, it deciphers all the content, including the garbage characters, and write it back to the file.
I also tried re initializing "block" with zeros at each round but, without great surprise, they were added to the file too, which can be very problematic.
So my question is, is there a way (like a function) to signify where the file ends, or tell fwrite() to stop writing?

You can't use a special character because the encrypted data might end up looking like it. It would not be an elegant solution anyway.
There are multiple solutions:
Prefix the file with the decrypted content length. That's very clean and easy to implement.
Use a cipher mode that retains length information. ECB does not. Use padding schemes or a scheme, that preserves the length such as counter mode.

Fread Abort 6 error

In my code I am sending sending packets each with a 128 bytes from the text file and need to read in data from a text file (I can't just allocated a buffer and read all of it before sending because the file will be extremely large). For some reason I am getting an Abort 6 error even when I have allocated memory.
SendIndex starts as 0 and it aborts for the first send so that shouldn't be the problem.
The problem occurs during strcpy I just don't know why.
Really confused so I would really appreciate the help.
struct packet packingT;
packingT.header = mpHeaderT;
packingT.data = (char*) calloc(512,sizeof(char));
char* sendString = (char*)calloc(128,sizeof(char));
FILE *file = fopen(receivedStruct->fileTitle, "rb");
if(file == NULL) {
printf("Error - Can't Open File\n");
exit(0);
}
fseek(file, 128*sendIndex, SEEK_SET);
fread(sendString, 128, 1,file);
fclose(file);
// sendString[128] = '\0'; <--- Still don't know if this is needed
packingT.header->seq_num = receivedStruct->nextSeqNum;
strcpy(packingT.data, sendString);

I think all you need to do is replace the final strcpy with memcpy instead. That is, the last line should be memcpy(packingT.data, sendString, 128);
(Edit: The reason being that strcpy determines the length of the thing to be copied by scanning for a zero at the end. You're reading arbitrary data, which may have zeros in the middle, and may not always end in a zero)
(Edit2: please be aware that the content of packingT.data is not terminated, so you can't use string functions on it. Depending on what you're doing, you might need to add a terminator, or ensure one gets written to the file)

How to duplicate an image file? [duplicate]

I am designing an image decoder and as a first step I tried to just copy the using c. i.e open the file, and write its contents to a new file. Below is the code that I used.
while((c=getc(fp))!=EOF)
fprintf(fp1,"%c",c);
where fp is the source file and fp1 is the destination file.
The program executes without any error, but the image file(".bmp") is not properly copied. I have observed that the size of the copied file is less and only 20% of the image is visible, all else is black. When I tried with simple text files, the copy was complete.
Do you know what the problem is?

Make sure that the type of the variable c is int, not char. In other words, post more code.
This is because the value of the EOF constant is typically -1, and if you read characters as char-sized values, every byte that is 0xff will look as the EOF constant. With the extra bits of an int; there is room to separate the two.

Did you open the files in binary mode? What are you passing to fopen?

It's one of the most "popular" C gotchas.

You should use freadand fwrite using a block at a time
FILE *fd1 = fopen("source.bmp", "r");
FILE *fd2 = fopen("destination.bmp", "w");
if(!fd1 || !fd2)
// handle open error
size_t l1;
unsigned char buffer[8192];
//Data to be read
while((l1 = fread(buffer, 1, sizeof buffer, fd1)) > 0) {
size_t l2 = fwrite(buffer, 1, l1, fd2);
if(l2 < l1) {
if(ferror(fd2))
// handle error
else
// Handle media full
}
}
fclose(fd1);
fclose(fd2);
It's substantially faster to read in bigger blocks, and fread/fwrite handle only binary data, so no problem with \n which might get transformed to \r\n in the output (on Windows and DOS) or \r (on (old) MACs)

reading from a binary file in C

I am currently working on a project in which I have to read from a binary file and send it through sockets and I am having a hard time trying to send the whole file.
Here is what I wrote so far:
FILE *f = fopen(line,"rt");
//size = lseek(f, 0, SEEK_END)+1;
fseek(f, 0L, SEEK_END);
int size = ftell(f);
unsigned char buffer[MSGSIZE];
FILE *file = fopen(line,"rb");
while(fgets(buffer,MSGSIZE,file)){
sprintf(r.payload,"%s",buffer);
r.len = strlen(r.payload)+1;
res = send_message(&r);
if (res < 0) {
perror("[RECEIVER] Send ACK error. Exiting.\n");
return -1;
}
}
I think it has something to do with the size of the buffer that I read into,but I don't know what it's the correct formula for it.
One more thing,is the sprintf done correctly?

If you are reading binary files, a NUL character may appear anywhere in the file.
Thus, using string functions like sprintf and strlen is a bad idea.
If you really need to use a second buffer (buffer), you could use memcpy.
You could also directly read into r.payload (if r.payload is already allocated with sufficient size).
You are looking for fread for a binary file.
The return value of fread tells you how many bytes were read into your buffer.
You may also consider to call fseek again.
See here How can I get a file's size in C?
Maybe your code could look like this:
#include <stdint.h>
#include <stdio.h>
#define MSGSIZE 512
struct r_t {
uint8_t payload[MSGSIZE];
int len;
};
int send_message(struct r_t *t);
int main() {
struct r_t r;
FILE *f = fopen("test.bin","rb");
fseek(f, 0L, SEEK_END);
size_t size = ftell(f);
fseek(f, 0L, SEEK_SET);
do {
r.len = fread(r.payload, 1, sizeof(r.payload), f);
if (r.len > 0) {
int res = send_message(&r);
if (res < 0) {
perror("[RECEIVER] Send ACK error. Exiting.\n");
fclose(f);
return -1;
}
}
} while (r.len > 0);
fclose(f);
return 0;
}

No, the sprintf is not done correctly. It is prone to buffer overflow, a very serious security problem.
I would consider sending the file as e.g. 1024-byte chunks instead of as line-by-line, so I would replace the fgets call with an fread call.
Why are you opening the file twice? Apparently to get its size, but you could open it only once and jump back to the beginning of the file. And, you're not using the size you read for anything.

Is it a binary file or a text file? fgets() assumes you are reading a text file -- it stops on a line break -- but you say it's a binary file and open it with "rb" (actually, the first time you opened it with "rt", I assume that was a typo).
IMO you should never ever use sprintf. The number of characters written to the buffer depends on the parameters that are passed in, and in this case if there is no '\0' in buffer then you cannot predict how many bytes will be copied to r.payload, and there is a very good chance you will overflow that buffer.
I think sprintf() would be the first thing to fix. Use memcpy() and you can tell it exactly how many bytes to copy.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Failure to write subsquent compressed data to an output file in C - c

Related

Why the fread() function is not reaching the end-of-file and the loop continues infinitely, while reading structures from a file?

How could I "manually" signify EOF when writing a file?

Fread Abort 6 error

How to duplicate an image file? [duplicate]

reading from a binary file in C

Categories

Resources