I have been finding a way to brute-force finding a int64_t in a file in C.
I have written the following code.
int64_t readbyte = 0, totalreadbytes = 0;
int64_t totalfound = 0;
const int64_t magic = MAGIC_NUMBER;
char *buffer = (char *)malloc(BUFFER_SIZE);
int64_t *offsets = (int64_t *)malloc(sizeof(int64_t) * (1 << 24));
if (buffer == NULL || offsets == NULL)
{
return -3;
}
while ((readbyte = fread(buffer, 1, BUFFER_SIZE, inptr)) > 0)
{
for (int i = 0; i <= readbyte - 8; i++)
{
if (memcmp(buffer + i, &magic, sizeof(magic))==0)
{
offsets[totalfound++] = totalreadbytes + i;
}
}
totalreadbytes += readbyte - 8;
fseek(inptr, -8, SEEK_CUR);
}
// Do something to those offsets found
free(offsets);
free(buffer);
I have been wondering if there is a way better to find that int64_t, because my goal is to find them in a file as large as 60gigs and there maybe several hundred thousands of them in that file
Backing up and re-reading data is going to slow things down quite a bit.
Building on #melpomene comment, here's a very simple way to do it with mmap():
uint64_t needle;
struct stat sb;
int fd = open( filename, O_RDONLY );
fstat( fd, &sb );
unsigned char *haystack = mmap( NULL, sb.st_size,
PROT_READ, MAP_PRIVATE, fd, 0 );
close( fd );
off_t bytesToSearch = sb.st_size - sizeof( needle );
// <= so the last bytes get searched
for ( off_t ii = 0; ii <= bytesToSearch; ii++ )
{
if ( 0 == memcmp( haystack + ii, &needle, sizeof( needle ) ) )
{
// found it!
}
}
Error checking and proper headers omitted for clarity.
There are a lot of ways to improve the performance of that. This IO pattern is the worst possible use of mmap() with regards to performance - read every byte in the file just once, then throw the mappings away. Because mapping a file isn't all that fast in the first place, and it impacts the entire machine.
It'd probably be a lot faster to just use open() and read() with direct IO in large page-sized chunks into page-aligned memory, especially if the file is a significant fraction of the system's RAM. But that would make the code much more complex, as the comparisons would have to span buffers - it's almost certainly much faster to use two buffers and copy a few bytes out to search across a break between buffers than it is to back up and do a non-aligned read.
Related
I have to implement for a course assignment the Huffman encryption & decryption algorithm first in the classic way, then I have to try to make it parallel using various methods (openMP, MPI, phtreads). The scope of the project is not to make it necessarily faster, but to analyze the results and talk about them and why are they like that.
The serial version works perfectly. However, for the parallel version, I stumble with a reading from file problem. In the serial version, I have a pice of code that looks like this:
char *buffer = calloc(1, MAX_BUFF_SZ);
while (bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input) > 0) {
compress_chunk(buffer, t, output);
memset(buffer, 0, MAX_BUFF_SZ);
}
This reads at most MAX_BUFF_SZ bytes from the input file and then encrypts them. I used the memset call for the case when bytes_read < MAX_BUFF_SZ (maybe a cleaner solution exists though).
However, for the parallel version (using openMP for example), I want each thread to analyze only a portion of the file, but the reading to be done still in chunks. Knowing that each thread has and id thread_id and there are at most total_threads, I calculate the start and the end positions as following:
int slice_size = (file_size + total_threads - 1) / total_threads;
int start = slice_size * thread_id;
int end = min((thread_id + 1) * slice_size, file_size);
I can move to the start position with a simple fseek(input, start, SEEK_SET) operation. However, I am not able to read the content in chunks. I tried with the following code (just to make sure the operation is okay):
int total_bytes = 0;
while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;
if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '\0';
break;
}
fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}
output is a different file for each thread. Even when I try with just 2 threads, there are some missing characters from them. I think I am close to the right solution and I have something like an error-by-one.
So the question is: how can I read a slice of a file, but in chunks? Can you please help me identify the bug in the above code and make it work?
Edit:
If MAX_BUFF_SZ would be bigger than the size of the input and I'll have for example 4 threads, how should a clean code look to ensure that T0 will do all the job and T1, T2 and T3 will do nothing?
Some simple code that may be use to test the behavior is the following (note that is not from the Huffman code, is some auxiliary code to test things):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>
#define MAX_BUFF_SZ 32
#define min(a, b) \
({ __typeof__ (a) _a = (a); \
__typeof__ (b) _b = (b); \
_a < _b ? _a : _b; })
int get_filesize(char *filename) {
FILE *f = fopen(filename, "r");
fseek(f, 0L, SEEK_END);
int size = ftell(f);
fclose(f);
return size;
}
static void compress(char *filename, int id, int tt) {
int total_bytes = 0;
int bytes_read;
char *newname;
char *buffer;
FILE *output;
FILE *input;
int fsize;
int slice;
int start;
int end;
newname = (char *) malloc(strlen(filename) + 2);
sprintf(newname, "%s-%d", filename, id);
fsize = get_filesize(filename);
buffer = calloc(1, MAX_BUFF_SZ);
input = fopen(filename, "r");
output = fopen(newname, "w");
slice = (fsize + tt - 1) / tt;
end = min((id + 1) * slice, fsize);
start = slice * id;
fseek(input, start, SEEK_SET);
while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;
printf("%s\n", buffer);
if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '\0';
break;
}
fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}
fclose(output);
fclose(input);
}
int main() {
omp_set_num_threads(4);
#pragma omp parallel
{
int tt = omp_get_num_threads();;
int id = omp_get_thread_num();
compress("test.txt", id, tt);
}
}
You can compile it with gcc test.c -o test -fopenmp. You may generate a file test.txt with some random characters, more than 32 (or change the max buffer size).
Edit 2:
Again, my problem is reading a slice of a file in chunks, not the analysis per se. I know how to do that. It's an University course, I can't just say "IO bound, end of story, analysis complete".
Apparently I just had to take a pen and a paper and make a little scheme. After playing around with some indices, I came out with the following code (encbuff and written_bits are some auxiliary variables I use, since I am actually writing bits to a file and I use an intermediary buffer to limit the writes):
while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;
if (start + total_bytes > end) {
int diff = start + total_bytes - end;
buffer[bytes_read - diff] = '\0';
compress_chunk(buffer, t, output, encbuff, &written_bits);
break;
}
compress_chunk(buffer, t, output, encbuff, &written_bits);
memset(buffer, 0, MAX_BUFF_SZ);
}
I also finished implementing the openMP version. For small files the serial one is faster, but starting from 25+MB, the parallel one starts to beats the serial one with about 35-45%. Thank you all for the advice.
Cheers!
I am trying to write multi-threaded code to read file in fixed chunks using mmap(2) and counts the words. Each thread works on a separate portion of the file, making faster processing of the file. I am able to read the file using mmap(2) single threaded. When the number of threads is more than one, it fails with a segmentation fault.
for( unsigned long cur_pag_num = 0; cur_pag_num < total_blocks; cur_pag_num++ ) {
mmdata = mmap(
NULL, PAGE_SIZE, PROT_READ, MAP_PRIVATE, fd, (fileOffset + (cur_pag_num * PAGE_SIZE))
);
if (mmdata == MAP_FAILED) printf(" mmap error ");
unsigned long wc = getWordCount( mmdata );
parserParam->wordCount +=wc;
munmap( mmdata, PAGE_SIZE );
}
unsigned long getWordCount(char *page){
unsigned long wordCount=0;
for(long i = 0 ; page[i] ;i++ ){
if(page[i]==' ' || page[i]=='\n')
wordCount++;
}
return wordCount;
}
I have figured out that code fails inside getWordCount(mmdata). What am I doing wrong here?
Note: size of file is more than the size of main memory. So reading in fixed size chunks (PAGE_SIZE).
getWordCount is accessing outside the mapped page, because the loop stops when it finds a null byte. But mmap() doesn't add a null byte after the mapped page. You need to pass the size of the mapped page to the function. It should stop when it reaches either that index or a null byte (if the file isn't long enough to fill the page, the rest of the page will be zeros).
for( unsigned long cur_pag_num = 0; cur_pag_num < total_blocks; cur_pag_num++ ) {
mmdata = mmap(
NULL, PAGE_SIZE, PROT_READ, MAP_PRIVATE, fd, (fileOffset + (cur_pag_num * PAGE_SIZE))
);
if (mmdata == MAP_FAILED) printf(" mmap error ");
unsigned long wc = getWordCount( mmdata, PAGE_SIZE );
parserParam->wordCount +=wc;
munmap( mmdata, PAGE_SIZE );
}
unsigned long getWordCount(char *page, size){
unsigned long wordCount=0;
for(long i = 0 ; i < size && page[i] ;i++ ){
if(page[i]==' ' || page[i]=='\n')
wordCount++;
}
return wordCount;
}
BTW, there's another problem with your approach: a word that spans page boundaries will be counted twice.
I've no doubt there is an answer to this somewhere, I just can't find it.
I have just returned to c after a long break and am very rusty, so please excuse dumb errors. I need to generate a large (maybe equivilent of 10mb) string. I don't know how long it's going to be until it's built.
I tried the following two approaches to test speed:
int main() {
#if 1
size_t message_len = 1; /* + 1 for terminating NULL */
char *buffer = (char*) malloc(message_len);
for (int i = 0; i < 200000; i++)
{
int size = snprintf(NULL, 0, "%d \n", i);
char * a = malloc(size + 1);
sprintf(a, "%d \n", i);
message_len += 1 + strlen(a); /* 1 + for separator ';' */
buffer = (char*) realloc(buffer, message_len);
strncat(buffer, a, message_len);
}
#else
FILE *f = fopen("test", "w");
if (f == NULL) return -1;
for (int i = 0; i < 200000; i++)
{
fprintf(f, "%d \n", i);
}
fclose(f);
FILE *fp = fopen("test", "r");
fseek(fp, 0, SEEK_END);
long fsize = ftell(f);
fseek(fp, 0, SEEK_SET);
char *buffer = malloc(fsize + 1);
fread(buffer, fsize, 1, f);
fclose(fp);
buffer[fsize] = 0;
#endif
char substr[56];
memcpy(substr, buffer, 56);
printf("%s", substr);
return 1;
}
The first solution of concatenating strings each time took 3.8s, the second of writing to a file then reading took 0.02s.
Surely there is a fast way to build a big string in c without resorting to reading and writing to a file? Am I just doing something very inefficient? If not can I write to some kind of file object, then read it at the end and never save it?
In C# you would use a stringbuffer to avoid the slow concatination, what's the equivilent in c?
Thanks in advance.
You are making life pretty rough with these lines:
for (int i = 0; i < 200000; i++)
{
int size = snprintf(NULL, 0, "%d \n", i); // << executed in first loop only
char * a = malloc(size + 1); // allocate enough space for "0 \n" + 1
sprintf(a, "%d \n", i); // may try to squeeze "199999 \n" into a
message_len += 1 + strlen(a); /* 1 + for separator ';' */
buffer = (char*) realloc(buffer, message_len);
strncat(buffer, a, message_len);
}
You compute size and allocate space for a in the first iteration - then proceed to use it in every subsequent iteration (where i gets bigger, and you will in principle exceed the storage allocated for a). If you did this correctly (allocating size for a in each loop) you would have to free in every loop as well, or create a giant memory leak.
The solution, in C, is to pre-allocate plenty of memory - and only reallocate in emergency. If you know "roughly" how big your string will be, allocate all that memory at once; keep track of how big it is, and add more if you run short. At the end you can always "give back what you didn't use". Too many calls to realloc keep moving memory around (since you often don't have enough contiguous memory available where you were). As #Matt clarified in his comment: there is a real risk that every call to realloc moves the entire block of memory - and as the block gets bigger, that becomes a quadratically increasing load on the system. Here is a possible better solution (complete, tested with small N and BLOCK just to show the principle; you will want to use large N (your value of 200000), and larger BLOCK - and get rid of the printf statements that were there to show things are working ):
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
#define N 2000000
#define BLOCK 32
int main(void) {
size_t message_len = BLOCK; //
char *buffer = (char*) malloc(message_len);
int bb;
int i, n=0;
char* a = buffer;
clock_t start, stop;
for(bb = 1; bb < 128; bb *= 2) {
int rCount = 0;
start = clock();
for (i = 0; i < N; i++)
{
a = buffer + n;
n += sprintf(a, "%d \n", i);
if ((message_len - n) < BLOCK*bb) {
rCount++;
message_len += BLOCK*bb;
//printf("increasing buffer\n");
//printf("increased buffer to %ld\n", (long int)message_len);
buffer = realloc(buffer, message_len);
}
}
stop = clock();
printf("\nat the end, buffer length is %d; rCount = %d\n", strlen(buffer), rCount);
// buffer = realloc(buffer, strlen(buffer+1));
//printf("buffer is now: \n%s\n", buffer);
printf("time taken with blocksize = %d: %.1f ms\n", BLOCK*bb, (stop - start) * 1000.0 / CLOCKS_PER_SEC);
}
}
You will want to use a fairly large value for BLOCK - this will limit the number of calls to realloc. I would use something like 100000; you get rid of the space at the end anyway.
EDIT I modified the code I had posted to allow timing of the loop - increasing N to 2 million to get "reasonable times". I also minimized the initial memory allocation (to force a lot of calls to realloc and fixed a bug (when realloc had to move memory, a was no longer pointing to an offset in buffer. That is fixed now by keeping track of the string length so far in n.
This is pretty fast - 450 ms for the smallest block, dropping to 350 ms for larger blocks (2 million numbers). That is comparable (within the resolution of my measurement) to your file read/write operation. But yes - file I/O streaming and associated memory management are highly optimized...
I have left out some details, but my approach is generally like this
create a structure like this one
typedef struct {
char *curr ;
char *start ;
char *end ;
} VBUF ;
write some functions along these lines:
void vbuf_alloc(VBUF *v,int n)
{
v->start = malloc(n) ;
v->end = v->start + n ;
v->curr = v->start ;
}
int vbuf_add(VBUF *v,char *s,int length)
{
if (v->end - v->curr < length) {
vbuf_realloc(v,(v->end - v->start) * 2) ;
}
memcpy(v->curr,s,length) ;
v->curr += length ;
return length ;
}
int vbuf_adds(VBUF *v,char *s)
{
return vbud_add(v,s,strlen(s)) ;
}
You can extend this suite of functions as much as you like.
c has no objects so there's no equivalent to the C# stringbuffer (though in C++ you would use std::string).
You would get a performance boost by not calling realloc on every append, and never calling malloc the way you are.
You can avoid your malloc completely by simply declaring a char[] large enough to print the largest int into; this would avoid the snprintf too, and the size is fairly small.
Instead of constantly calling realloc, you should grow your buffer by some reasonable sizeā¦ say 4kb (a nice size to correspond with page size), and only grow it again when it comes close to being exhausted (that is, when its current usage is less than the array size you use above from the top).
I suggest instead of realloc on every successive string, attempt to intelligently realloc ahead of time if the length is too short. In other words, avoid realloc whenever possible.
A naive implementation in pseudocode might be something like
Initialize an int/long to "written so far"
Initialize an int/long to remember "buffer size"
Alloc memory for a string up to the "buffer size"
Read in the next chunk into a temporary buffer
Get the "chunk size" from the temporary buffer
If "written so far" + "chunk size" > "buffer size"
Reallocate the chunk to be much bigger (double "buffer size"?)
Set the new "buffer size"
Copy the data from the temporary buffer to "buffer address" + "written so far" + 1
Set "written so far" to "written so far" + "chunk size"
I just threw this together, so there may be indexing errors, but you get the idea: only allocate and copy when you have to, instead of every time through the loop.
I am trying to compress a file stream with LZO and not getting very far. Specifically, I get a segmentation fault when extracting the archive file created by my compressFileWithLzo1x function.
My main function and prototype declarations are:
#include <stdio.h>
#include <stdlib.h>
#include "lzo/include/lzo/lzo1x.h"
#define LZO_IN_CHUNK (128*1024L)
#define LZO_OUT_CHUNK (LZO_IN_CHUNK + LZO_IN_CHUNK/16 + 64 + 3)
int compressFileWithLzo1x(const char *inFn, const char *outFn);
int extractFileWithLzo1x(const char *inFn);
int main(int argc, char **argv) {
const char *inFilename = "test.txt";
const char *outFilename = "test.txt.lzo1x";
if ( compressFileWithLzo1x(inFilename, outFilename) != 0 )
exit(EXIT_FAILURE);
if ( extractFileWithLzo1x(outFilename) != 0 )
exit(EXIT_FAILURE);
return 0;
}
Here is the implementation of my compression function:
int compressFileWithLzo1x(const char *inFn, const char *outFn) {
FILE *inFnPtr = fopen(outFn, "r");
FILE *outFnPtr = fopen(outFn, "wb");
int compressionResult;
lzo_bytep in;
lzo_bytep out;
lzo_voidp wrkmem;
lzo_uint out_len;
size_t inResult;
if (lzo_init() != LZO_E_OK)
return -1;
in = (lzo_bytep)malloc(LZO_IN_CHUNK);
out = (lzo_bytep)malloc(LZO_OUT_CHUNK);
wrkmem = (lzo_voidp)malloc(LZO1X_1_MEM_COMPRESS);
do {
inResult = fread(in, sizeof(lzo_byte), LZO_IN_CHUNK, inFnPtr);
if (inResult == 0)
break;
compressionResult = lzo1x_1_compress(in, LZO_IN_CHUNK, out, &out_len, wrkmem);
if ((out_len >= LZO_IN_CHUNK) || (compressionResult != LZO_E_OK))
return -1;
if (fwrite(out, sizeof(lzo_byte), (size_t)out_len, outFnPtr) != (size_t)out_len || ferror(outFnPtr))
return -1;
fflush(outFnPtr);
} while (!feof(inFnPtr) && !ferror(inFnPtr));
free(wrkmem);
free(out);
free(in);
fclose(inFnPtr);
fclose(outFnPtr);
return 0;
}
Here is the implementation of my decompression function:
int extractFileWithLzo1x(const char *inFn) {
FILE *inFnPtr = fopen(inFn, "rb");
lzo_bytep in = (lzo_bytep)malloc(LZO_IN_CHUNK);
lzo_bytep out = (lzo_bytep)malloc(LZO_OUT_CHUNK);
int extractionResult;
size_t inResult;
lzo_uint new_length;
if (lzo_init() != LZO_E_OK)
return -1;
do {
new_length = LZO_IN_CHUNK;
inResult = fread(in, sizeof(lzo_byte), LZO_IN_CHUNK, inFnPtr);
extractionResult = lzo1x_decompress(out, LZO_OUT_CHUNK, in, &new_length, NULL);
if ((extractionResult != LZO_E_OK) || (new_length != LZO_IN_CHUNK))
return -1;
fprintf(stderr, "out: [%s]\n", (unsigned char *)out);
} while (!feof(inFnPtr) && (!ferror(inFnPtr));
free(in);
free(out);
fclose(inFnPtr);
return 0;
}
The segmentation fault occurs here:
extractionResult = lzo1x_decompress(out, LZO_OUT_CHUNK, in, &new_length, NULL);
What is wrong with this approach that is causing the segmentation fault?
I hope I haven't left any code out this time. Feel free to let me know if I need to add more information. Thanks in advance for your advice.
You're compressing independent blocks. The LZO decompressor needs the byte length of the compressed data because when it decodes EOF it checks whether it has consumed all the input bytes (and returns an error if it hasn't) so you need to store the length of each compressed chunk as well. Thus you need a more complex file format. For example:
# compressing, in python-like pseudocode
ifile = open("data", "rb")
ofile = open("data.mylzo", "wb")
input, input_len = ifile.read(65536)
while input_len > 0:
compressed, compressed_len = lzo1x(input, input_len)
compressed_len -= 1 # store len-1 of next block
if compressed_len < 65536 - 1:
ofile.write(compressed_len & 255) # be sure of endianess in file formats!
ofile.write(compressed_len >> 8)
ofile.write(compressed)
else:
ofile.write(255) # incompressible block stored it as-is (saves space & time).
ofile.write(255)
ofile.write(input)
input, input_len = ifile.read(65536)
ofile.close()
ifile.close()
# decompressing, in python-like pseudocode
ifile = open("data.mylzo", "rb")
ofile = open("data", "wb")
compressed_len_s = ifile.read(2)
while len(compressed_len_s) == 2:
compressed_len = (compressed_len_s[0] | (compressed_len_s[1] << 8)) + 1
if compressed_len == 65536:
ofile.write(ifile.read(65536)) # this can be done without copying
else:
compressed = ifile.read(compressed_len)
decompressed = lzo1x_decompress(compressed, compressed_len)
ofile.write(decompressed)
compressed_len_s = ifile.read(2)
ofile.close()
ifile.close()
If you want to be able to decompress the chunks without skipping (either for decompression in parallel or random access) you should place the lengths of compressed chunks at the beginning, before the first chunk. Precede them with the number of chunks.
The last chunk can be shorter than 64k, and it can be incompressible but we'll still store the compressed form, even though it's longer than the non-compressed form, because only full 64k blocks are stored as-is. If entire file is shorter than 64k, it will grow.
The code you've given won't compile (spurious = in the #defines; inFilePtr instead of inFnPtr in various places, etc.). But:
When compressing, you are not taking account of the actual amount of data returned by the fread(), which might well be less than LZO_IN_CHUNK.
compressionResult = lzo1x_1_compress(in, LZO_IN_CHUNK, out, &out_len, wrkmem);
should probably be
compressionResult = lzo1x_1_compress(in, inResult, out, &out_len, wrkmem);
(This is unlikely to be the problem, but will add bogus junk at the end of the file.)
When decompressing, you have a similar problem, and the in / out arguments are the wrong way round, which is likely to be the cause of your segfault.
extractionResult = lzo1x_decompress(out, LZO_OUT_CHUNK, in, &new_length, NULL);
should probably be
extractionResult = lzo1x_decompress(in, inResult, out, &new_length, NULL);
I think you are opening the wrong file in int compressFileWithLzo1x:
FILE *inFnPtr = fopen(outFn, "r");
it should be
FILE *inFnPtr = fopen(inFn, "r");
On occasion, the following code works, which probably means good concept, but poor execution. Since this crashes depending on where the bits fell, this means I am butchering a step along the way. I am interested in finding an elegant way to fill bufferdata with <=4096 bytes from buffer, but admittedly, this is not it.
EDIT: the error I receive is illegal access on bufferdata
unsigned char buffer[4096] = {0};
char *bufferdata;
bufferdata = (char*)malloc(4096 * sizeof(*bufferdata));
if (! bufferdata)
return false;
while( ... )
{
// int nextBlock( voidp _buffer, unsigned _length );
read=nextBlock( buffer, 4096);
if( read > 0 )
{
memcpy(bufferdata+bufferdatawrite,buffer,read);
if(read == 4096) {
// let's go for another chunk
bufferdata = (char*)realloc(bufferdata, ( bufferdatawrite + ( 4096 * sizeof(*bufferdata)) ) );
if (! bufferdata) {
printf("failed to realloc\n");
return false;
}
}
}
else if( read<0 )
{
printf("error.\n");
break;
}
else {
printf("done.\n");
break;
}
}
free(bufferdata);
It's hard to tell where the error is, there's some code missing here and there.
if(read == 4096) { looks like a culprit, what if nextBlock, returned 4000 on one iteration, and 97 on the next ? Now you need to store 4097 bytes but you don't reallocate the buffer to accomodate for it.
You need to accumulate the bytes, and realloc whenever you pass a 4096 boundary.
something like:
#define CHUNK_SIZE 4096
int total_read = 0;
int buffer_size = CHUNK_SIZE ;
char *bufferdata = malloc(CHUNK_SIZE );
char buffer[CHUNK_SIZE];
while( ... )
{
// int nextBlock( voidp _buffer, unsigned _length );
read=nextBlock( buffer, CHUNK_SIZE );
if( read > 0 )
{
total_read += read;
if(buffer_size < total_read) {
// let's go for another chunk
char *tmp_buf;
tmp_buf= (char*)realloc(bufferdata, buffer_size + CHUNK_SIZE );
if (! tmp_buf) {
free(bufferdata);
printf("failed to realloc\n");
return false;
}
buffer_data = tmp_buf;
buffer_size += CHUNK_SIZE ;
}
memcpy(bufferdata+total_read-read,buffer,read);
}
...
}
A few comments:
Please define or const 4096. You will get burned if you ever need to change this. realloc chaining is an extremely inefficient way to get a buffer. Any way you could prefetch the size and grab it all at once? perhaps not, but I always cringe when i see realloc(). I'd also like to know what kZipBufferSize is and if it's in bytes like the rest of your counts. Also, what exactly is bufferdatawrite? I'm assuming it's source data, but I'd like to see it's declaration to make sure it's not a memory alignment issue - which is kinda what this feels like. Or a buffer overrun due to bad sizing.
Finally, are you sure they nextBlock isn't overruning memory some how? This is another point of potential weakness in your code.