Please mind this code:
#define CHUNK 0x4000
z_stream strm;
unsigned char out[CHUNK];
int ret;
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
int windowsBits = 15;
int GZIP_ENCODING = 16;
ret = deflateInit2(&strm, Z_BEST_SPEED, Z_DEFLATED, windowsBits | GZIP_ENCODING, 1,
Z_DEFAULT_STRATEGY);
if(ret == Z_OK) {
strm.next_in = (z_const unsigned char *)answer;
strm.avail_in = strlen(answer);
do {
strm.avail_out = CHUNK;
strm.next_out = out;
ret = deflate(&strm, Z_FINISH);
} while (strm.avail_out == 0);
}
/* clean up and return */
(void)deflateEnd(&strm);
With answer (unsigned char array of 200 elements with the last one being \0) filled in between the 4 declarations & the rest.
It crashes in the deflateInit2 on Z_MEM_ERROR.
I'm working on a STM32F4 (microcontroller). My RAM was almost full (~87%) before trying to implement the compression.
I got this part working once when I used different parameters but I had an error later in the program(because I want to send the gzip'ed string to an HTTP output, error was:
unrecognized encoding.
I have : ~30 KB of free RAM.
zlib's deflate normally needs about 256K of RAM. See zlib technical details. 30K is a bit restrictive, but you can still get deflate to work using the memLevel and windowBits parameters to reduce the memory footprint. From that page:
deflate memory usage (bytes) = (1 << (windowBits+2)) + (1 << (memLevel+9))
So you can get there with a memLevel of 5, and a windowBits of 11, taking about 24K (plus some other structures). This will reduce the compression effectiveness somewhat, but at least it will work. (You can still add 16 to windowBits for gzip encoding.)
Related
I am trying to decode the header bits based on the output byte of deflate compression output.
char a[50] = "Hello";
z_stream defstream;
defstream.zalloc = Z_NULL;
defstream.zfree = Z_NULL;
defstream.opaque = ZNULL;
defstream.avail_in = (uInt)strlen(a)+1;
defstream.next_in = (Bytef *)a;
defstream.avail_out = (uINt)sizeof(b);
defstream.next_out = (Bytef *)b;
deflateInit(&defstream, Z_BEST_COMPRESSION);
deflate(&defstream, Z_FINISH);
deflateEnd(&defstream);
for (int i=0; i<strlen(b); i++) {
printf("--- byte[%d]=%hhx\n", i, b[i]);
}
The result:
--- byte[0]=78
--- byte[1]=da
--- byte[2]=f3
and so on.
I just want to understand which bits are the 3-bit block header as described in deflate specification. First bit specifies the block final/BFINAL. Next two bits specify the BTYPE.
Based on this result, 0x78 - the first 3 bits are 000 which means BFINAL=0, BTYPE=00/no compression. But this seems not right to me. The BTYPE should specify either 01 or 10.
Am I missing out something here? Can someone please help?
Reference:
deflate specification
You are making a zlib stream, not a raw deflate stream. So the 78 da is the zlib header, not deflate compressed data. The deflate data starts with f3. The low three bits of that are 011. The low 1 is BFINAL (this is the last block), and the 01 is BTYPE (fixed Huffman codes).
I am trying to use zlib to deflate (compress?) data from a textfile.
It seems to work when I compress a file, but I am trying to prepend
the zlib compressed file with custom header. Both the file and header
should be compressed. However, when I add the header, the length of
the compressed (deflated) file is much shorter than expected and comes
out as an invalid zlib compressed object.
The code works great, until I add the header block of code between the
XXX comments below.
The "FILE *source" variable is a sample file, I typically use
/etc/passwd and the "char *header" is "blob 2172\0".
Without the header block, the output is 904 bytes and deflatable
(decompressable), but with the header it comes out to only 30 bytes.
It also comes out as an invalid zlib object with the header block of
code.
Any ideas where I am making a mistake, specifically why the output is
invalid and shorter with the header?
If its relevant, I am writing this on FreeBSD.
#define Z_CHUNK16384
#define HEX_DIGEST_LENGTH 257
int
zcompress_and_header(FILE *source, char *header)
{
int ret, flush;
z_stream strm;
unsigned int have;
unsigned char in[Z_CHUNK];
unsigned char out[Z_CHUNK];
FILE *dest = stdout; // This is a temporary test
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
ret = deflateInit(&strm, Z_BEST_SPEED);
//ret = deflateInit2(&strm, Z_BEST_SPEED, Z_DEFLATED, 15 | 16, 8,
Z_DEFAULT_STRATEGY);
if (ret != Z_OK)
return ret;
/* XXX Beginning of writing the header */
strm.next_in = (unsigned char *) header;
strm.avail_in = strlen(header) + 1;
do {
strm.avail_out = Z_CHUNK;
strm.next_out = out;
if (deflate (& strm, Z_FINISH) < 0) {
fprintf(stderr, "returned a bad status of.\n");
exit(0);
}
have = Z_CHUNK - strm.avail_out;
fwrite(out, 1, have, stdout);
} while(strm.avail_out == 0);
/* XXX End of writing the header */
do {
strm.avail_in = fread(in, 1, Z_CHUNK, source);
if (ferror(source)) {
(void)deflateEnd(&strm);
return Z_ERRNO;
}
flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
strm.next_in = in;
do {
strm.avail_out = Z_CHUNK;
strm.next_out = out;
ret = deflate(&strm, flush);
have = Z_CHUNK - strm.avail_out;
if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
(void)deflateEnd(&strm);
return Z_ERRNO;
}
} while(strm.avail_out == 0);
} while (flush != Z_FINISH);
} // End of function
deflate is not an archiver. It only compresses a stream. Once the stream is exhausted, your options are very limited. The manual clearly says that
If the parameter flush is set to Z_FINISH, pending input is processed, pending output is flushed and deflate returns with Z_STREAM_END if there was enough output space. If deflate returns with Z_OK or Z_BUF_ERROR, this function must be called again with Z_FINISH and more output space (updated avail_out) but no more input data, until it returns with Z_STREAM_END or an error. After deflate has returned Z_STREAM_END, the only possible operations on the stream are deflateReset or deflateEnd.
However, you are calling deflate for the file after you Z_FINISH the header, and zlib behaves unpredictably. The likely fix is to not use Z_FINISH for the header at all, and let the other side understand that the first line in the decompressed string is a header (or impose some archiving protocol understood by both sides).
Your first calls of deflate() should use Z_NO_FLUSH, not Z_FINISH. Z_FINISH should only be used when the last of the data to be compressed is provided with the deflate() call.
I'm currently building an HTTP server in C.
Please mind this piece of code :
#define CHUNK 0x4000
z_stream strm;
unsigned char out[CHUNK];
int ret;
char buff[200];
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
int windowsBits = 15;
int GZIP_ENCODING = 16;
ret = deflateInit2(&strm, Z_BEST_SPEED, Z_DEFLATED, windowsBits | GZIP_ENCODING, 1,
Z_DEFAULT_STRATEGY);
fill(buff); //fill buff with infos
do {
strm.next_in = (z_const unsigned char *)buff;
strm.avail_in = strlen(buff);
do {
strm.avail_out = CHUNK;
strm.next_out = out;
ret = deflate(&strm, Z_FINISH);
} while (strm.avail_out == 0);
send_to_client(out); //sending a part of the gzip encoded string
fill(buff);
}while(strlen(buff)!=0);
The idea is : I'm trying to send gzip'ed buffers, one by one, that (when they're concatened) is a whole body request.
BUT : for now, my client (a browser) only get the infos of the first buffer. No errors at all though.
How do I achieve this job, how to gzip some buffers inside a loop so I can send them everytime (in the loop) ?
First off, you need to do something with the generated deflate data after each deflate() call. Your code discards the compressed data generated in the inner loop. From this example, after the deflate() you would need something like:
have = CHUNK - strm.avail_out;
if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
(void)deflateEnd(&strm);
return Z_ERRNO;
}
That's where your send_to_client needs to be, sending have bytes.
In your case, your CHUNK is so much larger than your buff, that loop is always executing only once, so you are not discarding any data. However that is only happening because of the Z_FINISH, so when you make the next fix, you will be discarding data with your current code.
Second, you are finishing the deflate() stream each time after no more than 199 bytes of input. This will greatly limit how much compression you can get. Furthermore, you are sending individual gzip streams, for which most browsers will only interpret the first one. (This is actually a bug in those browsers, but I don't imagine they will be fixed.)
You need to give the compressor at least 10's to 100's of Kbytes to work with in order get decent compression. You need to use Z_NO_FLUSH instead of Z_FINISH until you get to your last buff you want to send. Then you use Z_FINISH. Again, take a look at the example and read the comments.
I want to decompress data which is (or supposed to be as per the specification I'm referring to) in DEFLATE compression format as specified in RFC 1951. Im using zlib library in C.
I referred to this example in github :
https://gist.github.com/gaurav1981/9f8d9bb7542b22f575df
And modified it just to decompress my data:
char dData[MAX_LENGTH];
char cData[MAX_LENGTH];
for(i=0; i < (size-4); i++)
{
cData[i] = *(data + i);
}
//cData[i] = '\0';
printf("Compressed size is: %lu\n", strlen(cData));
z_stream infstream;
infstream.zalloc = Z_NULL;
infstream.zfree = Z_NULL;
infstream.opaque = Z_NULL;
// setup "b" as the input and "c" as the compressed output
//infstream.avail_in = (uInt)((char*)defstream.next_out - b); // size of input
//infstream.avail_in = (uInt)((char*)defstream.next_out - cData);
infstream.avail_in = (uInt)(size - 4);
infstream.next_in = (Bytef *)cData; // input char array
infstream.avail_out = (uInt)sizeof(dData); // size of output
infstream.next_out = (Bytef *)dData; // output char array
// the actual DE-compression work.
inflateInit(&infstream);
inflate(&infstream, Z_NO_FLUSH);
inflateEnd(&infstream);
printf("Uncompressed size is: %lu\n", strlen(dData));
size = strlen(dData);
My uncompressed size is 0. So can someone tell what's wrong with my code?
I even wrote the data into a file and saved it as .gz and .zip but an error came when i tried to extract it (I'm running ubuntu 14.04)
And can someone be kind enough to analyse my data and extract it if it is possible. My data :
6374 492d 2d29 4ece c849 cc4b
294a 4cc9 cc57 f02e cd29 292d 6292 7780
30f2 1293 338a 3293 334a 52f3 98c4 0b9c
4a93 33b2 8b32 4b32 b399 d405 4212 d353
8b4b 320b 0a00
Instead of inflate(), you need to call inflateInit2() with the second argument being -15, in order to decompress raw deflate data.
Your data starts with \0, so strlen will return 0, that you have print as length of uncompressed data
I have a gzip file that is in memory, and I would like to uncompress it using zlib, version 1.1.3. Uncompress() is returning -3, Z_DATA_ERROR, indicating the source data is corrupt.
I know that my in memory buffer is correct - if I write the buffer out to a file, it is the same as my source gzip file.
The gzip file format indicates that there is a 10 byte header, optional headers, the data, and a footer. Is it possible to determine where the data starts, and strip that portion out? I performed a search on this topic, and a couple people have suggested using inflateInit2(). However, in my version of zlib, that function is oddly commented out. Is there any other options?
I came across the same problem, other zlib version (1.2.7)
I don't know why inflateInit2() is commented out.
Without calling inflateInit2 you can do the following:
err = inflateInit(&d_stream);
err = inflateReset2(&d_stream, 31);
the inflateReset2 is also called by inflateInit. Inside of inflateInit the WindowBits are set to 15 (1111 binary). But you have to set them to 31 (11111) to get gzip working.
The reason is here:
inside of inflateReset2 the following is done:
wrap = (windowBits >> 4) + 1;
which leads to 1 if window bits are set 15 (1111 binary) and to 2 if window bits are set 31 (11111)
Now if you call inflate() the following line in the HEAD state checks the state->wrap value along with the magic number for gzip
if ((state->wrap & 2) && hold == 0x8b1f) { /* gzip header */
So with the following code I was able to do in-memory gzip decompression:
(Note: this code presumes that the complete data to be decompressed is in memory and that the buffer for decompressed data is large enough)
int err;
z_stream d_stream; // decompression stream
d_stream.zalloc = (alloc_func)0;
d_stream.zfree = (free_func)0;
d_stream.opaque = (voidpf)0;
d_stream.next_in = deflated; // where deflated is a pointer the the compressed data buffer
d_stream.avail_in = deflatedLen; // where deflatedLen is the length of the compressed data
d_stream.next_out = inflated; // where inflated is a pointer to the resulting uncompressed data buffer
d_stream.avail_out = inflatedLen; // where inflatedLen is the size of the uncompressed data buffer
err = inflateInit(&d_stream);
err = inflateReset2(&d_stream, 31);
err = inflateEnd(&d_stream);
Just commenting in inflateInit2() is the oder solution. Here you can set WindowBits directly
Is it possible to determine where the data starts, and strip that portion out?
Gzip has the following magic number:
static const unsigned char gzipMagicBytes[] = { 0x1f, 0x8b, 0x08, 0x00 };
You can read through a file stream and look for these bytes:
static const int testElemSize = sizeof(unsigned char);
static const int testElemCount = sizeof(gzipMagicBytes);
const char *fn = "foo.bar";
FILE *fp = fopen(fn, "rbR");
char testMagicBuffer[testElemCount] = {0};
unsigned long long testMagicOffset = 0ULL;
if (fp != NULL) {
do {
if (memcmp(testMagicBuffer, gzipMagicBytes, sizeof(gzipMagicBytes)) == 0) {
/* we found gzip magic bytes, do stuff here... */
fprintf(stdout, "gzip stream found at byte offset: %llu\n", testMagicOffset);
break;
}
testMagicOffset += testElemSize * testElemCount;
fseek(fp, testMagicOffset - testElemCount + 1, SEEK_SET);
testMagicOffset -= testElemCount + 1;
} while (fread(testMagicBuffer, testElemSize, testElemCount, fp));
}
fclose(fp);
Once you have the offset, you could do copy and paste operations, or overwrite other bytes, etc.