Can a stream contain some fix huffman compressed block and some dynamic huffman compressed blocks - zlib

Is it possible to compress a stream with some blocks compressed with static Huffman encoding, and some blocks compressed with dynamic Huffman encoding? If yes, is it decompressible ?

Yes, and yes. A stream can contain a mix of all three block types, with the third type being stored.

Related

zlib get gzip-file-size before compressing

I have the following problem:
I'm using zlib in C to gzip files. The compressing (using z_stream, deflateInit2...) is no problem, but I need to know the size of the gziped file before I compress it. Is this possible or is the only option I have to count the bytes while compressing?
Thanks in advance!
but I need to know the size of the gziped file before I compress it
If you mean that you need it to compress (perhaps to allocate a buffer to hold the data), then you are mistaking, the whole point of z_stream is to let you compress input chunks in output chunks.
is the only option I have to count the bytes while compressing
Yes, you need to apply the compression algorithm to know the resulting size.

Deflate Format: differences between type blocks

I am currently trying to write a compressor and decompressor with the same purpose as the RFC Deflate specification.
I'm not able to understand the difference between how blocks are composed in the compression with fixed tables and dynamic tables. The file is processed by LZ77 generating (distance, length) + literal.
How do I know the type of block?
Do I have to compress this data?
Given that I use a fixed compression and don't have to send the tables, how would the encoder know how to encode data?
Moreover, do I have to send data before the actual compression executes?
I am confused on the difference between fixed tables and the table we send in the dynamic mode, and how the two blocks use them to encode data.
I'm currently reading Data Compression: The Complete Reference. Any advice will be helpful.
Since you are trying to compress, you would pick the smaller of the two. zlib's deflate computes what the size of a fixed block, a dynamic block, and a stored block would be, and emits the smallest of the three.
If you are encoding a fixed block, you encode using the fixed code for literal/lengths and distances. This code is provided in the RFC.

Is there any maximum limit in file size for Huffman compression?

As far as Huffman compression goes, it can compress any files containing ASCII symbols etc. However, is there any maximum limit on the file size?
No, there is no limit. You can string Huffman codes for as long as you like.

Array Compression Algorithm

I have an array of 10345 bytes, I want to compress the array and then decompress, kindly suggest me the compression algorithm which can reduce the size of array. I am using c language, and the array is of unsigned char type.
Rephrased: Can someone suggest a general-purpose compression algorithm (or library) for C/C++?
zlib
Lossless Compression Algorithms
This post's a community wiki. I don't want any points for this -- I've already voted to close the question.
The number of bytes to compress has very little to do with choice of compression algorithm, although it does affect the implementation. For example, when you have fewer than 2^15 bytes to compress, if you are using ZLib, you will want to specify a compression-level of less than 15. The compression-level in Zlib (one of the two such parameters) controls the depth of the "look-back" dictionary. If your file is shorter than 16k bytes, then a 32k look-back dictionary will never half-fill; in that case, use one less bit of pointer into the look-back for a 1/15th edge on the compression compared to setting ZLib to "max."
The content of the data is what matters. If you are sending images with mostly background, then you might want Run Length Encoding (used by Windows .BMP, for example).
If you are sending mostly English text, than you wish you could use something like ZLib, which implements Huffman encoding and LZW-style look-back dictionary compression.
If your data has been encrypted, then attempting to compress it will not succeed.
If your data is a particular type of signal, and you can tolerate some loss of detail, then you may want to transform it into frequency space and send only the principal components. (e.g., JPEG, MP3)

LZO Decompression Buffer Size

I am using MiniLZO on a project for some really simple compression tasks. I am compressing with one program, and decompressing with another. I'd like to know how much space to allocate for the decompression buffer. I am fine with over-allocating space, if it can save me the trouble of having to annotate my output file with an integer declaring how much space the decompressed data should take. How would I figure out how much space it could possibly take?
After some consideration, I think this question boils down to the following: What is the maximum compression ratio of lzo1x compression?
Since you control both the compressor and the decompressor, I suggest you compress the input in fixed-sized blocks. In my application I compress up to 64KB in each block, then emit the size of the compressed block and the compressed data itself, so the compressed stream actually looks like a series of compressed blocks:
length_of_block_1
block_1
length_of_block_2
block_2
...
The decompressor just reads each compressed block and decompresses it into a 64KB buffer, since I know the block was produced by compressing a 64KB block.
Hope that helps,
Eric Melski
The max size of the decompressed data is clearly the same as the max size of the data you compressed in the first place.
If there is an upper bound on your input size then I guess you can use it, but I have to say the usual way of doing this is to add a header to your compressed buffer which specifies the uncompressed size.

Resources