Is zlib Type0 header data should be included in adler checksum calculation? - zlib

When calculating the Adler-32 checksum of uncompressed data in zlib format, should it include Type 0 (uncompressed data) data as well?

The zlib format does not support "type 0". The only type supported by the zlib format is type 8, deflate. Since purely stored data does not have a means to detect when it ends, it cannot be used as a zlib data type. The type used must be self-terminating.
The deflate format internally supports a stored mode, which precedes chunks of uncompressed data with counts.
If the zlib format ever supports compression types other than 8, then yes, the Adler-32 would be computed over the uncompressed result of those methods of compression.

Related

Deflate Format: differences between type blocks

I am currently trying to write a compressor and decompressor with the same purpose as the RFC Deflate specification.
I'm not able to understand the difference between how blocks are composed in the compression with fixed tables and dynamic tables. The file is processed by LZ77 generating (distance, length) + literal.
How do I know the type of block?
Do I have to compress this data?
Given that I use a fixed compression and don't have to send the tables, how would the encoder know how to encode data?
Moreover, do I have to send data before the actual compression executes?
I am confused on the difference between fixed tables and the table we send in the dynamic mode, and how the two blocks use them to encode data.
I'm currently reading Data Compression: The Complete Reference. Any advice will be helpful.
Since you are trying to compress, you would pick the smaller of the two. zlib's deflate computes what the size of a fixed block, a dynamic block, and a stored block would be, and emits the smallest of the three.
If you are encoding a fixed block, you encode using the fixed code for literal/lengths and distances. This code is provided in the RFC.

Can I change zlib compression level in the same file

Will I corrupt the output if I
Write data to a file with compression level (say) 6.
Close that zstream and open a new zstream, calling deflateInit with a different compression level (say, 1), and append that data to the same file.
Yes, you will corrupt the output, in the sense that zlib decoders are not expecting concatenated zlib streams.
It doesn't matter though, since you don't need to end the zlib stream to change the compression level. The deflateParams() function allows you to change the compression level and compression strategy mid-stream. Please read the documentation in zlib.h.

How to insert type 0 block along with data in zlib

I have to insert uncompressed data in between the compressed data bytes. Type 0 header in zlib allows me to do that. But How can i do that ? any clues ?
There is no type 0 allowed in a zlib header. There is a stored block type in the deflate format used within a zlib stream. zlib will automatically use stored blocks if they are smaller than the same data compressed.

CRC-32 field in zip

I am designing a zip-unzip utility using C. There is a crc-32 code field. Is it of compressed data or uncompressed data?
It is the CRC-32 of the uncompressed data. In other words, it would be the CRC-32 of the file's original contents before being compressed. Zlib has a minizip contribution which is a small zip/unzip implementation written in C. In zip.c you can see in the function zipWriteInFileInZip that it is generating the crc of the buffer passed in that should contain the file's original contents.

Array Compression Algorithm

I have an array of 10345 bytes, I want to compress the array and then decompress, kindly suggest me the compression algorithm which can reduce the size of array. I am using c language, and the array is of unsigned char type.
Rephrased: Can someone suggest a general-purpose compression algorithm (or library) for C/C++?
zlib
Lossless Compression Algorithms
This post's a community wiki. I don't want any points for this -- I've already voted to close the question.
The number of bytes to compress has very little to do with choice of compression algorithm, although it does affect the implementation. For example, when you have fewer than 2^15 bytes to compress, if you are using ZLib, you will want to specify a compression-level of less than 15. The compression-level in Zlib (one of the two such parameters) controls the depth of the "look-back" dictionary. If your file is shorter than 16k bytes, then a 32k look-back dictionary will never half-fill; in that case, use one less bit of pointer into the look-back for a 1/15th edge on the compression compared to setting ZLib to "max."
The content of the data is what matters. If you are sending images with mostly background, then you might want Run Length Encoding (used by Windows .BMP, for example).
If you are sending mostly English text, than you wish you could use something like ZLib, which implements Huffman encoding and LZW-style look-back dictionary compression.
If your data has been encrypted, then attempting to compress it will not succeed.
If your data is a particular type of signal, and you can tolerate some loss of detail, then you may want to transform it into frequency space and send only the principal components. (e.g., JPEG, MP3)

Resources