Can I change zlib compression level in the same file - zlib

Will I corrupt the output if I
Write data to a file with compression level (say) 6.
Close that zstream and open a new zstream, calling deflateInit with a different compression level (say, 1), and append that data to the same file.

Yes, you will corrupt the output, in the sense that zlib decoders are not expecting concatenated zlib streams.
It doesn't matter though, since you don't need to end the zlib stream to change the compression level. The deflateParams() function allows you to change the compression level and compression strategy mid-stream. Please read the documentation in zlib.h.

Related

Can I Create FILE Instance by byte[ ] data? (Don't Write file)

Can I Create FILE Instance (FILE*) by byte[ ] data (on Memory)? Don't Write file.
(C, Linux)
I Need for 'MiniSEED' format data parsing by offical MiniSEED Library.
These Library is supported parsing 'MiniSEED' format packet Data that was written in file.
But I Need to parsing 'MiniSEED' data in Byte[] array directly. don't create real file.
(because I must get 'MiniSEED' data by realtime TCP Packet, continuously
and These Library support only way to parse data by written file.)
So I try to solve the problem Created FILE Instance by byte[] data directly.
I think this solution is best way without changing the library as an easy way.
You can create a FILE handle from in-memory data in Linux, because the Linux C libraries do support fmemopen() from POSIX.1-2008.
Calling fmemopen(buffer, size, "r") yields a read-only FILE handle to an in-memory object containing size bytes at buffer.
However, I don't understand why you'd need such a thing.
The official Mini-SEED library does provide function msr_unpack() (and msr_unpack_data()) to parse Mini-SEED data records.
The functions you are probably looking at using, ms_readmsr() and ms_readtraces() (or their thread-safe variants ms_readmsr_r() and ms_readtraces_r(), just read each record from the file, passing each to msr_unpack() (and in case of traces, to mst_addmsrtogroup() or mstl_addmsr()).
In other words, the library does support parsing in-memory data. Your assertion that it only supports parsing files is clearly incorrect.
The man pages describing the library functions do not seem to be available on the net, but if you download libmseed sources, you can read the library function man pages using man -l libmseed/doc/[function].3.
As a compromise, you might use mmap to create a direct mapping between the memory and the file. This will allow you to update the contents directly (by accessing the memory) and the library may access the same data through the file interface. Under Unix systems, depending upon the size of the data, the file may not actually need to be written to disk. It may reside in the kernel's cache structure for faster access (this happens by default: nothing extra you need to do).
No, there's no portable, standard way of creating a FILE * that represents an in-memory stream of bytes.
The typical solution is to instead make the read and write function(s) hookable, so that instead of hard-coding e.g. read() you make the library call an (optionally) application-supplied function.

zlib compression big buffer vs. many small?

I'm compressing a data structure that has many fields. Which is a better approach, to use gzwrite to compress and write to file each of the fields, or write all of the fields to a buffer and compress that?
Separate calls of gzwrite won't make field compression separate: they'll be in a single compressed stream, as if you've written them with one call. If you wanted to gzclose and reopen in between, then there would be a difference.
(I think you know the tradeoffs for separate streams vs. single stream: with a single one, compression is better but you are unable to decompress only the fields you need. But again, there is no such tradeoff in your question: call gzwrite as it's convenient for you, the result will be the same).

how to read pcm samples from a file using fread and fwrite?

I want to read pcm samples from a file using fread and want to determine the signal strength of the samples.How do I go about it?
For reading, how many bytes constitute 1 pcm sample? Can I read more than 1 pcm sample at a time?
This is for WAV and AAC files.
You have to understand that WAV-files (and even more so AAC-files) are not all the same. I will only explain about WAV-files, you'll hopefully understand how it is with AAC-files then. As you pointed out, a WAV-file has PCM-encoded data. However that can be: 8-bit, 16-bit, 32-bit, ...Mono, Stereo, 5.1, ...,8kHz, 16kHz, 44.1kHz, etc. Depending on these values you have to interpret the data (e.g. when reading it with the fread()-function) differently. Therefore WAV-files have a header. You have to read that header first, in the standard way (I do not know the details). Then you know how to read the actual data. Since it is not that easy, I suggest you use on of the libraries out there, that read WAV-files for you, e.g. http://www.mega-nerd.com/libsndfile/ . Of course you can also google or use SO to find others. Or you do it the hard way and find out how WAV-file headers look like and decode that data first, then move on to the actual PCM-encoded data.
I have no experience tackling with WAV file, but once read data from mp3 file. As to the mp3 file, each 576 pcm samples are encoded into a frame. All the frames are stored directly into a file alone with some side information. When processing encoded data, I read binary data from the mp3 file and stored in a buffer, decoding buffered data and extract what is meaningful to me.
I think processing wav file(which stores pcm samples per my understand) is not quite different. You can read the binary data from file directly and perform some transformation according to wav encoding specification.
The file itself does not know what kind of data even what format of the data is in it. You can take everything in a file as bytes(even plain text), read byte from file interpreting the binary data yourself.

What happens to a piece of data if you use zlib to decompress it, but it isn't compressed in the first place?

If you decompress data with zlib that isn't compressed, does anything happen?
If it does in fact change the data, how do you check if data is zlib zipped in the first place?
There would need to be a valid header. Extremely unlikely that this could ever happen unless it was an accurately structured (compressed) data stream, so it would be invalid data to inflate.

Reading tag data for Ogg/Flac files

I'm working on a C library that reads tag information from music files. I've already got ID3v2 taken care of, but I can't figure out how Ogg files are structured.
I opened a .ogg file in a hexeditor and I could find the tag data because that was all human readable. But everything from the beginning of the file to the tag data looked like garbage. How is this data encoded?
I don't need any help in the actual code, I just need help visualizing what a Ogg header looks like and what encoding it uses so I that I can read it. I'd like to use a non-hacky approach to reading Ogg files.
I've been looking at the Flac format, which has been helpful.
The Flac file I'm looking at has about 350 bytes between the "fLac" identifier and the human readable Comments section, and none of it is human readable in my hex editor, so I'm sure there has to be something important in there.
I'm using Linux, and I have no intention of porting to Windows or OS X. So if I need to use a glibc only function to convert the encoding, I'm fine with that.
The Ogg file format is documented here. There is a very nice graphical visualization as you requested with a detailed written description.
You may also want to look at libogg which is a open source BSD-licensed library for reading and writing Ogg files.
As is described in the link you provided, the following metadata blocks can occur between the "fLaC" marker and the VORBIS_COMMENT metadata block.
STREAMINFO: This block has information about the whole stream, like sample rate, number of channels, total number of samples, etc. It must be present as the first metadata block in the stream. Other metadata blocks may follow, and ones that the decoder doesn't understand, it will skip.
APPLICATION: This block is for use by third-party applications. The only mandatory field is a 32-bit identifier. This ID is granted upon request to an application by the FLAC maintainers. The remainder is of the block is defined by the registered application. Visit the registration page if you would like to register an ID for your application with FLAC.
PADDING: This block allows for an arbitrary amount of padding. The contents of a PADDING block have no meaning. This block is useful when it is known that metadata will be edited after encoding; the user can instruct the encoder to reserve a PADDING block of sufficient size so that when metadata is added, it will simply overwrite the padding (which is relatively quick) instead of having to insert it into the right place in the existing file (which would normally require rewriting the entire file).
SEEKTABLE: This is an optional block for storing seek points. It is possible to seek to any given sample in a FLAC stream without a seek table, but the delay can be unpredictable since the bitrate may vary widely within a stream. By adding seek points to a stream, this delay can be significantly reduced. Each seek point takes 18 bytes, so 1% resolution within a stream adds less than 2k. There can be only one SEEKTABLE in a stream, but the table can have any number of seek points. There is also a special 'placeholder' seekpoint which will be ignored by decoders but which can be used to reserve space for future seek point insertion.
Just after the above description, there's also the specification of the format of each of those blocks. The link also says
All numbers used in a FLAC bitstream are integers; there are no floating-point representations. All numbers are big-endian coded. All numbers are unsigned unless otherwise specified.
So, what are you missing? You say
I'd like a non-hacky approach to reading Ogg files.
Why re-write a library to do that when they already exist?

Resources