zlib format : beginning 2 bytes - zlib

I read https://www.ietf.org/rfc/rfc1950.txt. Still I have some doubts.
Its mentioned that CINFO value can't be more than 7, but in one of my input to zlib inflate() function I have 0x68de as the first two bytes. I am getting uncompressed data without any error from zlib. Here 0x68 first four bits is 0110 second four bits are 1000 which means CINFO is 8. I think I am missing something here. Can anyone explain me about this beginning two bytes (0x68de) clearly.

No, for 0x68, CINFO is 6 and CM is 8. CM is bits 0 to 3, which are the low four bits, and CINFO is bits 4 to 7, which are the high four bits. Section 2.1 clearly describes the notation used in the document, and which bits are which.

Related

Writing integer values to a file in binary using C

I am trying to write 9 bit numbers to a binary file.
For example, i want to write the integer value: 275 as 100010011 and so on. fwrite only allows one byte to be written at a time and I am not sure how to manipulate the bits to be able to do this.
You have to write a minimum of two bytes to store a 9-bits value. An easy solution is to use 16 bits per 9 bits value
Choose a 16 bits unsigned type, eg uint16_t and store the 2 bytes
uint16_t w = 275;
fwrite(&w, 1, 2, myfilep);
Reading the word w, ensure it actually uses only its 9 first bits (bits 0~8)
w &= 0x1FF;
Note that you might have endianness issues if you read the file on another system that doesn't have the same endianness as the system that wrote the word.
You could also optimize that solution using 9 bits of a 16 bits word, then using the remaining 7 bits to store the first 7 bits of the next 9 bits value etc...
See this answer that explains how to work with bit shifting in C.

C - Type Name : Number?

I was wondering what the following code is doing exactly? I know it's something to do with memory alignment but when I ask for the sizeof(vehicle) it prints 20 but the struct's actual size is 22. I just need to understand how this works, thanks!
struct vehicle {
short wheels:8;
short fuelTank : 6;
short weight;
char license[16];
};
printf("\n%d", sizeof(struct vehicle));
20
Memory will be allocated as (assuming memory word size is of 8 bits)
struct vehicle {
short wheels:8; // 1 byte
short fuelTank : 6;
// padd 2 bits to make fuelTank of 1 byte.
short weight; // 2 bytes.
char license[16]; // 16 bytes.
};
1 + 1 + 2 + 16 = 20 bytes.
Consider a machine with a word size of 32bit. The two first fields fit in a whole 16bit word as they occupy 8 + 6 = 14 bits. The second field, while not a bitfield (doesn't have the :<number> thing to allocate space in bits) can fit another 16 bits word to complete a 32 bit word, so the three first fields can pack in a 32bit word (4 bytes) if the architecture allows to access the memory in 16 bit quantities. Finaly, if you add 16 characters to that, this gives the 20 bytes that sizeof operator sends to printf.
Why do you assume the sizeof (struct vehicle) is 22 bytes? You allowed the compiler to print it and it said it's 20. Compilers are free to pad (or not) the structures to achieve better performance. That's an architecture dependency, and as you have not said architecture and compiler used, it is not possible to go further.
For example, 32bit intel arch allows to pad words at even boundaries without performance penalties, so this is a good selection in order to save memory. On other architectures, perhaps it's not allowed to use 16bit integers and data must be padded to fit the third field (leading to 22 bytes for the whole structure)
The only warranty you have when sizing data is that the compiler must allocate enough space to fit everything in an efficient way, so the only thing you can assume from that declaration is that it will occupy at least the minimum space to represent one field of 8 bit, other of 6, a complete short (I'll assume a short is 16 bit) and 16 characters (assuming 8 bits per char) it ammounts to 8 + 6 + 16 + 16*8 = 158 bits minimum.
Suppose we are writing a compiler for D. Knuth MIX machine. As it's stated in his book Fundamental Algorithms, this machine has an unspecified byte size of 64..100 bytes, requiring five to construct one addressable word (plus a binary sign). If you had a byte size independent compiler (one that compiles for any MIX machine, without assumptions of byte size) you have to use no more than 64 possible values per byte, leading to 6 bit per byte. You then would assume the second field fills one complete byte (and the sign drawn from the word it belongs to) and the first field needs two complete bytes (using half of the values for negative values) The third field might be in the second word, filling three complete bytes (6*3 = 18) and the sign of that word. The next 16 chars can begin on the next word, summing up to five complete words, so the whole structure will have 1 + 1 + 4 = 6 words, or 30 bytes. But if you want to handle effectively three signed fields, you'll need three complete words for the three fields (as each has a sign field only) leading to 7 words or 35 bytes.
I have suggested this example because of the particular characteristics of this architecture, that makes one to think on not so uncommon architectures that some time ago where in common use (the first machines ever built where not binary based, like some of these MIX machines)
Note
You can try to print the actual offsets of the fields, to see where in the structure are located and see where the compiler is padding.
#define OFFSET(Typ, field) ((int)&((Typ *)0)->field)
(Note, edited)
This macro will tell you the offset as an int. Use it as OFFSET(struct vehicle, weight) or OFFSET(struct vehicle, license[3])
Note
I had to edit the last macro definition as it complains on some architectures as the conversion of pointer -> int is not always possible (on 64bit architectures, it looses some bits) so it's better to compute the difference of two pointers, which is a proper size_t value, than to convert it directly from pointer.
#define OFFSET(Typ, field) ((char *)&((Typ *)0)->field - (char *)0)

Write 9 bits binary data in C

I am trying to write to a file binary data that does not fit in 8 bits. From what I understand you can write binary data of any length if you can group it in a predefined length of 8, 16, 32,64.
Is there a way to write just 9 bits to a file? Or two values of 9 bits?
I have one value in the range -+32768 and 3 values in the range +-256. What would be the way to save most space?
Thank you
No, I don't think there's any way using C's file I/O API:s to express storing less than 1 char of data, which will typically be 8 bits.
If you're on a 9-bit system, where CHAR_BIT really is 9, then it will be trivial.
If what you're really asking is "how can I store a number that has a limited range using the precise number of bits needed", inside a possibly larger file, then that's of course very possible.
This is often called bitstreaming and is a good way to optimize the space used for some information. Encoding/decoding bitstream formats requires you to keep track of how many bits you have "consumed" of the current input/output byte in the actual file. It's a bit complicated but not very hard.
Basically, you'll need:
A byte stream s, i.e. something you can put bytes into, such as a FILE *.
A bit index i, i.e. an unsigned value that keeps track of how many bits you've emitted.
A current byte x, into which bits can be put, each time incrementing i. When i reaches CHAR_BIT, write it to s and reset i to zero.
You cannot store values in the range –256 to +256 in nine bits either. That is 513 values, and nine bits can only distinguish 512 values.
If your actual ranges are –32768 to +32767 and –256 to +255, then you can use bit-fields to pack them into a single structure:
struct MyStruct
{
int a : 16;
int b : 9;
int c : 9;
int d : 9;
};
Objects such as this will still be rounded up to a whole number of bytes, so the above will have six bytes on typical systems, since it uses 43 bits total, and the next whole number of eight-bit bytes has 48 bits.
You can either accept this padding of 43 bits to 48 or use more complicated code to concatenate bits further before writing to a file. This requires additional code to assemble bits into sequences of bytes. It is rarely worth the effort, since storage space is currently cheap.
You can apply the principle of base64 (just enlarging your base, not making it smaller).
Every value will be written to two bytes and and combined with the last/next byte by shift and or operations.
I hope this very abstract description helps you.

How to allocate a 32-byte aligned memory in C

Came across this question in one of the interview samples. A 16-byte aligned allocation has already been answered in How to allocate aligned memory only using the standard library?
But, I have a specific question in the same regarding the mask used to zero down the last 4 bits. This mask "~0F" has been used such that the resulting address is divisible by 16. What should be done to achieve the same for 32-byte alignment/divisibility?
First, the question you referred to is 16-byte alignment, not 16-bit alignment.
Regarding your actual question, you just want to mask off 5 bits instead of 4 to make the result 32-byte aligned. So it would be ~0x1F.
To clarify a bit:
To align a pointer to a 32 byte boundary, you want the last 5 bits of the address to be 0. (Since 100000 is 32 in binary, any multiple of 32 will end in 00000.)
0x1F is 11111 in binary. Since it's a pointer, it's actually some number of 0's followed by 11111 - for example, with 64-bit pointers, it would be 59 0's and 5 1's. The ~ means that these values are inverted - so ~0x1F is 59 1's followed by 5 0's.
When you take ptr & ~0x1F, the bitwise & causes all bits that are &'ed with 1 to stay the same, and all bits that are &'ed with 0 to be set to 0. So you end up with the original value of ptr, except that the last 5 bits have been set to 0. What this means is that we've subtracted some number between 0 and 31 in order to make ptr a multiple of 32, which was the goal.

Efficient way to create/unpack large bitfields in C?

I have one microcontroller sampling from a lot of ADC's, and sending the measurements over a radio at a very low bitrate, and bandwidth is becoming an issue.
Right now, each ADC only give us 10 bits of data, and its being stored in a 16-bit integer. Is there an easy way to pack them in a deterministic way so that the first measurement is at bit 0, second at bit 10, third at bit 20, etc?
To make matters worse, the microcontroller is little endian, and I have no control over the endianness of the computer on the other side.
EDIT: So far, I like #MSN's answer the best, but I'll respond to the comments
#EvilTeach: I'm not sure if the exact bit pattern would be helpful, or how to best format it with text only, but I'll think about it.
#Jonathan Leffler: Ideally, I'd pack 8 10-bit values into 10 8-bit bytes. If it makes processing easier, I'd settle for 3 values in 4 bytes or 6 values in 8 bytes (although the 2 are equivalent to me, same amount of 'wasted' bits)
Use bit 0 and 31 to determine endianness and pack 3 10-bit values in the middle. One easy way to test matching endianness is to set bit 0 to 0 and bit 31 to 1. On the receiving end, if bit 0 is 1, assert that bit 31 is 0 and swap endianness. Otherwise, if bit 0 is 0, assert that bit 31 is 1 and extract the 3 values.
You can use bitfields, but the ordering within machine words is not defined:
That said, it would look something like:
struct adc_data {
unsigned first :10;
unsigned second :10;
unsigned third :10;
};
EDIT: Corrected, thanks to Jonathan.
The simplest thing to do about endian-ness is to simply pick one for your transmission. To pack the bits in the transmission stream, use an accumulator (of at least 17 bits in your case) in which you shift in 10 bits at a time and keep track of how many bits are in it. When you transmit a byte, you pull a byte out of the accumulator, subtract 8 from your count, and shift the accumulator by 8. I use "transmit" here loosely, your probably storing into a buffer for later transmission.
For example, if transmission is little endian, you shift in your 10 bits at the top of the acccumator (in its MS bits), and pull your bytes from the bottom. For example: for two values a and b:
Accumulator Count
(MS to LS bit)
aaaaaaaaaa 10 After storing a
aa 2 After sending first byte
bbbbbbbbbbaa 12 After storing b
bbbb 4 After sending second byte
Reception is a similar unpacking operation.

Resources