Read binary data (from file) into a struct - c

I'm reading binary data from a file, specifically from a zip file. (To know more about the zip format structure see http://en.wikipedia.org/wiki/ZIP_%28file_format%29)
I've created a struct that stores the data:
typedef struct {
/*Start Size Description */
int signatute; /* 0 4 Local file header signature = 0x04034b50 */
short int version; /*  4 2 Version needed to extract (minimum) */
short int bit_flag; /*  6 2 General purpose bit flag */
short int compression_method; /*  8 2 Compression method */
short int time; /* 10 2 File last modification time */
short int date; /* 12 2 File last modification date */
int crc; /* 14 4 CRC-32 */
int compressed_size; /* 18 4 Compressed size */
int uncompressed_size; /* 22 4 Uncompressed size */
short int name_length; /* 26 2 File name length (n) */
short int extra_field_length; /* 28 2 Extra field length (m) */
char *name; /* 30 n File name */
char *extra_field; /*30+n m Extra field */
} ZIP_local_file_header;
The size returned by sizeof(ZIP_local_file_header) is 40, but if the sum of each field is calculated with sizeof operator the total size is 38.
If we have the next struct:
typedef struct {
short int x;
int y;
} FOO;
sizeof(FOO) returns 8 because the memory is allocated with 4 bytes every time. So, to allocate x are reserved 4 bytes (but the real size is 2 bytes). If we need another short int it will fill the resting 2 bytes of the previous allocation. But as we have an int it will be allocated plus 4 bytes and the empty 2 bytes are wasted.
To read data from file, we can use the function fread:
ZIP_local_file_header p;
fread(&p,sizeof(ZIP_local_file_header),1,file);
But as there're empty bytes in the middle, it isn't read correctly.
What can I do to sequentially and efficiently store data with ZIP_local_file_header wasting no bytes?

In order to meet the alignment requirements of the underlying platform, structs may have "padding" bytes between members so that each member starts at a properly aligned address.
There are several ways around this: one is to read each element of the header separately using the appropriately-sized member:
fread(&p.signature, sizeof p.signature, 1, file);
fread(&p.version, sizeof p.version, 1, file);
...
Another is to use bit fields in your struct definition; these are not subject to padding restrictions. The downside is that bit fields must be unsigned int or int or, as of C99, _Bool; you may have to cast the raw data to the target type to interpret it correctly:
typedef struct {
unsigned int signature : 32;
unsigned int version : 16;
unsigned int bit_flag; : 16;
unsigned int compression_method : 16;
unsigned int time : 16;
unsigned int date : 16;
unsigned int crc : 32;
unsigned int compressed_size : 32;
unsigned int uncompressed_size : 32;
unsigned int name_length : 16;
unsigned int extra_field_length : 16;
} ZIP_local_file_header;
You may also have to do some byte-swapping in each member if the file was written in big-endian but your system is little-endian.
Note that name and extra field aren't part of the struct definition; when you read from the file, you're not going to be reading pointer values for the name and extra field, you're going to be reading the actual contents of the name and extra field. Since you don't know the sizes of those fields until you read the rest of the header, you should defer reading them until after you've read the structure above. Something like
ZIP_local_file_header p;
char *name = NULL;
char *extra = NULL;
...
fread(&p, sizeof p, 1, file);
if (name = malloc(p.name_length + 1))
{
fread(name, p.name_length, 1, file);
name[p.name_length] = 0;
}
if (extra = malloc(p.extra_field_length + 1))
{
fread(extra, p.extra_field_length, 1, file);
extra[p.extra_field_length] = 0;
}

C structs are just about grouping related pieces of data together, they do not specify a particular layout in memory. (Just as the width of an int isn't defined either.) Little-endian/Big-endian is also not defined, and depends on the processor.
Different compilers, the same compiler on different architectures or operating systems, etc., will all layout structs differently.
As the file format you want to read is defined in terms of which bytes go where, a struct, although it looks very convenient and tempting, isn't the right solution. You need to treat the file as a char[] and pull out the bytes you need and shift them in order to make numbers composed of multiple bytes, etc.

The solution is compiler-specific, but for instance in GCC, you can force it to pack the structure more tightly by appending __attribute__((packed)) to the definition. See http://gcc.gnu.org/onlinedocs/gcc-3.2.3/gcc/Type-Attributes.html.

It's been a while since I worked with zip-compressed files, but I do remember the practice of adding my own padding to hit the 4-byte alignment rules of PowerPC arch.
At best you simply need to define each element of your struct to the size of the piece of data you want to read in. Don't just use 'int' as that may be platform/compiler defined to various sizes.
Do something like this in a header:
typedef unsigned long unsigned32;
typedef unsigned short unsigned16;
typedef unsigned char unsigned8;
typedef unsigned char byte;
Then instead of just int use an unsigned32 where you have a known 4-byte vaule. And unsigned16 for any known 2-byte values.
This will help you see where you can add padding bytes to hit 4-byte alignment, or where you can group 2, 2-byte elements to make up a 4-byte alignment.
Ideally you can use a minimum of padding bytes (which can be used to add additional data later as your expand the program) or none at all if you can align everything to 4-byte boundaries with variable-length data at the end.

Also, the name and extra_field will not contain any meaningful data, most likely. At least not between runs of the program, since these are pointers.

Related

passign a struct pointer to a function and struct padding in c programming

In build_uart_frame() , I call calcFCS() which calculates an XOR of all the bytes in the struct members(len, cmd0, cmd1 and data).
I do not think the struct is padded therefore will calling calcFCS() be an issue? Could somebody explain what is the issue in relation to struct padding as I don't understand its role here and secondly how can I do this operation correctly?
Thank you
typedef struct uart_frame {
uint8_t sof; /* 1 byte */
uint8_t len; /* 1 bytes */
uint8_t cmd0; /* 1 byte */
uint8_t cmd1;
char data[11]; /* 0 -250 byte */
unsigned char fcs; /* 1 byte */
} uart_frame_t;
//-------------------------------------------------------------------------
// Global uart frame
uart_frame_t rdata;
//-------------------------------------------------------------------------
unsigned char calcFCS(unsigned char *pMsg, unsigned char len) {
unsigned char result = 0;
while(len--) {
result ^= *pMsg++;
}
return(result);
}
//-------------------------------------------------------------------------
// Worker code to populate the frame
int build_uart_frame() {
uart_frame_t *rd = &rdata; //pointer variable 'rd' of type uart_frame
// common header codes
rd->sof = 0xFE;
rd->len = 11;
rd->cmd0 = 0x22;
rd->cmd0 = 0x05;
snprintf(rd->data, sizeof(rd->data), "%s", "Hello World");
rd->fcs = calcFCS((unsigned char *)rd, sizeof(uart_frame_t) - 1); //issue with struct padding
return 0;
}
Given your very specific example, it is unlikely that padding will be an issue, since all data types are bytes. Padding is mostly an issue when you use larger data types, because those should typically not be allocated at misaligned addresses.
Yet that is no guarantee: the compiler could in theory decide to replace a char with an int if it thinks that will get faster code. It is free to insert any amount of padding anywhere in a struct, except at the very top.
This is why structs are unsuitable to describe memory maps or data protocols. You will have to ensure that no padding is present and preferably do so portably. The best way to ensure this is a standard C compile-time assert:
_Static_assert(sizeof(uart_frame_t) == offsetof(uart_frame_t, fcs)+sizeof(unsigned char),
"Padding detected");
Here the size of the whole struct is checked against the byte position of the last struct member + the size of that member. If they are the same, there was no padding.
Now of course this only prevents your code from compiling and misbehaving, it doesn't solve the actual problem. Unfortunately there is no portable way to block padding. #pragma pack(1) is common but non-standard. __attribute__((packed)) is another compiler-specific command for this.
Ensuring that no packing is present on the given system where the code is compiled is usually enough.
Also, some of the more exotic systems (MIPS, SPARC etc) don't even support misaligned reads, meaning that misaligned access will not just mean slower code, but a run-time bus error crash.
The only way to safely ensure maximum portability of code using structs, is to write serialize/de-serialize routines that manually copies every member to/from a raw byte array:
void uart_serialize (const uart_frame_t* frame, uint8_t* raw)
{
raw[0] = frame->sof;
raw[1] = frame->len;
...
memcpy(&raw[4], frame->data, 11);
...
}
The downside of such methods is that they obviously adds some execution time, so I would only use them for code that I know needs to be ported to all kinds of different systems.

C misreading resolution of .bmp file

I have a project that involves reading a .bmp file into a C program, putting a mask on it, and printing the version with the mask back to a different file. The part I'm having an issue with seems to be the actual process of reading in the file. The first big red flag I'm seeing is that it keeps reading in the wrong resolution. I've searched quite a bit and seen a few scripts to read in a .bmp file as answers to various questions here but using the logic from those scripts hasn't helped.
The primary issue seems to be that rather than reading in the proper dimensions of 200 x 300 on the example image given by my professor, it reads in 13107200 x 65536. However, if I were to include the part of the code that prints to a different file, you would see that the output file has the appropriate resolution. This tells me that I am likely reading in the information properly but not storing it in the way that I think I am.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct HEADER {
unsigned short int Type; // Magic indentifier
unsigned int Size; // File size in bytes
unsigned short int Reserved1, Reserved2;
unsigned int Offset; // Offset to data (in B)
} Header; // -- 14 Bytes
struct INFOHEADER {
unsigned int Size; // Header size in bytes
int Width, Height; // Width / height of image
unsigned short int Planes; // Number of colour planes
unsigned short int Bits; // Bits per pixel
unsigned int Compression; // Compression type
unsigned int ImageSize; // Image size in bytes
int xResolution, yResolution; // Pixels per meter
unsigned int Colors; // Number of colors
unsigned int ImportantColors; // Important colors
} InfoHeader; // -- 40 Bytes
struct PIXEL {
unsigned char Red, Green, Blue; // Intensity of Red, Green, and Blue
}; // -- 3 Bytes
int getFileSize(FILE *input);
int getHexVal(FILE *input);
struct HEADER *getHeader(FILE *input);
struct INFOHEADER *getInfoHeader(FILE *input);
struct PIXEL *getPixel(FILE *input, struct PIXEL *loc);
struct HEADER *printHeader(FILE *output);
struct INFOHEADER *printInfoHeader(FILE *output);
struct PIXEL *printPixel(FILE *output, struct PIXEL *loc);
int main(int argc, char const *argv[]) {
if (argc == 3) {
if (!strcmp(argv[1], argv[2])) {
printf("The input and output file must be different. Please try again.\n");
return 1;
}
// char Matrix[3][3] =
// { { 0, -1, 0 },
// { -1, 4, -1 },
// { 0, -1, 0 }
// };
FILE *input = fopen(argv[1], "rb");
if (!input) return 1;
int i, j;
// getHeader(input);
fread(&Header, sizeof(struct HEADER), 1, input);
if (Header.Type != 0x4D42) {
printf("The specified input file was not a bitmap. Please try again.");
fclose(input);
return 1;
}
// getInfoHeader(input);
fread(&InfoHeader, sizeof(struct INFOHEADER), 1, input);
fseek(input, Header.Offset, SEEK_SET);
struct PIXEL arr[InfoHeader.Width][InfoHeader.Height];
printf("%d %d\n", InfoHeader.Width, InfoHeader.Height);
for (i = 0; i < InfoHeader.Width; i++) {
for (j = 0; j < InfoHeader.Height; j++) {
getPixel(input, arr[i] + j);
printf("%d %d %d\n", arr[i][j].Red, arr[i][j].Green, arr[i][j].Blue);
}
}
fclose(input);
}
}
I can see a lot of problems with your code:
1. Inconsistent sizes of data types
On different platforms, types like int and short can have different sizes. So, int might be one size on one platform, and another size on a different platform. You may need to use exact sized types like uint32_t.
2. Padding and alignment
The headers stored in the files are packed. Your structs are aligned. That means that the compiler inserts padding between members to ensure that members are always aligned for optimal memory access.
There are a variety of ways of dealing with this. You could declare your structs to be packed. That would get you so far, but see the next point.
3. Endianness
If you are reading a Windows bitmap on a big endian system, you have convert from little endian data in the file to big endian data for your system.
4. xResolution, yResolution are the wrong members
These are meant to indicate a physical size of the pixels. In practice they are seldom specified. You meant to read Width and Height.
5. The VLA (gah!)
You are using a variable length array: struct PIXEL arr[InfoHeader.xResolution][InfoHeader.yResolution]. That is liable to lead to stack overflows for large bitmaps. You really need to use dynamically allocated memory for the pixel array.
How would I deal with these issues?
Use exact sized types.
Declare packed structs.
Read the structs from file, and then perform endian correction if needed.
Allocate the pixel array with malloc.
The types int and short and so on are only guaranteed to have certain minimum sizes. They can vary on different implementations. Even if we assume that an int and short is four and two octets respectively, you will still run into problems when reading and writing your structures.
For example:
struct HEADER {
unsigned short int Type;
unsigned int Size;
unsigned short int Reserved1, Reserved2;
unsigned int Offset;
} Header;
In order to make Size suitably aligned for the processor, the compiler will (typically) insert padding between Type and Size which places Size at offset +4 instead of +2 (assuming the sizes mentioned above).
The best way to read (and write) binary formats is to read the file into an unsigned char * buffer, and then extract the fields from there. Eg.
unsigned long Size = buffer[2] +
buffer[3] * 0x100UL +
buffer[4] * 0x10000UL +
buffer[5] * 0x1000000UL;
or similar.
I suspect you mixed up some of the fields.
After looking at http://en.wikipedia.org/wiki/BMP_file_format , I think instead of this
struct PIXEL arr[InfoHeader.xResolution][InfoHeader.yResolution];
You really meant this:
struct PIXEL arr[InfoHeader.Width][InfoHeader.Height];

Continuous memory allocation with different data type in C?

I'm trying to compose a string (char array exactly) containing a fixed 14 starting characters and ending with varying content. The varying bit contains 2 floats and 1 32-bit integer that's to be individually treated as 4 1-byte characters in the array separated by commas. It can be illustrated by the following piece of code, which doesn't compile for some obvious reasons (*char can't assign to *float). So, what can I do to get around it?
char *const comStr = "AT+UCAST:0000=0760,0020,0001\r"; // command string
float *pressure;
float *temperature;
uint32_t *timeStamp;
pressure = comStr + 14; // pressure in the address following the '=' in command string
temperature = comStr + 18; // temperature in the address following the 1st ',' in command string
timeStamp = comStr + 22; // time stamp in the address following the 2nd ',' in command string
I have an unclear memory about something like struct and union in the C language which reserves strictly the memory allocation order in which the variables are defined within the "structure". Maybe something like this:
typedef struct
{
char[14] command;
float *pressure;
char comma1;
float *temperature;
char comma2;
uint32_t *time_stamp;
char CR;
}comStr;
Does this structure guarantee that comStr-> command[15] gives me the first/last byte (depends on the endian) of *pressure? Or is there any other special structure do the trick hiding from me?
(Note: comStr-> command[15] isn't going to be evaluated in future code, so exceeding index boundary is not a concern here. The only important thing here is just whether the memory is allocated continuously so that a hardware fetch lasting for 29 bytes starting from the memory address (comStr-> command) gives me exactly the string I want).
p.s. As I am writing this, I came up with an idea. Can I possibly just use memcpy() for the purpose ;) memcpy has parameters of void* type, hopefully it works! I am going to try it now! All hail stackOverflow anyway!
EDIT: I should have made myself clearer, sorry for any misleading and misunderstanding! The character array I want to construct is to be sent through UART byte by byte. To do this, a DMA system is to be used to transfer the array to the transmit buffer byte by byte automatically if the character array's starting memory address and length are given to the DMA system. So the character array must to be stored continuously in the memory. I hope this makes the question clearer.
This proposed structure:
typedef struct
{
char[14] command;
float *pressure;
char comma;
float *temperature;
char comma;
uint32_t *time_stamp;
char CR;
}comStr;
Is not going to help you with your requirement:
The only important thing here is just whether the memory is allocated continuously so that a hardware fetch lasting for 29 bytes starting from the memory address (comStr->command) gives me exactly the string I want.
Note you can't have two members with the same name; you'd need to use comma1 and comma2 for example. Also, the array dimension is in the wrong place.
One problem is that there will be padding bytes within the structure.
Another problem is that the pointers will be holding addresses of something outside the structure (since there is nothing valid inside the structure for them to point at).
It is not clear what you're after. Only a very limited range of floating point values can be represented by 4 bytes in a string. If you're after binary data I/O, then you can drop the pointers and the commas:
typedef struct
{
char command[14];
float pressure;
float temperature;
uint32_t time_stamp;
}comStr;
If you want the commas present, then you're going to have to work harder:
typedef struct
{
char command[14];
char pressure[4];
char comma1;
char temperature[4];
char comma2;
char time_stamp[4];
char CR;
} comStr;
You will have to load the data carefully:
struct comStr com;
float pressure = ...;
float temperature = ...;
uint32_t time_stamp = ...;
assert(sizeof(float) == 4);
...
memmove(&com.pressure, &pressure, sizeof(pressure));
memmove(&com.temperature, &temperature, sizeof(temperature));
memmove(&com.time_stamp, &time_stamp, sizeof(time_stamp));
You have to unpack with a similar set of memory copies. Note that you won't be able to use simple string manipulation on the structure; there could be zero bytes in any or all of the pressure, temperature and time_stamp sections of the structure.
Structure padding
#include <stddef.h>
#include <stdio.h>
#include <stdint.h>
typedef struct
{
char command[14];
float *pressure;
char comma1;
float *temperature;
char comma2;
uint32_t *time_stamp;
char CR;
} comStr;
int main(void)
{
static const struct
{
char *name;
size_t offset;
} offsets[] =
{
{ "command", offsetof(comStr, command) },
{ "pressure", offsetof(comStr, pressure) },
{ "comma1", offsetof(comStr, comma1) },
{ "temperature", offsetof(comStr, temperature) },
{ "comma2", offsetof(comStr, comma2) },
{ "time_stamp", offsetof(comStr, time_stamp) },
{ "CR", offsetof(comStr, CR) },
};
enum { NUM_OFFSETS = sizeof(offsets)/sizeof(offsets[0]) };
printf("Size of comStr = %zu\n", sizeof(comStr));
for (int i = 0; i < NUM_OFFSETS; i++)
printf("%-12s %2zu\n", offsets[i].name, offsets[i].offset);
return 0;
}
Output on Mac OS X:
Size of comStr = 64
command 0
pressure 16
comma1 24
temperature 32
comma2 40
time_stamp 48
CR 56
Note how large the structure is on a 64-bit machine. Pointers are 8-bytes each and are 8-byte aligned.
Various issues to be a covered in your question. I'll take a shot at some of those issues.
The order of members in a structure is guaranteed to be the same as order you have declared them. But there is a different issue here - padding.
Check this -http://c-faq.com/struct/padding.html and follow other links/questions there
Next thing is that you are mistaken in thinking that something like "125" is an integer or something like "1.25" is a float - it's not - it's a string. i.e.
char * p = "125";
p[0] will not contain 0. It will contain '0' - if the encoding is ASCII, then this will be 48. i.e. p[0] will contain 48 & not 0. p[1] will contain 49 & p[2] will contain 52. It will be something similar for float.
The opposite will also happen.
i.e. if you have at an address and you treat it as a char array - the char array will not contain the float you think it will.
Try this program to see this
#include <stdio.h>
struct A
{
char c[4];
float * p;
int i;
};
int main()
{
float x = 1.25;
struct A a;
a.p = &x;
a.i = 0; // to make sure the 'presumed' string starting at p gets null terminate after the float
printf("%s\n", &a.c[4]);
}
For me, it prints "╪·↓". And this has nothing to do with endianness.
Another thing you need to remember, while assigning values to your structure object - you need to remember that comStr.pressure & comStr.temperature are pointers. You cannot assign values to them directly. You need to either give them the address of an existing float or allocate memory dynamically to which they can point to.
Also are you trying to create the char array or to parse the char array which already exists. If you are trying to create it, a better way to do this will be to use snprintf to do what you want. snprintf uses format specifiers similar to printf but prints to a char array. You can create your char array that way. A bigger question remains - what do you plan to do with this char array you create - that will determine if endianness is relevant for you.
If you are trying to read from the char array you have been given and trying to split into floats and commas and whatever, then one way to do this will be sscanf but may be difficult for your particular string format.
At last, I found an easy way round but I don't know if there is any drawback for this method. I did:
char commandStr[27];
char *commandHeader = "AT+UCAST:0000=";
float pressure = 760.0;
float temperature = 20.0;
uint32_t timeStamp = 0;
memcpy(commandStr, commandHeader, 14);
commandStr[26] = '\r';
memcpy((void*)(comStr+14), (void*)(&pressure), 4);
memcpy((void*)(comStr+18), (void*)(&temperature), 4);
memcpy((void*)(comStr+22), (void*)(&timeStamp), 4);
Does this code have any security issues or performance issues or whatever?

How does gcc calculate the required space for a structure?

struct {
integer a;
struct c b;
...
}
In general how does gcc calculate the required space? Is there anyone here who has ever peeked into the internals?
I have not "peeked at the internals", but it's pretty clear, and any sane compiler will do it exactly the same way. The process goes like:
Begin with size 0.
For each element, round size up to the next multiple of the alignment for that element, then add the size of that element.
Finally, round size up to the least common multiple of the alignments of all members.
Here's an example (assume int is 4 bytes and has 4 byte alignment):
struct foo {
char a;
int b;
char c;
};
Size is initially 0.
Round to alignment of char (1); size is still 0.
Add size of char (1); size is now 1.
Round to alignment of int (4); size is now 4.
Add size of int (4); size is now 8.
Round to alignment of char (1); size is still 8.
Add size of char (1); size is now 9.
Round to lcm(1,4) (4); size is now 12.
Edit: To address why the last step is necessary, suppose instead the size were just 9, not 12. Now declare struct foo myfoo[2]; and consider &myfoo[1].b, which is 13 bytes past the beginning of myfoo and 9 bytes past &myfoo[0].b. This means it's impossible for both myfoo[0].b and myfoo[1].b to be aligned to their required alignment (4).
There's not truely standardized way of aligning a struct, but the rule of thumb goes like this: The entire struct is aligned at a 4 or 8 byte boundary (depending on the platform). Within the struct, each member is aligned by its size. So the following packs with no padding:
char // 1
char
char
char
short int // 2
short int
int // 4
This will have a total size of 12. However, this next one will cause padding:
char // 1, + 1 bytes padding
short // 2
int // 4
char // 1, + 1 byte padding
short // 2
char // 1
char // 1, + 2 bytes padding
Now the structure takes up 16 bytes.
This is just a typical example, the details will depend on your platform. Sometimes you can tell a compiler to never add any padding -- this cause more expensive memory access (possibly introducing concurrency problems) but will save space.
To lay out aggregates as efficiently as possible, order the members by size, starting with the biggest.
The size of a structure is implementation defined, but it is hard to say what the size of your structure will be without more information (it is incomplete). For instance, given this struct:
struct MyStruct {
int abc;
int def;
char temp;
};
Yields a size of 9 on my compiler. 4 bytes for int and 1 byte for a char.
Have modified your code so that it compiles and ran it on Eclipse/Microsoft C compiler platform:
struct c {
int a;
struct c *b;
};
struct c d;
printf("\nsizeof c=%d, sizeof a=%d, sizeof b=%d",
sizeof(d), sizeof(d.a), sizeof(d.b));
printf("\naddrof c =%08x", &c);
printf("\naddrof c.a=%08x", &c.a);
printf("\naddrof c.b=%08x", &c.b);
The above code fragment produced the following output:
sizeof c=8, sizeof a=4, sizeof b=4
addrof c =0012ff38
addrof c.a=0012ff38
addrof c.b=0012ff3c
Do something like this so you can see (WITHOUT GUESSING) exactly how your compiler formats a structure.

storing in unsigned char array

i got a question about unsigned char array.
How can i store an integer in the array continually?
for example, i need to store 01011 to the array first. Then i need to store 101, how can i stored as 01011101 in the array?
thanks for your help!
Store 01011 first. You'll get value 00001011. Then when you want to store three more bits, perform a left shift by three positions (you'll get 01011000) and make OR with 00000101, you'll get 01011101. However, doing it this way you have to know definitely that you had only five bits filled after first assignment.
Obviously, you'll need to resize the array as it grows. Dynamic memory allocation/reallocation is a way to go, probably. Pay attention to choosing a right reallocation strategy.
Apart from that, you may want to look at C++ STL containers if you are not limited only to C.
You should write an abstract data type called bitstream for this purpose. It could have the following interface:
This is file bitstream.h:
#ifndef BITSTREAM_H
#define BITSTREAM_H
typedef struct bitstream_impl bitstream;
struct bitstream_impl;
/**
* Creates a bit stream and allocates memory for it. Later, that memory
* must be freed by calling bitstream_free().
*/
extern bitstream *bitstream_new();
/**
* Writes the lower 'bits' bits from the 'value' into the stream, starting with
* the least significand bit ("little endian").
*
* Returns nonzero if the writing was successful, and zero if the writing failed.
*/
extern int bitstream_writebits_le(bitstream *bs, unsigned int value, unsigned int bits);
/**
* Writes the lower 'bits' bits from the 'value' into the stream, starting with
* the most significand bit ("big endian").
*
* Returns nonzero if the writing was successful, and zero if the writing failed.
*/
extern int bitstream_writebits_be(bitstream *bs, unsigned int value, unsigned int bits);
/**
* Returns a pointer to the buffer of the bitstream.
*
* The returned pointer remains valid until the next time that one of the
* bitstream_write* functions is called. The returned buffer must not be
* modified. All bits in the buffer that have not yet been written are zero.
* (This applies only to the last byte of the buffer.) Each byte of the buffer
* contains at most 8 bits of data, even if CHAR_BITS is larger.
*/
extern unsigned char *bitstream_getbuffer(const bitstream *bs);
/**
* Returns the number of bits that have been written to the stream so far.
*/
extern unsigned int bitstream_getsize(const bitstream *bs);
/**
* Frees all the memory that is associated with this bitstream.
*/
extern void bitstream_free(bitstream *bs);
#endif
Then you need to write an implementation for this interface. It should be in a file called bitstream.c. This is left as an excercise.
To use the bitstream should then be pretty simple:
#include "bitstream.h"
static void
die(const char *msg) {
perror(msg);
exit(EXIT_FAILURE);
}
int main(void)
{
bitstream *bs;
unsigned char *buf;
bs = bitstream_new();
if (bs == NULL)
die("bitstream_new");
if (!bitstream_writebits_be(bs, 0x000b, 5))
die("write 5 bits");
if (!bitstream_writebits_be(bs, 0x0005, 3))
die("write 3 bits");
if (bitstream_getsize(bs) != 8)
die("FAIL: didn't write exactly 8 bits.");
buf = bitstream_getbuffer(bs);
if (buf[0] != 0x005dU)
die("FAIL: didn't write the expected bits.");
bitstream_free(bs);
return 0;
}

Resources