How to handle portability issues in a binary file format

How to handle portability issues in a binary file format - c

I'm designing a binary file format to store strings[without terminating null to save space] and binary data.
i. What is the best way to deal with little/big endian systems?
i.a Would converting everything to network byte order and back with ntohl()/htonl() work?
ii. Will the packed structures be the same size on x86, x64 and arm?
iii. Are their any inherent weakness with this approach?
struct __attribute__((packed)) Header {
uint8_t magic;
uint8_t flags;
};
struct __attribute__((packed)) Record {
uint64_t length;
uint32_t crc;
uint16_t year;
uint8_t day;
uint8_t month;
uint8_t hour;
uint8_t minute;
uint8_t second;
uint8_t type;
};
Tester code I'm using the develop the format:
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <limits.h>
#include <strings.h>
#include <stdint.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
struct __attribute__((packed)) Header {
uint8_t magic;
uint8_t flags;
};
struct __attribute__((packed)) Record {
uint64_t length;
uint32_t crc;
uint16_t year;
uint8_t day;
uint8_t month;
uint8_t hour;
uint8_t minute;
uint8_t second;
uint8_t type;
};
int main(void)
{
int fd = open("test.dat", O_RDWR|O_APPEND|O_CREAT, 444);
struct Header header = {1, 0};
write(fd, &header, sizeof(header));
char msg[] = {"BINARY"};
struct Record record = {strlen(msg), 0, 0, 0, 0, 0, 0, 0};
write(fd, &record, sizeof(record));
write(fd, msg, record.length);
close(fd);
fd = open("test.dat", O_RDWR|O_APPEND|O_CREAT, 444);
read(fd, &header, sizeof(struct Header));
read(fd, &record, sizeof(struct Record));
int len = record.length;
char c;
while (len != 0) {
read(fd, &c, 1);
len--;
printf("%c", c);
}
close(fd);
}

i. Defining the file to be in one order and converting to and from "internal" order, if necessary, when reading/writing (perhaps with ntohl and the like) is, in my opinion, the best approach.
ii. I do not trust packed structures. They might work for this approach for those platforms, but there are no guarantees.
iii. Reading and writing binary files using fread and fwrite on whole structs is (again in my opinion) an inherently weak approach. You maximize the likelihood that you will be bitten by word size problems, padding and alignment problems, and byte order problems.
What I like to do is write little functions like get16() and put32() that read and write a byte at a time and so are inherently insensitive to word size and byte order difficulties. Then I write straightforward putHeader and getRecord functions (and the like) in terms of these.
unsigned int get16(FILE *fp)
{
unsigned int r;
r = getc(fp);
r = (r << 8) | getc(fp);
return r;
}
void put32(unsigned long int x, FILE *fp)
{
putc((int)((x >> 24) & 0xff), fp);
putc((int)((x >> 16) & 0xff), fp);
putc((int)((x >> 8) & 0xff), fp);
putc((int)(x & 0xff), fp);
}
[P.S. As #Olaf correctly points out in one of the comments, in production code you'd need handling for EOF and error in these functions. I've left those out for simplicity of presentation.]

Related

Clear just bit field members of a struct?

I have a struct like the following:
struct Foo {
unsigned int id;
unsigned int flag_1 : 1;
unsigned int flag_2 : 1;
unsigned int flag_3 : 1;
// Some arbitrary number of further flags. Code is
// automatically generated and number will vary.
// Notably, it may be more than an int's worth.
int some_data;
float some_more_data;
// ...
};
From time to time, I need to reset all the flags to zero while preserving the rest of the struct. One way is obviously to set each flag to 0 individually, but it feels like there ought to be a way to do it in one fell swoop. Is that possible?
(Note that I am open to not using bit fields, but this is code that will sometimes run on memory-contrained systems, so the memory savings are very appealing.)
Edit:
There is a similar question here: Reset all bits in a c bitfield
However, the struct in that question is entirely bitfields. I cannot simply memset the entire struct to zero here, and the other answer involving unions is not guaranteed to work, especially if there are more than an int's worth of flags.

Just use a separate struct for the flags:
struct Foo_flags {
unsigned int flag_1 : 1;
unsigned int flag_2 : 1;
unsigned int flag_3 : 1;
// ...
};
struct Foo {
unsigned int id;
struct Foo_flags flags;
int some_data;
float some_more_data;
// ...
};
Or even a simpler nested struct:
struct Foo {
unsigned int id;
struct {
unsigned int flag_1 : 1;
unsigned int flag_2 : 1;
unsigned int flag_3 : 1;
// ...
} flags;
int some_data;
float some_more_data;
// ...
};
Then, later in your code:
struct Foo x;
// ...
x.flags.flag_1 = 1;
// ...
memset(&x.flags, 0, sizeof(x.flags));

With some minor adjustments, you can use the offsetof macro to find the beginning and end of the "flag" data within the structure, then use memset to clear the relevant memory. (Note that you cannot use offsetof directly on bitfields, hence the addition of the flag_beg member!)
Here's a working example:
#include <stdio.h>
#include <stddef.h> // defines offsetof
#include <string.h> // declares memset
struct Foo {
unsigned int id;
unsigned int flag_beg; // Could be unsigned char to save space
unsigned int flag_1 : 1;
unsigned int flag_2 : 1;
unsigned int flag_3 : 1;
unsigned int flag_end; // Could be unsigned char to save space
// Some arbitrary number of further flags. Code is
// automatically generated and number will vary.
// Notably, it may be more than an int's worth.
int some_data;
float some_more_data;
// ...
};
#define FBEG (offsetof(struct Foo, flag_beg))
#define FEND (offsetof(struct Foo, flag_end))
int main()
{
struct Foo f;
f.id = 3; f.flag_1 = 1; f.flag_2 = 0; f.flag_3 = 1;
f.some_data = 33; f.some_more_data = 16.2f;
printf("%u %u %u %u %d %f\n", f.id, f.flag_1, f.flag_2, f.flag_3, f.some_data, f.some_more_data);
memset((char*)(&f) + FBEG, 0, FEND - FBEG);
printf("%u %u %u %u %d %f\n", f.id, f.flag_1, f.flag_2, f.flag_3, f.some_data, f.some_more_data);
return 0;
}

Generating packets in C

I am not receiving anything in buffer. Wherever I printf my buffer, it is always empty or shows garbage value. Can anyone help?
I defined header, packet and called them in my main, but buffer still shows garbage.
#include <stdint.h>
struct header {
uint16_t f1;
uint16_t f2;
uint32_t f3;
};
struct data {
uint16_t pf1;
uint64_t pf2;
};
#include <arpa/inet.h>
#include <string.h>
#include <stdint.h>
#include "packet.h"
void htonHeader(struct header h, char buffer[8]) {
uint16_t u16;
uint32_t u32;
u16 = htons(h.f1);
memcpy(buffer+0, &u16, 2);
printf("Value of buff is: %hu\n",buffer);
u16 = htons(h.f2);
memcpy(buffer+2, &u16, 2);
u32 = htonl(h.f3);
memcpy(buffer+4, &u32, 4);
}
void htonData(struct data d, char buffer[10]) {
uint16_t u16;
uint32_t u32;
u16 = htons(d.pf1);
memcpy(buffer+0, &u16, 2);
u32 = htonl(d.pf2>>32);
memcpy(buffer+2, &u32, 4);
u32 = htonl(d.pf2);
memcpy(buffer+6,&u32, 4);
}
void HeaderData(struct header h, struct data d, char buffer[18]) {
htonHeader(h, buffer+0);
htonData(d, buffer+8);
printf("buff is: %s\n",buffer);
}
#include <stdio.h>
#include "packet.c"
#include <string.h>
#include<stdlib.h>
int main(){
struct header h;
struct data d;
char buff[18];
//printf("Packet is: %s\n",buff);
printf("Generating Packets..... \n");
h.f1=1;
d.pf1=2;
h.f2=3;
d.pf2=4;
h.f3=5;
HeaderData(h,d,buff);
strcat(buff,buff+8);
printf("Packet is: %s\n",buff);
return 0;
}

The problem is that your printf()s are either syntactically wrong (printf( "%hu", ... ); expects an unsigned short as parameter, but you pass a pointer) or you try to print buff by using "%s" but the content is binary, not text. What you could do instead was doing some kind of hexdump, like:
int i;
for( i=0; i<sizeof( buff ); i++ ) {
printf( "%x ", buff[i] & 0xff );
}
puts( "" ); // terminate the line
Please note, that using sizeof works im main() only, in the other function you've got to determine the buffer size differently.
Besides: because of the binary content of buff, you can't use strcat(). Even if you have made sure that there is a '\0' behind the last value you have copied (I haven't checked if you have), depending on the integer values you copy, there may be another '\0' value before that one and strcat() would overwrite everything form that point on.

Creating bmp file in C

I am trying to create .bmp file (filled with one colour for testing purposes).
Here is code that I'm using:
#include <stdio.h>
#define BI_RGB 0
typedef unsigned int UINT;
typedef unsigned long DWORD;
typedef long int LONG;
typedef unsigned short WORD;
typedef unsigned char BYTE;
typedef struct tagBITMAPFILEHEADER {
UINT bfType;
DWORD bfSize;
UINT bfReserved1;
UINT bfReserved2;
DWORD bfOffBits;
} BITMAPFILEHEADER;
typedef struct tagBITMAPINFOHEADER {
DWORD biSize;
LONG biWidth;
LONG biHeight;
WORD biPlanes;
WORD biBitCount;
DWORD biCompression;
DWORD biSizeImage;
LONG biXPelsPerMeter;
LONG biYPelsPerMeter;
DWORD biClrUsed;
DWORD biClrImportant;
} BITMAPINFOHEADER;
typedef struct COLORREF_RGB
{
BYTE cRed;
BYTE cGreen;
BYTE cBlue;
}COLORREF_RGB;
int main(int argc, char const *argv[])
{
BITMAPINFOHEADER bih;
bih.biSize = sizeof(BITMAPINFOHEADER);
bih.biWidth = 600;
bih.biHeight = 600;
bih.biSizeImage = bih.biWidth * bih.biHeight * 3;
bih.biPlanes = 1;
bih.biBitCount = 24;
bih.biCompression = BI_RGB;
bih.biXPelsPerMeter = 2835;
bih.biYPelsPerMeter = 2835;
bih.biClrUsed = 0;
bih.biClrImportant = 0;
COLORREF_RGB rgb;
rgb.cRed = 0;
rgb.cGreen = 0;
rgb.cBlue = 0;
BITMAPFILEHEADER bfh;
bfh.bfType = 0x424D;
bfh.bfReserved1 = 0;
bfh.bfReserved2 = 0;
bfh.bfOffBits = sizeof(BITMAPFILEHEADER) + bih.biSize;
bfh.bfSize = sizeof(BITMAPFILEHEADER) + sizeof(BITMAPINFOHEADER) +
bih.biWidth * bih.biHeight * 4;
FILE *f;
f = fopen("test.bmp","wb");
fwrite(&bfh, sizeof(BITMAPFILEHEADER), 1, f);
fwrite(&bih, sizeof(BITMAPINFOHEADER), 1, f);
int i,j;
for(i = 0; i < bih.biHeight; i++)
{
for(j = 0; j < bih.biWidth; j++)
{
fwrite(&rgb,sizeof(COLORREF_RGB),1,f);
}
}
fclose(f);
return 0;
}
and jet every time I compile and run it I get error saying that its not valid BMP image. I double checked all values in multiple references and still can't find error here.
Did I misunderstood something and what am I doing wrong here?
Also not sure if important but I am using Ubuntu 14.04 to compile this.
EDIT
Found one more issue :
bfh.bfType = 0x424D;
should be
bfh.bfType = 0x4D42;
But still can't see image.

First of all, you set:
bfSize Specifies the size of the file, in bytes.
to invalid value, your code resulted into 1440112 while size of file is actually 1080112
bfh.bfSize = sizeof(BITMAPFILEHEADER) + sizeof(BITMAPINFOHEADER) +
bih.biWidth * bih.biHeight * sizeof(COLORREF_RGB);
Because sizeof(COLORREF_RGB) is actually 4 not 3.
Another mistake is that size of your structs and types:
expected size actual size*
typedef unsigned int UINT; // 2 4
typedef unsigned long DWORD; // 4 8
typedef long int LONG; // 4 8
typedef unsigned short WORD; // 2 2
typedef unsigned char BYTE; // 1 1
* I'm using gcc on x86_64 architecture
Your offsets just don't match with offsets on wikipedia, reference you are using was probably written for 16 bit compiler on 16 bit OS (as pointed out by cup in a comment) so it assumes int to be 2B type.
Using values from stdint.h worked for me (guide on stdint.h for Visual Studio here):
#include <stdint.h>
typedef uint16_t UINT;
typedef uint32_t DWORD;
typedef int32_t LONG;
typedef uint16_t WORD;
typedef uint8_t BYTE;
And last but not least you have to turn off memory alignment as suggested by Weather Vane.

I believe your field sizes to be incorrect, try this
#pragma pack(push, 1)
typedef struct tagBITMAPFILEHEADER {
WORD bfType;
DWORD bfSize;
WORD bfReserved1;
WORD bfReserved2;
DWORD bfOffBits;
} BITMAPFILEHEADER;
#pragma pack(pop)

You are trying to map a C structure to some externally-defined binary format There are a number of problems with this:
Size
typedef unsigned int UINT;
typedef unsigned long DWORD;
typedef long int LONG;
typedef unsigned short WORD;
typedef unsigned char BYTE;
The C language only specify the minimum sizes of these types. Their actual sizes can and do vary between different compilers and operating systems. The sizes and offset of the members in your structures may not be what you expect. The only size that you can rely on (on almost any system that you are likely to encounter these days) is char being 8 bits.
Padding
typedef struct tagBITMAPFILEHEADER {
UINT bfType;
DWORD bfSize;
UINT bfReserved1;
UINT bfReserved2;
DWORD bfOffBits;
} BITMAPFILEHEADER;
It is common for DWORD to be twice as large as a UINT, and in fact your program depends on it. This usually means that the compiler will introduce padding between bfType and bfSize to give the latter appropriate alignment for its type. The .bmp format has no such padding.
Order
BITMAPFILEHEADER bfh;
bfh.bfType = 0x424D;
...
fwrite(&bfh, sizeof(BITMAPFILEHEADER), 1, f);
Another problem is that the C language does not specify the endianess of the types. The .bfType member may be stored as [42][4D] (Big-Endian) or [4D][42] (Little-Endian) in memory. The .bmp format specifically requires the values to be stored in Little-Endian order.
Solution
You might be able to solve some of these problems by using compiler-specific extensions (such as #pragma's or compiler switches), but probably not all of them.
The only way to properly write an externally-defined binary format, is to use an array of unsigned char and write the values a byte at a time. Personally, I would write a set of helper functions for specific types:
void w32BE (unsigned char *p, unsigned long ul)
{
p[0] = (ul >> 24) & 0xff;
p[1] = (ul >> 16) & 0xff;
p[2] = (ul >> 8) & 0xff;
p[3] = (ul ) & 0xff;
}
void w32LE (unsigned char *p, unsigned long ul)
{
p[0] = (ul ) & 0xff;
p[1] = (ul >> 8) & 0xff;
p[2] = (ul >> 16) & 0xff;
p[3] = (ul >> 24) & 0xff;
}
/* And so on */
and some functions for writing the entire .bmp file or sections of it:
int function write_header (FILE *f, BITMAPFILEHEADER bfh)
{
unsigned char buf[14];
w16LE (buf , bfh.bfType);
w32LE (buf+ 2, bfh.bfSize);
w16LE (buf+ 6, bfh.bfReserved1);
w16LE (buf+ 8, bfh.bfReserved2);
w32LE (buf+10, bfh.bfOffBits);
return fwrite (buf, sizeof buf, 1, f);
}

Reading and writing bitmaps in c

I'm trying to create an application that inverts the colors of a bitmap file but am having some trouble with actually gathering the data and from the bitmap. I'm using structures to keep the data for the bitmap and it's header. Right now I have:
struct
{
uint16_t type;
uint32_t size;
uint32_t offset;
uint32_t header_size;
int32_t width;
int32_t height;
uint16_t planes;
uint16_t bits;
uint32_t compression;
uint32_t imagesize;
int32_t xresolution;
int32_t yresolution;
uint32_t ncolours;
uint32_t importantcolours;
} header_bmp
struct {
header_bmp header;
int data_size;
int width;
int height;
int bytes_per_pixel;
char *data;
} image_bmp;
Now for actually reading and writing the bitmap I have the following:
image_bmp* startImage(FILE* fp)
{
header_bmp* bmp_h = (struct header_bmp*)malloc(sizeof(struct header_bmp));
ReadHeader(fp, bmp_h, 54);
}
void ReadHeader(FILE* fp, char* header, int dataSize)
{
fread(header, dataSize, 1, fp);
}
From here how do I extract the header information into my header structure?
Also if anyone has any good resources over reading and writing bitmaps, please let me know. I have been searching for hours and can't find much useful information over the topic.

You actually should already have all the data in the correct places. The only issue possibly gone wrong could be endianness. e.g. is the number 256 represented in "short" as
0x01 0x00 or 0x00 0x01.
EDIT: there is something wrong related to the syntax of struct...
struct name_of_definition { int a; int b; short c; short d; };
struct name_of_def_2 { struct name_of_definition instance; int a; int b; }
*ptr_to_instance; // or one can directly allocate the instance it self by
// by omitting the * mark.
struct { int b; int c; } instance_of_anonymous_struct;
ptr_to_instance = malloc(sizeof(struct name_of_def_2));
also:
ReadHeader(fp, (char*)&ptr_to_instance->header, sizeof(struct definition));
// ^ don't forget to cast to the type accepted by ReadHeader
In this way you can directly read data into the middle of the struct, but the possible issue of endianness still lurks around.

I get segmentation fault reading infoheader on a BMP using fread. How do I fix this please?

This is got me pretty stuck, how do I fix this? I know I haven't got error checking, but they aren't required i'd guess since it's restricted to my desktop. It obveously can't be EOF. It's for the infoheader struct, fileheader works fine. Do i need to take a new line or something?
#include <stdio.h>
#include <stdlib.h>
typedef struct
{
unsigned char fileMarker1; /* 'B' */
unsigned char fileMarker2; /* 'M' */
unsigned int bfSize;
unsigned short unused1;
unsigned short unused2;
unsigned int imageDataOffset; /* Offset to the start of image data */
}FILEHEADER;
typedef struct
{
unsigned int biSize;
int width; /* Width of the image */
int height; /* Height of the image */
unsigned short planes;
unsigned short bitPix;
unsigned int biCompression;
unsigned int biSizeImage;
int biXPelsPerMeter;
int biYPelsPerMeter;
unsigned int biClrUsed;
unsigned int biClrImportant;
}INFOHEADER;
typedef struct
{
unsigned char b; /* Blue value */
unsigned char g; /* Green value */
unsigned char r; /* Red value */
}IMAGECOMPONENT;
int fileheadfunc(FILE *image);
int infoheadfunc(FILE *image);
int main( int argc, char *argv[] )
{
char *filename; /* *threshholdInput = argv[2]; */
FILE *image;
int filehead, infohead;
filename = argv[1];
/* int threshhold = atoi(threshholdInput); */
if (argc != 2)
{
printf(" Incorrect Number Of Command Line Arguments\n");
return(0);
}
image = fopen( filename, "r");
if (image == NULL)
{
fprintf(stderr, "Error, cannot find file %s\n", filename);
exit(1);
}
filehead = fileheadfunc(image);
infohead = infoheadfunc(image);
fclose(image);
return(0);
}
int fileheadfunc(FILE *image)
{
FILEHEADER *header;
long pos;
fseek (image , 0 , SEEK_SET);
fread( (unsigned char*)header, sizeof(FILEHEADER), 1, image );
if ( (*header).fileMarker1 != 'B' || (*header).fileMarker2 != 'M' )
{
fprintf(stderr, "Incorrect file format");
exit(1);
}
printf("This is a bitmap!\n");
pos = ftell(image);
printf("%ld\n", pos);
printf("%zu\n", sizeof(FILEHEADER));
return(0);
}
int infoheadfunc(FILE *image)
{
INFOHEADER *iheader;
fseek (image, 0, SEEK_CUR );
fread( (unsigned int*)iheader, sizeof(INFOHEADER), 1, image );
printf("Width: %i\n", (*iheader).width);
printf("Height: %i\n", (*iheader).height);
return(0);
}

You're not actually allocating any storage for the BMP header data structures, e.g. you need to change this:
int fileheadfunc(FILE *image)
{
FILEHEADER *header;
long pos;
fseek(image, 0, SEEK_SET);
fread((unsigned char*)header, sizeof(FILEHEADER), 1, image);
...
to this:
int fileheadfunc(FILE *image)
{
FILEHEADER header; // <<<
long pos;
fseek(image, 0, SEEK_SET);
fread(&header, sizeof(FILEHEADER), 1, image); // <<<
...
Also, as previously noted in one of the comments above, you need #pragma pack(1) (or equivalent if you're not using gcc or a gcc-compatible compiler) prior to your struct definitions to eliminate unwanted padding. (NB: use #pragma pack() after your struct definitions to restore normal struct padding/alignment.)

There are two problems with the code:
Alignment
For performance reasons the compiler will, unless instructed to do otherwise, arrange struct fields on its "natural boundaries", effectively leaving uninitialised gaps between byte-size fields. Add
#pragma pack(1)
before the struct definitions and you should be fine. It's also easy to test: just print out the struct size without and with pragma pack in place, and you'll see the difference.
Allocation
As Paul R already said, you should allocate space for the headers, not just provide a pointer to the structures. The fact that fileheadfunc works is a coincidence, there just wasn't anything in the way that got smashed when data got written outside of the allocated space.
A last one, just for prevention sake: should you ever want to return the read structures to the calling program, do not just return a pointer to the structure allocated in the function as that will cause problems similat to the unallocated variables you have now. Allocate them in the calling function, and pass a pointer to that variable to the header read functions.
EDIT clarification regarding the last point:
DON'T
FILEHEADER * fileheadfunc(FILE *image)
{
FILEHEADER header;
...
return &header; // returns an address on the function stack that will
// disappear once you return
}
DO
int fileheadfunc(FILE *image, FILEHEADER *header)
{
...
}
which will be called like this
...
FILEHEADER header;
returnvalue = fileheaderfunc(imagefile,&header);
EDIT2: just noticed that the way you read the DIB header is not correct. There are several variations of that header, with different sizes. So after reading the file header you first need to read 4 bytes into an unsigned int and based on the value read select the correct DIB header structure to use (don't forget you already read its first field!) or tell the user you encountered an unsupported file format.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to handle portability issues in a binary file format - c

Related

Clear just bit field members of a struct?

Generating packets in C

Creating bmp file in C

Reading and writing bitmaps in c

I get segmentation fault reading infoheader on a BMP using fread. How do I fix this please?

Categories

Resources