Packed bit fields in c structures - GCC - c

I am working with structs in c on linux.
I started using bit fields and the "packed" attribute and I came across a wierd behavior:
struct __attribute__((packed)) {
int a:12;
int b:32;
int c:4;
} t1;
struct __attribute__((packed)) {
int a:12;
int b;
int c:4;
}t2;
void main()
{
printf("%d\n",sizeof(t1)); //output - 6
printf("%d\n",sizeof(t2)); //output - 7
}
How come both structures - that are exactly the same - take diffrent number of bytes?

Your structures are not "exactly the same". Your first one has three consecutive bit-fields, the second has one bit-field, an (non bit-field) int, and then a second bit-field.
This is significant: consecutive (non-zero width) bit-fields are merged into a single memory location, while a bit-field followed by a non-bit-field are distinct memory locations.
Your first structure has a single memory location, your second has three. You can take the address of the b member in your second struct, not in your first. Accesses to the b member don't race with accesses the a or c in your second struct, but they do in your first.
Having a non-bit-field (or a zero-length bit-field) right after a bit-field member "closes" it in a sense, what follows will be a different/independent memory location/object. The compiler cannot "pack" your b member inside the bit-field like it does in the first struct.

struct t1 // 6 bytes
{
int a:12; // 0:11
int b:32; // 12:43
int c:4; // 44:47
}__attribute__((packed));
struct t1 // 7 bytes
{
int a:12; // 0:11
int b; // 16:47
int c:4; // 48:51
}__attribute__((packed));
The regular int b must be aligned to a byte boundary. So there is padding before it. If you put c right next to a this padding will no longer be necessary. You should probably do this, as accessing non-byte-aligned integers like int b:32 is slow.

Related

Why is this structures code in C outputs differently? [duplicate]

I've read this about structure padding in C:
http://bytes.com/topic/c/answers/543879-what-structure-padding
and wrote this code after the article, what should print out size of 'struct pad' like 16 byte and the size of 'struct pad2' should be 12. -as I think.
I compiled this code with gcc, with different levels of optimization, even the sizeof() operator gives me both of them 16 byte.
Why is it?
This information is necessary for me because of PS3 machines, where the byte boundaries and exploitation of the full dma transfer is important:
#include <stdio.h>
#include <stdlib.h>
struct pad
{
char c1; // 1 byte
short s1; // 2 byte
short s2; // 2 byte
char c2; // 1 byte
long l1; // 4 byte
char c3; // 1 byte
};
struct pad2
{
long l1;
short s1;
short s2;
char c1;
char c2;
char c3;
};
int main(void)
{
struct pad P1;
printf("%d\n", sizeof(P1));
struct pad P2;
printf("%d\n", sizeof(P2));
return EXIT_SUCCESS;
}
There are two tricks that can be used to owercome this problem
Using directive #pragma pack(1) and then #pragma pack(pop)
example:
#pragma pack(1)
struct tight{
short element_1;
int *element_2;
};
#pragma pack(pop)
To check if the sizes of two structs are same during compilation use this trick
char voidstr[(sizeof(struct1)==sizeof(struct2)) - 1]; //it will return error at compile time if this fail
Your structures each include a long, which your platform apparently requires to be on a four-byte boundary. The structure must be at least as aligned as its most aligned member, so it has to be 4-byte aligned, and a structure's size has to be a multiple of its alignment in case it goes into an array.
Extra padding is required to make the long aligned, and so the smallest multiple of 4 is 16.
Two pieces of advice:
You can compute the offset of a field l1 by
printf("Offset of field %s is %d\n", "l1", offsetof(struct pad, l1);
To get the offsetof macro you will need to #include <stddef.h> (thanks caf!).
If you want to pack data as densely as possible, use unsigned char[4] instead of long and unsigned char[2] instead of short, and do the arithmetic to convert.
EDIT:: The sizeof(struct pad2) is 12. Your code has a bug; structure P2 is declared of type struct pad. Try this:
#define xx(T) printf("sizeof(" #T ") == %d\n", sizeof(T))
xx(struct pad);
xx(struct pad2);
P.S. I should definitely stop trying to answer SO questions after midnight.
On PS3, don't guess. Use __attribute__((aligned (16))), or similar. Not only does it guarantee that the start of the structure will be aligned on a proper boundary (if global or static), it also pads the structure to a multiple of your specified alignment.
Your code isn't showing what you think it is, because both P1 and P2 are defined as instances of struct pad. struct pad2 isn't ever used.
If I change the definition of P2 so that it is struct pad2, gcc does indeed decide to make it size 12.
struct pad P1;
printf("%d\n", sizeof(P1));
struct pad P2;
printf("%d\n", sizeof(P2));
P1 and P2 have the same type "struct pad" maybe you want to use "struct pad2" for P2.
All CPU expect that the built in data types like (int, float,char,double) are stored in the memory at their natural boundary, at address of their length.So structure padding is done to faster access of data from memory.
For example,
If int is declared, it should occurs in the memory at an address multiple of 4, as
size of the int is 4 byte.
Similarly for double, it resides in memory at multiple of 8.
If memory is properly aligned, CPU can run faster and work efficiently.
For the following examples, let us assume:
Sizeof(int)=4 byte
Sizeof(float)=4 byte
Sizeof(char)=1 byte
Find details on BoundsCheck

What determines the distance between two structure members?

#include‬ <stdio.h>
struct test
{
unsigned int x;
long int y;
unsigned int z;
};
int main()
{
struct test t;
unsigned int *ptr1 = &t.x;
unsigned int *ptr2 = &t.z;
printf("%d", ptr2 - ptr1);
return 0;
}
This program's output is 4 on my system, why do I get this result instead of 2?
Is the ptr2 - ptr1 statement correct as ptr1 and ptr2 come from pointers to members of the same structure item?
The reason this outputs 4 has to do with the size of each type and struct padding and alignment.
On my system, sizeof(unsigned int) is 4, sizeof(long int) is 8, and sizeof(struct test) is 24. So to ensure that the 64-bit field lies on a 64-bit boundary, the structure is physically laid out like this:
struct test
{
unsigned int x; // 4 bytes
// 4 bytes padding
long int y; // 8 bytes
unsigned int z; // 4 bytes
// 4 bytes padding
};
So when you take the difference between the offset of x and the offset of z, there are 16 bytes difference. And since we're doing pointer subtraction, the value of the difference is {byte offset difference} / {element size}. So we have 16 (byte difference) / 4 (sizeof(unsigned int)) == 4.
If sizeof(long int) was 4, then the struct would probably be laid out like this:
struct test
{
unsigned int x; // 4 bytes
long int y; // 4 bytes
unsigned int z; // 4 bytes
};
In which case the output would be 2.
Note that while the ordering of struct members is defined to be sequential, the layout of the padding is is implementation defined. Compilers are free to pad as they see fit.
From section 6.7.2.1 of the C standard:
13 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that
increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial
member (or if that member is a bit-field, then to the unit
in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
...
15 There may be unnamed padding at the end of a structure or union.

How to add an array to a bitfield struct?

I tried to use bit fields in a struct for some values which only need one or two bits instead of a whole byte.
My code is:
struct s_rdata {
signed int p0:28;
signed int p1:28;
signed int p2:28;
unsigned int d0:17;
unsigned int d1:17;
unsigned int d2:17;
unsigned char data1:1;
unsigned char data2:1;
} rdata;
So, you might see that there are variables named p0 - p2, d0 - d2 and data1 - data2.
I'd now like to have those in an array. However, none of this lines does work:
signed int p[3]:28;
signed int p:28[3];
Can't I add an array to a bitfield list, an array of signed int only needing 28 bits per entry?
No, you cannot have an array of bitfields, nor a bitfield whose base type is an array type. The latter doesn't even make sense. The former would be counter-productive, as you would lose any space efficiency gained by using bitfields in the first place.
You can have an array of a structs with a single bitfield member:
struct container {
signed int bitfield:7;
} array[3];
but again, you would lose any space efficiency associated with the bitfield use.
You can create an array of any struct type, of course, including those with multiple bitfield members. In that case, you may achieve some internal space efficiency within the individual structures, but you will likely see padding at the end of each one that reduces the overall space efficiency of the array.
Ultimately, unless your program's memory consumption is excessive for its target environment, I strongly recommend that you forget about bitfields. They will complicate your life for an uncertain gain of uncertain importance, and that gain will likely be offset by a performance degradation (of uncertain magnitude and importance).
Should you eventually decide that the program really is using too much memory, then be certain to test any change you make to see how much improvement it yields, and at what cost.
I agree with #John's assessment with respect to space inefficiencies, but that notwithstanding, here is an idea to contain contents of a bit field within struct array:
typedef struct
{
int A : 16;
int B : 16;
} Struct1;
typedef struct
{
Struct1 B;//bitfield struct
int array[10];//combined with array (of any legal type)
}
Here is an example of populating array with contents of bit struct:
typedef struct {
signed int p0:28;
signed int p1:28;
signed int p2:28;
unsigned int d0:17;
unsigned int d1:17;
unsigned int d2:17;
unsigned char data1:1;
unsigned char data2:1;
} RDATA;
typedef struct {
int A[3];
unsigned int B[3];
unsigned char C[2];
} ARRAY;
RDATA rdata = {//initialize instance of RDATA with data
28,
28,
28,
17,
17,
17,
1,
1
};
int WriteToArray(ARRAY *a);
int main(void)
{
ARRAY a;
WriteToArray(&a);//populate array with RDATA values
//ARRAY is now populated with bitfield values;
a.A[0];
a.A[1];
//and so on
return 0;
}
WriteToArray(ARRAY *a)
{
a->A[0]=rdata.p0;//using values initialized above
a->A[1]=rdata.p1;
a->A[2]=rdata.p2;
a->B[0]=rdata.d0;
a->B[1]=rdata.d1;
a->B[2]=rdata.d2;
a->C[0]=rdata.data1;
a->C[1]=rdata.data2;
return 0;
}
You can also experiment with pragma pack statements to look at effects on space, but refer to #John's comment under your post for additional considerations regarding pragma and spacing.

Bit-fields confusion?

I am working with bit-fields in C and do not understand what is going on with them. I created this code but I do not understand why different things are coming up as usual.
struct tB
{
unsigned b1:3;
signed b2:6;
unsigned b3:11;
signed b4:1;
} b;
int main(void)
{
struct tB *p;
printf("%d\n", sizeof(*p));
}
Why when I print out *p do I get 4 as *p?
Let us say I was trying to get sizeof(b), how would I come up with that?
sizeof(b) will give you the size in bytes of a variable of type struct tB,which in this case will be 4 (Due to padding it won't be 3 as it is expected to be)
sizeof(*p) will again give you the size in bytes of a variable of type struct tB .You should initialize p with the address of a variable of struct tB type.Eg:
struct tB *p=&b;
But you should know that in this case if you use sizeof(p) then it would give the size of the pointer p, not the variable pointed by p. Try this variation of your program :
#include<stdio.h>
struct tB
{
unsigned b1:3;
signed b2:6;
unsigned b3:11;
signed b4:1;
unsigned b5:13;
} b;
int main(void)
{
struct tB *p;
printf("%d\n%d",sizeof(*p),sizeof(p));
}
Here is another variation that rounds the size of struct tB to 24 bits(3 bytes) as you expect,by dealing with the padding using the #pragma pack() directive,which is compiler dependent (I am using CodeBlocks on Windows).
#include<stdio.h>
#pragma pack(1)
struct tB
{
unsigned b1:3;
signed b2:6;
unsigned b3:11;
signed b4:1;
} b;
int main(void)
{
struct tB *p;
printf("%d\n%d",sizeof(*p),sizeof(p));
}
You have 21 bits, round up to nearest int and you got 32 (i.e. 4 bytes).
It's all about processor word. The processor accessing the memory can't access it, let's say, 1 or 2 bytes. It fetching it by word. Generally, compiler makes proper aligning of structures to conform word alignment. Usually, this alignment is equal to processor architecture though the size of processor register. So, in your case you have 21-bit structure which aligned to one word. If you will adjust your structure to be, say, 33-bits long you will have 2-word alignment and in your case program will print 8.
Here the article on Wikipedia related to this Data structure alignment.

Structure padding in C

I've read this about structure padding in C:
http://bytes.com/topic/c/answers/543879-what-structure-padding
and wrote this code after the article, what should print out size of 'struct pad' like 16 byte and the size of 'struct pad2' should be 12. -as I think.
I compiled this code with gcc, with different levels of optimization, even the sizeof() operator gives me both of them 16 byte.
Why is it?
This information is necessary for me because of PS3 machines, where the byte boundaries and exploitation of the full dma transfer is important:
#include <stdio.h>
#include <stdlib.h>
struct pad
{
char c1; // 1 byte
short s1; // 2 byte
short s2; // 2 byte
char c2; // 1 byte
long l1; // 4 byte
char c3; // 1 byte
};
struct pad2
{
long l1;
short s1;
short s2;
char c1;
char c2;
char c3;
};
int main(void)
{
struct pad P1;
printf("%d\n", sizeof(P1));
struct pad P2;
printf("%d\n", sizeof(P2));
return EXIT_SUCCESS;
}
There are two tricks that can be used to owercome this problem
Using directive #pragma pack(1) and then #pragma pack(pop)
example:
#pragma pack(1)
struct tight{
short element_1;
int *element_2;
};
#pragma pack(pop)
To check if the sizes of two structs are same during compilation use this trick
char voidstr[(sizeof(struct1)==sizeof(struct2)) - 1]; //it will return error at compile time if this fail
Your structures each include a long, which your platform apparently requires to be on a four-byte boundary. The structure must be at least as aligned as its most aligned member, so it has to be 4-byte aligned, and a structure's size has to be a multiple of its alignment in case it goes into an array.
Extra padding is required to make the long aligned, and so the smallest multiple of 4 is 16.
Two pieces of advice:
You can compute the offset of a field l1 by
printf("Offset of field %s is %d\n", "l1", offsetof(struct pad, l1);
To get the offsetof macro you will need to #include <stddef.h> (thanks caf!).
If you want to pack data as densely as possible, use unsigned char[4] instead of long and unsigned char[2] instead of short, and do the arithmetic to convert.
EDIT:: The sizeof(struct pad2) is 12. Your code has a bug; structure P2 is declared of type struct pad. Try this:
#define xx(T) printf("sizeof(" #T ") == %d\n", sizeof(T))
xx(struct pad);
xx(struct pad2);
P.S. I should definitely stop trying to answer SO questions after midnight.
On PS3, don't guess. Use __attribute__((aligned (16))), or similar. Not only does it guarantee that the start of the structure will be aligned on a proper boundary (if global or static), it also pads the structure to a multiple of your specified alignment.
Your code isn't showing what you think it is, because both P1 and P2 are defined as instances of struct pad. struct pad2 isn't ever used.
If I change the definition of P2 so that it is struct pad2, gcc does indeed decide to make it size 12.
struct pad P1;
printf("%d\n", sizeof(P1));
struct pad P2;
printf("%d\n", sizeof(P2));
P1 and P2 have the same type "struct pad" maybe you want to use "struct pad2" for P2.
All CPU expect that the built in data types like (int, float,char,double) are stored in the memory at their natural boundary, at address of their length.So structure padding is done to faster access of data from memory.
For example,
If int is declared, it should occurs in the memory at an address multiple of 4, as
size of the int is 4 byte.
Similarly for double, it resides in memory at multiple of 8.
If memory is properly aligned, CPU can run faster and work efficiently.
For the following examples, let us assume:
Sizeof(int)=4 byte
Sizeof(float)=4 byte
Sizeof(char)=1 byte
Find details on BoundsCheck

Resources