Union data type field behavior - c

I am having trouble figuring out how this piece of code works.
Mainly, I am confused by how does x.br get the value of 516 after x.str.a and x.str.b get their values of 4 and 2, respectively.
I am new to unions, so maybe there is something I am missing, but shouldn't there only be 1 active field in an union at any given time?
#include <stdio.h>
void f(short num, short* res) {
if (num) {
*res = *res * 10 + num % 10;
f(num / 10, res);
}
}
typedef union {
short br;
struct {
char a, b;
} str;
} un;
void main() {
short res = 0; un x;
x.str.a = 4;
x.str.b = 2;
f(x.br, &res);
x.br = res;
printf("%d %d %d\n", x.br, x.str.a, x.str.b);
}
I would be very thankful if somebody cleared this up for me, thank you!

To add to #Deepstop answer, and to correct an important point about your understanding -
shouldn't there only be 1 active field in an union at any given time?
There's no such a thing as active field in unions. All the different fields are referring to the same exact piece of memory (except miss-aligned data.
You can look at the different fields as different ways to interpret the same data, i.e. you can read your union as two fields or 8 bits or one field of 16 bits. But both will always "work" at the same time.

OK short is likely a 16 bit integer. Char a, b are each 8 bit chars.
So you are using the same 16 bit memory location for both.
0000 0010 0000 0100 is the 16 bit representation of 516
0000 0010 is the 8 bit representation of 2
0000 0100 is the 8 bit representation of 4
The CPU you are running this on is 'little-endian' so the low-order byte of a 16 bit integer comes first, which is the 2, and the high order byte, the 4, comes second.
So by writing 2 then 4 into consecutive bytes, and reading them back as a 16 bit integer, you get 516, which is 2 * 256 + 4. If you wrote 3 then 5, you'd get 3 * 256 + 5, which is 783.
The point is that union puts two data structures in exactly the same memory location.

how does x.br get the value of 516 after x.str.a and x.str.b get their
values of 4 and 2
Your union definition
typedef union {
short br;
struct {
char a, b;
} str;
} un;
Specifies that un.br share the same memory address as un.str. This is the whole point of a union. This means that when you modify the value of un.br that you are also modifying the values for un.str.a and un.str.b.
I am new to unions, so maybe there is something I am missing, but
shouldn't there only be 1 active field in an union at any given time?
Not sure what you mean by "only be 1 active field", but the members of a union are all mapped to the same memory address so any time that you write a value to a union member it writes that value to the same memory address as the other members. If you want the members to be mapped to different memory addresses so that when you write the value of a member it only modifies the value of that specific member, then you should use a struct and not a union.

Related

Converting types via unions doesn't work when using structs

I want to be able to "concat" bytes together, so that if I have bytes 00101 and 010 the result will be 00101010
For this task I have written the following code:
#include <stdio.h>
typedef struct
{
unsigned char bits0to5 : 6;
unsigned char bits5to11 : 6;
unsigned char bits11to15 : 4;
}foo;
typedef union
{
foo bytes_in_one_form;
long bytes_in_other_form : 16;
}bar;
int main()
{
bar example;
/*just in case the problem is that bytes_in_other_form wasn't initialized */
example.bytes_in_other_form = 0;
/*001000 = 8, 000101 = 5, 1111 = 15*/
example.bytes_in_one_form.bits0to5 = 8;
example.bytes_in_one_form.bits5to11 = 5;
example.bytes_in_one_form.bits11to15 = 15;
/*sould be 0010000001011111 = 8287*/
/*NOTE: maybe the printf is wrong since its only 2 bytes and from type long?*/
printf("%d", example.bytes_in_other_form);
/*but number printed is 1288*/
return 0;
}
What have I done wrong? Unions should have all their member share the same memory, and both the struct and the long take up exactly 2 bytes.
Note:
For solutions that use entirely different algorithm, please note I need to be able to have the zeros at the start (so for example 8 = 001000 and not 1000, and the solution should match any number of bytes at any distribution (although understanding what I did wrong in my current algorithm is better) I should also mention I use ANSI C.
Thanks!
This answer applies to the original question, which had:
typedef struct
{
unsigned char bits0to5 : 6;
unsigned char bits5to11 : 6;
unsigned char bits11to15 : 4;
}foo;
Here's what's happening in your specific example (note that the results may vary from one platform to another):
The bit fields are being packed into char variables. If the next bit field doesn't fit into the current char, it skips to the next one. Additionally, you have little-endian addressing, so the char values appear right-to-left in the aliased long bit field.
So the layout of the structure fields is:
+--------+--------+--------+
|....cccc|..bbbbbb|..aaaaaa|
+--------+--------+--------+
Where aaaaaa is the first field, bbbbbb is the second field, cccc is the third field, and the . values are padding.
When storing your values, you have:
+--------+--------+--------+
|....1111|..000101|..001000|
+--------+--------+--------+
With zeroes in the pad bits, this becomes:
+--------+--------+--------+
|00001111|00000101|00001000|
+--------+--------+--------+
The other value in the union is aliased to the low-order 16 bits, so the value it picks up is:
+--------+--------+
|00000101|00001000|
+--------+--------+
This is 0x0508, which in decimal is 1288 as you saw.
If the structure instead uses unsigned long for the bit field types, then we have:
typedef struct
{
unsigned long bits0to5 : 6;
unsigned long bits5to11 : 6;
unsigned long bits11to15 : 4;
}foo;
In this case, the fields are packed into an unsigned long as follows:
-----+--------+--------+
.....|11110001|01001000|
-----+--------+--------+
The low-order 16 bits are 0xf148, which is 61768 in decimal.

value of members of unions with bit-fields

I'm trying to learn how memory is allocated for unions containing bit-fields.
I've looked at posts and questions similar to this and have understood that padding is involved most of the times depending on the order in which members are declared in structure.
1.
union Example
{ int i:2;
int j:9;
};
int main(void)
{
union Example ex;
ex.j=15;
printf("%d",ex.i);
}
Output: -1
2.
union Example
{ unsigned int i:2;
int j:9;
};
int main(void)
{
union Example ex;
ex.j=15;
printf("%d",ex.i);
}
Output: 3
Can someone please explain the output? Is it just the standard output for such cases?
I use the built-in gcc compiler in ubuntu
The way your compiler allocates bits for bit-fields in a union happens to overlap the low bits of j with the bits of i. Thus, when j is set to 15, the two bits of i are each set to 1.
When i is a two-bit signed integer, declared with int i:2, your compiler interprets it as a two-bit two’s complement number. With this interpretation, the bits 11 represent −1.
When i is a two-bit unsigned integer, declared with unsigned int i:2, it is pure binary, with no sign bit. With this interpretation, the bits 11 represent 3.
The program below shows that setting a signed two-bit integer to −1 or an unsigned two-bit integer to 3 produce the same bit pattern in the union, at least in your C implementation.
#include <stdio.h>
int main(void)
{
union Example
{
unsigned u;
int i : 2;
int j : 9;
unsigned int k : 2;
} ex;
ex.u = 0;
ex.i = -1;
printf("ex.i = -1: %#x.\n", ex.u);
ex.u = 0;
ex.j = 15;
printf("ex.j = 15: %#x.\n", ex.u);
ex.u = 0;
ex.k = 3;
printf("ex.k = 3: %#x.\n", ex.u);
}
Output:
ex.i = -1: 0x3.
ex.j = 15: 0xf.
ex.k = 3: 0x3.
Note that a compiler could also allocate the bits high-to-low instead of low-to-high, in which case the high bits of j would overlap with i, instead of the low bits. Also, a compiler could use different sizes of storage units for the nine-bit j than it does for the two-bit i, in which case their bits might not overlap at all. (For example, it might use a single eight-bit byte for i but use a 32-bit word for j. The eight-bit byte would overlap some part of the 32-bit word, but not necessarily the part used for j.)
Example1
#include <stdio.h>
union Example
{ int i:2;
int j:9;
};
int main(void)
{
union Example ex;
ex.j=15;
printf("%d",ex.i);
printf("\nSize of Union : %lu\n" , sizeof(ex));
return 0;
}
Output:
-1
Size of Union : 4
Observations:
From this ouput, We can deduce that even if we use 9 bits( max ), Compiler allocates 4 byte(because of padding).
Statement union Example ex; allocates 4 bytes of memory for Union Example object ex.
Statement ex.j=15; asigns 0x0000000F to that 4 Byte of memory. So for j it allocates 0-0-0-0-0-1-1-1-1 for 9 bits.
Statement printf("%d",ex.i); tries to print ex.i( which is 2 bits). We have 1-1 from the previous statement.
Here comes the interesting part, ex.i is of type signed int. Hence first bit is used to assign signed representation and most cpu does signed representation in 2's complement form.
So if we does reverse 2's complement of 1-1 we will get value 1. So the ouput we get is -1.
Example2
#include <stdio.h>
union Example
{ unsigned int i:2;
int j:9;
};
int main(void)
{
union Example ex;
ex.j=15;
printf("%d",ex.i);
return 0;
}
Output:
3
Observations:
Statement union Example ex; allocates 4 bytes of memory for Union Example object ex.
Statement ex.j=15; asigns 0x0000000F to that 4 Byte of memory. So for j it allocates 0-0-0-0-0-1-1-1-1 for 9 bits.
statement printf("%d",ex.i); tries to print ex.i( which is 2 bits). We have 1-1 from the previous statement.
Same as Above example but here ex.i is of type unsigned int. Hence no signed bit is used here for representation.
So whatever is stored in the 2 bit location is the exact value. Hence the ouput is 3.
Hope i cleared your doubt. Please check for 2's and 1's complement in the internet.

different between C struct bitfields on char and on int

When using bitfields in C, I found out differences I did not expect related to the actual type that is used to declare the fields.
I didn't find any clear explanation. Now, the problem is identified, so if though there is no clear response, this post may be useful to anyone facing the same issue.
Still if some can point to a formal explanation, this coudl be great.
The following structure, takes 2 bytes in memory.
struct {
char field0 : 1; // 1 bit - bit 0
char field1 : 2; // 2 bits - bits 2 down to 1
char field2 ; // 8 bits - bits 15 down to 8
} reg0;
This one takes 4 bytes in memory, the question is why ?
struct {
int field0 : 1; // 1 bit - bit 0
int field1 : 2; // 2 bits - bits 2 down to 1
char field2 ; // 8 bits - bits 15 down to 8
} reg1;
In both cases, the bits are organized in memory in the same way: field 2 is always taking bits 15 down to 8.
I tried to find some literarure on the subject, but still can't get a clear explanation.
The two most appropriate links I can found are:
http://www.catb.org/esr/structure-packing/
http://www.msg.ucsf.edu/local/programs/IBM_Compilers/C:C++/html/language/ref/clrc03defbitf.htm
However, none really explains really why the second structure is taking 4 bytes. Actually reading carefully, I would even expect the structure to take 2 bytes.
In both cases,
field0 takes 1 bit
field1 takes 2 bits
field2 takes 8 bits, and is aligned on the first available byte address
Hence, the useful data requires 2 bytes in both cases.
So what is behind the scene that makes reg1 to take 4 bytes ?
Full Code Example:
#include "stdio.h"
// Register Structure using char
typedef struct {
// Reg0
struct _reg0_bitfieldsA {
char field0 : 1;
char field1 : 2;
char field2 ;
} reg0;
// Nextreg
char NextReg;
} regfileA_t;
// Register Structure using int
typedef struct {
// Reg1
struct _reg1_bitfieldsB {
int field0 : 1;
int field1 : 2;
char field2 ;
} reg1;
// Reg
char NextReg;
} regfileB_t;
regfileA_t regsA;
regfileB_t regsB;
int main(int argc, char const *argv[])
{
int* ptrA, *ptrB;
printf("sizeof(regsA) == %-0d\n",sizeof(regsA)); // prints 3 - as expected
printf("sizeof(regsB) == %-0d\n",sizeof(regsB)); // prints 8 - why ?
printf("\n");
printf("sizeof(regsA.reg0) == %-0d\n",sizeof(regsA.reg0)); // prints 2 - as epxected
printf("sizeof(regsB.reg0) == %-0d\n",sizeof(regsB.reg1)); // prints 4 - int bit fields tells the struct to use 4 bytes then.
printf("\n");
printf("addrof(regsA.reg0) == 0x%08x\n",(int)(&regsA.reg0)); // 0x0804A028
printf("addrof(regsA.reg1) == 0x%08x\n",(int)(&regsA.NextReg)); // 0x0804A02A = prev + 2
printf("addrof(regsB.reg0) == 0x%08x\n",(int)(&regsB.reg1)); // 0x0804A020
printf("addrof(regsB.reg1) == 0x%08x\n",(int)(&regsB.NextReg)); // 0x0804A024 = prev + 4 - my register is not at the righ place then.
printf("\n");
regsA.reg0.field0 = 1;
regsA.reg0.field1 = 3;
regsA.reg0.field2 = 0xAB;
regsB.reg1.field0 = 1;
regsB.reg1.field1 = 3;
regsB.reg1.field2 = 0xAB;
ptrA = (int*)&regsA;
ptrB = (int*)&regsB;
printf("regsA.reg0.value == 0x%08x\n",(int)(*ptrA)); // 0x0000AB07 (expected)
printf("regsB.reg0.value == 0x%08x\n",(int)(*ptrB)); // 0x0000AB07 (expected)
return 0;
}
When I first write the struct I expected to get reg1 to take only 2 bytes, hence the next register was at the offset = 2.
The relevant part of the standard is C11/C17 6.7.2.1p11:
An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.
that, in connection with C11/C17 6.7.2.1p5
A. bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type. It is implementation-defined whether atomic types are permitted.
and that you're using char means that there is nothing to discuss in general - for a specific implementation check the compiler manuals. Here's the one for GCC.
From the 2 excerpts it follows that an implementation is free to use absolutely whatever types it wants to to implement the bitfields - it could even use int64_t for both of these cases having the structure of size 16 bytes. The only thing a conforming implementation must do is to pack the bits within the chosen addressable storage unit if enough space remains.
For GCC on System-V ABI on 386-compatible (32-bit processors), the following stands:
Plain bit-fields (that is, those neither signed nor unsigned) always have non- negative values. Although they may have type char, short, int, long, (which can have negative values),
these bit-fields have the same range as a bit-field of the same size
with the corresponding unsigned type. Bit-fields obey the same
size and alignment rules as other structure and union members, with
the following additions:
Bit-fields are allocated from right to left (least to most significant).
A bit-field must entirely reside in a storage unit appropriate for its declared type. Thus a bit-field never crosses its unit boundary.
Bit-fields may share a storage unit with other struct/union members, including members that are not bit-fields. Of course,
struct members occupy different parts of the storage unit.
Unnamed bit-fields' types do not affect the alignment of a structure or union, although individual bit-fields' member offsets obey the
alignment constraints.
i.e. in System-V ABI, 386, int f: 1 says that the bit-field f must be within an int. If entire bytes of space remains, a following char within the same struct will be packed inside this int, even if it is not a bit-field.
Using this knowledge, the layout for
struct {
int a : 1; // 1 bit - bit 0
int b : 2; // 2 bits - bits 2 down to 1
char c ; // 8 bits - bits 15 down to 8
} reg1;
will be
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|a b b x x x x x|c c c c c c c c|x x x x x x x x|x x x x x x x x|
<------------------------------ int ---------------------------->
and the layout for
struct {
char a : 1; // 1 bit - bit 0
char b : 2; // 2 bits - bits 2 down to 1
char c ; // 8 bits - bits 15 down to 8
} reg1;
will be
1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
|a b b x x x x x|c c c c c c c c|
<---- char ----><---- char ---->
So there are tricky edge cases. Compare the 2 definitions here:
struct x {
short a : 2;
short b : 15;
char c ;
};
struct y {
int a : 2;
int b : 15;
char c ;
};
Because the bit-field must not cross the unit boundary, the struct x members a and b need to go to different shorts. Then there is not enough space to accommodate the char c, so it must come after that. And the entire struct must be suitably aligned for short so it will be 6 bytes on i386. The latter however, will pack a and b in the 17 lowest bits of the int, and since there is still one entire addressable byte left within the int, the c will be packed here too, and hence sizeof (struct y) will be 4.
Finally, you must really specify whether the int or char is signed or not - the default might be not what you expect! Standard leaves it up to the implementation, and GCC has a compile-time switch to change them.

unable to understand the output of union program in C

I know the basic properties of union in C but still couldn't understand the output, can somebody explain this?
#include <stdio.h>
int main()
{
union uni_t{
int i;
char ch[2];
};
union uni_t z ={512};
printf("%d%d",z.ch[0],z.ch[1]);
return 0;
}
The output when running this program is
02
union a
{
int i;
char ch[2];
}
This declares a type union a, the contents of which (i.e. the memory area of a variable of this type) could be accessed as either an integer (a.i) or a 2-element char array (a.ch).
union a z ={512};
This defines a variable z of type union a and initializes its first member (which happens to be a.i of type int) to the value of 512. (Cantfindname has the binary representation of that.)
printf( "%d%d", z.ch[0], z.ch[1] );
This takes the first character, then the second character from a.ch, and prints their numerical value. Again, Cantfindname talks about endianess and how it affects the results. Basically, you are taking apart an int byte-by-byte.
And the whole shebang is apparently assuming that sizeof( int ) == 2, which hasn't been true for desktop computers for... quite some time, so you might want to be looking at a more up-to-date tutorial. ;-)
What you get here is the result of endianess (http://en.wikipedia.org/wiki/Endianness).
512 is 0b0000 0010 0000 0000 in binary, which in little endian is stored in the memory as 0000 0000 0000 0010. Then ch[0] reads the last 8 bits (0b0000 0010 = 2 in decimal) and ch[1] reads the first 8 bits (0b0000 0000 = 0 in decimal).
Using int will not lead to this output in 32 bit machines as sizeof(int) = 4. This output will occur only if we use a 16 bit system or we use short int having memory size of 2 bytes.
A Union is a variable that may hold (at different times) objects of different types and sizes, with the compiler keeping track of size and alignment requirements.
union uni_t
{
short int i;
char ch[2];
};
This code snippet declares a union having two members- a integer and a character array.
The union can be used to hold different values at different times by simply allocating the values.
union uni_t z ={512};
This defines a variable z of type union uni_t and initializes the integer member ( i ) to the value of 512.
So the value stored in z becomes : 0b0000 0010 0000 0000
When this value is referenced using character array then ch[1] refers to first byte of data and ch[0] refers to second byte.
ch[1] = 0b00000010 = 2
ch[0] = ob00000000 = 0
So printf("%d%d",z.ch[0],z.ch[1]) results to
02

Memory layout of struct having bitfields

I have this C struct: (representing an IP datagram)
struct ip_dgram
{
unsigned int ver : 4;
unsigned int hlen : 4;
unsigned int stype : 8;
unsigned int tlen : 16;
unsigned int fid : 16;
unsigned int flags : 3;
unsigned int foff : 13;
unsigned int ttl : 8;
unsigned int pcol : 8;
unsigned int chksm : 16;
unsigned int src : 32;
unsigned int des : 32;
unsigned char opt[40];
};
I'm assigning values to it, and then printing its memory layout in 16-bit words like this:
//prints 16 bits at a time
void print_dgram(struct ip_dgram dgram)
{
unsigned short int* ptr = (unsigned short int*)&dgram;
int i,j;
//print only 10 words
for(i=0 ; i<10 ; i++)
{
for(j=15 ; j>=0 ; j--)
{
if( (*ptr) & (1<<j) ) printf("1");
else printf("0");
if(j%8==0)printf(" ");
}
ptr++;
printf("\n");
}
}
int main()
{
struct ip_dgram dgram;
dgram.ver = 4;
dgram.hlen = 5;
dgram.stype = 0;
dgram.tlen = 28;
dgram.fid = 1;
dgram.flags = 0;
dgram.foff = 0;
dgram.ttl = 4;
dgram.pcol = 17;
dgram.chksm = 0;
dgram.src = (unsigned int)htonl(inet_addr("10.12.14.5"));
dgram.des = (unsigned int)htonl(inet_addr("12.6.7.9"));
print_dgram(dgram);
return 0;
}
I get this output:
00000000 01010100
00000000 00011100
00000000 00000001
00000000 00000000
00010001 00000100
00000000 00000000
00001110 00000101
00001010 00001100
00000111 00001001
00001100 00000110
But I expect this:
The output is partially correct; somewhere, the bytes and nibbles seem to be interchanged. Is there some endianness issue here? Are bit-fields not good for this purpose? I really don't know. Any help? Thanks in advance!
No, bitfields are not good for this purpose. The layout is compiler-dependant.
It's generally not a good idea to use bitfields for data where you want to control the resulting layout, unless you have (compiler-specific) means, such as #pragmas, to do so.
The best way is probably to implement this without bitfields, i.e. by doing the needed bitwise operations yourself. This is annoying, but way easier than somehow digging up a way to fix this. Also, it's platform-independent.
Define the header as just an array of 16-bit words, and then you can compute the checksum easily enough.
The C11 standard says:
An implementation may allocate any addressable storage unit large
enough to hold a bitfield. If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined. The order of
allocation of bit-fields within a unit (high-order to low-order or
low-order to high-order) is implementation-defined.
I'm pretty sure this is undesirable, as it means there might be padding between your fields, and that you can't control the order of your fields. Not just that, but you're at the whim of the implementation in terms of network byte order. Additionally, imagine if an unsigned int is only 16 bits, and you're asking to fit a 32-bit bitfield into it:
The expression that specifies the width of a bit-field shall be an
integer constant expression with a nonnegative value that does not
exceed the width of an object of the type that would be specified were
the colon and expression omitted.
I suggest using an array of unsigned chars instead of a struct. This way you're guaranteed control over padding and network byte order. Start off with the size in bits that you want your structure to be, in total. I'll assume you're declaring this in a constant such as IP_PACKET_BITCOUNT: typedef unsigned char ip_packet[(IP_PACKET_BITCOUNT / CHAR_BIT) + (IP_PACKET_BITCOUNT % CHAR_BIT > 0)];
Write a function, void set_bits(ip_packet p, size_t bitfield_offset, size_t bitfield_width, unsigned char *value) { ... } which allows you to set the bits starting at p[bitfield_offset / CHAR_BIT] bit bitfield_offset % CHARBIT to the bits found in value, up to bitfield_width bits in length. This will be the most complicated part of your task.
Then you could define identifiers for VER_OFFSET 0 and VER_WIDTH 4, HLEN_OFFSET 4 and HLEN_WIDTH 4, etc to make modification of the array seem less painless.
Although question was asked long time back, there's no answer with explaination of your result. I'll answer it, hopefully it'll be useful to someone.
I'll illustrate the bug using first 16 bits of your data structure.
Please Note: This explaination is guarranteed to be true only with the set of your processor and compiler. If any of these changes, behaviour may change.
Fields:
unsigned int ver : 4;
unsigned int hlen : 4;
unsigned int stype : 8;
Assigned to:
dgram.ver = 4;
dgram.hlen = 5;
dgram.stype = 0;
Compiler starts assigning bit fields starting with offset 0. This means first byte of your data structure is stored in memory as:
Bit offset: 7 4 0
-------------
| 5 | 4 |
-------------
First 16 bits after assignment look like this:
Bit offset: 15 12 8 4 0
-------------------------
| 5 | 4 | 0 | 0 |
-------------------------
Memory Address: 100 101
You are using Unsigned 16 pointer to dereference memory address 100. As a result address 100 is treated as LSB of a 16 bit number. And 101 is treated as MSB of a 16 bit number.
If you print *ptr in hex you'll see this:
*ptr = 0x0054
Your loop is running on this 16 bit value and hence you get:
00000000 0101 0100
-------- ---- ----
0 5 4
Solution:
Change order of elements to
unsigned int hlen : 4;
unsigned int ver : 4;
unsigned int stype : 8;
And use unsigned char * pointer to traverse and print values.
It should work.
Please note, as others've said, this behavior is platform and compiler specific. If any of these changes, you need to verify that memory layout of your data structure is correct.
For Chinese users, I think you can refer blog for more details, really good.
In summary, due to endianness, there is byte order as well as bit order. Bit order is the order how each bit of one byte saved in memory. Bit order has same rule with byte order in sense of endianness issue.
For your picture, it's designed in network order which is big endian. So your struct defination is actually for big endian. Per your output, your PC is little endian, so you need change struct field orders when use.
The way to show each bits is incorrect since when get by char, the bit order has changed from machine order (little endian in your case) to normal order which we human use. You may change it as following per refered blog.
void
dump_native_bits_storage_layout(unsigned char *p, int bytes_num)
{
union flag_t {
unsigned char c;
struct base_flag_t {
unsigned int p7:1,
p6:1,
p5:1,
p4:1,
p3:1,
p2:1,
p1:1,
p0:1;
} base;
} f;
for (int i = 0; i < bytes_num; i++) {
f.c = *(p + i);
printf("%d%d%d%d %d%d%d%d ",
f.base.p7,
f.base.p6,
f.base.p5,
f.base.p4,
f.base.p3,
f.base.p2,
f.base.p1,
f.base.p0);
}
printf("\n");
}
//prints 16 bits at a time
void print_dgram(struct ip_dgram dgram)
{
unsigned char* ptr = (unsigned short int*)&dgram;
int i,j;
//print only 10 words
for(i=0 ; i<10 ; i++)
{
dump_native_bits_storage_layout(ptr, 1);
/* for(j=7 ; j>=0 ; j--)
{
if( (*ptr) & (1<<j) ) printf("1");
else printf("0");
if(j%8==0)printf(" ");
}*/
ptr++;
//printf("\n");
}
}
#unwind
A typical use case of Bit Fields is interpreting/emulation of byte code or CPU instructions with given layout. "Don't use it, because you cannot control it" is the answer for children.
#Bruce
For Intel/GCC I see a packed LITTLE ENDIAN bit layout, i.e. in struct ip_dgram field ver is represented by bits 0..3, field hlen is represented by bits 4..7 ...
For correctness of operation it is required to verify the memory layout against your design at runtime.
struct ModelIndicator
{
int a:4;
int b:4;
int c:4;
};
union UModelIndicator
{
ModelIndicator i;
int v;
};
// test packed little endian
static bool verifyLayoutModel()
{
UModelIndicator um;
um.v = 0;
um.i.a = 2; // 0..3
um.i.b = 3; // 4..7
um.i.c = 9; // 8..11
return um.v = (9 << 8) + (3 << 4) + 2;
}
int main()
{
if (!verifyLayoutModel())
{
std::cerr << "Invalid memory layout" << std::endl;
return -1;
}
// ...
}
At the earliest, when above test fails, you need to consider compiler pragmas or adjust your structures accordingly, resp. verifyLayoutModel().
I agree with what unwind said. Bit fields are compiler dependent.
If you need the bits to be in a specific order, pack the data into a pointer to a character array. Increment the buffer the size of the element being packed. Pack the next element.
pack( char** buffer )
{
if ( buffer & *buffer )
{
//pack ver
//assign first 4 bits to 4.
*((UInt4*) *buffer ) = 4;
*buffer += sizeof(UInt4);
//assign next 4 bits to 5
*((UInt4*) *buffer ) = 5;
*buffer += sizeof(UInt4);
... continue packing
}
}
Compiler dependant or not, It depends whether you want to write a very fast program or if you want one that works with different compilers. To write for C a fast, compact application, use a stuct with bit fields/. If you want a slow general purpose program , long code it.

Resources