Overlapped bit-field in C - c

I'm a computer science student.
Now, I'm working on a computer architecture project in C, which simulates a processor.
There are many types of instructions such as
31 27 26 22 21 17 16 0
---------------------------------------------------------------
| op | ra | rb | imm17 |
---------------------------------------------------------------
31 27 26 22 21 17 16 7 6 5 4 0
---------------------------------------------------------------
| op | ra | rb | imm10 | m | shamt |
---------------------------------------------------------------
31 27 26 22 21 0
---------------------------------------------------------------
| op | ra | imm22 |
---------------------------------------------------------------
So, I wanted to make a C structure which contains bit-fields corresponding to each elements such as op, ra and so on.
At first, I thought that I could use unions and nested structs.
For example, I wrote code like:
struct instr_t {
union {
uint32_t imm22 : 22;
struct {
union {
uint32_t imm17: 17;
struct {
uint8_t shamt: 5;
uint8_t mode : 2;
uint16_t imm10 : 10;
};
};
uint8_t rb : 5;
};
};
uint8_t ra : 5;
uint8_t op : 5;
}
I expected that the result of sizeof(struct instr_t) would be 4 but the reality was 12.
Maybe the nested structs got some paddings.
So, here is my qeustion:
How can one achieve overlapped C bit-fields?
or
Does anybody can recommend a better way to implement multiple types of instruction in C?
Thank you!

The bit-fields members must be stored in the same storage unit to be layed out contiguously:
struct instr_1_t {
uint32_t imm22 : 17;
uint32_t rb : 5;
uint32_t ra : 5;
uint32_t op : 5;
};
struct instr_2_t {
uint32_t shamt: 5;
uint32_t m: 2;
uint32_t imm10 : 10;
uint32_t rb : 5;
uint32_t ra : 5;
uint32_t op : 5;
};
struct instr_3_t {
uint32_t imm22 : 22;
uint32_t ra : 5;
uint32_t op : 5;
};
union instr_t {
struct {
uint32_t pad : 22;
uint32_t op : 5;
};
instr_1_t instr_1;
instr_2_t instr_2;
instr_3_t instr_3;
};
static_assert(sizeof(instr_t) == sizeof(uint32_t), "sizeof(instr_t) != sizeof(uint32_t)");
void handle_instr(instr_t i) {
switch(i.op) {
//
}
}

Maxim gave the correct answer.
I also suggest looking over this code to understand why sizeof instr_t was giving 12 :)
typedef struct s1{
uint8_t shamt: 5;
uint8_t mode : 2;
uint16_t imm10 : 10;
} s_1;
typedef union u1{
uint32_t imm17: 17;
s_1 member0;
} u_1;
typedef struct s2{
u_1 member1;
uint8_t rb : 5;
} s_2;
typedef union u2{
uint32_t imm22 : 22;
s_2 member3;
} u_2;
typedef struct instr_t {
u_2 member4;
uint8_t ra : 5;
uint8_t op : 5;
} s_instr;
int main(int argc, char* argv[])
{
printf("sizes s_1=%d, u_1=%d, s_2=%d, u_2=%d, s_instr=%d\n", sizeof(s_1), sizeof(u_1), sizeof(s_2), sizeof(u_2), sizeof(s_instr));
printf("uint8_t=%d, uint16_t=%d, uint32_t=%d\n", sizeof(uint8_t), sizeof(uint16_t), sizeof(uint32_t));
printf("Sizeof instr_t is %d\n", sizeof(s_instr));
}
Hope this helps!
Cheers!

Bitfields are not portable. You never know if the same bitfield definition gets you the same results on two different compilers. Bitfields also have let's say interesting semantics in multi-threaded programs.
Go with C++ and write a class with appropriated inlined accessors. I mean you are a computer science student, you know C++, right?
If for some mad reason your superiors demand that the code is written in C, write a struct with one uint32_t member and individual accessor functions using shift and masking operations. Obviously also inlined.

Related

conversion from 'long long unsigned int' to 'long long unsigned int:40' changes value from '0xFFFFFFFFFFFFFFFF' to '0xFFFFFFFFFF' [-Werror=overflow]

I have this example code that throws an error when I try to fix one of the GCC warnings
#include <stdint.h>
//
typedef union someStruct
{
uint64_t All;
struct
{
uint64_t Foo : 40;
uint64_t Bar : 24;
} Field;
} someStruct;
#define bits_64 ((uint64_t)(-1))
//
typedef union bits
{
uint64_t oneBit: 1;
uint64_t twoBits: 2;
uint64_t threeBits: 3;
uint64_t fourBits: 4;
uint64_t fiveBits: 5;
uint64_t sixBits: 6;
uint64_t sevenBits: 7;
uint64_t fourtyBits: 40;
uint64_t All;
} bits;
#define bits_40 (((bits)(-1)).fourtyBits)
//
int main()
{
someStruct x;
someStruct y;
x.Field.Foo = bits_64; //-Woverflow warning
//trying to fix the warning with using the bits union
y.Field.Foo = bits_40; // but this throws the error msg below
/*
<source>:30:19: error: cast to union type from type not present in union
30 | #define bits_40 (((bits)(-1)).fourtyBits)
| ^
*/
return 0;
}
How can I use a union to define any number of bits and assign it to any struct field?
P.S. I cannot use enums and/or define a union variable; I have to use macros this way to fit the codebase.
Your #define for bits_40 should look like this:
#define bits_40 (((bits){.All = -1)).fourtyBits)
You could also just do:
#define bits_40 ((1ULL << 40) - 1)
and skip the bits struct entirely. Or you could define a BIT_MASK macro as follows:
#define BIT_MASK(bits) ((1uLL << bits) - 1)
:
:
x.Field.Foo = BIT_MASK(40);

Structure with bit-fields size

I tried to use a structure with different sized bit-fields. The total number of bits used is 64. However, when I check the structure size, I get 11 instead of an expected 8. By trying to decompose the structure, I saw the difference came from the day field. If I pack every bit to get 8-bits packs, the day field is packed beetween the "end" of month and the "start" of hour. I don't know if this is a good approach. Can someone explain me that ?
typedef unsigned char uint8_t;
typedef struct frameHeader_t
{
uint8_t encryption : 2;
uint8_t frameVersion : 2;
uint8_t probeType : 4;
uint8_t dataType : 5;
uint8_t measurePeriod : 3;
uint8_t remontePerdiod : 4;
uint8_t nbrMeasure : 2;
uint8_t year : 7;
uint8_t month : 4;
uint8_t day : 5;
uint8_t hour : 5;
uint8_t minute : 6;
uint8_t second : 6;
uint8_t randomization : 5;
uint8_t status : 4;
}FrameHeader;
int main()
{
FrameHeader my_frameHeader;
printf("%d\n", sizeof(FrameHeader));
return 0;
}
If you run it through the pahole tool, you should get an explanation:
struct frameHeader_t {
uint8_t encryption:2; /* 0: 6 1 */
uint8_t frameVersion:2; /* 0: 4 1 */
uint8_t probeType:4; /* 0: 0 1 */
uint8_t dataType:5; /* 1: 3 1 */
uint8_t measurePeriod:3; /* 1: 0 1 */
uint8_t remontePerdiod:4; /* 2: 4 1 */
uint8_t nbrMeasure:2; /* 2: 2 1 */
/* XXX 2 bits hole, try to pack */
uint8_t year:7; /* 3: 1 1 */
/* XXX 1 bit hole, try to pack */
uint8_t month:4; /* 4: 4 1 */
/* XXX 4 bits hole, try to pack */
uint8_t day:5; /* 5: 3 1 */
/* XXX 3 bits hole, try to pack */
uint8_t hour:5; /* 6: 3 1 */
/* XXX 3 bits hole, try to pack */
uint8_t minute:6; /* 7: 2 1 */
/* XXX 2 bits hole, try to pack */
uint8_t second:6; /* 8: 2 1 */
/* XXX 2 bits hole, try to pack */
uint8_t randomization:5; /* 9: 3 1 */
/* XXX 3 bits hole, try to pack */
uint8_t status:4; /* 10: 4 1 */
/* size: 11, cachelines: 1, members: 15 */
/* bit holes: 8, sum bit holes: 20 bits */
/* bit_padding: 4 bits */
/* last cacheline: 11 bytes */
};
You're using uint8_t as the base type so the fields are getting padded to groups of 8 bits.
You should be able to completely eliminate the padding, somewhat more portably than with __attribute((packed)) by using unsigned long long/uint_least64_t (at least 64 bits large) as the base type of the bitfields, but technically the non-int/non-unsigned-int base types for bitfields aren't guaranteed to be supported, but you could use unsigned (at least 16 bits guaranteed by the C standard) after reorganizing the bitfields a little, for example into:
typedef struct frameHeader_t
{
//16
unsigned year : 7;
unsigned randomization : 5;
unsigned month : 4;
//16
unsigned second : 6;
unsigned minute : 6;
unsigned status : 4;
//16
unsigned hour : 5;
unsigned dataType : 5;
unsigned probeType : 4;
unsigned encryption : 2;
//16
unsigned day : 5;
unsigned remontePerdiod : 4;
unsigned measurePeriod : 3;
unsigned nbrMeasure : 2;
unsigned frameVersion : 2;
}FrameHeader;
//should be an unpadded 8 bytes as long as `unsigned` is 16,
//32, or 64 bits wide (I don't know of a platform where it isn't)
(The padding or lack thereof isn't guaranteed, but I've never seen an implementation insert it unless it was necessary.)
to avoid the packing magic I usually use fixed size unsigned types with the same size
typedef struct frameHeader_t
{
uint64_t encryption : 2;
uint64_t frameVersion : 2;
uint64_t probeType : 4;
uint64_t dataType : 5;
uint64_t measurePeriod : 3;
uint64_t remontePerdiod : 4;
uint64_t nbrMeasure : 2;
uint64_t year : 7;
uint64_t month : 4;
uint64_t day : 5;
uint64_t hour : 5;
uint64_t minute : 6;
uint64_t second : 6;
uint64_t randomization : 5;
uint64_t status : 4;
}FrameHeader;
https://godbolt.org/z/BX2QsC

Print full uint32_t in hex from struct bitfield

I have a structure like below:
struct myCoolStuff{
uint32_t stuff1 : 4;
uint32_t stuff2 : 4;
uint32_t stuff3 : 24;
uint32_t differentField;
}
How can I combine these fields into a hex format for printing to the screen or writing out to a file? Thank you.
struct myCoolStuff data = {.stuff1=0xFF, .stuff2=0x66, .stuff3=0x112233, .differentField=99};
printf("my combined stuff is: %x\n", <combined stuff>);
printf("My full field is: %x\n", data.differentField);
Expected Output:
my combined stuff is: 0xFF66112233
My different field is: 99
First, you can't get 0xFF out of 0xFF after you put it in a 4-bit variable. 0xFF takes 8 bits. Same for 0x66.
As for reinterpretting the bitfields as a single integer, you could,
in a very nonportable fashion (there's big-endian/little-endian issues and the possibility of padding bits) use a union.
( This:
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
struct myCoolStuff{
union{
struct {
uint32_t stuff1 : 4;
uint32_t stuff2 : 4;
uint32_t stuff3 : 24;
};
uint32_t fullField;
};
};
struct myCoolStuff data = {.stuff1=0xFF, .stuff2=0x66, .stuff3=0x112233};
int main()
{
printf("My full field is: %" PRIX32 "\n", data.fullField);
}
prints 1122336F on my x86_64. )
To do it portably you can simply take the bitfields and put them together manually:
This:
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
struct myCoolStuff{
uint32_t stuff1 : 4;
uint32_t stuff2 : 4;
uint32_t stuff3 : 24;
};
struct myCoolStuff data = {.stuff1=0xFF, .stuff2=0x66, .stuff3=0x112233};
int main()
{
uint32_t fullfield = data.stuff1 << 28 | data.stuff2 << 24 | data.stuff3;
printf("My full field is: %" PRIX32 "\n", fullfield);
}
should print F6112233 anywhere where it compiles (uint32_t isn't guaranteed to exist (although on POSIX platforms it will); uint_least32_t would've been more portable.)
Be careful to make sure data.stuff1 has enough bits to be shiftable by 28. Yours does because it's typed uint32_t, but it would be safer to do it e.g., with (data.stuff1 + 0UL)<<28 or (data.stuff1 + UINT32_C(0))<<28 and same for the second shift.
Add a union inside of this struct that you can use to reinterpret the fields.
struct myCoolStuff{
union {
struct {
uint32_t stuff1 : 4;
uint32_t stuff2 : 4;
uint32_t stuff3 : 24;
};
uint32_t stuff;
}
uint32_t fullField;
};
...
printf("my combined stuff is: %x\n", data.stuff);
Multiply (using at least uint32_t math) and then print using the matching specifier.
#include <inttypes.h>
struct myCoolStuff{
uint32_t stuff1 : 4;
uint32_t stuff2 : 4;
uint32_t stuff3 : 24;
uint32_t differentField;
}
uint32_t combined stuff = ((uint32_t) data.stuff1 << (4 + 24)) |
((uint32_t) data.stuff2 << 24) | data.stuff3;
printf("my combined stuff is: 0x%" PRIX32 "\n", combined stuff);
printf("My full field is: %x\n", data.differentField);
Maybe something like this will help :
unsigned char *ptr = (unsigned char *)&data; // store start address
int size = sizeof(myCoolStuff); // get size of struct in bytes
while(size--) // for each byte
{
unsigned char c = *ptr++; // get byte value
printf(" %x ", (unsigned)c); // print byte value
}

Bit field memory usage in C

Why does it return with 96 and not 64?
If I sum bit of bit field I will get 64.
Edited:
The var variable has 0xFFFFFF and not 0xFFFFFFFF. -> The var variable has 0x3FFFFFFF00FFFFFF and not 0xFFFFFFFFFFFFFFFF.*
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
typedef struct{
uint32_t a : 24;
uint32_t b : 20;
uint32_t c : 10;
uint32_t d : 6;
uint32_t e : 4;
}MyType_t;
int main(){
MyType_t test;
test.a = -1;
test.b = -1;
test.c = -1;
test.d = -1;
test.e = -1;
uint64_t var = *((uint64_t*)&test);
printf("MyType_t: %d bit\n", sizeof(MyType_t) * 8);//96 bit
printf("Var: %#llX\n", var);//0x3FFFFFFF00FFFFFF
return 0;
}
This code will be worked correctly:
typedef struct{
uint32_t a : 16;
uint32_t b : 16;
uint32_t c : 16;
uint32_t d : 8;
uint32_t e : 8;
}MyType_t;
The fields a and b cannot possibly fit into a single type of uint32_t:
typedef struct{
uint32_t a : 24; //first 32 bits
uint32_t b : 20; //second 32 bits
uint32_t c : 10; //
uint32_t d : 6; //third 32 bits
uint32_t e : 4; //
}MyType_t;
so the size of the struct is three times the size of uint32_t.
The behavior of the code uint64_t var = *((uint64_t*)&test); is not defined.

Alignment of the mixed bit fields and fields of structures in big-endian and little-endian

from my previous experience i understood the following:
// if i have structure in big-endian system, look like this:
typedef struct
{
unsigned long
a: t1,
b: t2,
c: t3,
d: t4,
//...
z: tn;
} TType;
// i can adapt this for little-endian so:
typedef struct
{
unsigned long
z:tn,
//...
d: t4,
c: t3,
b: t2,
a: t1;
} TType;
// and i get identical mapping to memory
or following:
// if i have next structure:
typedef struct
{
unsigned long
a : 2,
b : 5,
c : 6,
d : 3;
} TType2;
// ...
TType2 test;
test.a = 0x2;
test.b = 0x0E;
test.c = 0x3A;
test.d = 0x6;
printf("*(unsigned short *)&test = 0x%04X\n", *(unsigned short *)&test);
// in little-endian system i get: 0xDD3A , or mapping to memory :
// c'-|-----b-----|-a-| |--d--|------c----
// 0 0 1 1 _ 1 0 1 0 _|_ 1 1 0 1 _ 1 1 0 1 _
// in big-endian system i get: 0xD69D , or mapping to memory :
// -----c-----|--d--| |-a-|------b----|-c'
// 1 1 0 1 _ 0 1 1 0 _|_ 1 0 0 1 _ 1 1 0 1 _
If i am not right - please correct me.
My embedded-system is 32-bit little-endian. This device have big-endian hardware, connected via SPI.
Program, that i must adapt for my system, later work with this hardware in 32-bit big-endian system via parallel bus.
I begin to adapt the library, which intends for building and analyzing ethernet framers and something else.
I met the following code (which crash my mind):
#pragma pack(1)
//...
typedef unsigned short word;
//...
#ifdef _MOTOROLA_CPU
typedef struct
{
word ip_ver : 4;
word ihl : 4;
word ip_tos : 8;
word tot_len;
word identification;
word flags : 3;
word fragment_ofs: 13;
word time_to_live: 8;
word protocol : 8;
word check_sum;
IP_ADDRESS_T src;
IP_ADDRESS_T dest;
} IP_MSG_HEADER_T, *IP_MSG_HEADER_P;
#else // Intel CPU.
typedef struct
{
word ip_tos : 8;
word ihl : 4;
word ip_ver : 4;
word tot_len;
word identification;
word fragment_ofs: 13;
word flags : 3;
word protocol : 8;
word time_to_live: 8;
word check_sum;
IP_ADDRESS_T src;
IP_ADDRESS_T dest;
} IP_MSG_HEADER_T, *IP_MSG_HEADER_P;
#endif
But i met and the following:
typedef struct
{
word formid : 5;
word padding_formid : 3;
word TS_in_bundle : 5;
word padding_ts : 3;
word cell_per_frame : 8;
word padding : 8;
} SERVICE_SPEC_OLD_FIELD_T, *SERVICE_SPEC_OLD_FIELD_P
#else // Intel CPU.
typedef struct
{
word padding : 8;
word cell_per_frame : 8;
word padding_ts : 3;
word TS_in_bundle : 5;
word padding_formid : 3;
word formid : 5;
} SERVICE_SPEC_OLD_FIELD_T, *SERVICE_SPEC_OLD_FIELD_P;
#endif /*_MOTOROLA_CPU*/
Similar vagueness everywhere in this library.
I don't see here logic. Is me really stupid, or this code - nonsense?
Also: Whether I am right in the following:
// Two structures
typedef struct
{
unsigned long
a : 1,
b : 3,
c : 12;
unsigned short word;
} Type1;
// and
typedef struct
{
unsigned long
a : 1,
b : 3,
c : 12,
word : 16;
} Type2;
// will be identical in memory for little-endian and different for big-endian?
Thanks in advance!
In this particular case, the structures are big- and little-endian aligned on 16 bit boundaries. If you group each set into 16 bit pieces, you can see that is the size of the things being swapped for endian compatibility.

Resources