convert struct of bitfields to binary format

convert struct of bitfields to binary format - c

R Instruction J Instruction J Instruction Instruction Table
So, I'm trying to convert 32bit instructions to machine-code binary.
Instruction for example: add #3, #5, #9:
has an opcode of 0.
rs will have the first operand value (3), rt will have the 2nd value (5) and rd will have the 3rd (9).
funct 1
6 bits of unused value, just 0s
#x - represent a register in memory
therefore the above R-type instruction will be represented in binary like so:
opCode rs rt rd funct unused
000001 00011 00101 01001 00001 000000 - 32 bits/4B instruction
My attempt to store each instruction:
typedef union __attribute__((__packed__))
{
struct
{
unsigned opcode : 6;
unsigned rs : 5;
unsigned rt : 5;
unsigned rd : 5;
unsigned funct : 5;
unsigned unused : 6;
} r;
struct
{
unsigned first : 8;
unsigned second : 8;
unsigned third : 8;
unsigned fourth : 8;
} bytes;
} r_instruction;
typedef union __attribute__((__packed__))
{
struct
{
unsigned rs : 5;
unsigned rt : 5;
unsigned opcode : 6;
unsigned immed : 16;
} i;
struct
{
unsigned first : 8;
unsigned second : 8;
unsigned third : 8;
unsigned fourth : 8;
} bytes;
} i_instruction;
typedef union __attribute__((__packed__))
{
struct
{
unsigned addr : 25;
unsigned opcode : 6;
unsigned isReg : 1;
} j;
struct
{
unsigned first : 8;
unsigned second : 8;
unsigned third : 8;
unsigned fourth : 8;
} bytes;
} j_instruction;
I need to output each instruction binary (like in the example) to a file, where each line is the instruction described in 4 bytes, each byte in Hexa representation, the byte order is little-endian.
so my input is:
;comment
MAIN: add #3, #5, #9
LOOP: ori #9, -5, #2
la val1
jmp Next
Next: move #20, #4
bgt #0, #2, END
la K
sw #0, 4, #10
bne #31, #9 , LOOP
call val1
jmp #4
END: stop
STR: .asciz "aBcd"
LIST: .db 6, -9
.dh 27056
.dw 5
.entry K
K: .dw 31,-12
.extern val1
desired output
(the first line is the above example's Hexadecimal representation).
but when I try to output each byte like so:
r_instruction inst;
inst.r.opcode = 0;
inst.r.rs = 3;
inst.r.rt = 5;
inst.r.rd = 9;
inst.r.funct = 1;
inst.r.unused = 0;
printf("%x %x %x %x", inst.bytes.first,
inst.bytes.second,
inst.bytes.third, inst.bytes.fourth);
>> c0 28 29 0
my main problem is how to convert my existing instruction to the requested presentation

Instead of using packed structs with bitfields, consider using normal structs and do the generation of the raw instruction yourself. For instance, let's take an instruction of type "J":
typedef struct { uint32_t addr; uint32_t opcode; uint32_t isReg; } j_instr;
// this function generates a 32-bit raw instruction (that you can then convert to ASCII hexadecimal or whatever you want)
uint32_t gen_j_instr(j_instr instr) {
uint32_t result = 0;
// let's add the opcode
result += instr.opcode << 26; // 26 is the starting position of opcode
// let's add the isReg
result += instr.isReg << 25; // 25 is the starting position of isReg
// let's add the addr
result += instr.addr; // no shift needed, addr is at position 0
return result;
}
The C11 standard gives implementations a lot of freedom when using structs with bitfields, therefore you can make few assumptions about their exact binary representation. On the other hand, using the previous method, you're completely independent of how your struct will be represented in binary.

put them in a union, together with a plain unsigned int;
typedef struct __attribute__((__packed__))
{
unsigned opcode : 6;
unsigned rs : 5;
unsigned rt : 5;
unsigned rd : 5;
unsigned funct : 5;
unsigned : 6;
} r_instruction;
typedef struct __attribute__((__packed__))
{
unsigned rs : 5;
unsigned rt : 5;
unsigned opcode : 6;
unsigned immed : 16;
} i_instruction;
typedef struct __attribute__((__packed__))
{
unsigned addr : 25;
unsigned opcode : 6;
unsigned isReg : 1;
} j_instruction;
union all_meuk {
r_instruction r_meuk;
i_instruction i_meuk;
j_instruction j_meuk;
unsigned all;
};

Related

How can I concatenate integer bit fields in c into a uint64_t integer?

I have a struct in c like this
struct RegisterStruct
{
uint64_t b_0 : 64;
uint64_t b_1 : 64;
uint64_t c_0 : 64;
uint64_t c_1 : 64;
uint64_t c_2 : 64;
uint64_t d_0 : 64;
uint64_t d_1 : 64;
};
I would like to concatenate the fields into an uint64_t integer. Each of the fields should occupy a given number of bits defined as follows:
b_0: 4bits
b_1: 4bits
c_0: 8bits
c_1: 8bits
c_2: 8bits
d_1: 16bits
d_2: 16bits
The result should be an uint64_t integer containing the concatenated bit fields(from b_0 to d_2) each occupying the given number of bits.
Here is what I have tried but I don't think this solution is correct:
struct RegisterStruct Register;
Register.b_0 = 8;
Register.b_1 = 8;
Register.c_0 = 128;
Register.c_1 = 128;
Register.c_2 = 128;
Register.d_0 = 32768;
Register.d_1 = 32768;
uint64_t reg_frame =Register.b_0<<60|Register.b_1<<56|Register.c_0<<48|Register.c_1<<40|Register.c_2<<32|Register.d_0<<16|Register.d_1;

You can put the structure containing the bit fields in a union with the full 64-bit unsigned integer like this:
union RegisterUnion
struct
{
uint64_t b_0 : 4;
uint64_t b_1 : 4;
uint64_t c_0 : 8;
uint64_t c_1 : 8;
uint64_t c_2 : 8;
uint64_t d_0 : 16;
uint64_t d_1 : 16;
};
uint64_t val;
};
The main problem with the above is that it is not portable. The C standard leaves the order in which bit fields are packed into their underlying storage unit type (uint64_t in this case) as an implementation defined decision. This is entirely separate from the ordering of bytes within a multi-byte integer, i.e. the little endian versus big endian byte ordering.
In addition, using uint64_t as the base type of a bit-field might not be supported. An implementation is only required to support bit-field members of types _Bool, signed int and unsigned int (or qualified versions thereof). According to the C11 draft 6.7.2.1 paragraph 5:
A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type. It is implementation-defined whether atomic types are permitted.

typedef union
{
struct
{
uint64_t b_0 : 4;
uint64_t b_1 : 4;
uint64_t c_0 : 8;
uint64_t c_1 : 8;
uint64_t c_2 : 8;
uint64_t d_0 : 16;
uint64_t d_1 : 16;
};
uint64_t u64;
}R_t;
int main()
{
R_t Register;
Register.b_0 = 8;
Register.b_1 = 8;
Register.c_0 = 128;
Register.c_1 = 128;
Register.c_2 = 128;
Register.d_0 = 32768;
Register.d_1 = 32768;
printf("%llx\n", (long long unsigned)Register.u64);
}
https://godbolt.org/z/_dYuz2

Structure with bit-fields size

I tried to use a structure with different sized bit-fields. The total number of bits used is 64. However, when I check the structure size, I get 11 instead of an expected 8. By trying to decompose the structure, I saw the difference came from the day field. If I pack every bit to get 8-bits packs, the day field is packed beetween the "end" of month and the "start" of hour. I don't know if this is a good approach. Can someone explain me that ?
typedef unsigned char uint8_t;
typedef struct frameHeader_t
{
uint8_t encryption : 2;
uint8_t frameVersion : 2;
uint8_t probeType : 4;
uint8_t dataType : 5;
uint8_t measurePeriod : 3;
uint8_t remontePerdiod : 4;
uint8_t nbrMeasure : 2;
uint8_t year : 7;
uint8_t month : 4;
uint8_t day : 5;
uint8_t hour : 5;
uint8_t minute : 6;
uint8_t second : 6;
uint8_t randomization : 5;
uint8_t status : 4;
}FrameHeader;
int main()
{
FrameHeader my_frameHeader;
printf("%d\n", sizeof(FrameHeader));
return 0;
}

If you run it through the pahole tool, you should get an explanation:
struct frameHeader_t {
uint8_t encryption:2; /* 0: 6 1 */
uint8_t frameVersion:2; /* 0: 4 1 */
uint8_t probeType:4; /* 0: 0 1 */
uint8_t dataType:5; /* 1: 3 1 */
uint8_t measurePeriod:3; /* 1: 0 1 */
uint8_t remontePerdiod:4; /* 2: 4 1 */
uint8_t nbrMeasure:2; /* 2: 2 1 */
/* XXX 2 bits hole, try to pack */
uint8_t year:7; /* 3: 1 1 */
/* XXX 1 bit hole, try to pack */
uint8_t month:4; /* 4: 4 1 */
/* XXX 4 bits hole, try to pack */
uint8_t day:5; /* 5: 3 1 */
/* XXX 3 bits hole, try to pack */
uint8_t hour:5; /* 6: 3 1 */
/* XXX 3 bits hole, try to pack */
uint8_t minute:6; /* 7: 2 1 */
/* XXX 2 bits hole, try to pack */
uint8_t second:6; /* 8: 2 1 */
/* XXX 2 bits hole, try to pack */
uint8_t randomization:5; /* 9: 3 1 */
/* XXX 3 bits hole, try to pack */
uint8_t status:4; /* 10: 4 1 */
/* size: 11, cachelines: 1, members: 15 */
/* bit holes: 8, sum bit holes: 20 bits */
/* bit_padding: 4 bits */
/* last cacheline: 11 bytes */
};
You're using uint8_t as the base type so the fields are getting padded to groups of 8 bits.
You should be able to completely eliminate the padding, somewhat more portably than with __attribute((packed)) by using unsigned long long/uint_least64_t (at least 64 bits large) as the base type of the bitfields, but technically the non-int/non-unsigned-int base types for bitfields aren't guaranteed to be supported, but you could use unsigned (at least 16 bits guaranteed by the C standard) after reorganizing the bitfields a little, for example into:
typedef struct frameHeader_t
{
//16
unsigned year : 7;
unsigned randomization : 5;
unsigned month : 4;
//16
unsigned second : 6;
unsigned minute : 6;
unsigned status : 4;
//16
unsigned hour : 5;
unsigned dataType : 5;
unsigned probeType : 4;
unsigned encryption : 2;
//16
unsigned day : 5;
unsigned remontePerdiod : 4;
unsigned measurePeriod : 3;
unsigned nbrMeasure : 2;
unsigned frameVersion : 2;
}FrameHeader;
//should be an unpadded 8 bytes as long as `unsigned` is 16,
//32, or 64 bits wide (I don't know of a platform where it isn't)
(The padding or lack thereof isn't guaranteed, but I've never seen an implementation insert it unless it was necessary.)

to avoid the packing magic I usually use fixed size unsigned types with the same size
typedef struct frameHeader_t
{
uint64_t encryption : 2;
uint64_t frameVersion : 2;
uint64_t probeType : 4;
uint64_t dataType : 5;
uint64_t measurePeriod : 3;
uint64_t remontePerdiod : 4;
uint64_t nbrMeasure : 2;
uint64_t year : 7;
uint64_t month : 4;
uint64_t day : 5;
uint64_t hour : 5;
uint64_t minute : 6;
uint64_t second : 6;
uint64_t randomization : 5;
uint64_t status : 4;
}FrameHeader;
https://godbolt.org/z/BX2QsC

combining MSB and LSB in short

I have a function that return 1 Byte
uint8_t fun();
the function should run 9 times , so I get 9 Byte I want to make the last8 one as 4 short values here what I've done but I'm not sure that the value that I get are correct :
char array[9];
.............
for ( i = 0; i< 9 ; i++){
array[i] = fun();
}
printf( " 1. Byte %x a = %d , b=%d c =%d \n" ,
array[0],
*(short*)&(array[1]),
*(short*)&(array[3]),
*(short*)&(array[5]),
*(short*)&(array[7]));
is that right ?

It's better to be explicit and join the 8-bit values into 16-bit values yourself:
uint8_t bytes[9];
uint16_t words[4];
words[0] = bytes[1] | (bytes[2] << 8);
words[1] = bytes[3] | (bytes[4] << 8);
words[2] = bytes[5] | (bytes[6] << 8);
words[3] = bytes[7] | (bytes[8] << 8);
The above assumes little-endian, by the way.

You will get alignement problems. Any pointer to a short can be seen as a pointer to char, but on non 8 bit machines, the inverse is not guaranteed.
IMHO, this would be safer :
struct {
char arr0;
union {
char array[8];
uint16_t sarr[4];
} u;
} s;
s.arr0 = fun();
for ( i = 0; i< 8 ; i++){
s.u.array[i] = fun();
}
printf( " 1. Byte %x a = %d , b=%d c =%d d=%d\n" ,
s.arr0,
s.u.sarr[0],
s.u.sarr[1],
s.u.sarr[2],
s.u.sarr[3]);
But I suppose you deal correctly with endianness on your machine and know how the conversion 2 chars <=> 1 short works ...

Try using struct to arrange the data and shift operations to convert for enianism.
// The existence of this function is assumed from the question.
extern unsigned char fun(void);
typedef struct
{
unsigned char Byte;
short WordA;
short WordB;
short WordC;
short WordD;
} converted_data;
void ConvertByteArray(converted_data* Dest, unsigned char* Source)
{
Dest->Byte = Source[0];
// The following assume that the Source bytes are MSB first.
// If they are LSB first, you will need to swap the indeces.
Dest->WordA = (((short)Source[1]) << 8) + Source[2];
Dest->WordB = (((short)Source[3]) << 8) + Source[4];
Dest->WordC = (((short)Source[5]) << 8) + Source[6];
Dest->WordD = (((hshort)Source[7]) << 8) + Source[8];
}
int main(void)
{
unsigned char array[9];
converted_data convertedData;
// Fill the array as per the question.
int i;
for ( i = 0; i< 9 ; i++)
{
array[i] = fun();
}
// Perform the conversion
ConvertByteArray(&convertedData, array);
// Note the use of %h not %d to specify a short in the printf!
printf( " 1. Byte %x a = %h , b=%h c =%h d =%h\n",
(int)convertedData.Byte, // Cast as int because %x assumes an int.
convertedData.WordA,
convertedData.WordB,
convertedData.WordC,
convertedData.WordD );
return 0;
}

Alignment of the mixed bit fields and fields of structures in big-endian and little-endian

from my previous experience i understood the following:
// if i have structure in big-endian system, look like this:
typedef struct
{
unsigned long
a: t1,
b: t2,
c: t3,
d: t4,
//...
z: tn;
} TType;
// i can adapt this for little-endian so:
typedef struct
{
unsigned long
z:tn,
//...
d: t4,
c: t3,
b: t2,
a: t1;
} TType;
// and i get identical mapping to memory
or following:
// if i have next structure:
typedef struct
{
unsigned long
a : 2,
b : 5,
c : 6,
d : 3;
} TType2;
// ...
TType2 test;
test.a = 0x2;
test.b = 0x0E;
test.c = 0x3A;
test.d = 0x6;
printf("*(unsigned short *)&test = 0x%04X\n", *(unsigned short *)&test);
// in little-endian system i get: 0xDD3A , or mapping to memory :
// c'-|-----b-----|-a-| |--d--|------c----
// 0 0 1 1 _ 1 0 1 0 _|_ 1 1 0 1 _ 1 1 0 1 _
// in big-endian system i get: 0xD69D , or mapping to memory :
// -----c-----|--d--| |-a-|------b----|-c'
// 1 1 0 1 _ 0 1 1 0 _|_ 1 0 0 1 _ 1 1 0 1 _
If i am not right - please correct me.
My embedded-system is 32-bit little-endian. This device have big-endian hardware, connected via SPI.
Program, that i must adapt for my system, later work with this hardware in 32-bit big-endian system via parallel bus.
I begin to adapt the library, which intends for building and analyzing ethernet framers and something else.
I met the following code (which crash my mind):
#pragma pack(1)
//...
typedef unsigned short word;
//...
#ifdef _MOTOROLA_CPU
typedef struct
{
word ip_ver : 4;
word ihl : 4;
word ip_tos : 8;
word tot_len;
word identification;
word flags : 3;
word fragment_ofs: 13;
word time_to_live: 8;
word protocol : 8;
word check_sum;
IP_ADDRESS_T src;
IP_ADDRESS_T dest;
} IP_MSG_HEADER_T, *IP_MSG_HEADER_P;
#else // Intel CPU.
typedef struct
{
word ip_tos : 8;
word ihl : 4;
word ip_ver : 4;
word tot_len;
word identification;
word fragment_ofs: 13;
word flags : 3;
word protocol : 8;
word time_to_live: 8;
word check_sum;
IP_ADDRESS_T src;
IP_ADDRESS_T dest;
} IP_MSG_HEADER_T, *IP_MSG_HEADER_P;
#endif
But i met and the following:
typedef struct
{
word formid : 5;
word padding_formid : 3;
word TS_in_bundle : 5;
word padding_ts : 3;
word cell_per_frame : 8;
word padding : 8;
} SERVICE_SPEC_OLD_FIELD_T, *SERVICE_SPEC_OLD_FIELD_P
#else // Intel CPU.
typedef struct
{
word padding : 8;
word cell_per_frame : 8;
word padding_ts : 3;
word TS_in_bundle : 5;
word padding_formid : 3;
word formid : 5;
} SERVICE_SPEC_OLD_FIELD_T, *SERVICE_SPEC_OLD_FIELD_P;
#endif /*_MOTOROLA_CPU*/
Similar vagueness everywhere in this library.
I don't see here logic. Is me really stupid, or this code - nonsense?
Also: Whether I am right in the following:
// Two structures
typedef struct
{
unsigned long
a : 1,
b : 3,
c : 12;
unsigned short word;
} Type1;
// and
typedef struct
{
unsigned long
a : 1,
b : 3,
c : 12,
word : 16;
} Type2;
// will be identical in memory for little-endian and different for big-endian?
Thanks in advance!

In this particular case, the structures are big- and little-endian aligned on 16 bit boundaries. If you group each set into 16 bit pieces, you can see that is the size of the things being swapped for endian compatibility.

a method to do bit-operation in C

I'm trying to define a structure that can allow to set a byte value directly, and also allow to manipulate the bits of the byte without using function like bit_set(), bit_clear() etc,.
Here's my definition
typedef union FLAG_WORK {
volatile unsigned char BYTE;
struct {
volatile unsigned char bit0:1;
volatile unsigned char bit1:1;
volatile unsigned char bit2:1;
volatile unsigned char bit3:1;
volatile unsigned char bit4:1;
volatile unsigned char bit5:1;
volatile unsigned char bit6:1;
volatile unsigned char bit7:1;
}BIT;
}FLAG8;
and a sample code
int main()
{
FLAG8 i;
i.BYTE=(unsigned char)0; // initial the value of i.BYTE
i.BIT.bit0 = 1; // set bit0 of i.BYTE
i.BIT.bit1 = 1;
cout << (int)i.BYTE << endl;
cout << "Hello world!" << endl;
return 0;
}
I just wonder how to modify the structure allowing me to assign value to "i" in above code directly?
any suggestion?

C99 allows intializing members explicitly. Assuming I understand your question correctly, you're looking for
FLAG8 i = { .BIT = { .bit2 = 1, .bit5 = 1 } };
FLAG8 j = { .BYTE = 42 };
FLAG8 k = { 42 }; // same as j as initializing the first member is default

As normal you can assign like But the values are overlapped, because Here union holds only 1 byte of memory
FLAG8 i = { .BIT = { .bit2 = 1, .bit5 = 1 } , .BYTE = 42};
//result of i.BYTE is 42 depens on Byte value
FLAG8 i = { .BYTE = 42 , .BIT = { .bit2 = 1, .bit5 = 1 } };
//result of i.BYTE is 36 depends on bits values
1.As #Christoph suggested You can assign value BYTE or bit by bit directly Like this
FLAG8 i = { .BYTE = 42 };
FLAG8 j = { .BIT = { .bit2 = 1, .bit5 = 1 } };
printf("%d \n\n",(int)i.BYTE );
printf("%d \n\n",(int)j.BYTE );
2.You can assign value bit by bit
i.BIT.bit0 = 1; // set bit0 of i.BYTE
i.BIT.bit1 = 1;
i.BIT.bit4 = 1;
printf("%d \n\n",(int)i.BYTE );
3.You can assign directly a BYTE with hexa decimal value
i.BYTE=0x10;
printf("%d \n\n",(int)i.BYTE );
4.You can assign directly a BYTE with decimal value
i.BYTE=100;
printf("%d \n\n",(int)j.BYTE );
5.You can assign directly with the Identifier of same type
j.BYTE=0x30;
printf("%d \n\n",(int)j.BYTE );
i=j;
printf("%d \n\n",(int)i.BYTE );
6.Assign a decimal value and check it in printing both decimal and hexa decimal format
j.BYTE=100;
printf("%d \n\n",(int)j.BYTE );
printf("%x \n\n",(int)j.BYTE );
j.BIT.bit4=0;
j.BIT.bit5=0;
printf("%d \n\n",(int)j.BYTE );
printf("%x \n\n",(int)j.BYTE );

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

convert struct of bitfields to binary format - c

Related

How can I concatenate integer bit fields in c into a uint64_t integer?

Structure with bit-fields size

combining MSB and LSB in short

Alignment of the mixed bit fields and fields of structures in big-endian and little-endian

a method to do bit-operation in C

Categories

Resources