Forcing alignment of C bitfield using a union

Forcing alignment of C bitfield using a union - c

I was wondering if it is possible to force the alignment of bitfield in C. Using the variables in the code below I know that writing to _align_bytes then reading from bits is undefined (and vice-versa) because it is implementation depended. Is the code below a valid method to "persuade" bits to be stored contiguously in something that is the size of unsigned short? I believe that (minus any endian issues) this code is correct... but bitfields and unions are the two C topics I am least familiar with.
I am doing a low level micro-controller project and would like an easy method of reading configuration bits without a ton of bit masking. Thanks for any tips and suggestions.
Sam
P.S. Please disregard any assumptions I make about endianness as this project I am working on is very low level and not intended to be ported to other devices/platforms.
#include <stdio.h>
#include <assert.h>
typedef union packet {
struct {
unsigned int bit0 : 1;
unsigned int bit1 : 1;
unsigned int bit2 : 1;
unsigned int bit3 : 1;
unsigned int bit4 : 1;
unsigned int bit5 : 1;
unsigned int bit6 : 1;
unsigned int bit7 : 1;
unsigned int bit8 : 1;
unsigned int bit9 : 1;
unsigned int bit10 : 1;
unsigned int bit11 : 1;
unsigned int bit12 : 1;
unsigned int bit13 : 1;
unsigned int bit14 : 1;
unsigned int bit15 : 1;
} bits;
unsigned short _align_bytes;
} packet_t;
int main(int argc, char *argv[]) {
assert(sizeof(unsigned short) == 2);
unsigned short data = 0xA05F;
packet_t *p = (packet_t *)&data;
printf("%u", p->bits.bit15);
printf("%u", p->bits.bit14);
printf("%u", p->bits.bit13);
printf("%u", p->bits.bit12);
printf("%u", p->bits.bit11);
printf("%u", p->bits.bit10);
printf("%u", p->bits.bit9);
printf("%u", p->bits.bit8);
printf("%u", p->bits.bit7);
printf("%u", p->bits.bit6);
printf("%u", p->bits.bit5);
printf("%u", p->bits.bit4);
printf("%u", p->bits.bit3);
printf("%u", p->bits.bit2);
printf("%u", p->bits.bit1);
printf("%u", p->bits.bit0);
return 0;
}

This is a common pattern and as far as I know, the answer is yes: the bit fields will be contiguous and occupy the same memory as the _align_bytes field. That's the whole point of a union, right? Different ways of looking at the same memory.
I'm not sure what you mean by "writing to _align_bytes then reading from bits is undefined". The only issue I see is the endianess: bit0 may be the lsb or the msb of _align_bytes. If you don't want it to be portable, then you just need to do a quick test to figure out which it is, and you should be set.

I am not sure, but wouldn't this violate strict aliasing rules because two pointers of different types are pointing to the same memory location? In C89 and C99 you are not guaranteed to get anything back correctly.
You may want to test this, and if required use -fno-strict-aliasing or something similar for your compiler so that it disables strict aliasing which could cause issues.

Related

Is it possible to have bit fields array? [duplicate]

I was pondering (and therefore am looking for a way to learn this, and not a better solution) if it is possible to get an array of bits in a structure.
Let me demonstrate by an example. Imagine such a code:
#include <stdio.h>
struct A
{
unsigned int bit0:1;
unsigned int bit1:1;
unsigned int bit2:1;
unsigned int bit3:1;
};
int main()
{
struct A a = {1, 0, 1, 1};
printf("%u\n", a.bit0);
printf("%u\n", a.bit1);
printf("%u\n", a.bit2);
printf("%u\n", a.bit3);
return 0;
}
In this code, we have 4 individual bits packed in a struct. They can be accessed individually, leaving the job of bit manipulation to the compiler. What I was wondering is if such a thing is possible:
#include <stdio.h>
typedef unsigned int bit:1;
struct B
{
bit bits[4];
};
int main()
{
struct B b = {{1, 0, 1, 1}};
for (i = 0; i < 4; ++i)
printf("%u\n", b.bits[i]);
return 0;
}
I tried declaring bits in struct B as unsigned int bits[4]:1 or unsigned int bits:1[4] or similar things to no avail. My best guess was to typedef unsigned int bit:1; and use bit as the type, yet still doesn't work.
My question is, is such a thing possible? If yes, how? If not, why not? The 1 bit unsigned int is a valid type, so why shouldn't you be able to get an array of it?
Again, I don't want a replacement for this, I am just wondering how such a thing is possible.
P.S. I am tagging this as C++, although the code is written in C, because I assume the method would be existent in both languages. If there is a C++ specific way to do it (by using the language constructs, not the libraries) I would also be interested to know.
UPDATE: I am completely aware that I can do the bit operations myself. I have done it a thousand times in the past. I am NOT interested in an answer that says use an array/vector instead and do bit manipulation. I am only thinking if THIS CONSTRUCT is possible or not, NOT an alternative.
Update: Answer for the impatient (thanks to neagoegab):
Instead of
typedef unsigned int bit:1;
I could use
typedef struct
{
unsigned int value:1;
} bit;
properly using #pragma pack

NOT POSSIBLE - A construct like that IS NOT possible(here) - NOT POSSIBLE
One could try to do this, but the result will be that one bit is stored in one byte
#include <cstdint>
#include <iostream>
using namespace std;
#pragma pack(push, 1)
struct Bit
{
//one bit is stored in one BYTE
uint8_t a_:1;
};
#pragma pack(pop, 1)
typedef Bit bit;
struct B
{
bit bits[4];
};
int main()
{
struct B b = {{0, 0, 1, 1}};
for (int i = 0; i < 4; ++i)
cout << b.bits[i] <<endl;
cout<< sizeof(Bit) << endl;
cout<< sizeof(B) << endl;
return 0;
}
output:
0 //bit[0] value
0 //bit[1] value
1 //bit[2] value
1 //bit[3] value
1 //sizeof(Bit), **one bit is stored in one byte!!!**
4 //sizeof(B), ** 4 bytes, each bit is stored in one BYTE**
In order to access individual bits from a byte here is an example (Please note that the layout of the bitfields is implementation dependent)
#include <iostream>
#include <cstdint>
using namespace std;
#pragma pack(push, 1)
struct Byte
{
Byte(uint8_t value):
_value(value)
{
}
union
{
uint8_t _value;
struct {
uint8_t _bit0:1;
uint8_t _bit1:1;
uint8_t _bit2:1;
uint8_t _bit3:1;
uint8_t _bit4:1;
uint8_t _bit5:1;
uint8_t _bit6:1;
uint8_t _bit7:1;
};
};
};
#pragma pack(pop, 1)
int main()
{
Byte myByte(8);
cout << "Bit 0: " << (int)myByte._bit0 <<endl;
cout << "Bit 1: " << (int)myByte._bit1 <<endl;
cout << "Bit 2: " << (int)myByte._bit2 <<endl;
cout << "Bit 3: " << (int)myByte._bit3 <<endl;
cout << "Bit 4: " << (int)myByte._bit4 <<endl;
cout << "Bit 5: " << (int)myByte._bit5 <<endl;
cout << "Bit 6: " << (int)myByte._bit6 <<endl;
cout << "Bit 7: " << (int)myByte._bit7 <<endl;
if(myByte._bit3)
{
cout << "Bit 3 is on" << endl;
}
}

In C++ you use std::bitset<4>. This will use a minimal number of words for storage and hide all the masking from you. It's really hard to separate the C++ library from the language because so much of the language is implemented in the standard library. In C there's no direct way to create an array of single bits like this, instead you'd create one element of four bits or do the manipulation manually.
EDIT:
The 1 bit unsigned int is a valid type, so why shouldn't you be able
to get an array of it?
Actually you can't use a 1 bit unsigned type anywhere other than the context of creating a struct/class member. At that point it's so different from other types it doesn't automatically follow that you could create an array of them.

C++ would use std::vector<bool> or std::bitset<N>.
In C, to emulate std::vector<bool> semantics, you use a struct like this:
struct Bits {
Word word[];
size_t word_count;
};
where Word is an implementation-defined type equal in width to the data bus of the CPU; wordsize, as used later on, is equal to the width of the data bus.
E.g. Word is uint32_fast_t for 32-bit machines, uint64_fast_t for 64-bit machines;
wordsize is 32 for 32-bit machines, and 64 for 64-bit machines.
You use functions/macros to set/clear bits.
To extract a bit, use GET_BIT(bits, bit) (((bits)->)word[(bit)/wordsize] & (1 << ((bit) % wordsize))).
To set a bit, use SET_BIT(bits, bit) (((bits)->)word[(bit)/wordsize] |= (1 << ((bit) % wordsize))).
To clear a bit, use CLEAR_BIT(bits, bit) (((bits)->)word[(bit)/wordsize] &= ~(1 << ((bit) % wordsize))).
To flip a bit, use FLIP_BIT(bits, bit) (((bits)->)word[(bit)/wordsize] ^= (1 << ((bit) % wordsize))).
To add resizeability as per std::vector<bool>, make a resize function which calls realloc on Bits.word and changes Bits.word_count accordingly. The exact details of this is left as a problem.
The same applies for proper range-checking of bit indices.

this is abusive, and relies on an extension... but it worked for me:
struct __attribute__ ((__packed__)) A
{
unsigned int bit0:1;
unsigned int bit1:1;
unsigned int bit2:1;
unsigned int bit3:1;
};
union U
{
struct A structVal;
int intVal;
};
int main()
{
struct A a = {1, 0, 1, 1};
union U u;
u.structVal = a;
for (int i =0 ; i<4; i++)
{
int mask = 1 << i;
printf("%d\n", (u.intVal & mask) >> i);
}
return 0;
}

You can also use an array of integers (ints or longs) to build an arbitrarily large bit mask. The select() system call uses this approach for its fd_set type; each bit corresponds to the numbered file descriptor (0..N). Macros are defined: FD_CLR to clear a bit, FD_SET to set a bit, FD_ISSET to test a bit, and FD_SETSIZE is the total number of bits. The macros automatically figure out which integer in the array to access and which bit in the integer. On Unix, see "sys/select.h"; under Windows, I think it is in "winsock.h". You can use the FD technique to make your own definitions for a bit mask. In C++, I suppose you could create a bit-mask object and overload the [] operator to access individual bits.

You can create a bit list by using a struct pointer. This will use more than a bit of space per bit written though, since it'll use one byte (for an address) per bit:
struct bitfield{
unsigned int bit : 1;
};
struct bitfield *bitstream;
Then after this:
bitstream=malloc( sizeof(struct bitfield) * numberofbitswewant );
You can access them like so:
bitstream[bitpointer].bit=...

How to get the bit position of any member in structure

How can I get the bit position of any members in structure?
In example>
typedef struct BitExamStruct_
{
unsigned int v1: 3;
unsigned int v2: 4;
unsigned int v3: 5;
unsigned int v4: 6;
} BitExamStruct;
Is there any macro to get the bit position of any members like GetBitPos(v2, BitExamStruct)?
I thought that compiler might know members' location based on bits length in the structure. So I want to know whether I can get it by using just a simple macro without running code.
Thank you in advance.

There is no standard way that I know of to do so, but it doesn't mean you can't find a solution.
The following is not the prettiest code ever; it's a kind of hack to identify where the variable "begins" in memory. Please keep in mind that the following can give different results depending on the endianess:
#include <stdio.h>
#include <string.h>
typedef struct s_toto
{
int a:2;
int b:3;
int c:3;
} t_toto;
int
main()
{
t_toto toto;
unsigned char *c;
int bytes;
int bits;
memset(&toto, 0, sizeof(t_toto));
toto.c = 1;
c = (unsigned char *)&toto;
for (bytes = 0; bytes < (int)sizeof(t_toto); bytes++)
{
if (*c)
break;
}
for (bits = 0; bits < 8; bits++)
{
if (*c & 0b10000000)
break;
*c = (*c << 1);
}
printf("position (bytes=%d, bits=%d): %d\n", bytes, bits, (bytes * 8) + bits);
return 0;
}
What I do is that I initialize the whole structure to 0 and I set 1 as value of the variable I want to locate. The result is that only one bit is set to 1 in the structure. Then I read the memory byte per byte until I find one that's not zero. Once found, I can look at its bits until I find the one that's set.

There is no portable (aka standard C) way. But thinking outside the box, if you need full control or need this information badly, bitfields are the wrong approach. The proper solution is shifting and masking. Of course this is feasible only when you are in control of the source code.

Bitfields and alignment

Trying to pack data into a packet. This packet should be 64 bits. I have this:
typedef union {
uint64_t raw;
struct {
unsigned int magic : 8;
unsigned int parity : 1;
unsigned int stype : 8;
unsigned int sid : 8;
unsigned int mlength : 31;
unsigned int message : 8;
} spacket;
} packet_t;
But it seems that alignment is not guaranteed. Because when I run this:
#include <strings.h>
#include <stdio.h>
#include <stddef.h>
#include <stdint.h>
const char *number_to_binary(uint64_t x)
{
static char b[65];
b[64] = '\0';
uint64_t z;
int w = 0;
for (z = 1; w < 64; z <<= 1, ++w)
{
b[w] = ((x & z) == z) ? '1' : '0';
}
return b;
}
int main(void)
{
packet_t ipacket;
bzero(&ipacket, sizeof(packet_t));
ipacket.spacket.magic = 255;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.parity = 1;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.stype = 255;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.sid = 255;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.mlength = 2147483647;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.message = 255;
printf("%s\n", number_to_binary(ipacket.raw));
}
I get (big endian):
1111111100000000000000000000000000000000000000000000000000000000
1111111110000000000000000000000000000000000000000000000000000000
1111111111111111100000000000000000000000000000000000000000000000
1111111111111111111111111000000000000000000000000000000000000000
1111111111111111111111111000000011111111111111111111111111111110
1111111111111111111111111000000011111111111111111111111111111110
My .mlength field is lost somewhere on the right part although it should be right next to the .sid field.
This page confirms it: Alignment of the allocation unit that holds a bit field is unspecified. But if this is the case, how do people are packing data into bit fields which is their purpose in the first place?
24 bits seems to be the maximum size the .mlength field is able to take before the .message field is kicked out.

Almost everything about the layout of bit-fields is implementation-defined in the standard, as you'd find from numerous other questions on the subject on SO. (Amongst others, you could look at Questions about bitfields and especially Bit field's memory management in C).
If you want your bit fields to be packed into 64 bits, you'll have to trust that your compiler allows you to use 64-bit types for the fields, and then use:
typedef union {
uint64_t raw;
struct {
uint64_t magic : 8;
uint64_t parity : 1;
uint64_t stype : 8;
uint64_t sid : 8;
uint64_t mlength : 31;
uint64_t message : 8;
} spacket;
} packet_t;
As originally written, under one plausible (common) scheme, your bit fields would be split into new 32-bit words when there isn't space enough left in the current one. That is, magic, parity, stype and sid would occupy 25 bits; there isn't enough room left in a 32-bit unsigned int to hold another 31 bits, so mlength is stored in the next unsigned int, and there isn't enough space left over in that unit to store message so that is stored in the third unsigned int unit. That would give you a structure occupying 3 * sizeof(unsigned int) or 12 bytes — and the union would occupy 16 bytes because of the alignment requirements on uint64_t.
Note that the standard does not guarantee that what I show will work. However, under many compilers, it probably will work. (Specifically, it works with GCC 5.3.0 on Mac OS X 10.11.4.)

Depending on your architecture and/or compiler your data will be aligned to different sizes. From your observations I would guess that you are seeing the consequences of 32 bit aligning. If you take a look at the sizeof your union and that is more than 8 bytes (64 bits) data has been padded for alignment.
With 32 bit alignment mlength and message will only be able to stay next to each other if they sum up to less than or equal 32 bits. This is probably what you see with your 24 bit limit.
If you want your struct to only take 64 bits with 32 bit alignment you will have to rearrange it a little bit. The single bit parity should be next to the 31 bit mlength and your 4 8 bit variables should be grouped together.

2 Chars to Short in C

I've got 2 chars.
Char 128 and Char 2.
How do I turn these chars into the Short 640 in C?
I've tried
unsigned short getShort(unsigned char* array, int offset)
{
short returnVal;
char* a = slice(array, offset, offset+2);
memcpy(&returnVal, a, 2);
free(a);
return returnVal;
}
But that didn't work, it just displays it as 128. What's the preferred method?

Probably the easiest way to turn two chars, a and b, into a short c, is as follows:
short c = (((short)a) << 8) | b;
To fit this into what you have, the easiest way is probably something like this:
unsigned short getShort(unsigned char* array, int offset)
{
return (short)(((short)array[offset]) << 8) | array[offset + 1];
}

I found that the accepted answer was nearly correct, except i'd run into a bug where sometimes the top byte of the result would be 0xff...
I realized this was because of C sign extension. if the second char is >= 0x80, then converting 0x80 to a short becomes 0xff80. Performing an 'or' of 0xff80 with anything results in the top byte remaining 0xff.
The following solution avoids the issue by zeroing out the top byte of b during its implicit conversion to a short.
short c = (((short)a) << 8) | (0x00ff & b);

I see that there is already an answer, but I'm a bit puzzled about what was going on with your original attempt. The following code shows your way and a technique using a union. Both seem to work just fine. I suppose you might have been running into an endianness problem. Anyway, perhaps this demonstration will be useful even if your problem is already solved.
#include <stdio.h>
#include <string.h>
int main()
{
short returnVal;
char a[2];
union {
char ch[2];
short n;
} char2short;
a[0] = 128;
a[1] = 2;
memcpy(&returnVal, a, 2);
printf("short = %d\n", returnVal);
char2short.ch[0] = 128;
char2short.ch[1] = 2;
printf("short (union) = %d\n", char2short.n);
return 0;
}
Outputs:
short = 640
short (union) = 640

I see that you are not actually trying to shift bits but assemble the equivelant of hex values together, like you would color values in CSS.
Give this code a shot:
char b1=128,b2=2;
char data[16];
sprintf((char *)data,"%x%x",(BYTE)b2,(BYTE)b1);
short result=strtol(data,(char **)NULL, 16);

Changing bits in an int in C?

So I have a 16 bit number. Say that the variable name for it is Bits. I want to make it so that Bits[2:0] = 001, 100, and 000, without changing anything else. I'm not sure how to do it, because all I can think of is ORing the bit I want to be a 1 with 1, but I'm not sure how to clear the other bits so that they're 0. If anyone has advice, I'd appreciate it. Thanks!

To clear certain bits, & with the inverse of the bits to be cleared. Then you can | in the bits you want.
In this case, you want to zero out the lower three bits (111 in binary or 7 in decimal), so we & with ~7 to clear those bits.
Bits = (Bits & ~7) | 1; // set lower three bits of Bits to 001

union structures allow the ability to specify the number of bits a variable has and address each bit of a bigger variable individually.
union {
short value;
struct {
unsigned char bit0 : 1;
unsigned char bit1 : 1;
unsigned char bit2 : 1;
unsigned char bit3 : 1;
unsigned char bit4 : 1;
unsigned char bit5 : 1;
unsigned char bit6 : 1;
unsigned char bit7 : 1;
unsigned char bit8 : 1;
unsigned char bit9 : 1;
unsigned char bit10 : 1;
unsigned char bit11 : 1;
unsigned char bit12 : 1;
unsigned char bit13 : 1;
unsigned char bit14 : 1;
unsigned char bit15 : 1;
} bits;
} var;
Now you have a variable named var that hold a 16-bit integer which can be referenced by var.value, and you have access to each individual bit of this variable by acessing var.bits.bit0 through var.bits.bit15.
By setting var.value = 0; all bits are set to 0 too. By setting var.bits.bit0 = 1; you automatically change the value of var.value to 0x8000, or as seen in binary, 1000000000000000.
If you intention is change only the 3 last bits, you can simplify the structure to something more like this:
union {
short value;
struct {
unsigned short header : 13;
unsigned char bit13 : 1;
unsigned char bit14 : 1;
unsigned char bit15 : 1;
} bits;
} var;
Now you have the var.bits.header, that is a 13-bits variable, and 3 other 1-bit variables you can play with.
But notice C++ does not support this kind of structure, so for best C to C++ portability you might prefer to use bitwise operations instead as proposed by #nneonneo.