Bitfields and Unions in C giving problems - c

I am implementing a radio standard and have hit a problem with unions in structure and memory size. In the below example I need this structure to located in a single byte of memory (as per the radio standard) but its currently giving me a size of 2 bytes. After much digging I understand that its because the Union's "size" is byte rather than 3 bits...but havent worked out a way around this.
I have looked at:
Bitfields in C with struct containing union of structs; and
Will this bitfield work the way I expect?
But neither seem to give me a solution.
Any ideas?
Thanks!
#ifdef WIN32
#pragma pack(push)
#pragma pack(1)
#endif
typedef struct three_bit_struct
{
unsigned char bit_a : 1;
unsigned char bit_b : 1;
unsigned char bit_c : 1;
}three_bit_struct_T;
typedef union
{
three_bit_struct_T three_bit_struct;
unsigned char another_three_bits : 3;
}weird_union_T;
typedef struct
{
weird_union_T problem_union;
unsigned char another_bit : 1;
unsigned char reserved : 4;
}my_structure_T;
int _tmain(int argc, _TCHAR* argv[])
{
int size;
size = sizeof(my_structure_T);
return 0;
}
#ifdef WIN32
#pragma pack(pop)
#endif

The problem is that the size of three_bit_struct_T will be rounded up to the nearest byte* regardless of the fact that it only contains three bits in its bitfield. A struct simply cannot have a size which is part-of-a-byte. So when you augment it with the extra fields in my_structure_T, inevitably the size will spill over into a second byte.
To cram all that stuff into a single byte, you'll have to put all the bitfield members in the outer my_structure_T rather than having them as an inner struct/union.
I think the best you can do is have the whole thing as a union.
typedef struct
{
unsigned char bit_a : 1;
unsigned char bit_b : 1;
unsigned char bit_c : 1;
unsigned char another_bit : 1;
unsigned char reserved : 4;
} three_bit_struct_T;
typedef struct
{
unsigned char another_three_bits : 3;
unsigned char another_bit : 1;
unsigned char reserved : 4;
} another_three_bit_struct_T;
typedef union
{
three_bit_struct_T three_bit_struct;
another_three_bit_struct_T another_three_bit_struct;
} my_union_T;
(*) or word, depending on alignment/packing settings.

Two good advices: never use struct/union for data protocols, and never use bit-fields anywhere in any situation.
The best way to implement this is through bit masks and bit-wise operators.
#define BYTE_BIT7 0x80u
uint8_t byte;
byte |= BYTE_BIT_7; // set bit to 1
byte &= ~BYTE_BIT_7; // set bit to 0
if(byte & BYTE_BIT_7) // check bit value
This code is portable to every C compiler in the world and also to C++.

Related

I need to write a C program that uses a Union structure to set the bits of an int

Here are the instructions:
Define a union data struct WORD_T for a uint16_t integer so that a value can be assigned to a WORD_T integer in three ways:
(1) To assign the value to each bit of the integer,
(2) To assign the value to each byte of the integer,
(3) To assign the value to the integer directly.
I know I need to do something different with that first struct, but I'm pretty lost in general. Here is what I have:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#define BIT(n) (1 << n)
#define BIT_SET(var, mask) (n |= (mask) )
#define BIT_CLEAR(var, mask) (n &= ~(mask) )
#define BIT_FLIP(var, mask) (n ^= (mask) )
union WORD_T{
struct{
uint16_t integerVar:16
};
struct{
unsigned bit1:0;
unsigned bit2:0;
unsigned bit3:0;
unsigned bit4:0;
unsigned bit5:0;
unsigned bit6:0;
unsigned bit7:0;
unsigned bit8:0;
unsigned bit9:0;
unsigned bit10:0;
unsigned bit11:0;
unsigned bit12:0;
unsigned bit13:0;
unsigned bit14:0;
unsigned bit15:0;
unsigned bit16:0;
};
void setIndividualBit(unsigned value, int bit) {
mask = BIT(value) | BIT(bit);
BIT_SET(n, mask);
}
};
The most obvious problem is that :0 means "a bit-field of zero bits" which doesn't make any sense. You should change this to 1 and then you can assign individual bits through code like the_union.the_struct.bit1 = 1;.
This format is an alternative to the bit set/clear macros you wrote.
Your union should look like:
typedef union
{
uint16_t word;
uint8_t byte[2];
struct{
uint16_t bit1:1;
...
However. This is a really bad idea and your teacher should know better than to give such assignments. Bit-fields in C are a bag of worms - they are very poorly specified by the standard. Problems:
You can't know which bit that is the MSB, it is not defined.
You can't know if there will be padding bits/bytes somewhere.
You can't even use uint16_t or uint8_t because bit-fields are only specified to work with int.
And so on. In addition you have to deal with endianness and alignment. See Why bit endianness is an issue in bitfields?.
Essentially, code using bit-fields like this will rely heavily on the specific compiler. It will be completely non-portable.
All of these problems could be avoided by dropping the bit-field and use bitwise operators on a uint16_t instead, like you did in your macros. With bitwise operators only, your code will turn deterministic and 100% portable. You can even use them to dodge endianess, by using bit shifts.
Here's the union definition:
union WORD_T
{
uint16_t word;
struct
{
uint8_t byte0;
uint8_t byte1;
};
struct
{
uint8_t bit0:1;
uint8_t bit1:1;
uint8_t bit2:1;
uint8_t bit3:1;
uint8_t bit4:1;
uint8_t bit5:1;
uint8_t bit6:1;
uint8_t bit7:1;
uint8_t bit8:1;
uint8_t bit9:1;
uint8_t bit10:1;
uint8_t bit11:1;
uint8_t bit12:1;
uint8_t bit13:1;
uint8_t bit14:1;
uint8_t bit15:1;
};
};
To assign to the individual components, do something similar to the following:
union WORD_T data;
data.word = 0xF0F0; // (3) Assign to entire word
data.byte0 = 0xAA; // (2) Assign to individual byte
data.bit0 = 0; // (1) Assign to individual bits
data.bit1 = 1;
data.bit2 = 1;
data.bit2 = 0;

atomic variable with bitflags from struct

I have a single atomic variable which I read and write to from with the C11 atomics.
Now I have a struct which contains every flag like this:
typedef struct atomic_container {
unsigned int flag1 : 2;
unsigned int flag2 : 2;
unsigned int flag3 : 2;
unsigned int flag4 : 2;
unsigned int progress : 8;
unsigned int reserved : 16;
}atomic_container;
then I use a function to convert this struct to a unsigned int with a 32 bit width using bitshifts. and then write to it with the atomic functions.
I wonder if I can directly write this struct atomically rather than first bitshifting it into a unsigned int. This does seem to work but I am worried this might be implementation defined and could lead to undefined behavior. the struct in question is 32 bit wide exactly as I want it.

using memcpy for structs

I have a problem when using memcpy on a struct.
Consider the following struct
struct HEADER
{
unsigned int preamble;
unsigned char length;
unsigned char control;
unsigned int destination;
unsigned int source;
unsigned int crc;
}
If I use memcpy to copy data from a receive buffer to this struct the copy is OK, but if i redeclare the struct to the following :
struct HEADER
{
unsigned int preamble;
unsigned char length;
struct CONTROL control;
unsigned int destination;
unsigned int source;
unsigned int crc;
}
struct CONTROL
{
unsigned dir : 1;
unsigned prm : 1;
unsigned fcb : 1;
unsigned fcb : 1;
unsigned function_code : 4;
}
Now if I use the same memcpy code as before, the first two variables ( preamble and length ) are copied OK. The control is totally messed up, and last three variables are shifted one up, aka crc = 0, source = crc, destination = source...
ANyone got any good suggestions for me ?
Do you know that the format in the receive buffer is correct, when you add the control in the middle?
Anyway, your problem is that bitfields are the wrong tool here: you can't depend on the layout in memory being anything in particular, least of all the exact same one you've chosen for the serialized form.
It's almost never a good idea to try to directly copy structures to/from external storage; you need proper serialization. The compiler can add padding and alignment between the fields of a structure, and using bitfields makes it even worse. Don't do this.
Implement proper serialization/deserialization functions:
unsigned char * header_serialize(unsigned char *put, const struct HEADER *h);
unsigned char * header_deserialize(unsigned char *get, struct HEADER *h);
That go through the structure and read/write as many bytes as you feel are needed (possibly for each field):
static unsigned char * uint32_serialize(unsigned char *put, uint32_t x)
{
*put++ = (x >> 24) & 255;
*put++ = (x >> 16) & 255;
*put++ = (x >> 8) & 255;
*put++ = x & 255;
return put;
}
unsigned char * header_serialize(unsigned char *put, const struct HEADER *h)
{
const uint8_t ctrl_serialized = (h->control.dir << 7) |
(h->control.prm << 6) |
(h->control.fcb << 5) |
(h->control.function_code);
put = uint32_serialize(put, h->preamble);
*put++ = h->length;
*put++ = ctrl_serialized;
put = uint32_serialize(put, h->destination);
put = uint32_serialize(put, h->source);
put = uint32_serialize(put, h->crc);
return put;
}
Note how this needs to be explicit about the endianness of the serialized data, which is something you always should care about (I used big-endian). It also explicitly builds a single uint8_t version of the control fields, assuming the struct version was used.
Also note that there's a typo in your CONTROL declaration; fcb occurs twice.
Using struct CONTROL control; instead of unsigned char control; leads to a different alignment inside the struct and so filling it with memcpy() produces a different result.
Memcpy copies the values of bytes from the location pointed by source directly to the memory block pointed by destination.
The underlying type of the objects pointed by both the source and destination pointers are irrelevant for this function; The result is a binary copy of the data.
So if there is any structure padding then you will have messed up results.
Check sizeof(struct CONTROL) -- I think it would be 2 or 4 depending on the machine. Since you are using unsigned bitfields (and unsigned is shorthand of unsigned int), the whole structure (struct CONTROL) would take at least the size of unsigned int -- i.e. 2 or 4 bytes.
And, using unsigned char control takes 1 byte for this field. So, definitely there should be mismatch staring with the control variable.
Try rewriting the struct control as below:-
struct CONTROL
{
unsigned char dir : 1;
unsigned char prm : 1;
unsigned char fcb : 1;
unsigned char fcb : 1;
unsigned char function_code : 4;
}
The clean way would be to use a union, like in.:
struct HEADER
{
unsigned int preamble;
unsigned char length;
union {
unsigned char all;
struct CONTROL control;
} uni;
unsigned int destination;
unsigned int source;
unsigned int crc;
};
The user of the struct can then choose the way he wants to access the thing.
struct HEADER thing = {... };
if (thing.uni.control.dir) { ...}
or
#if ( !FULL_MOON ) /* Update: stacking of bits within a word appears to depend on the phase of the moon */
if (thing.uni.all & 1) { ... }
#else
if (thing.uni.all & 0x80) { ... }
#endif
Note: this construct does not solve endianness issues, that will need implicit conversions.
Note2: and you'll have to check the bit-endianness of your compiler, too.
Also note that bitfields are not very useful, especially if the data goes over the wire, and the code is expected to run on different platforms, with different alignment and / or endianness. Plain unsigned char or uint8_t plus some bitmasking yields much cleaner code. For example, check the IP stack in the BSD or linux kernels.

Is it valid to use bit fields with union?

I have used bit field with a structure like this,
struct
{
unsigned int is_static: 1;
unsigned int is_extern: 1;
unsigned int is_auto: 1;
} flags;
Now i wondered to see if this can be done with a union so i modified the code like,
union
{
unsigned int is_static: 1;
unsigned int is_extern: 1;
unsigned int is_auto: 1;
} flags;
I found the bit field with union works but all those fields in the union are given to a single bit as I understood from output. Now I am seeing it is not erroneous to use bit fields with union, but it seems to me that using it like this is not operationally correct. So what is the answer - is it valid to use bit field with union?
It is valid but as you found out, not useful the way you have done it there.
You might do something like this so you can reset all the bits at the same time using flags.
union {
struct {
unsigned int is_static: 1;
unsigned int is_extern: 1;
unsigned int is_auto: 1;
};
unsigned int flags;
};
Or you might do something like this:
union {
struct {
unsigned int is_static: 1;
unsigned int is_extern: 1;
unsigned int is_auto: 1;
};
struct {
unsigned int is_ready: 1;
unsigned int is_done: 1;
unsigned int is_waiting: 1;
};
};
You are given a gun and bullets. Is it okay to shoot your self in foot with it? Of course not, but nobody can stop you from doing this if you want to.
My point is, just like gun and bullets, union and bit fields are tools and they have their purpose, uses and "abuses". So using bitfields in union, as you have written above, is perfectly valid C but a useless piece of code. All the fields inside union share same memory so all the bitfields you mention are essentially same flag as they share same memory.
Here is an example on how to use bit fields with unions. I'm also showing how to arrange for MSB. In pictures it would look something like this:
MSB LSB
7 0
+------+-------+
| five | three |
| bits | bits |
+------+-------+
// A struct tag definition of
// two bit fields in an 8-bit register
struct fields_tag {
// LSB
unsigned int five:5;
unsigned int three:3;
// MSB
};
// here is a tag and typedef for less typing
// to modify the 8-bit value as a whole
// and read in parts.
typedef union some_reg_tag {
uint8_t raw;
struct fields_tag fields;
} some_reg_t;
Here is how to use the bit fields with Arduino
some_reg_t a_register;
a_register.raw = 0xC2; // assign using raw field.
Serial.print("some reg = "); // dump entire register
Serial.println(a_register.raw, HEX); // dump register by field
Serial.print("some reg.three = ");
Serial.println(a_register.fields.three, HEX);
Serial.print("some reg.five = ");
Serial.println(a_register.fields.five, HEX);
Here is the output showing the results
some reg = C2
some reg.three = 6
some reg.five = 2

Pointer casting problem with struct array member

I've run across this source in a legacy code base and I don't really know why exactly it behaves the way it does.
In the following code, the pData struct member either contains the data or a pointer to the real data in shared memory. The message is sent using IPC (msgsnd() and msgrcv()). Using the pointer casts (that are currently commented out), it fails using GCC 4.4.1 on an ARM target, the member uLen gets modified. When using memcpy() and everything works as expected. I can't really see what is wrong with the pointer casting. What is wrong here?
typedef struct {
long mtype;
unsigned short uRespQueue;
unsigned short uID;
unsigned short uLen;
unsigned char pData[8000];
} message_t;
// changing the pointer in the struct
{
unsigned char *pData = <some_pointer>;
#if 0
*((unsigned int *)pMessage->pData) = (unsigned int)pData;
#else
memcpy(pMessage->pData, &pData, sizeof(unsigned int));
#endif
}
// getting the pointer out
{
#if 0
unsigned char *pData; (unsigned char *)(*((unsigned int *)pMessage->pData));
#else
unsigned char *pData;
memcpy(&pData, pMessage->pData, sizeof(int));
#endif
}
I suspect it's an alignment problem and either GCC or the processor is trying to compensate. The structure is defined as:
typedef struct {
long mtype;
unsigned short uRespQueue;
unsigned short uID;
unsigned short uLen;
unsigned char pData[8000];
} message_t;
Assuming normal alignment restrictions and a 32-bit processor, the offsets of each field are:
mtype 0 (alignment 4)
uRespQueue 4 (alignment 2)
uID 6 (alignment 2)
uLen 8 (alignment 2)
pData 10 (alignment 1)
On all but the most recent versions of the ARM processor, memory access must be aligned on the ARM processor and with the casting:
*((unsigned int *)pMessage->pData) = (unsigned int)pData;
you are attempting to write a 32-bit value on a misaligned address. To correct the alignment, the address appears to have truncated the LSB's of the address to have the proper alignment. Doing so happened to overlap with the uLen field causing the problem.
To be able to handle this correctly, you need to make sure that you write the value to a properly aligned address. Either offset the pointer to align it or make sure pData is aligned to be able to handle 32-bit data. I would redefine the structure to align the pData member for 32-bit access.
typedef struct {
long mtype;
unsigned short uRespQueue;
unsigned short uID;
unsigned short uLen;
union { /* this will add 2-bytes of padding */
unsigned char *pData;
unsigned char rgData[8000];
};
} message_t;
The structure should still occupy the same amount of bytes since it has a 4-byte alignment due to the mtype field.
Then you should be able to access the pointer:
unsigned char *pData = ...;
/* setting the pointer */
pMessage->pData = pData;
/* getting the pointer */
pData = pMessage->pData;
That is a very nasty thing to do (the thing that's compiled out). You're trying basically to hack the code, and instead of using the data copy in the message (in the provided 8000 bytes for it), you try to put a pointer, and pass it through IPC.
The main issue is sharing memory between processes. Who knows what happens to that pointer after you send it? Who knows what happens to the data it points to? That's a very bad habbit to send out a pointer to data that is not under your control (i.e.: not protected/properly shared).
Another thing that might happen, and is probably what you're actually talking about, is the alignment. The array is of char's, the previous member in the struct is short, the compiler might attempt packing them. Recasting char[] to int * means that you take memory area and represent it as something else, without telling the compiler. You're stomping over the uLen by the cast.
memcopy is the proper way to do it.
The point here is the code "int header = (((int)(txUserPtr) - 4))"
Illustration of UserTypes and struct pointer casting is great of help!
typedef union UserTypes
{
SAUser AUser;
BUser BUser;
SCUser CUser;
SDUser DUser;
} UserTypes;
typedef struct AUser
{
int userId;
int dbIndex;
ChannelType ChanType;
} AUser;
typedef struct AUser
{
int userId;
int dbIndex;
ChannelType ChanType;
} AUser;
typedef struct BUser
{
int userId;
int dbIndex;
ChannelType ChanType;
} BUser;
typedef struct CUser
{
int userId;
int dbIndex;
ChannelType ChanType;
} CUser;
typedef struct DUser
{
int userId;
int dbIndex;
ChannelType ChanType;
} DUser;
//this is the function I want to test
void Fun(UserTypes * txUserPtr)
{
int header = (*((int*)(txUserPtr) - 4));
//the problem is here
//how should i set incoming pointer "txUserPtr" so that
//Fun() would skip following lines.
// I don't want to execute error()
if((header & 0xFF000000) != (int)0xAA000000)
{
error("sth error\n");
}
/*the following is the rest */
}

Resources