Byte allocation in struct containing struct within C

Byte allocation in struct containing struct within C - c

I have the following structs in a C program.
typedef struct
{
unsigned char msg_id : 8;
unsigned char msg_num : 8;
} Message_Header_Type;
typedef struct
{
Message_Header_Type header;
int msg[32];
} Raw_Message_Type;
What I'm seeing is that the header in the Raw_Message_Type is taking up 4 bytes but I only want it to take only 2. How would I go about doing this?

What you are looking for is struct packing, which is platform and compiler dependent, the details are not specified in the C standard.
With GCC, on a 32-bit platform, you could do the following to get a struct of size 2:
typedef struct __attribute__((__packed__))
{
unsigned char msg_id;
unsigned char msg_num;
} Message_Header_Type;
From the GCC documentation:
packed
This attribute, attached to an enum, struct, or union type definition, specified that the minimum required memory be used to represent the type.

Message_Header_type inside of Raw_Message_Type is still taking 2 bytes as you would expect, but the compiler is padding the structure for alignment reasons, so your Raw_Message_Type is 132 bytes long instead of the 130 bytes you expected. This is probably because 32-bit aligned accesses are more efficient on your processor.
If you are using gcc, you can tell the compiler to align on 2-byte boundaries instead of the 4-byte boundaries that it is using by using #pragma pack.
typedef struct
{
unsigned char msg_id : 8;
unsigned char msg_num : 8;
} Message_Header_Type;
#pragma pack(push, 2) // save current alignment and set to 2-byte boundaries
typedef struct
{
Message_Header_Type header;
int msg[32];
} Raw_Message_Type;
#pragma pack(pop) // restore the previous alignment
Special packing like this is often necessary when using fixed-size structures in files, but be aware that there may be a (light) performance penalty for using a different packing than what the processor prefers. In this specific case, your 32 msg[] ints are now all 2 bytes off from the preferred alignment for your platform.

C implementations are free to insert padding into struct layouts between members, at the end, or both. It is common for them to do so for alignment purposes. It is also common for compilers to provide a mechanism to influence or override the padding, but details are necessarily implementation-specific.
With GCC, for example, you could apply the packed attribute to both your structs:
typedef struct __attribute__((__packed__))
{
unsigned char msg_id : 8;
unsigned char msg_num : 8;
} Message_Header_Type;
typedef struct __attribute__((__packed__))
{
Message_Header_Type header;
int msg[32];
} Raw_Message_Type;
Some other implementations borrow GCC's approach; some use pragmas for the same purpose; and some provide other mechanisms. You'll need to check your compiler's documentation.

Structure objects are not guaranteed the size of the sum of bits used in member bit fields. This is due to padding.
In your case though Raw_Message_Type has another member int msg[32];. It seems logical to use two more bytes for the purpose alignment so that msg[0] can be aligned to a 4-byte boundary.
There is a good chance that
Message_Header_Type obj;
printf("Size : %zu\n",sizeof(obj));
would give you 2 as a the result.

Related

ARM GCC Address of packed struct warning

In my code, I have something like this:
#include <stdint.h>
typedef struct __attribute__((packed)) {
uint8_t test1;
uint16_t test2;
} test_struct_t;
test_struct_t test_struct;
int main(void)
{
uint32_t *ptr = (uint32_t*) &test_struct;
return 0;
}
When I compile this using arm-none-eabi-gcc, I get the warning
.\test.c:11:2: warning: converting a packed 'test_struct_t' pointer
(alignment 1) to a 'uint32_t' {aka 'long unsigned int'} pointer
(alignment 4) may result in an unaligned pointer value
[-Waddress-of-packed-member]
Can anyone tell me why this is happening? Taking the address of a packed struct member is of course dangerous. But the whole struct itself should always be aligned, shouldn't it?

There is an answer in the comments, but since it's author didn't post it, I take the liberty to post it myself. All the credit is due to #Clifford.
By default, when the struct is packed, compilers also change alignment of the struct to 1 byte. However, for your case you need the struct to be both packed and aligned as 32-bit unsigned integer. This can be done by changing the packing attribute as following:
#include <stdint.h>
struct __attribute__((packed, aligned(sizeof(uint32_t)))) TestStruct {
uint8_t test1;
uint16_t test2;
};
struct TestStruct test_struct;
int32_t* p = (int32_t*)(&test_struct);
This compiles for ARM platform without any warnings.

In my experience, "packed" structs are almost always a bad idea. They don't always do what people think they do, and they might do other things as well. Depending on compilers, target processors, options, etc., you might find the compiler generating code that uses multiple byte accesses to things that you expect to be 16-bit or 32-bit accesses.
Hardware registers are always going to be properly aligned on a microcontroller. (Bitfields may be a different matter, but you are not using bitfields here.) But there might be gaps or padding.
And the whole idea of trying to access this using a pointer to a uint32_t is wrong. Don't access data via pointer casts like this - in fact, if you see a pointer cast at all, be highly suspicious.
So how do you get a structure that matches a hardware structure exactly? You write it out explicitly, and you use compile-time checks to be sure:
#pragma GCC diagnostic error "-Wpadded"
struct TestStruct {
uint8_t test1;
uint8_t padding;
uint16_t test2;
};
_Static_assert(sizeof(struct TestStruct) == 4, "Size check");
The padding is explicit. Any mistakes will be caught by the compiler.
What if you really, really want an unaligned 16-bit field in the middle here, and you haven't made a mistake in reading the datasheets? Use bitfields:
#pragma GCC diagnostic error "-Wpadded"
struct TestStruct2 {
uint32_t test1 : 8;
uint32_t test2 : 16;
uint32_t padding : 8;
};
_Static_assert(sizeof(struct TestStruct2) == 4, "Size check");
Put the padding in explicitly. Tell the compiler to complain about missing padding, and also check the size. The compiler doesn't charge for the extra microsecond of work.
And what if you really, really, really need to access this as a uint32_t ? You use a union for type-punning (though not with C++) :
union TestUnion {
uint32_t raw;
struct TestStruct2 s;
};

Your packed structure has a size of 3 bytes, and there can be no padding in it. Thus, if we were to create an array of such structures, with the first element having a 4-byte aligned address then, by the definition of arrays (contiguous memory), the second element would be three bytes (sizeof(struct test_struct_t)) from that. Thus, the second element would have only single byte alignment – so, the alignment requirement of your structure is, by deduction, one byte.
On your ARM platform, a unit32_t requires 4 byte alignment, hence the warning.

Inaccurate size of a struct with bit fields in my book

I'm studying the basics of the C language. I arrived at the chapter of structures with bit fields. The book shows an example of a struct with two different types of data: various bools and various unsigned ints.
The book declares that the structure has a size of 16 bits and that without using padding the structure would measure 10 bits.
This is the structure that the book uses in the example:
#include <stdio.h>
#include <stdbool.h>
struct test{
bool opaque : 1;
unsigned int fill_color : 3;
unsigned int : 4;
bool show_border : 1;
unsigned int border_color : 3;
unsigned int border_style : 2;
unsigned int : 2;
};
int main(void)
{
struct test Test;
printf("%zu\n", sizeof(Test));
return 0;
}
Why on my compiler instead does the exact same structure measure 16 bytes (rather than bits) with padding and 16 bytes without padding?
I'm using
GCC (tdm-1) 4.9.2 compiler;
Code::Blocks as IDE.
Windows 7 64 Bit
Intel CPU 64 bit
This is result I'm getting:
Here is a picture of the page where the example is:

The Microsoft ABI lays out bitfields in a different way than GCC normally does it on other platforms. You can choose to use the Microsoft-compatible layout with the -mms-bitfields option, or disable it with -mno-ms-bitfields. It's likely that your GCC version uses -mms-bitfields by default.
According to the documentation, When -mms-bitfields is enabled:
Every data object has an alignment requirement. The alignment requirement for all data except structures, unions, and arrays is either the size of the object or the current packing size (specified with either the aligned attribute or the pack pragma), whichever is less. For structures, unions, and arrays, the alignment requirement is the largest alignment requirement of its members. Every object is allocated an offset so that:
offset % alignment_requirement == 0
Adjacent bit-fields are packed into the same 1-, 2-, or 4-byte allocation unit if the integral types are the same size and if the next bit-field fits into the current allocation unit without crossing the boundary imposed by the common alignment requirements of the bit-fields.
Since bool and unsigned int have different sizes, they are packed and aligned separately, which increases the struct size substantially. The alignment of unsigned int is 4 bytes, and having to realign three times in the middle of the struct leads to a 16 byte total size.
You can get the same behavior of the book by changing bool to unsigned int, or by specifying -mno-ms-bitfields (though this will mean you can't inter-operate with code compiled on Microsoft compilers).
Note that the C standard does not specify how bitfields are laid out. So what your book says may be true for some platforms but not for all of them.

Taken as describing the provisions of the C language standard, your text makes unjustified claims. Specifically, not only does the standard not say that unsigned int is the basic layout unit of structures of any kind, it explicitly disclaims any requirement on the size of the storage units in which bitfield representations are stored:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
(C2011, 6.7.2.1/11)
The text also makes assumptions about padding that are not supported by the standard. A C implementation is free to include an arbitrary amount of padding after any or all elements and bitfield storage units of a struct, including the last. Implementations typically use this freedom to address data alignment considerations, but C does not limit padding to that use. This is altogether separate from the unnamed bitfields that your text refers to as "padding".
I guess the book should be commended, however, for avoiding the distressingly common misconception that C requires the declared data type of a bit field to have anything to do with the size of the storage unit(s) in which its representation resides. The standard makes no such association.
Why on my compiler instead does the exact same structure measure 16 bytes (rather than bits) with padding and 16 bytes without padding?
To cut the text as much slack as possible, it does distinguish between the number of bits of data occupied by members (16 bits total, 6 belonging to unnamed bitfields) and the overall size of instances of the struct. It seems to be asserting that the overall structure will be the size of an unsigned int, which apparently is 32 bits on the system it is describing, and that would be the same for both versions of the struct.
In principle, your observed sizes could be explained by your implementation using a 128-bit storage unit for the bitfields. In practice, it likely uses one or more smaller storage units, so that some of the extra space in each struct is attributable to implementation-provided padding, such as I touch on above.
It is very common for C implementations to impose a minimum size on all structure types, and therefore to pad representations to that size when necessary. Often this size matches the strictest alignment requirement of any data type supported by the system, though that's, again, an implementation consideration, not a requirement of the language.
Bottom line: only by relying on implementation details and / or extensions can you predict the exact size of a struct, regardless of the presence or absence of bitfield members.

The C standard doesn't described all details about how variables shall be placed in memory. This leaves room for optimization that depends on the platform used.
To give your self an idea about how things are located in memory, you can do like this:
#include <stdio.h>
#include <stdbool.h>
struct test{
bool opaque : 1;
unsigned int fill_color : 3;
unsigned int : 4;
bool show_border : 1;
unsigned int border_color : 3;
unsigned int border_style : 2;
unsigned int : 2;
};
int main(void)
{
struct test Test = {0};
int i;
printf("%zu\n", sizeof(Test));
unsigned char* p;
p = (unsigned char*)&Test;
for(i=0; i<sizeof(Test); ++i)
{
printf("%02x", *p);
++p;
}
printf("\n");
Test.opaque = true;
p = (unsigned char*)&Test;
for(i=0; i<sizeof(Test); ++i)
{
printf("%02x", *p);
++p;
}
printf("\n");
Test.fill_color = 3;
p = (unsigned char*)&Test;
for(i=0; i<sizeof(Test); ++i)
{
printf("%02x", *p);
++p;
}
printf("\n");
return 0;
}
Running this on ideone (https://ideone.com/wbR5tI) I get:
4
00000000
01000000
07000000
So I can see that opaque and fill_color are both in the first byte.
Running exactly the same code on a Windows machine (using gcc) gives:
16
00000000000000000000000000000000
01000000000000000000000000000000
01000000030000000000000000000000
So here I can see that opaque and fill_color are not both in the first byte. It seems that opaque takes up 4 bytes.
This explains that you get 16 bytes in total, i.e. the bool takes 4 bytes each and then 4 bytes for the fields in between and after.

To my surprise there seems to be a difference between some GCC 4.9.2 online compilers. First, this is my code:
#include <stdio.h>
#include <stdbool.h>
struct test {
bool opaque : 1;
unsigned int fill_color : 3;
unsigned int : 4;
bool show_border : 1;
unsigned int border_color : 3;
unsigned int border_style : 2;
unsigned int : 2;
};
struct test_packed {
bool opaque : 1;
unsigned int fill_color : 3;
unsigned int : 4;
bool show_border : 1;
unsigned int border_color : 3;
unsigned int border_style : 2;
unsigned int : 2;
} __attribute__((packed));
int main(void)
{
struct test padding;
struct test_packed no_padding;
printf("with padding: %zu bytes = %zu bits\n", sizeof(padding), sizeof(padding) * 8);
printf("without padding: %zu bytes = %zu bits\n", sizeof(no_padding), sizeof(no_padding) * 8);
return 0;
}
And now, results from different compilers.
GCC 4.9.2 from WandBox:
with padding: 4 bytes = 32 bits
without padding: 2 bytes = 16 bits
GCC 4.9.2 from http://cpp.sh/:
with padding: 4 bytes = 32 bits
without padding: 2 bytes = 16 bits
BUT
GCC 4.9.2 from theonlinecompiler.com:
with padding: 16 bytes = 128 bits
without padding: 16 bytes = 128 bits
(to properly compile you need to chagne %zu to %u)
EDIT
#interjay's answer might explain this. When I added -mms-bitfields to GCC 4.9.2 from WandBox, I got this:
with padding: 16 bytes = 128 bits
without padding: 16 bytes = 128 bits

Before the author defines the struct he says he wants to divide the bit fields into two bytes so there will be one byte containing the bitfields for the fill-related information and one byte for the border related information.
To achieve that, he adds (inserts) a few unused bits (bitfield):
unsigned int 4; // padding of the first byte
he also padds the second byte, but there is no need for that.
So before the padding there would be 10 bits in use and afer the padding there are 16 bits defined (but not al of them in use).
Note: the author uses bool to indicate a 1/0 field. The author next assumes the _Bool C99 type is aliased to bool. But it seems compilers get a bit confused here. Replacing bool with unsigned int would solve it. From C99:
6.3.2: The following may be used in an expression wherever an int or unsigned int may
be used:
A bit-field of type _Bool, int, signed int, or unsigned int.

You are completely misinterpreting what the book says.
There are 16 bits worth of bit fields declared. 6 bits are unnamed fields that cannot be used for anything - that's the padding mentioned. 16 bits minus 6 bits equals 10 bits. Not counting the padding fields, the struct has 10 useful bits.
How many bytes the struct has, depends on the quality of the compiler. Apparently you ran into a compiler that doesn't pack bool bitfields in a struct, and it uses 4 bytes for a bool, some memory for bitfields, plus struct padding, total 4 bytes, another 4 bytes for a bool, more memory for bitfields, plus struct padding, total 4 bytes, adding up to 16 bytes. It's rather sad really. This struct could quite reasonably be two bytes.

There have historically been two common ways of interpreting the types of bitfield elements:
Examine whether the type is signed or unsigned, but ignore distinctions
between "char", "short", "int", etc. in deciding where an element should be
placed.
Unless a bitfield is preceded by another with the same type, or the
corresponding signed/unsigned type, allocate an object of that type and
place the bitfield within it. Place following bitfields with the same
type in that object if they fit.
I think the motivation behind #2 was that on a platform where 16-bit values need to be word-aligned, a compiler given something like:
struct foo {
char x; // Not a bitfield
someType f1 : 1;
someType f2 : 7;
};
might be able to allocate a two-byte structure, with both fields being placed in the second byte, but if the structure had been:
struct foo {
char x; // Not a bitfield
someType f1 : 1;
someType f2 : 15;
};
it would be necessary that all of f2 fit within a single 16-bit word, which would thus necessitate a padding byte before f1. Because of the Common Initial Sequence rule, f1 must be placed identically in those two structures, which would imply that if f1 could satisfy the Common Initial Sequence rule, it would need padding before it even in the first structure.
As it is, code which wants to allow the denser layout in the first case can say:
struct foo {
char x; // Not a bitfield
unsigned char f1 : 1;
unsigned char f2 : 7;
};
and invite the compiler to put both bitfields into a byte immediately following x. Since the type is specified as unsigned char, the compiler need not worry about the possibility of a 15-bit field. If the layout were:
struct foo {
char x; // Not a bitfield
unsigned short f1 : 1;
unsigned short f2 : 7;
};
and the intention was that f1 and f2 would sit in the same storage, then the compiler would need to place f1 in a way that could support a word-aligned access for its "bunkmate" f2. If the code were:
struct foo {
char x; // Not a bitfield
unsigned char f1 : 1;
unsigned short f2 : 15;
};
then f1 would be placed following x, and f2 in a word by itself.
Note that the C89 Standard added a syntax to force the layout that prevent f1 from being placed in a byte before the storage used f2:
struct foo {
char x; // Not a bitfield
unsigned short : 0; // Forces next bitfield to start at a "short" boundary
unsigned short f1 : 1;
unsigned short f2 : 15;
};
The addition of the :0 syntax in C89 largely eliminates the need to have compilers regard changing types as forcing alignment, except when processing old code.

How to align C structure to 4-byte boundary?

I have the following structure:
typedef struct LOG_EVENT
{
time_t time;
uint32_t count;
int32_t error_type;
uint16_t gm_state;
uint8_t gm_mode;
uint8_t mc_state;
} LOG_EVENT;
On 32-bit system , the structure's strict alignment is 4bytes so the members are aligned across 4-byte boundary. There is no padding added in this case because all members together are 4-byte aligned.
But this is not true on 64-bit system, where time_t is 64bit. The strict alignment in that case is 8-byte.
How can I change the alignment to 4-byte ? I want to align across 4-byte boundary on 64-bit system because I want to make sure no padding is done.
From the gcc attributes page https://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Variable-Attributes.html , it says that The aligned attribute can only increase the alignment; but you can decrease it by specifying packed as well.
I don't see packed attribute taking in any arguments.
Also, if I use the byte-alignment like below, would it cause any issues compared to 4-byte alignment:
typedef struct __attribute__ ((packed)) LOG_EVENT
{
time_t time;
uint32_t count;
int32_t error_type;
uint16_t gm_state;
uint8_t gm_mode;
uint8_t mc_state;
} LOG_EVENT;

#pragma pack(4) will set alignment to 4 bytes.
Notice this directive is not part of the standard, this was originally introduced in MSVC and later adopted by gcc for compatibility with the Microsoft's compiler.
Also, notice the size of types like time_t, size_t and all pointer types will vary between those architectures. If your intention is to make a structure that is intelligible between applications running on those two architectures this will be a problem.
Also know that there is no benefit in using 64-bit applications other than the fact that you can address more than 4GB of memory, if you don't need all that memory you can stick to 32-bit, there is no sin in that.

The most straightforward thing to do is ... not use time_t. Use fixed-width types only.
typedef struct LOG_EVENT
{
int32_t time;
uint32_t count;
int32_t error_type;
uint16_t gm_state;
uint8_t gm_mode;
uint8_t mc_state;
} LOG_EVENT;
You give up the ability to handle timestamps outside the range of signed 32-bit time_t (usually, but not always, 1901-12-13T20:45:52Z through 2038-01-19T03:14:07Z) but if you're trying to work with an on-disk record array with 32-bit timestamps, which is the most plausible explanation here, that's not an issue.

Troubles while reading the struct size

If I debug the following code then I see the size value is 12 (as expected).
#include <cstdint>
int main(int argc, char *argv[])
{
typedef struct __attribute__((__packed__)) { int8_t value; } time;
typedef struct __attribute__((__packed__)) {
uint8_t msg[8];
// time t1;
uint32_t count;
} theStruct;
theStruct s;
int size = sizeof(s);
return 0;
}
Interestingly, removing the comment at "time t1;", the value of size goes to 16. I was expecting 13.
I know (more or less) that this is explained by the data structure padding story...
But, is there some way to avoid this issue?
What to do in order to read size = 13?

There are some issues with MinGW's emulation of MSVC struct packing.
The workaround is to pass -mno-ms-bitfields flag to the compiler; this will cause it to use its own layout algorithm rather than attempt to emulate MSVC.
See also Struct packing and alignment with mingw (ARM but may be the same issue).

It is clearly an alignement problem, meaning it has nothing to do with the language itself but all with the underlying platform.
If the platform knows (or thinks) that int32_t need an alignement of 4, it should add 3 bytes of padding after time.
Anyway __attribute__((__packed__)) is non standard C and will only be interpreted by gcc (*). Worse it leads to non portable programs : according to this answer on another question, it would cause a bus error on a Sparc architecture because of a misaligned int32_t.
I do know that x86 (and derivatives) architecture are now the most common, but other architecture may still exist ...
(*) according to Visual C++ equivalent of GCC's __attribute__ ((__packed__)), the MSVC equivalent is #pragma pack(push, 1)

sizeof(struct) different for different compilers

Supposing I have a code like this:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[]) {
typedef struct{
uint16_t x : 9;
uint8_t y : 7;
} z;
printf("sizeof(z) = %lu\n",sizeof(z));
}
I have different results for clang on Mac (2) and someone told me on Windows it returned (3). Not sure if I understand it well, but I see that while first compiler compresses the struct to 9+7 = 16 bits, the other uses 16 bits of uint16_t and 8 of uint8_t. Could you advise?

Not sure if I understand it well, but I see that while first compiler compresses the struct to 9+7 = 16 bits, the other uses 16 bits of uint16_t and 8 of uint8_t. Could you advise?
The first thing to remember about bit-field is this phrase from K&R, 2nd:
(6.9 Bit-fields) "Almost everything about fields is implementation-dependent."
It includes padding, alignment and bit endianness.

There are two possible problems that might be occurring:
Bit-fields are very poorly standardized part within the ANSI C specification. The compiler chooses how bits are allocated within the bit-field container.You should avoid using them inside structures instead you can use #define or enum.
The second possible issue is that the compiler will lay the structure in memory by adding padding to ensure that the next object is aligned to the size of that object.It is a good practices to place elements of the struct according to their size:
typedef struct{
uint8_t x : 7;
uint16_t y : 9;
} z;

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight