sizeof(struct) different for different compilers

sizeof(struct) different for different compilers - c

Supposing I have a code like this:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[]) {
typedef struct{
uint16_t x : 9;
uint8_t y : 7;
} z;
printf("sizeof(z) = %lu\n",sizeof(z));
}
I have different results for clang on Mac (2) and someone told me on Windows it returned (3). Not sure if I understand it well, but I see that while first compiler compresses the struct to 9+7 = 16 bits, the other uses 16 bits of uint16_t and 8 of uint8_t. Could you advise?

Not sure if I understand it well, but I see that while first compiler compresses the struct to 9+7 = 16 bits, the other uses 16 bits of uint16_t and 8 of uint8_t. Could you advise?
The first thing to remember about bit-field is this phrase from K&R, 2nd:
(6.9 Bit-fields) "Almost everything about fields is implementation-dependent."
It includes padding, alignment and bit endianness.

There are two possible problems that might be occurring:
Bit-fields are very poorly standardized part within the ANSI C specification. The compiler chooses how bits are allocated within the bit-field container.You should avoid using them inside structures instead you can use #define or enum.
The second possible issue is that the compiler will lay the structure in memory by adding padding to ensure that the next object is aligned to the size of that object.It is a good practices to place elements of the struct according to their size:
typedef struct{
uint8_t x : 7;
uint16_t y : 9;
} z;

Related

ARM GCC Address of packed struct warning

In my code, I have something like this:
#include <stdint.h>
typedef struct __attribute__((packed)) {
uint8_t test1;
uint16_t test2;
} test_struct_t;
test_struct_t test_struct;
int main(void)
{
uint32_t *ptr = (uint32_t*) &test_struct;
return 0;
}
When I compile this using arm-none-eabi-gcc, I get the warning
.\test.c:11:2: warning: converting a packed 'test_struct_t' pointer
(alignment 1) to a 'uint32_t' {aka 'long unsigned int'} pointer
(alignment 4) may result in an unaligned pointer value
[-Waddress-of-packed-member]
Can anyone tell me why this is happening? Taking the address of a packed struct member is of course dangerous. But the whole struct itself should always be aligned, shouldn't it?

There is an answer in the comments, but since it's author didn't post it, I take the liberty to post it myself. All the credit is due to #Clifford.
By default, when the struct is packed, compilers also change alignment of the struct to 1 byte. However, for your case you need the struct to be both packed and aligned as 32-bit unsigned integer. This can be done by changing the packing attribute as following:
#include <stdint.h>
struct __attribute__((packed, aligned(sizeof(uint32_t)))) TestStruct {
uint8_t test1;
uint16_t test2;
};
struct TestStruct test_struct;
int32_t* p = (int32_t*)(&test_struct);
This compiles for ARM platform without any warnings.

In my experience, "packed" structs are almost always a bad idea. They don't always do what people think they do, and they might do other things as well. Depending on compilers, target processors, options, etc., you might find the compiler generating code that uses multiple byte accesses to things that you expect to be 16-bit or 32-bit accesses.
Hardware registers are always going to be properly aligned on a microcontroller. (Bitfields may be a different matter, but you are not using bitfields here.) But there might be gaps or padding.
And the whole idea of trying to access this using a pointer to a uint32_t is wrong. Don't access data via pointer casts like this - in fact, if you see a pointer cast at all, be highly suspicious.
So how do you get a structure that matches a hardware structure exactly? You write it out explicitly, and you use compile-time checks to be sure:
#pragma GCC diagnostic error "-Wpadded"
struct TestStruct {
uint8_t test1;
uint8_t padding;
uint16_t test2;
};
_Static_assert(sizeof(struct TestStruct) == 4, "Size check");
The padding is explicit. Any mistakes will be caught by the compiler.
What if you really, really want an unaligned 16-bit field in the middle here, and you haven't made a mistake in reading the datasheets? Use bitfields:
#pragma GCC diagnostic error "-Wpadded"
struct TestStruct2 {
uint32_t test1 : 8;
uint32_t test2 : 16;
uint32_t padding : 8;
};
_Static_assert(sizeof(struct TestStruct2) == 4, "Size check");
Put the padding in explicitly. Tell the compiler to complain about missing padding, and also check the size. The compiler doesn't charge for the extra microsecond of work.
And what if you really, really, really need to access this as a uint32_t ? You use a union for type-punning (though not with C++) :
union TestUnion {
uint32_t raw;
struct TestStruct2 s;
};

Your packed structure has a size of 3 bytes, and there can be no padding in it. Thus, if we were to create an array of such structures, with the first element having a 4-byte aligned address then, by the definition of arrays (contiguous memory), the second element would be three bytes (sizeof(struct test_struct_t)) from that. Thus, the second element would have only single byte alignment – so, the alignment requirement of your structure is, by deduction, one byte.
On your ARM platform, a unit32_t requires 4 byte alignment, hence the warning.

Inaccurate size of a struct with bit fields in my book

I'm studying the basics of the C language. I arrived at the chapter of structures with bit fields. The book shows an example of a struct with two different types of data: various bools and various unsigned ints.
The book declares that the structure has a size of 16 bits and that without using padding the structure would measure 10 bits.
This is the structure that the book uses in the example:
#include <stdio.h>
#include <stdbool.h>
struct test{
bool opaque : 1;
unsigned int fill_color : 3;
unsigned int : 4;
bool show_border : 1;
unsigned int border_color : 3;
unsigned int border_style : 2;
unsigned int : 2;
};
int main(void)
{
struct test Test;
printf("%zu\n", sizeof(Test));
return 0;
}
Why on my compiler instead does the exact same structure measure 16 bytes (rather than bits) with padding and 16 bytes without padding?
I'm using
GCC (tdm-1) 4.9.2 compiler;
Code::Blocks as IDE.
Windows 7 64 Bit
Intel CPU 64 bit
This is result I'm getting:
Here is a picture of the page where the example is:

The Microsoft ABI lays out bitfields in a different way than GCC normally does it on other platforms. You can choose to use the Microsoft-compatible layout with the -mms-bitfields option, or disable it with -mno-ms-bitfields. It's likely that your GCC version uses -mms-bitfields by default.
According to the documentation, When -mms-bitfields is enabled:
Every data object has an alignment requirement. The alignment requirement for all data except structures, unions, and arrays is either the size of the object or the current packing size (specified with either the aligned attribute or the pack pragma), whichever is less. For structures, unions, and arrays, the alignment requirement is the largest alignment requirement of its members. Every object is allocated an offset so that:
offset % alignment_requirement == 0
Adjacent bit-fields are packed into the same 1-, 2-, or 4-byte allocation unit if the integral types are the same size and if the next bit-field fits into the current allocation unit without crossing the boundary imposed by the common alignment requirements of the bit-fields.
Since bool and unsigned int have different sizes, they are packed and aligned separately, which increases the struct size substantially. The alignment of unsigned int is 4 bytes, and having to realign three times in the middle of the struct leads to a 16 byte total size.
You can get the same behavior of the book by changing bool to unsigned int, or by specifying -mno-ms-bitfields (though this will mean you can't inter-operate with code compiled on Microsoft compilers).
Note that the C standard does not specify how bitfields are laid out. So what your book says may be true for some platforms but not for all of them.

Taken as describing the provisions of the C language standard, your text makes unjustified claims. Specifically, not only does the standard not say that unsigned int is the basic layout unit of structures of any kind, it explicitly disclaims any requirement on the size of the storage units in which bitfield representations are stored:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
(C2011, 6.7.2.1/11)
The text also makes assumptions about padding that are not supported by the standard. A C implementation is free to include an arbitrary amount of padding after any or all elements and bitfield storage units of a struct, including the last. Implementations typically use this freedom to address data alignment considerations, but C does not limit padding to that use. This is altogether separate from the unnamed bitfields that your text refers to as "padding".
I guess the book should be commended, however, for avoiding the distressingly common misconception that C requires the declared data type of a bit field to have anything to do with the size of the storage unit(s) in which its representation resides. The standard makes no such association.
Why on my compiler instead does the exact same structure measure 16 bytes (rather than bits) with padding and 16 bytes without padding?
To cut the text as much slack as possible, it does distinguish between the number of bits of data occupied by members (16 bits total, 6 belonging to unnamed bitfields) and the overall size of instances of the struct. It seems to be asserting that the overall structure will be the size of an unsigned int, which apparently is 32 bits on the system it is describing, and that would be the same for both versions of the struct.
In principle, your observed sizes could be explained by your implementation using a 128-bit storage unit for the bitfields. In practice, it likely uses one or more smaller storage units, so that some of the extra space in each struct is attributable to implementation-provided padding, such as I touch on above.
It is very common for C implementations to impose a minimum size on all structure types, and therefore to pad representations to that size when necessary. Often this size matches the strictest alignment requirement of any data type supported by the system, though that's, again, an implementation consideration, not a requirement of the language.
Bottom line: only by relying on implementation details and / or extensions can you predict the exact size of a struct, regardless of the presence or absence of bitfield members.

The C standard doesn't described all details about how variables shall be placed in memory. This leaves room for optimization that depends on the platform used.
To give your self an idea about how things are located in memory, you can do like this:
#include <stdio.h>
#include <stdbool.h>
struct test{
bool opaque : 1;
unsigned int fill_color : 3;
unsigned int : 4;
bool show_border : 1;
unsigned int border_color : 3;
unsigned int border_style : 2;
unsigned int : 2;
};
int main(void)
{
struct test Test = {0};
int i;
printf("%zu\n", sizeof(Test));
unsigned char* p;
p = (unsigned char*)&Test;
for(i=0; i<sizeof(Test); ++i)
{
printf("%02x", *p);
++p;
}
printf("\n");
Test.opaque = true;
p = (unsigned char*)&Test;
for(i=0; i<sizeof(Test); ++i)
{
printf("%02x", *p);
++p;
}
printf("\n");
Test.fill_color = 3;
p = (unsigned char*)&Test;
for(i=0; i<sizeof(Test); ++i)
{
printf("%02x", *p);
++p;
}
printf("\n");
return 0;
}
Running this on ideone (https://ideone.com/wbR5tI) I get:
4
00000000
01000000
07000000
So I can see that opaque and fill_color are both in the first byte.
Running exactly the same code on a Windows machine (using gcc) gives:
16
00000000000000000000000000000000
01000000000000000000000000000000
01000000030000000000000000000000
So here I can see that opaque and fill_color are not both in the first byte. It seems that opaque takes up 4 bytes.
This explains that you get 16 bytes in total, i.e. the bool takes 4 bytes each and then 4 bytes for the fields in between and after.

To my surprise there seems to be a difference between some GCC 4.9.2 online compilers. First, this is my code:
#include <stdio.h>
#include <stdbool.h>
struct test {
bool opaque : 1;
unsigned int fill_color : 3;
unsigned int : 4;
bool show_border : 1;
unsigned int border_color : 3;
unsigned int border_style : 2;
unsigned int : 2;
};
struct test_packed {
bool opaque : 1;
unsigned int fill_color : 3;
unsigned int : 4;
bool show_border : 1;
unsigned int border_color : 3;
unsigned int border_style : 2;
unsigned int : 2;
} __attribute__((packed));
int main(void)
{
struct test padding;
struct test_packed no_padding;
printf("with padding: %zu bytes = %zu bits\n", sizeof(padding), sizeof(padding) * 8);
printf("without padding: %zu bytes = %zu bits\n", sizeof(no_padding), sizeof(no_padding) * 8);
return 0;
}
And now, results from different compilers.
GCC 4.9.2 from WandBox:
with padding: 4 bytes = 32 bits
without padding: 2 bytes = 16 bits
GCC 4.9.2 from http://cpp.sh/:
with padding: 4 bytes = 32 bits
without padding: 2 bytes = 16 bits
BUT
GCC 4.9.2 from theonlinecompiler.com:
with padding: 16 bytes = 128 bits
without padding: 16 bytes = 128 bits
(to properly compile you need to chagne %zu to %u)
EDIT
#interjay's answer might explain this. When I added -mms-bitfields to GCC 4.9.2 from WandBox, I got this:
with padding: 16 bytes = 128 bits
without padding: 16 bytes = 128 bits

Before the author defines the struct he says he wants to divide the bit fields into two bytes so there will be one byte containing the bitfields for the fill-related information and one byte for the border related information.
To achieve that, he adds (inserts) a few unused bits (bitfield):
unsigned int 4; // padding of the first byte
he also padds the second byte, but there is no need for that.
So before the padding there would be 10 bits in use and afer the padding there are 16 bits defined (but not al of them in use).
Note: the author uses bool to indicate a 1/0 field. The author next assumes the _Bool C99 type is aliased to bool. But it seems compilers get a bit confused here. Replacing bool with unsigned int would solve it. From C99:
6.3.2: The following may be used in an expression wherever an int or unsigned int may
be used:
A bit-field of type _Bool, int, signed int, or unsigned int.

You are completely misinterpreting what the book says.
There are 16 bits worth of bit fields declared. 6 bits are unnamed fields that cannot be used for anything - that's the padding mentioned. 16 bits minus 6 bits equals 10 bits. Not counting the padding fields, the struct has 10 useful bits.
How many bytes the struct has, depends on the quality of the compiler. Apparently you ran into a compiler that doesn't pack bool bitfields in a struct, and it uses 4 bytes for a bool, some memory for bitfields, plus struct padding, total 4 bytes, another 4 bytes for a bool, more memory for bitfields, plus struct padding, total 4 bytes, adding up to 16 bytes. It's rather sad really. This struct could quite reasonably be two bytes.

There have historically been two common ways of interpreting the types of bitfield elements:
Examine whether the type is signed or unsigned, but ignore distinctions
between "char", "short", "int", etc. in deciding where an element should be
placed.
Unless a bitfield is preceded by another with the same type, or the
corresponding signed/unsigned type, allocate an object of that type and
place the bitfield within it. Place following bitfields with the same
type in that object if they fit.
I think the motivation behind #2 was that on a platform where 16-bit values need to be word-aligned, a compiler given something like:
struct foo {
char x; // Not a bitfield
someType f1 : 1;
someType f2 : 7;
};
might be able to allocate a two-byte structure, with both fields being placed in the second byte, but if the structure had been:
struct foo {
char x; // Not a bitfield
someType f1 : 1;
someType f2 : 15;
};
it would be necessary that all of f2 fit within a single 16-bit word, which would thus necessitate a padding byte before f1. Because of the Common Initial Sequence rule, f1 must be placed identically in those two structures, which would imply that if f1 could satisfy the Common Initial Sequence rule, it would need padding before it even in the first structure.
As it is, code which wants to allow the denser layout in the first case can say:
struct foo {
char x; // Not a bitfield
unsigned char f1 : 1;
unsigned char f2 : 7;
};
and invite the compiler to put both bitfields into a byte immediately following x. Since the type is specified as unsigned char, the compiler need not worry about the possibility of a 15-bit field. If the layout were:
struct foo {
char x; // Not a bitfield
unsigned short f1 : 1;
unsigned short f2 : 7;
};
and the intention was that f1 and f2 would sit in the same storage, then the compiler would need to place f1 in a way that could support a word-aligned access for its "bunkmate" f2. If the code were:
struct foo {
char x; // Not a bitfield
unsigned char f1 : 1;
unsigned short f2 : 15;
};
then f1 would be placed following x, and f2 in a word by itself.
Note that the C89 Standard added a syntax to force the layout that prevent f1 from being placed in a byte before the storage used f2:
struct foo {
char x; // Not a bitfield
unsigned short : 0; // Forces next bitfield to start at a "short" boundary
unsigned short f1 : 1;
unsigned short f2 : 15;
};
The addition of the :0 syntax in C89 largely eliminates the need to have compilers regard changing types as forcing alignment, except when processing old code.

Byte allocation in struct containing struct within C

I have the following structs in a C program.
typedef struct
{
unsigned char msg_id : 8;
unsigned char msg_num : 8;
} Message_Header_Type;
typedef struct
{
Message_Header_Type header;
int msg[32];
} Raw_Message_Type;
What I'm seeing is that the header in the Raw_Message_Type is taking up 4 bytes but I only want it to take only 2. How would I go about doing this?

What you are looking for is struct packing, which is platform and compiler dependent, the details are not specified in the C standard.
With GCC, on a 32-bit platform, you could do the following to get a struct of size 2:
typedef struct __attribute__((__packed__))
{
unsigned char msg_id;
unsigned char msg_num;
} Message_Header_Type;
From the GCC documentation:
packed
This attribute, attached to an enum, struct, or union type definition, specified that the minimum required memory be used to represent the type.

Message_Header_type inside of Raw_Message_Type is still taking 2 bytes as you would expect, but the compiler is padding the structure for alignment reasons, so your Raw_Message_Type is 132 bytes long instead of the 130 bytes you expected. This is probably because 32-bit aligned accesses are more efficient on your processor.
If you are using gcc, you can tell the compiler to align on 2-byte boundaries instead of the 4-byte boundaries that it is using by using #pragma pack.
typedef struct
{
unsigned char msg_id : 8;
unsigned char msg_num : 8;
} Message_Header_Type;
#pragma pack(push, 2) // save current alignment and set to 2-byte boundaries
typedef struct
{
Message_Header_Type header;
int msg[32];
} Raw_Message_Type;
#pragma pack(pop) // restore the previous alignment
Special packing like this is often necessary when using fixed-size structures in files, but be aware that there may be a (light) performance penalty for using a different packing than what the processor prefers. In this specific case, your 32 msg[] ints are now all 2 bytes off from the preferred alignment for your platform.

C implementations are free to insert padding into struct layouts between members, at the end, or both. It is common for them to do so for alignment purposes. It is also common for compilers to provide a mechanism to influence or override the padding, but details are necessarily implementation-specific.
With GCC, for example, you could apply the packed attribute to both your structs:
typedef struct __attribute__((__packed__))
{
unsigned char msg_id : 8;
unsigned char msg_num : 8;
} Message_Header_Type;
typedef struct __attribute__((__packed__))
{
Message_Header_Type header;
int msg[32];
} Raw_Message_Type;
Some other implementations borrow GCC's approach; some use pragmas for the same purpose; and some provide other mechanisms. You'll need to check your compiler's documentation.

Structure objects are not guaranteed the size of the sum of bits used in member bit fields. This is due to padding.
In your case though Raw_Message_Type has another member int msg[32];. It seems logical to use two more bytes for the purpose alignment so that msg[0] can be aligned to a 4-byte boundary.
There is a good chance that
Message_Header_Type obj;
printf("Size : %zu\n",sizeof(obj));
would give you 2 as a the result.

Finding the sizeof the [duplicate]

Supposing I have a code like this:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[]) {
typedef struct{
uint16_t x : 9;
uint8_t y : 7;
} z;
printf("sizeof(z) = %lu\n",sizeof(z));
}
I have different results for clang on Mac (2) and someone told me on Windows it returned (3). Not sure if I understand it well, but I see that while first compiler compresses the struct to 9+7 = 16 bits, the other uses 16 bits of uint16_t and 8 of uint8_t. Could you advise?

Not sure if I understand it well, but I see that while first compiler compresses the struct to 9+7 = 16 bits, the other uses 16 bits of uint16_t and 8 of uint8_t. Could you advise?
The first thing to remember about bit-field is this phrase from K&R, 2nd:
(6.9 Bit-fields) "Almost everything about fields is implementation-dependent."
It includes padding, alignment and bit endianness.

There are two possible problems that might be occurring:
Bit-fields are very poorly standardized part within the ANSI C specification. The compiler chooses how bits are allocated within the bit-field container.You should avoid using them inside structures instead you can use #define or enum.
The second possible issue is that the compiler will lay the structure in memory by adding padding to ensure that the next object is aligned to the size of that object.It is a good practices to place elements of the struct according to their size:
typedef struct{
uint8_t x : 7;
uint16_t y : 9;
} z;

Representing Registers using Unions

I am trying to set up the internal registers of a HCS12 processor using unions. Here is the way I currently have the unions:
union byte{
struct{
unsigned int B0: 1;
unsigned int B1: 1;
unsigned int B2: 1;
unsigned int B3: 1;
unsigned int B4: 1;
unsigned int B5: 1;
unsigned int B6: 1;
unsigned int B7: 1;
}idv;
unsigned int All: 8;
};
union allPurpose{
struct {
union byte A;
union byte B;
} AB;
unsigned int D: 16;
};
The issue is that when I initialize A to 170 and B to 187 using the int All. D should be 43,707 but is 170. I know that nested unions work but for some reason it is not working. If anyone can see something wrong and can help it would be appreciated.
Edit: Here is the code that is using the union in.
HCS12.accumulator.AB.A.All=0xAA;
HCS12.accumulator.AB.B.All=0xBB;
printf("\nReg A: %d",HCS12.accumulator.AB.A.All);
printf("\nReg B: %d",HCS12.accumulator.AB.B.All);
printf("\nReg D: %d",HCS12.accumulator.D);
The Union allPurpose is just in a another structure.

You cannot assume that union byte A; and union byte B; are packed contiguously in memory.
(In fact, on modern architectures, is unlikely that they would be due to their being only one byte in size).
The specific packing arrangements are down to compiler and platform choice. A typical arrangement is 4 byte packing, meaning that your structure looks like this:
struct {
union byte A;
/*Three bytes of packing*/
union byte B;
} AB;

First of all you should never use bit fields in embedded systems, for multiple reasons. Most likely your problems come from the bit order being poorly specified for your compiler (which is likely Codewarrior?).
I would strongly advise to use the following 100% portable syntax instead:
#define PEAR (*(volatile uint8_t*)0x000Au)
#define NOACCE 0x80u
#define PIPOE 0x20u
#define NECLK 0x10u
#define LSTRE 0x08u
#define RDWE 0x04u
You shouldn't use the default primitive data types in C (such as unsigned int) when working with embedded systems. Use uint16_t etc from stdint.h instead.
It doesn't make any sense to make a union of one 8 bit member and one 16 bit member. This could be the cause of the problem.
In general avoid using structs/unions when mapping hardware registers, because the compiler may introduce padding anywhere inside the struct. Since the HCS12 is a good core which doesn't care about alignment, this won't be an issue in your specific case however.
EDIT
Your printf code looks fishy. printf() uses argument promotion to int. Look in the debugger memory map instead. What values does it say that your variables have?

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight