memory layout for packed and unpacked structure - c

I have 2 structures defined like below.
#include<stdint.h>
typedef struct
{
uint32_t a;
uint8_t b;
uint8_t pad[3]; //padding here is added intentionally.
uint32_t c;
}A;
typedef struct
{
uint32_t a;
uint8_t b;
uint8_t pad[3];
uint32_t c;
}__attribute__((__packed__)) B;
Are these 2 structs guaranteed to have exactly same memory layout on all the hardware platforms? It can be assumed that the compiler is always gcc.

No. There could still be padding in the unpacked version of this struct. Even if there is no such implementation today, there could be an architecture in the future where-by all of it's types are most optimal when aligned to offsets evenly divisible by 128, and any of the members in the unpacked version could be followed by between 120 and 96 bits of padding on such a system. A compiler might take advantage of this. Stackoverflow is forever.

The answer would be "possibly" but not "guaranteed" since all architectures is a very broad range of coverage. I can envision some systems that have memory architectures whose read/write performance would prefer keeping reads and writes on boundaries larger than uint8_t.
Note that when packed, compilers will use read/write code that is often suboptimal for the particular architecture, but correct with respect to accessing the structure member.

Related

Why is the compiler adding padding to a struct that's already 4-byte aligned?

This issue has been reproduced using multiple 32-bit architectures
I have the following struct in which the sum of all the data member sizes is 76 bytes:
typedef struct
{
uint32_t unused_32;
//Unrolled arrays
uint8_t ar1_1;
uint8_t ar1_2;
uint8_t ar2_1;
uint8_t ar2_2;
uint8_t ar3_1;
uint8_t ar3_2;
uint8_t ar4_1;
uint8_t ar4_2;
uint8_t ar4_3;
uint8_t ar4_4;
uint8_t ar4_5;
uint8_t ar4_6;
uint8_t ex1;
uint8_t ex2;
uint8_t ex3;
uint8_t ex4;
uint8_t fsi_1;
uint8_t fsi_2;
uint8_t fsi_3;
uint8_t fsi_4;
uint64_t ex5;
uint64_t ex6;
uint64_t Unused_1;
uint64_t Unused_2;
uint64_t Unused_3;
uint64_t Unused_4;
uint32_t Crc;
} testStruct;
However, when I do a sizeof() on the struct I get 80 bytes and I'm having trouble figuring out why. When I set the preprocessor to use the "#pragma pack(1)" option and do a sizeof(), it returns 76 bytes, indicating that the compiler is adding padding.
I printed out the addresses of every data member of the structure and everything is in sequential order in memory with no 'holes'. I also added an initialization for a 32-bit integer before the struct and printed out its address, and it came right before the first datamember of the struct, indicating that the padding that's being added is at the end.
76 bytes is already 4-byte aligned, why is the compiler adding an additional 4 bytes onto the end of the struct to make it 80 bytes?
From the GCC documentation:
Note that the alignment of any given struct or union type is required by the ISO C standard to be at least a perfect multiple of the lowest common multiple of the alignments of all of the members of the struct or union in question.
In other words, because your struct contains members of type uint64_t (aligned to 8 bytes), the struct itself must be aligned to an 8-byte boundary.
As OP was able to use #pragma pack(1), an implementation specific directive, it implies that the size of the structure could be packed to 76 bytes rather than 80 on OP's platform.
With such packing, an array of testStruct would certainly cause uint64_t members to occur on non-octal addresses. By padding the structure with 4, an array of testStruct, starting on an octal address would have all uint64_t members in each element aligned for potential optimal code/speed.
This is a common trade off compilers make to optimize, code, speed and memory usage.
Most coding tasks should not assume a packed structure.
For code that needs precise layout, either implementation directive like #pragma pack(1) are needed or for portability, another non-struct approach is needed.
Compiler use 4 bytes boundary till fsi_4, then use 8 bytes boundary once declaration of "uint64_t" started thats why you are getting 80 bytes of total size.

Memory - Natural address boundary

Definition
Structure padding is the process of aligning data members of the structure in accordance with the memory alignment rules specified by the processor.
what is the memory alignment rule for Intel x86 processor?
As per my understanding, natural address boundaries for Intel-x86 processor is 32 bits each(i.e.,addressOffset%4==0)
So, In x86 processor,
struct mystruct_A {
char a;
int b;
char c;
};
will be constructed as,
struct mystruct_A {
char a;
char gap_0[3]; /* inserted by compiler: for alignment of b using array */
int b;
char c;
char gap_1[3]; /* for alignment of the whole struct using array */
};
what is the memory alignment rule for Intel x86-64 processor?
As per my understanding, natural address boundaries for Intel x86-64 processor is 64 bits each(i.e.,addressOffset%8==0)
So, In x86-64 processor,
struct mystruct_A {
char a;
int b;
char c;
};
will be constructed as,
struct mystruct_A {
char a;
char gap_0[7]; /* inserted by compiler: for alignment of b using array */
int b;
char c;
char gap_1[7]; /* for alignment of the whole struct using array */
};
If the above understanding is correct, then I would like to know why use an array of int for bit operation?
Recommends to use int sized data, as mentioned here, that says, because the most cost efficient access to memory is accessing int sized data.
Question:
Is this memory alignment rule that forces to declare int sized data for bit operations?
Addendum: this is valid for x86/-64 bit processors, but also for others. I am blindly assuming you're using those. For others, you should check the respective manuals.
If fasm automatically added fillers into my structs i'd go insane. In general, performance is better when accesses to memory are on a boundary corresponding to the size of the element you want to retrieve. That being said, it's not a definite necessity!
This article here might be worth a look: https://software.intel.com/en-us/articles/coding-for-performance-data-alignment-and-structures
Intel's suggestion for optimal layout is to start with the biggest elements first and going smaller as the structure increases. That way you'll stay aligned properly, as long as the first element is aligned properly. There are no three-byte elements, thus misalignment is out of the question and all the compiler might do is adding bytes at the end, which is the best way to make sure it won't ruin things if you choose to do direct memory accesses instead of using variables.
The safest procedure is to not rely on your compiler, but instead aligning the data properly yourself.
Fun Fact: loops work the same way. Padding NOPs in your code, before the start of a loop, can make a difference.

structure with bitfields, memory allocation for array

I want to use a structure as given below on my ARM Cortex M0. I am using C programming.
struct path_table_t {
uint8_t lut_index;
uint8_t flag : 1;
};
flag field is made to be single bit by bit fields. How will be the memory allocation for the array of above mentioned structure will happen?
Will I get the benefit of bit fields as saving in total memory?
Most likely the sizeof(struct path_table_t) == 2. That is, structure with 2 elements, with each being 8 bits. 7 bits of flag should remain unused, but are still reserved.
Note that C standard gives a lot of room for compiler implementation to vary the size required by the bit fields. For example (N1570, 6.7.2.1 p11):
An implementation may allocate any addressable storage unit large enough to hold a bit-
field.
and
The alignment of the addressable storage unit is unspecified.
So the space being reserved by the flag and its padding might be even larger.
If you wish to be sure, check the size with sizeof. Just remember that changing compiler or compiler settings might affect the result.
To answer your question Will I get the benefit of bit fields as saving in total memory? NO
consider two struct's
struct path_table1_t {
uint8_t lut_index;
uint8_t flag : 1;
}a;
and
struct path_table2_t {
uint8_t lut_index;
uint8_t flag;
}b;
sizeof(a) and sizeof(b) is equal.
But incase you are planning to use all the bits in the given bit field then the memory allocated is reduced. Like
struct path_table3_t {
uint8_t lut_index : 1;
uint8_t flag : 1;
}c;
In this case the sizeof(c) is the sizeof(uint8_t) which will be 1 byte.
To summarize:
sizeof(c) < sizeof(a) and sizeof(a) = sizeof(b)
If you are very much interested to save memory then use the keyword __attribute__((__packed__)) , where this directive would make compiler to squeeze in the memory allocated for this struct upto which the user is using.
Memory allocation for bit fields is implementation defined.
Let's consider your example:
struct path_table_t {
uint8_t lut_index;
uint8_t flag : 1;
};
int main()
{
struct path_table_t p;
printf("%d\n",sizeof(p));
}
Here you want totally 9 bits of usage so it will be padded to 2 bytes.
If your structure was like
struct path_table_t {
uint8_t lut_index : 1;
uint8_t flag : 1;
};
int main()
{
struct path_table_t p;
printf("%d\n",sizeof(p));
}
You see here only 2 bits is required so your sizeof(struct) would be 1 byte which answers your question.
Yes using bit fields can save memory as shown above.

Why struct size is multiple of 8 when consist a int64 variable In 32 bit system

In C Programming language and I use 32 bit system,
I have a struct and this struct size is multiple of four.
But I look at Linker Map file and size is multiple of eight
Example
typedef struct _str
{
U64 var1,
U32 var2
} STR;
Size of this struct is 16.
But
typedef struct _str
{
U32 var1,
U32 var2,
U32 var3
} STR2;
Size of STR2 is 12.
I am working on 32 bit ARM microcontroller.
I dont know why
The first structure is padded in order to get it aligned on a U64 boundary: it is equivalent to
struct STR
{
U64 var1;
U32 var2;
U8 pad[4]; /* sizeof(U64) - sizeof(U32) */
};
So when used in an array of struct STR [], each U64 is well aligned regarding to ABI requirements.
Have a look at Procedure Call Standard for ARM Architecture , 4.1 Fundamental Data Types .
The size of a structure is defined as a multiple of the alignment of a member with the largest alignment requirement. It looks like the alignment requirement of U64 is 8 bytes even in 32-bit mode on your platform.
The following code:
#include <stdio.h>
#include <stdint.h>
int main() { printf("alignof(uint64_t) is %zu\n", __alignof(uint64_t)); }
Produces the same output when compiled in both 32-bit and 64-bit mode on Linux x86_64:
alignof(uint64_t) is 8
Padding.
Suppose you have
typedef struct _str
{
U64 var1,
U32 var2
} STR;
STR s[2];
Your architecture may demand that s[0].var1 and s[1].var1 lie on the natural alignment for U64, at 8-byte boundaries. Since C does not put padding between array elements, the padding goes inside the structure.
On the other hand,
typedef struct _str
{
U32 var1,
U32 var2,
U32 var3
} STR2;
STR2 s2[2];
Only a 4-byte alignment is required here.
A bit of background, to help you reason about such issues.
Since the primitive memory operations are generally specified in multiples of 8, compiler engineers decide on padding schemes for in-memory data-structures.
If a memory retrieval operation (memory -> bus -> cpu) is going to be 16-bit (on a hypothetical computer) chunks and you put 3*8-bit types in your struct, the compiler designer might as well pad it up to a 32-bit struct since 2,16-bit memory, retrieval operations are going to take place to pull your struct into the CPU-cache for CPU operations.
Ofcourse you can tell the compiler not to do this in exception circumstances, such as designing a on-disk or network protocol, where you might want to be space concious.
In the real world such issues are more elaborate, but the decisions stem from what the best choices are for general-purpose efficient use of your hardware :D

Reading binary data into memory structures, weird effects

I've been at this for a while now and it really puzzles me. This is a very distilled code fragment that reproduces the problem:
uint8_t dataz[] = { 1, 2, 3, 4, 5, 6 };
struct mystruct {
uint8_t dummy1[1];
uint16_t very_important_data;
uint8_t dummy2[3];
} *mystruct = (void *) dataz;
printf("%x\n", mystruct -> very_important_data);
What do you expect should be the output ? I'd say x302, but nope. It gives me x403. The same as if using this structure:
struct mystruct {
uint8_t dummy1[2];
uint16_t very_important_data;
uint8_t dummy2[2];
} *mystruct = (void *) dataz;
How would you explain that?
As others have mentioned, unless your compiler alignment is byte-aligned, your structure is likely to have "holes" in it. The compiler does this because it speeds up memory access.
If you're using gcc, there is a "packed" attribute which will cause the struct to be byte-aligned, and so remove the "holes":
struct __attribute((__packed__)) mystruct {
uint8_t dummy1[1];
uint16_t very_important_data;
uint8_t dummy2[3];
} *mystruct = (void *) dataz;
However, this will not necessarily fix the problem. The 16-bit value may not be set to what you think it should be, depending on the endianness of your machine. You will have to swap the bytes in any multi-byte integers in the struct. There is no general function to do this, as it would require information on the layout of the structure at run-time, which C does not provide.
Mapping structures to binary data is generally non-portable, even if you get it to work on your machine, right now.
Packing. The is no guarantee how members of a struct are physically located inside the struct. They may be word-aligned, leaving gaps.
There are pragmas in some versions of C to explictly control packing.
Most likely, the compiler has added a byte of padding between dummy1 and very_important_data to align very_important_data on a 16-bit boundary.
In general, the alignment and padding of fields in a struct is implementation-dependent, so you shouldn't rely on it. If you absolutely need a particular behavior, many compilers offer #pragma or other directives to control this. Check your compiler's documentation.
It depends on the compiler, but usually a compiler aligns each member to its natural alignment. In the case you ran into, very_important_data is a uint16_t which probably has a natural alignment of 2 bytes.

Resources