Structs in a 32-bit architecture [duplicate] - c

This question already has answers here:
What is the meaning of "__attribute__((packed, aligned(4))) "
(3 answers)
Closed 9 years ago.
The following code;
struct s1 {
void *a;
char b[2];
int c;
};
struct s2 {
void *a;
char b[2];
int c;
}__attribute__((packed));
if s1 has a size of 12 bytes and s2 has a size of 10 bytes, is this due to data being read in 4 byte chunks and }__attribute__((packed)); reduces the size of void*a; to only 2 bytes?
A little confused as to what }__attribute__((packed)); does.
Many thanks

It is due to alignment, a process in which the compiler adds hidden "junk" between the fields to make sure they have optimal (for performance) starting addresses.
Using packed forces the compiler to not do that, which often means that accessing the structure becomes slower (or simply impossible, causing e.g. a bus error) if the hardware has problems doing e.g. 32-bit accesses on addresses that are not multiples of 4.

On Intel processors, the fetches of 32-bit aligned data is considerably faster than unaligned; on many other processors unaligned fetches might be illegal altogether, or need to be simulated using 2 instructions. Thus the first structure would have the c always on these 32-bit architectures aligned to a byte address divisible by 4. This however requires that 2 bytes will be wasted in storage.
struct s1 {
void *a;
char b[2];
int c;
};
// Byte layout in memory (32-bit little-endian):
// | a0 | a1 | a2 | a3 | b0 | b1 | NA | NA | c0 | c1 | c2 | c3 |
// addresses increasing ====>
On the other hand, sometimes you absolutely need to map some unaligned datastructures (like file formats, or network packets), as is, into C structures; there you can use the __attribute__((packed)) to specify that you want everything without padding bytes:
struct s2 {
void *a;
char b[2];
int c;
} __attribute__((packed));
// Byte layout in memory (32-bit little-endian):
// | a0 | a1 | a2 | a3 | b0 | b1 | c0 | c1 | c2 | c3 |
// addresses increasing ====>

This is due to data structure alignment, a combination of two processes: data alignment and data padding. The first structure will be aligned to the word as you said, however the second structure is packed and forces the compiler to not pad the structure to the word.
The second structure is 10 bytes because the character array is 2 bytes, not the void pointer (it remains 4 bytes, as all pointers are). This can hinder performance as the trade off of 2 bytes of space is not worth the efficiency lost by the hardware (under most circumstances) and could lead to undefined behaviour.

Related

64-bit compiler struct padding with 32-bit arguments

I've been trying to understand how 64-bit compiler enforces struct alignment, and I can't understand why there is no padding in case a struct has only 32-bit arguments.
I would expect to have padding even in that case as 64-bit CPUs access memory using 64-bit pointers, don't they?
typedef struct {
uint32_t a1;
uint32_t a2;
uint32_t a3;
}tHeader;
typedef struct{
tHeader header;
uint32_t data1;
uint32_t data2;
}tPacket1;
0 7 bytes
+-------+-------+
| a1 a2 |
+-------+-------+
+-------+-------+
| a3 data1 |
+-------+-------+
+-------+-------+
| data2 | <---- Why no padding here?
+-------+-------+
20 bytes.
Padding example when 64-bit argument is present:
typedef struct {
uint32_t a1;
uint32_t a2;
uint32_t a3;
uint64_t a4;
}tHeader;
typedef struct{
tHeader header;
uint32_t data;
}tPacket1;
0 7 bytes
+-------+-------+
| a1 a2 |
+-------+-------+
+-------+-------+
| a3 PADDING|
+-------+-------+
+-------+-------+
| a4 |
+-------+-------+
+-------+-------+
| data PADDING| <---- Why padding here?
+-------+-------+
Total: 8 * 4 = 32 bytes.
Tested with:
$ gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
No padding is needed in the first case because all the members have 4-byte alignment. So if you have two consecutive structures, it can be laid out like:
0 7 bytes
+-------+-------+
| a1 a2 |
+-------+-------+
+-------+-------+
| a3 data1 |
+-------+-------+
+-------+-------+
| data2 a1 |
+-------+-------+
+-------+-------+
| a2 a3 |
+-------+-------+
+-------+-------+
| data1 data2 |
+-------+-------+
But that won't work in the second example because a4 needs 8-byte alignment. If it omitted the padding at the end, you'd have this:
0 7 bytes
+-------+-------+
| a1 a2 |
+-------+-------+
+-------+-------+
| a3 PADDING|
+-------+-------+
+-------+-------+
| a4 |
+-------+-------+
+-------+-------+
| data a1|
+-------+-------+
+-------+-------+
| a2 a3 |
+-------+-------+
+-------+-------+
|PADDING a4 |
+-------+-------+
+-------+-------+
| a4 data |
+-------+-------+
But splitting the second a4 like that is not permitted.
You could use the packed attribute to force it. Then the 64-bit element would be accessed using multiple 32-bit instructions. This would also obviate the padding between a3 and a4.
The idea of padding is to optimize the accesses and not capricious. 64bit cpus can make 64bit memory bus accesses, but there's no benefit of making a 64bit access when the bus transfer is done for a 32 bit quantity. When the cpu wants to transfer a 32bit quantity, it selects a 64bit address to select the 64bit register, and only reads the bytes corresponding to the quantity you selected. The compiler pads a structure, when you have a 64bit field that is not at a multiple of 8 address. In case you have to access it, the cpu should have to make two 64bit accesses to load/store the data if the data is not aligned to a multiple of 8 address. For this reason is done alignment. In case you have a structure that has only 32 bit addresses, accessing those registers only requires 32 bit bus accesses, so only 32bit alignment is required for the whole structure.
Let's go to the extreme... imagine you have a char register. Should you do a 64bit bus access and require the char to be aligned to a 64bit address to transfer only one byte of data?
In order to calculate alignment you have to think:
What is the offset of free space i got from the last field in the structure?
What is the alignment required by the next field?
Let's assume the last field packed in the structure left an offset of 13 bytes (you used a char[13] type) and imagine that the next field to pack is a char: A char requires one byte, so it can be appended to the last field without any alignment (it has alignment of one). Now assume it is a short (two bytes), the compiler should put it in the next even address (offset 14, with one byte of padding, and this will allow the structure to require an even address to be aligned (or the short field would not be aligned) In case it is a 32bit int, then the offset for the next field would require a multiple of 4 address, and 3 bytes of padding should be inserted, and so the offset should be at 16, with a 4byte alignment requirement for the whole structure. In case the data is a 64bit integer, the offset should have been fixed at an 8 byte boundary, making the compiler to insert 3 bytes of padding to make it at offset 16, and the alignment requirement for the whole structure would be 16 bytes. When no more fields are left to complete the structure, there's still some padding to make the whole structure to fit an integer number of bytes equal to the next multiple of the required alignment of the structure, so in case you append two structures, as in an array of structures of this type, the alignment is conserved. So in case we add at the end of the structure proposed a final 3 char array, we'll need to add a pad byte to fill the 4byte alignment of the whole structure.
So, once said this, the padding is computed based on the alignment requirements of the next field, based on the alignment required by the field (if it is an array the alignment required is the same as for the individual cell type of the array, and if it is a simple type it is the size of the type itself, and for structures the base type should be the alignment requirements of the next field with the biggest alignment requirements of the structure)
In the case you shown, only 32bit integers are included, so no padding is used between them (if the structue is aligned to 32bit, all fields will be aligned to 32bit) and no padding is needed on the end of the structure (to maintain the alignment in case of an array of structures)

Mapping a number to bit position in C

I'm developing an programm running on Atmel AT90CAN128. Connected to this controller there are 40 devices, each with a status (on/off). As I need to report the status of each of this devices to a PC through Serial Communication, I have 40 bits, which define whether the device is on or off. In addition, the PC can turn any of this devices on or off.
So, my first attempt was to create the following struct:
typedef struct {
unsigned char length; //!< Data Length
unsigned data_type; //!< Data type
unsigned char data[5]; //!< CAN data array 5 * 8 = 40 bits
} SERIAL_packet;
The problem with this was that the PC will send an unsigned char address telling me the device to turn on/off, so accessing the bit corresponding to that address number turned out to be rather complicated...
So I started looking for options, and I stumbled upon the C99 _Bool type. I thought, great so now I'll just create a _Bool data[40] and I can access the address bit just by indexing my data array. Turns out that in C (or C++) memory mapping needs an entire byte for addressing it. So even if I declare a _Bool the size of that _Bool will be 8 bits which is a problem (it needs to be as fast as possible so the more bits I send the slower it gets, and the PC will be specting 40 bits only) and not very efficient for the communication. So I started looking into Bit Fields, and tried the following:
typedef struct {
unsigned char length; //!< Data Length
unsigned data_type; //!< Data type
arrayData data[40]; //!< Data array 5 bytes == 40 bits
} SERIAL_packet;
typedef struct {
unsigned char aux : 1;
} arrayData;
And I wonder, is this going to map that data[40] into a consequent memory block with a size of 40 bits (5 bytes)?
If not, is there any obvious solution I'm missing? This doesn't seem like a very complicated thing to do (would be much simpler if there were less than 32 devices so I could use a int and just access through a bit mask).
Assuming the addresses you get back are in the range 0 - 39 and that a char has 8 bits, you can treat your data array as an array of bits:
| data[0] | data[1] ...
-----------------------------------------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|
-----------------------------------------------------------------
To set bit i:
packet.data[i/8] |= (1 << (i%8));
To clear bit i:
packet.data[i/8] &= (1 << (i%8)) ^ 0xff;
To read bit i:
int flag = (packet.data[i/8] & (1 << (i%8)) != 0;

Sizeof a struct in C

Well, after reading this Size of structure with a char, a double, an int and a t I still don't get the size of my struct which is :
struct s {
char c1[3];
long long k;
char c2;
char *pt;
char c3;
}
And sizeof(struct s) returns me 40
But according to the post I mentioned, I thought that the memory should like this way:
0 1 2 3 4 5 6 7 8 9 a b c d e f
+-------------+- -+---------------------------+- - - - - - - -+
| c1 | |k | |
+-------------+- -+---------------------------+- - - - - - - -+
10 11 12 13 14 15 16 17
+---+- -+- -+- - - - - -+----+
|c2 | |pt | | c3 |
+---+- -+- -+- - - - - -+----+
And I should get 18 instead of 40...
Can someone explain to me what I am doing wrong ? Thank you very much !
Assuming an 8-byte pointer size and alignment requirement on long long and pointers, then:
3 bytes for c1
5 bytes padding
8 bytes for k
1 byte for c2
7 bytes padding
8 bytes for pt
1 byte for c3
7 bytes padding
That adds up to 40 bytes.
The trailing padding is allocated so that arrays of the structure keep all the elements of the structure properly aligned.
Note that the sizes, alignment requirements and therefore padding depend on the machine hardware, the compiler, and the platform's ABI (Application Binary Interface). The rules I used are common rules: an N-byte type (for N in {1, 2, 4, 8, 16 }) needs to be allocated on an N-byte boundary. Arrays (both within the structure and arrays of the structure) also need to be properly aligned. You can sometimes dink with the padding with #pragma directives; be cautious. It is usually better to lay out the structure with the most stringently aligned objects at the start and the less stringently aligned ones at the end.
If you used:
struct s2 {
long long k;
char *pt;
char c1[3];
char c2;
char c3;
};
the size required would be just 24 bytes, with just 3 bytes of trailing padding. Order does matter!
The size of the structure depends upon what compiler is used and what compiler options are enabled. The C language standard makes no promises about how memory is utilized when the compiler creates structures, and different architectures (for example 32-bit WinTel vs 64-bit WinTel) cause different layout decisions even when the same compiler is used.
Essentially, the size of a structure is equal to the sum of the size of the bytes needed by the field elements (which can generally be calculated) plus the sum of the padding bytes injected by the compiler (which is generally not known).
It is because of alignment, gcc has
#pragma pack(push,n)
// declare your struct here
#pragma pack(pop)
to change it. Read here, and also __attribute__((__packed__)).
If you declare the struct
struct packed
{
char c1[3];
long long k;
char c2;
char *pt;
char c3;
} __attribute__((__packed__));
then compiling with gcc, sizeof(packed) = 18 since
c1: 3
k : 8
c2: 1
pt: 4 // it depends
c3: 1
Apparently Visual C++ compiler supports #pragma pack(push,n) too.
what is a size of structure?
#include <stdio.h>
struct {
char a;
char b;
char c;
}st;
int main()
{
printf("%ld", sizeof(st));
return 0;
}
it shows 3 in gdb compiler.

How to write a byte to register with specific memory address?

I want to write a byte to register with specific memory address (0x1228A432)
But, this register has a following structure:
Bits | Access | Name | Reset | Description |
[31:8] | Read only | -------- | ------ | Reserved |
[7:0] | Read-write | REG[7:0] | 0xXX | ----------- |
Please tell me, how to write a byte to this register without "touching" the Reserved bits?
EDIT1: My target is Cortex A9.
I could successfully read/write to onboard DDR2 memory using 256-bit values (such as 0xFF)
EDIT2: I used to work with DDR2 memory in the following way :
// First stage
static unsigned char *p = 0;
char * argv1="0x60000000";
unsigned long address=strtoul(argv1, 0, 0);
p = (unsigned char *) argv1;
// Second stage
char * argv4="FF";
int value=strtol(argv4,0,16);
// Third stage
int offset = 9;
p[offset]=value;
EDIT3: I found out the following information:
All registers are 32 bits wide and do not support byte writes.
Write operations must be word-wide and bits marked as reserved must be preserved
using read-modify-write.
One way to preserve bits [31:8], assuming 32-bit wide access, is to read the value, zero-out bits [7:0], bitwise-or it with the value needed and then write it back to the register.
Something like (stealing from RedX a bit ;) ):
uint8_t your_8_bit_value = 0x42;
uint32_t volatile * const mem_map_register = (uint32_t volatile *) 0x1228a432;
*mem_map_register = (*mem_map_register & 0xFFFFFF00) | your_8_bit_value;
Yet I think there should be more info available about your hardware. I've seen several datasheets saying e.g. that you have to write all 1 to reserved bits (meaning that reserved bits are reserved for future use, and 1 is a safe default), etc. So it is not always obvious, that leaving reserved bits untouched is the right thing to do.
You should find more details about your hardware - are byte-wide writes supported, are writes to reserved bits ignored perhaps, or should be all 0/1, etc.
Look up the assembler instruction handbook for an 8 bit writing instruction (not sure if it exists). If it does, use an uint8_t for your assignment to that memory location (uint8_t volatile * const reg = (uint8_t volatile * const) 0x1228a432;).
Else do what Omkant said. Overwriting the bits with the same number should not produce any unwanted results, since they are not "zeroed" before being overwritten.
His code in C (this is the verbose version for better readability):
uint8_t your_8_bit_value = 0x42;
uint32_t volatile * const mem_map_register = (uint32_t volatile *) 0x1228a432;
*mem_map_register = (*mem_map_register & 0xFFFFFF00) | your_8_bit_value;
[register value] = ([register value] | [00 00 00 FF]) & [FF FF FF XX]
Here , xx is the one byte read from your given address and then set a mask of 24 bits.
And perform bitwise & on the values shown above
I think this should work

How are the structure members stored on a little endian machine?

struct Dummy {
int x;
char y;
};
int main() {
struct Dummy dum;
dum.x = 10;
dum.y = 'a';
}
How would be the layout of the structure members on a little endian machine?
Would it be something like this?
0x1000 +0 +1 +2 +3
___________________
x: | 10 | 0 | 0 | 0 |
-------------------
y: | 'a'| 0 | 0 | 0 |
-------------------
0x1000 +4 +5 +6 +7
I think you'll find this question useful. The endianess is usually relevant for a word in the memory, not to the whole structure.
Structure layout is a compiler implementation detail, affected by the default packing. Endianness normally only affects the order of the bytes in a structure member value, not the layout. Check the small print in the compiler manual or use sizeof and the offsetof macro to experiment.
The layout you documented in your question is indeed very common for a 32-bit LE compiler.
The structure members will be in the order declared, with padding inserted as necessary so each field is properly aligned for its type and with padding inserted as necessary at the end such that in an array each subsequent structure is properly aligned and begins immediately after the end of the previous structure. It is also possible (but unlikely) that additional unnecessary padding will be inserted between any two elements or at the end.
Each field itself will be stored as appropriate for the type on the compiler and architecture, e.g. the int 10 would be stored as the bytes 0a 00 00 00 on a normal little-endian machine with 32-bit ints.

Resources