The PMA memory for the USB peripheral on some STM32s is structured as 256 16-bit words that are 32-bit aligned. This format is demonstrated in the following table:
Address + offset
0x0
0x1
0x2
0x3
0x40006000
0x000
0x001
-----
-----
0x40006004
0x002
0x003
-----
-----
0x40006008
0x004
0x005
-----
-----
0x4000600C
0x006
0x007
-----
-----
0x40006010
0x008
0x009
-----
-----
....
....
....
-----
-----
0x400063F8
0x1FC
0x1FD
-----
-----
0x400063FC
0x1FE
0x1FF
-----
-----
For example, the PMA address 0x006 is accessed by the CPU at address 0x4000600C. Access to any address with an offset of 0x2 and 0x3 is invalid.
Ideally, data in the PMA could be accessed by defining structs such as the following, but count would occupy offsets 0x3 and 0x4, assuming the struct points to an address with an offset of 0x0.
struct buf_desc_entry {
volatile uint16_t addr;
volatile uint16_t count;
} __attribute__((packed));
One solution would be to pad the struct so that count occupies offsets 0x0 and 0x1. This would be tedious and insecure since it would have to be done manually for all structs.
struct buf_desc_entry {
volatile uint16_t addr;
volatile uint16_t _1;
volatile uint16_t count;
volatile uint16_t _2;
} __attribute__((packed));
Another solution for this example would be to 32-bit align all variables either using __attribute__((aligned (4))) or using a linker script, however, this solution would not work for structs that contain 8-bit variables, since they can occupy offsets of 0x0 or 0x1.
Is there a more elegant way of doing this?
Related
ARM Cortex supports bit-banded memory, where individual bits are mapped to "bytes" in certain regions. I believe that only certain parts of RAM are bit-banded. I'd like to use bit-banding from C and C++.
How do I this? It seems I'd need to:
Tell the compiler to place certain variables in bit-banded regions. How? What if the variables are elements of a struct?
Tell the compiler, when I want to access a bit, to turn if (flags & 0x4) into if (flags_bb_04). Ideally, I'd like this to be automatic, and to fall back to the former if bit banding isn't available.
The simplest solution is to use regular variables and access them through thier bit-band address. For that you do not need to "tell the compiler" anything. For example, given:
extern "C" volatile uint32_t* getBitBandAddress( volatile const void* address, int bit )
{
volatile uint32_t* bit_address = 0;
uint32_t addr = reinterpret_cast<uint32_t>(address);
// This bit maniplation makes the function valid for RAM
// and Peripheral bitband regions
uint32_t word_band_base = addr & 0xf0000000;
uint32_t bit_band_base = word_band_base | 0x02000000;
uint32_t offset = addr - word_band_base;
// Calculate bit band address
bit_address = reinterpret_cast<volatile uint32_t*>(bit_band_base + (offset * 32u) + (static_cast<uint32_t>(bit) * 4u));
return bit_address ;
}
you could create a 32bit "array" thus:
uint32_t word = 0 ;
uint32_t* bits = getBitbandAddress( word, 0 ) ;
bits[5] = 1 ; // word now == 32 (bit 5 set).
Now if you have a part with external or CCM memory for example that is not bitbandable, you do need to ensure that the linker ( not the compiler) places the normal memory object in bitbandable memory. How that is done is toolchain specific but in gnu for example you might have:
uint32_t word __attribute__ ((section ("ISRAM1"))) = 0 ;
Bitbanding is perhaps most useful for atomically accessing individual bits in peripheral registers. For fast and thread-safe access.
Some compilers are bitband aware and may automatically optimise single bit bitfield access using bitbanding. So for example;
struct
{
bit1 : 1 ;
bit2 : 1 ;
} bits __attribute__ ((section ("BITBANDABLE")));
The compiler (at least armcc v5) may optimise this to utilise the bitband access to bits.bit1 and bits.bit2. YMMV.
I'm writing a little kernel in c for x86 platform, but I'm having trouble to load the gdt and reload the segment selectors.
I am using bochs to test my kernel.
The issue is, when I load the GDT but don't reload the segment selectors, I can stop my program, type info gdt and get a nice result:
When I dont load my GDT:
<bochs:2> info gdt
Global Descriptor Table (base=0x00000000000010b0, limit=32):
GDT[0x0000]=??? descriptor hi=0x00000000, lo=0x00000000
GDT[0x0008]=??? descriptor hi=0x00000000, lo=0x00000000
GDT[0x0010]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Non-Conforming, Accessed, 32-bit
GDT[0x0018]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write, Accessed
You can list individual entries with 'info gdt [NUM]' or groups with 'info gdt [NUM] [NUM]'
<bochs:3>
When I load my GDT:
<bochs:2> info gdt
Global Descriptor Table (base=0x00000000001022a0, limit=48):
GDT[0x0000]=??? descriptor hi=0x00000000, lo=0x00000000
GDT[0x0008]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Non-Conforming, 32-bit
GDT[0x0010]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write
GDT[0x0018]=Code segment, base=0x00000000, limit=0x00000fff, Execute-Only, Non-Conforming, 32-bit
GDT[0x0020]=Data segment, base=0x00000000, limit=0x00000fff, Read-Only
GDT[0x0028]=??? descriptor hi=0x00000000, lo=0x00000000
You can list individual entries with 'info gdt [NUM]' or groups with 'info gdt [NUM] [NUM]'
<bochs:3>
So it seems that my GDT is loaded properly.
Now comes the tricky part.
When I want to reload the segment selectors, I'm having this error:
04641352650e[CPU0 ] fetch_raw_descriptor: GDT: index (ff57) 1fea > limit (30)
04641352650e[CPU0 ] interrupt(): vector must be within IDT table limits, IDT.limit = 0x0
04641352650e[CPU0 ] interrupt(): vector must be within IDT table limits, IDT.limit = 0x0
04641352650i[CPU0 ] CPU is in protected mode (active)
04641352650i[CPU0 ] CS.mode = 32 bit
04641352650i[CPU0 ] SS.mode = 32 bit
04641352650i[CPU0 ] EFER = 0x00000000
04641352650i[CPU0 ] | EAX=0000ff53 EBX=00010000 ECX=001022e0 EDX=00000000
04641352650i[CPU0 ] | ESP=00102294 EBP=001022b0 ESI=00000000 EDI=00000000
04641352650i[CPU0 ] | IOPL=0 id vip vif ac vm RF nt of df if tf sf zf af PF cf
04641352650i[CPU0 ] | SEG sltr(index|ti|rpl) base limit G D
04641352650i[CPU0 ] | CS:0010( 0002| 0| 0) 00000000 ffffffff 1 1
04641352650i[CPU0 ] | DS:0018( 0003| 0| 0) 00000000 ffffffff 1 1
04641352650i[CPU0 ] | SS:0018( 0003| 0| 0) 00000000 ffffffff 1 1
04641352650i[CPU0 ] | ES:0018( 0003| 0| 0) 00000000 ffffffff 1 1
04641352650i[CPU0 ] | FS:0018( 0003| 0| 0) 00000000 ffffffff 1 1
04641352650i[CPU0 ] | GS:0018( 0003| 0| 0) 00000000 ffffffff 1 1
04641352650i[CPU0 ] | EIP=001001d8 (001001d8)
04641352650i[CPU0 ] | CR0=0x60000011 CR2=0x00000000
04641352650i[CPU0 ] | CR3=0x00000000 CR4=0x00000000
(0).[4641352650] [0x0000001001d8] 0010:00000000001001d8 (unk. ctxt): mov ds, ax ; 8ed8
04641352650e[CPU0 ] exception(): 3rd (13) exception with no resolution, shutdown status is 00h, resetting
And with that, when I type info gdt again, it gives me a very huge array,
which doesn't even fit in my terminal scrollback capacity.
Here are the last lines:
GDT[0xffd8]=??? descriptor hi=0x72670074, lo=0x64696c61
GDT[0xffe0]=16-Bit TSS (available) at 0x6c65725f, length 0xc6275
GDT[0xffe8]=Data segment, base=0x5f726700, limit=0x0002636f, Read-Only, Expand-down, Accessed
GDT[0xfff0]=Data segment, base=0x00657266, limit=0x00086572, Read/Write, Accessed
GDT[0xfff8]=Data segment, base=0x675f6275, limit=0x00057267, Read/Write
You can list individual entries with 'info gdt [NUM]' or groups with 'info gdt [NUM] [NUM]'
It says me that I want to access data outside of my GDT.
Here is the code I have written so far:
enum SEG_TYPE {
// Data
SEG_TYPE_DRO = 0b0000,
SEG_TYPE_DRW = 0b0010,
SEG_TYPE_DROE = 0b0100,
SEG_TYPE_DRWE = 0b0110,
// Code
SEG_TYPE_CEO = 0b1000,
SEG_TYPE_CER = 0b1010,
SEG_TYPE_CEOC = 0b1100,
SEG_TYPE_CERC = 0b1110,
};
enum SEG_AC {
SEG_AC_KERNEL = 0b11,
SEG_AC_USER = 0b00,
};
void gdt_entry_init(struct gdt_entry* entry, u32 base, u32 limit, enum SEG_TYPE type, enum SEG_AC access_rights) {
// Base address
entry->base_0_15 = base;
entry->base_16_23 = base >> 16;
entry->base_24_31 = base >> 24;
// Limit
entry->limit_0_15 = limit;
entry->limit_16_19 = limit >> 16;
// Segment type
entry->type = type;
// Access rights
entry->dpl = access_rights;
// AVL
entry->avl = 0;
// Default operation set to 32 bits
entry->db = 1;
// Code segment
entry->l = 0;
// Present (always present)
entry->p = 1;
// Descriptor type (code or data)
entry->s = 1;
// Granularity (enabled with 4KBytes increment)
entry->g = 1;
}
struct gdt_entry {
u32 limit_0_15 : 16;
u32 base_0_15 : 16;
u32 base_16_23 : 8;
u32 type : 4;
u32 s : 1;
u32 dpl : 2;
u32 p : 1;
u32 limit_16_19 : 4;
u32 avl : 1;
u32 l : 1;
u32 db : 1;
u32 g : 1;
u32 base_24_31 : 8;
} __attribute__((packed));
struct gdt_r {
u16 limit;
u32 base;
} __attribute__((packed));
struct gdt_entry gdt[6];
void gdt_init() {
// Null segment
struct gdt_entry null_entry = { 0 };
gdt[0] = null_entry;
// Kernel code segment
gdt_entry_init(gdt + 1, 0x0, 0xFFFFFFFF, SEG_TYPE_CER, SEG_AC_KERNEL);
// Kernel data segment
gdt_entry_init(gdt + 2, 0x0, 0xFFFFFFFF, SEG_TYPE_DRW, SEG_AC_KERNEL);
// User code segment
gdt_entry_init(gdt + 3, 0x0, 0x0, SEG_TYPE_CEO, SEG_AC_USER);
// User data segment
gdt_entry_init(gdt + 4, 0x0, 0x0, SEG_TYPE_DRO, SEG_AC_USER);
// TSS
gdt[5] = null_entry;
struct gdt_r gdtr;
gdtr.base = (u32)gdt;
gdtr.limit = sizeof(gdt);
asm volatile("lgdt %0\n"
: /* no output */
: "m" (gdtr)
: "memory");
// 0x10 is the address of the the kernel data segment
asm volatile("movw 0x10, %%ax\n":);
asm volatile("movw %%ax, %%ds\n":);
asm volatile("movw %%ax, %%fs\n":);
asm volatile("movw %%ax, %%gs\n":);
asm volatile("movw %%ax, %%ss\n":);
// 0x8 is the address of the kernel code segment
asm volatile("pushl 0x8\n"
"pushl $1f\n"
"lret\n"
"1:\n"
: /* no output */);
}
If you guys have any idea whats going on with this.
It turned out a really smart person find the issues:
I missed the $ in front of direct value when writing inline asm:
// 0x10 is the address of the the kernel data segment
asm volatile("movw $0x10, %%ax\n":);
asm volatile("movw %%ax, %%ds\n":);
asm volatile("movw %%ax, %%fs\n":);
asm volatile("movw %%ax, %%gs\n":);
asm volatile("movw %%ax, %%ss\n":);
// 0x8 is the address of the kernel code segment
asm volatile("pushl $0x8\n"
"pushl $1f\n"
"lret\n"
"1:\n"
: /* no output */);
I exchanged permissions for KERNEL and USER, correct should be
enum SEG_AC {
SEG_AC_KERNEL = 0b00,
SEG_AC_USER = 0b11,
};
I'm investigation how different compilers handle unaligned access of structure bitfields members as well as members that cross the primitive types' boundaries, and I think MinGW64 is bugged. My test program is:
#include <stdint.h>
#include <stdio.h>
/* Structure for testing element access
The crux is the ISO C99 6.7.2.1p10 item:
An implementation may allocate any addressable storage unit large enough to hold a bitfield.
If enough space remains, a bit-field that immediately follows another bit-field in a
structure shall be packed into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is
implementation-defined. The order of allocation of bit-fields within a unit (high-order to
low-order or low-order to high-order) is implementation-defined. The alignment of the
addressable storage unit is unspecified.
*/
typedef struct _my_struct
{
/* word 0 */
uint32_t first :32; /**< A whole word element */
/* word 1 */
uint32_t second :8; /**< bits 7-0 */
uint32_t third :8; /**< bits 15-8 */
uint32_t fourth :8; /**< bits 23-16 */
uint32_t fifth :8; /**< bits 31-24 */
/* word 2 */
uint32_t sixth :16; /**< bits 15-0 */
uint32_t seventh :16; /**< bits 31-16 */
/* word 3 */
uint32_t eigth :24; /**< bits 23-0 */
uint32_t ninth :8; /**< bits 31-24 */
/* word 4 */
uint32_t tenth :8; /**< bits 7-0 */
uint32_t eleventh :24; /**< bits 31-8 */
/* word 5 */
uint32_t twelfth :8; /**< bits 7-0 */
uint32_t thirteeneth :16; /**< bits 23-8 */
uint32_t fourteenth :8; /**< bits 31-24 */
/* words 6 & 7 */
uint32_t fifteenth :16; /**< bits 15-0 */
uint32_t sixteenth :8; /**< bits 23-16 */
uint32_t seventeenth :16; /**< bits 31-24 & 7-0 */
/* word 7 */
uint32_t eighteenth :24; /**< bits 31-8 */
/* word 8 */
uint32_t nineteenth :32; /**< bits 31-0 */
/* words 9 & 10 */
uint32_t twentieth :16; /**< bits 15-0 */
uint32_t twenty_first :32; /**< bits 31-16 & 15-0 */
uint32_t twenty_second :16; /**< bits 31-16 */
/* word 11 */
uint32_t twenty_third :32; /**< bits 31-0 */
} __attribute__((packed)) my_struct;
uint32_t buf[] = {
0x11223344, 0x55667788, 0x99AABBCC, 0x01020304, /* words 0 - 3 */
0x05060708, 0x090A0B0C, 0x0D0E0F10, 0x12131415, /* words 4 - 7 */
0x16171819, 0x20212324, 0x25262728, 0x29303132, /* words 8 - 11 */
0x34353637, 0x35363738, 0x39404142, 0x43454647 /* words 12 - 15 */
};
uint32_t data[64];
int main(void)
{
my_struct *p;
p = (my_struct*) buf;
data[0] = 0;
data[1] = p->first;
data[2] = p->second;
data[3] = p->third;
data[4] = p->fourth;
data[5] = p->fifth;
data[6] = p->sixth;
data[7] = p->seventh;
data[8] = p->eigth;
data[9] = p->ninth;
data[10] = p->tenth;
data[11] = p->eleventh;
data[12] = p->twelfth;
data[13] = p->thirteeneth;
data[14] = p->fourteenth;
data[15] = p->fifteenth;
data[16] = p->sixteenth;
data[17] = p->seventeenth;
data[18] = p->eighteenth;
data[19] = p->nineteenth;
data[20] = p->twentieth;
data[21] = p->twenty_first;
data[22] = p->twenty_second;
data[23] = p->twenty_third;
if( p->fifth == 0x55 )
{
data[0] = 0xCAFECAFE;
}
else
{
data[0] = 0xDEADBEEF;
}
int i;
for (i = 0; i < 24; ++i) {
printf("data[%d] = 0x%0x\n", i, data[i]);
}
return data[0];
}
And the results I found are:
| Data Member | Type | GCC Cortex M3 | GCC mingw64 | GCC Linux | GCC Cygwin |
|:------------|:-------:|:---------------|:--------------|:--------------|:--------------|
| data[0] | uint32_t| 0x0 | 0xcafecafe | 0xcafecafe | 0xcafecafe |
| data[1] | uint32_t| 0x11223344 | 0x11223344 | 0x11223344 | 0x11223344 |
| data[2] | uint32_t| 0x88 | 0x88 | 0x88 | 0x88 |
| data[3] | uint32_t| 0x77 | 0x77 | 0x77 | 0x77 |
| data[4] | uint32_t| 0x66 | 0x66 | 0x66 | 0x66 |
| data[5] | uint32_t| 0x55 | 0x55 | 0x55 | 0x55 |
| data[6] | uint32_t| 0xbbcc | 0xbbcc | 0xbbcc | 0xbbcc |
| data[7] | uint32_t| 0x99aa | 0x99aa | 0x99aa | 0x99aa |
| data[8] | uint32_t| 0x20304 | 0x20304 | 0x20304 | 0x20304 |
| data[9] | uint32_t| 0x1 | 0x1 | 0x1 | 0x1 |
| data[10] | uint32_t| 0x8 | 0x8 | 0x8 | 0x8 |
| data[11] | uint32_t| 0x50607 | 0x50607 | 0x50607 | 0x50607 |
| data[12] | uint32_t| 0xc | 0xc | 0xc | 0xc |
| data[13] | uint32_t| 0xa0b | 0xa0b | 0xa0b | 0xa0b |
| data[14] | uint32_t| 0x9 | 0x9 | 0x9 | 0x9 |
| data[15] | uint32_t| 0xf10 | 0xf10 | 0xf10 | 0xf10 |
| data[16] | uint32_t| 0xe | 0xe | 0xe | 0xe |
| data[17] | uint32_t| 0x150d | 0x1415 | 0x150d | 0x150d |
| data[18] | uint32_t| 0x121314 | 0x171819 | 0x121314 | 0x121314 |
| data[19] | uint32_t| 0x16171819 | 0x20212324 | 0x16171819 | 0x16171819 |
| data[20] | uint32_t| 0x2324 | 0x2728 | 0x2324 | 0x2324 |
| data[21] | uint32_t| 0x27282021 | 0x29303132 | 0x27282021 | 0x27282021 |
| data[22] | uint32_t| 0x2526 | 0x3637 | 0x2526 | 0x2526 |
| data[23] | uint32_t| 0x29303132 | 0x35363738 | 0x29303132 | 0x29303132 |
GCC Cortex M3 is
arm-none-eabi-gcc (GNU MCU Eclipse ARM Embedded GCC, 32-bit) 8.2.1 20181213 (release) [gcc-8-branch revision 267074]
GCC Mingw is
gcc.exe (i686-posix-dwarf-rev0, Built by MinGW-W64 project) 8.1.0
GCC Linux is
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
GCC Cygwin is
gcc (GCC) 7.4.0
All GCC versions seem to correctly handle unaligned access (like my_struct.thirteeneth).
The problem is not that members who cross the word boundary (my_struct.seventeenth) are different, as the C99 standard quoted above clearly states that the behaviour is implementation-defined. The problem is that all subsequent accesses are clearly incorrect (data[17] and on) even for aligned members (my_struct.nineteenth & my_struct.twenty_third). What's going on here, is this a bug or are these valid values?
It is not bugged, it lays the bitfields according to windows ABI.
According to gcc docs:
If packed is used on a structure, or if bit-fields are used, it may be that the Microsoft ABI lays out the structure differently than the way GCC normally does.
Compile mingw64 version with -mno-ms-bitfields to fix the difference. Or compile all other versions with -mms-bitfields to lay out the structure the same as mingw.
The chances that a widely used compiler like GCC has a bug is not zero but really minimal. And odds are that PEBKAS. ;-)
Anyway, I have compiled your programm with "gcc (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0" and got the same result as you in the column "mingw64".
A finer look reveals that the compiler aligns the bitfields on 32-bit boundaries which happens to be the width of an int. This conforms perfectly to chapter 6.7.2.1 of the standard C17 which states that the "straddling" (in its words of the annex J.3.9) is implementation-defined.
The other GCC variants are not aligning the bit fields and support crossing 32-bit boundaries.
It is clearly not a bug, the values are valid. It might be worth to research the reasons and perhaps post a feature request.
Edit:
Just to clarify, this is the layout with alignment. There is nothing wrong with elements seventeenth and following:
/* 0x11223344: word 0 */
uint32_t first :32;
/* 0x55667788: word 1 */
uint32_t second :8;
uint32_t third :8;
uint32_t fourth :8;
uint32_t fifth :8;
/* 0x99AABBCC: word 2 */
uint32_t sixth :16;
uint32_t seventh :16;
/* 0x01020304: word 3 */
uint32_t eigth :24;
uint32_t ninth :8;
/* 0x05060708: word 4 */
uint32_t tenth :8;
uint32_t eleventh :24;
/* 0x090A0B0C: word 5 */
uint32_t twelfth :8;
uint32_t thirteeneth :16;
uint32_t fourteenth :8;
/* 0x0D0E0F10: words 6 */
uint32_t fifteenth :16;
uint32_t sixteenth :8;
/* 0x12131415: word 7, because "seventeenth" does not fit in the space left */
uint32_t seventeenth :16;
/* 0x16171819: word 8, because "eighteenth" does not fit in the space left */
uint32_t eighteenth :24;
/* 0x20212324: word 9, because "nineteenth" does not fit in the space left */
uint32_t nineteenth :32;
/* 0x25262728: words 10 */
uint32_t twentieth :16;
/* 0x29303132: word 11, because "twenty_first" does not fit in the space left */
uint32_t twenty_first :32;
/* 0x34353637: word 12 */
uint32_t twenty_second :16;
/* 0x35363738: word 13, because "twenty_third" does not fit in the space left */
uint32_t twenty_third :32;
You can not rely at all, in any way, on how bit-fields are arranged in a structure.
Per 6.7.2.1 Structure and union specifiers, paragraph 11 of the C11 standard (bolding mine):
An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.
You even quoted that. Given that, there is no "incorrect" way for an implementation to lay out a bit-field.
So you can not rely on the size of the bit-field container.
You can not rely on whether or not a bit-field crosses units.
You can not rely on the order of bit-fields within a unit.
Yet your question assumes you can do all that, even using terms such as "correct" when you see what you expected and "clearly incorrect" to describe bit-field layouts you didn't expect.
It's not "clearly incorrect".
If you need to know where a bit is in a structure, you simply can not portably use bit-fields.
In fact, all your effort on this question is a perfect case study in why you can't rely on bit-fields.
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 1] .text PROGBITS 00000000 000034 00002a 00 AX 0 0 4
As above,the segment begin from 0x34 address, but its Al is 4,so it can't be divided by 2**4.
I mean : 0x34 % 16 != 0.So I want to ask why .text segment's address doesn't begin from Integer times of 16.
The section header struct looks like this:
typedef struct {
uint32_t sh_name;
uint32_t sh_type;
uint32_t sh_flags;
Elf32_Addr sh_addr;
Elf32_Off sh_offset;
uint32_t sh_size;
uint32_t sh_link;
uint32_t sh_info;
uint32_t sh_addralign;
uint32_t sh_entsize;
} Elf32_Shdr;
So what you see under the Al column is sh_addralign. Let's look at the description of that member from the elf manpage:
sh_addralign
Some sections have address alignment constraints. If a
section holds a doubleword, the system must ensure
doubleword alignment for the entire section. That is, the
value of sh_addr must be congruent to zero, modulo the
value of sh_addralign. Only zero and positive integral
powers of two are allowed. Values of zero or one mean the
section has no alignment constraints.
TL;DR: The alignment constraint shown in the Al column is for Addr (which is aligned in your case since it's zero), not for Off. In other words, it's an alignment constraint for the address where the image is loaded in memory, not for where it's stored in the ELF file.
i coded a small program to show you the casting problem
#include <stdlib.h>
struct flags {
u_char flag1;
u_char flag2;
u_short flag3;
u_char flag4;
u_short flag5;
u_char flag7[5];
};
int main(){
char buffer[] = "\x01\x02\x04\x03\x05\x07\x06\xff\xff\xff\xff\xff";
struct flags *flag;
flag = (struct flags *) buffer;
return 0;
}
my problem is when i cast the flag 5 wrongly takes the "\x06\xff" bytes ignoring the "\x07" and the flag 7 wrongly takes the next 4 "\xff" bytes plus a nul which is the next byte.I also run gdb
(gdb) p/x flag->flag5
$1 = 0xff06
(gdb) p/x flag->flag7
$2 = {0xff, 0xff, 0xff, 0xff, 0x0}
(gdb) x/15xb flag
0xbffff53f: 0x01 0x02 0x04 0x03 0x05 0x07 0x06 0xff
0xbffff547: 0xff 0xff 0xff 0xff 0x00 0x00 0x8a
why this is happening and how i can handle it correctly?
thanks
It seems like structure member alignment issues. Unless you know how your compiler packs structure members, you should not make assumptions about the positions of those members in memory.
The reason that the 0x07 is apparently lost, is because the compiler is probably aligning the flag5 member on a 16-bit boundary, skipping the odd memory location that holds the 0x07 value. That value is lost in the padding. Also, what you are doing is overflowing the buffer, a big no-no. In other words:
struct flags {
u_char flag1; // 0x01
u_char flag2; // 0x02
u_short flag3; // 0x04 0x03
u_char flag4; // 0x05
// 0x07 is in the padding
u_short flag5; // 0x06 0xff
u_char flag7[5]; // 0xff 0xff 0xff 0xff ... oops, buffer overrun, because your
// buffer was less than the sizeof(flags)
};
You can often control the packing of structure members with most compilers, but the mechanism is compiler specific.
The compiler is free to put some unused padding between members of the structure to (for instance) arrange the alignment to it's conveneince. Your compiler may provide a #pragma packed or a command line argument to insure tight structure packing.
How structures are stored is implementation defined, and thus, you can't rely on a specific memory layout for serialization like that.
To serialize your structure to a byte array, write a function which serializes each field in a set order.
You might need to pack the struct:
struct flags __attribute__ ((__packed__)) {
u_char flag1;
u_char flag2;
u_short flag3;
u_char flag4;
u_short flag5;
u_char flag7[5];
};
Note: This is GCC -- I don't know how portable it is.
This has to do with padding. The compiler is adding garbage memory into your struct in order to get it to align with your memory correctly for efficiency.
See the following examples:
http://msdn.microsoft.com/en-us/library/71kf49f1(v=vs.80).aspx
http://en.wikipedia.org/wiki/Data_structure_alignment#Typical_alignment_of_C_structs_on_x86