What exactly does ". = 0x7c00" in a linker script do?
More specifically, when I place . = 0x7c00 at the beginning of a linker script, why doesn't the resulting output file begin with 0x7c00 = 31,744 zeros?
I understand that when a PC boots, the BIOS places the 512 byte MBR at memory address 0x7c00. However, I am confused as to how exactly the linker's location counter affects how the output file is laid out.
(For context, I'm trying to thoroughly understand the sample code from the "x86 bare metal" project. https://github.com/cirosantilli/x86-bare-metal-examples. I've included the entire linker script below for context.)
SECTIONS
{
/*
We could also pass the -Ttext 0x7C00 to as instead of doing this.
If your program does not have any memory accesses, you can omit this.
*/
. = 0x7c00;
.text :
{
__start = .;
/*
We are going to stuff everything
into a text segment for now, including data.
Who cares? Other segments only exist to appease C compilers.
*/
*(.text)
/*
Magic bytes. 0x1FE == 510.
We could add this on each Gas file separately with `.word`,
but this is the perfect place to DRY that out.
*/
. = 0x1FE;
SHORT(0xAA55)
*(.stage2)
__stage2_nsectors = ABSOLUTE((. - __start) / 512);
. = ALIGN(512);
__end = .;
__end_align_4k = ALIGN(4k);
}
}
It looks like the ". = 0x7c00" is not meaning a length but an absolute address. It reads to me as 'set the current value of the special variable "." to be the hex value 0x7c00 and then it plans to use that address as an offset later in the script like with the . = ALIGN(512) it is also why it saves that address off as __start so it can then do math on the resulting image. If you manipulate . during the script so it points to the last chunk of memory added to the image then you can use it to determine the total size:
__stage2_nsectors = ABSOLUTE((. - __start) / 512);
in English would be
The difference between the starting place and wherever I ended divided by sector size.
Related
I have a new standard c++ project on an imx rt 1024 (an nxp chip), in which I try to move my vector table to SRAM. It fails, depending on a change I apply in the linker script.
The project is a new project from scratch created by MCUxpresso. I am not looking for answers MCUxpresso related, or c/c++/startup code related. I only want to properly understand the consequences of my changed linker script I show below.
The part that works
My starting point is a small program on my evk board, using a simple FreeRTOS task to blink a led. This works fine, when I put my vector table in flash.
linker script:
/* Not relevant for this question, other than showing there is something
written to flash before my vector table, harmless I think, but didn't want to leave
out of this question
*/
.boot_hdr : ALIGN(4)
{
__boot_hdr_start__ = ABSOLUTE(.) ;
KEEP(*(.boot_hdr.conf))
. = 0x1000 ;
KEEP(*(.boot_hdr.ivt))
. = 0x1020 ;
KEEP(*(.boot_hdr.boot_data))
. = 0x1030 ;
KEEP(*(.boot_hdr.dcd_data))
__boot_hdr_end__ = ABSOLUTE(.) ;
. = 0x2000 ;
} >PROGRAM_FLASH
/*
Here I write my vector table to flash
*/
.vector : ALIGN(4)
{
__vector_table_flash_start__ = ADDR(.vector) ;
__vector_table_itc_start__ = LOADADDR(.vector) ;
KEEP(*(.isr_vector))
__vector_table_flash_end__ = ABSOLUTE(.) ;
. = ALIGN(4) ;
} >PROGRAM_FLASH
Disassembled code for vector table
Disassembled code of reset handler
Note: 0x600022e5 corresponds to 0x600022e4, this has something to do with arm .thumb. I don't exactly know how that works tbh.
When I run this app, it runs fine. If I set a breakpoint in the ResetHandler it breaks and I can step through the startup code and jump to main. When I let the program run, my led will blink every second.
The part which fails
I changed my linker script to put my vector table in SRAM as follows
.vector : ALIGN(4)
{
__vector_table_flash_start__ = ADDR(.vector) ;
__vector_table_itc_start__ = LOADADDR(.vector) ;
KEEP(*(.isr_vector))
__vector_table_flash_end__ = ABSOLUTE(.) ;
. = ALIGN(4) ;
} >SRAM_ITC AT>PROGRAM_FLASH
For reference, the memory section:
MEMORY
{
PROGRAM_FLASH (rx) : ORIGIN = 0x60000000, LENGTH = 0x400000
SRAM_DTC (rwx) : ORIGIN = 0x20000000, LENGTH = 0x10000
SRAM_ITC (rwx) : ORIGIN = 0x0, LENGTH = 0x10000
SRAM_OC (rwx) : ORIGIN = 0x20200000, LENGTH = 0x20000
}
ENTRY(ResetISR)
When I upload, my program doesn't even reach the reset vector. It goes straight into the woods, and crashes somewhere outside program code.
The questions
What EXACTLY happens when I adjust my linker script with >SRAM_ITC AT>PROGRAM_FLASH?
I am pretty sure the produced elf file still contains the entire vector table starting from address 0x60002000. The >SRAM_ITC only tells the linker where certain parts of memory will end up AFTER the startup code copied all parts to their final ram location. Right? So how on earth can the initial jump to 0x60002004 (the address which holds the location of the reset handler) fail? The nxp bootloader always expects the reset vector on that location. I didn't change that. I only told the linker that the memory on that location will finally end up in SRAM. What am I misunderstanding here?
Maybe a stupid question: If I am completely wrong with my above assumptions, is there a way to see this from disassembly? I think objdump only shows the final addresses, but my debug probe will only write to flash as far as I know. So after uploading my code to my target, I still assume that stuff got written to flash, and after reset the built in bootloader will jump to 0x60002004 and set the PC to the address located at 0x60002000. Where can I see the actual blob of bytes which is programmed to flash memory?
Copying the vector table to sram from my custom bootloader solved the problem. That way the "on chip bootloader" from nxp can jump to my custom bootloader.
Before I just to my app from my custom bootloader, I copy the vector table to sram and set SCB->VTOR to the start of sram vector table.
In my linker file, I have the following memory sections.
MEMORY
{
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 64k
mram (rwx) : ORIGIN = 0xA0000010, LENGTH = 1M
}
The actual address of the mram peripheral starts at 0xA0000000. In a C file I can write to a specific memory address as
(*(uint32_t *)(void *)0xA0000000) = 0xaabbccdd;
Will this cause any problems?
The space specified in the linker script defines where and what the linker can locate in the defined memory regions. It will not locate anything to the "hole" you have left at 0xA0000000 to 0xA000000F because it is not aware of it.
In that sense it is "safe" in that the linker will not attempt to use that space. It is entirely in the control of your code - you have taken responsibility for that region by not giving it to the linker. And indeed the statement:
(*(uint32_t *)(void *)0xA0000000) = 0xaabbccdd;
will write a 32 bit value to that location. The point is neither the compiler nor the linker will prevent you from doing what you will in that region.
What is less plausible is LENGTH = 1M. That would make your mram 0x100010 bytes long (i.e. 1M + 0x10). That is a problem because the linker is free to locate objects in the region 0x100000 to 0x10000F. The consequences of that depend on your hardware, but quite possibly it will wrap into the region 0x100000 to 0x10000F that you have attempted to hide from the linker. I would imagine that you need LENGTH = 1M - 0x10 or LENGTH = 0x0FFFF0.
Now while you can absolve the linker from managing that region in order to manage it in your code, it may not be the best approach. It would be better to create a linker symbol at the required absolute address.
So given:
MEMORY
{
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 64k
mram (rwx) : ORIGIN = 0xA0000000, LENGTH = 1M
}
Your would create a linker symbol in mram:
SECTIONS
{
...
.mram :
{
reserved_mram = 0 ; /* address is zero offset from .mram1 */
. += 0x10 ; /* create 16 byte "hole" at address reserved_mram */
... /* other mram linker assignment follows */
} < mram
...
}
Then in your code you should be able to declare:
extern uint32_t reserved_mram[] ; // 4 x 32-bit word array.
And through reserved_mram you can access the memory at 0xA00000000 symbolically and the code is always in sync with the linker script so you can relocate the space easily without introducing a conflict.
Of course there is no bounds checking and no size information - you still need to confine your access to reserved_mram[0] to reserved_mram[3].
You might alternatively create a separate symbol for each location (with meaningful names specific to your application):
.mram :
{
reserved_mram1 = 0 ;
. += 4 ;
reserved_mram2 = . ;
. += 4 ;
reserved_mram3 = . ;
. += 4 ;
reserved_mram4 = . ;
. += 4 ;
... /* other mram linker usage follows */
} < mram
Then in your code:
extern uint32_t reserved_mram1 ;
extern uint32_t reserved_mram2 ;
extern uint32_t reserved_mram3 ;
extern uint32_t reserved_mram4 ;
Another alternative; you might create an independent section for the region, then create variables within it using __attribute__((section(.xxx))) directives in the code. e.g:
MEMORY
{
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 64k
mram1 (rwx) : ORIGIN = 0xA0000000, LENGTH = 0x10
mram2 (rwx) : ORIGIN = 0xA0000010, LENGTH = 0xFFFF0
}
SECTIONS
{
...
.mram1 :
{
*(.mram1)
} < mram1
...
}
Then in your code:
uint32_t my_mram_data __attribute__ ((section (".mram1"))) ;
The variable my_mram_data will be created somewhare in .mram1, but the linker decides where. The advantage here is that you can create arbitrary variables in the code without modifying the linker script, and if you attempt to allocate to .mram1 more data than is available you will get a linker error.
Note that linker script syntax is arcane and varies between linkers - I am unassuming this relates to the GNU linker? But my linker foo is strictly on demand (i.e. I figure it out when I need to) and I make no claim that any of the above is complete or correct, or even the only possible solutions - regard it as illustrative, and refer to the linker documentation for accurate information.
I use a STM32F411RE.
Since I've no more memory in my RAM. I decided to store large variable in my flash. For that I created a section in section.ld.
.large_buffer: ALIGN(4)
{
. = ALIGN(4) ;
*(.large_buffer.large_buffer.*)
. = ALIGN(4) ;
} >FLASH
In the main.c file, I declare the variable as follow :
uint8_t buffer[60 * 200] __attribute__ ((section(".large_buffer"), used));
At this point everything is OK, the buffer is not stocked in the RAM (bss), I can access it and rewrite it.
buffer[25] = 42;
printf("%d\n", buffer[25]); // 42
The problem comes when I want to edit the variable from an other file.
main.c
uint8_t buffer[60 * 200] __attribute__ ((section(".large_buffer"), used));
int main()
{
myFunc(buffer);
}
other.c
myFunc(uint8_t* buffer)
{
buffer[25] = 42;
printf("%d\n", buffer[25]); // 0
}
buffer never change in another file (passed as parameter).
Did I miss something ?
You cannot write to flash memory the same way as you write to RAM, because of physical design of flash memories. To be exact you need to erase sector/page (let's say ~ 1-4kB, it's specified in your MCU datasheet). The reason is that flash are made that they retain state even if they're not powered, whenever you want to change any bit from value 0 -> 1 you need to erase whole sector (After erase all of bits will be set to 1).
So you cannot use Flash as data memory, what you could do is use Flash as storing variables that are const (read-only) value, so any look-up tables will perfectly fit in there (usually compilers when you stat variable to const will put them inside of flash). How to write to flash you can read in Reference Manual of your MCU.
I would like to place a 2kB chunk of memory, aligned 16 bytes before a 1024 bytes alignment.
Platform : arm, bare metal, GNU toolchain. No need for portability
Can I do that with either GCC/attributes pragmas, ld custom linker script or any other solution ?
I would like to avoid wasting 1kB for that (basically placing a 3kB chunk of memory aligned at 1kB & adding 1024-16 bytes of padding).
Forcing a particular address to place the data is possible, but will ld be able to place variables before and after it (or is it just a way to put padding ? )
Context : buffer needs to be at 1k boundary by hardware design , but I'd like to add a bit of room before / after to be able to copy to this buffer with no bounds checking if my source is at most 16B wide.
edit: added example.
Let's say I have RAM starting at 0x2000000. I need a char buf[2048] be placed in it, at 1024*N-16 offset - ie (&buf[16])%1024==0 , hopefully without losing 1008 padding bytes.
(edit2)
So I'd like to have :
0x2000000 - some vars
0x2000010 - some other vars ...
0x2000100 - some other vars ...
0x20003F0 - char buf[2048] : here (int)&buf[16]%1024=0x2000400%1024==0
0x2000BF0 - some other vars ...
ALIGN(exp) is equivalent to (. + exp - 1) & ~(exp - 1). Of course, that expression only works if exp is a power of two. So you can't use ALIGN(), but you can write your own expression that does produce the result you want. Something like ((. + 1024 + 16 - 1) & ~(1024 - 1)) - 16 should do the trick. Plug in various values for . and you see it rounds up like you want.
The problem you'll have is that the linker will place every section you specified to be before your special section before it, and every section specified to be after it after it. It won't cleverly order the .data sections of different files to be before or after so as to produce the minimum amount of padding. It also won't re-order individual variables within an object file and section at all. If you are trying to pack as tightly as possible, I think you'll need to do something like:
.data : {
*(.about1008bytes)
. = ((. + 1024 + 16 - 1) & ~(1024 - 1)) - 16
*(.DMAbuf)
*(.data)
}
Use a section attribute to place your buffer in .DMAbuf and try to find close to but not more than 1008 bytes of other data variables and stick them in section .about1008bytes.
If you want to go crazy, use gcc -fdata-sections to place every data object in its own section, extract the section sizes with readelf, give that to a program you write to sort them for optimal packing that then spits out a chunk of linker script listing them in the optimal order.
You should define a specific section into your linker script, with the required alignment.
Looking at the man
ALIGN(exp)
Return the result of the current location counter (.) aligned to the next exp boundary. exp must be an expression whose value is a power of two. This is equivalent to
(. + exp - 1) & ~(exp - 1)
ALIGN doesn't change the value of the location counter--it just does arithmetic on it. As an example, to align the output .data section to the next 0x2000 byte boundary after the preceding section and to set a variable within the section to the next 0x8000 boundary after the input sections:
SECTIONS{ ...
.data ALIGN(0x2000): {
*(.data)
variable = ALIGN(0x8000);
}
... }
The first use of ALIGN in this example specifies the location of a section because it is used as the optional start attribute of a section definition (see section Optional Section Attributes). The second use simply defines the value of a variable. The built-in NEXT is closely related to ALIGN.
As an example you can define your section
SECTIONS
{
.myBufBlock ALIGN(16) :
{
KEEP(*(.myBufSection))
} > m_data
}
and into your code you can
unsigned char __attribute__((section (".myBufSection"))) buf[2048];
EDIT
SECTIONS
{
.myBufBlock 0x0x7FBF0 :
{
KEEP(*(.myBufSection))
} > m_data
}
and into your code you can
unsigned char __attribute__((section (".myBufSection"))) buf[16+2048+16];
To your DMA you can set the address &buf[16] that will be 1k aligned.
Below is a portion of the C code I am using:
pushbutton_ISR()
{
int press;
int key_pressed;
press = *(KEYS_ptr + 3); // read the pushbutton Edge Det Register interrupt register
*(KEYS_ptr + 3) = 0; // Clear the Edge Det registers.
if (press & 0x1) { // KEY1
key_pressed = KEY1;
//sum = sum + *NEW_NUMBER;
}
else if (press & 0x2) { // KEY2
key_pressed = KEY2;
*GREEN_LEDS = *NEW_NUMBER;
sum = sum + *NEW_NUMBER;
*RED_LEDS = sum;
}
else // i.e. (press & 0x8), which is KEY3
sum = *(NEW_NUMBER); // Read the SW slider switch values; store in pattern
return;
}
The compiler compiles this fine and the code appears to run (on an Altera board) fine. However, when I change the first if statement to:
if (press & 0x1) { // KEY1
//key_pressed = KEY1;
sum = sum + *NEW_NUMBER;
}
the compiler gives the following error messages:
.../nios2-elf/bin/ld.exe: section .data loaded at [00000a00,00000e0f] overlaps section .text loaded at [00000500,00000a0f]
.../nios2-elf/bin/ld.exe: section .ctors loaded at [00000a10,00000a13] overlaps section .data loaded at [00000a00,00000e0f]
.../nios2-elf/bin/ld.exe: Z:/Projects/Altera/3215_W15_LabB/Part2/from_handout.elf: section .data vma 0xa00 overlaps previous sections
.../nios2-elf/bin/ld.exe: Z:/Projects/Altera/3215_W15_LabB/Part2/from_handout.elf: section .ctors vma 0xa10 overlaps previous sections
.../nios2-elf/bin/ld.exe: Z:/Projects/Altera/3215_W15_LabB/Part2/from_handout.elf: section .rodata vma 0xa14 overlaps previous sections
.../nios2-elf/bin/ld.exe: Z:/Projects/Altera/3215_W15_LabB/Part2/from_handout.elf: section .sdata vma 0xe10 overlaps previous sections
.../nios2-elf/bin/ld.exe: Z:/Projects/Altera/3215_W15_LabB/Part2/from_handout.elf: section .sbss vma 0xe18 overlaps previous sections
Could you please advise me about the reasons for these errors, and how to resolve them.
This has nothing to do with your code being incorrect.
These are linker errors (it even tells you ld.exe is the program complaining) about output sections overlapping. This probably means you just ran out of space, but could also mean the linker directive file your project is using has some problems.
When you add in this line, it causes the size of the compiled code to be too big for the memory area that you are loading the code into.
You can see from the first line of the linker error message that .text (the code) is loaded at 0x500, and .data (the non-zero static variables) is loaded at 0xa00. However, the .text section is so long that it is too big to fit in the space between 0x500 and 0xa00.
To fix this you will either need to:
Make your code smaller
Increase the amount of space available for .text
To do the first one, you could use -Os or similar compiler option to compile for minimum code size ; or manually rewrite your code to be smaller.
For the second one you really need to understand the hardware you are loading the code into. Is it a hardware requirement that code goes at 0x500 and data goes at 0xa00? If not, then you may be able to load the code and/or data into different addresses.
These addresses are configured in your linker script (this may be hardcoded into the makefile or it may be an actual file somewhere). Hopefully the hardware device came with documentation that explains how much memory it has and where you're allowed to load your code to.