Is there a separate stack for FreeRTOS ISR context ? Is it fixed or configurable ?
#define configMINIMAL_STACK_SIZE ( ( unsigned short ) 256 )
#define configTOTAL_HEAP_SIZE ( ( size_t ) ( 512 * 1024 ) )
From my understanding this Stack size is exclusively for general tasks and not for ISRs. Any insights would be helpful.
Adding more details : This is an exclusive FreeRTOS port and not available in the community. The architecture is arm926ej-s (This can support a full fledged linux kernel - MMU support, but there was a need for running RTOS on it).
ISR Stack size are configured by startup code, in your port. There's two ISR: FIQ and IRQ, each has its own stack.
Here I have searched an ARM9 FreeRTOS Demo for its stacks configuration, follow the result:
FreeRTOS/Demo/ARM9_STR91X_IAR$ grep -sri "FIQ_STACK"
91x_init.s: SECTION FIQ_STACK:DATA:NOROOT(3)
91x_init.s: LDR SP, =SFE(FIQ_STACK)
STR91x_FLASH.icf:define block FIQ_STACK with alignment = 8, size = __ICFEDIT_size_fiqstack__ { };
STR91x_FLASH.icf: block CSTACK, block SVC_STACK, block IRQ_STACK, block FIQ_STACK,
91x_init_IAR.s:FIQ_Stack DEFINE USR_Stack-8 ; followed by FIQ stack
91x_init_IAR.s:ABT_Stack DEFINE FIQ_Stack-8 ; followed by ABT stack
91x_init_IAR.s: LDR SP, =FIQ_Stack
FreeRTOS/Demo/ARM9_STR91X_IAR$ grep -sri __ICFEDIT_size_fiqstack__
STR91x_FLASH.icf:define symbol __ICFEDIT_size_fiqstack__ = 0x10;
STR91x_FLASH.icf:define block FIQ_STACK with alignment = 8, size = __ICFEDIT_size_fiqstack__ { };
It means that stacks sizes are defined in STR91x_FLASH.icf file, or 91x_init_IAR.s, in the ARM9_STR91X_IAR Demo, accordingly with the compiler/startups you use to build.
Related
I am coding a bootloader for Nucleo-F429ZI. I have two different STM32 projects, one for the bootloader itself and an application to jump from the bootloader.
Linker script for bootloader
MEMORY
{
CCMRAM (xrw) : ORIGIN = 0x10000000, LENGTH = 64K
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 32K
FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 32K
}
Linker script for app
_estack = ORIGIN(RAM) + LENGTH(RAM);
MEMORY
{
CCMRAM (xrw) : ORIGIN = 0x10000000, LENGTH = 64K
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 192K
FLASH (rx) : ORIGIN = 0x8008000, LENGTH = 64K
}
I did not forget to set the flash offset of the app.
system_stm32f4xx.c (in the app project)
#define VECT_TAB_BASE_ADDRESS FLASH_BASE // 0x8000000
#define VECT_TAB_OFFSET 0x00008000U
The tutorial of STMicroelectronics about bootloaders has the following code to jump
main.c (in bootloader project)
#define FLASH_APP_ADDR 0x8008000
typedef void (*pFunction)(void);
uint32_t JumpAddress;
pFunction Jump_To_Application;
void go2APP(void)
{
JumpAddress = *(uint32_t*)(FLASH_APP_ADDR + 4);
Jump_To_Application = (pFunction) JumpAddress;
__set_MSP(*(uint32_t*)FLASH_APP_ADDR); // in cmsis_gcc.h
Jump_To_Application();
}
cmsis_gcc.h (in bootloader project)
__STATIC_FORCEINLINE void __set_MSP(uint32_t topOfMainStack)
{
__ASM volatile ("MSR msp, %0" : : "r" (topOfMainStack) : );
}
As you can see, __set_MSP function sets the main stack pointer before jumping to FLASH_APP_ADDR + 4.
I found the memory location of the target place by debugging. FLASH_APP_ADDR + 4 caused to run Reset_Handler function of app project. Lets see what will be executed.
startup_stm32f429zitx.c (in the app project)
.section .text.Reset_Handler
.weak Reset_Handler
.type Reset_Handler, %function
Reset_Handler:
ldr sp, =_estack /* set stack pointer */
/* Copy the data segment initializers from flash to SRAM */
ldr r0, =_sdata
ldr r1, =_edata
ldr r2, =_sidata
movs r3, #0
b LoopCopyDataInit
First thing of what Reset_Handler does is setting the stack pointer. _estack was defined in linker script.
If Reset_Handler is setting stack pointer, why did we call the __set_MSP function? I remove the function __set_MSP and bootloding process is still working. However I examined some other bootloader codes and found the exact same logic.
I tried what i have said and could not find an explanation.
Cortex-M core the loads SP register with initial value from address FLASH_BASE+0 during boot sequence. Then jumps to the code entry point (Reset vector) from address FLASH_BASE+4. Any bootloader code mimics core behaviour. Note, that FLASH_BASE here is not necessarily actual flash base, but an abstract value, that depends on the used processor, and it's settings.
Provided Reset_Handler code loads the sp register with __estack (Main stack top) value, but it doesn't have to! Bootloader can not expect the main program to do it, but has perform the same boot sequence as the core after reset. This way the main code doesn't have to rely on knowing, who started it - core, bootloader, jtag, or something else.
I've seen startup code, that doesn't load SP, but disables interrupts with the first instruction. Or startup code, written in C, which could use stack with the first instruction.
The real question here could be: Why this startup code loads SP if it is already loaded? But perhaps it should be forwarded to the original code author.
Let's see what's happening line by line.
JumpAddress = *(uint32_t*)(FLASH_APP_ADDR + 4);
Okay, so we take FLASH_APP_ADDR, add 1 word to it, call it a pointer to a word, dereference it. So it's the content of 0x8008004 (which is the one word after start of the vector table - list of interrupt handler pointers). You can find it in the vector table in reference manual. Here is reference manual for your MCU. Page 375
Next,
Jump_To_Application = (pFunction) JumpAddress;
Okay, so we treat reset handler address as a void function(void).
Eventually, you get to the stack
__set_MSP(*(uint32_t*)FLASH_APP_ADDR);
This function, as we see from its source code, simply sets main stack pointer to its argument. The argument is take vector table address, treat it as a pointer to a word, dereference it. So it's the first word of that vector table. And the first word of the vector table is the main stack pointer auto-loaded after power on. By definition of the vector table. You reset the stack to cold boot value, same value as the first word of your Flash. Your bootloader has used some stack until this point, but it won't be needed anymore, and the bootloader function will never return and free that stack, so you just reset stack to its initial value for your program. It will reuse all stack used by the bootloader.
So right now you've reset the stack pointer and you assigned reset handler to the function you call. And then you, well, call it.
Your vector table and the program that the bootloader starts are two different entities in memory. If you don't need to remap the interrupt handlers at runtime, don't move the vector table. It will stay at the beginning of the flash and will lead to the default interrupt handlers. Just make sure the address you execute from contains executable code and you run it from the start (well, if you don't, you will hardfault).
I use GNU for ARM and want to define some cell in RAM memory space as following:
#define FOO_LOCATION 0x20000000
#define foo (*((volatile uint32_t *) FOO_LOCATION ))
My question is - if such declaration will prohibit usage of the cell with FOO_LOCATION address in stack or heap? What address preffered to avoid memory fragmentation?
Update
I want to place some variable at a certain memory address and access it after watchdog reset. I guess that if i will declare it as usual
uint32_t foo;
it will have another physical location after reset. Also i read a post where said that most probably there is no such way to declare variable adddress. And i have idea to tell the GNU not to use some memory address. As for example special registers are not used by custom variables.
Update 2
In addition to previous definitions i added section to linker script
SECTIONS
{
. = 0x20000000
.fooSection :
{
*(.fooSection)
. = 0x04 /* size = 4 bytes */
}
/* other placements follow here... */
}
Question:
Why are 8 bytes reserved at the "bottom" of kernel stack when it is created?
Background:
We know that struct pt_regs and thread_info share the same 2 consecutive pages(8192 bytes), with pt_reg located at the higher end and thread_info at the lower end.
However, I noticed that 8 bytes are reserved at the highest address of these 2 pages:
in arch/arm/include/asm/threadinfo.h
#define THREAD_START_SP (THREAD_SIZE - 8)
This way you can access to thread_info structure just by reading stack pointer and masking out THREAD_SIZE bits (otherwise SP initially would be on the next THREAD_SIZE block).
static inline struct thread_info *current_thread_info(void)
{
register unsigned long sp asm ("sp");
return (struct thread_info *)(sp & ~(THREAD_SIZE - 1));
}
Eight bytes come from the ARM calling convention that SP needs to be 8-byte aligned.
Update:
AAPCS 5.2.1.1 states:
A process may only access (for reading or writing) the closed interval of the entire stack delimited by [SP, stack-base – 1] (where SP is the value of register r13).
Since stack is full-descending
THREAD_START_SP (THREAD_SIZE - 8)
would enforce this requirement probably by illegal access to next page (segmentation fault).
Why are 8 bytes reserved at the "bottom" of kernel stack when it is created?
If we reserve anything on the stack, it must be a multiple of eight.
If we peek above the stack, we like to make sure it is mapped.
Multiple of eight
The stack and user register needs to be aligned to 8 bytes. This just makes things more efficient as many ARMs have a 64bit bus and operations on the kernel stack (such as ldrd and strd) may have these requirements. You can see the protection in usr_entry macro. Specifically,
#if defined(CONFIG_AEABI) && (__LINUX_ARM_ARCH__ >= 5) && (S_FRAME_SIZE & 7)
#error "sizeof(struct pt_regs) must be a multiple of 8"
#endif
ARMv5 (architecture version 5) adds the ldrd and strd instructions. It is also a requirement of the EABI version of the kernel (versus OABI). So if we reserve anything on the stack, it must be a multiple of 8.
Peeking on stack
For the very top frame, we may want to take a peek at previous data. In order not to constantly check that the stack is in the 8K range an extra entry is reserved. Specifically, I think that signals need to peek at the stack.
I've been banging my head with this for the last 3-4 days and I can't find a DECENT explanatory documentation (from ARM or unofficial) to help me.
I've got an ODROID-XU board (big.LITTLE 2 x Cortex-A15 + 2 x Cortex-A7) board and I'm trying to understand a bit more about the ARM architecture. In my "experimenting" code I've now arrived at the stage where I want to WAKE UP THE OTHER CORES FROM THEIR WFI (wait-for-interrupt) state.
The missing information I'm still trying to find is:
1. When getting the base address of the memory-mapped GIC I understand that I need to read CBAR; But no piece of documentation explains how the bits in CBAR (the 2 PERIPHBASE values) should be arranged to get to the final GIC base address
2. When sending an SGI through the GICD_SGIR register, what interrupt ID between 0 and 15 should I choose? Does it matter?
3. When sending an SGI through the GICD_SGIR register, how can I tell the other cores WHERE TO START EXECUTION FROM?
4. How does the fact that my code is loaded by the U-BOOT bootloader affect this context?
The Cortex-A Series Programmer's Guide v3.0 (found here: link) states the following in section 22.5.2 (SMP boot in Linux, page 271):
While the primary core is booting, the secondary cores will be held in a standby state, using the
WFI instruction. It (the primary core) will provide a startup address to the secondary cores and wake them using an
Inter-Processor Interrupt(IPI), meaning an SGI signalled through the GIC
How does Linux do that? The documentation-S don't give any other details regarding "It will provide a startup address to the secondary cores".
My frustration is growing and I'd be very grateful for answers.
Thank you very much in advance!
EXTRA DETAILS
Documentation I use:
ARMv7-A&R Architecture Reference Manual
Cortex-A15 TRM (Technical Reference Manual)
Cortex-A15 MPCore TRM
Cortex-A Series Programmer's Guide v3.0
GICv2 Architecture Specification
What I've done by now:
UBOOT loads me at 0x40008000; I've set-up Translation Tables (TTBs), written TTBR0 and TTBCR accordingly and mapped 0x40008000 to 0x8000_0000 (2GB), so I also enabled the MMU
Set-up exception handlers of my own
I've got Printf functionality over the serial (UART2 on ODROID-XU)
All the above seems to work properly.
What I'm trying to do now:
Get the GIC base address => at the moment I read CBAR and I simply AND (&) its value with 0xFFFF8000 and use this as the GIC base address, although I'm almost sure this ain't right
Enable the GIC distributor (at offset 0x1000 from GIC base address?), by writting GICD_CTLR with the value 0x1
Construct an SGI with the following params: Group = 0, ID = 0, TargetListFilter = "All CPUs Except Me" and send it (write it) through the GICD_SGIR GIC register
Since I haven't passed any execution start address for the other cores, nothing happens after all this
....UPDATE....
I've started looking at the Linux kernel and QEMU source codes in search for an answer. Here's what I found out (please correct me if I'm wrong):
When powering up the board ALL THE CORES start executing from the reset vector
A software (firmware) component executes WFI on the secondary cores and some other code that will act as a protocol between these secondary cores and the primary core, when the latter wants to wake them up again
For example, the protocol used on the EnergyCore ECX-1000 (Highbank) board is as follows:
**(1)** the secondary cores enter WFI and when
**(2)** the primary core sends an SGI to wake them up
**(3)** they check if the value at address (0x40 + 0x10 * coreid) is non-null;
**(4)** if it is non-null, they use it as an address to jump to (execute a BX)
**(5)** otherwise, they re-enter standby state, by re-executing WFI
**(6)** So, if I had an EnergyCore ECX-1000 board, I should write (0x40 + 0x10 * coreid) with the address I want each of the cores to jump to and send an SGI
Questions:
1. What is the software component that does this? Is it the BL1 binary I've written on the SD Card, or is it U-BOOT?
2. From what I understand, this software protocol differs from board to board. Is it so, or does it only depend on the underlying processor?
3. Where can I find information about this protocol for a pick-one ARM board? - can I find it on the official ARM website or on the board webpage?
Ok, I'm back baby. Here are the conclusions:
The software component that puts the CPUs to sleep is the bootloader (in my case U-Boot)
Linux somehow knows how the bootloader does this (hardcoded in the Linux kernel for each board) and knows how to wake them up again
For my ODROID-XU board the sources describing this process are UBOOT ODROID-v2012.07 and the linux kernel found here: LINUX ODROIDXU-3.4.y (it would have been better if I looked into kernel version from the branch odroid-3.12.y since the former doesn't start all of the 8 processors, just 4 of them but the latter does).
Anyway, here's the source code I've come up with, I'll post the relevant source files from the above source code trees that helped me writing this code afterwards:
typedef unsigned int DWORD;
typedef unsigned char BOOLEAN;
#define FAILURE (0)
#define SUCCESS (1)
#define NR_EXTRA_CPUS (3) // actually 7, but this kernel version can't wake them up all -> check kernel version 3.12 if you need this
// Hardcoded in the kernel and in U-Boot; here I've put the physical addresses for ease
// In my code (and in the linux kernel) these addresses are actually virtual
// (thus the 'VA' part in S5P_VA_...); note: mapped with memory type DEVICE
#define S5P_VA_CHIPID (0x10000000)
#define S5P_VA_SYSRAM_NS (0x02073000)
#define S5P_VA_PMU (0x10040000)
#define EXYNOS_SWRESET ((DWORD) S5P_VA_PMU + 0x0400)
// Other hardcoded values
#define EXYNOS5410_REV_1_0 (0x10)
#define EXYNOS_CORE_LOCAL_PWR_EN (0x3)
BOOLEAN BootAllSecondaryCPUs(void* CPUExecutionAddress){
// 1. Get bootBase (the address where we need to write the address where the woken CPUs will jump to)
// and powerBase (we also need to power up the cpus before waking them up (?))
DWORD bootBase, powerBase, powerOffset, clusterID;
asm volatile ("mrc p15, 0, %0, c0, c0, 5" : "=r" (clusterID));
clusterID = (clusterID >> 8);
powerOffset = 0;
if( (*(DWORD*)S5P_VA_CHIPID & 0xFF) < EXYNOS5410_REV_1_0 )
{
if( (clusterID & 0x1) == 0 ) powerOffset = 4;
}
else if( (clusterID & 0x1) != 0 ) powerOffset = 4;
bootBase = S5P_VA_SYSRAM_NS + 0x1C;
powerBase = (S5P_VA_PMU + 0x2000) + (powerOffset * 0x80);
// 2. Power up each CPU, write bootBase and send a SEV (they are in WFE [wait-for-event] standby state)
for (i = 1; i <= NR_EXTRA_CPUS; i++)
{
// 2.1 Power up this CPU
powerBase += 0x80;
DWORD powerStatus = *(DWORD*)( (DWORD) powerBase + 0x4);
if ((powerStatus & EXYNOS_CORE_LOCAL_PWR_EN) == 0)
{
*(DWORD*) powerBase = EXYNOS_CORE_LOCAL_PWR_EN;
for (i = 0; i < 10; i++) // 10 millis timeout
{
powerStatus = *(DWORD*)((DWORD) powerBase + 0x4);
if ((powerStatus & EXYNOS_CORE_LOCAL_PWR_EN) == EXYNOS_CORE_LOCAL_PWR_EN)
break;
DelayMilliseconds(1); // not implemented here, if you need this, post a comment request
}
if ((powerStatus & EXYNOS_CORE_LOCAL_PWR_EN) != EXYNOS_CORE_LOCAL_PWR_EN)
return FAILURE;
}
if ( (clusterID & 0x0F) != 0 )
{
if ( *(DWORD*)(S5P_VA_PMU + 0x0908) == 0 )
do { DelayMicroseconds(10); } // not implemented here, if you need this, post a comment request
while (*(DWORD*)(S5P_VA_PMU + 0x0908) == 0);
*(DWORD*) EXYNOS_SWRESET = (DWORD)(((1 << 20) | (1 << 8)) << i);
}
// 2.2 Write bootBase and execute a SEV to finally wake up the CPUs
asm volatile ("dmb" : : : "memory");
*(DWORD*) bootBase = (DWORD) CPUExecutionAddress;
asm volatile ("isb");
asm volatile ("\n dsb\n sev\n nop\n");
}
return SUCCESS;
}
This successfully wakes 3 of 7 of the secondary CPUs.
And now for that short list of relevant source files in u-boot and the linux kernel:
UBOOT: lowlevel_init.S - notice lines 363-369, how the secondary CPUs wait in a WFE for the value at _hotplug_addr to be non-zeroed and to jump to it; _hotplug_addr is actually bootBase in the above code; also lines 282-285 tell us that _hotplug_addr is to be relocated at CONFIG_PHY_IRAM_NS_BASE + _hotplug_addr - nscode_base (_hotplug_addr - nscode_base is 0x1C and CONFIG_PHY_IRAM_NS_BASE is 0x02073000, thus the above hardcodings in the linux kernel)
LINUX KERNEL: generic - smp.c (look at function __cpu_up), platform specific (odroid-xu): platsmp.c (function boot_secondary, called by generic __cpu_up; also look at platform_smp_prepare_cpus [at the bottom] => that's the function that actually sets the boot base and power base values)
For clarity and future reference, there's a subtle piece of information missing here thanks to the lack of proper documentation of the Exynos boot protocol (n.b. this question should really be marked "Exynos 5" rather than "Cortex-A15" - it's a SoC-specific thing and what ARM says is only a general recommendation). From cold boot, the secondary cores aren't in WFI, they're still powered off.
The simpler minimal solution (based on what Linux's hotplug does), which I worked out in the process of writing a boot shim to get a hypervisor running on the XU, takes two steps:
First write the entry point address to the Exynos holding pen (0x02073000 + 0x1c)
Then poke the power controller to switch on the relevant core(s): That way, they drop out of the secure boot path into the holding pen to find the entry point waiting for them, skipping the WFI loop and obviating the need to even touch the GIC at all.
Unless you're planning a full-on CPU hotplug implementation you can skip checking the cluster ID - if we're booting, we're on cluster 0 and nowhere else (the check for pre-production chips with backwards cluster registers should be unnecessary on the Odroid too - certainly was for me).
From my investigation, firing up the A7s is a little more involved. Judging from the Exynos big.LITTLE switcher driver, it seems you need to poke a separate set of power controller registers to enable cluster 1 first (and you may need to mess around with the CCI too, especially to have the MMUs and caches on) - I didn't get further since by that point it was more "having fun" than "doing real work"...
As an aside, Samsung's mainline patch for CPU hotplug on the 5410 makes the core power control stuff rather clearer than the mess in their downstream code, IMO.
QEMU uses PSCI
The ARM Power State Coordination Interface (PSCI) is documented at: https://developer.arm.com/docs/den0022/latest/arm-power-state-coordination-interface-platform-design-document and controls things such as powering on and off of cores.
TL;DR this is the aarch64 snippet to wake up CPU 1 on QEMU v3.0.0 ARMv8 aarch64:
/* PSCI function identifier: CPU_ON. */
ldr w0, =0xc4000003
/* Argument 1: target_cpu */
mov x1, 1
/* Argument 2: entry_point_address */
ldr x2, =cpu1_entry_address
/* Argument 3: context_id */
mov x3, 0
/* Unused hvc args: the Linux kernel zeroes them,
* but I don't think it is required.
*/
hvc 0
and for ARMv7:
ldr r0, =0x84000003
mov r1, #1
ldr r2, =cpu1_entry_address
mov r3, #0
hvc 0
A full runnable example with a spinlock is available on the ARM section of this answer: What does multicore assembly language look like?
The hvc instruction then gets handled by an EL2 handler, see also: the ARM section of: What are Ring 0 and Ring 3 in the context of operating systems?
Linux kernel
In Linux v4.19, that address is informed to the Linux kernel through the device tree, QEMU for example auto-generates an entry of form:
psci {
method = "hvc";
compatible = "arm,psci-0.2", "arm,psci";
cpu_on = <0xc4000003>;
migrate = <0xc4000005>;
cpu_suspend = <0xc4000001>;
cpu_off = <0x84000002>;
};
The hvc instruction is called from: https://github.com/torvalds/linux/blob/v4.19/drivers/firmware/psci.c#L178
static int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
which ends up going to: https://github.com/torvalds/linux/blob/v4.19/arch/arm64/kernel/smccc-call.S#L51
Go to www.arm.com and download there evaluation copy of DS-5 developement suite. Once installed, under the examples there will be a startup_Cortex-A15MPCore directory. Look at startup.s.
I am learning Device Driver and Kernel programming.According to Jonathan Corbet book we do not have main() function in device drivers.
#include <linux/init.h>
#include <linux/module.h>
static int my_init(void)
{
return 0;
}
static void my_exit(void)
{
return;
}
module_init(my_init);
module_exit(my_exit);
Here I have two questions :
Why we do not need main() function in Device Drivers?
Does Kernel have main() function?
start_kernel
On 4.2, start_kernel from init/main.c is a considerable initialization process and could be compared to a main function.
It is the first arch independent code to run, and sets up a large part of the kernel. So much like main, start_kernel is preceded by some lower level setup code (done in the crt* objects in userland main), after which the "main" generic C code runs.
How start_kernel gets called in x86_64
arch/x86/kernel/vmlinux.lds.S, a linker script, sets:
ENTRY(phys_startup_64)
and
phys_startup_64 = startup_64 - LOAD_OFFSET;
and:
#define LOAD_OFFSET __START_KERNEL_map
arch/x86/include/asm/page_64_types.h defines __START_KERNEL_map as:
#define __START_KERNEL_map _AC(0xffffffff80000000, UL)
which is the kernel entry address. TODO how is that address reached exactly? I have to understand the interface Linux exposes to bootloaders.
arch/x86/kernel/vmlinux.lds.S sets the very first bootloader section as:
.text : AT(ADDR(.text) - LOAD_OFFSET) {
_text = .;
/* bootstrapping code */
HEAD_TEXT
include/asm-generic/vmlinux.lds.h defines HEAD_TEXT:
#define HEAD_TEXT *(.head.text)
arch/x86/kernel/head_64.S defines startup_64. That is the very first x86 kernel code that runs. It does a lot of low level setup, including segmentation and paging.
That is then the first thing that runs because the file starts with:
.text
__HEAD
.code64
.globl startup_64
and include/linux/init.h defines __HEAD as:
#define __HEAD .section ".head.text","ax"
so the same as the very first thing of the linker script.
At the end it calls x86_64_start_kernel a bit awkwardly with and lretq:
movq initial_code(%rip),%rax
pushq $0 # fake return address to stop unwinder
pushq $__KERNEL_CS # set correct cs
pushq %rax # target address in negative space
lretq
and:
.balign 8
GLOBAL(initial_code)
.quad x86_64_start_kernel
arch/x86/kernel/head64.c defines x86_64_start_kernel which calls x86_64_start_reservations which calls start_kernel.
arm64 entry point
The very first arm64 that runs on an v5.7 uncompressed kernel is defined at https://github.com/cirosantilli/linux/blob/v5.7/arch/arm64/kernel/head.S#L72 so either the add x13, x18, #0x16 or b stext depending on CONFIG_EFI:
__HEAD
_head:
/*
* DO NOT MODIFY. Image header expected by Linux boot-loaders.
*/
#ifdef CONFIG_EFI
/*
* This add instruction has no meaningful effect except that
* its opcode forms the magic "MZ" signature required by UEFI.
*/
add x13, x18, #0x16
b stext
#else
b stext // branch to kernel start, magic
.long 0 // reserved
#endif
le64sym _kernel_offset_le // Image load offset from start of RAM, little-endian
le64sym _kernel_size_le // Effective size of kernel image, little-endian
le64sym _kernel_flags_le // Informative flags, little-endian
.quad 0 // reserved
.quad 0 // reserved
.quad 0 // reserved
.ascii ARM64_IMAGE_MAGIC // Magic number
#ifdef CONFIG_EFI
.long pe_header - _head // Offset to the PE header.
This is also the very first byte of an uncompressed kernel image.
Both of those cases jump to stext which starts the "real" action.
As mentioned in the comment, these two instructions are the first 64 bytes of a documented header described at: https://github.com/cirosantilli/linux/blob/v5.7/Documentation/arm64/booting.rst#4-call-the-kernel-image
arm64 first MMU enabled instruction: __primary_switched
I think it is __primary_switched in head.S:
/*
* The following fragment of code is executed with the MMU enabled.
*
* x0 = __PHYS_OFFSET
*/
__primary_switched:
At this point, the kernel appears to create page tables + maybe relocate itself such that the PC addresses match the symbols of the vmlinux ELF file. Therefore at this point you should be able to see meaningful function names in GDB without extra magic.
arm64 secondary CPU entry point
secondary_holding_pen defined at: https://github.com/cirosantilli/linux/blob/v5.7/arch/arm64/kernel/head.S#L691
Entry procedure further described at: https://github.com/cirosantilli/linux/blob/v5.7/arch/arm64/kernel/head.S#L691
Fundamentally, there is nothing special about a routine being named main(). As alluded to above, main() serves as the entry point for an executable load module. However, you can define different entry points for a load module. In fact, you can define more than one entry point, for example, refer to your favorite dll.
From the operating system's (OS) point of view, all it really needs is the address of the entry point of the code that will function as a device driver. The OS will pass control to that entry point when the device driver is required to perform I/O to the device.
A system programmer defines (each OS has its own method) the connection between a device, a load module that functions as the device's driver, and the name of the entry point in the load module.
Each OS has its own kernel (obviously) and some might/maybe start with main() but I would be surprised to find a kernel that used main() other than in a simple one, such as UNIX! By the time you are writing kernel code you have long moved past the requirement to name every module you write as main().
Hope this helps?
Found this code snippet from the kernel for Unix Version 6. As you can see main() is just another program, trying to get started!
main()
{
extern schar;
register i, *p;
/*
* zero and free all of core
*/
updlock = 0;
i = *ka6 + USIZE;
UISD->r[0] = 077406;
for(;;) {
if(fuibyte(0) < 0) break;
clearsig(i);
maxmem++;
mfree(coremap, 1, i);
i++;
}
if(cputype == 70)
for(i=0; i<62; i=+2) {
UBMAP->r[i] = i<<12;
UBMAP->r[i+1] = 0;
}
// etc. etc. etc.
Several ways to look at it:
Device drivers are not programs. They are modules that are loaded into another program (the kernel). As such, they do not have a main() function.
The fact that all programs must have a main() function is only true for userspace applications. It does not apply to the kernel, nor to device drivers.
With main() you propably mean what main() is to a program, namely its "entry point".
For a module that is init_module().
From Linux Device Driver's 2nd Edition:
Whereas an application performs a single task from beginning to end, a module registers itself in order to serve future requests, and its "main" function terminates immediately. In other words, the task of the function init_module (the module's entry point) is to prepare for later invocation of the module's functions; it's as though the module were saying, "Here I am, and this is what I can do." The second entry point of a module, cleanup_module, gets invoked just before the module is unloaded. It should tell the kernel, "I'm not there anymore; don't ask me to do anything else."
Yes, the Linux kernel has a main function, it is located in arch/x86/boot/main.c file. But Kernel execution starts from arch/x86/boot/header.S assembly file and the main() function is called from there by "calll main" instruction.
Here is that main function:
void main(void)
{
/* First, copy the boot header into the "zeropage" */
copy_boot_params();
/* Initialize the early-boot console */
console_init();
if (cmdline_find_option_bool("debug"))
puts("early console in setup code.\n");
/* End of heap check */
init_heap();
/* Make sure we have all the proper CPU support */
if (validate_cpu()) {
puts("Unable to boot - please use a kernel appropriate "
"for your CPU.\n");
die();
}
/* Tell the BIOS what CPU mode we intend to run in. */
set_bios_mode();
/* Detect memory layout */
detect_memory();
/* Set keyboard repeat rate (why?) and query the lock flags */
keyboard_init();
/* Query Intel SpeedStep (IST) information */
query_ist();
/* Query APM information */
#if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
query_apm_bios();
#endif
/* Query EDD information */
#if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
query_edd();
#endif
/* Set the video mode */
set_video();
/* Do the last things and invoke protected mode */
go_to_protected_mode();
}
While the function name main() is just a common convention (there is no real reason to use it in kernel mode) the linux kernel does have a main() function for many architectures, and of course usermode linux has a main function.
Note the OS runtime loads the main() function to start an app, when an operating system boots there is no runtime, the kernel is simply loaded to a address by the boot loader which is loaded by the MBR which is loaded by the hardware. So while a kernel may contain a function called main it need not be the entry point.
See Also:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms633559%28v=vs.85%29.aspx
Linux kernel source:
x86: linux-3.10-rc6/arch/x86/boot/main.c
arm64: linux-3.10-rc6/arch/arm64/kernel/asm-offsets.c