How can C allocate memory without an OS? (Cases where OS is written in C) - c

I am following a course where we are writing an OS on a microcontroller.
The OS is written in C and the instructor initializes stack spaces for each thread in the following way.
int32_t TCB_STACK[NUM_OF_THREADS][STACK_SIZE];
How can the memory be allocated if there isn't any OS already running to service this in the first place? Am I missing something?

You don't have to "allocate" anything as such on a bare metal system, you have full access to all of the physical memory and there's no one else to share it with.
On such a system those arrays end up in statically allocated RAM memory known as .bss. Where .bss is located and how large it is, is determined by the linker script. The linker script decides which parts of the memory to reserve for what. Stack, .bss, .data and so on. Similarly the flash memory for the actual program may be divided in several parts.

Related

How the function calls works in terms of Stack and Code memory?

I have been trying to understand how memory works, what happens step by step inside execution of the application in terms of memory, specially in embedded system. More of context in C/C++
Out of Stack, heap, static and Code memory of a application, which is Stored in RAM or volatile memory and which part is stored in non-volatile memory? Or when a application is executed, the whole application is copied to RAM or volatile memory?
When a function is called, does all the assembly instruction of that function gets copied to the stack or only memory is allocated to function?
If only memory is allocated to the function in real time, that means the address of those variable has to be added to the assembly code of the function, how does that happen?
Who does all this in an embedded system stack memory allocation etc in a embedded system when we write code for it in C? There is no OS in the MCU to do memory management for us so who manages this memory allocation during function calls in MCU
Out of Stack, heap, static and Code memory of a application, which is
Stored in RAM or volatile memory and which part is stored in
non-volatile memory? Or when a application is executed, the whole
application is copied to RAM or volatile memory?
When a function is called, does all the assembly instruction of that
function gets copied to the stack or only memory is allocated to
function?
If only memory is allocated to the function in real time, that means
the address of those variable has to be added to the assembly code of
the function, how does that happen?
For a comprehensive and correct understanding of code execution in bare-metal (without OS, and OS environments (embedded systems), different memory structures and all their internals, I would recommend you to cover this book - Extreme C (Auth: Kamran Amini), especially these sections:
Chapter 2: From Source to Binary
Chapter 4: Process Memory Structure
Chapter 5: Stack and Heap
In my experience, heresay and random comments will hurt your understanding instead and you will be barely able to make sense out of it. Consult authentic published content.
Who does all this in an embedded system stack memory allocation etc in a embedded system when we write code for it in C? There is no OS in the MCU to do memory management for us so who manages this memory allocation during function calls in MCU
The programmer's interface in C is the main.c file:
int main (){
// User code here
return 0;
}
But technically, your IDE (Keil uVision or Code Composer Studio etc.) along with it's RTE (Run-time environment) component, places a boilperlate code of initialization and de-initialization files (commonly known as MCU startup code) before and after this main.c file, during building (compilation) phase of your source code for that particular target hardware such as Tiva C TM4C123GH6PM or STM32F400 etc.
The code inside these startup/initialization files initializes your stack and heap memory, clock-gating controls of your mcu, setups up different peripherals (gpio, serial, i2c etc.), interrupt vector table, PC (program counter), SP (stack pointer) etc. and a lot more other things.
You can open a debug session of a simple program for your target MCU, while configuring it to 'stop at first assembly instruction/startup' and then performing stepping operations to go through all the code until you reach main.c file.
The findings are fantastic.
in embedded system
I am answering assuming a system without MMU, assuming some small microcontroller. Overall, note the that answers highly depend on specific configuration of specific system. In the end, you can store everything in volatile memory if you want, and you can configure microcontrollers that way.
Out of Stack, heap, static and Code memory of a application, which is Stored in RAM or volatile memory and which part is stored in non-volatile memory?
There are sections, see like https://www.embeddedtutor.com/2019/07/memory-layout-executable-in-embedded.html . TBH it's really trivial - if something changes, it's volatile, if something does not change and persist between power losses, it's non-volatile. Let's say:
what
where
stack
volatile
heap
volatile
static
initialization of static variables is stored in non-volatile memory, static variables themselves are stored in volatile memory. I.e. data segment
Code
We live on Harvard architectures and assembly instructions do not change, so they go to non-volatile memory
Or when a application is executed, the whole application is copied to RAM or volatile memory?
No, it is not copied.
But, anyway, you can configure a microcontroller to execute instructions from volatile memory, like STM32 code_in_ram. Still there is no copy (there can be, you can write code for that...), the instructions are there. I.e. it all depends on specific configuration.
If only memory is allocated to the function in real time, that means the address of those variable has to be added to the assembly code of the function, how does that happen?
There is a stack pointer that contains the location of the top of the stack. It is incremented or decremented depending on if adding or removing data from the stack (and note, that for example on x86 stack grows towards decreasing memory addresses, like upside down). The functions themselves manage the stack, whereas functions are created by a C compiler and the C compiler creates that code to manage that stack.
There are standards that specify how it specifically works on specific architectures - these standard specify architecture ABI. For example, x86-64 ABI or ARM ABI. They specify what registers are used, how stack is used and passed, etc. Then compilers "implement that ABI" - create assembly code that adheres to the rules given in ABI standards.
Who does all this in an embedded system stack memory allocation etc in a embedded system when we write code for it in C?
C code itself manages memory by itself. Compiler glues it together.
so who manages this memory allocation during function calls in MCU
The code itself, I guess? (Nowadays) C standard libraries are written in C, and some are open source. Newlib is a popular free C standard library implementation for embedded systems - you can inspect it's malloc implementation.

Where memory segments are defined?

I just learned about different memory segments like: Text, Data, Stack and Heap. My question is:
1- Where the boundaries between these sections are defined? Is it in Compiler or OS?
2- How the compiler or OS know which addresses belong to each section? Should we define it anywhere?
This answer is from the point of view of a more special-purpose embedded system rather than a more general-purpose computing platform running an OS such as Linux.
Where the boundaries between these sections are defined? Is it in Compiler or OS?
Neither the compiler nor the OS do this. It's the linker that determines where the memory sections are located. The compiler generates object files from the source code. The linker uses the linker script file to locate the object files in memory. The linker script (or linker directive) file is a file that is a part of the project and identifies the type, size and address of the various memory types such as ROM and RAM. The linker program uses the information from the linker script file to know where each memory starts. Then the linker locates each type of memory from an object file into an appropriate memory section. For example, code goes in the .text section which is usually located in ROM. Variables go in the .data or .bss section which are located in RAM. The stack and heap also go in RAM. As the linker fills one section it learns the size of that section and can then know where to start the next section. For example, the .bss section may start where the .data section ended.
The size of the stack and heap may be specified in the linker script file or as project options in the IDE.
IDEs for embedded systems typically provide a generic linker script file automatically when you create a project. The generic linker file is suitable for many projects so you may never have to customize it. But as you customize your target hardware and application further you may find that you also need to customize the linker script file. For example, if you add an external ROM or RAM to the board then you'll need to add information about that memory to the linker script so that the linker knows how to locate stuff there.
The linker can generate a map file which describes how each section was located in memory. The map file may not be generated by default and you may need to turn on a build option if you want to review it.
How the compiler or OS know which addresses belong to each section?
Well I don't believe the compiler or OS actually know this information, at least not in the sense that you could query them for the information. The compiler has finished its job before the memory sections are located by the linker so the compiler doesn't know the information. The OS, well how do I explain this? An embedded application may not even use an OS. The OS is just some code that provides services for an application. The OS doesn't know and doesn't care where the boundaries of memory sections are. All that information is already baked into the executable code by the time the OS is running.
Should we define it anywhere?
Look at the linker script (or linker directive) file and read the linker manual. The linker script is input to the linker and provides the rough outlines of memory. The linker locates everything in memory and determines the extent of each section.
For your Query :-
Where the boundaries between these sections are defined? Is it in Compiler or OS?
Answer is OS.
There is no universally common addressing scheme for the layout of the .text segment (executable code), .data segment (variables) and other program segments. However, the layout of the program itself is well-formed according to the system (OS) that will execute the program.
How the compiler or OS know which addresses belong to each section? Should we define it anywhere?
I divided your this question into 3 questions :-
About the text (code) and data sections and their limitation?
Text and Data are prepared by the compiler. The requirement for the compiler is to make sure that they are accessible and pack them in the lower portion of address space. The accessible address space will be limited by the hardware, e.g. if the instruction pointer register is 32-bit, then text address space would be 4 GiB.
About Heap Section and limit? Is it the total available RAM memory?
After text and data, the area above that is the heap. With virtual memory, the heap can practically grow up close to the max address space.
Do the stack and the heap have a static size limit?
The final segment in the process address space is the stack. The stack takes the end segment of the address space and it starts from the end and grows down.
Because the heap grows up and the stack grows down, they basically limit each other. Also, because both type of segments are writeable, it wasn't always a violation for one of them to cross the boundary, so you could have buffer or stack overflow. Now there are mechanism to stop them from happening.
There is a set limit for heap (stack) for each process to start with. This limit can be changed at runtime (using brk()/sbrk()). Basically what happens is when the process needs more heap space and it has run out of allocated space, the standard library will issue the call to the OS. The OS will allocate a page, which usually will be manage by user library for the program to use. I.e. if the program wants 1 KiB, the OS will give additional 4 KiB and the library will give 1 KiB to the program and have 3 KiB left for use when the program ask for more next time.
Most of the time the layout will be Text, Data, Heap (grows up), unallocated space and finally Stack (grows down). They all share the same address space.
The sections are defined by a format which is loosely tied to the OS. For example on Linux you have ELF and on Mac OS you have Mach-O.
You do not define the sections explicitly as a programmer, in 99.9% of cases. The compiler knows what to put where.

Does a compiler have consider the kernel memory space when laying out memory?

I'm trying to reconcile a few concepts.
I know of virtual memory is shared (mapped) between the kernel and all user processes, which I read here. I also know that when the compiler generates addresses for code + data, the kernel must load them at the correct virtual addresses for that process.
To constrain the scope of the question, I'll just mean gcc when I mention 'the compiler'.
So does the compiler need to be compliant each new release of an OS, to know not to place code or data at the high memory addresses reserved for the kernel? As in, someone writing that piece of the compiler must know those details of how the kernel plans to load the program (lest the compiler put executable code in high memory)?
Or am I confusing different concepts? I got a bit confused when going through this tutorial, especially at the very bottom where it has OS code in low memory addresses, because I thought Linux uses high memory for the kernel.
The compiler doesn't determine the address ranges in memory at which things are placed. That's handled by the OS.
When the program is first executed, the loader places the various portions of the program and its libraries in memory. For memory that's allocated dynamically, large chunks are allocated from the OS and then sometimes divided into smaller chunks.
The OS loader knows where to load things. And the OS's virtual memory allocation logic how to find safe, empty spaces in the address space the process uses.
I'm not sure what you mean by the "high memory addresses reserved for the kernel". If you're talking about a 2G/2G or 3G/1G split on a 32-bit operating system, that is a fundamental design element of those OSes that use it. It doesn't change with versions.
If you're talking about high physical memory, then no. Compilers don't care about physical memory.
Linux gives each application its own memory space, distinct from the kernel. The page table contains the translations between this memory space and physical RAM, and the kernel sets up the page table so there's no interference.
That said, the compiler usually doesn't even care where the program is loaded in memory. Why would it?

Microcontroller memory allocation

I've been thinking for day about the following question:
In a common pc when you allocate some memory, you ask for it to the OS that keeps track of which memory segments are occupied and which ones are not, and don't let you mess around with other programs memory etc.
But what about a microcontroller, I mean a microcontroller doesn't have an operating system running so when you ask for a bunch of memory what is going on? you cannot simply acess the memory chip and acess a random place cause it may be occupied... who keeps track of which parts of memory are already occupied, and gives you a free place to store something?
EDIT:
I've programmed microcontrollers in C... and I was thinking that the answer could be "language independent". But let me be more clear: supose i have this program running on a microcontroller:
int i=0;
int d=3;
what makes sure that my i and d variables are not stored at the same place in memory?
I think the comments have already covered this...
To ask for memory means you have some operating system managing memory that you are mallocing from (using a loose sense of the term operating system). First you shouldnt be mallocing memory in a microcontroller as a general rule (I may get flamed for that statement). Can be done in cases but you are in control of your memory, you own the system with your application, asking for memory means asking yourself for it.
Unless you have reasons why you cannot statically allocate your structures or arrays or use a union if there are mutually exclusive code paths that might both want much or all of the spare memory, you can try to allocate dynamically and free but it is a harder system engineering problem to solve.
There is a difference between runtime allocation of memory and compile time. your example has nothing to do with the rest of the question
int i=0;
int d=3;
the compiler at compile time allocates two locations in .data one for each of those items. the linker and/or script manages where .data lives and what its limitations are on size, if .data is bigger than what is available you should get a linker warning, if not then you need to fix your linker commands or script to match your system.
runtime allocation is managed at runtime and where and how it manages the memory is determined by that library, even if you have plenty of memory a bad or improperly written library could overlap .text, .data, .bss and/or the stack and cause a lot of problems.
excessive use of the stack is also a pretty serious system engineering problem which coming from non-embedded systems is these days often overlooked because there is so much memory. It is a very real problem when dealing with embedded code on a microcontroller. You need to know your worst case stack usage, leave room for at least that much memory if you are going to have a heap to dynamically allocate, or even if you statically allocate.

How are the different segments like heap, stack, text related to the physical memory?

When a C program is compiled and the object file(ELF) is created. the object file contains different sections such as bss, data, text and other segments. I understood that these sections of the ELF are part of virtual memory address space. Am I right? Please correct me if I am wrong.
Also, there will be a virtual memory and page table associated with the compiled program. Page table associates the virtual memory address present in ELF to the real physical memory address when loading the program. Is my understanding correct?
I read that in the created ELF file, bss sections just keeps the reference of the uninitialised global variables. Here uninitialised global variable means, the variables that are not intialised during declaration?
Also, I read that the local variables will be allocated space at run time (i.e., in stack). Then how they will be referenced in the object file?
If in the program, there is particular section of code available to allocate memory dynamically. How these variables will be referenced in object file?
I am confused that these different segments of object file (like text, rodata, data, bss, stack and heap) are part of the physical memory (RAM), where all the programs are executed.
But I feel that my understanding is wrong. How are these different segments related to the physical memory when a process or a program is in execution?
1. Correct, the ELF file lays out the absolute or relative locations in the virtual address space of a process that the operating system should copy the ELF file contents into. (The bss is just a location and a size, since its supposed to be all zeros, there is no need to actually have the zeros in the ELF file). Note that locations can be absolute locations (like virtual address 0x100000 or relative locations like 4096 bytes after the end of text.)
2. The virtual memory definition (which is kept in page tables and maps virtual addresses to physical addresses) is not associated with a compiled program, but with a "process" (or "task" or whatever your OS calls it) that represents a running instance of that program. For example, a single ELF file can be loaded into two different processes, at different virtual addresses (if the ELF file is relocatable).
3. The programming language you're using defines which uninitialized state goes in the bss, and which gets explicitly initialized. Note that the bss does not contain "references" to these variables, it is the storage backing those variables.
4. Stack variables are referenced implicitly from the generated code. There is nothing explicit about them (or even the stack) in the ELF file.
5. Like stack references, heap references are implicit in the generated code in the ELF file. (They're all stored in memory created by changing the virtual address space via a call to sbrk or its equivalent.)
The ELF file explains to an OS how to setup a virtual address space for an instance of a program. The different sections describe different needs. For example ".rodata" says I'd like to store read-only data (as opposed to executable code). The ".text" section means executable code. The "bss" is a region used to store state that should be zeroed by the OS. The virtual address space means the program can (optionally) rely on things being where it expects when it starts up. (For example, if it asks for the .bss to be at address 0x4000, then either the OS will refuse to start it, or it will be there.)
Note that these virtual addresses are mapped to physical addresses by the page tables managed by the OS. The instance of the ELF file doesn't need to know any of the details involved in which physical pages are used.
I am not sure if 1, 2 and 3 are correct but I can explain 4 and 5.
4: They are referenced by offset from the top of the stack. When executing a function, the top of the stack is increased to allocate space for local variables. Compiler determines the order of local variables in the stack so the compiler nows what is the offset of the variables from the top of the stack.
Stack in physical memory is positioned upside down. Beginning of stack usually has highest memory address available. As programs runs and allocates space for local variables the address of the top of the stack decrements (and can potentially lead to stack overflow - overlapping with segments on lower addresses :-) )
5: Using pointers - Address of dynamically allocated variable is stored in (local) variable. This corresponds to using pointers in C.
I have found nice explanation here: http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/mem.html
All the addresses of the different sections (.text, .bss, .data, etc.) you see when you inspect an ELF with the size command:
$ size -A -x my_elf_binary
are virtual addresses. The MMU with the operating system performs the translation from the virtual addresses to the RAM physical addresses.
If you want to know these things, learn about the OS, with source code (www.kernel.org) if possible.
You need to realize that the OS kernel is actually running the CPU and managing the memory resource. And C code is just a light weight script to drive the OS and to run only simple operation with registers.
Virtual memory and Physical memory is about CPU's TLB letting the user space process to use contiguous memory virtually through the power of TLB (using page table) hardware.
So the actual physical memory, mapped to the contiguous virtual memory can be scattered to anywhere on the RAM.
Compiled program doesn't know about this TLB stuff and physical memory address stuff. They are managed in the OS kernel space.
BSS is a section which OS prepares as zero filled memory addresses, because they were not initialized in the c/c++ source code, thus marked as bss by the compiler/linker.
Stack is something prepared only a small amount of memory at first by the OS, and every time function call has been made, address will be pushed down, so that there is more space to place the local variables, and pop when you want to return from the function.
New physical memory will be allocated to the virtual address when the first small amount of memory is full and reached to the bottom, and page fault exception would occur, and the OS kernel will prepare a new physical memory and the user process can continue working.
No magic. In object code, every operation done to the pointer returned from malloc is handled as offsets to the register value returned from malloc function call.
Actually malloc is doing quite complex things. There are various implementations (jemalloc/ptmalloc/dlmalloc/googlemalloc/...) for improving dynamic allocations, but actually they are all getting new memory region from the OS using sbrk or mmap(/dev/zero), which is called anonymous memory.
Just do a man on the command readelf to find out the starting addresses of the different segments of your program.
Regarding the first question you are absolutely right. Since most of today's systems use run-time binding it is only during execution that the actual physical addresses are known. Moreover, it's the compiler and the loader that divide the program into different segments after linking the different libraries during compile and load time. Hence, the virtual addresses.
Coming to the second question it is at the run-time due to runtime binding. The third question is true. All uninitialized global variables and static variables go into BSS. Also note the special case: they go into BSS even if they are initialized to 0.
4.
If you look at a assembler code generated by gcc you can see that memory local variables is allocated in stack through command push or through changing value of the register ESP. Then they are initiated with command mov or something like that.

Resources