data segment containing heap data or static variables - c

I was reading (Operating System - Tannenbaum, page 190) about system memory and I found a paragraph that said:
the data segment being used as a heap for the variables that are dynamically allocated and released and a stack segment for the normal local variables and return addresses.
Where as Data Segment says that it is used for initialized static variables.
Which one of them is correct? Or is there something wrong with my understanding?

From your link itself:
Historically, to be able to support memory address spaces larger than the native size of the internal address register would allow, early CPUs implemented a system of segmentation whereby they would store a small set of indexes to use as offsets to certain areas. The Intel 8086 family of CPUs provided four segments: the code segment, the data segment, the stack segment and the extra segment.
Now, Operating Systems: Design and Implementation was written in 1987, When the data segment was used for both stack, heap, initialized data, and uninitialized data.
Since then, there were a few important changes:
There's no a lot more memory and many more CPU bits, and segmentation is no longer needed by the hardware.
Segments because more than a mere hardware artifact - they became a memory management design pattern.
The BSS segment was introduced.
Features like mmap() and POSIX shared memory IPC mean that the heap is not a single contiguous segment.
Multi-threading means multiple stacks, in the same memory space, sharing a single heap.
So when the book was written, the "data segment" was a concept defined by the hardware, and it was defined to contain everything that wasn't local: initialized data, uninitialized data, dynamically allocated data etc.
But these days the OS' memory manager defines "data segement" as "memory area containing the program's initialized data".
On CPUs that use a data segment pointer, it points to the beginning of what the OS' memory manager declares to be "the data segment".
But the memory manager has more segments, for BSS and heap, which aren't represented by a CPU pointer, so the memory manager just places them immediately after the data segment.
Stacks are a different story these days. When you create a new thread, it gets a new stack, often limited in size (e.g. 8 MB on some versions of Linux). Most likely, the stack for a thread will be allocated from the same area as the heap, meaning at a higher address than the data segment, and since the stack grows to lower addresses, all stacks will still grow towards the data segment.

Related

Where does malloc() allocate memory? Is it the data section or the heap section of the virtual address space of the process?

Ever since I was introduced to C, I was told that in C dynamic memory allocation is done using the functions in the malloc family. I also learned that memory dynamically allocated using malloc is allocated on the heap section of the process.
Various OS textbooks say that malloc involves system call (though not always but at times) to allocate structures on heap to the process. Now supposing that malloc returns pointer to chunk of bytes allocated on the heap, why should it need a system call. The activation records of a function are placed in the stack section of the process and since the "stack section" is already a part of the virtual address space of the process, pushing and popping of activation records, manipulation of stack pointers, just start from the highest possible address of the virtual address space. It does not even require a system call.
Now on the same grounds since the "heap section" is also a part of the virtual address space of the process, why should a system call be necessary for allocating a chunk of bytes in this section. The routine like malloc could self handle the "free" list and "allocated" list on its own. All it needs to know is the end of the "data section". Certain texts say that system calls are necessary to "attach memory to the process for dynamic memory allocation", but if malloc allocates memory on "heap section" why is it at all required to attach memory to the process during malloc? Could be simply taken from portion already part of the process.
While going through the text "The C Programming Language" [2e] by Kernighan and Ritchie, I came across their implementation of the malloc function [section 8.7 pages 185-189]. The authors say :
malloc calls upon the operating system to obtain more memory as necessary.
Which is what the OS texts say, but counter intuitive to my thought above (if malloc allocates space on heap).
Since asking the system for memory is a comparatively expensive operation, the authors do not do that on every call to malloc, so they create a function morecore which requests at least NALLOC units; this larger block is chopped up as needed. And the basic free list management is done by free.
But the thing is that the authors use sbrk() to ask the operating system for memory in morecore. Now Wikipedia says:
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process.
Where
a data segment (often denoted .data) is a portion of an object file or the corresponding address space of a program that contains initialized static variables, that is, global variables and static local variables.
Which I guess is not the "heap section". [Data section is the second section from bottom in the picture above, while heap is the third section from bottom.]
I am totally confused. I want to know what really happens and how both the concepts are correct? Please help me understand the concept by joining the scattered pieces together...
In your diagram, the section labeled "data" is more precisely called "static data"; the compiler pre-allocates this memory for all the global variables when the process starts.
The heap that malloc() uses is the rest of the process's data segment. This initially has very little memory assigned to it in the process. If malloc() needs more memory, it can use sbrk() to extend the size of the data segment, or it can use mmap() to create additional memory segments elsewhere in the address space.
Why does malloc() need to do this? Why not simply make the entire address space available for it to use? There are historical and practical reasons for this.
The historical reason is that early computers didn't have virtual memory. All the memory assigned to a process was swapped in bulk to disk when switching between processes. So it was important to only assign memory pages that were actually needed.
The practical reason is that this is useful for detecting various kinds of errors. If you've ever gotten a segmentation violation error because you dereferenced an uninitialized pointer, you've benefited from this. Much of the process's virtual address space is not allocated to the process, which makes it likely that unitialized pointers point to unavailable memory, and you get an error trying to use it.
There's also an unallocated gap between the heap (growing upwards) and the stack (growing downward). This is used to detect stack overflow -- when the stack tries to use memory in that gap, it gets a fault that's translated to the stack overflow signal.
This is the Standard C Library specification for malloc(), in its entirety:
7.22.3.4 The malloc function
Synopsis
#include <stdlib.h>
void *malloc(size_t size);
Description
The malloc function allocates space for an object whose size is
specified by size and whose value is indeterminate. Note that this need
not be the same as the representation of floating-point zero or a null
pointer constant.
Returns
The malloc function returns either a null pointer or a pointer to the
allocated space.
That's it. There's no mention of the Heap, the Stack or any other memory location, which means that the underlying mechanisms for obtaining the requested memory are implementation details.
In other words, you don't care where the memory comes from, from a C perspective. A conforming implementation is free to implement malloc() in any way it sees fit, so long as it conforms to the above specification.
I was told that in C dynamic memory allocation is done using the functions in the malloc family. I also learned that memory dynamically allocated using malloc is allocated on the heap section of the process.
Correct on both points.
Now supposing that malloc returns pointer to chunk of bytes allocated on the heap, why should it need a system call.
It needs to request an adjustment to the size of the heap, to make it bigger.
...the "stack section" is already a part of the virtual address space of the process, pushing and popping of activation records, manipulation of stack pointers, [...] does not even require a system call.
The stack segment is grown implicitly, yes, but that's a special feature of the stack segment. There's typically no such implicit growing of the data segment. (Note, too, that the implicit growing of the stack segment isn't perfect, as witness the number of people who post questions to SO asking why their programs crash when they allocate huge arrays as local variables.)
Now on the same grounds since the "heap section" is also a part of the virtual address space of the process, why should a system call be necessary for allocating a chunk of bytes in this section.
Answer 1: because it's always been that way.
Answer 2: because you want accidental stray pointer references to crash, not to implicitly allocate memory.
malloc calls upon the operating system to obtain more memory as necessary.
Which is what the OS texts say, but counter intuitive to my thought above (if malloc allocates space on heap).
Again, malloc does request space on the heap, but it must use an explicit system call to do so.
But the thing is that the authors use sbrk() to ask the operating system for memory in morecore. Now Wikipedia says:
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process.
Different people use different nomenclatures for the different segments. There's not much of a distinction between the "data" and "heap" segments. You can think of the heap as a separate segment, or you can think of those system calls -- the ones that "allocate space on the heap" -- as simply making the data segment bigger. That's the nomenclature the Wikipedia article is using.
Some updates:
I said that "There's not much of a distinction between the 'data' and 'heap' segments." I suggested that you could think of them as subparts of a single, more generic data segment. And actually there are three subparts: initialized data, uninitialized data or "bss", and the heap. Initialized data has initial values that are explicitly copied out of the program file. Uninitialized data starts out as all bits zero, and so does not need to be stored in the program file; all the program file says is how many bytes of uninitialized data it needs. And then there's the heap, which can be thought of as a dynamic extension of the data segment, which starts out with a size of 0 but may be dynamically adjusted at runtime via calls to brk and sbrk.
I said, "you want accidental stray pointer references to crash, not to implicitly allocate memory", and you asked about this. This was in response to your supposition that explicit calls to brk or sbrk ought not to be required to adjust the size of the heap, and your suggestion that the heap could grow automatically, implicitly, just like the stack does. But how would that work, really?
The way automatic stack allocation works is that as the stack pointer grows (typically "downward"), it eventually reaches a point that it points to unallocated memory -- that blue section in the middle of the picture you posted. At that point, your program literally gets the equivalent of a "segmentation violation". But the operating system notices that the violation involves an address just below the existing stack, so instead of killing your program on an actual segmentation violation, it quick-quick makes the stack segment a little bigger, and lets your program proceed as if nothing had happened.
So I think your question was, why not have the upward-growing heap segment work the same way? And I suppose an operating system could be written that worked that way, but most people would say it was a bad idea.
I said that in the stack-growing case, the operating system notices that the violation involves an address "just below" the existing stack, and decides to grow the stack at that point. There's a definition of "just below", and I'm not sure what it is, but these days I think it's typically a few tens or hundreds of kilobytes. You can find out by writing a program that allocates a local variable
char big_stack_array[100000];
and seeing if your program crashes.
Now, sometimes a stray pointer reference -- that would otherwise cause a segmentation violation style crash -- is just the result of the stack normally growing. But sometimes it's a result of a program doing something stupid, like the common error of writing
char *retbuf;
printf("type something:\n");
fgets(retbuf, 100, stdin);
And the conventional wisdom is that you do not want to (that is, the operating system does not want to) coddle a broken program like this by automatically allocating memory for it (at whatever random spot in the address space the uninitialized retbuf pointer seems to point) to make it seem to work.
If the heap were set up to grow automatically, the OS would presumably define an analogous threshold of "close enough" to the existing heap segment. Apparently stray pointer references within that region would cause the heap to automatically grow, while references beyond that (farther into the blue region) would crash as before. That threshold would probably have to be bigger than the threshold governing automatic stack growth. malloc would have to be written to make sure not to try to grow the heap by more than that amount. And true, stray pointer references -- that is, program bugs -- that happened to reference unallocated memory in that zone would not be caught. (Which is, it's true, what can happen for buggy, stray pointer references just off the end of the stack today.)
But, really, it's not hard for malloc to keep track of things, and explicitly call sbrk when it needs to. The cost of requiring explicit allocation is small, and the cost of allowing automatic allocation -- that is, the cost of the stray pointer bugs not caught -- would be larger. This is a different set of tradeoffs than for the stack growth case, where an explicit test to see if the stack needed growing -- a test which would have to occur on every function call -- would be significantly expensive.
Finally, one more complication. The picture of the virtual memory layout that you posted -- with its nice little stack, heap, data, and text segments -- is a simple and perhaps outdated one. These days I believe things can be a lot more complicated. As #chux wrote in a comment, "your malloc() understanding is only one of many ways allocation is handled. A clear understanding of one model may hinder (or help) understanding of the many possibilities." Among those complicating possibilities are:
A program may have multiple stack segments maintaining multiple stacks, if it supports coroutines or multithreading.
The mmap and shm_open system calls may cause additional memory segments to be allocated, scattered anywhere within that blue region between the heap and the stack.
For large allocations, malloc may use mmap rather than sbrk to get memory from the OS, since it turns out this can be advantageous.
See also Why does malloc() call mmap() and brk() interchangeably?
As the bard said, "There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy." :-)
Not all virtual addresses are available at the beginning of a process.
OS does maintain a virtual-to-physics map, but (at any given time) only some of the virtual addresses are in the map. Reading or Writing to an virtual address that isn't in the map cause a instruction level exception. sbrk puts more addresses in the map.
Stack just like data section but has a fixed size, and there is no sbrk-like system call to extend it. We can say there is no heap section, but only a fixed-size stack section and a data section which can be grown upward by sbrk.
The heap section you say is actually a managed (by malloc and free) part of the data section. It's clear that the code relating to heap management is not in OS kernel but in C library executing in CPU user mode.

where does address of variables stored in a memory?

whenever we need to find the address of the variable we use below syntax in C and it prints a address of the variable. what i am trying to understand is the address that returned is actual physical memory location or compiler throwing a some random number. if it is either physical or random, where did it get those number or where it has to be stored in memory. actually does address of the memory location takes space in the memory?
int a = 10;
printf("ADDRESS:%d",&a);
ADDRESS: 2234xxxxxxxx
This location is from the virtual address space, which is allocated to your program. In other words, this is from the virtual memory, which your OS maps to a physical memory, as and when needed.
It depends on what type of system you've got.
Low-end systems such as microcontroller applications often only supports physical addresses.
Mid-range CPUs often come with a MMU (memory mapping unit) which allows so-called virtual memory to be placed on top of the physical memory. Meaning that a certain part of the code could be working from address 0 to x, though in reality those virtual addresses are just aliases for physical ones.
High-end systems like PC typically only allows virtual memory access and denies applications direct access to physical memory. They often also use Address space layout randomization (ASLR) to produce random address layouts for certain kinds of memory, in order to prevent hacks that exploit hard-coded addresses.
In either case, the actual address itself does not take up space in memory.
Higher abstraction layer concepts such as file systems may however store addresses in look-up tables etc and then they will take up memory.
… is the address that returned is actual physical memory location or compiler throwing a some random number
In general-purpose operating systems, the addresses in your C program are virtual memory addresses.1
if it is either physical or random, where did it get those number or where it has to be stored in memory.
The software that loads your program into memory makes the final decisions about what addresses are used2, and it may inform your program about those addresses in various ways, including:
It may put the start addresses of certain parts of the program in designated processor registers. For example, the start address of the read-only data of your program might be put in R17, and then your program would use R17 as a base address for accessing that data.
It may “fix up” addresses built into your program’s instructions and data. The program’s executable file may contain information about places in your program’s instructions or data that need to be updated when the virtual addresses are decided. After the instructions and data are loaded into memory, the loader will use the information in the file to find those places and update them.
With position-independent code, the program counter itself (a register in the processor that contains the address of the instruction the processor is currently executing or about to execute) provides address information.
So, when your program wants to evaluate &x, it may take the offset of x from the start of the section it is in (and that offset is built into the program by the compiler and possibly updated by the linker) and adds it to the base address of that section. The resulting sum is the address of x.
actually does address of the memory location takes space in the memory?
The C standard does not require the program to use any memory for the address of x, &x. The result of &x is a value, like the result of 3*x. The only thing the compiler has to do with a value is ensure it gets used for whatever further expression it is used in. It is not required to store it in memory. However, if the program is dealing with many values in a piece of code, so there are not enough processor registers to hold them all, the compiler may choose to store values in memory temporarily.
Footnotes
1 Virtual memory is a conceptual or “imaginary” address space. Your program can execute with virtual addresses because the hardware automatically translates virtual addresses to physical addresses while it is executing the program. The operating system creates a map that tells the hardware how to translate virtual addresses to physical addresses. (The map may also tell the hardware certain virtual memory is not actually in physical memory at the moment. In this case, the hardware interrupts the program and starts an operating system routine which deals with the issue. That routine arranges for the needed data to be loaded into memory and then updates the virtual memory map to indicate that.)
2 There is usually a general scheme for how parts of the program are laid out in memory, such as starting the instructions in one area and setting up space for stack in another area. In modern systems, some randomness is intentionally added to the addresses to foil malicious people trying to take advantage of bugs in programs.

Linux heap allocation

In FreeRTOS, the heap is simply a global array with a size (lets call is heapSize) defined in a H file which the user can change. This array is a non-initialized global array which makes it as part of the BSS section of the image, as so it is filled with zeros upon loading, then, every allocation of memory is taken from this array and every address of allocated memory is a an offset of this array.
So, for a maximal utilization of the memory size, we can approximate the size of the Data, Text and BSS areas of our entire program, and define the heap size to something like heapSize = RAM_size - Text_size - Data_size - BSS_size.
I would like to know what is the equivalent implementation is Linux OS. Can Linux scan a given RAM and decide its size in run time? does linux have an equivalent data structure to manage the heap? if so, how does it allocates the memory for this data structure in the first place?
I would like to know what is the equivalent implementation is Linux OS.
Read "Chapter 8: Allocating Memory" in Linux Device Drivers, Third Edition.
Heaps in Linux are dynamic, so it grows whenever you request more memory. This can extend beyond physical memory size by using swap files, where some unused portions of the RAM is written to disk.
So I think you need to think more in terms of "how much memory does my application need" rather than "how much memory is available".

How does the global variable declaration solve the stack overflow in C?

I have some C code.
What it does is simple, get some array from io, then sort it.
#include <stdio.h>
#include <stdlib.h>
#define ARRAY_MAX 2000000
int main(void) {
int my_array[ARRAY_MAX];
int w[ARRAY_MAX];
int count = 0;
while (count < ARRAY_MAX && 1 == scanf("%d", &my_array[count])) {
count++;
}
merge_sort(my_array, w, count);
return EXIT_SUCCESS;
}
And it works well, but if I really give it a group of number which is 2000000, it cause a stack overflow. Yes, it used up all the stack. One of the solution is to use malloc() to allocate a memory space for these 2 variables, to move them to the heap, so no problem at all.
The other solution is to move the below 2 declaration to the global scope, to make them global variables.
int my_array[ARRAY_MAX];
int w[ARRAY_MAX];
My tutor told me that this solution does the same job: to move these 2 variables into the heap.
But I checked some documents online. Global variables, without initialisation, they will reside in the bss segment, right?
I checked online, the size of this section is just few bytes.
How could it prevent the stack overflow?
Or, because these 2 types are array, so they are pointers, and global pointers reside in data segment, and it indicates the size of data segment can be dynamically changed as well?
The bss (block started by symbol) section is tiny in the object file (4 or 8 bytes) but the value stored is the number of bytes of zeroed memory to allocate after the initialized data.
It avoids the stack overflow by allocating the storage 'not on the stack'. It is normally in the data segment, after the text segment and before the start of the heap segment — but that simple memory picture can be more complicated these days.
Officially, there should be caveats about 'the standard doesn't say that there must be a stack' and various other minor bits'n'pieces, but that doesn't alter the substance of the answer. The bss section is small because it is a single number — but the number can represent an awful lot of memory.
Disclaimer: This is not a guide, it is an overview. It is based on how Linux does things, though I may have gotten some details wrong. Most (desktop) operating systems use a very similar model, with different details. Additionally, this only applies to userspace programs. Which is what you're writing unless you're developing for the kernel or working on modules (linux), drivers (windows), kernel extensions (osx).
Virtual Memory: I'll go into more detail below, but the gist is that each process gets an exclusive 32-/64-bit address space. And obviously a process' entire address space does not always map to real memory. This means A) one process' addresses mean nothing to another process and B) the OS decides which parts of a process' address space are loaded into real memory and which parts can stay on disk, at any given point in time.
Executable File Format
Executable files have a number of different sections. The ones we care about here are .text, .data, .bss, and .rodata. The .text section is your code. The .data and .bss sections are global variables. The .rodata section is constant-value 'variables' (aka consts). Consts are things like error strings and other messages, or perhaps magic numbers. Values that your program needs to refer to but never change. The .data section stores global variables that have an initial value. This includes variables defined as <type> <varname> = <value>;. E.g. a data structure containing state variables, with initial values, that your program uses to keep track of itself. The .bss section records global variables that do not have an initial value, or that have an initial value of zero. This includes variables defined as <type> <varname>; and <type> <varname> = 0;. Since the compiler and the OS both know that variables in the .bss section should be initialized to zero, there's no reason to actually store all of those zeros. So the executable file only stores variable metadata, including the amount of memory that should be allocated for the variable.
Process Memory Layout
When the OS loads your executable, it creates six memory segments. The bss, data, and text segments are all located together. The data and text segments are loaded (not really, see virtual memory) from the file. The bss section is allocated to the size of all of your uninitialized/zero-initialized variables (see VM). The memory mapping segment is similar to the data and text segments in that it consists of blocks of memory that are loaded (see VM) from files. This is where dynamic libraries are loaded.
The bss, data, and text segments are fixed-size. The memory mapping segment is effectively fixed-size, but it will grow when your program loads a new dynamic library or uses another memory mapping function. However, this does not happen often and the size increase is always the size of the library or file (or shared memory) being mapped.
The Stack
The stack is a bit more complicated. A zone of memory, the size of which is determined by the program, is reserved for the stack. The top of the stack (low memory address) is initialized with the main function's variables. During execution, more variables may be added to or removed from the bottom of the stack. Pushing data onto the stack 'grows' it down (higher memory address), increasing stack pointer (which maintains the address of the bottom of the stack). Popping data off the stack shrinks it up, reducing the stack pointer. When a function is called, the address of the next instruction in the calling function (the return address, within the text segment) is pushed onto the stack. When a function returns, it restores the stack to the state it was in before the function was called (everything it pushed onto the stack is popped off) and jumps to the return address.
If the stack grows too large, the result is dependent on many factors. Sometimes you get a stack overflow. Sometimes the run-time (in your case, the C runtime) tries to allocate more memory for the stack. This topic is beyond the scope of this answer.
The Heap
The heap is used for dynamic memory allocation. Memory allocated with one of the alloc functions lives on the heap. All other memory allocations are not on the heap. The heap starts as a large block of unused memory. When you allocate memory on the heap, the OS tries to find space within the heap for your allocation. I'm not going to go over how the actual allocation process works.
Virtual Memory
The OS makes your process think that it has the entire 32-/64-bit memory space to play in. Obviously, this is impossible; often this would mean your process had access to more memory than your computer physically has; on a 32-bit processor with 4GB of memory, this would mean your process had access to every bit of memory, with no room left for other processes.
The addresses that your process uses are fake. They do not map to actual memory. Additionally, most of the memory in your process' address space is inaccessible, because it refers to nothing (on a 32-bit processor it may not be most). The ranges of usable/valid addresses are partitioned into pages. The kernel maintains a page table for each process.
When your executable is loaded and when your process loads a file, in reality, it is mapped to one or more pages. The OS does not necessarily actually load that file into memory. What it does is create enough entries in the page table to cover the entire file while notating that those pages are backed by a file. Entries in the page table have two flags and an address. The first flag (valid/invalid) indicates whether or not the page is loaded in real memory. If the page is not loaded, the other flag and the address are meaningless. If the page is loaded, the second flag indicates whether or not the page's real memory has been modified since it was loaded and the address maps the page to real memory.
The stack, heap, and bss work similarly, except they are not backed by a 'real' file. If the OS decides that one of your process' pages isn't being used, it will unload that page. Before it unloads the page, if the modified flag is set in the page table for that page, it will save the page to disk somewhere. This means that if a page in the stack or heap is unloaded, a 'file' will be created that now maps to that page.
When your process tries to access a (virtual) memory address, the kernel/memory management hardware uses the page table to translate that virtual address to a real memory address. If the valid/invalid flag is invalid, a page fault is triggered. The kernel pauses your process, locates or makes a free page, loads part of the mapped file (or fake file for the stack or heap) into that page, sets the valid/invalid flag to valid, updates the address, then reruns the original instruction that triggered the page fault.
AFAIK, the bss section is a special page or pages. When a page in this section is first accessed (and triggers a page fault), the page is zeroed before the kernel returns control to your process. This means that the kernel doesn't pre-zero the entire bss section when your process is loaded.
Further Reading
Anatomy of a Program in Memory
How the Kernel Manages Your Memory
Global variables are not allocated on the stack. They are allocated in the data segment (if initialised) or the bss (if they are uninitialised).

How are the different segments like heap, stack, text related to the physical memory?

When a C program is compiled and the object file(ELF) is created. the object file contains different sections such as bss, data, text and other segments. I understood that these sections of the ELF are part of virtual memory address space. Am I right? Please correct me if I am wrong.
Also, there will be a virtual memory and page table associated with the compiled program. Page table associates the virtual memory address present in ELF to the real physical memory address when loading the program. Is my understanding correct?
I read that in the created ELF file, bss sections just keeps the reference of the uninitialised global variables. Here uninitialised global variable means, the variables that are not intialised during declaration?
Also, I read that the local variables will be allocated space at run time (i.e., in stack). Then how they will be referenced in the object file?
If in the program, there is particular section of code available to allocate memory dynamically. How these variables will be referenced in object file?
I am confused that these different segments of object file (like text, rodata, data, bss, stack and heap) are part of the physical memory (RAM), where all the programs are executed.
But I feel that my understanding is wrong. How are these different segments related to the physical memory when a process or a program is in execution?
1. Correct, the ELF file lays out the absolute or relative locations in the virtual address space of a process that the operating system should copy the ELF file contents into. (The bss is just a location and a size, since its supposed to be all zeros, there is no need to actually have the zeros in the ELF file). Note that locations can be absolute locations (like virtual address 0x100000 or relative locations like 4096 bytes after the end of text.)
2. The virtual memory definition (which is kept in page tables and maps virtual addresses to physical addresses) is not associated with a compiled program, but with a "process" (or "task" or whatever your OS calls it) that represents a running instance of that program. For example, a single ELF file can be loaded into two different processes, at different virtual addresses (if the ELF file is relocatable).
3. The programming language you're using defines which uninitialized state goes in the bss, and which gets explicitly initialized. Note that the bss does not contain "references" to these variables, it is the storage backing those variables.
4. Stack variables are referenced implicitly from the generated code. There is nothing explicit about them (or even the stack) in the ELF file.
5. Like stack references, heap references are implicit in the generated code in the ELF file. (They're all stored in memory created by changing the virtual address space via a call to sbrk or its equivalent.)
The ELF file explains to an OS how to setup a virtual address space for an instance of a program. The different sections describe different needs. For example ".rodata" says I'd like to store read-only data (as opposed to executable code). The ".text" section means executable code. The "bss" is a region used to store state that should be zeroed by the OS. The virtual address space means the program can (optionally) rely on things being where it expects when it starts up. (For example, if it asks for the .bss to be at address 0x4000, then either the OS will refuse to start it, or it will be there.)
Note that these virtual addresses are mapped to physical addresses by the page tables managed by the OS. The instance of the ELF file doesn't need to know any of the details involved in which physical pages are used.
I am not sure if 1, 2 and 3 are correct but I can explain 4 and 5.
4: They are referenced by offset from the top of the stack. When executing a function, the top of the stack is increased to allocate space for local variables. Compiler determines the order of local variables in the stack so the compiler nows what is the offset of the variables from the top of the stack.
Stack in physical memory is positioned upside down. Beginning of stack usually has highest memory address available. As programs runs and allocates space for local variables the address of the top of the stack decrements (and can potentially lead to stack overflow - overlapping with segments on lower addresses :-) )
5: Using pointers - Address of dynamically allocated variable is stored in (local) variable. This corresponds to using pointers in C.
I have found nice explanation here: http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/mem.html
All the addresses of the different sections (.text, .bss, .data, etc.) you see when you inspect an ELF with the size command:
$ size -A -x my_elf_binary
are virtual addresses. The MMU with the operating system performs the translation from the virtual addresses to the RAM physical addresses.
If you want to know these things, learn about the OS, with source code (www.kernel.org) if possible.
You need to realize that the OS kernel is actually running the CPU and managing the memory resource. And C code is just a light weight script to drive the OS and to run only simple operation with registers.
Virtual memory and Physical memory is about CPU's TLB letting the user space process to use contiguous memory virtually through the power of TLB (using page table) hardware.
So the actual physical memory, mapped to the contiguous virtual memory can be scattered to anywhere on the RAM.
Compiled program doesn't know about this TLB stuff and physical memory address stuff. They are managed in the OS kernel space.
BSS is a section which OS prepares as zero filled memory addresses, because they were not initialized in the c/c++ source code, thus marked as bss by the compiler/linker.
Stack is something prepared only a small amount of memory at first by the OS, and every time function call has been made, address will be pushed down, so that there is more space to place the local variables, and pop when you want to return from the function.
New physical memory will be allocated to the virtual address when the first small amount of memory is full and reached to the bottom, and page fault exception would occur, and the OS kernel will prepare a new physical memory and the user process can continue working.
No magic. In object code, every operation done to the pointer returned from malloc is handled as offsets to the register value returned from malloc function call.
Actually malloc is doing quite complex things. There are various implementations (jemalloc/ptmalloc/dlmalloc/googlemalloc/...) for improving dynamic allocations, but actually they are all getting new memory region from the OS using sbrk or mmap(/dev/zero), which is called anonymous memory.
Just do a man on the command readelf to find out the starting addresses of the different segments of your program.
Regarding the first question you are absolutely right. Since most of today's systems use run-time binding it is only during execution that the actual physical addresses are known. Moreover, it's the compiler and the loader that divide the program into different segments after linking the different libraries during compile and load time. Hence, the virtual addresses.
Coming to the second question it is at the run-time due to runtime binding. The third question is true. All uninitialized global variables and static variables go into BSS. Also note the special case: they go into BSS even if they are initialized to 0.
4.
If you look at a assembler code generated by gcc you can see that memory local variables is allocated in stack through command push or through changing value of the register ESP. Then they are initiated with command mov or something like that.

Resources