The heap troubles me because I don't understand who creates it, who maintains it and who decides where it should be... This test shows part of my conundrum:
Source code:
#include <malloc.h>
#include <stdio.h>
int a;
int b = 5;
int * getMeAPointer() {
int * e = malloc(4);
*e = 5;
return e;
}
void main() {
a = 5;
int c = 5;
int * d = (int *) 0x405554;
*d = 5;
int * e = getMeAPointer();
printf("Address of a located in .bss is %x\n", &a);
printf("Address of b located in .data is %x\n", &b);
printf("Address of c located in stack is %x\n", &c);
printf("Address of d located in stack is %x\n", &d);
printf("Address of *d located absolutely is %x\n", d);
printf("Address of e located in stack is %x\n", &e);
printf("Address of *e located on heap is %x\n", e);
printf("Address of getMeAPointer() located in .text is %x\n", getMeAPointer);
free(e);
}
Example printouts:
Address of a located in .bss is 0x405068
Address of b located in .data is 0x402000
Address of c located in stack is 0x22ff1c
Address of d located in stack is 0x22ff18
Address of *d located absolutely is 0x405554
Address of e located in stack is 0x22ff14
Address of *e located on heap is 0x541738
Address of getMeAPointer() located in .text is 0x4013b0
Address of a located in .bss is 0x405068
Address of b located in .data is 0x402000
Address of c located in stack is 0x22ff1c
Address of d located in stack is 0x22ff18
Address of *d located absolutely is 0x405554
Address of e located in stack is 0x22ff14
Address of *e located on heap is 0x3a1738
Address of getMeAPointer() located in .text is 0x4013b0
Address of a located in .bss is 0x405068
Address of b located in .data is 0x402000
Address of c located in stack is 0x22ff1c
Address of d located in stack is 0x22ff18
Address of *d located absolutely is 0x405554
Address of e located in stack is 0x22ff14
Address of *e located on heap is 0x351738
Address of getMeAPointer() located in .text is 0x4013b0
....etc....
Now these are my concerns:
Why is the heap moving around and none of the other segments? This is on a Windows 7 OS with MinGW and this file was compiled with GCC without further flags (I don't believe this is an example of Address Space Layout Randomization).
Who decides where the heap should be? I belive the linker reserves a place for the heap (I've seen heap symbols in symbol tables) but when is the exact address decided on, is it a runtime thing done by the RUNNABLE itself (C code) AFTER loading, or is it done by the linker / loader / dynamic linker while LOADING the program right BEFORE execution?
Is there any way to set the heap address in ld? I have understood that I can set all segments except stack (since that's built into the kernel of the OS) but can I set the heap address?
The way I understands it, the heap is not really an assembly language construct and we don't have access to a heap if we choose to just do assembly programming. Therefore it is a C construct but I'm interested in how that affects the life of the heap (i mean we speak of the heap like it's on the same level as the segments and the stack but if it isn't, then that should give it a whole lot of other conditions)... is this correct and can anyone tell me something more about it?
I've Google'd all day long to be honest and I'm hungry for some answers!
Why is the heap moving around and none of the other segments?
Because dynamic memory allocation is, well, dynamic. The address you get back from malloc() depends on where a sufficiently big chunk of free memory can be found at the very moment your program is being executed. Obviously, since there are other programs too, this changes over time.
Who decides where the heap should be?
The developers of your operating system.
I belive [sic!] the linker reserves a place for the heap
Rarely. On most implementations I've seen, it's entirely a runtime thing. (It's not that it's impossible that the linker have something to do with it, but still.)
Is there any way to set the heap address in ld?
If there is, it's surely documented. (Assuming that the ld you are referring to is the linker in your toolchain.)
[...] I can set all segments except stack (since that's built into the kernel of the OS)
I don't think I fully understand what you are saying, but "the stack" is not "built in to the kernel". Generally, the stack address is either statically hard-coded into the executable or it is referred to using relative instructions.
the heap is not really an assembly language construct and we don't have access to a heap if we choose to just do assembly programming.
But yes you do. I'm not familiar with Windows, but on most Unixes, you can use the brk() and/or sbrk() syscalls.
Therefore it is a C construct
Your logic is flawed. Just because something is not an assembly thing, it does not automatically mean that it's a C thing. In fact, there's no such thing as "the heap" or "the stack" in C. In C, there is only automatic, static and dynamic storage duration, which are not tied to specific manners of implementation.
A variety of factors will influence where the heap will reside, i.e.
Base address of the application
Libraries required by the application
What the operating system allocates to the application
The C heap is (simplified) just a huge block of memory which is maintained by the runtime of the application, so the effective address you get when calling malloc() is defined by that runtime. The runtime is different between compiler versions and different vendors. The entire memory block that the runtime uses as a heap is obtained from the operating system at application startup. Here, the operating system may return a different address each time the application is run. Therefore, heap addresses are not predictable. If your application starts allocating and deallocating memory during its run, it will even get "more" random, because now the runtime needs to find free blocks between currently allocated blocks and so on. So unless the sequence of allocations/deallocations is exactly the same between runs, you will get entirely different addresses, even if the base heap address is the same.
Related
I was reading Operating Systems: Three Easy Pieces. To learn how the virtual address space for a program look like, I run the following code.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
printf("location of code : %p\n", (void *) main);
printf("location of heap : %p\n", (void *) malloc(1));
int x = 3;
printf("location of stack : %p\n", (void *) &x);
return x;
}
Its output is:
location of code : 0x564eac1266fa
location of heap : 0x564ead8e5670
location of stack : 0x7fffd0e77e54
Why the code segment's location is 0x564eac1266fa? What does so large a (virtual) space before it use for? Why doesn't it start from or near 0x0)
And, why the program's virtual address is so large?(from the stack location, it's 48 bits wide) What's the point of it?
The possible virtual address space organizations are defined by the hardware you are using, specifically the MMU it supports. The OS may then use any organization that the hardware can be coerced into using, but generally it just uses it directly (possibly with some subsetting), as that is most efficient.
The x86_64 architecture defines a 48-bit virtual address space1, and most OSes reserve half of that for system use, so user programs see a 47 bit address space. Within that address space, most OSes will randomize the addresses used for any given program, so as to make exploiting bugs in the programs harder.
1Strictly speaking, the architecture defines a 64-bit virtual address space, but then reserves all addresses that do not have the top 17 bits all 0 or all 1.
You are barking up the wrong tree with what you are trying to do here. A process has multiple stacks, may have multiple heaps, and main might not be the start of the code. Viewing an address space as a code segment, stack segment, heap segment, ... as horrible operating systems books do is only going to get you confused.
Because of logical addressing, the memory mapped into the address space does not have to be contiguous.
Why the code segment's location is 0x564eac1266fa? What does so large a (virtual) space before it use for? Why doesn't it start from or near 0x0)
The start of code in your process would well be at 0x564eac1266f8. The fact that you have a high address does not mean the lower addresses have been mapped into the process address space.
And, why the program's virtual address is so large?(from the stack location, it's 48 bits wide) What's the point of it?
Stacks generally start high and grow low.
#include<stdio.h>
int main()
{
int a,b;
float e;
char f;
printf("int &a = %u\n",&a);
printf("int &b = %u\n",&b);
printf("float &e = %u\n",&e);
printf("char &f = %u\n",&f);
}
The Output is
int &a = 2293324
int &b = 2293320
float &e = 2293316
char &f = 2293315
But when i use this code and replace the printf for float--
#include<stdio.h>
int main()
{
int a,b;
float e;
char f;
printf("int &a = %u\n",&a);
printf("int &b = %u\n",&b);
printf("char &f = %u\n",&f);
}
Then the Output is
int &a = 2293324
int &b = 2293320
char &f = 2293319
here address is not provided to float, but it is declared on top.
My questions are
Is memory not allocated to variables not used in program?
Why addresses allocated in decreasing order. ex- it's going from 2293324 to 2293320?
1) Is memory not allocated to variables not used in program?
Yes that can happen, the compiler is allowed to optimize it out.
2) Why addresses allocated in decreasing order. ex- it's going from 2293324 to 2293320?
That is usual for most local storage implementations, that they use the CPU supported stack pointer going from stack top to stack bottom. All those local variables will be allocated at the stack most probably.
1) Is memory not allocated to variables not used in program?
It's an allowed optimization; if an unused variable doesn't affect the program's observable behavior, a compiler may just discard it completely. Note that most modern compilers will warn you about unused variables (so you can either remove them from the code or do something with them).
2) Why addresses allocated in decreasing order. ex- it's going from 2293324 to 2293320?
The compiler is not required to allocate storage for separate objects in any particular order, so don't assume that your variables will be allocated in the order they were declared. Also, remember that on x86 and some other systems, the stack grows "downwards" towards decreasing addresses. Remember the top of any stack is simply the location where something was most recently pushed - it has nothing to do with relative address values.
While not specifically required by the standard, local variables are universally located on the program stack.
When you enter a function, one of the first thing done is to decrement the stack pointer to provide space for the local variables.
SUBL #SOMETHING, SP
Where SOMETHING is the amount of space required and SP is the stack pointer register.. In your first example, SOMETHING is probably 13. Then the address of:
f is 0(SP)
e is 1(sp)
b is 5(sp)
a is 9(sp)
I am assuming your compiler did not align the stack pointer. Often they do giving something more like:
f is 3(SP)
e is 4(sp)
b is 8(sp)
a is 12(sp)
And SOMETHING would be rounded up to 16 on a 32-bit system.
You might want to generate an assembly listing using your compiler to see what is going on underneath.
Is memory not allocated to variables not used in program?
Note that for local variable memory is not really allocated. A variable is temporarily bound to a location on the program stack (stack is not required by the standard but is how it is done in most cases). That is why the variable's initial value is undefined. It could have been bound to something else previously.
The compiler does not need to reserve space for variables that are not used. They can be optimized away. Usually, there are compiler settings to instruct not to do this for debugging.
Why addresses allocated in decreasing order. ex- it's going from 2293324 to 2293320?
Program stacks generally grow downward. Starting ye olde days, the program would be at the bottom of the address space, the heap above that and the stack at the opposite end.
The heap would grow towards higher addresses. The stack would grow towards the heap (lower addresses).
While the address spaces can be more complicated than that these days, the downward growth of stacks has stayed.
There is no particular requirement that the compiler map the variables to the stack in descending order but there's a 50/50 chance it will do it that way.
I ran this program multiple times on the same machine.
#include <stdio.h>
int a = 0;
int main() {
int b = 0;
printf("%p %p\n", &a, &b);
}
Every time it is printing the same address of the variable a but for b it changes. I know a will go into the .data section so address is fixed (correct me if I am wrong) but why is stack getting a different address every time?
All these addresses will be virtual. Is it possible to get the physical address from these variables?
If a global variable is initialized to zero, where will it go, BSS or data?
#JonathanLeffler already mentioned that the stack address changes due to address space layout randomization.
To make the addresses of globals change as well compile your executable with -fpie and link with -fpie -pie.
See Position Independent Executables (PIE) for more details.
Not possible from the user space. Even if you had the physical address that would not be of much use in the user space because it may change due to swap-out-swap-in.
Normally zero-initialized data goes into BSS.
I am trying to figure out how the addresses are assigned to variables located on the stack. I ran the little program below:
int main()
{
long a;
int b;
int c;
printf("&a = %p\n", &a);
printf("&b = %p\n", &b);
printf("&c = %p\n", &c);
}
The output I expected was (considering addresses are going down):
&a = 0x7fff6e1acb88
&b = 0x7fff6e1acb80
&c = 0x7fff6e1acb7c
But instead I got:
&a = 0x7fff6e1acb88
&b = 0x7fff6e1acb80
&c = 0x7fff6e1acb84
How come the c variable is located between the a and b variable? Are variables not put on the stack as they are declared?
I tried replacing the type of a from long to int and I got this:
&a = 0x7fff48466a74
&b = 0x7fff48466a78
&c = 0x7fff48466a7c
Here I don't understand why are the addresses going up, while they were going down previously?
I compiled the program using gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-11precise2), if that's of any help.
Are variables not put on the stack as they are declared?
No.
why are the addresses going up, while they were going down previously?
They could going up, but they do not have to.
Compilers is free to rearrage the order of local variables that it sees is fit, or it could even delete or add some.
Variables are not necessarily put on the stack in the order in which they are declared. You can't predict where on the stack they will be -- they could be in any order. As glglgl pointed out in the comments, they don't even have to be on the stack and could simply be held in registers.
Even though the stack pointer is counting downwards on your given CPU, the program will be using stack frames. How a certain parameter is allocated inside the stack frame is implementation-defined.
Also note that some CPUs have up-counting stack pointers.
Also note that local variables are not necessarily allocated on the stack. More often, they are allocated in CPU registers. But when you take the address of a variable, you kind of force the compiler to allocate it on the stack, since registers have no addresses.
The direction of stack growth depends on the architecture. On x86, stack grows down (from higher address to lower). How the variables are put on the stack also depends on the OS's Application binary interface (ABI) and how the compiler follows the ABI convention. But, the compiler may not necessarily follow the ABI conventions all the time.
it depends on the compiler.
for example I test your code with "GNU GCC version 4.8.1" and the Compiler allocate all locale variables in order :
&a = 0x7fffe265d118
&b = 0x7fffe265d114
&c = 0x7fffe265d110
Agreeing with #Lundin that you cannot print the address of the registers these a most likely sitting in. The best thing you can do to see how these are stored would be to dump the object code and investigate what happens when you create more local variables then the amount of registers to hold them. That is when you'll see the stack activity (in the disassembly ).
$ gcc -c main.c -o main.o
$ objdump -D main.o > main.dump
According to Linux programmers manual:
brk() and sbrk() change the location of the program break, which
defines the end of the process's data segment.
What does the data segment mean over here? Is it just the data segment or data, BSS, and heap combined?
According to wiki Data segment:
Sometimes the data, BSS, and heap areas are collectively referred to as the "data segment".
I see no reason for changing the size of just the data segment. If it is data, BSS and heap collectively then it makes sense as heap will get more space.
Which brings me to my second question. In all the articles I read so far, author says that heap grows upward and stack grows downward. But what they do not explain is what happens when heap occupies all the space between heap and stack?
In the diagram you posted, the "break"—the address manipulated by brk and sbrk—is the dotted line at the top of the heap.
The documentation you've read describes this as the end of the "data segment" because in traditional (pre-shared-libraries, pre-mmap) Unix the data segment was continuous with the heap; before program start, the kernel would load the "text" and "data" blocks into RAM starting at address zero (actually a little above address zero, so that the NULL pointer genuinely didn't point to anything) and set the break address to the end of the data segment. The first call to malloc would then use sbrk to move the break up and create the heap in between the top of the data segment and the new, higher break address, as shown in the diagram, and subsequent use of malloc would use it to make the heap bigger as necessary.
Meantime, the stack starts at the top of memory and grows down. The stack doesn't need explicit system calls to make it bigger; either it starts off with as much RAM allocated to it as it can ever have (this was the traditional approach) or there is a region of reserved addresses below the stack, to which the kernel automatically allocates RAM when it notices an attempt to write there (this is the modern approach). Either way, there may or may not be a "guard" region at the bottom of the address space that can be used for stack. If this region exists (all modern systems do this) it is permanently unmapped; if either the stack or the heap tries to grow into it, you get a segmentation fault. Traditionally, though, the kernel made no attempt to enforce a boundary; the stack could grow into the heap, or the heap could grow into the stack, and either way they would scribble over each other's data and the program would crash. If you were very lucky it would crash immediately.
I'm not sure where the number 512GB in this diagram comes from. It implies a 64-bit virtual address space, which is inconsistent with the very simple memory map you have there. A real 64-bit address space looks more like this:
Legend: t: text, d: data, b: BSS
This is not remotely to scale, and it shouldn't be interpreted as exactly how any given OS does stuff (after I drew it I discovered that Linux actually puts the executable much closer to address zero than I thought it did, and the shared libraries at surprisingly high addresses). The black regions of this diagram are unmapped -- any access causes an immediate segfault -- and they are gigantic relative to the gray areas. The light-gray regions are the program and its shared libraries (there can be dozens of shared libraries); each has an independent text and data segment (and "bss" segment, which also contains global data but is initialized to all-bits-zero rather than taking up space in the executable or library on disk). The heap is no longer necessarily continous with the executable's data segment -- I drew it that way, but it looks like Linux, at least, doesn't do that. The stack is no longer pegged to the top of the virtual address space, and the distance between the heap and the stack is so enormous that you don't have to worry about crossing it.
The break is still the upper limit of the heap. However, what I didn't show is that there could be dozens of independent allocations of memory off there in the black somewhere, made with mmap instead of brk. (The OS will try to keep these far away from the brk area so they don't collide.)
Minimal runnable example
What does brk( ) system call do?
Asks the kernel to let you you read and write to a contiguous chunk of memory called the heap.
If you don't ask, it might segfault you.
Without brk:
#define _GNU_SOURCE
#include <unistd.h>
int main(void) {
/* Get the first address beyond the end of the heap. */
void *b = sbrk(0);
int *p = (int *)b;
/* May segfault because it is outside of the heap. */
*p = 1;
return 0;
}
With brk:
#define _GNU_SOURCE
#include <assert.h>
#include <unistd.h>
int main(void) {
void *b = sbrk(0);
int *p = (int *)b;
/* Move it 2 ints forward */
brk(p + 2);
/* Use the ints. */
*p = 1;
*(p + 1) = 2;
assert(*p == 1);
assert(*(p + 1) == 2);
/* Deallocate back. */
brk(b);
return 0;
}
GitHub upstream.
The above might not hit a new page and not segfault even without the brk, so here is a more aggressive version that allocates 16MiB and is very likely to segfault without the brk:
#define _GNU_SOURCE
#include <assert.h>
#include <unistd.h>
int main(void) {
void *b;
char *p, *end;
b = sbrk(0);
p = (char *)b;
end = p + 0x1000000;
brk(end);
while (p < end) {
*(p++) = 1;
}
brk(b);
return 0;
}
Tested on Ubuntu 18.04.
Virtual address space visualization
Before brk:
+------+ <-- Heap Start == Heap End
After brk(p + 2):
+------+ <-- Heap Start + 2 * sizof(int) == Heap End
| |
| You can now write your ints
| in this memory area.
| |
+------+ <-- Heap Start
After brk(b):
+------+ <-- Heap Start == Heap End
To better understand address spaces, you should make yourself familiar with paging: How does x86 paging work?.
Why do we need both brk and sbrk?
brk could of course be implemented with sbrk + offset calculations, both exist just for convenience.
In the backend, the Linux kernel v5.0 has a single system call brk that is used to implement both: https://github.com/torvalds/linux/blob/v5.0/arch/x86/entry/syscalls/syscall_64.tbl#L23
12 common brk __x64_sys_brk
Is brk POSIX?
brk used to be POSIX, but it was removed in POSIX 2001, thus the need for _GNU_SOURCE to access the glibc wrapper.
The removal is likely due to the introduction mmap, which is a superset that allows multiple range to be allocated and more allocation options.
I think there is no valid case where you should to use brk instead of malloc or mmap nowadays.
brk vs malloc
brk is one old possibility of implementing malloc.
mmap is the newer stricly more powerful mechanism which likely all POSIX systems currently use to implement malloc. Here is a minimal runnable mmap memory allocation example.
Can I mix brk and malloc?
If your malloc is implemented with brk, I have no idea how that can possibly not blow up things, since brk only manages a single range of memory.
I could not however find anything about it on the glibc docs, e.g.:
https://www.gnu.org/software/libc/manual/html_mono/libc.html#Resizing-the-Data-Segment
Things will likely just work there I suppose since mmap is likely used for malloc.
See also:
What's unsafe/legacy about brk/sbrk?
Why does calling sbrk(0) twice give a different value?
More info
Internally, the kernel decides if the process can have that much memory, and earmarks memory pages for that usage.
This explains how the stack compares to the heap: What is the function of the push / pop instructions used on registers in x86 assembly?
You can use brk and sbrk yourself to avoid the "malloc overhead" everyone's always complaining about. But you can't easily use this method in conjuction with malloc so it's only appropriate when you don't have to free anything. Because you can't. Also, you should avoid any library calls which may use malloc internally. Ie. strlen is probably safe, but fopen probably isn't.
Call sbrk just like you would call malloc. It returns a pointer to the current break and increments the break by that amount.
void *myallocate(int n){
return sbrk(n);
}
While you can't free individual allocations (because there's no malloc-overhead, remember), you can free the entire space by calling brk with the value returned by the first call to sbrk, thus rewinding the brk.
void *memorypool;
void initmemorypool(void){
memorypool = sbrk(0);
}
void resetmemorypool(void){
brk(memorypool);
}
You could even stack these regions, discarding the most recent region by rewinding the break to the region's start.
One more thing ...
sbrk is also useful in code golf because it's 2 characters shorter than malloc.
There is a special designated anonymous private memory mapping (traditionally located just beyond the data/bss, but modern Linux will actually adjust the location with ASLR). In principle it's no better than any other mapping you could create with mmap, but Linux has some optimizations that make it possible to expand the end of this mapping (using the brk syscall) upwards with reduced locking cost relative to what mmap or mremap would incur. This makes it attractive for malloc implementations to use when implementing the main heap.
malloc uses brk system call to allocate memory.
include
int main(void){
char *a = malloc(10);
return 0;
}
run this simple program with strace, it will call brk system.
I can answer your second question. Malloc will fail and return a null pointer. That's why you always check for a null pointer when dynamically allocating memory.
The heap is placed last in the program's data segment. brk() is used to change (expand) the size of the heap. When the heap cannot grow any more any malloc call will fail.
The data segment is the portion of memory that holds all your static data, read in from the executable at launch and usually zero-filled.