local variable memory and malloced memory location [closed] - c

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I was playing up with malloced memory and local variables to see how stack and Heap grows. From my understanding heap grows upwards and the stack grows downwards. All the memory allocated using malloc is allocated in the heap and local variables are allocated on the stack in a function.
In following program :
#include <stdio.h>
#include <stdlib.h>
#define SIZE 999999999
void tot(){
static int c =0;
int k;
c++;
int *pp = malloc(sizeof(int) * SIZE);
if(pp==NULL){
fprintf(stderr,"NO memory allocated after call : %d\n",c);
return;
}
printf("%p %d %p\n",&k,c,pp);
tot();
}
int main(int argc, char *argv[]){
int k;
int *pp = malloc(sizeof(int) * 99999);
if(pp ==NULL){
fprintf(stderr,"No memory allocated\n");
return;
}
printf("%p %p\n",&k,pp);
tot();
return 0;
}
I have created one local variable in the function tot() which will be 4 bytes and one pointer variable of type int* which will increase the total stack size for each call to slightly greater than 4 bytes. Also I have allocated some memory using malloc and assigned the allocated memory to local pointer variable. Since this malloced memory is allocated on the heap, the stack size should still be just over 4 bytes but according to my observation in the above program, the stack size is increased by a large amount. After some maths on that large amount I found out that the stack frame is including the malloced memory created in each function call.
Though removing the line
int *pp = (int*)malloc(sizeof(int) * SIZE);
which is responsible for allocating this large memory in each function call reduces the stack frame size back to ~4 bytes which is perfectly fine.
Stack frame is growing downwards in both situations though.
Why I am getting such output. Is my belief that dynamic memory is allocated on the heap is wrong. Why malloced memory is increasing the size of the stack frame ?
EDIT :
I also tried accessing the allocated memory in one stack frame in other stack frame by passing the pointer to memory allocated in one stack frame to the other function call (another stack frame) so that the memory allocated in one frame can be used in other call just to verify if the compiler is not internally converting the malloc to alloca (which could have been the reason for large stack frames) but that didn't make any difference. The results are still same.

Many Linux systems have ASLR so the result of malloc might not be reproducible from one execution to the next; you can disable ASLR thru /proc/sys/kernel/randomize_va_space.
Actually, recent malloc implementations are using more the mmap(2) syscall than the sbrk or brk one. So the heap is actually made of several non-contiguous virtual memory segments. Usually, later calls to free mark the freed region as reusable by malloc (but don't necessarily munumap(2) it).
You can have a look at the address space of some process of pid 1234 by looking into /proc/1234/maps. Read proc(5) man page (and try cat /proc/self/maps which shows the address space of the process running that cat).
And you tot function being tail recursive, most GCC compilers would optimize it as a loop.... (if you are compiling with gcc -O1 or more). However, stack frames are usually required to be aligned (e.g. to 16 bytes) and contain also the previous frame pointer, some space to spill registers, etc...

C is a low-level language, but it still defines things loosely enough that specific details can vary between platforms. For instance, the compiler is free to allocate varying amounts of stack space as it sees fit (typically "extra" space), so long as the results of the running program are not altered.
This may be done in the name of performance [1] or to support certain calling conventions. So even though an int may only need for bytes of space, the compiler may allocate 16 bytes on the stack to preserve a favorable memory alignment.
If the "large" increases in stack size are small compared to SIZE, then the difference is probably just due to padding of this sort. (Note that there is also more data on the stack than just the local variables - the return address of the calling function will usually be stored there, as well as the contents of any registers the compiler decided should be stored from the caller.)
If it actually looks like malloc is allocating to the stack (i.e. the stack is increasing in jumps of SIZE), that's surprising, but probably still "correct" on behalf of the compiler. By definition malloc allocates memory; there is no guarantee that this is allocated from the heap, although that is certainly the normal implementation. Since the results of malloc are never accessed outside of the stack frame in which it's invoked, the compiler could in principle optimize this to an alloca call; but, again, I'd be very surprised to see that.
[1] Example: http://software.intel.com/en-us/articles/data-alignment-when-migrating-to-64-bit-intel-architecture

Related

Will malloc round up to the nearest page size?

I'm not sure if I'm asking a noob question here, but here I go. I also searched a lot for a similar question, but I got nothing.
So, I know how mmap and brk work and that, regardless of the length you enter, it will round it up to the nearest page boundary. I also know malloc uses brk/sbrk or mmap (At least on Linux/Unix systems) but this raises the question: does malloc also round up to the nearest page size? For me, a page size is 4096 bytes, so if I want to allocate 16 bytes with malloc, 4096 bytes is... a lot more than I asked for.
The basic job of malloc and friends is to manage the fact that the OS can generally only (efficiently) deal with large allocations (whole pages and extents of pages), while programs often need smaller chunks and finer-grained management.
So what malloc (generally) does, is that the first time it is called, it allocates a larger amount of memory from the system (via mmap or sbrk -- maybe one page or maybe many pages), and uses a small amount of that for some data structures to track the heap use (where the heap is, what parts are in use and what parts are free) and then marks the rest of that space as free. It then allocates the memory you requested from that free space and keeps the rest available for subsequent malloc calls.
So the first time you call malloc for eg 16 bytes, it will uses mmap or sbrk to allocate a large chunk (maybe 4K or maybe 64K or maybe 16MB or even more) and initialize that as mostly free and return you a pointer to 16 bytes somewhere. A second call to malloc for another 16 bytes will just return you another 16 bytes from that pool -- no need to go back to the OS for more.
As your program goes ahead mallocing more memory it will just come from this pool, and free calls will return memory to the free pool. If it generally allocates more than it frees, eventually that free pool will run out, and at that point, malloc will call the system (mmap or sbrk) to get more memory to add to the free pool.
This is why if you monitor a process that is allocating and freeing memory with malloc/free with some sort of process monitor, you will generally just see the memory use go up (as the free pool runs out and more memory is requested from the system), and generally will not see it go down -- even though memory is being freed, it generally just goes back to the free pool and is not unmapped or returned to the system. There are some exceptions -- particularly if very large blocks are involved -- but generally you can't rely on any memory being returned to the system until the process exits.
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <unistd.h>
int main(void) {
void *a = malloc(1);
void *b = malloc(1);
uintptr_t ua = (uintptr_t)a;
uintptr_t ub = (uintptr_t)b;
size_t page_size = getpagesize();
printf("page size: %zu\n", page_size);
printf("difference: %zd\n", (ssize_t)(ub - ua));
printf("offsets from start of page: %zu, %zu\n",
(size_t)ua % page_size, (size_t)ub % page_size);
}
prints
page_size: 4096
difference: 32
offsets from start of page: 672, 704
So clearly it is not rounded to page size in this case, which proves that it is not always rounded to page size.
It will hit mmap if you change allocation to some arbitrary large size. For example:
void *a = malloc(10000001);
void *b = malloc(10000003);
and I get:
page size: 4096
difference: -10002432
offsets from start of page: 16, 16
And clearly the starting address is still not page aligned; the bookkeeping must be stored below the pointer and the pointer needs to be sufficiently aligned for the largest alignment generally needed - you can reason this with free - if free is just given a pointer but it needs to figure out the size of the allocation, where could it look for it, and only two choices are feasible: in a separate data structure that lists all base pointers and their allocation sizes, or at some offset below the current pointer. And only one of them is sane.

Where do non-malloc'd variables go? [duplicate]

Today I was helping a friend of mine with some C code, and I've found some strange behavior that I couldn't explain him why it was happening. We had TSV file with a list of integers, with an int each line. The first line was the number of lines the list had.
We also had a c file with a very simple "readfile". The first line was read to n, the number of lines, then there was an initialization of:
int list[n]
and finally a for loop of n with a fscanf.
For small n's (till ~100.000), everything was fine. However, we've found that when n was big (10^6), a segfault would occur.
Finally, we changed the list initialization to
int *list = malloc(n*sizeof(int))
and everything when well, even with very large n.
Can someone explain why this occurred? what was causing the segfault with int list[n], that was stopped when we start using list = malloc(n*sizeof(int))?
There are several different pieces at play here.
The first is the difference between declaring an array as
int array[n];
and
int* array = malloc(n * sizeof(int));
In the first version, you are declaring an object with automatic storage duration. This means that the array lives only as long as the function that calls it exists. In the second version, you are getting memory with dynamic storage duration, which means that it will exist until it is explicitly deallocated with free.
The reason that the second version works here is an implementation detail of how C is usually compiled. Typically, C memory is split into several regions, including the stack (for function calls and local variables) and the heap (for malloced objects). The stack typically has a much smaller size than the heap; usually it's something like 8MB. As a result, if you try to allocate a huge array with
int array[n];
Then you might exceed the stack's storage space, causing the segfault. On the other hand, the heap usually has a huge size (say, as much space as is free on the system), and so mallocing a large object won't cause an out-of-memory error.
In general, be careful with variable-length arrays in C. They can easily exceed stack size. Prefer malloc unless you know the size is small or that you really only do want the array for a short period of time.
int list[n]
Allocates space for n integers on the stack, which is usually pretty small. Using memory on the stack is much faster than the alternative, but it is quite small and it is easy to overflow the stack (i.e. allocate too much memory) if you do things like allocate huge arrays or do recursion too deeply. You do not have to manually deallocate memory allocated this way, it is done by the compiler when the array goes out of scope.
malloc on the other hand allocates space in the heap, which is usually very large compared to the stack. You will have to allocate a much larger amount of memory on the heap to exhaust it, but it is a lot slower to allocate memory on the heap than it is on the stack, and you must deallocate it manually via free when you are done using it.
int list[n] stores the data in the stack, while malloc stores it in the heap.
The stack is limited, and there is not much space, while the heap is much much bigger.
int list[n] is a VLA, which allocates on the stack instead of on the heap. You don't have to free it (it frees automatically at the end of the function call) and it allocates quickly but the storage space is very limited, as you have discovered. You must allocate larger values on the heap.
This declaration allocates memory on the stack
int list[n]
malloc allocates on the heap.
Stack size is usually smaller than heap, so if you allocate too much memory on the stack you get a stackoverflow.
See also this answer for further information
Assuming you have a typical implementation in your implementation it's most likely that:
int list[n]
allocated list on your stack, where as:
int *list = malloc(n*sizeof(int))
allocated memory on your heap.
In the case of a stack there is typically a limit to how large these can grow (if they can grow at all). In the case of a heap there is still a limit, but that tends to be much largely and (broadly) constrained by your RAM+swap+address space which is typically at least an order of magnitude larger, if not more.
If you are on linux, you can set ulimit -s to a larger value and this might work for stack allocation also.
When you allocate memory on stack, that memory remains till the end of your function's execution. If you allocate memory on heap(using malloc), you can free the memory any time you want(even before the end of your function's execution).
Generally, heap should be used for large memory allocations.
When you allocate using a malloc, memory is allocated from heap and not from stack, which is much more limited in size.
int array[n];
It is an example of statically allocated array and at the compile time the size of the array will be known. And the array will be allocated on the stack.
int *array(malloc(sizeof(int)*n);
It is an example of dynamically allocated array and the size of the array will be known to user at the run time. And the array will be allocated on the heap.

Difference between array type and array allocated with malloc

Today I was helping a friend of mine with some C code, and I've found some strange behavior that I couldn't explain him why it was happening. We had TSV file with a list of integers, with an int each line. The first line was the number of lines the list had.
We also had a c file with a very simple "readfile". The first line was read to n, the number of lines, then there was an initialization of:
int list[n]
and finally a for loop of n with a fscanf.
For small n's (till ~100.000), everything was fine. However, we've found that when n was big (10^6), a segfault would occur.
Finally, we changed the list initialization to
int *list = malloc(n*sizeof(int))
and everything when well, even with very large n.
Can someone explain why this occurred? what was causing the segfault with int list[n], that was stopped when we start using list = malloc(n*sizeof(int))?
There are several different pieces at play here.
The first is the difference between declaring an array as
int array[n];
and
int* array = malloc(n * sizeof(int));
In the first version, you are declaring an object with automatic storage duration. This means that the array lives only as long as the function that calls it exists. In the second version, you are getting memory with dynamic storage duration, which means that it will exist until it is explicitly deallocated with free.
The reason that the second version works here is an implementation detail of how C is usually compiled. Typically, C memory is split into several regions, including the stack (for function calls and local variables) and the heap (for malloced objects). The stack typically has a much smaller size than the heap; usually it's something like 8MB. As a result, if you try to allocate a huge array with
int array[n];
Then you might exceed the stack's storage space, causing the segfault. On the other hand, the heap usually has a huge size (say, as much space as is free on the system), and so mallocing a large object won't cause an out-of-memory error.
In general, be careful with variable-length arrays in C. They can easily exceed stack size. Prefer malloc unless you know the size is small or that you really only do want the array for a short period of time.
int list[n]
Allocates space for n integers on the stack, which is usually pretty small. Using memory on the stack is much faster than the alternative, but it is quite small and it is easy to overflow the stack (i.e. allocate too much memory) if you do things like allocate huge arrays or do recursion too deeply. You do not have to manually deallocate memory allocated this way, it is done by the compiler when the array goes out of scope.
malloc on the other hand allocates space in the heap, which is usually very large compared to the stack. You will have to allocate a much larger amount of memory on the heap to exhaust it, but it is a lot slower to allocate memory on the heap than it is on the stack, and you must deallocate it manually via free when you are done using it.
int list[n] stores the data in the stack, while malloc stores it in the heap.
The stack is limited, and there is not much space, while the heap is much much bigger.
int list[n] is a VLA, which allocates on the stack instead of on the heap. You don't have to free it (it frees automatically at the end of the function call) and it allocates quickly but the storage space is very limited, as you have discovered. You must allocate larger values on the heap.
This declaration allocates memory on the stack
int list[n]
malloc allocates on the heap.
Stack size is usually smaller than heap, so if you allocate too much memory on the stack you get a stackoverflow.
See also this answer for further information
Assuming you have a typical implementation in your implementation it's most likely that:
int list[n]
allocated list on your stack, where as:
int *list = malloc(n*sizeof(int))
allocated memory on your heap.
In the case of a stack there is typically a limit to how large these can grow (if they can grow at all). In the case of a heap there is still a limit, but that tends to be much largely and (broadly) constrained by your RAM+swap+address space which is typically at least an order of magnitude larger, if not more.
If you are on linux, you can set ulimit -s to a larger value and this might work for stack allocation also.
When you allocate memory on stack, that memory remains till the end of your function's execution. If you allocate memory on heap(using malloc), you can free the memory any time you want(even before the end of your function's execution).
Generally, heap should be used for large memory allocations.
When you allocate using a malloc, memory is allocated from heap and not from stack, which is much more limited in size.
int array[n];
It is an example of statically allocated array and at the compile time the size of the array will be known. And the array will be allocated on the stack.
int *array(malloc(sizeof(int)*n);
It is an example of dynamically allocated array and the size of the array will be known to user at the run time. And the array will be allocated on the heap.

How do I calculate beforehand how much memory calloc would allocate?

I basically have this piece of code.
char (* text)[1][80];
text = calloc(2821522,80);
The way I calculated it, that calloc should have allocated 215.265045 megabytes of RAM, however, the program in the end exceeded that number and allocated nearly 700mb of ram.
So it appears I cannot properly know how much memory that function will allocate.
How does one calculate that propery?
calloc (and malloc for that matter) is free to allocate as much space as it needs to satisfy the request.
So, no, you cannot tell in advance how much it will actually give you, you can only assume that it's given you the amount you asked for.
Having said that, 700M seems a little excessive so I'd be investigating whether the calloc was solely responsible for that by, for example, a program that only does the calloc and nothing more.
You might also want to investigate how you're measuring that memory usage.
For example, the following program:
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
int main (void) {
char (* text)[1][80];
struct mallinfo mi;
mi = mallinfo(); printf ("%d\n", mi.uordblks);
text = calloc(2821522,80);
mi = mallinfo(); printf ("%d\n", mi.uordblks);
return 0;
}
outputs, on my system:
66144
225903256
meaning that the calloc has allocated 225,837,112 bytes which is only a smidgeon (115,352 bytes or 0.05%) above the requested 225,721,760.
Well it depends on the underlying implementation of malloc/calloc.
It generally works like this - there's this thing called the heap pointer which points to the top of the heap - the area from where dynamic memory gets allocated. When memory is first allocated, malloc internally requests x amount of memory from the kernel - i.e. the heap pointer increments by a certain amount to make that space available. That x may or may not be equal to the size of the memory block you requested (it might be larger to account for future mallocs). If it isn't, then you're given at least the amount of memory you requested(sometimes you're given more memory because of alignment issues). The rest is made part of an internal free list maintained by malloc. To sum it up malloc has some underlying data structures and a lot depends on how they are implemented.
My guess is that the x amount of memory was larger (for whatever reason) than you requested and hence malloc/calloc was holding on to the rest in its free list. Try allocating some more memory and see if the footprint increases.

What does the brk() system call do?

According to Linux programmers manual:
brk() and sbrk() change the location of the program break, which
defines the end of the process's data segment.
What does the data segment mean over here? Is it just the data segment or data, BSS, and heap combined?
According to wiki Data segment:
Sometimes the data, BSS, and heap areas are collectively referred to as the "data segment".
I see no reason for changing the size of just the data segment. If it is data, BSS and heap collectively then it makes sense as heap will get more space.
Which brings me to my second question. In all the articles I read so far, author says that heap grows upward and stack grows downward. But what they do not explain is what happens when heap occupies all the space between heap and stack?
In the diagram you posted, the "break"—the address manipulated by brk and sbrk—is the dotted line at the top of the heap.
The documentation you've read describes this as the end of the "data segment" because in traditional (pre-shared-libraries, pre-mmap) Unix the data segment was continuous with the heap; before program start, the kernel would load the "text" and "data" blocks into RAM starting at address zero (actually a little above address zero, so that the NULL pointer genuinely didn't point to anything) and set the break address to the end of the data segment. The first call to malloc would then use sbrk to move the break up and create the heap in between the top of the data segment and the new, higher break address, as shown in the diagram, and subsequent use of malloc would use it to make the heap bigger as necessary.
Meantime, the stack starts at the top of memory and grows down. The stack doesn't need explicit system calls to make it bigger; either it starts off with as much RAM allocated to it as it can ever have (this was the traditional approach) or there is a region of reserved addresses below the stack, to which the kernel automatically allocates RAM when it notices an attempt to write there (this is the modern approach). Either way, there may or may not be a "guard" region at the bottom of the address space that can be used for stack. If this region exists (all modern systems do this) it is permanently unmapped; if either the stack or the heap tries to grow into it, you get a segmentation fault. Traditionally, though, the kernel made no attempt to enforce a boundary; the stack could grow into the heap, or the heap could grow into the stack, and either way they would scribble over each other's data and the program would crash. If you were very lucky it would crash immediately.
I'm not sure where the number 512GB in this diagram comes from. It implies a 64-bit virtual address space, which is inconsistent with the very simple memory map you have there. A real 64-bit address space looks more like this:
Legend: t: text, d: data, b: BSS
This is not remotely to scale, and it shouldn't be interpreted as exactly how any given OS does stuff (after I drew it I discovered that Linux actually puts the executable much closer to address zero than I thought it did, and the shared libraries at surprisingly high addresses). The black regions of this diagram are unmapped -- any access causes an immediate segfault -- and they are gigantic relative to the gray areas. The light-gray regions are the program and its shared libraries (there can be dozens of shared libraries); each has an independent text and data segment (and "bss" segment, which also contains global data but is initialized to all-bits-zero rather than taking up space in the executable or library on disk). The heap is no longer necessarily continous with the executable's data segment -- I drew it that way, but it looks like Linux, at least, doesn't do that. The stack is no longer pegged to the top of the virtual address space, and the distance between the heap and the stack is so enormous that you don't have to worry about crossing it.
The break is still the upper limit of the heap. However, what I didn't show is that there could be dozens of independent allocations of memory off there in the black somewhere, made with mmap instead of brk. (The OS will try to keep these far away from the brk area so they don't collide.)
Minimal runnable example
What does brk( ) system call do?
Asks the kernel to let you you read and write to a contiguous chunk of memory called the heap.
If you don't ask, it might segfault you.
Without brk:
#define _GNU_SOURCE
#include <unistd.h>
int main(void) {
/* Get the first address beyond the end of the heap. */
void *b = sbrk(0);
int *p = (int *)b;
/* May segfault because it is outside of the heap. */
*p = 1;
return 0;
}
With brk:
#define _GNU_SOURCE
#include <assert.h>
#include <unistd.h>
int main(void) {
void *b = sbrk(0);
int *p = (int *)b;
/* Move it 2 ints forward */
brk(p + 2);
/* Use the ints. */
*p = 1;
*(p + 1) = 2;
assert(*p == 1);
assert(*(p + 1) == 2);
/* Deallocate back. */
brk(b);
return 0;
}
GitHub upstream.
The above might not hit a new page and not segfault even without the brk, so here is a more aggressive version that allocates 16MiB and is very likely to segfault without the brk:
#define _GNU_SOURCE
#include <assert.h>
#include <unistd.h>
int main(void) {
void *b;
char *p, *end;
b = sbrk(0);
p = (char *)b;
end = p + 0x1000000;
brk(end);
while (p < end) {
*(p++) = 1;
}
brk(b);
return 0;
}
Tested on Ubuntu 18.04.
Virtual address space visualization
Before brk:
+------+ <-- Heap Start == Heap End
After brk(p + 2):
+------+ <-- Heap Start + 2 * sizof(int) == Heap End
| |
| You can now write your ints
| in this memory area.
| |
+------+ <-- Heap Start
After brk(b):
+------+ <-- Heap Start == Heap End
To better understand address spaces, you should make yourself familiar with paging: How does x86 paging work?.
Why do we need both brk and sbrk?
brk could of course be implemented with sbrk + offset calculations, both exist just for convenience.
In the backend, the Linux kernel v5.0 has a single system call brk that is used to implement both: https://github.com/torvalds/linux/blob/v5.0/arch/x86/entry/syscalls/syscall_64.tbl#L23
12 common brk __x64_sys_brk
Is brk POSIX?
brk used to be POSIX, but it was removed in POSIX 2001, thus the need for _GNU_SOURCE to access the glibc wrapper.
The removal is likely due to the introduction mmap, which is a superset that allows multiple range to be allocated and more allocation options.
I think there is no valid case where you should to use brk instead of malloc or mmap nowadays.
brk vs malloc
brk is one old possibility of implementing malloc.
mmap is the newer stricly more powerful mechanism which likely all POSIX systems currently use to implement malloc. Here is a minimal runnable mmap memory allocation example.
Can I mix brk and malloc?
If your malloc is implemented with brk, I have no idea how that can possibly not blow up things, since brk only manages a single range of memory.
I could not however find anything about it on the glibc docs, e.g.:
https://www.gnu.org/software/libc/manual/html_mono/libc.html#Resizing-the-Data-Segment
Things will likely just work there I suppose since mmap is likely used for malloc.
See also:
What's unsafe/legacy about brk/sbrk?
Why does calling sbrk(0) twice give a different value?
More info
Internally, the kernel decides if the process can have that much memory, and earmarks memory pages for that usage.
This explains how the stack compares to the heap: What is the function of the push / pop instructions used on registers in x86 assembly?
You can use brk and sbrk yourself to avoid the "malloc overhead" everyone's always complaining about. But you can't easily use this method in conjuction with malloc so it's only appropriate when you don't have to free anything. Because you can't. Also, you should avoid any library calls which may use malloc internally. Ie. strlen is probably safe, but fopen probably isn't.
Call sbrk just like you would call malloc. It returns a pointer to the current break and increments the break by that amount.
void *myallocate(int n){
return sbrk(n);
}
While you can't free individual allocations (because there's no malloc-overhead, remember), you can free the entire space by calling brk with the value returned by the first call to sbrk, thus rewinding the brk.
void *memorypool;
void initmemorypool(void){
memorypool = sbrk(0);
}
void resetmemorypool(void){
brk(memorypool);
}
You could even stack these regions, discarding the most recent region by rewinding the break to the region's start.
One more thing ...
sbrk is also useful in code golf because it's 2 characters shorter than malloc.
There is a special designated anonymous private memory mapping (traditionally located just beyond the data/bss, but modern Linux will actually adjust the location with ASLR). In principle it's no better than any other mapping you could create with mmap, but Linux has some optimizations that make it possible to expand the end of this mapping (using the brk syscall) upwards with reduced locking cost relative to what mmap or mremap would incur. This makes it attractive for malloc implementations to use when implementing the main heap.
malloc uses brk system call to allocate memory.
include
int main(void){
char *a = malloc(10);
return 0;
}
run this simple program with strace, it will call brk system.
I can answer your second question. Malloc will fail and return a null pointer. That's why you always check for a null pointer when dynamically allocating memory.
The heap is placed last in the program's data segment. brk() is used to change (expand) the size of the heap. When the heap cannot grow any more any malloc call will fail.
The data segment is the portion of memory that holds all your static data, read in from the executable at launch and usually zero-filled.

Resources