Linux heap allocation

Linux heap allocation - c

In FreeRTOS, the heap is simply a global array with a size (lets call is heapSize) defined in a H file which the user can change. This array is a non-initialized global array which makes it as part of the BSS section of the image, as so it is filled with zeros upon loading, then, every allocation of memory is taken from this array and every address of allocated memory is a an offset of this array.
So, for a maximal utilization of the memory size, we can approximate the size of the Data, Text and BSS areas of our entire program, and define the heap size to something like heapSize = RAM_size - Text_size - Data_size - BSS_size.
I would like to know what is the equivalent implementation is Linux OS. Can Linux scan a given RAM and decide its size in run time? does linux have an equivalent data structure to manage the heap? if so, how does it allocates the memory for this data structure in the first place?

I would like to know what is the equivalent implementation is Linux OS.
Read "Chapter 8: Allocating Memory" in Linux Device Drivers, Third Edition.

Heaps in Linux are dynamic, so it grows whenever you request more memory. This can extend beyond physical memory size by using swap files, where some unused portions of the RAM is written to disk.
So I think you need to think more in terms of "how much memory does my application need" rather than "how much memory is available".

Related

data segment containing heap data or static variables

I was reading (Operating System - Tannenbaum, page 190) about system memory and I found a paragraph that said:
the data segment being used as a heap for the variables that are dynamically allocated and released and a stack segment for the normal local variables and return addresses.
Where as Data Segment says that it is used for initialized static variables.
Which one of them is correct? Or is there something wrong with my understanding?

From your link itself:
Historically, to be able to support memory address spaces larger than the native size of the internal address register would allow, early CPUs implemented a system of segmentation whereby they would store a small set of indexes to use as offsets to certain areas. The Intel 8086 family of CPUs provided four segments: the code segment, the data segment, the stack segment and the extra segment.
Now, Operating Systems: Design and Implementation was written in 1987, When the data segment was used for both stack, heap, initialized data, and uninitialized data.
Since then, there were a few important changes:
There's no a lot more memory and many more CPU bits, and segmentation is no longer needed by the hardware.
Segments because more than a mere hardware artifact - they became a memory management design pattern.
The BSS segment was introduced.
Features like mmap() and POSIX shared memory IPC mean that the heap is not a single contiguous segment.
Multi-threading means multiple stacks, in the same memory space, sharing a single heap.
So when the book was written, the "data segment" was a concept defined by the hardware, and it was defined to contain everything that wasn't local: initialized data, uninitialized data, dynamically allocated data etc.
But these days the OS' memory manager defines "data segement" as "memory area containing the program's initialized data".
On CPUs that use a data segment pointer, it points to the beginning of what the OS' memory manager declares to be "the data segment".
But the memory manager has more segments, for BSS and heap, which aren't represented by a CPU pointer, so the memory manager just places them immediately after the data segment.
Stacks are a different story these days. When you create a new thread, it gets a new stack, often limited in size (e.g. 8 MB on some versions of Linux). Most likely, the stack for a thread will be allocated from the same area as the heap, meaning at a higher address than the data segment, and since the stack grows to lower addresses, all stacks will still grow towards the data segment.

Fortran: insufficient virtual memory

I - not a professional software engineer - am currently extending a quite large scientific software.
At runtime I get an error stating "insufficient virtual memory".
At this point during runtime, the used working memory is about 550mb and the error accurs when a rather big threedimensional array is dynamically allocated. The array - if it would be allocated - would be about a size of 170mb. Adding this to the already used 550mb the program would still be way below the 2gb boundary that is set for 32bit applications. Also there is more than enough working memory available on the system.
Visual Studio is currently set that it allocates arrays on the stack. Allocating them on the heap does not make any difference anyway.
Splitting the array into smaller arrays (being the size of the one big array in sum) results in the program running just fine. So I guess that the dynamically allocated memory has to be available in one adjacent block.
So there I am and I have no clue how to solve this. I can not deallocate some of the already used 550mb as the data is still required. I also can not change very much of the configuration (e.g. the compiler).
Is there a solution for my problem?
Thank you some much in advance and best regards
phroth248

The virtual memory is the memory your program can address. It is usually the sum of the physical memory and the swap space. For example, if you have 16GB of physical memory and 4GB of swap space, the virtual memory will be 20GB. If your Fortran program tries to allocate more than those 20 addressable GB, you will get an "insufficient virtual memory" error.
To get an idea of the required memory of your 3D array:
allocate (A(nx,ny,nz))
You have nx*ny*nz elements and each element takes 8 bytes in double precision or 4 bytes in single precision. I let you do the math.

Some things:
1. It is usually preferable to to allocate huge arrays using operating system services rather than language facilities. That will circumvent any underlying library problems.
You may have a problem with 550MB in a 32-bit system. Usually there is some division of the 4GB address space into dedicated regions.
You need to make sure you have enough virtual memory.
a) Make sure your page file space is large enough.
b) Make sure that your system is not configured to limit processes address space sizes to smaller than what you need.
c) Make sure that your accounts settings are not limiting your process address space to smaller than allowed by the system.

What determines how much memory can be allocated?

This is a follow-up to my previous question about why size_t is necessary.
Given that size_t is guaranteed to be big enough to represent the largest size of a block of memory you can allocate (meaning there can still be some integers bigger than size_t), my question is...
What determines how much you can allocate at once?

The architecture of your machine, the operating system (but the two are intertwined) and your compiler/set of libraries determines how much memory you can allocate at once.
malloc doesn't need to be able to use all the memory the OS could give him. The OS doesn't need to make available all the memory present in the machine (and various versions of Windows Server for example have different maximum memory for licensing reasons)
But note that the OS can make available more memory than the one present in the machine, and even more memory than the one permitted by the motherboard (let's say the motherboard has a single memory slot that accepts only 1gb memory stick, Windows could still let a program allocate 2gb of memory). This is done throught the use of Virtual Memory, Paging (you know, the swap file, your old and slow friend :-) Or, for example, through the use of NUMA.

I can think of three constraints, in actual code:
The biggest unsigned int size_t is able to allocate. size_t should be the same type (same size, etc.) the OS' memory allocation mechanism is using.
The biggest block the operating system is able to handle in RAM (how are block's size represented? how this representation affects the maximum block size?).
Memory fragmentation (largest free block) and the total available free RAM.

C Memory Management in Embedded Systems

I have to use c/asm to create a memory management system since malloc/free don't yet exist. I need to have malloc/free!
I was thinking of using the memory stack as the space for the memory, but this would fail because when the stack pointer shrinks, ugly things happen with the allocated space.
1) Where would memory be allocated? If I place it randomly in the middle of the Heap/Stack and the Heap/Stack expands, there will be conflicts with allocated space!
12 What Is the simplest/cleanest solution for memory management? These are the only options I've researched:
A memory stack where malloc grows the stack and free(p) shrinks the stack by shifting [p..stack_pointer] (this would invalidate the shifted memory addresses though...).
A linked list (Memory Pool) with a variable-size chunk of memory. However I don't know where to place this in memory... should the linked list be a "global" variable, or "static"?
Thanks!

This article provides a good review of memory management techniques. The resources section at the bottom has links to several open source malloc implementations.

For embedded systems the memory is partitioned at link time into several sections or pools, i.e.:
ro (code + constants)
rw (heap)
zi (zero initialised memory for static variables)
You could add a 4th section in the linker configuration files that would effectively allocate a space in the memory map for dynamic allocations.
However once you have created the raw storage for dynamic memory then you need to understand how many, how large and how frequent the dynamic allocations will occur. From this you can build a picture of how the memory will fragment over time.
Typically an application that is running OS free will not use dynamic memory as you don't want to have to deal with the consequences of malloc failing. If at all possible the better solution is design to avoid it. If this is not at all possible try and simplify the dynamic behaviour using a few large structures that have the data pre-allocated before anything needs to use it.
For example say that you have an application that processes 10bytes of data whilst receiving the next 10 bytes of data to process, you could implement a simple buffering solution. The driver will always be requesting buffers of the same size and there would be a need for 3 buffers. Adding a little meta data to a structure:
{
int inUse;
char data[10];
}
You could take an array of three of theses structures (remembering to initialise inUse to 0 and flick between [0] and [1], with [2] reserved for the situations when a few too many interrupts occur and the next buffer is required buffer one is freed (the need for the 3rd buffer). The alloc algorithm would on need to check for the first buffer !inUse and return a pointer to data. The free would merely need to change inUse back to 0.
Depending on the amount of available RAM and machine (physical / virtual addressing) that you're using there are lots of possible algorithms, but the more complex the algorithm the longer the allocations could take.

Declare a huge static char buffer and use this memory to write your own malloc & free functions.
Algorithms for writing malloc and free could be as complex (and optimized) or as simple as you want.
One simple way could be following...
based on the type of memory allocation needs in your application try to find the most common buffer sizes
declare structures for each size with a char buffer of that length
and a boolean to represent whether buffer is occupied or not.
Then declare static arrays of above structures( decide array sizes
based on the total memory available in the system)
now malloc would simply go the most suitable array based on the
required size and search for a free buffer (use some search algo here
or simply use linear search) and return. Also mark the boolean in the
associated structure to TRUE.
free would simply search for buffer and mark the boolean to FALSE.
hope this helps.

Use the GNU C library. You can use just malloc() and free(), or any other subset of the library. Borrowing the design and/or implementation and not reinventing the wheel is a good way to be productive.
Unless, of course, this is homework where the point of the exercise is to implement malloc and free....

How can I reserve memory addresses without allocating them

I would like (in *nix) to allocate a large, contigious address space, but without consuming resources straight away, i.e. I want to reserve an address range an allocate from it later.
Suppose I do foo=malloc(3*1024*1024*1024) to allocate 3G, but on a 1G computer with 1G of swap file. It will fail, right?
What I want to do is say "Give me a memory address range foo...foo+3G into which I will be allocating" so I can guarantee all allocations within this area are contiguous, but without actually allocating straight away.
In the example above, I want to follow the foo=reserve_memory(3G) call with a bar=malloc(123) call which should succeedd since reserve_memory hasn't consumed any resources yet, it just guarantees that bar will not be in the range foo...foo+3G.
Later I would do something like allocate_for_real(foo,0,234) to consume bytes 0..234 of foo's range. At this point, the kernel would allocate some virtual pages and map them to foo...foo+123+N
Is this possible in userspace?
(The point of this is that objects in foo... need to be contiguous and cannot reasonably be moved after they are created.)
Thank you.

Short answer: it already works that way.
Slightly longer answer: the bad news is that there is no special way of reserving a range, but not allocating it. However, the good news is that when you allocate a range, Linux does not actually allocate it, it just reserves it for use by you, later.
The default behavior of Linux is to always accept a new allocation, as long as there is address range left. When you actually start using the memory though, there better be some memory or at least swap backing it up. If not, the kernel will kill a process to free memory, usually the process which allocated the most memory.
So the problem in Linux with default settings gets shifted from, "how much can I allocate", into "how much can I allocate and then still be alive when I start using the memory?"
Here is some info on the subject.

I think, a simple way would be to do that with a large static array.
On any modern system this will not be mapped to existing memory (in the executable file on disk or in RAM of your execution machine) unless you will really access it. Once you will access it (and the system has enough resources) it will be miraculously initialized to all zeros.
And your program will seriously slow down once you reach the limit of physical memory and then randomly crash if you run out of swap.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight