Determine Stack bottom, start and end of data segment of C program - c

I am trying to understand how memory space is allocated for a C program. For that , I want to determine stack and data segment boundaries. Is there any library call or system call which does this job ? I found that stack bottom can be determined by reading /proc/self/stat. However, I could not find how to do it. Please help. :)

Processes don't have a single "data segment" anymore. They have a bunch of mappings of memory into their address space. Common cases are:
Shared library or executable code or rodata, mapped shared, without write access.
Glibc heap segments, anonymous segments mapped with rw permissions.
Thread stack areas. They look a lot like heap segments, but are usually separated from each other with some unmapped guard pages.
As Nikolai points out, you can look at the list of these with the pmap tool.

Look into /proc/<pid>/maps and /proc/<pid>/smaps (assuming Linux). Also pmap <pid>.

There is no general method for doing this. In fact, some of the secure computing environments randomize the exact address space allocations and order so that code injection attacks are more challenging to engineer.
However, every C runtime library has to arrange the contributions of data and stack segments so the program works correctly. Reading the runtime startup code is the most direct way of finding the answer.
Which C compiler are you interested in?


What is the main origin of heap and stack memory division?

I read a lot of explanation of heap and stack memory, and all of them obscure anyway in terms of origin. First of all I understand how this memories works with software, but I don't understand the main source of this division. I assume that they are the same unspecialized physical memory, but...
For example say we have PC without any OS, and we want create some bootable program with assembly language for x86. I assume we can do this (Personally I don't know assembly, but some people write OS anyway). So the main question is Can we already operate with heap and stack, or we must create some memory managment machinery for this? If yes, so how it can be possible in terms of bare metal programming?
Adding something to the other answer, fairly correct but perhaps not very complete.
Heap and stack are two (software) ways to "manage" memory. The physical memory, normally, is a flat array of cells where a program can read and write. It is up to the running program to use those cells as it wants. But there is more to say.
1^ thing. Heap is totally software, while stack is also (or mainly) a hardware thing. Most processors have hardware (or CPU instruction) to support the stack, while most (or all?) don't care about the heap. Even more: there are small embedded processors (or microcontrollers) which have a separated stack area - totally different from other ram areas where the program could create a "heap".
2^ thing. Whean speaking about "programs", one can/should think that the operating system (the OS) is a program, specialized in managing resources (memory included), and extendable with "applications" (which are programs). In such scenario, stack and heap are managed in cooperation from both OS and the applications.
So, to reply to your main question, the 90% correct answer is: in bare metal you have already a stack - perhaps you have to issue some short instruction to set it up, but it is straightforward. But you don't have a heap, you must implement it in your program. First you set aside some memory to be used as a stack; and then you can set aside some more memory to be used as a heap, not forgetting that you must preserve some memory for normal/static data. The part of the program that manages the heap should know what to do, using but not erratically overwriting the stack and the static data, to perform its functions.

Is the paging file part of the heap? And am I correct in thinking that the heap doesn't have to be a continuous block of memory?

Just two simple questions I could not find a proper answer to. I decided I would learn assembly language since its one area of my programming capabilities that lacks.
Also where functions are called recursively, how does the OS determine how large the stack should be for a thread? or does it just place the stack where there is a large amount of memory it can expand into before colliding with the heap?
Is the paging file part of the heap?
Not really. You can't access it directly; the OS may use it transparently to page out parts of your process's virtual address space.
the heap doesn't have to be a continuous block of memory?
Yes, allocations with mmap(MAP_ANONYMOUS) are considered part of the heap, and can be anywhere in your virtual address space. As Weather Vane points out, "heap" is a pretty broad / archaic term. It's not a useful concept for understanding what really happens under the hood.
e.g. you can mmap a file to make its contents part of your virtual address space. Is that part of the "heap"? I don't know and don't care.
Also where functions are called recursively, how does the OS determine how large the stack should be for a thread?
It just sets a max stack size limit when your process starts. Using recursive functions has nothing to do with it. (Although it's the most common way to use up the fixed size limit, the other being large local arrays).

Does a compiler have consider the kernel memory space when laying out memory?

I'm trying to reconcile a few concepts.
I know of virtual memory is shared (mapped) between the kernel and all user processes, which I read here. I also know that when the compiler generates addresses for code + data, the kernel must load them at the correct virtual addresses for that process.
To constrain the scope of the question, I'll just mean gcc when I mention 'the compiler'.
So does the compiler need to be compliant each new release of an OS, to know not to place code or data at the high memory addresses reserved for the kernel? As in, someone writing that piece of the compiler must know those details of how the kernel plans to load the program (lest the compiler put executable code in high memory)?
Or am I confusing different concepts? I got a bit confused when going through this tutorial, especially at the very bottom where it has OS code in low memory addresses, because I thought Linux uses high memory for the kernel.
The compiler doesn't determine the address ranges in memory at which things are placed. That's handled by the OS.
When the program is first executed, the loader places the various portions of the program and its libraries in memory. For memory that's allocated dynamically, large chunks are allocated from the OS and then sometimes divided into smaller chunks.
The OS loader knows where to load things. And the OS's virtual memory allocation logic how to find safe, empty spaces in the address space the process uses.
I'm not sure what you mean by the "high memory addresses reserved for the kernel". If you're talking about a 2G/2G or 3G/1G split on a 32-bit operating system, that is a fundamental design element of those OSes that use it. It doesn't change with versions.
If you're talking about high physical memory, then no. Compilers don't care about physical memory.
Linux gives each application its own memory space, distinct from the kernel. The page table contains the translations between this memory space and physical RAM, and the kernel sets up the page table so there's no interference.
That said, the compiler usually doesn't even care where the program is loaded in memory. Why would it?

Determine total memory usage of embedded C program

I would like to be able to debug how much total memory is being used by C program in a limited resource environment of 256 KB memory (currently I am testing in an emulator program).
I have the ability to print debug statements to a screen, but what method should I use to calculate how much my C program is using (including globals, local variables [from perspective of my main function loop], the program code itself etc..)?
A secondary aspect would be to display the location/ranges of specific variables as opposed to just their size.
-Edit- The CPU is Hitachi SH2, I don't have an IDE that lets me put breakpoints into the program.
Using the IDE options make the proper actions (mark a checkobx, probably) so that the build process (namely, the linker) will generate a map file.
A map file of an embedded system will normally give you the information you need in a detailed fashion: The memory segments, their sizes, how much memory is utilzed in each one, program memory, data memory, etc.. There is usually a lot of data supplied by the map file, and you might need to write a script to calculate exactly what you need, or copy it to Excel. The map file might also contain summary information for you.
The stack is a bit trickier. If the map file gives that, then there you have it. If not, you need to find it yourself. Embedded compilers usually let you define the stack location and size. Put a breakpoint in the start of you program. When the application stops there zero the entire stack. Resume the application and let it work for a while. Finally stop it and inspect the stack memory. You will see non-zero values instead of zeros. The used stack goes until the zeros part starts again.
Generally you will have different sections in mmap generated file, where data goes, like :
.text .... and so on!!!
with other attributes like Base,Size(hex),Size(dec) etc for each section.
While at any time local variables may take up more or less space (as they go in and out of scope), they are instantiated on the stack. In a single threaded environment, the stack will be a fixed allocation known at link time. The same is true of all statically allocated data. The only run-time variable part id dynamically allocated data, but even then sich data is allocated from the heap, which in most bare-metal, single-threaded environments is a fixed link-time allocation.
Consequently all the information you need about memory allocation is probably already provided by your linker. Often (depending on your tool-chain and linker parameters used) basic information is output when the linker runs. You can usually request that a full linker map file is generated and this will give you detailed information. Some linkers can perform stack usage analysis that will give you worst case stack usage for any particular function. In a single threaded environment, the stack usage from main() will give worst case overall usage (although interrupt handlers need consideration, the linker is not thread or interrupt aware, and some architectures have separate interrupt stacks, some are shared).
Although the heap itself is typically a fixed allocation (often all the available memory after the linker has performed static allocation of stack and static data), if you are using dynamic memory allocation, it may be useful at run-time to know how much memory has been allocated from the heap, as well as information about the number of allocations, average size of allocation, and the number of free blocks and their sizes also. Because dynamic memory allocation is implemented by your system's standard library any such analysis facility will be specific to your library, and may not be provided at all. If you have the library source you could implement such facilities yourself.
In a multi-threaded environment, thread stacks may be allocated statically or from the heap, but either way the same analysis methods described above apply. For stack usage analysis, the worst-case for each thread is measured from the entry point of each thread rather than from main().

Memory allocation in C

How do I inspect in what parts of my memory my heap, stack etc lie? I am currently looking at a program in C, and in looking at the .elf file I can see what memory addresses the program is using, but I don't know if it's in the heap or stack.
That's quite hard to know from a static analysis of the compiled code itself. You should be able to see any static initialized data areas, and also static uninitialized (BSS) sections, but exactly how those are loaded with respect to stack, heap and so on is down to the platform's executable loader.
If you are working in embedded platform , you should probably use some linker scripts(lcf files) along with building the program, then you can identify in detail all the sections(stack,heap,intvec,bss,text,code) ,its placement in the memory (whether in L1 cache,L2 cache or DDR) and its starting/ending address while loading into the board.
The thing is that, please have a look into the linker manual(you can find it in the compiler installation directory) for proper understanding of the keywords in the lcf.
Also there is one more way to analyse the sections, you can create the "map file" for your project and go through it.It will list all sections in the program and its addresses.
you could try using ollydbg, which is a free debugger. the one drawback to this is it shows everything in assembly form, but it will show you what's in your stack, heap, and even what is in your registers. I'm not sure if this is what you are looking for.
