Is it possible to allocate a single byte of memory at a specific address? - c

Is it possible to allocate a single byte of memory at a specific desired address, say 0x123?
This suggests follow up questions:
Is it possible to know if a specific address has already been malloced?
Some complications could be:
The byte at the desired address 0x123 was already malloc'ed. In this case, can I move the byte value elsewhere and notify the compiler (or whatever's keeping track of these things) of the new address of the byte?
The byte at the desired address 0x123 was malloc'ed along with other bytes. E.g. char *str = malloc(8); and str <= 0x123 < str + 8, or in other words, 0x123 overlaps some portion of already malloc'ed memory. In this case, is it possible to move the portion of malloc'ed memory elsewhere and notify the compiler (or whatever's keeping track of these things)?
There are also several variations:
Is this possible if the desired address is known at compile time?
Is this possible if the desired address is known at run time?
I know mmap takes a hint addr, but it allocates in multiples of the pagesize and may or may not allocate at the given hint addr.

It is possible to assign a specific value to a pointer as follows:
unsigned char *p = (unsigned char *)0x123;
However dereferencing such a pointer will almost certainly result in undefined behavior on any hosted system.
The only time such a construct would be valid is on an embedded system where it is allowed to access an arbitrary address and the implementation documents specific addresses for specific uses.
As for trying to manipulate the inner workings of a malloc implementation, such a task is very system specific and not likely to yield any benefit.

The are operating-system-specific ways to do this. On Windows, you can use VirtualAlloc (with the MEM_COMMIT | MEM_RESERVE flags). On Linux, you can use mmap (with the MAP_FIXED_NOREPLACE flag). These are the operating system functions which give you full control over your own address space.
In either case, you can only map entire pages. Addresses only become valid and invalid a page at a time. You can't have a page that is only half valid, and you can't have a page where only one address is valid. This is a CPU limitation.
If the page you want is already allocated, then obviously you can't allocate it again.
On both Windows and Linux, you can't allocate the first page. This is so that accesses to NULL pointers (which point to the first page) will always crash.

Is it possible to allocate a single byte of memory at a specific desired address, say 0x123?
Generally: no. The C language doesn't cover allocation at specific addresses, it only covers how to access a specific address. Many compilers do provide a non-standard language extensions for how to allocate at a fixed address. When sticking to standard C, the actual allocation must either be done:
In hardware, by for example having a MCU which provides a memory-mapped register map, or
By the system-specific linker, through custom linker scripts.
See How to access a hardware register from firmware? for details.
malloc doesn't make any sense in either case, since it exclusively uses heap allocation and the heap sits inside a pre-designated address space.

Related

int* 'limits' in C [duplicate]

char *cp = (char *) malloc(1);
strcpy(cp, "123456789");
puts(cp);
output is "123456789" on both gcc (Linux) and Visual C++ Express, does that mean when there is free memory, I can actually use more than what I've allocated with malloc()?
and why malloc(0) doesn't cause runtime error?
Thanks.
You've asked a very good question and maybe this will whet your appetite about operating systems. Already you know you've managed to achieve something with this code that you wouldn't ordinarily expect to do. So you would never do this in code you want to make portable.
To be more specific, and this depends entirely on your operating system and CPU architecture, the operating system allocates "pages" of memory to your program - typically this can be in the order of 4 kilobytes. The operating system is the guardian of pages and will immediately terminate any program that attempts to access a page it has not been assigned.
malloc, on the other hand, is not an operating system function but a C library call. It can be implemented in many ways. It is likely that your call to malloc resulted in a page request from the operating system. Then malloc would have decided to give you a pointer to a single byte inside that page. When you wrote to the memory from the location you were given you were just writing in a "page" that the operating system had granted your program, and thus the operating system will not see any wrong doing.
The real problems, of course, will begin when you continue to call malloc to assign more memory. It will eventually return pointers to the locations you just wrote over. This is called a "buffer overflow" when you write to memory locations that are legal (from an operating system perspective) but could potentially be overwriting memory another part of the program will also be using.
If you continue to learn about this subject you'll begin to understand how programs can be exploited using such "buffer overflow" techniques - even to the point where you begin to write assembly language instructions directly into areas of memory that will be executed by another part of your program.
When you get to this stage you'll have gained much wisdom. But please be ethical and do not use it to wreak havoc in the universe!
PS when I say "operating system" above I really mean "operating system in conjunction with privileged CPU access". The CPU and MMU (memory management unit) triggers particular interrupts or callbacks into the operating system if a process attempts to use a page that has not been allocated to that process. The operating system then cleanly shuts down your application and allows the system to continue functioning. In the old days, before memory management units and privileged CPU instructions, you could practically write anywhere in memory at any time - and then your system would be totally at the mercy of the consequences of that memory write!
No. You get undefined behavior. That means anything can happen, from it crashing (yay) to it "working" (boo), to it reformatting your hard drive and filling it with text files that say "UB, UB, UB..." (wat).
There's no point in wondering what happens after that, because it depends on your compiler, platform, environment, time of day, favorite soda, etc., all of which can do whatever they want as (in)consistently as they want.
More specifically, using any memory you have not allocated is undefined behavior. You get one byte from malloc(1), that's it.
When you ask malloc for 1 byte, it will probably get 1 page (typically 4KB) from the operating system. This page will be allocated to the calling process so as long as you don't go out of the page boundary, you won't have any problems.
Note, however, that it is definitely undefined behavior!
Consider the following (hypothetical) example of what might happen when using malloc:
malloc(1)
If malloc is internally out of memory, it will ask the operating system some more. It will typically receive a page. Say it's 4KB in size with addresses starting at 0x1000
Your call returns giving you the address 0x1000 to use. Since you asked for 1 byte, it is defined behavior if you only use the address 0x1000.
Since the operating system has just allocated 4KB of memory to your process starting at address 0x1000, it will not complain if you read/write something from/to addresses 0x1000-0x1fff. So you can happily do so but it is undefined behavior.
Let's say you do another malloc(1)
Now malloc still has some memory left so it doesn't need to ask the operating system for more. It will probably return the address 0x1001.
If you had written to more than 1 byte using the address given from the first malloc, you will get into troubles when you use the address from the second malloc because you will overwrite the data.
So the point is you definitely get 1 byte from malloc but it might be that malloc internally has more memory allocated to you process.
No. It means that your program behaves badly. It writes to a memory location that it does not own.
You get undefined behavior - anything can happen. Don't do it and don't speculate about whether it works. Maybe it corrupts memory and you don't see it immediately. Only access memory within the allocated block size.
You may be allowed to use until the memory reaches some program memory or other point at which your applicaiton will most likely crash for accessing protected memory
So many responses and only one that gives the right explanation. While the page size, buffer overflow and undefined behaviour stories are true (and important) they do not exactly answer the original question. In fact any sane malloc implementation will allocate at least in size of the alignment requirement of an intor a void *. Why, because if it allocated only 1 byte then the next chunk of memory wouldn't be aligned anymore. There's always some book keeping data around your allocated blocks, these data structures are nearly always aligned to some multiple of 4. While some architectures can access words on unaligned addresses (x86) they do incure some penalties for doing that, so allocator implementer avoid that. Even in slab allocators there's no point in having a 1 byte pool as small size allocs are rare in practice. So it is very likely that there's 4 or 8 bytes real room in your malloc'd byte (this doesn't mean you may use that 'feature', it's wrong).
EDIT: Besides, most malloc reserve bigger chunks than asked for to avoid to many copy operations when calling realloc. As a test you can try using realloc in a loop with growing allocation size and compare the returned pointer, you will see that it changes only after a certain threshold.
You just got lucky there. You are writing to locations which you don't own this leads to undefined behavior.
On most platforms you can not just allocate one byte. There is often also a bit of housekeeping done by malloc to remember the amount of allocated memory. This yields to the fact that you usually "allocate" memory rounded up to the next 4 or 8 bytes. But this is not a defined behaviour.
If you use a few bytes more you'll very likeley get an access violation.
To answer your second question, the standard specifically mandates that malloc(0) be legal. Returned value is implementation-dependent, and can be either NULL or a regular memory address. In either case, you can (and should) legally call free on the return value when done. Even when non-NULL, you must not access data at that address.
malloc allocates the amount of memory you ask in heap and then return a pointer to void (void *) that can be cast to whatever you want.
It is responsibility of the programmer to use only the memory that has been allocate.
Writing (and even reading in protected environment) where you are not supposed can cause all sort of random problems at execution time. If you are lucky your program crash immediately with an exception and you can quite easily find the bug and fix it. If you aren't lucky it will crash randomly or produce unexpected behaviors.
For the Murphy's Law, "Anything that can go wrong, will go wrong" and as a corollary of that, "It will go wrong at the right time, producing the most large amount of damage".
It is sadly true. The only way to prevent that, is to avoid that in the language that you can actually do something like that.
Modern languages do not allow the programmer to do write in memory where he/she is not supposed (at least doing standard programming). That is how Java got a lot of its traction. I prefer C++ to C. You can still make damages using pointers but it is less likely. That is the reason why Smart Pointers are so popular.
In order to fix these kind of problems, a debug version of the malloc library can be handy. You need to call a check function periodically to sense if the memory was corrupted.
When I used to work intensively on C/C++ at work, we used Rational Purify that in practice replace the standard malloc (new in C++) and free (delete in C++) and it is able to return quite accurate report on where the program did something it was not supposed. However you will never be sure 100% that you do not have any error in your code. If you have a condition that happen extremely rarely, when you execute the program you may not incur in that condition. It will eventually happen in production on the most busy day on the most sensitive data (according to Murphy's Law ;-)
It could be that you're in Debug mode, where a call to malloc will actually call _malloc_dbg. The debug version will allocate more space than you have requested to cope with buffer overflows. I guess that if you ran this in Release mode you might (hopefully) get a crash instead.
You should use new and delete operators in c++... And a safe pointer to control that operations doesn't reach the limit of the array allocated...
There is no "C runtime". C is glorified assembler. It will happily let you walk all over the address space and do whatever you want with it, which is why it's the language of choice for writing OS kernels. Your program is an example of a heap corruption bug, which is a common security vulnerability. If you wrote a long enough string to that address, you'd eventually overrun the end of the heap and get a segmentation fault, but not before you overwrote a lot of other important things first.
When malloc() doesn't have enough free memory in its reserve pool to satisfy an allocation, it grabs pages from the kernel in chunks of at least 4 kb, and often much larger, so you're probably writing into reserved but un-malloc()ed space when you initially exceed the bounds of your allocation, which is why your test case always works. Actually honoring allocation addresses and sizes is completely voluntary, so you can assign a random address to a pointer, without calling malloc() at all, and start working with that as a character string, and as long as that random address happens to be in a writable memory segment like the heap or the stack, everything will seem to work, at least until you try to use whatever memory you were corrupting by doing so.
strcpy() doesn't check if the memory it's writing to is allocated. It just takes the destination address and writes the source character by character until it reaches the '\0'. So, if the destination memory allocated is smaller than the source, you just wrote over memory. This is a dangerous bug because it is very hard to track down.
puts() writes the string until it reaches '\0'.
My guess is that malloc(0) only returns NULL and not cause a run-time error.
My answer is in responce to Why does printf not seg fault or produce garbage?
From
The C programming language by Denis Ritchie & Kernighan
typedef long Align; /* for alignment to long boundary */
union header { /* block header */
struct {
union header *ptr; /* next block if on free list */
unsigned size; /* size of this block */
} s;
Align x; /* force alignment of blocks */
};
typedef union header Header;
The Align field is never used;it just forces each header to be aligned on a worst-case boundary.
In malloc,the requested size in characters is rounded up to the proper number of header-sized units; the block that will be allocated contains
one more unit, for the header itself, and this is the value recorded in the
size field of the header.
The pointer returned by malloc points at the free space, not at the header itself.
The user can do anything with the space requested, but if anything is written outside of the allocated space the list is likely to be scrambled.
-----------------------------------------
| | SIZE | |
-----------------------------------------
| |
points to |-----address returned touser
next free
block
-> a block returned by malloc
In statement
char* test = malloc(1);
malloc() will try to search consecutive bytes from the heap section of RAM if requested bytes are available and it returns the address as below
--------------------------------------------------------------
| free memory | memory in size allocated for user | |
----------------------------------------------------------------
0x100(assume address returned by malloc)
test
So when malloc(1) executed it won't allocate just 1 byte, it allocated some extra bytes to maintain above structure/heap table. you can find out how much actual memory allocated when you requested only 1 byte by printing test[-1] because just to before that block contain the size.
char* test = malloc(1);
printf("memory allocated in bytes = %d\n",test[-1]);
If the size passed is zero, and ptr is not NULL then the call is equivalent to free.

Dynamic Memory Allocation

How malloc() stores metadata?
void* p;
void* q;
p = malloc(sizeof(char));
q = malloc(sizeof(int));
I know that the return value p[0] points to the start of allocated block of memory,
than if I iterate and print
p[-1], p[-2].... q[-1], q[-2]....
or p[1], p[2], p[3], p[4]....
or q[1], q[2], q[3], q[4]....
I find some value that help malloc() to store data howether I can t understand
precisely what that metadata means..I only know that some of them are for block size, for the adress of the next free block but i can t find on the web nothing more
Please, Can you give me some detailed explanation of those value?
How this metadata works and is used depends entirely on the memory management in your libc. Here are some useful writeups to get you started:
Malloc - Typically, classic malloc.
DLMalloc - Doug Lee's Malloc.
GC Malloc - Malloc with garbage collection.
TC Malloc - Thread caching malloc.
Each of these has different aims, benefits and possible deficiencies. For example, perhaps you are concerned about possible heap overflow issues and protections. This may lead to one choice. Perhaps you are looking for better fragment management. This might lead to the selection of Doug Lee's malloc. You really need to specify which library you are using or research them all to understand how the metadata is used to maintain bins, coalesce adjustment free regions, etc.
David Hoelzer has an excellent answer, but I wanted to follow it up a touch more:
"Meta Data" for malloc is 100% implementation driven. Here is a quick overview of some implementations I have written or used:
The meta data stores "next" block pointers.
It stores canary values to make sure you haven't written passed the end of a block
There is no meta data at all because all the blocks are the same size and they are placed in a stack to allow O(1) access - A Unit Allocator. In this case, you'd hit memory from an adjacent block.
The data stores next block ptr, prev block ptr, alignment id, block size, and an identifier that tells a memory debugger exactly where the allocation occured. Very useful for leak tracking.
It hits the previous block's info because the data is stored at the tail of the block instead of the head...
Or mix and match all the above. In any case, reading mem[-1] is a bad idea, and in some cases (thinking embedded systems), could indeed cause a segment fault on only a read if said read happen to address beyond the current memory page and into a forbidden area of memory.
Update as per OP
The 4th scheme I described is one that has quite a bit of information per block - 16-bytes of information not being uncommon as that size won't throw off common alignment schemes. It would contain a 4-byte pointer to the next allocated block, a 4-byte pointer to the previous, an identifier for alignment - a single byte can handle this directly for common sizes of 0-256 byte alignement, but that byte could also represent 256 possible alignment enum values instead, 3 bytes of pad or canary values, and a 4-byte unique identifier to identify where in the code the call was made, though it could be made smaller. This can be a lookup value to a debug table that contains __file__, __line__, and whatever other info you wish to save with the alloction.
This would be one of the heavier varieties as it has a lot of information that needs to be updated when values are allocated and freed.
It is illegal to access p[-1], etc in your example. The results will be undefined and maybe cause memory corruption, or segmentation fault. In general you don't get any control over malloc() at all or information about what it is doing.
That said, some malloc() "replacement" libraries will give you finer grained control or information - you link these into your binary "on top" of the system malloc().
The C standard does not specify how malloc stores its metadata (or even if it will have accessible metadata); therefore, the metadata format and location are implementation-dependent. Portable code must therefore never attempt to access or parse the metadata.
Some malloc implementations provide, as an extension, functions like malloc_usable_size or malloc_size which can tell you the size of allocated blocks. While the presence of these functions is also implementation dependent, they are at least reliable and correct ways to get the information you need if they are present.

Accessing memory below the stack on linux

This program accesses memory below the stack.
I would assume to get a segfault or just nuls when going out of stack bounds but I see actual data. (This is assuming 100kb below stack pointer is beyond the stack bounds)
Or is the system actually letting me see memory below the stack? Weren't there supposed to be kernel level protections against this, or does that only apply to allocated memory?
Edit: With 1024*127 below char pointer it randomly segfaults or runs, so the stack doesn't seem to be a fixed 8MB, and there seems to be a bit of random to it too.
#include <stdio.h>
int main(){
char * x;
int a;
for( x = (char *)&x-1024*127; x<(char *)(&x+1); x++){
a = *x & 0xFF;
printf("%p = 0x%02x\n",x,a);
}
}
Edit: Another wierd thing. The first program segfaults at only 1024*127 but if I printf downwards away from the stack I don't get a segfault and all the memory seems to be empty (All 0x00):
#include <stdio.h>
int main(){
char * x;
int a;
for( x = (char *)(&x); x>(char *)&x-1024*1024; x--){
a = *x & 0xFF;
printf("%p = 0x%02x\n",x,a);
}
}
When you access memory, you're accessing the process address space.
The process address space is divided into pages (typically 4 KB on x86). These are virtual pages: their contents are held elsewhere. The kernel manages a mapping from virtual pages to their contents. Contents can be provided by:
A physical page, for pages that are currently backed by physical RAM. Accesses to these happen directly (via the memory management hardware).
A page that's been swapped out to disk. Accessing this will cause a page fault, which the kernel handles. It needs to fill a physical page with the on-disk contents, so it finds a free physical page (perhaps swapping that page's contents out to disk), reads in the contents from disk, and updates the mapping to state that "virtual page X is in physical page Y".
A file (i.e. a memory mapped file).
Hardware devices (i.e. hardware device registers). These don't usually concern us in user space.
Suppose that we have a 4 GB virtual address space, split into 4 KB pages, giving us 1048576 virtual pages. Some of these will be mapped by the kernel; others will not. When the process starts (i.e. when main() is invoked), the virtual address space will contain, amongst other things:
Program code. These pages are usually readable and executable.
Program data (i.e. for initialised variables). This usually has some read-only pages and some read-write pages.
Code and data from libraries that the program depends on.
Some pages for the stack.
These things are all mapped as pages in the 4 GB address space. You can see what's mapped by looking at /proc/(pid)/maps, as one of the comments has pointed out. The precise contents and location of these pages depend on (a) the program in question, and (b) address space layout randomisation (ASLR), which makes locations of things harder to guess, thereby making certain security exploitation techniques more difficult.
You can access any particular location in memory by defining a pointer and dereferencing it:
*(unsigned char *)0x12345678
If this happens to point to a mapped page, and that page is readable, then the access will succeed and yield whatever's mapped at that address. If not, then you'll receive a SIGSEGV from the kernel. You could handle that (which is useful in some cases, such as JIT compilers), but normally you don't, and the process will be terminated. As noted above, due to ASLR, if you do this in a program and run the program several times then you'll get non-deterministic results for some addresses.
There is usually quite a bit of accessible memory below the stack pointer, because that memory is used when you grow the stack normally. The stack itself is only controlled by the value of the stack pointer - it is a software entity, not a hardware entity.
However, system code may assume typical stack usage. I. e., on some systems, the stack is used to store state for a context switch, while a signal handler runs, etc. This also depends on whether the hardware automatically switches stack pointers when leaving user mode. If the system does use your stack for this, it will clobber the data you stored there, and that can really happen at every point in your program.
So it is not safe to manipulate stack memory below the stack pointer. It's not even safe to assume that a value that has successfully been written will still be the same in the next line code. Only the portion above the stack pointer is guaranteed not to be touched by the runtime/kernel.
It goes without saying, that this code invokes undefined behavior. The pointer arithmetic is invalid, because the address &x-1024*127 is not allocated to the variable x, so that dereferencing this pointer invokes undefined behavior.
This is undefined behavior in C. You're accessing a random memory address which, depending on the platform, may or may not be on the stack. It may or may not be in memory this user can access; if not you will get a segfault or similar. There are absolutely no promises either way.
Actually, it's not undefined behaviour, it's pretty well defined. Accessing memory locations through pointers is and was always defined since C is as close to the hardware as it can be.
I however agree that accessing hardware through pointers when you don't know exactly what you're doing is a dangerous thing to do.
Don't Do That. (If you're one of the five or six people who has a legitimate reason to do this, you already know it and don't need our advice.)
It would be a poor world with only five or six people legitimately programming operating systems, embedded devices and drivers (although it sometimes appears as if the latter is the case...).
This is undefined behavior in C. You're accessing a random memory address which, depending on the platform, may or may not be on the stack. It may or may not be in memory this user can access; if not you will get a segfault or similar. There are absolutely no promises either way.
Don't Do That. (If you're one of the five or six people who has a legitimate reason to do this, you already know it and don't need our advice.)

Declare a pointer to an integer at address 0x200 in memory

I have a couple of doubts, I remember some where that it is not possible for me to manually put a variable in a particular location in memory, but then I came across this code
#include<stdio.h>
void main()
{
int *x;
x=0x200;
printf("Number is %lu",x); // Checkpoint1
scanf("%d",x);
printf("%d",*x);
}
Is it that we can not put it in a particular location, or we should not put it in a particular location since we will not know if it's a valid location or not?
Also, in this code, till the first checkopoint, I get output to be 512.
And then after that Seg Fault.
Can someone explain why? Is 0x200 not a valid memory location?
In the general case - the behavior you will get is undefined - everything can happen.
In linux for example, the first 1GB is reserved for kernel, so if you try to access it - you will get a seg fault because you are trying to access a kernel memory in user mode.
No idea how it works in windows.
Reference for linux claim:
Currently the 32 bit x86 architecture is the most popular type of
computer. In this architecture, traditionally the Linux kernel has
split the 4GB of virtual memory address space into 3GB for user
programs and 1GB for the kernel.
Adding to what #amit wrote:
In windows it is the same. In general it is the same for all protected-mode operating systems. Since DOS etc. are no longer around it is the same with all systems except kernel-mode (km-drivers) and embedded systems.
The operating system manages which memory-pages you are allowed to write to and places markers that will make the cpu automatically raise access-violations if some other page is written to.
Up until the "checkpoint", you haven't accessed memory location 0x200, so everything works fine.
There I'd a local variable x in the function main. It is of type "pointer to int". x is assigned the value 0x200, and then that value is printed. But the target of x hasn't been accessed, so up to this point it doesn't matter whether x holds a valid memory address or not.
Then scanf tries to write to the memory address you passed in, which is the 0x200 stored in x. Then you get a seg fault, which is certainly sac possible result of trying to write to an arbitrary memory address.
So what are your doubts? What makes you think that this might work, when you come across this code that clearly doesn't?
Writing to a particular memory address might work under certain conditions, but is extremely unlikely to in general. Under all modern OSes, normal programs do not have control over their memory layout. The OS decides where initial things like the program's code, stack, and globals go. The OS will probably also be using some memory space, and it is not required to tell you what it's using. Instead you ask for memory (either by making variables or by calling memory allocation routines), and you use that.
So writing to particular addresses is very very likely to get either memory that hasn't been allocated, or memory that is being used for some other purpose. Neither of those is good, even if you do manage to hit an address that is actually writable. What if you clobber sundry some piece of data used by one of your program's other variables? Or some other part of your program clobbers the value you just wrote?
You should never be choosing a particular hard-coded memory address, you should be using an address of something you know is a variable, or an address you got from something like malloc.

How Process Size is determined? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I am very new to these concepts but I want to ask you all a question that is very basic I think, but I am confused, So I am asking it.
The question is...
How is the size of a process determined by the OS?
Let me clear it first, suppose that I have written a C program and I want to know that how much memory it is going to take, how can I determine it? secondly I know that there are many sections like code section, data section, BSS of a process. Now does the size of these are predetermined? secondly how the size of Stack and heap are determined. does the size of stack and heap also matters while the Total size of process is calculated.
Again we say that when we load the program , an address space is given to the process ( that is done by base and limit register and controlled by MMU, I guess) and when the process tries to access a memory location that is not in its address space we get segmentation fault. How is it possible for a process to access a memory that is not in its address space. According to my understanding when some buffer overflows happens then the address gets corrupted. Now when the process wants to access the corrupted location then we get the segmentation fault. Is there any other way of Address violation.
and thirdly why the stack grows downward and heap upwards.Is this process is same with all the OS. How does it affects the performance.why can't we have it in other way?
Please correct me, if I am wrong in any of the statement.
Thanks
Sohrab
When a process is started it gets his own virtual address space. The size of the virtual address space depends on your operating system. In general 32bit processes get 4 GiB (4 giga binary) addresses and 64bit processes get 18 EiB (18 exa binary) addresses.
You cannot in any way access anything that is not mapped into your virtual address space as by definition anything that is not mapped there does not have an address for you. You may try to access areas of your virtual address space that are currently not mapped to anything, in which case you get a segfault exception.
Not all of the address space is mapped to something at any given time. Also not all of it may be mapped at all (how much of it may be mapped depends on the processor and the operating system). On current generation intel processors up to 256 TiB of your address space may be mapped. Note that operating systems can limit that further. For example for 32 bit processes (having up to 4 GiB addresses) Windows by default reserves 2 GiB for the system and 2 GiB for the application (but there's a way to make it 1 GiB for the system and 3 GiB for the application).
How much of the address space is being used and how much is mapped changes while the application runs. Operating system specific tools will let you monitor what the currently allocated memory and virtual address space is for an application that is running.
Code section, data section, BSS etc. are terms that refer to different areas of the executable file created by the linker. In general code is separate from static immutable data which is separate from statically allocated but mutable data. Stack and heap are separate from all of the above. Their size is computed by the compiler and the linker. Note that each binary file has his own sections, so any dynamically linked libraries will be mapped in the address space separately each with it's own sections mapped somewhere. Heap and stack, however, are not part of the binary image, there generally is just one stack per process and one heap.
The size of the stack (at least the initial stack) is generally fixed. Compilers and/or linkers generally have some flags you can use to set the size of the stack that you want at runtime. Stacks generally "grow backward" because that's how the processor stack instructions work. Having stacks grow in one direction and the rest grow in the other makes it easier to organize memory in situations where you want both to be unbounded but do not know how much each can grow.
Heap, in general, refers to anything that is not pre-allocated when the process starts. At the lowest level there are several logical operations that relate to heap management (not all are implemented as I describe here in all operating systems).
While the address space is fixed, some OSs keep track of which parts of it are currently reclaimed by the process. Even if this is not the case, the process itself needs to keep track of it. So the lowest level operation is to actually decide that a certain region of the address space is going to be used.
The second low level operation is to instruct the OS to map that region to something. This in general can be
some memory that is not swappable
memory that is swappable and mapped to the system swap file
memory that is swappable and mapped to some other file
memory that is swappable and mapped to some other file in read only mode
the same mapping that another virtual address region is mapped to
the same mapping that another virtual address region is mapped to, but in read only mode
the same mapping that another virtual address region is mapped to, but in copy on write mode with the copied data mapped to the default swap file
There may be other combinations I forgot, but those are the main ones.
Of course the total space used really depends on how you define it. RAM currently used is different than address space currently mapped. But as I wrote above, operating system dependent tools should let you find out what is currently happening.
The sections are predetermined by the executable file.
Besides that one, there may be those of any dynamically linked libraries. While the code and constant data of a DLL is supposed to be shared across multiple processes using it and not be counted more than once, its process-specific non-constant data should be accounted for in every process.
Besides, there can be dynamically allocated memory in the process.
Further, if there are multiple threads in the process, each of them will have its own stack.
What's more, there are going to be per-thread, per-process and per-library data structures in the process itself and in the kernel on its behalf (thread-local storage, command line params, handles to various resources, structures for those resources as well and so on and so forth).
It's difficult to calculate the full process size exactly without knowing how everything is implemented. You might get a reasonable estimate, though.
W.r.t. According to my understanding when some buffer overflows happens then the address gets corrupted. It's not necessarily true. First of all, the address of what? It depends on what happens to be in the memory near the buffer. If there's an address, it can get overwritten during a buffer overflow. But if there's another buffer nearby that contains a picture of you, the pixels of the picture can get overwritten.
You can get segmentation or page faults when trying to access memory for which you don't have necessary permissions (e.g. the kernel portion that's mapped or otherwise present in the process address space). Or it can be a read-only location. Or the location can have no mapping to the physical memory.
It's hard to tell how the location and layout of the stack and heap are going to affect performance without knowing the performance of what we're talking about. You can speculate, but the speculations can turn out to be wrong.
Btw, you should really consider asking separate questions on SO for separate issues.
"How is it possible for a process to access a memory that is not in its address space?"
Given memory protection it's impossible. But it might be attempted. Consider random pointers or access beyond buffers. If you increment any pointer long enough, it almost certainly wanders into an unmapped address range. Simple example:
char *p = "some string";
while (*p++ != 256) /* Always true. Keeps incrementing p until segfault. */
;
Simple errors like this are not unheard of, to make an understatement.
I can answer to questions #2 and #3.
Answer #2
When in C you use pointers you are really using a numerical value that is interpreted as address to memory (logical address on modern OS, see footnotes). You can modify this address at your will. If the value points to an address that is not in your address space you have your segmentation fault.
Consider for instance this scenario: your OS gives to your process the address range from 0x01000 to 0x09000. Then
int * ptr = 0x01000;
printf("%d", ptr[0]); // * prints 4 bytes (sizeof(int) bytes) of your address space
int * ptr = 0x09100;
printf("%d", ptr[0]); // * You are accessing out of your space: segfault
Mostly the causes of segfault, as you pointed out, are the use of pointers to NULL (that is mostly 0x00 address, but implementation dependent) or the use of corrupted addresses.
Note that, on linux i386, base and limit register are not used as you may think. They are not per-process limits but they point to two kind of segments: user space or kernel space.
Answer #3
The stack growth is hardware dependent and not OS dependent. On i386 assembly instruction like push and pop make the stack grow downwards with regard to stack related registers. For instance the stack pointer automatically decreases when you do a push, and increases when you do a pop. OS cannot deal with it.
Footnotes
In a modern OS, a process uses the so called logic address. This address is mapped with physical address by the OS. To have a note of this compile yourself this simply program:
#include <stdio.h>
int main()
{
int a = 10;
printf("%p\n", &a);
return 0;
}
If you run this program multiple times (even simultaneously) you would see, even for different instances, the same address printed out. Of course this is not the real memory address, but it is a logical address that will be mapped to physical address when needed.

Resources