Disassembling simple C function - c

I'm trying to understand the underlying assembly for a simple C function.
program1.c
void function() {
char buffer[1];
}
=>
push %ebp
mov %esp, %ebp
sub $0x10, %esp
leave
ret
Not sure how it's arriving at 0x10 here? Isn't a character 1 byte, which is 8 bits, so it should be 0x08?
program2.c
void function() {
char buffer[4];
}
=>
push %ebp
mov %esp, %ebp
sub $0x18, %esp
mov ...
mov ...
[a bunch of random instructions]
Not sure how it's arriving at 0x18 here either? Also, why are there so many additional instructions after the SUB instruction? All I did was change the length of the array from 1 to 4.

gcc uses -mpreferred-stack-boundary=4 by default for x86 32 and 64bit ABIs, so it keeps %esp 16B-aligned.
I was able to reproduce your output with gcc 4.8.2 -O0 -m32 on the Godbolt Compiler Explorer
void f1() { char buffer[1]; }
pushl %ebp
movl %esp, %ebp # make a stack frame (`enter` is super slow, so gcc doesn't use it)
subl $16, %esp
leave # `leave` is not terrible compared to mov/pop
ret
You must be using a version of gcc with -fstack-protector enabled by default. Newer gcc isn't usually configured to do that, so you don't get the same sentinel value and check written to the stack. (Try a newer gcc in that godbolt link)
void f4() { char buffer[4]; }
pushl %ebp #
movl %esp, %ebp # make a stack frame
subl $24, %esp # IDK why it reserves 24, rather than 16 or 32B, but prob. has something to do with aligning the stack for the possible call to __stack_chk_fail
movl %gs:20, %eax # load a value from thread-local storage
movl %eax, -12(%ebp) # store it on the stack
xorl %eax, %eax # tmp59
movl -12(%ebp), %eax # D.1377, tmp60
xorl %gs:20, %eax # check that the sentinel value matches what we stored
je .L3 #,
call __stack_chk_fail #
.L3:
leave
ret
Apparently gcc considers char buffer[4] a "vulnerable object", but not char buffer[1]. Without -fstack-protector, there'd be little to no difference in the asm even at -O0.

Isn't a character 1 byte, which is 8 bits, so it should be 0x08?
This values are not bits, they are bytes.
Not sure how it's arriving at 0x10 here?
This lines:
push %ebp
mov %esp, %ebp
sub $0x10, %esp
Are allocating space on the stack, 16 bytes of memory are being reserved for the execution of this function.
All those bytes are needed to store information like:
A 4 byte memory address for the instruction that will be jumped to in the ret instruction
The local variables of the functions
Data structure alignment
Other stuff i can't remember right now :)
In your example, 16 bytes were allocated. 4 of them are for the address of the next instruction that will be called, so we have 12 bytes left. 1 byte is for the char array of size 1, which is probably optimized by the compiler to a single char. The last 11 bytes are probably to store some of the stuff i can't remember and the padding's added by the compiler.
Not sure how it's arriving at 0x18 here either?
Each of the additional bytes in your second example increased the stack size in 2 bytes, 1 byte for the char, and 1 likely for memory alignment purposes.
Also, why are there so many additional instructions after the SUB instruction?
Please update the question with the instructions.

This code is just setting up the stack frame. This is used as scratch space for local variables, and will have some kind of alignment requirement.
You haven't mentioned your platform, so I can't tell you exactly what the requirements are for your system, but obviously both values are at least 8-byte aligned (so the size of your local variables is rounded up so %esp is still a multiple of 8).
Search for "c function prolog epilog" or "c function call stack" to find more resources in this area.
Edit - Peter Cordes' answer explains the discrepancy and the mysterious extra instructions.
And for completeness, although Fábio already answered this part:
Not sure how it's arriving at 0x10 here? Isn't a character 1 byte, which is 8 bits, so it should be 0x08?
On x86, %esp is the stack pointer, and pointers store addresses, and these are addresses of bytes. Sub-byte addressing is rarely used (cf. Peter's comment). If you want to examine individual bits inside a byte, you'd usually use bitwise (&,|,~,^) operations on the value, but not change the address.
(You could equally argue that sub-cache-line addressing is a convenient fiction, but we're rapidly getting off-topic).

Whenever you allocate memory, your operating system almost never actually gives you exactly that amount, unless you use a function like pvalloc, which gives you a page-aligned amount of bytes (usually 4K). Instead, your operating system assumes that you might need more in the future, so goes ahead and gives you a bit more.
To disable this behavior, use a lower-level system call that doesn't do buffering, like sbrk(). These lecture notes are an excellent resource:
http://web.eecs.utk.edu/~plank/plank/classes/cs360/360/notes/Malloc1/lecture.html

Related

Segmentation fault when calling x86 Assembly function from C program

I am writing a C program that calls an x86 Assembly function which adds two numbers. Below are the contents of my C program (CallAssemblyFromC.c):
#include <stdio.h>
#include <stdlib.h>
int addition(int a, int b);
int main(void) {
int sum = addition(3, 4);
printf("%d", sum);
return EXIT_SUCCESS;
}
Below is the code of the Assembly function (my idea is to code from scratch the stack frame prologue and epilogue, I have added comments to explain the logic of my code) (addition.s):
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
# Push the current EBP (base pointer) to the stack, so that we
# can reset the EBP to its original state after the function's
# execution
push %ebp
# Move the EBP (base pointer) to the current position of the ESP
# register
movl %esp, %ebp
# Read in the parameters of the addition function
# addition(a, b)
#
# Since we are pushing to the stack, we need to obtain the parameters
# in reverse order:
# EBP (return address) | EBP + 4 (return value) | EBP + 8 (b) | EBP + 4 (a)
#
# Utilize advanced indexing in order to obtain the parameters, and
# store them in the CPU's registers
movzbl 8(%ebp), %ebx
movzbl 12(%ebp), %ecx
# Clear the EAX register to store the sum
xorl %eax, %eax
# Add the values into the section of memory storing the return value
addl %ebx, %eax
addl %ecx, %eax
I am getting a segmentation fault error, which seems strange considering that I think I am allocating memory in accordance with the x86 calling conventions (e.x. allocating the correct memory sections to the function's parameters). Furthermore, if any of you have a solution, it would be greatly appreciated if you could provide some advice as to how to debug an Assembly program embedded with C (I have been using the GDB debugger but it simply points to the line of the C program where the segmentation fault happens instead of the line in the Assembly program).
Your function has no epilogue. You need to restore %ebp and pop the stack back to where it was, and then ret. If that's really missing from your code, then that explains your segfault: the CPU will go on executing whatever garbage happens to be after the end of your code in memory.
You clobber (i.e. overwrite) the %ebx register which is supposed to be callee-saved. (You mention following the x86 calling conventions, but you seem to have missed that detail.) That would be the cause of your next segfault, after you fixed the first one. If you use %ebx, you need to save and restore it, e.g. with push %ebx after your prologue and pop %ebx before your epilogue. But in this case it is better to rewrite your code so as not to use it at all; see below.
movzbl loads an 8-bit value from memory and zero-extends it into a 32-bit register. Here the parameters are int so they are already 32 bits, so plain movl is correct. As it stands your function would give incorrect results for any arguments which are negative or larger than 255.
You're using an unnecessary number of registers. You could move the first operand for the addition directly into %eax rather than putting it into %ebx and adding it to zero. And on x86 it is not necessary to get both operands into registers before adding; arithmetic instructions have a mem, reg form where one operand can be loaded directly from memory. With this approach we don't need any registers other than %eax itself, and in particular we don't have to worry about %ebx anymore.
I would write:
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
push %ebp
movl %esp, %ebp
# load first argument
movl 8(%ebp), %eax
# add second argument
addl 12(%ebp), %eax
# epilogue
movl %ebp, %esp # redundant since we haven't touched esp, but will be needed in more complex functions
pop %ebp
ret
In fact, you don't need a stack frame for this function at all, though I understand if you want to include it for educational value. But if you omit it, the function can be reduced to
.text
.global addition
addition:
movl 4(%esp), %eax
addl 8(%esp), %eax
ret
You are corrupting the stacke here:
movb %al, 4(%ebp)
To return the value, simply put it in eax. Also why do you need to clear eax? that's inefficient as you can load the first value directly into eax and then add to it.
Also EBX must be saved if you intend to use it, but you don't really need it anyway.

Size of local variable in assembly

I have following C function:
void function(int a) {
char buffer[1];
}
It produces following assembly code(gcc with 0 optimization, 64 bit machine):
function:
pushq %rbp
movq %rsp, %rbp
movl %edi, -20(%rbp)
nop
popq %rbp
ret
Questions:
Why buffer occupies 20 bytes?
If I declare char buffer instead of char buffer[1] the offset is 4 bytes, but I expected to see 8, since machine is 64 bit and I thought it will use qword(64 bit).
Thanks in advance and sorry if question is duplicated, I was not able to find the answer.
movl %edi, -20(%rbp) is spilling the function arg from a register into the red-zone below the stack pointer. It's 4 bytes long, leaving 16 bytes of space above it below RSP.
gcc's -O0 (naive anti-optimized) code-gen for you function doesn't actually touch the memory it reserved for buffer[], so you don't know where it is.
You can't infer that buffer[] is using up all 16 bytes above a in the red zone, just that gcc did a bad job of packing locals efficiently (because you compiled with -O0 so it didn't even try). But it's definitely not 20 because there isn't that much space left. Unless it put buffer[] below a, somewhere else in the rest of the 128-byte red-zone. (Hint: it didn't.)
If we add an initializer for the array, we can see where it actually stores the byte.
void function(int a) {
volatile char buffer[1] = {'x'};
}
compiled by gcc8.2 -xc -O0 -fverbose-asm -Wall on the Godbolt compiler explorer:
function:
pushq %rbp
movq %rsp, %rbp # function prologue, creating a traditional stack frame
movl %edi, -20(%rbp) # a, a
movb $120, -1(%rbp) #, buffer
nop # totally useless, IDK what this is for
popq %rbp # tear down the stack frame
ret
So buffer[] is in fact one byte long, right below the saved RBP value.
The x86-64 System V ABI requires 16-byte alignment for automatic storage arrays that are at least 16 bytes long, but that's not the case here so that rule doesn't apply.
I don't know why gcc leaves extra padding before the spilled register arg; gcc often has that kind of missed optimization. It's not giving a any special alignment.
If you add extra local arrays, they will fill up that 16 bytes above the spilled arg, still spilling it to -20(%rbp). (See function2 in the Godbolt link)
I also included clang -O0, and icc -O3 and MSVC optimized output, in the Godbolt link. Fun fact: ICC chooses to optimize away volatile char buffer[1] = {'x'}; without actually storing to memory, but MSVC allocates it in the shadow space. (Windows x64 uses a different calling convention, and has 32B shadow space above the return address instead of a 128B red zone below the stack pointer.)
clang/LLVM -O0 chooses to spill a right below RSP, and put the array 1 byte below that.
With just char buffer instead of char buffer[1]
We get movl %edi, -4(%rbp) # a, a from gcc -O0. It apparently optimizes away the unused and uninitialized local variable entirely, and spills a right below the saved RBP. (I didn't run it under GDB or look at the debug info to see if &buffer would give us.)
So again, you're mixing up a with buffer.
If we initialize it with char buffer = 'x', we're back to the old stack layout, with buffer at -1(%rbp).
Or even if we just make it volatile char buffer; without an initializer, then space for it exists on the stack and a is spilled to -20(%rbp) even with no store done to buffer.
4 bytes aligned char ,8 bytes pushed rbp, 8 bytes a = 20. Start addres of the a is current stack pointer minus 20

Motivation for useless prologue in gcc-compiled main(), disabling it?

Given the following minimal test case:
void exit(int);
int main() {
exit(0);
}
GCC 4.9 and later with 32-bit x86 target produces something like:
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $4, %esp
subl $12, %esp
pushl $0
call exit
Note the convoluted stack-realignment code. With the function renamed to anything but main, however, it gives the (much more reasonable):
xmain:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
subl $12, %esp
pushl $0
call exit
The differences are even more pronounced with -O. As main nothing changes; renamed, it yields:
xmain:
subl $24, %esp
pushl $0
call exit
The above was noticed in answering this question:
How do i get rid of call __x86.get_pc_thunk.ax
Is this behavior (and its motivation) documented anywhere, and is there any way to suppress it? GCC has x86 target-specific options to set the preferred/assumed incoming and outgoing stack alignment and enable/disable realignment for arbitrary functions, but they don't seem to be honored for main.
This answer is based on source diving. I do not know what the developers' intentions or motivations were. All of the code involved seems to date to 2008ish, which is after my own time working on GCC, but long enough ago that people's memories have probably gotten fuzzy. (GCC 4.9 was released in 2014; did you go back any farther than that? If I'm right about when this code was introduced, the clumsy stack alignment for main should start happening in version 4.4.)
GCC's x86 back end appears to have been coded to make extra-conservative assumptions about the stack alignment on entry to main, regardless of command-line options. The function ix86_minimum_incoming_stack_boundary is called to compute the expected stack alignment on entry for each function, and the last thing it does ...
12523 /* Stack at entrance of main is aligned by runtime. We use the
12524 smallest incoming stack boundary. */
12525 if (incoming_stack_boundary > MAIN_STACK_BOUNDARY
12526 && DECL_NAME (current_function_decl)
12527 && MAIN_NAME_P (DECL_NAME (current_function_decl))
12528 && DECL_FILE_SCOPE_P (current_function_decl))
12529 incoming_stack_boundary = MAIN_STACK_BOUNDARY;
12530
12531 return incoming_stack_boundary;
... is override the expected stack alignment to a conservative constant, MAIN_STACK_BOUNDARY, if the function being compiled is main. MAIN_STACK_BOUNDARY is 128 (bits) when compiling 64-bit code and 32 when compiling 32-bit code. As far as I can tell, there is no command-line knob that will make it expect the stack to be more aligned than that on entry to main. I can persuade it to skip stack alignment for main by telling it that no additional alignment is needed, compiling your test program with -m32 -mpreferred-stack-boundary=2 gives me
main:
pushl $0
call exit
with GCC 7.3.
The write-only manipulations of %ecx appear to be a missed-optimization bug. They are coming from this part of ix86_expand_prologue:
13695 /* Grab the argument pointer. */
13696 t = plus_constant (Pmode, stack_pointer_rtx, m->fs.sp_offset);
13697 insn = emit_insn (gen_rtx_SET (crtl->drap_reg, t));
13698 RTX_FRAME_RELATED_P (insn) = 1;
13699 m->fs.cfa_reg = crtl->drap_reg;
13700 m->fs.cfa_offset = 0;
13701
13702 /* Align the stack. */
13703 insn = emit_insn (ix86_gen_andsp (stack_pointer_rtx,
13704 stack_pointer_rtx,
13705 GEN_INT (-align_bytes)));
13706 RTX_FRAME_RELATED_P (insn) = 1;
13707
The intention is to save a pointer to the incoming argument area before realigning the stack, so that it is straightforward to access arguments. Either because this happens fairly late in the pipeline (after register allocation), or because the instructions are marked FRAME_RELATED, nothing manages to delete those instructions again when they turn out to be unnecessary.
I imagine the GCC devs would at least listen to a bug report about this, but they might reasonably consider it low priority, because these are instructions that are executed only once in the lifetime of the whole program, they're only actually dead when main doesn't use its arguments, and they only happen in the traditional 32-bit ABI, which I have the impression is considered a second-class target nowadays.
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $4, %esp
The above section replicates the invoking stack frame, which since you haven’t defined any arguments to main() consists of just the return address -4(%ecx) and frame pointer, into a $16 byte aligned stack; thus my WAG is that this is to accomodate runtimes (crt0.s) that do not align the stack properly.
The push %ebp was a bit of a giveaway -- it establishes a consistent looking backtrace through crt0.s despite this trampoline.
This is just a ‘normal’ call of exit, with the stack properly aligned...
subl $12, %esp
pushl $0
call exit

pushing and changing of %esp frame pointer

I have a small program, written in C, echo():
/* Read input line and write it back */
void echo() {
char buf[8]; /* Way too small! */
gets(buf);
puts(buf);
}
The corresponding assembly code:
1 echo:
2 pushl %ebp //Save %ebp on stack
3 movl %esp, %ebp
4 pushl %ebx //Save %ebx
5 subl $20, %esp //Allocate 20 bytes on stack
6 leal -12(%ebp), %ebx //Compute buf as %ebp-12
7 movl %ebx, (%esp) //Store buf at top of stack
8 call gets //Call gets
9 movl %ebx, (%esp) //Store buf at top of stack
10 call puts //Call puts
11 addl $20, %esp //Deallocate stack space
12 popl %ebx //Restore %ebx
13 popl %ebp //Restore %ebp
14 ret //Return
I have a few questions.
Why does the %esp allocate 20 bytes? The buf is 8 bytes, why the extra 12?
The return address is right above where we pushed %ebp right? (Assuming we draw the stack upside down, where it grows downward) What is the purpose of the old %ebp (which the current %ebp is pointing at, as a result of line 3)?
If i want to change the return address (by inputting anything more than 12 bytes), it would change where echo() returns to. What is the consequence of changing the old %ebp (aka 4 bytes before the return address)? Is there any possibility of changing the return address or where echo returns to by just changing the old %ebp?
What is the purpose of the %ebp? I know its the frame pointer but, what is that?
Is it ever possible for the compiler to put the buffer somewhere that is not right next to where the old %ebp is stored? Like if we declare buf[8] but it stores it at -16(%ebp) instead of -12(%ebp) on line 6?
*c code and assembly copied from Computer Systems - A programmer's Perspective 2nd ed.
** Using gets() because doing buffer overflows
The reason 20 bytes are allocated is for the purpose of stack alignment. GCC 4.5+ generates code that ensures that the callee's local stack space is aligned to a 16-byte boundary, in order to ensure that compiled code can do aligned SSE loads and stores on the stack in a well-defined manner. For that reason, the compiler in this case needs to throw away some stack-space in order to ensure that gets/puts get a properly aligned frame.
In essence, this is how the stack will look, where each line is a 4-byte word except for --- lines that denote 16-byte address boundaries:
...
Saved EIP from caller
Saved EBP
---
Saved EBX # This is where echo's frame starts
buf
buf
Unused
---
Unused
Parameter to gets/puts
Saved EIP
Saved EBP
---
... # This is where gets'/puts' frame starts
As you can hopefully see from my fantastic ASCII graphics, if it weren't for the "unused" portions, gets/puts would get an unaligned frame. Do note also, however, that not 12 bytes are unused; 4 of them are reserved for the parameter.
Is it ever possible for the compiler to put the buffer somewhere that is not right next to where the old %ebp is stored? Like if we declare buf[8] but it stores it at -16(%ebp) instead of -12(%ebp) on line 6?
Certainly. The compiler is free to organize the stack however it feels like. In order to do buffer overflows predictably, you have to be looking at a specific compiled binary of a program.
As for what the purpose of EBP is (and thus to answer your questions 2, 3 and 5), please see any introductory text to how the call stack is organized, such as the Wikipedia article.

memory layout hack

i have been following this course in youtube and it was talking about how some programmers can use there knowledge of how memory is laid to do clever things..
one of the examples in the lecture was something like that
#include <stdio.h>
void makeArray();
void printArray();
int main(){
makeArray();
printArray();
return 0;
}
void makeArray(){
int array[10];
int i;
for(i=0;i<10;i++)
array[i]=i;
}
void printArray(){
int array[10];
int i;
for(i=0;i<10;i++)
printf("%d\n",array[i]);
}
the idea is as long as the two function has the same activation record size on the stack segment it will work and print numbers from 0 to 9 ... but actually it prints something like that
134520820
-1079626712
0
1
2
3
4
5
6
7
there are always those two values at the begging ... can any one explain that ???
iam using gcc in linux
the exact lecture url starting at 5:15
I'm sorry but there's absolutely nothing clever about that piece of code and people who use it are very foolish.
Addendum:
Or, sometimes, just sometimes, very clever. Having watched the video linked to in the question update, this wasn't some rogue code monkey breaking the rules. This guy understood what he was doing quite well.
It requires a deep understanding of the underlying code generated and can easily break (as mentioned and seen here) if your environment changes (like compilers, architectures and so on).
But, provided you have that knowledge, you can probably get away with it. It's not something I'd suggest to anyone other than a veteran but I can see it having its place in very limited situations and, to be honest I've no doubt occasinally been somewhat more ... pragmatic ... than I should have been in my own career :-)
Now back to your regular programming ...
It's non-portable between architectures, compilers, releases of compilers, and probably even optimisation levels within the same release of a compiler, as well as being undefined behaviour (reading uninitialised variables).
Your best bet if you want to understand it is to examine the assembler code output by the compiler.
But your best bet overall is to just forget about it and code to the standard.
For example, this transcript shows how gcc can have different behaviour at different optimisation levels:
pax> gcc -o qq qq.c ; ./qq
0
1
2
3
4
5
6
7
8
9
pax> gcc -O3 -o qq qq.c ; ./qq
1628373048
1629343944
1629097166
2280872
2281480
0
0
0
1629542238
1629542245
At gcc's high optimisation level (what I like to call its insane optimisation level), this is the makeArray function. It's basically figured out that the array is not used and therefore optimised its initialisation out of existence.
_makeArray:
pushl %ebp ; stack frame setup
movl %esp, %ebp
; heavily optimised function
popl %ebp ; stack frame tear-down
ret ; and return
I'm actually slightly surprised that gcc even left the function stub in there at all.
Update: as Nicholas Knight points out in a comment, the function remains since it must be visible to the linker - making the function static results in gcc removing the stub as well.
If you check the assembler code at optimisation level 0 below, it gives a clue (it's not the actual reason - see below). Examine the following code and you'll see that the stack frame setup is different for the two functions despite the fact that they have exactly the same parameters passed in and the same local variables:
subl $48, %esp ; in makeArray
subl $56, %esp ; in printArray
This is because printArray allocates some extra space to store the address of the printf format string and the address of the array element, four bytes each, which accounts for the eight bytes (two 32-bit values) difference.
That's the most likely explanation for your array in printArray() being off by two values.
Here's the two functions at optimisation level 0 for your enjoyment :-)
_makeArray:
pushl %ebp ; stack fram setup
movl %esp, %ebp
subl $48, %esp
movl $0, -4(%ebp) ; i = 0
jmp L4 ; start loop
L5:
movl -4(%ebp), %edx
movl -4(%ebp), %eax
movl %eax, -44(%ebp,%edx,4) ; array[i] = i
addl $1, -4(%ebp) ; i++
L4:
cmpl $9, -4(%ebp) ; for all i up to and including 9
jle L5 ; continue loop
leave
ret
.section .rdata,"dr"
LC0:
.ascii "%d\12\0" ; format string for printf
.text
_printArray:
pushl %ebp ; stack frame setup
movl %esp, %ebp
subl $56, %esp
movl $0, -4(%ebp) ; i = 0
jmp L8 ; start loop
L9:
movl -4(%ebp), %eax ; get i
movl -44(%ebp,%eax,4), %eax ; get array[i]
movl %eax, 4(%esp) ; store array[i] for printf
movl $LC0, (%esp) ; store format string
call _printf ; make the call
addl $1, -4(%ebp) ; i++
L8:
cmpl $9, -4(%ebp) ; for all i up to and including 9
jle L9 ; continue loop
leave
ret
Update: As Roddy points out in a comment. that's not the cause of your specific problem since, in this case, the array is actually at the same position in memory (%ebp-44 with %ebp being the same across the two calls). What I was trying to point out was that two functions with the same argument list and same local parameters did not necessarily end up with the same stack frame layout.
All it would take would be for printArray to swap the location of its local variables (including any temporaries not explicitly created by the developer) around and you would have this problem.
Probably GCC generates code that does not push the arguments to the stack when calling a function, instead it allocates extra space in the stack. The arguments to your 'printf' function call, "%d\n" and array[i] take 8 bytes on the stack, the first argument is a pointer and the second is an integer. This explains why there are two integers that are not printed correctly.
Never, ever, ever, ever, ever, ever do anything like this. It will not work reliably. You will get odd bugs. It is far from portable.
Ways it can fail:
.1. The compiler adds extra, hidden code
DevStudio, in debug mode, adds calls to functions that check the stack to catch stack errors. These calls will overwrite what was on the stack, thus losing your data.
.2. Someone adds an Enter/Exit call
Some compilers allow the programmer to define functions to be called on function entry and function exit. Like (1) these use stack space and will overwrite what's already there, losing data.
.3. Interrupts
In main(), if you get an interrupt between the calls to makeArray and printArray, you will lose your data. The first thing that happens when processing an interrupt is to save the state of the cpu. This usually involves pushing the CPU registers and flags onto the stack, and yes, you guessed it, overwrite your data.
.4. Compilers are clever
As you've seen, the array in makeArray is at a different address to the one in printArray. The compiler has placed it's local variables in different positions on the stack. It uses a complex algorithm to decide where to put variable - on the stack, in a register, etc and it's really not worth trying to figure out how the compiler does it as the next version of the compiler might do it some other way.
To sum up, these kind of 'clever tricks' aren't tricks and are certainly not clever. You would not lose anything by declaring the array in main and passing a reference/pointer to it in the two functions. Stacks are for storing local variables and function return addresses. Once your data goes out of scope (i.e. the stack top shrinks past the data) then the data is effectively lost - anything can happen to it.
To illustrate this point more, your results would probably be different if you had different function names (I'm just guessing here, OK).

Resources