Segmentation fault when byte coding a function? [duplicate] - c

This question already has answers here:
How to call machine code stored in char array?
(6 answers)
Closed 6 years ago.
I get a segmentation fault when I run the following C program (compiled with gcc in Ubuntu).
#include <stdio.h>
char f[] = "\x55\x48\x89\xe5\x48\x89\x7d\xf8\x48\x89\x75\xf0\x48\x8b\x45\xf8\x8b\x10\x48\x8b\x45\xf0\x8b\x00\x89\xd1\x29\xc1\x89\xc8\xc9\xc3";
int main()
{
int (*func)();
func = (int (*)()) f;
int x=3,y=5;
printf("%d\n",(int)(*func)(&x,&y));
return 0;
}
The string f contains the machine code of the following function.
int f(int*a, int*b)
{
return *a-*b;
}
c.f.:
f.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <f>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 89 7d f8 mov %rdi,-0x8(%rbp)
8: 48 89 75 f0 mov %rsi,-0x10(%rbp)
c: 48 8b 45 f8 mov -0x8(%rbp),%rax
10: 8b 10 mov (%rax),%edx
12: 48 8b 45 f0 mov -0x10(%rbp),%rax
16: 8b 00 mov (%rax),%eax
18: 89 d1 mov %edx,%ecx
1a: 29 c1 sub %eax,%ecx
1c: 89 c8 mov %ecx,%eax
1e: c9 leaveq
1f: c3 retq
This is compiled using:
gcc test.c -Wall -Werror
./a.out
Segmentation fault
The expected output is -2 - how can I get it to work?

Apparantly below suggestion no longer works with gcc, as the array data nowadays gets located in a separate non-executable read-only ELF segment.
I'll leave it here for historical reasons.
Interestingly, the linker didn't complain that you attempt to link a char f[] = "..."; as a function f() to your application. You attempt to call a function f(). There is a symbol f linked to the executable, but suprisingly it is no function at all, but some variable. And thus it fails to execute it. This is likely due to a stack execution protection mechanism.
To circumvent this, apparantly, you just need to get the string to the text segment of the process memory. You can achieve this, if you declare the string as const char f[].
From Smashing The Stack For Fun And Profit, by Aleph One:
The text region is fixed by the program and includes code (instructions)
and read-only data. This region corresponds to the text section of the
executable file.
As the const char[] is read-only, the compiler puts it together with the code into the text region. Thereby the execution prevention mechanism is circumvented and the machine is able to execute the machine code therein.
Example:
/* test.c */
#include <stdio.h>
const char f[] = "\x55\x48\x89\xe5\x48\x89\x7d\xf8\x48\x89\x75\xf0\x48\x8b\x45\xf8\x8b\x10\x48\x8b\x45\xf0\x8b\x00\x89\xd1\x29\xc1\x89\xc8\xc9\xc3";
int main()
{
int (*func)();
func = (int (*)()) f;
int x=3,y=5;
printf("%d\n",(int)(*func)(&x,&y));
return 0;
}
yields:
$ gcc test.c -Wall && ./a.out
-2
(Fedora 16, gcc 4.6.3)

If I understand you correctly you're trying to run simple code that is not in text space, but instead is in your initialized static storage? If this is failing then there can only be three reasons: either your code was initialized incorrectly (unlikely in this simple case), your data space has been stepped on (doesn't look like it in this simple case), or your system is preventing it as a security measure (quite likely since what you're trying to do is fairly atypical, primarily used for buffer overflow exploitation).

Related

error try to create function by malloc [duplicate]

This question already has answers here:
Dereferencing function pointers in C to access CODE memory
(2 answers)
Allocate executable ram in c on linux
(2 answers)
Execute computer instructions directly in binary [closed]
(4 answers)
How to alloc a executable memory buffer?
(5 answers)
Closed 5 years ago.
i try to make a function with a strange way , but i am believe the there exist a way to do it.
i try to create function fn()=1;
int fn()
{
return 0;
}
then i try to compile it without main then disassembled
gcc -Wall -c fn.c
objdump -d ./a.out
the result is :
./fn.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <fn>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: b8 01 00 00 00 mov $0x1,%eax
9: 5d pop %rbp
a: c3 retq
then i write my program:
#include <stdio.h>
#include <stdlib.h>
union datas{
char * v;
int (*d)();
}ptr;
int main()
{
int (*f0)();
ptr.v=(char *)malloc(11);
ptr.v[0]=0x55;
ptr.v[1]=0x48;
ptr.v[2]=0x89;
ptr.v[3]=0xe5;
ptr.v[4]=0xb8;
ptr.v[5]=0x01;
ptr.v[6]=0x00;
ptr.v[7]=0x00;
ptr.v[8]=0x00;
ptr.v[9]=0x5d;
ptr.v[10]=0xc3;
printf("ok1\n");//check
f0=ptr.d;
printf("ok2\n");//check
printf("fn=%d\n",f0());
printf("ok3\n");//check
return 0;
}
but the result is:
ok1
ok2
Segmentation fault (core dumped)
It won't work because you cannot mix pointer types in c (a strongly-typed language). Moreover, c does not allow mixing code and data at all. You want that? Try lisp.

what is stack smashing (C)?

Code:
int str_join(char *a, const char *b) {
int sz =0;
while(*a++) sz++;
char *st = a -1, c;
*st = (char) 32;
while((c = *b++)) *++st = c;
*++st = 0;
return sz;
}
....
char a[] = "StringA";
printf("string-1 length = %d, String a = %s\n", str_join(&a[0],"StringB"), a);
Output:
string-1 length = 7, char *a = StringA StringB
*** stack smashing detected **** : /T02 terminated
Aborted (core dumped)
I don't understand why it's showing stack smashing? and what is *stack smashing? Or is it my compiler's error?
Well, stack smashing or stack buffer overflow is a rather detailed topic to be discussed here, you can refer to this wiki article for more info.
Coming to the code shown here, the problem is, your array a is not large enough to hold the final concatenated result.
Thereby, by saying
while((c = *b++)) *++st = c;
you're essentially accessing out of bound memory which invokes undefined behavior. This is the reason you're getting the "stack smashing" issue because you're trying to access memory which does not belong to your process.
To solve this, you need to make sure that array a contains enough space to hold both the first and second string concatenated together. You have to provide a larger destination array, in short.
Stack smashing means you've written outside of ("smashed" past/through) the function's storage space for local variables (this area is called the "stack", in most systems and programming languages). You may also find this type of error called "stack overflow" and/or "stack underflow".
In your code, C is probably putting the string pointed to by a on the stack. In your case, the place that causes the stack "smash" is when you increment st beyond the original a pointer and write to where it points, you're writing outside the area the C compiler guarantees to have reserved for the original string assigned into a.
Whenever you write outside an area of memory that is already properly "reserved" in C, that's "undefined behavior" (which just means that the C language/standard doesn't say what happens): usually, you end up overwriting something else in your program's memory (programs typically put other information right next to your variables on the stack, like return addresses and other internal details), or your program tries writing outside of the memory the operating system has "allowed" it to use. Either way, the program typically breaks, sometimes immediately and obviously (for example, with a "segmentation fault" error), sometimes in very hidden ways that don't become obvious until way later.
In this case, your compiler is building your program with special protections to detect this problem and so your programs exits with an error message. If the compiler didn't do that, your program would try to continue to run, except it might end up doing the wrong thing and/or crashing.
The solution comes down to needing to explicitly tell your code to have enough memory for your combined string. You can either do this by explicitly specifying the length of the "a" array to be long enough for both strings, but that's usually only sufficient for simple uses where you know in advance how much space you need. For a general-purpose solution, you'd use a function like malloc to get a pointer to a new chunk of memory from the operating system that has the size you need/want once you've calculated what the full size is going to be (just remember to call free on pointers that you get from malloc and similar functions once you're done with them).
Minimal reproduction example with disassembly analysis
main.c
void myfunc(char *const src, int len) {
int i;
for (i = 0; i < len; ++i) {
src[i] = 42;
}
}
int main(void) {
char arr[] = {'a', 'b', 'c', 'd'};
int len = sizeof(arr);
myfunc(arr, len + 1);
return 0;
}
GitHub upstream.
Compile and run:
gcc -fstack-protector-all -g -O0 -std=c99 main.c
ulimit -c unlimited && rm -f core
./a.out
fails as desired:
*** stack smashing detected ***: terminated
Aborted (core dumped)
Tested on Ubuntu 20.04, GCC 10.2.0.
On Ubuntu 16.04, GCC 6.4.0, I could reproduce with -fstack-protector instead of -fstack-protector-all, but it stopped blowing up when I tested on GCC 10.2.0 as per Geng Jiawen's comment. man gcc clarifies that as suggested by the option name, the -all version adds checks more aggressively, and therefore presumably incurs a larger performance loss:
-fstack-protector
Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to functions with vulnerable objects. This includes functions that call "alloca", and functions with buffers larger than or equal to 8 bytes. The guards are initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits. Only variables that are actually allocated on the stack are considered, optimized away variables or variables allocated in registers don't count.
-fstack-protector-all
Like -fstack-protector except that all functions are protected.
Disassembly
Now we look at the disassembly:
objdump -D a.out
which contains:
int main (void){
400579: 55 push %rbp
40057a: 48 89 e5 mov %rsp,%rbp
# Allocate 0x10 of stack space.
40057d: 48 83 ec 10 sub $0x10,%rsp
# Put the 8 byte canary from %fs:0x28 to -0x8(%rbp),
# which is right at the bottom of the stack.
400581: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
400588: 00 00
40058a: 48 89 45 f8 mov %rax,-0x8(%rbp)
40058e: 31 c0 xor %eax,%eax
char arr[] = {'a', 'b', 'c', 'd'};
400590: c6 45 f4 61 movb $0x61,-0xc(%rbp)
400594: c6 45 f5 62 movb $0x62,-0xb(%rbp)
400598: c6 45 f6 63 movb $0x63,-0xa(%rbp)
40059c: c6 45 f7 64 movb $0x64,-0x9(%rbp)
int len = sizeof(arr);
4005a0: c7 45 f0 04 00 00 00 movl $0x4,-0x10(%rbp)
myfunc(arr, len + 1);
4005a7: 8b 45 f0 mov -0x10(%rbp),%eax
4005aa: 8d 50 01 lea 0x1(%rax),%edx
4005ad: 48 8d 45 f4 lea -0xc(%rbp),%rax
4005b1: 89 d6 mov %edx,%esi
4005b3: 48 89 c7 mov %rax,%rdi
4005b6: e8 8b ff ff ff callq 400546 <myfunc>
return 0;
4005bb: b8 00 00 00 00 mov $0x0,%eax
}
# Check that the canary at -0x8(%rbp) hasn't changed after calling myfunc.
# If it has, jump to the failure point __stack_chk_fail.
4005c0: 48 8b 4d f8 mov -0x8(%rbp),%rcx
4005c4: 64 48 33 0c 25 28 00 xor %fs:0x28,%rcx
4005cb: 00 00
4005cd: 74 05 je 4005d4 <main+0x5b>
4005cf: e8 4c fe ff ff callq 400420 <__stack_chk_fail#plt>
# Otherwise, exit normally.
4005d4: c9 leaveq
4005d5: c3 retq
4005d6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
4005dd: 00 00 00
Notice the handy comments automatically added by objdump's artificial intelligence module.
If you run this program multiple times through GDB, you will see that:
the canary gets a different random value every time
the last loop of myfunc is exactly what modifies the address of the canary
The canary randomized by setting it with %fs:0x28, which contains a random value as explained at:
https://unix.stackexchange.com/questions/453749/what-sets-fs0x28-stack-canary
Why does this memory address %fs:0x28 ( fs[0x28] ) have a random value?
How to debug it?
See: Stack smashing detected

Segmentation fault when calling a function located in the heap

I'm trying to tweak the rules a little bit here, and malloc a buffer,
then copy a function to the buffer.
Calling the buffered function works, but the function throws a Segmentation fault when i'm trying to call another function within.
Any thoughts why?
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <stdlib.h>
int foo(int x)
{
printf("%d\n", x);
}
int bar(int x)
{
}
int main()
{
int foo_size = bar - foo;
void* buf_ptr;
buf_ptr = malloc(1024);
memcpy(buf_ptr, foo, foo_size);
mprotect((void*)(((int)buf_ptr) & ~(sysconf(_SC_PAGE_SIZE) - 1)),
sysconf(_SC_PAGE_SIZE),
PROT_READ|PROT_WRITE|PROT_EXEC);
int (*ptr)(int) = buf_ptr;
printf("%d\n", ptr(3));
return 0;
}
This code will throw a segfault, unless i'll change the foo function to:
int foo(int x)
{
//Anything but calling another function.
x = 4;
return x;
}
NOTE:
The code successfully copies foo into the buffer, i know i made some assumptions, but on my platform they're ok.
Your code is not position independent and even if it were, you don't have the correct relocations to move it to an arbitrary position. Your call to printf (or any other function) will be done with pc-relative addressing (through the PLT, but that's besides the point here). This means that the instruction generated to call printf isn't a call to a static address but rather "call the function X bytes from the current instruction pointer". Since you moved the code the call is done to a bad address. (I'm assuming i386 or amd64 here, but generally it's a safe assumption, people who are on weird platforms usually mention that).
More specifically, x86 has two different instructions for function calls. One is a call relative to the instruction pointer which determines the destination of the function call by adding a value to the current instruction pointer. This is the most commonly used function call. The second instruction is a call to a pointer inside a register or memory location. This is much less commonly used by compilers because it requires more memory indirections and stalls the pipeline. The way shared libraries are implemented (your call to printf will actually go to a shared library) is that for every function call you make outside of your own code the compiler will insert fake functions near your code (this is the PLT I mentioned above). Your code does a normal pc-relative call to this fake function and the fake function will find the real address to printf and call that. It doesn't really matter though. Almost any normal function call you make will be pc-relative and will fail. Your only hope in code like this are function pointers.
You might also run into some restrictions on executable mprotect. Check the return value of mprotect, on my system your code doesn't work for one more reason: mprotect doesn't allow me to do this. Probably because the backend memory allocator of malloc has additional restrictions that prevents executable protections of its memory. Which leads me to the next point:
You will break things by calling mprotect on memory that isn't managed by you. That includes memory you got from malloc. You should only mprotect things you've gotten from the kernel yourself through mmap.
Here's a version that demonstrates how to make this work (on my system):
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <string.h>
#include <err.h>
int
foo(int x, int (*fn)(const char *, ...))
{
fn("%d\n", x);
return 42;
}
int
bar(int x)
{
return 0;
}
int
main(int argc, char **argv)
{
size_t foo_size = (char *)bar - (char *)foo;
int ps = getpagesize();
void *buf_ptr = mmap(NULL, ps, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_ANON|MAP_PRIVATE, -1, 0);
if (buf_ptr == MAP_FAILED)
err(1, "mmap");
memcpy(buf_ptr, foo, foo_size);
int (*ptr)(int, int (*)(const char *, ...)) = buf_ptr;
printf("%d\n", ptr(3, printf));
return 0;
}
Here, I abuse the knowledge of how the compiler will generate the code for the function call. By using a function pointer I force it to generate a call instruction that isn't pc-relative. Also, I manage the memory allocation myself so that we get the right permissions from start and not run into any restrictions that brk might have. As a bonus we do error handling that actually helped me find a bug in the first version of this experiment and I also corrected other minor bugs (like missing includes) which allowed me to enable warnings in the compiler and catch another potential problem.
If you want to dig deeper into this you can do something like this. I added two versions of the function:
int
oldfoo(int x)
{
printf("%d\n", x);
return 42;
}
int
foo(int x, int (*fn)(const char *, ...))
{
fn("%d\n", x);
return 42;
}
Compile the whole thing and disassemble it:
$ cc -Wall -o foo foo.c
$ objdump -S foo | less
We can now look at the two generated functions:
0000000000400680 <oldfoo>:
400680: 55 push %rbp
400681: 48 89 e5 mov %rsp,%rbp
400684: 48 83 ec 10 sub $0x10,%rsp
400688: 89 7d fc mov %edi,-0x4(%rbp)
40068b: 8b 45 fc mov -0x4(%rbp),%eax
40068e: 89 c6 mov %eax,%esi
400690: bf 30 08 40 00 mov $0x400830,%edi
400695: b8 00 00 00 00 mov $0x0,%eax
40069a: e8 91 fe ff ff callq 400530 <printf#plt>
40069f: b8 2a 00 00 00 mov $0x2a,%eax
4006a4: c9 leaveq
4006a5: c3 retq
00000000004006a6 <foo>:
4006a6: 55 push %rbp
4006a7: 48 89 e5 mov %rsp,%rbp
4006aa: 48 83 ec 10 sub $0x10,%rsp
4006ae: 89 7d fc mov %edi,-0x4(%rbp)
4006b1: 48 89 75 f0 mov %rsi,-0x10(%rbp)
4006b5: 8b 45 fc mov -0x4(%rbp),%eax
4006b8: 48 8b 55 f0 mov -0x10(%rbp),%rdx
4006bc: 89 c6 mov %eax,%esi
4006be: bf 30 08 40 00 mov $0x400830,%edi
4006c3: b8 00 00 00 00 mov $0x0,%eax
4006c8: ff d2 callq *%rdx
4006ca: b8 2a 00 00 00 mov $0x2a,%eax
4006cf: c9 leaveq
4006d0: c3 retq
The instruction for the function call in the printf case is "e8 91 fe ff ff". This is a pc-relative function call. 0xfffffe91 bytes in front of our instruction pointer. It's treated as a signed 32 bit value, and the instruction pointer used in the calculation is the address of the next instruction. So 0x40069f (next instruction) - 0x16f (0xfffffe91 in front is 0x16f bytes behind with signed math) gives us the address 0x400530, and looking at the disassembled code I find this at the address:
0000000000400530 <printf#plt>:
400530: ff 25 ea 0a 20 00 jmpq *0x200aea(%rip) # 601020 <_GLOBAL_OFFSET_TABLE_+0x20>
400536: 68 01 00 00 00 pushq $0x1
40053b: e9 d0 ff ff ff jmpq 400510 <_init+0x28>
This is the magic "fake function" I mentioned earlier. Let's not get into how this works. It's necessary for shared libraries to work and that's all we need to know for now.
The second function generates the function call instruction "ff d2". This means "call the function at the address stored inside the rdx register". No pc-relative addressing and that's why it works.
The compiler is free to generate the code the way it wants provided the observable results are correct (as if rule). So what you do is just an undefined behaviour invocation.
Visual Studio sometimes uses relays. That means that the address of a function just points to a relative jump. That's perfectly allowed per standard because of the as is rule but it would definitely break that kind of construction. Another possibility is to have local internal functions called with relative jumps but outside of the function itself. In that case, your code would not copy them, and the relative calls will just point to random memory. That means that with different compilers (or even different compilation options on same compiler) it could give expected result, crash, or directly end the program without error which is exactly UB.
I think I can explain a bit. First of all, if both your functions have no return statement within, an undefined behaviour is invoked as per standard ยง6.9.1/12. Secondly, which is most common on a lot of platforms, and yours apparently as well, is the following: relative addresses of functions are hardcoded into binary code of functions. That means, that if you have a call of "printf" within "foo" and then you move (e.g. execute) from another location, that address, from which "printf" should be called, turns bad.

Where does constant local variable array go in memory for a 'C' program

I am using GCC 4.8.1 and it doesn't seem to store const variable local to main in DATA segment. Below is code and memory map for 3 such programs:
Code 1:
int main(void)
{ //char a[10]="HELLO"; //1 //const char a[10] = "HELLO"; //2
return 0;
}
MEMORY MAP FOR ABOVE:
text data bss dec hex filename
7264 1688 1040 9992 2708 a.exe
CODE 2:
int main(void)
{
char a[10]="HELLO";
//const char a[10] = "HELLO";
return 0;
}
MEMORY MAP FOR 2:
text data bss dec hex filename
7280 1688 1040 10008 2718 a.exe
CODE 3:
int main(void)
{
//char a[10]="HELLO";
const char a[10] = "HELLO";
return 0;
}
MEMORY MAP FOR 3 :
text data bss dec hex filename
7280 1688 1040 10008 2718 a.exe
I do not see any difference in data segment between 3 codes. Can someone please explain this result to me.
Thanks in anticipation!
If your array is not used by your program so the compiler is allowed to simply optimize out the object.
From the C Standard:
(C99, 5.1.2.3p1) "The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant"
and
(C99, 5.1.2.3p3) "In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object)."
If you compile your program #3 with optimizations disabled (-O0) or depending on your compiler the object can still be allocated. In your case it does not appear in the data and rodata section but in text section thus the text section increase.
For example in your third example, in my compiler the resulting code with -O0 is (dumped using objdump -d):
00000000004004d0 <main>:
4004d0: 55 push %rbp
4004d1: 48 89 e5 mov %rsp,%rbp
4004d4: 48 b8 48 45 4c 4c 4f mov $0x4f4c4c4548,%rax
4004db: 00 00 00
4004de: 48 89 45 f0 mov %rax,-0x10(%rbp)
4004e2: 66 c7 45 f8 00 00 movw $0x0,-0x8(%rbp)
4004e8: b8 00 00 00 00 mov $0x0,%eax
4004ed: 5d pop %rbp
4004ee: c3 retq
4004ef: 90 nop
0x4f4c4c4548 is the ASCII characters of your string moved in a register and then pushed in the stack.
If I compile the same program with -O3, the output is simply:
00000000004004d0 <main>:
4004d0: 31 c0 xor %eax,%eax
4004d2: c3 retq
4004d3: 90 nop
and the string does not appear in data or rodata, it is simply optimized out.
This is what should happen:
Code 1: nothing is stored anywhere.
Code 2: a is stored on the stack. It is not stored in .data.
Code 3 a is either stored on the stack or in .rodata, depending on whether it is initialized with a constant expression or not. The optimizer might also decide to store it in .text (together with the code).
I do not see any difference in data segment between 3 codes.
That's because there should be no difference. .data is used for non-constant variables with static storage duration that are initialized to a value other than zero.

Why does my program not overflow the stack when I allocate a 11MB char array while the stack upper limit is 10MB?

I have two simple C++ programs and two questions here. I'm working in CentOS 5.2 and my dev environment is as follows:
g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50)
"ulimit -s" output: 10240 (kbytes), that is, 10MB
Program #1:
main.cpp:
int main(int argc, char * argv[])
{
char buf[1024*1024*11] = {0};
return 0;
}
(Compiled with "g++ -g main.cpp")
The program allocates 1024*1024*11 bytes(that is, 11MB) on the stack but it doesn't crash. After I change the allocation size to 1024*1024*12(that is, 12MB), the program crashes. I think this should be caused by a stack overflow. But Why does the program not crash when the allocation size is 11MB, which is also greater than the 10MB-upper-limit??
Program #2:
main.cpp:
#include <iostream>
int main(int argc, char * argv[])
{
char buf[1024*1024*11] = {0};
std::cout << "*** separation ***" << std::endl;
char buf2[1024*1024] = {0};
return 0;
}
(Compiled with "g++ -g main.cpp")
This program would result in a program crash because it allocates 12MB bytes on the stack. However, according to the core dump file(see below) the crash occurs on the buf but not buf2. Shouldn't the crash happen to buf2 because we know from program #1 that the allocation of char buf[1024*1024*11] is OK thus after we allocate another 1024*1024 bytes the stack would overflow?
I think there must be some quite fundamental concepts that I didn't build a solid understanding. But what are they??
Appendix: The core-dump info generated by program #2:
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
[New process 16433]
#0 0x08048715 in main () at main.cpp:5
5 char buf[1024*1024*11] = {0};
You're wrongly assuming the stack allocations happens where they appear in your code. Anytime you have local variables whose size are known at compile time, space for those will be allocated together when the function is entered. Only dynamic sized local variables are allocated later (VLAs and alloca).
Furthermore the error happens as soon as you write to the memory, not when it's first allocated. Most likely buf is located before buf2 on the stack and the overflow thus happens in buf, not buf2.
To analyze these kind of mysteries, it is always useful to look at the generated code. My guess is that your particular compiler version is doing something different, because mine segfaults with -O0, but not with -O1.
From your program #1, with g++ a.c -g -O0, and then objdump -S a.out
int main(int argc, char * argv[])
{
8048484: 55 push %ebp
8048485: 89 e5 mov %esp,%ebp
This is the standard stack frame. Nothing to see here.
8048487: 83 e4 f0 and $0xfffffff0,%esp
Align the stack to multiple of 16, just in case.
804848a: 81 ec 30 00 b0 00 sub $0xb00030,%esp
Allocate 0xB00030 bytes of stack space. That is 1024*1024*11 + 48 bytes. No access to the memory yet, so no exception. The extra 48 bytes are of internal use of the compiler.
8048490: 8b 45 0c mov 0xc(%ebp),%eax
8048493: 89 44 24 1c mov %eax,0x1c(%esp) <--- SEGFAULTS
The first time the stack is accessed is beyond the ulimit, so it segfaults.
8048497: 65 a1 14 00 00 00 mov %gs:0x14,%eax
Thiis is the stack-protector.
804849d: 89 84 24 2c 00 b0 00 mov %eax,0xb0002c(%esp)
80484a4: 31 c0 xor %eax,%eax
char buf[1024*1024*11] = {0};
80484a6: 8d 44 24 2c lea 0x2c(%esp),%eax
80484aa: ba 00 00 b0 00 mov $0xb00000,%edx
80484af: 89 54 24 08 mov %edx,0x8(%esp)
80484b3: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp)
80484ba: 00
80484bb: 89 04 24 mov %eax,(%esp)
80484be: e8 d1 fe ff ff call 8048394 <memset#plt>
Initialize the array, calling memset
return 0;
80484c3: b8 00 00 00 00 mov $0x0,%eax
}
As you can see, the segfault happens when the internal variables are accessed, because they happen to be below the big array (they have to be, because there is the stack protector, to detect stack smashing).
If you compile with optimizations, the compiler notices that you do nothing useful with the array and optimizes it out. So no sigseg.
Probably your version of GCC is a bit oversmart in non-optimization mode, and removes the array. We can analyze it further if you post the output of objdump -S a.out.
When defining local variables on the stack, there's no real allocation of memory like it is done in the heap. The stack memory allocation consists more simply in changing the address of the stack pointer (that is going to be used by called functions) to reserve the wanted memory.
I suspect that this operation of changing the stack pointer is done only once, at the beginning of the function, to reserve space for all the used local variable (by oposition of changing it once per local variable). This explains why the error on your program #2 occurs on the first allocation.
Both of your programs should ideally give segfault.
Normally whenever a function is entered, all the variables defined in it are allocated memory onto the stack. This said, it however also depends on the optimization level with which the code is compiled.
Optimization level zero, indicated during compilation as -O0 indicates no optimzation at all. It is also the default optimization level with which a code is compiled. The above mentioned programs when compiled with -O0, gives segfault.
However, when you compile the programs using higher optimization levels, the compiler notices that the variable which is defined is not used in the function. It therefore removes the variable definition from the assembly language code. As a result, your programs won't give any segfault.

Resources