On Linux I have a code that use a array declared inside the main function with a sixe of 2MB + 1 byte
#include <stdio.h>
#include <stdlib.h>
#define MAX_DATA (2097152) /* 2MB */
int main(int argc, char *argv[])
{
/* Reserve 1 byte for null termination */
char data[MAX_DATA + 1];
printf("Bye\n");
return 0;
}
When I compiled on Linux with gcc I run it without any problem. But on Windows I get a runtime error. At moment of run it I have 5GB of free memory.
For solve the problem on Windows I need specify other stack size:
gcc -Wl,--stack,2097153 -o test.exe test.c
or declare the data array outside the main function.
Because that the program compiled on linux was linked without change the stack size?
Why it run ok on Linux but fail on Windows?
I use the same source code and the same gcc instructions:
gcc -Wall -O source.c -o source
Because malloc implementation on linux i think is not reliable because it can return a not null pointer even if memory is not available.
I think that in the program that is running on the Linux, it maybe silently ignore a stack problem?.
Is possible that the program that is running on Linux that was not linked changing the stack size, but not fail at runtime unlike Windows, is silently ignoring a stack problem?
Also, why if I declare the array outside the main function it Works ok on Windows? In case it use heap why I not need free it?
Why does it run fine on Linux but fails on Windows?
Because the default stack size for a process or thread is system dependant:
On Windows, the default stack reservation size used by the linker is 1 MB.
On Linux/Unix, the maximum stack size can be configured through the ulimit command. In addition, you can configure the stack size when creating a new thread.
Because malloc implementation on linux i think is not reliable because it can return a not null pointer even if memory is not available.
I suppose that you are talking about the overcommit issue. To overcome this, you can use calloc and check the return value. If you do this at the very beginning of your application, you can immediately exit with an appropriate error message.
Related
I have some example code here which I'm using to understand some C behaviour for a beginner's CTF:
// example.c
#include <stdio.h>
void main() {
void (*print)();
print = getenv("EGG");
print();
}
Compile: gcc -z execstack -g -m32 -o example example.c
Usage: EGG=$(echo -ne '\x90\xc3) ./example
If I compile the code with the execstack flag, the program will execute the opcodes I've injected above. Without the flag, the program will crash due to a segmentation fault.
Why exactly is this? Is it because getenv is storing the actual opcodes on the stack, and the execstack flag allows jumps to the stack? Or does getenv push a pointer onto the stack, and there are some other rules about what sections of memory are executable? I read the manpage, but I couldn't work out exactly what the rules are and how they're enforced.
Another issue is I think I'm also really lacking a good tool to visualise memory whilst debugging, so its hard to figure this out. Any advice would be really appreciated.
getenv doesn't store the env var's value on the stack. It's already on the stack from process startup, and getenv obtains a pointer to it.
See the i386 System V ABI's description of where argv[] and envp[] are located at process startup: above [esp].
_start doesn't copy them before calling main, just calculates pointers to them to pass as args to main. (Links to the latest version at https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI, where the official current version is maintained.)
Your code is casting a pointer to stack memory (containing the value of an env var) into a function pointer and calling through it. Look at the compiler-generated asm (e.g. on https://godbolt.org/): it'll be something like call getenv / call eax.
-zexecstack in your kernel version1 makes all your pages executable, not just the stack. It also applies to .data, .bss, and .rodata sections, and memory allocated with malloc / new.
The exact mechanism on GNU/Linux was a "read-implies-exec" process-wide flag that affects all future allocations, including manual use of mmap. See Unexpected exec permission from mmap when assembly files included in the project for more about the GNU_STACK ELF header stuff.
Footnote 1: Linux after 5.4 or so only makes the stack itself executable, not READ_IMPLIES_EXEC: Linux default behavior of executable .data section changed between 5.4 and 5.9?
Fun fact: taking the address of a nested function that accesses its parents local variables gets gcc to enable -zexecstack. It stores code for an executable "trampoline" onto the stack that passes a "static chain" pointer to the actual nested function, allowing it to reference its parent's stack-frame.
If you wanted to exec data as code without -zexecstack, you'd use mprotect(PROT_EXEC|PROT_READ|PROT_WRITE) on the page containing that env var. (It's part of your stack so you shouldn't remove write permission; it could be in the same page as main's stack frame for example.)
Related:
With GNU/Linux ld from binutils before late 2018 or so, the .rodata section is linked into the same ELF segment as the .text section, and thus const char code[] = {0xc3} or string literals are executable.
Current ld gives .rodata its own segment that's mapped read without exec, so finding ROP / Spectre "gadgets" in read-only data is no longer possible, unless you use -zexecstack. And even that doesn't work on current kernels; char code[] = ...; as a local inside a function will put data on the stack where it's actually executable. See How to get c code to execute hex machine code? for details.
I have a question about Address Space Layout Randomization (ALSR) on macOS. According to Apple (2016), "If you are compiling an executable that targets macOS 10.7 and later or iOS 4.3 and later, the necessary flags [for ASLR] are enabled by default”. In the spirit of science, I decided to test this on Xcode 11.3 and macOS Catalina 10.15.2 with the following program:
#include <stdio.h>
int main(int argc, const char * argv[]) {
int stack = 0;
printf("%p\n", &stack);
return 0;
}
According to Arpaci-Dusseau & Arpaci-Dusseau (2018), with ASLR enabled, this program should produce a different virtual address on every run (p. 16). However, every time I run the program in Xcode, the output is the same, for example:
0x7ffeefbff52c
Program ended with exit code: 0
What am I missing?
References
Apple. (2017). Avoiding buffer overflows and underflows. Retrieved from https://developer.apple.com/library/archive/documentation/Security/Conceptual/SecureCodingGuide/Articles/BufferOverflows.html
Arpaci-Dusseau, R. H., & Arpaci-Dusseau, A. C. (2018). Complete virtual memory systems. In Operating systems: Three easy pieces. Retrieved from http://pages.cs.wisc.edu/~remzi/OSTEP/vm-complete.pdf
The apparent ineffectiveness of ASLR is an artifact of running within Xcode. Either its use of the debugger or some other diagnostic feature effectively disables ASLR for the process.
Running the program outside of Xcode will show the ASLR behavior you expect.
I have some example code here which I'm using to understand some C behaviour for a beginner's CTF:
// example.c
#include <stdio.h>
void main() {
void (*print)();
print = getenv("EGG");
print();
}
Compile: gcc -z execstack -g -m32 -o example example.c
Usage: EGG=$(echo -ne '\x90\xc3) ./example
If I compile the code with the execstack flag, the program will execute the opcodes I've injected above. Without the flag, the program will crash due to a segmentation fault.
Why exactly is this? Is it because getenv is storing the actual opcodes on the stack, and the execstack flag allows jumps to the stack? Or does getenv push a pointer onto the stack, and there are some other rules about what sections of memory are executable? I read the manpage, but I couldn't work out exactly what the rules are and how they're enforced.
Another issue is I think I'm also really lacking a good tool to visualise memory whilst debugging, so its hard to figure this out. Any advice would be really appreciated.
getenv doesn't store the env var's value on the stack. It's already on the stack from process startup, and getenv obtains a pointer to it.
See the i386 System V ABI's description of where argv[] and envp[] are located at process startup: above [esp].
_start doesn't copy them before calling main, just calculates pointers to them to pass as args to main. (Links to the latest version at https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI, where the official current version is maintained.)
Your code is casting a pointer to stack memory (containing the value of an env var) into a function pointer and calling through it. Look at the compiler-generated asm (e.g. on https://godbolt.org/): it'll be something like call getenv / call eax.
-zexecstack in your kernel version1 makes all your pages executable, not just the stack. It also applies to .data, .bss, and .rodata sections, and memory allocated with malloc / new.
The exact mechanism on GNU/Linux was a "read-implies-exec" process-wide flag that affects all future allocations, including manual use of mmap. See Unexpected exec permission from mmap when assembly files included in the project for more about the GNU_STACK ELF header stuff.
Footnote 1: Linux after 5.4 or so only makes the stack itself executable, not READ_IMPLIES_EXEC: Linux default behavior of executable .data section changed between 5.4 and 5.9?
Fun fact: taking the address of a nested function that accesses its parents local variables gets gcc to enable -zexecstack. It stores code for an executable "trampoline" onto the stack that passes a "static chain" pointer to the actual nested function, allowing it to reference its parent's stack-frame.
If you wanted to exec data as code without -zexecstack, you'd use mprotect(PROT_EXEC|PROT_READ|PROT_WRITE) on the page containing that env var. (It's part of your stack so you shouldn't remove write permission; it could be in the same page as main's stack frame for example.)
Related:
With GNU/Linux ld from binutils before late 2018 or so, the .rodata section is linked into the same ELF segment as the .text section, and thus const char code[] = {0xc3} or string literals are executable.
Current ld gives .rodata its own segment that's mapped read without exec, so finding ROP / Spectre "gadgets" in read-only data is no longer possible, unless you use -zexecstack. And even that doesn't work on current kernels; char code[] = ...; as a local inside a function will put data on the stack where it's actually executable. See How to get c code to execute hex machine code? for details.
First an abstraction of my program:
int main ()
{
My_Struct1 ms1; // sizeof (My_Struct1) is 88712 B -- L1
My_Struct2 ms2; // sizeof (My_Struct2) is 13208 B -- L2
// 1. Invoke parser to fill in the two struct instances. -- L3
printf ("%ul, %ul\n", &ms1, &ms2) // -- **L3b** doesn't produce seg. fault.
my_fun (&ms1, &ms2); // -- L4, does produce seg. fault.
return 0;
}
If I run my program using makefile, then a segmentation fault occurs at L4 (always).
If I execute my program directly from shell (./executable), then the segmentation does occur sometimes but not always.
The error is: Segmentation fault: Cannot access memory at address at L4 for &ms1 and &ms2 both. The type and location of the error is what was pointed out by gdb.
My guess is that the error is because of the size of the structures.
Please explain in detail what is going.
The error behivour is same even after reducing the size of My_Struct1 to 8112 B and My_Struct2 to 1208 B.
I am working on:
Ubuntu 14.04
Intel® Core™ i5-4200M CPU # 2.50GHz × 4
3.8 GiB memory
gcc - 4.8.4
First, compile with all warnings & debug info. Probably with CFLAGS= -g -Wall -Wextra in your Makefile. Perhaps you might sometimes add some sanitize instrumentation options such as -fsanitize=address or -fsanitize=undefined (then it could be worthwhile to upgrade your GCC compiler to GCC 5 in march 2016).You might also want -Wstack-usage=1000 warning & -fstack-usage developer option.
Be very afraid of undefined behavior.
Then, enable core(5) dumps. Probably some ulimit -c 100000 (or whatever number is realistic) in your ~/.bashrc then start a new terminal; check with cat /proc/self/limits (a Linux specific command, related to proc(5)) that the limits are well set. See setrlimit(2).
Run your faulty test, e.g. with make test. You'll get a core dump. Check with ls -l core and file core.
At last, do a post mortem debugging session. If your binary is someprog, run gdb someprog core. Probably the first gdb command you'll type would be bt
Indeed, you are probably wrong in declaring quite large struct as local variables in main. The rule of thumb is to restrict your call frame to a few kilobytes at most (hence, never have a local variable of more than a kilobyte in the call stack). So I would recommend putting your large struct in the heap (so use malloc and free appropriately, read about C dynamic memory allocation). But a typical call stack on Linux can grow to several megabytes.
Also, run your program with valgrind
BTW, the correct format for (void*) pointers in %p so your added printf should be
printf("ms1#%p, ms2#%p\n", (void*)&ms1, (void*)&ms2);
I've just implemented a pretty complicated piece of software, but my school's testing system won't take it.
The system uses the so-called mudflap library which should be able to prevent illegal memory accesses better. As a consequence, my program generates segfaults when run on the school's testing system (I submit the source code and the testing system compiles it for itself, using the mudflap library).
I tried to isolate the problematic code in my program, and it seems that it all boils down to something as simple as pointer arrays. Mudflap doesn't seem to like them.
Below is a piece of some very simple code with that works with a pointer array:
#include <stdlib.h>
int main()
{
char** rows;
rows=(char**)malloc(sizeof(char*)*3);
rows[0]=(char*)malloc(sizeof(char)*4);
rows[1]=(char*)malloc(sizeof(char)*4);
rows[2]=(char*)malloc(sizeof(char)*4);
strcpy(rows[0], "abc");
strcpy(rows[1], "abc");
strcpy(rows[2], "abc");
free(rows[0]); free(rows[1]); free(rows[2]);
free(rows);
return 0;
This will generate a segfault with mudflap. In my opinion, this is a perfectly legal code.
Could you please explain to me what is wrong with it, and why it generates a segfault with mudflap?
Note: The program should be compiled under an amd64 linux system with g++ using the following commands:
export MUDFLAP_OPTIONS='-viol-segv -print-leaks';
g++ -Wall -pedantic -fmudflap -fmudflapir -lmudflap -g file.cpp
You have at least one problem here:
char** rows;
rows=(char**)malloc(3);
This allocates 3 bytes. On most platforms the allocator probably has a minimum of at least 4 bytes which lets you get away with overwriting the buffer a bit. I'm guessing your mudflap library is more strict in its checking and catches the overwrite.
However, if you want an array of 3 char * pointers, you probably need at least 12 bytes.
Try changing these lines to:
char** rows;
rows=(char**)malloc(3 * sizeof(char *));
EDIT: Based on your modified code, I agree it looks correct now. The only thing I can suggest is that perhaps malloc() is failing and causing a NULL pointer access. If thats not the case it sounds like a bug or misconfiguration of mudflap.