Overflowing stack with huge local variable? - c

As it is said that 8 mb of stack is given to each process.
This stack will be used to store local variables.
So if i take an array of size max than of the stack , it must overflow ??
int main()
{
int arr[88388608];
int arr1[88388608];
int arr2[88388608];
while(1);
return 0;
}
But i am unable to get the result !

Welcome to the world of optimizing compilers!
Because of the as-if rule, the compiler is only required to build something that would have same observable results as your original code.
So the compiler if free to:
remove the unused arrays
remove the empty loop
store the dynamic arrays from main outside of the stack - because main is a special function that shall be called only once by the environment
If you want to observe the stack overflow (the bad one, not our nice site :-) ),
you should:
use some code to fill the arrays
compile with all optimization removed and preferently in debug mode to tell the compiler do what I wrote as accurately as you can
The following code does SIGSEGV with CLang 3.4.1 when compiled as cc -g foo.c -o foo
#include <stdio.h>
#define SIZE 88388608
void fill(int *arr, size_t size, int val) {
for (size_t i=0; i<size; i++) {
arr[i] = val;
}
}
int main() {
int arr[SIZE];
int arr1[SIZE];
int arr2[SIZE];
fill(arr, SIZE, 0);
fill(arr1, SIZE, 0);
fill(arr2, SIZE, 0);
printf("%d %d %d\n", arr[12], arr1[15], arr2[18]);
return 0;
}
and even this code works fine when compiled as -O2 optimization level... Compilers are now too clever for me, and I'm not brave enough to thoroughly look at the assembly code which would be the only real way to understand what is actually executed!

Related

Effect on uninitialized global variable on executable size

A highly voted previous answer's highly voted comment states:
consider having many uninitialized buffers 4096 bytes in length. Would
you want all of those 4k buffers to contribute to the size of the
binary? That would be a lot of wasted space.
I am building the following two files into an executable on ubuntu:
main.c
int sum(int *a, int n);
int array[2] = {1,2};
int abc;//Comment in case (a) Uncomment in case (b) and (c)
int def;//Comment in case (a) and (b) Uncomment in case (c)
int main(){
int val = sum(array, 2);
return val;
}
sum.c
int sum(int *a, int n){
int i, s = 0;
for(i = 0; i < n; i++)
s += a[i];
return s;
}
The following command is used to create the executable
$gcc -Og -o prog main.c sum.c
There are 3 cases:
(a) has no uninitialized global variable. The executable has size 8648 bytes.
(b) has uninitialized global variable abc. The executable has size 8680 bytes.
(c) has uninitialized global variables abc and def. The executable has size 8704.
My question is, why does the executable size even change? My understanding (also confirmed by the answer linked to above) was that uninitialized global variables should NOT affect executable size.

skip the next instruction in c code

Hi every one : I try to skip instruction
void func(char *str) {
char buffer[24];
int *ret;
strcpy(buffer, str);
}
int main(int argc, char **argv) {
int x;
x = 0;
func(argv[1]);
x = 1;
printf("%d\n”, x);
}
How I can use the pointer *ret defined in funct() to modify the return address for the function in such away I can skip x=1
It's a bad idea to use this in production code! (Reasons copied from dvnrrs' comment.) This is undefined behavior; it is severely abusing (assumed) knowledge about the way the stack is laid out under the hood, and the size of the compiled instructions. This is doomed to failure, especially if optimization is turned on.
Please note that modifying the memory next to local variables this way is incorrect C code, and it makes undefined behavior. I think there is no standard C solution to your problem, so if you want to do that, your best bet is architecture-specific assembly code. The following code happens to work for me in C on i386 with my GCC, but it's still undefined behavior, so it's inherently fragile and can cease to work in any changes in the compiler or in the ABI.
This prints 42 for me:
#include <stdio.h>
void func(char *str) {
(&str)[-1] += 2;
}
int main(int argc, char **argv) {
(void)argc; (void)argv;
int x;
x = 42;
func(argv[1]);
x = 137;
printf("%d\n",x);
return 0;
}
Compile and run on Linux i386:
$ gcc -m32 -W -Wall -s -O0 t.c && ./a.out
42
You can't do this in C; the closest thing would be the setjmp() and longjmp() functions but those won't work for your exact case.
As others pointed out in comments, the best thing to do might be to give your function a return value, and then check that return value in the caller to decide what code to run next. Your sample code seems like a contrived test case so I'm not sure exactly what you're trying to accomplish.

How can one make Clang optimize away useless array copies

Consider the following C99 code (that uses the alloca extension.)
void print_int_list(size_t size, int x[size]) {
int y[size];
memcpy(y, x, size * sizeof *x);
for (size_t ii = 0; ii < size; ++ii)
printf("%i ", y[ii]);
printf("\n");
}
void print_int_list_2(size_t size, int x[size]) {
for (size_t ii = 0; ii < size; ++ii)
printf("%i ", x[ii]);
printf("\n");
}
void print_int(int x) {
int * restrict const y = alloca(sizeof x);
memcpy(y, &x, sizeof x);
printf("%d\n", *y);
}
void print_int_2(int x) {
printf("%d\n", *x);
}
In the code print_int is optimized to be exactly the same as print_int_2 on Clang version 3.0 but the function print_int_list is not optimized away to print_int_2. Instead the useless array copy is kept.
This sort of thing is not a problem for most people but it is for me. I intend to prototype a compiler by generating C code for use with Clang, (and later port it to LLVM directly), and I want to generate extremely stupid, simple, and obviously correct code, and let LLVM do the work of optimizing the code.
What I need to know is how one can make Clang optimize away useless array copies so that stupid code like print_int_list will get optimized into code like print_int_list_2.
First, I would go more carefully. There is a step inbetween the two cases that you have, arrays of fixed size. I think nowadays compilers can trace array components that are also indexed with a compile time constant.
Also don't forget that memcpy converts your arrays to pointers to the first element and then makes them void*. So it looses all information.
So I'd go
try fixed sized arrays
don't use memcpy but an assignment loop
and try to losen the constraints from there.

How to determine the length of a function?

Consider the following code that takes the function f(), copies the function itself in its entirety to a buffer, modifies its code and runs the altered function. In practice, the original function that returns number 22 is cloned and modified to return number 42.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define ENOUGH 1000
#define MAGICNUMBER 22
#define OTHERMAGICNUMBER 42
int f(void)
{
return MAGICNUMBER;
}
int main(void)
{
int i,k;
char buffer[ENOUGH];
/* Pointer to original function f */
int (*srcfptr)(void) = f;
/* Pointer to hold the manipulated function */
int (*dstfptr)(void) = (void*)buffer;
char* byte;
memcpy(dstfptr, srcfptr, ENOUGH);
/* Replace magic number inside the function with another */
for (i=0; i < ENOUGH; i++) {
byte = ((char*)dstfptr)+i;
if (*byte == MAGICNUMBER) {
*byte = OTHERMAGICNUMBER;
}
}
k = dstfptr();
/* Prints the other magic number */
printf("Hello %d!\n", k);
return 0;
}
The code now relies on just guessing that the function will fit in the 1000 byte buffer. It also violates rules by copying too much to the buffer, since the function f() will be most likely a lot shorter than 1000 bytes.
This brings us to the question: Is there a method to figure out the size of any given function in C? Some methods include looking into intermediate linker output, and guessing based on the instructions in the function, but that's just not quite enough. Is there any way to be sure?
Please note: It compiles and works on my system but doesn't quite adhere to standards because conversions between function pointers and void* aren't exactly allowed:
$ gcc -Wall -ansi -pedantic fptr.c -o fptr
fptr.c: In function 'main':
fptr.c:21: warning: ISO C forbids initialization between function pointer and 'void *'
fptr.c:23: warning: ISO C forbids passing argument 1 of 'memcpy' between function pointer and 'void *'
/usr/include/string.h:44: note: expected 'void * __restrict__' but argument is of type 'int (*)(void)'
fptr.c:23: warning: ISO C forbids passing argument 2 of 'memcpy' between function pointer and 'void *'
/usr/include/string.h:44: note: expected 'const void * __restrict__' but argument is of type 'int (*)(void)'
fptr.c:26: warning: ISO C forbids conversion of function pointer to object pointer type
$ ./fptr
Hello 42!
$
Please note: on some systems executing from writable memory is not possible and this code will crash. It has been tested with gcc 4.4.4 on Linux running on x86_64 architecture.
You cannot do this in C. Even if you knew the length, the address of a function matters, because function calls and accesses to certain types of data will use program-counter-relative addressing. Thus, a copy of the function located at a different address will not do the same thing as the original. Of course there are many other issues too.
In the C standard, there is no notion of introspection or reflection, thus you'd need to devise a method yourself, as you have done, some other safer methods exists however.
There are two ways:
Disassemble the function (at runtime) till you hit the final RETN/JMP/etc, while accounting for switch/jump tables. This of course requires some heavy analysis of the function you disassemble (using an engine like beaEngine), this is of course the most reliable, but its slow and heavy.
Abuse compilation units, this is very risky, and not fool proof, but if you know you compiler generates functions sequentially in their compilation unit, you can do something along these lines:
void MyFunc()
{
//...
}
void MyFuncSentinel()
{
}
//somewhere in code
size_t z = (uintptr_t)MyFuncSentinel - (uintptr_t)MyFunc;
uint8_t* buf = (uint8_t*)malloc(z);
memcpy(buf,(char*)MyFunc,z);
this will have some extra padding, but it will be minimal (and unreachable). although highly risky, its a lot faster that the disassemble method.
note: both methods will require that the target code has read permissions.
#R.. raises a very good point, your code won't be relocatable unless its PIC or you reassasmble it in-place to adjust the addresses etc.
Here is a standards compliant way of achieving the result you want:
int f(int magicNumber)
{
return magicNumber;
}
int main(void)
{
k = f(OTHERMAGICNUMBER);
/* Prints the other magic number */
printf("Hello %d!\n", k);
return 0;
}
Now, you may have lots of uses of f() all over the place with no arguments and not want to go through your code changing every one, so you could do this instead
int f()
{
return newf(MAGICNUMBER);
}
int newf(int magicNumber)
{
return magicNumber;
}
int main(void)
{
k = newf(OTHERMAGICNUMBER);
/* Prints the other magic number */
printf("Hello %d!\n", k);
return 0;
}
I'm not suggesting this is a direct answer to your problem but that what you are doing is so horrible, you need to rethink your design.
Well, you can obtain the length of a function at runtime using labels:
int f()
{
int length;
start:
length = &&end - &&start + 11; // 11 is the length of function prologue
// and epilogue, got with gdb
printf("Magic number: %d\n", MagicNumber);
end:
return length;
}
After executing this function we know its length, so we can malloc for the right length, copy and editing the code, then executing it.
int main()
{
int (*pointerToF)(), (*newFunc)(), length, i;
char *buffer, *byte;
length = f();
buffer = malloc(length);
if(!buffer) {
printf("can't malloc\n");
return 0;
}
pointerToF = f;
newFunc = (void*)buffer;
memcpy(newFunc, pointerToF, length);
for (i=0; i < length; i++) {
byte = ((char*)newFunc)+i;
if (*byte == MagicNumber) {
*byte = CrackedNumber;
}
}
newFunc();
}
Now there's another bigger problem though, the one #R. mentioned. Using this function once modified (correctly) results in segmentation fault when calling printf because the call instruction has to specify an offset which will be wrong. You can see this with gdb, using disassemble f to see the original code and x/15i buffer to see the edited one.
By the way, both my code and yours compile without warnings but crash on my machine (gcc 4.4.3) when calling the edited function.

LLVM Compiler 2.0 bug?

When the following code is compiled with LLVM Compiler, it doesn't operate correctly.
(i doesn't increase.)
It operates correctly when compiling with GCC 4.2.
Is this a bug of LLVM Compiler?
#include <stdio.h>
#include <string.h>
void BytesFromHexString(unsigned char *data, const char *string) {
printf("bytes:%s:", string);
int len = (int)strlen(string);
for (int i=0; i<len; i+=2) {
unsigned char x;
sscanf((char *)(string + i), "%02x", &x);
printf("%02x", x);
data[i] = x;
}
printf("\n");
}
int main (int argc, const char * argv[])
{
// insert code here...
unsigned char data[64];
BytesFromHexString(data, "4d4f5cb093fc2d3d6b4120658c2d08b51b3846a39b51b663e7284478570bcef9");
return 0;
}
For sscanf you'd use %2x instead of %02x. Furthermore, %2x indicates that an extra int* argument will be passed. But you're passing an unsigned char*. And finally, sscanf takes a const char* as first argument, so there's no need for that cast.
So give this a try :
int x;
sscanf((string + i), "%2x", &x);
EDIT : to clarify why this change resolves the issue : in your code, sscanf tried to write sizeof(int) bytes in a memory location (&x) that could only hold sizeof(unsigned char) bytes (ie. 1 byte). So, you were overwriting a certain amount of memory. This overwritten memory could very well have been (part of) the i variable.
From the compiler side of things the reason for this code behaving differently is that gcc and llvm (or any other compiler) may lay out the stack differently. You were likely just clobbering something else on the stack before that you didn't need for this example, but with the different layout for the llvm compiler you were clobbering something more useful.
This is another good reason to use stack protectors when debugging a problem (-fstack-protector-all/-fstack-protector). It can help flush out these issues.

Resources