how to find if stack increases upwards or downwards?
This is very platform-dependent, and even application-dependent.
The code posted by Vino only works in targets where parameters are passed on the stack AND local variables are allocated from the stack, in that order. Many compilers will assign fixed memory addresses to parameters, or pass parameters in registers. While common, passing parameters on the stack is one of the least efficient ways to get data into and out of a function.
Look at the disassembly for your compiled app and see what code the compiler is generating. If your target has native stack manipulation commands (like PUSH and POP) that the compiler is using, then the CPU datasheet/reference manual will tell you which direction the stack is growing. However, the compiler may choose to implement its own stack, in which case you'll have to do some digging.
Or, read the stack pointer, push something on the stack, and read the stack pointer again. Compare the results of the first and second read to determine the direction in which the pointer moves.
For future reference: if you include some details about your target architecture (embedded? PC? Linux, Windows? GCC? VC? Watcom? blah blah blah) you'll get more meaningful answers.
One possible way is...
#include <stdio.h>
void call(int *a)
{
int b;
if (&b > a)
printf("Stack grows up.\n");
else
printf("Stack grows down.\n");
}
int main ()
{
int a;
call(&a);
return 0;
}
Brute force approach is to fill your memory with a known value say 0xFF. Push some items on the stack. Do a memory dump. Push some more items on the stack. Do another memory dump.
Create function with many local variables.
Turn off optimizations.
Either print the assembly language..
Or when debugging, display as mixed source and assembly language.
Note the stack pointer (or register) before the function is executed.
Single-step through the function and watch the stack pointer.
In general, whether a compiler uses incrementing or decrementing stack pointers is a very minor issue as long as the issue is consistent and working. This is one issue that rarely occupies my mind. I tend to concentrate on more important topics, such as quality, correctness and robustness.
I'll trust the compiler to correctly handle stack manipulation. I don't trust recursive functions, especially on embedded or restricted platforms.
Related
When I try to google this, all I find is stuff about getting and setting the stack limit, such as -[NSThread stackSize], but that's NOT what I want. I want to know how much memory is in actually in use on the stack in the current thread, or equivalently how much stack space remains available.
I'm hoping to figure out a stack overflow in a crash report submitted by a user. In my previous experience, a stack overflow has usually been caused by an infinite recursion, but not this time. So I'm wondering if some of my C++ functions are really using a heck of a lot more stack space than they should.
A comment suggested that I get the stack pointer at the start of the thread, and compare its value later. I happened across the question Print out value of stack pointer. It has several answers:
(The accepted answer) Take the address of a local variable.
Use a little assembly language to get the value of the stack pointer register.
Use the function __builtin_frame_address(0) in GCC or Clang.
I tried those techniques (Apple Clang, macOS 11.2). Methods 2 and 3 produced similar results, but method 1 produced absurdly different results. For one thing, method 1 gives values that increase as you go deeper into a call chain, while the others give values that decrease. What's up with this, are there two different kinds of stacks?
If you are trying to do that, I guess you want to know how much memory are you using to guess the optimum number of threads you can create of some kind.
The answer is not easy, as you normally don't have access to the stack pointer. But I'll try to devise a solution for you that will not require to access the stack pointer, while it requires to use a global variable per thread.
The idea is to force a parameter to be in the stack. Even if the ABI in your system uses register to pass parameters, if you save the address of a parameter (the actual parameter variable) into some local variable, and then after that you call a function, that takes a parameter (the type doesn't matter, as you are going to use it's address to compare both):
static char *initial_stack_pseudo_addr;
size_t save_initial_stack(char dumb)
{
/* the & operator forces dumb to be implemented in the stack */
initial_stack_pseudo_addr = &dumb;
}
size_t how_much_stack(int dumb)
{
return initial_stack_pseudo_addr - &dumb;
}
So when you start the thread, you call save_initial_stack(0);. When you want to know how much stack you have consumed, just can do the following:
size_t stack_size = how_much_stack(0);
printf("at this point I have %zi bytes of stack\n", stack_size);
Basically, what you have done is to calculate how many bytes are between the address of the local parameter of the call to save_initial_stack() to the address of the local parameter of the call you do now to get the stack size. This is approximate, but the stack changes too quick to have a precise idea.
The following example will illustrate the thing. A recursive function is called after setting the initial pointer value, then at each recursive call the current size of the stack (approximate) is computed and printed, and a new recursive call is made. The program should run until the process gets a stack overflow.
#include <stdio.h>
char *stack_at_start;
void save_stack_pointer(char dumb)
{
stack_at_start = &dumb;
}
size_t get_stack_size(char dumb)
{
return stack_at_start - &dumb;
}
void recursive()
{
printf("Stack size: %zi\n", get_stack_size(0));
recursive();
}
int main()
{
save_stack_pointer(0);
recursive();
}
Like we do with macros:
#undef SOMEMACRO
Can we also undeclare or delete the variables in C, so that we can save a lot of memory?
I know about malloc() and free(), but I want to delete the variables completely so that if I use printf("%d", a); I should get error
test.c:4:14: error: ‘a’ undeclared (first use in this function)
No, but you can create small minimum scopes to achieve this since all scope local variables are destroyed when the scope is exit. Something like this:
void foo() {
// some codes
// ...
{ // create an extra minimum scope where a is needed
int a;
}
// a doesn't exist here
}
It's not a direct answer to the question, but it might bring some order and understanding on why this question has no proper answer and why "deleting" variables is impossible in C.
Point #1 What are variables?
Variables are a way for a programmer to assign a name to a memory space. This is important, because this means that a variable doesn't have to occupy any actual space! As long as the compiler has a way to keep track of the memory in question, a defined variable could be translated in many ways to occupy no space at all.
Consider: const int i = 10; A compiler could easily choose to substitute all instances of i into an immediate value. i would occupy 0 data memory in this case (depending on architecture it could increase code size). Alternatively, the compiler could store the value in a register and again, no stack nor heap space will be used. There's no point in "undefining" a label that exists mostly in the code and not necessarily in runtime.
Point #2 Where are variables stored?
After point #1 you already understand that this is not an easy question to answer as the compiler could do anything it wants without breaking your logic, but generally speaking, variables are stored on the stack. How the stack works is quite important for your question.
When a function is being called the machine takes the current location of the CPU's instruction pointer and the current stack pointer and pushes them into the stack, replacing the stack pointer to the next location on stack. It then jumps into the code of the function being called.
That function knows how many variables it has and how much space they need, so it moves the frame pointer to capture a frame that could occupy all the function's variables and then just uses stack. To simplify things, the function captures enough space for all it's variables right from the start and each variable has a well defined offset from the beginning of the function's stack frame*. The variables are also stored one after the other.
While you could manipulate the frame pointer after this action, it'll be too costly and mostly pointless - The running code only uses the last stack frame and could occupy all remaining stack if needed (stack is allocated at thread start) so "releasing" variables gives little benefit. Releasing a variable from the middle of the stack frame would require a defrag operation which would be very CPU costly and pointless to recover few bytes of memory.
Point #3: Let the compiler do its job
The last issue here is the simple fact that a compiler could do a much better job at optimizing your program than you probably could. Given the need, the compiler could detect variable scopes and overlap memory which can't be accessed simultaneously to reduce the programs memory consumption (-O3 compile flag).
There's no need for you to "release" variables since the compiler could do that without your knowledge anyway.
This is to complement all said before me about the variables being too small to matter and the fact that there's no mechanism to achieve what you asked.
* Languages that support dynamic-sized arrays could alter the stack frame to allocate space for that array only after the size of the array was calculated.
There is no way to do that in C nor in the vast majority of programming languages, certainly in all programming languages that I know.
And you would not save "a lot of memory". The amount of memory you would save if you did such a thing would be minuscule. Tiny. Not worth talking about.
The mechanism that would facilitate the purging of variables in such a way would probably occupy more memory than the variables you would purge.
The invocation of the code that would reclaim the code of individual variables would also occupy more space than the variables themselves.
So if there was a magic method purge() that purges variables, not only the implementation of purge() would be larger than any amount of memory you would ever hope to reclaim by purging variables in your program, but also, in int a; purge(a); the call to purge() would occupy more space than a itself.
That's because the variables that you are talking about are very small. The printf("%d", a); example that you provided shows that you are thinking of somehow reclaiming the memory occupied by individual int variables. Even if there was a way to do that, you would be saving something of the order of 4 bytes. The total amount of memory occupied by such variables is extremely small, because it is a direct function of how many variables you, as a programmer, declare by hand-typing their declarations. It would take years of typing on a keyboard doing nothing but mindlessly declaring variables before you would declare a number of int variables occupying an amount of memory worth speaking of.
Well, you can use blocks ({ }) and defining a variable as late as possible to limit the scope where it exists.
But unless the variable's address is taken, doing so has no influence on the generated code at all, as the compiler's determination of the scope where it has to keep the variable's value is not significantly impacted.
If the variable's address is taken, failure of escape-analysis, mostly due to inlining-barriers like separate compilation or allowing semantic interpositioning, can make the compiler assume it has to keep it alive till later in the block than strictly neccessary. That's rarely significant (don't worry about a handful of ints, and most often a few lines of code longer keeping it alive are insignificant), but best to keep it in mind for the rare case where it might matter.
If you are that concerned about the tiny amount of memory that is on the stack, then you're probably going to be interested in understanding the specifics of your compiler as well. You'll need to find out what it does when it compiles. The actual shape of the stack-frame is not specified by the C language. It is left to the compiler to figure out. To take an example from the currently accepted answer:
void foo() {
// some codes
// ...
{ // create an extra minimum scope where a is needed
int a;
}
// a doesn't exist here
}
This may or may not affect the memory usage of the function. If you were to do this in a mainstream compiler like gcc or Visual Studio, you would find that they optimize for speed rather than stack size, so they pre-allocate all of the stack space they need at the start of the function. They will do analysis to figure out the minimum pre-allocation needed, using your scoping and variable-usage analysis, but those algorithms literally wont' be affected by extra scoping. They're already smarter than that.
Other compilers, especially those for embedded platforms, may allocate the stack frame differently. On these platforms, such scoping may be the trick you needed. How do you tell the difference? The only options are:
Read the documentation
Try it, and see what works
Also, make sure you understand the exact nature of your problem. I worked on a particular embedded project which eschewed the stack for everything except return values and a few ints. When I pressed the senior developers about this silliness, they explained that on this particular application, stack space was at more of a premium than space for globally allocated variables. They had a process they had to go through to prove that the system would operate as intended, and this process was much easier for them if they allocated everything up front and avoided recursion. I guarantee you would never arrive at such a convoluted solution unless you first knew the exact nature of what you were solving.
As another solution you could look at, you could always build your own stack frames. Make a union of structs, where each struct contains the variables for one stack frame. Then keep track of them yourself. You could also look at functions like alloca, which can allow for growing the stack frame during the function call, if your compiler supports it.
Would a union of structs work? Try it. The answer is compiler dependent. If all variables are stored in memory on your particular device, then this approach will likely minimize stack usage. However, it could also substantially confuse register coloring algorithms, and result in an increase in stack usage! Try and see how it goes for you!
I have studied that in linux system Stack grow from high memory ddress to low memory address. To test this i have written a small code:
#include<stdio.h>
void func() {
int var1;
int var2;
printf("Func: %p %p",&var1,&var2);
}
int main() {
int var1;
int var2;
printf("Main: %p %p\n",&var1,&var2);
func();
return 0;
}
While I run this in in ideone, I get following output:
Main: 0xbfd958f0 0xbfd958f4
Func: 0xbfd958f8 0xbfd958fc
According to the textbook, Func should be stored in Lower memory address than Main, but here what is happening is completely opposite. Can somebody explain me this behaviour. Here is the link to ideone.
Thank you.
Typically the stack grows down from high memory, and the heap grows up from low memory, so they will never "bump into" each other.
The stack can theoretically grow in either direction, though. x86 supports stacks growing either direction but I've never seen anyone use an upward-growing stack on purpose.
The best part is that Intel refers to downward-growing stacks as "grow up" and upward-growing stacks as "grow down."
NOTE:- You should not assume anything about the ordering of local variables inside the stack frame. The compiler might put the "first" variable "first" in the sense of pushing it at the current location, meaning the "first" variable is at a higher address. Or it could organize the variables upward in memory (more likely) giving the "first" variable a lower address. Or it could arrange the variables completely at random. If optimizing, it may even eliminate variables, or use the same memory location for more than one variable if their lifetimes don't overlap.
You can follow this link
BUFFER OVERFLOW 7
but it's still important to know that the return address is not guaranteed to be arranged in any particular way. If -fomit-frame-pointer is used, then the base pointer will not be on the stack. And as I said before, the ordering of local variables conforms to no specific convention.
Another complication is the presence of more than one calling convention in the same program. It is not generally possible just by looking at code addresses to tell what convention a function conforms to. The stack frame may look very different from what you expect.
If I run a program, just like
#include <stdio.h>
int main(int argc, char *argv[], char *env[]) {
printf("My references are at %p, %p, %p\n", &argc, &argv, &env);
}
We can see that those regions are actually in the stack.
But what else is there? If we ran a loop through all the values in Linux 3.5.3 (for example, until segfault) we can see some weird numbers, and kind of two regions, separated by a bunch of zeros, maybe to try to prevent overwriting the environment variables accidentally.
Anyway, in the first region there must be a lot of numbers, such as all the frames for each function call.
How could we distinguish the end of each frame, where the parameters are, where the canary if the compiler added one, return address, CPU status and such?
Without some knowledge of the overlay, you only see bits, or numbers. While some of the regions are subject to machine specifics, a large number of the details are pretty standard.
If you didn't move too far outside of a nested routine, you are probably looking at the call stack portion of memory. With some generally considered "unsafe" C, you can write up fun functions that access function variables a few "calls" above, even if those variables were not "passed" to the function as written in the source code.
The call stack is a good place to start, as 3rd party libraries must be callable by programs that aren't even written yet. As such, it is fairly standardized.
Stepping outside of your process memory boundaries will give you the dreaded Segmentation violation, as memory fencing will detect an attempt to access non-authorized memory by the process. Malloc does a little more than "just" return a pointer, on systems with memory segmentation features, it also "marks" the memory accessible to that process and checks all memory accesses that the process assignments are not being violated.
If you keep following this path, sooner or later, you'll get an interest in either the kernel or the object format. It's much easier to investigate one way of how things are done with Linux, where the source code is available. Having the source code allows you to not reverse-engineer the data structures by looking at their binaries. When starting out, the hard part will be learning how to find the right headers. Later it will be learning how to poke around and possibly change stuff that under non-tinkering conditions you probably shouldn't be changing.
PS. You might consider this memory "the stack" but after a while, you'll see that really it's just a large slab of accessible memory, with one portion of it being considered the stack...
The contents of the stack are basically:
Whatever the OS passes to the program.
Call frames (also called stack frames, activation areas, ...)
What does the OS pass to the program? A typical *nix will pass the environment, arguments to the program, possibly some auxiliary information, and pointers to them to be passed to main().
In Linux, you'll see:
a NULL
the filename for the program.
environment strings
argument strings (including argv[0])
padding full of zeros
the auxv array, used to pass information from the kernel to the program
pointers to environment strings, ended by a NULL pointer
pointers to argument strings, ended by a NULL pointer
argc
Then, below that are stack frames, which contain:
arguments
the return address
possibly the old value of the frame pointer
possibly a canary
local variables
some padding, for alignment purposes
How do you know which is which in each stack frame? The compiler knows, so it just treats its location in the stack frame appropriately. Debuggers can use annotations for each function in the form of debug info, if available. Otherwise, if there is a frame pointer, you can identify things relative to it: local variables are below the frame pointer, arguments are above the stack pointer. Otherwise, you must use heuristics, things that look like code addresses are probably code addresses, but sometimes this results in incorrect and annoying stack traces.
The content of the stack will vary depending on the architecture ABI, the compiler, and probably various compiler settings and options.
A good place to start is the published ABI for your target architecture, then check that your particular compiler conforms to that standard. Ultimately you could analyse the assembler output of the compiler or observe the instruction level operation in your debugger.
Remember also that a compiler need not initialise the stack, and will certainly not "clear it down", when it has finished with it, so when it is allocated to a process or thread, it might contain any value - even at power-on, SDRAM for example will not contain any specific or predictable value, if the physical RAM address has been previously used by another process since power on or even an earlier called function in the same process, the content will have whatever that process left in it. So just looking at the raw stack does not tell you much.
Commonly a generic stack frame may contain the address that control will jump to when the function returns, the values of all the parameters passed, and the value of all auto local variables in the function. However the ARM ABI for example passes the first four arguments to a function in registers R0 to R3, and holds the return value of the leaf function in the LR register, so it is not as simple in all cases as the "typical" implementation I have suggested.
The details are very dependent on your environment. The operating system generally defines an ABI, but that's in fact only enforced for syscalls.
Each language (and each compiler even if they compile the same language) in fact may do some things differently.
However there is some sort of system-wide convention, at least in the sense of interfacing with dynamically loaded libraries.
Yet, details vary a lot.
A very simple "primer" could be http://kernelnewbies.org/ABI
A very detailed and complete specification you could look at to get an idea of the level of complexity and details that are involved in defining an ABI is "System V Application Binary Interface AMD64 Architecture Processor Supplement" http://www.x86-64.org/documentation/abi.pdf
I'm worried that I am misunderstanding something about stack behavior in C.
Suppose that I have the following code:
int main (int argc, const char * argv[])
{
int a = 20, b = 25;
{
int temp1;
printf("&temp1 is %ld\n" , &temp1);
}
{
int temp2;
printf("&temp2 is %ld\n" , &temp2);
}
return 0;
}
Why am I not getting the same address in both printouts? I am getting that temp2 is one int away from temp1, as if temp1 was never recycled.
My expectation is for the stack to contain 20, and 25.
Then have temp1 on top, then have it removed, then have temp2 on top, then have it removed.
I am using gcc on Mac OS X.
Note that I am using the -O0 flag for compiling without optimizations.
Tho those wondering about the background for this question: I am preparing teaching materials on C, and I am trying to show the students that they should not only avoid returning pointers to automatic variables from functions, but also to avoid taking the address of variables from nested blocks and dereferencing them outside. I was trying to demonstrate how this causes problems, and couldn't get the screenshot.
The compiler is completely within its rights not to optimize temp1 and temp2 into the same location. It has been many years since compilers generated code for one stack operation at a time; these days the whole stack frame is laid out at one go. (A few years back a colleague and I figured out a particularly clever way to do this.) Naive stack layout probably puts each variable in its own slot, even when, as in your example, their lifetimes don't overlap.
If you're curious, you might get different results with gcc -O1 or gcc -O2.
There is no guarantee what address stack objects will receive regardless of the order they are declared.
The compiler can happily reorder the creation and duration of stack variables providing it does not affect the results of the function.
I believe the C standard just talks about the scope and lifetime of variables defined in a block. It makes no promises about how the variables interact with the stack or if a stack even exists.
I remember reading something about it. All I have now is this obscure link.
Just to let everybody know (and for the sake of the archives), it appears to be our kernel extension is running into a known limitation of GCC. Just to recap, we have a function in a very portable, very lightweight library, that for some reason is getting compiled with a 1600+ byte stack when compiled on/for Darwin. No matter what compiler options I tried, and what optimization levels I used, the stack was no smaller than 1400 "machine check" panic in pretty reproducible (but not frequent) situations.
After a lot of searching on the Web, learning some i386 assembly and talking to some people who are much better at assembly, I have learned that GCC is somewhat notorious for having horrid stack allocation. [...]
Apparently this is gcc's dirty little secret, except it's not much of a secret to some--Linus Torvalds has complained several times on various lists about the gcc stack allocation (search lkml.org for "gcc stack usage"). Once I knew what to search for, there was plenty of griping about gcc's subpar allocation of stack variables, and in particular, it's inability to re-use stack space for variables in different scopes.
With that said, my Linux version of gcc properly re-uses stack space, I get same address for both variables. Not sure what C standard says about it, but strict scope enforcement is only important for code correctness in C++ (due to destruction at the end of the scope), but not in C.
There is no standard that sets how variables are placed on the stack. What happens in the compiler is much more complicated. In your code, the compiler may even choose to completely ignore and suppress variables a and b.
During the many stages of the compiler, the code may be converted to it's SSA form, and all stack variables lose their addresses and meanings in this form (it may even make it harder for the debugger).
Stack space is very cheap, in the sense that the time to allocate either 2 or 20 variables is constant. Also, stack space is very dynamic for most function calls, since with the exception of a few functions (those nearer main() and thread-entry functions, with long-lived event loops or so), they tend to complete quickly. So, you just don't bother with them.
This is completely dependent on the compiler and how it is configured.