Will it be precise to say that in
void f() {
int x;
...
}
"int x;" means allocating sizeof(int) bytes on the stack?
Are there any specifications for that?
Nothing in the standard mandates that there is a stack. And nothing in the standard mandates that a local variable needs memory allocated for it. The variable could be placed in a register, or even removed altogether as an optimization.
There are no specification about that and your assumption is often (but not always) false.
Consider some code like
void f() {
int x;
for (x=0; x<1000; x++)
{ // do something with x
}
// x is no more used here
}
First, an optimizing compiler would put x inside some register of the machine and not consume any stack location (unless e.g. you do something with the address &x like storing it in a global).
Also the compiler could unroll that loop, and remove x from the generated code. For example, many compilers would replace
for (x=0; x<5; x++) g(x);
with the equivalent of
g(0); g(1); g(2); g(3); g(4);
and perhaps replace
for (x=0; x<10000; x++) t[x]=x;
with something like
for (α = 0; α < 10000; α += 4)
{ t[α] = α; t[α+1] = α+1; t[α+2] = α+2; t[α+3] = α+3; };
where α is a fresh variable (or perhaps x itself).
Also, there might be no stack. For C it is uncommon, but some other languages did not have any stack (see e.g. old A.Appel's book compiling with continuations).
BTW, if using GCC you could inspect its intermediate (Gimple) representations with e.g. the MELT probe (or using gcc -fdump-tree-all which produces hundreds of dump files!).
from GNU:
3.2.1 Memory Allocation in C Programs
Automatic allocation happens when you declare an automatic variable,
such as a function argument or a local variable. The space for an
automatic variable is allocated when the compound statement containing
the declaration is entered, and is freed when that compound statement
is exited. In GNU C, the size of the automatic storage can be an
expression that varies. In other C implementations, it must be a
constant.
It depends on a lot of factor. The compiler can optimize and remove it from the stack, keeping the value in register. etc.
If you compile in debug it certainly does allocate some space in the stack but you never know. This is not specify. The only thing specify is the visibility of the variable and the size and arithmetic on it. Look at the C99 spec for more information.
I think it depends on compiler. I used the default compiler for Code::Blocks and Dev-C++ and it looks like memory is allocated during initialization. In following cout statement, changing n2 to n1 will give the same answer. But if I initialize n1 to some value, or if I display n2 before I display the average, I will get a different answer which it is garbage.
Note that VS does correctly handles this by giving error since variables are not initialized.
void getNums();
void getAverage();
int main()
{
getNums();
getAverage();
return 0;
}
void getNums()
{
int num1 = 4;
double total = 10;
}
void getAverage()
{
int counter;
double n1 , n2;
cout << n2/counter << endl;
}
Related
I have a function that needs external parameters and afterwards creates variables that are heavily used inside that function. E.g. the code could look like this:
void abc(const int dim);
void abc(const int dim) {
double arr[dim] = { 0.0 };
for (int i = 0; i != dim; ++i)
arr[i] = i;
// heavy usage of the arr
}
int main() {
const int par = 5;
abc(par);
return 0;
}
But I am getting a compiler error, because the allocation on the stack needs compile-time constants. When I tried allocating manually on the stack with _malloca, the time performance of the code worsened (compared to the case when I declare the constant par inside the abc() function). And I don't want the array arr to be on the heap, because it is supposed to contain only small amount of values and it is going to get used quite often inside the function. Is there some way to combine the efficiency while keeping the possibility to pass the size parameter of an array to the function?
EDIT: I am using MSVC compiler and I received an error C2131: expression did not evaluate to a constant in VC 2017.
If you're using a modern C compiler, that implements the entire C99, or the C11 with variable-length array extension, this would work, with one little modification:
void abc(const int dim);
void abc(const int dim) {
double arr[dim];
for (int i = 0; i != dim; ++i)
arr[i] = i;
// heavy usage of the arr
}
int main(void) {
const int par = 5;
abc(par);
return 0;
}
I.e. double arr[dim] would work - it doesn't have a compile-time constant size, but it is enough to know its size at runtime. However, such a VLA cannot be initialized.
Unfortunately MSVC is not a modern C compiler / at MS they don't want to implement the VLA themselves - and I even suspect they're a big part of why the VLA's were made optional in C11, so you'd need to define the array in main then pass a pointer to it to the function abc; or if the size is globally constant, use an actual compile-time constant, i.e. a #define.
However, you're not showing the actual code that you're having performance problems with. It might very well be that the compiler can produce optimized output if it knows the number of iterations - if that is true, then the "globally defined size" might be the only way to get excellent performance.
Unfortunately the Microsoft Compiler does not support variable length arrays.
If the array is not too large you could allocate by the largest possible size needed and pass a pointer to that stack array and a dimension to the function. This approach could help limit the number of allocations.
Another option is to implement a simple heap allocated global pool for functions of this type to use. The pool would allocate a large continuous chunk on the heap and then you can get a pointer to your reservation in the pool. The benefit of this approach is you will not have to worry about over allocation on the stack causing a segmentation fault (which can happen with variable length arrays).
As I understood that the following code generate variable length arrays (via a non standard extension of C++).
int main()
{
int valone = rand();
int valtwo = rand();
int array[valone][valtwo];
// Printing size
cout << sizeof(array) << endl;
}
Is there any way to check whether its generated on stack or heap? The wikipedia description here says that gcc generates the same in stack, but when I tried above code, most of the times, the array size seems too big to fit into stack, but it never complains.
Note: This code works only with gcc & clang and not with visual studio
the array size seems too big to fit into stack, but it never complains.
By "never complains", I presume you mean that the program doesn't crash.
You never touch the memory that you allocate and the compiler was smart enough to prove it and didn't allocate anything.
Let us take the address of the variable, and send it to a function that is defined elsewhere:
int array[valone][valtwo] = {};
cout << &array << endl;
Now, the compiler wasn't quite so sure that the array is never accessed. That's because it can't go into the streaming operator which implemented in another translation unit. Perhaps the operator will dereference the pointer; we must make sure that the array exists.
Segfault crashed this program on my first attempt. The stack was overflown.
I suppose this kind of crash test is a way to test if VLA is on the stack.
Mikhail's suggestion in comments to compare adjacency of automatic variables to the VLA is a decent platform dependant idea, but it can only work if you allocate small enough VLA that it doesn't crash the program.
It might be a tricky question and I have tried some thing like
#include "iostream"
int Stack_or_heap(void* ptr)
{
int dummy;
return ptr > &dummy;
}
int main(int argc, char** argv)
{
int* i = new int();
int x, y, z;
std::cout << Stack_or_heap(&x) << Stack_or_heap(&y) << Stack_or_heap(&z) << Stack_or_heap(i);
}
First I assume you know that any memory chunk allocated in the heap using either malloc or new continues its lifespan throughout the program unless explicitly deallocated. Thus we can conclude, if a variable length array only exists within its scope but not in the lifespan of the program, it is most probably allocated in the stack.
I don't know if there are cases where a chuck of memory is allocated in the heap but behaves as if it is a member of the stack. It is best the see the assembly code for that. Such implementation would probably use dynamic allocation functions/syscalls under the hood.
If it looks like a duck, walks like duck and talks like duck, then it should be safe to assume it is a duck(for the most of the time).
Depended of the version of the C compiler and compiler flags it is possible to initialize variables on any place in your functions (As far as I am aware).
I'm used to put it all the variables at the top of the function, but the discussion started about the memory use of the variables if defined in any other place in the function.
Below I have written 2 short examples, and I wondered if anyone could explain me (or verify) how the memory gets allocated.
Example 1: Variable y is defined after a possible return statement, there is a possibility this variable won't be used for that reason, as far as I'm aware this doesn't matter and the code (memory allocation) would be the same if the variable was placed at the top of the function. Is this correct?
Example 2: Variable x is initialized in a loop, meaning that the scope of this variable is only within this loop, but what about the memory use of this variable? Would it be any different if placed on the top of the functions? Or just initialized on the stack at the function call?
Edit: To conclude a main question:
Does reducing the scope of the variable or change the location of the first use (so anywhere else instead of top) have any effects on the memory use?
Code example 1
static void Function(void){
uint8_t x = 0;
//code changing x
if(x == 2)
{
return;
}
uint8_t y = 0;
//more code changing y
}
Code example 2
static void LoopFunction(void){
uint8_t i = 0;
for(i =0; i < 100; i ++)
{
uint8_t x = i;
// do some calculations
uartTxLine("%d", x);
}
//more code
}
I'm used to put it all the variables at the top of the function
This used to be required in the older versions of C, but modern compilers dropped that requirement. As long as they know the type of the variable at the point of its first use, the compilers have all the information they need.
I wondered if anyone could explain me how the memory gets allocated.
The compiler decides how to allocate memory in the automatic storage area. Implementations are not limited to the approach that gives each variable you declare a separate location. They are allowed to reuse locations of variables that go out of scope, and also of variables no longer used after a certain point.
In your first example, variable y is allowed to use the space formerly occupied by variable x, because the first point of use of y is after the last point of use of x.
In your second example the space used for x inside the loop can be reused for other variables that you may declare in the // more code area.
Basically, the story goes like this. When calling a function in raw assembler, it is custom to store everything used by the function on the stack upon entering the function, and clean it up upon leaving. Certain CPUs and ABIs may have a calling convention which involves automatic stacking of parameters.
Likely because of this, C and many other old languages had the requirement that all variables must be declared at the top of the function (or on top of the scope), so that the { } reflect push/pop on the stack.
Somewhere around the 80s/90s, compilers started to optimize such code efficiently, as in they would only allocate room for a local variable at the point where it was first used, and de-allocate it when there was no further use for it. Regardless of where that variable was declared - it didn't matter for the optimizing compiler.
Around the same time, C++ lifted the variable declaration restrictions that C had, and allowed variables to be declared anywhere. However, C did not actually fix this before the year 1999 with the updated C99 standard. In modern C you can declare variables everywhere.
So there is absolutely no performance difference between your two examples, unless you are using an incredibly ancient compiler. It is however considered good programming practice to narrow the scope of a variable as much as possible - though it shouldn't be done at the expense of readability.
Although it is only a matter of style, I would personally prefer to write your function like this:
(note that you are using the wrong printf format specifier for uint8_t)
#include <inttypes.h>
static void LoopFunction (void)
{
for(uint8_t i=0; i < 100; i++)
{
uint8_t x = i;
// do some calculations
uartTxLine("%" PRIu8, x);
}
//more code
}
Old C allowed only to declare (and initialize) variables at the top of a block. You where allowed to init a new block (a pair of { and } characters) anywhere inside a block, so you had then the possibility of declaring variables next to the code using them:
... /* inside a block */
{ int x = 3;
/* use x */
} /* x is not adressabel past this point */
And you where permitted to do this in switch statements, if statements and while and do statements (everywhere where you can init a new block)
Now, you are permitted to declare a variable anywhere where a statement is allowed, and the scope of that variable goes from the point of declaration to the end of the inner nested block you have declared it into.
Compilers decide when they allocate storage for local variables so, you can all of them allocated when you create a stack frame (this is the gcc way, as it allocates local variables only once) or when you enter in the block of definition (for example, Microsoft C does this way) Allocating space at runtime is something that requires advancing the stack pointer at runtime, so if you do this only once per stack frame you are saving cpu cycles (but wasting memory locations). The important thing here is that you are not allowed to refer to a variable location outside of its scoping definition, so if you try to do, you'll get undefined behaviour. I discovered an old bug for a long time running over internet, because nobody take the time to compile that program using Microsoft-C compiler (which failed in a core dump) instead of the commmon use of compiling it with GCC. The code was using a local variable defined in an inner scope (the then part of an if statement) by reference in some other part of the code (as everything was on main function, the stack frame was present all the time) Microsoft-C just reallocated the space on exiting the if statement, but GCC waited to do it until main finished. The bug solved by just adding a static modifier to the variable declaration (making it global) and no more refactoring was neccesary.
int main()
{
struct bla_bla *pointer_to_x;
...
if (something) {
struct bla_bla x;
...
pointer_to_x = &x;
}
/* x does not exist (but it did in gcc) */
do_something_to_bla_bla(pointer_to_x); /* wrong, x doesn't exist */
} /* main */
when changed to:
int main()
{
struct bla_bla *pointer_to_x;
...
if (something) {
static struct bla_bla x; /* now global ---even if scoped */
...
pointer_to_x = &x;
}
/* x is not visible, but exists, so pointer_to_x continues to be valid */
do_something_to_bla_bla(pointer_to_x); /* correct now */
} /* main */
This is a question for my Programming Langs Concepts/Implementation class. Given the following C code snippet:
void foo()
{
int i;
printf("%d ", i++);
}
void main()
{
int j;
for (j = 1; j <= 10; j++)
foo();
}
The local variable i in foo is never initialized but behaves similarly to a static variable on most systems. Meaning the program will print 0 1 2 3 4 5 6 7 8 9. I understand why it does this (the memory location of i never changes) but the question in the homework asks to modify the code (without changing foo) to alter this behavior. I've come up with a solution that works and makes the program print ten 0's but I don't know if it's the "right" solution and to be honest I don't exactly know why it works.
Here is my solution:
void main()
{
int j;
void* some_ptr = NULL;
for (j = 1; j <= 10; j++)
{
some_ptr = malloc(sizeof(void*));
foo();
free(some_ptr);
}
}
My original thought process was that i wasn't changing locations because there was no other memory manipulation happening around the calls of foo, so allocating a variable should disrupt that, but ince some_ptr is allocated in the heap and i is on the stack, shouldn't the allocation of some_ptr have no effect on i? My thought is that the compiler is playing some games with the optimization of that subroutine call, could anyone clarify?
There cannot be a "right" solution. But there can be a class of solutions which work for a particular CPU architecture, ABI, compiler, and compiler options.
Changing the code to something like this will have the effect of altering the memory above the stack in a way which should affect many, if not most, environments in the targeted way.
void foo()
{
int i;
printf("%d ", i++);
}
void main()
{
int j;
int a [2];
for (j = 1; j <= 10; j++)
{
foo();
a [-5] = j * 100;
}
}
Output (gcc x64 on Linux):
0 100 200 300 400 500 600 700 800 900
a[-5] is the number of words of stack used for overhead and variables spanning the two functions. There is the return address, saved stack link value, etc. The stack likely looks like this when foo() writes to a[-5]:
i
saved stack link
return address
main's j
(must be something else)
main's a[]
I guessed -5 on the second try. -4 was my first guess.
When you call foo() from main(), the (uninitialized) variable i is allocated at a memory address. In the original code, it so happens that it is zero (on your machine, with your compiler, and your chosen compilation options, your environment settings, and given the current phase of the moon — it might change when any of these, or a myriad other factors, changes).
By calling another function before calling foo(), you allow the other function to overwrite the memory location that foo() will use for i with a different value. It isn't guaranteed to change; you could, by bad luck, replace the zero with another zero.
You could perhaps use another function:
static void bar(void)
{
int j;
for (j = 10; j < 20; j++)
printf("%d\n", j);
}
and calling that before calling foo() will change the value in i. Calling malloc() changes things too. Calling pretty much any function will probably change it.
However, it must be (re)emphasized that the original code is laden with undefined behaviour, and calling other functions doesn't make it any less undefined. Anything can happen and it is valid.
The variable i in foo is simply uninitialized, and uninitialized value have indeterminate value upon entering the block. The way you saw it print certain value is entirely by coincident, and to write standard conforming C, you should never rely on such behavior. You should always initialize automatic variables before using it.
From c11std 6.2.4p6:
For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.
The reason the uninitialized value seems to keep its value from past calls is that it is on the stack and the stack pointer happens to have the same value every time the function is called.
The reason your code might be changing the value is that you started calling other functions: malloc and free. Their internal stack variables are using the same location as i in foo().
As for optimization, small programs like this are in danger of disappearing entirely. GCC or Clang might decide that since using an uninitialized variable is undefined behavior, the compiler is within its rights to completely remove the code. Or it might put i in a register set to zero. Then decide all printf calls output zero. Then decide that your entire program is simply a single puts("0000000000") call.
I've been trying to get into the habit of defining trivial variables at the point they're needed. I've been cautious about writing code like this:
while (n < 10000) {
int x = foo();
[...]
}
I know that the standard is absolutely clear that x exists only inside the loop, but does this technically mean that the integer will be allocated and deallocated on the stack with every iteration? I realise that an optimising compiler isn't likely to do this, but it that guaranteed?
For example, is it ever better to write:
int x;
while (n < 10000) {
x = foo();
[...]
}
I don't mean with this code specifically, but in any kind of loop like this.
I did a quick test with gcc 4.7.2 for a simple loop differing in this way and the same assembly was produced, but my question is really are these two, according to the standard, identical?
Note that "allocating" automatic variables like this is pretty much free; on most machines it's either a single-instruction stack pointer adjustment, or the compiler uses registers in which case nothing needs to be done.
Also, since the variable remains in scope until the loop exits, there's absolutely no reason to "delete" (=readjust the stack pointer) it until the loop exits, I certainly wouldn't expect there to be any overhead per-iteration for code like this.
Also, of course the compiler is free to "move" the allocation out of the loop altogether if it feels like it, making the code equivalent to your second example with the int x; before the while. The important thing is that the first version is easier to read and more tighly localized, i.e. better for humans.
Yes, the variable x inside the loop is technically defined on each iteration, and initialized via the call to foo() on each iteration. If foo() produces a different answer each time, this is fine; if it produces the same answer each time, it is an optimization opportunity – move the initialization out of the loop. For a simple variable like this, the compiler typically just reserves sizeof(int) bytes on the stack — if it can't keep x in a register — that it uses for x when x is in scope, and may reuse that space for other variables elsewhere in the same function. If the variable was a VLA — variable length array — then the allocation is more complex.
The two fragments in isolation are equivalent, but the difference is the scope of x. In the example with x declared outside the loop, the value persists after the loop exits. With x declared inside the loop, it is inaccessible once the loop exits. If you wrote:
{
int x;
while (n < 10000)
{
x = foo();
...other stuff...
}
}
then the two fragments are near enough equivalent. At the assembler level, you'll be hard pressed to spot the difference in either case.
My personal point of view is that once you start worrying about such micro-optimisations, you're doomed to failure. The gain is:
a) Likely to be very small
b) Non-portable
I'd stick with code that makes your intention clear (i.e. declare x inside the loop) and let the compiler care about efficiency.
There is nothing in the C standard that says how the compiler should generate code in either case. It could adjust the stack pointer on every iteration of the loop if it fancies.
That being said, unless you start doing something crazy with VLAs like this:
void bar(char *, char *);
void
foo(int x)
{
int i;
for (i = 0; i < x; i++) {
char a[i], b[x - i];
bar(a, b);
}
}
the compiler will most likely just allocate one big stack frame at the beginning of the function. It's harder to generate code for creating and destroying variables in blocks instead of just allocating all you need at the beginning of the function.