If I type int x is it using sizeof(int) bytes of memory now? Is it not until x has a value?
What if x = b + 6...is x given a spot in memory before b is?
Yes, as soon as you declare a variable like:
int x;
memory is, generally, allocated on the stack. That being said, the compiler is very smart. If it notices you never use that variable, it may optimize it away.
If I type int x is it using sizeof(int) bytes of memory now? Is it not until x has a value?
Once you declare a variable like int x; it will be taking up space in memory (4 bytes in the case of an int). Giving it a value like x = 5 will just modify the memory that is already being taken up.
What if x = b + 6...is x given a spot in memory before b is?
For this statement to be valid, both x and b must have been declared before this statement. As for which one was allocated in memory first, that depends on what you did before this statement.
Example:
int x = 5;
int b = 6;
x = b + 6; //your code
In this case, x was allocated in memory before b.
Actually when you make a function call, the space required for all locally declared variables are already allocated.
When you compile your C function and convert it to assembly compiler adds procedure prolog which actually relocates the stack pointer for opening space for function parameters, return value, local variables and couple of more values to manage function call.
The order of allocation of local variables is compiler dependent and doesn't necessarily have to be in the order of declaration or usage.
When you use a variable before assigning any value. CPU just uses what was in the already allocated memory. It may be 0, it may be some garbage value or it may even be a value that your previous function call left there. Totally depends on your programs execution, operating system and compiler.
So it is one of the best exercises that always initialize what you have declared as soon as possible. Because you may use the variable by mistake before assigning a value. And if it contains the correct value that you intended to assign (lets say 0, which is more probable) than it will work for some time. But later all of a sudden you program may change the behavior that you didn't expect though it was working perfectly before. And debugging might be a pain because of your assumptions.
Depending on where the variable is defined it may be assigned space
for global and static variables at compile time;
for local variables at run time when the function is entered (not when the definition is encounterd -- #Faruxx was pointing that out);
for objects (not variables) which are allocated dynamically via malloc obviously at run time when malloc is executed. Malloc will typically request a lot more memory from the system (a page? 4k?) for the program's address space and slice it up into byte sized bits (cough) during subsequent calls.
A declaration like extern int size; will not allocate or occupy any memory but refers to a memory location which will be resolved only by the linker (if the actual definition is in another translation unit) to some memory reserved at compile time there.
Peformance:
Global variables will in principle impact startup performance because global memory is zero initialized. This is obviously a one time penalty and negligible except for extreme cases. A larger executable will also need longer to start.
Variables on the stack are completely performance penalty free. They are not initialized (exactly for these performance reasons) and the assembler code for incrementing the stack pointer couldn't care less about the increment. The maximum stack size is fixed at startup to a few kB or MB (and the program will typically crash with, you guess it, when it tries to increase the stack beyond its limit); so there is no potentially expensive interaction with the operating system when the stack grows.
Allocations on the heap carry a comparatively large performance penalty because they always involve a function call (malloc) which then actually needs to do some work, plus a potential operating system call (put a memory segment into the program's address space), plus the need to pair each malloc with a delete in long-running programs which must not leak. (When I say "comparatively large" I mean "compared to local variables"; on your average PC you don't need to think about it except in the most inner loop.)
Related
If I have a for loop:
for (uint8_t x = 0; x < 100; ++x) {
char f[2000] = {0,};
}
what actually is going on here? Is this reusing the same memory address every time so it only actually allocates it once, or is it using a new memory location each loop? Does it depend on optimization levels? What about the implementation?
There is no guarantee that f is allocated in the same memory address every time around the loop, but it should be very hard to find a compiler that doesn't do it that way. I have seen f get trashed in the loop iteration, so yes it really does go out of scope at the closing brace.
A modern compiler will almost always gather up all the variable declarations within a function, analyze for overlapping runtime scope (not compile time--It's smart enough now to figure out when a variable isn't used any further down in the function most of the time), and allocate all the fixed* memory on the stack at function entry time. This runtime call is still much cheaper than malloc().
Stack allocation is done with a single instruction for all the variables: sub rbp, constant. Freeing the memory is done the same way: add rbp, constant. This buffer is almost half the size (4KB) where the compiler has to emit a call to the runtime to verify the stack has enough room. On Windows, this function is called _chkstk.
*Flexible arrays are allocated with a stretchy stack element near their first use.
When one iteration of the loop ends and a new one begins, the lifetime of the array f ends and the lifetime of a new array f starts. As far as the C standard is concerned, each is distinct from the other.
In practice, the compiler may use the same memory address for each iteration of f or it might not. This is not something that can be depended on. Certain optimizations may take advantage of the fact that the lifetime of f ends at the end of the loop body and do things that appear strange in non-compliant code.
In short, stick with what the C standard says is valid and don't make any assumptions about f.
I'm trying to learn c programming and can't understand how stacks work.
Everywhere I read I find that when a function is called stack frame is created in the stack which contains all the data for the function call- parameters, return address and local variables. And the stack frame is removed releasing the memory when the function returns.
But what if we had a compound statement inside the function which have its own variables. Is the memory for the local variables for block is also allocated inside the stack frame when the function call and released when it returns.
Example
int main(){
int a = 10;
if(int a<50){
int b=9;
}
else{
int c=10;
}
}
Is the memory for b and c is allocated with a when the function starts executing?
And deallocated when the function returns?
If so than there is no difference other than the visibility of the variable when declaring it in the beginning of the function or inside a another block in the function.
Please explain.
The C standard doesn't specify how such things are to be implemented. The C standard doesn't even mention a stack! A stack is a common way of implementing function calls but nothing in the standard requires a stack. All such things are implementation specific details. For the posted code, the standard only specifies when the variables are in scope.
So there is no general answer to your question. The answer depends on your specific system, i.e. processor, compiler, etc.
Provided that your system uses a stack (which is likely), the compiler may reserve stack space for all 3 variables or it may reserve space for 2 variables, i.e. one for awhile b and c share the other. Both implementations will be legal. The compiler is even allowed to place the variables directly in some registers so that nothing needs to be reserved on the stack.
You can check your specific system by looking at the generated assembly code.
A C implementation may implement this in multiple ways. Let’s suppose your example objects, a, b, and c, are actually used in your code in some way that results in the compiler actually allocating memory for them and not optimizing them away. Then:
The compiler could allocate stack space (by decreasing the top-of-stack pointer) for all of a, b, and c when the function starts, and release it when the function ends.
The compiler could allocate stack space for a when the function starts, then allocate space (again by decreasing the stack pointer) in the middle of the function when space for b or c is needed, then release that stack space as each block ends.
In a good modern compiler, the compiler is likely to analyze all the active lifetimes of the objects and find a somewhat optimal solution for using stack space in overlapping ways. By “active lifetime”, I mean the time from when the value of an object is set to the last time that value is needed (not the C standard’s definition of “lifetime”). For example, in int a = f(x); … g(a); h(y); a = f(y); … g(a);, there are actually two lifetimes for a, from its initial assignment to the first g(a) and from the assignment a = f(y); to the second g(a);. If the compiler needs memory to store a, it might use different memory for these two lifetimes.
Because of the above, what memory is used for which C object can get quite complicated. A particular memory location might be used for a at one time and for b at another. It may depend on loops and goto statements in your code. It also depends on whether the address of an object is taken—if the address is taken, the compiler may have to keep the object in one place, so that the address is consistent. (It might be able to get away without doing that, depending on how it can see the address is used.)
Basically, the compiler is free to use the stack, other memory, and registers in whatever way it chooses as long as the observable behavior of your program remains as it is defined by the C standard.
(The observable behavior is the input/output interactions of your program, the data written to files, and the accesses to volatile objects.)
Your example as stated is not valid since you have no brackets in the if-else statement. However, in the example below all variables are typically allocated when the function is entered:
int main(void)
{
int a = 10;
if (a < 50) {
int b = 9;
} else {
int c = 10;
}
}
As mentioned by user "500 - Internal Server Error", this is an implementation issue.
What is the time complexity of declaring and defining, but not initializing, an array in C? For what reason is the answer the case?
I am interested in the time complexity at both compile and run time, but more so run time.
Here is an example of a program with such an array:
int main ()
{
int n[ 10 ]; /* n is an array of 10 integers */
return 0;
}
If it is not O(1), constant time, is there a language that does declare and define arrays in constant time?
The language doesn't specify this. But in typical implementations, space for all local variables in a block is allocated simply by adjusting the stack pointer by the total size of all the variables when entering that block, which is O(1). Arrays are simply included in that total size, and it's calculated at compile time. VLAs are not allocated when the block is entered, the allocation is delayed until the execution of the declaration (since it depends on a variable which must be assigned first), but it's still just an O(1) operation of adjusting the SP register.
I think many implementations actually allocate all the space for a function when entering the function, rather than adjusting the SP for each block. But variables that exist in blocks that do not overlap may share the same memory in the stack frame. But this is not really relevant for the question asked, unless you're wondering if there's a difference between
int a[10];
int b[10];
// code that uses a and b
and
int a[10];
{
int b[10];
// code that uses a and b
}
The compile-time complexity is O(1) for each variable (it just needs to look up the size of the datatype, and multiply by the size if it's an array), so O(n) where n is the number of local variables.
This is a strange and probably unanswerable question. Normally complexity analysis and "big O" notation are applied to algorithms, not so much implementations. But here you're essentially asking entirely about implementation, and of the non-algorithmic, "noise" or "overhead" activities of allocating arrays.
Defining and declaring are compile-time concepts, and I've never heard big-O applied to compile-time activities.
At run time, there may be some work to do to cause the array to spring into existence, whether or not it's initialized. If it's a local ("stack") array, the OS may have to allocate and page in memory for the new function's stack frame, which will probably be more or less O(n) in the array's size. But if the stack is already there, it will be O(0), i.e. free.
If the array is static and/or global, on the other hand, it only has to get allocated once. But, again, the OS will have to allocate memory for it, which might be O(n). The OS might or might not have to page the memory in -- depends on whether you do anything with the array, and on the OS's VM algorithm. (And once you start talking about VM performance, it gets very tricky to define and think about, because the overhead might end up getting shared with other processes in various ways.)
If the array is global or static, and if you don't initialize it, C says it's initialized to 0, which the C run-time library and/or OS does for you one way or another, which will almost certainly be O(n) at some level -- although, again, it may end up being overlapped or shared with other activities in various complicated or unmeasurable ways.
In C, the cost of instantiating a variable at run time (whether a scalar or an array) is (usually) down in the noise, although it really depends on the underlying platform. For example, setting aside space for auto variables on an x86 platform is (usually) done by simply adjusting the stack pointer:
subq $X, %rsp
where X is the amount of storage required for all local variables in the function. So it takes the same amount of time whether X is 4 or 4K1.
Storage for static variables may be allocated from within the program image itself, such that the storage is set aside as soon as the program is loaded into memory (making it effectively zero-cost at runtime).
Or not.
Big O notation doesn't really apply here; the exact mechanisms for allocating storage can vary a lot based on the implementation, and much of it is out of your control. Space is usually the limiting factor here, not time.
Modulo page faults or other memory subsystem functions that are beyond our control.
Take a simple program like this:
int main(void)
{
char p;
char *q;
q = &p;
return 0;
}
How is &p determined? Does the compiler calculate all such references before-hand or is it done at runtime? If at runtime, is there some table of variables or something where it looks these things up? Does the OS keep track of them and it just asks the OS?
My question may not even make sense in the context of the correct explanation, so feel free to set me straight.
How is &p determined? Does the compiler calculate all such references before-hand or is it done at runtime?
This is an implementation detail of the compiler. Different compilers can choose different techniques depending on the kind of operating system they are generating code for and the whims of the compiler writer.
Let me describe for you how this is typically done on a modern operating system like Windows.
When the process starts up, the operating system gives the process a virtual address space, of, let's say 2GB. Of that 2GB, a 1MB section of it is set aside as "the stack" for the main thread. The stack is a region of memory where everything "below" the current stack pointer is "in use", and everything in that 1MB section "above" it is "free". How the operating system chooses which 1MB chunk of virtual address space is the stack is an implementation detail of Windows.
(Aside: whether the free space is at the "top" or "bottom" of the stack, whether the "valid" space grows "up" or "down" is also an implementation detail. Different operating systems on different chips do it differently. Let's suppose the stack grows from high addresses to low addresses.)
The operating system ensures that when main is invoked, the register ESP contains the address of the dividing line between the valid and free portions of the stack.
(Aside: again, whether the ESP is the address of the first valid point or the first free point is an implementation detail.)
The compiler generates code for main that pushes the stack pointer by lets say five bytes, by subtracting from it if the stack is growing "down". It decreases by five because it needs one byte for p and four for q. So the stack pointer changes; there are now five more "valid" bytes and five fewer "free" bytes.
Let's say that q is the memory that is now in ESP through ESP+3 and p is the memory now in ESP+4. To assign the address of p to q, the compiler generates code that copies the four byte value ESP+4 into the locations ESP through ESP+3.
(Aside: Note that it is highly likely that the compiler lays out the stack so that everything that has its address taken is on an ESP+offset value that is divisible by four. Some chips have requirements that addresses be divisible by pointer size. Again, this is an implementation detail.)
If you do not understand the difference between an address used as a value and an address used as a storage location, figure that out. Without understanding that key difference you will not be successful in C.
That's one way it could work but like I said, different compilers can choose to do it differently as they see fit.
The compiler cannot know the full address of p at compile-time because a function can be called multiple times by different callers, and p can have different values.
Of course, the compiler has to know how to calculate the address of p at run-time, not only for the address-of operator, but simply in order to generate code that works with the p variable. On a regular architecture, local variables like p are allocated on the stack, i.e. in a position with fixed offset relative to the address of the current stack frame.
Thus, the line q = &p simply stores into q (another local variable allocated on the stack) the address p has in the current stack frame.
Note that in general, what the compiler does or doesn't know is implementation-dependent. For example, an optimizing compiler might very well optimize away your entire main after analyzing that its actions have no observable effect. The above is written under the assumption of a mainstream architecture and compiler, and a non-static function (other than main) that may be invoked by multiple callers.
This is actually an extraordinarily difficult question to answer in full generality because it's massively complicated by virtual memory, address space layout randomization and relocation.
The short answer is that the compiler basically deals in terms of offsets from some “base”, which is decided by the runtime loader when you execute your program. Your variables, p and q, will appear very close to the “bottom” of the stack (although the stack base is usually very high in VM and it grows “down”).
Address of a local variable cannot be completely calculated at compile time. Local variables are typically allocated in the stack. When called, each function allocates a stack frame - a single continuous block of memory in which it stores all its local variables. The physical location of the stack frame in memory cannot be predicted at compile time. It will only become known at run-time. The beginning of each stack frame is typically stored at run-time in a dedicated processor register, like ebp on Intel platform.
Meanwhile, the internal memory layout of a stack frame is pre-determined by the compiler at compile-time, i.e. it is the compiler who decides how local variables will be laid out inside the stack frame. This means that the compiler knows the local offset of each local variable inside the stack frame.
Put this all together and we get that the exact absolute address of a local variable is the sum of the address of the stack frame itself (the run-time component) and the offset of this variable inside that frame (the compile-time component).
This is basically exactly what the compiled code for
q = &p;
will do. It will take the current value of the stack frame register, add some compile-time constant to it (offset of p) and store the result in q.
In any function, the function arguments and the local variables are allocated on the stack, after the position (program counter) of the last function at the point where it calls the current function. How these variables get allocated on the stack and then deallocated when returning from the function, is taken care of by the compiler during compile time.
For e.g. for this case, p (1 byte) could be allocated first on the stack followed by q (4 bytes for 32-bit architecture). The code assigns the address of p to q. The address of p naturally then is 5 added or subtracted from the the last value of the stack pointer. Well, something like that, depends on how the value of the stack pointer is updated and whether the stack grows upwards or downwards.
How the return value is passed back to the calling function is something that I'm not certain of, but I'm guessing that it is passed through the registers and not the stack. So, when the return is called, the underlying assembly code should deallocate p and q, place zero into the register, then return to the last position of the caller function. Of course, in this case, it is the main function, so it is more complicated in that, it causes the OS to terminate the process. But in other cases, it just goes back to the calling function.
In ANSI C, all the local variables should be placed at the top of the function and is allocated once into the stack when entering the function and deallocated when returning from the function. In C++ or later versions of C, this becomes more complicated when local variables can also be declared inside blocks (like if-else or while statement blocks). In this case, the local variable is allocated onto the stack when entering the block and deallocated when leaving the block.
In all cases, the address of a local variable is always a fixed number added or subtracted from the stack pointer (as calculated by the compiler, relative to the containing block) and the size of the variable is determined from the variable type.
However, static local variables and global variables are different in C. These are allocated in fixed locations in the memory, and thus there's a fixed address for them (or a fixed offset relative to the process' boundary), which is calculated by the linker.
Yet a third variety is memory allocated on the heap using malloc/new and free/delete. I think this discussion would be too lengthy if we include that as well.
That said, my description is only for a typical hardware architecture and OS. All of these are also dependent on a wide variety of things, as mentioned by Emmet.
p is a variable with automatic storage. It lives only as long as the function it is in lives. Every time its function is called memory for it is taken from the stack, therefore, its address can change and is not known until runtime.
I was recently asked this question by a friend.
In a C program if I declare an integer
int x = 3;
then will it be fetched into the cache?
My opinion:
Yes. As the processor will allocate sizeof(int) amount of space in memory. Then to write 3 to that memory location it will get x in its registers and then add 3 to it. So as x is stored in CPU registers (this is how I think it works) it will also be fetched in cache.
Whereas if we only declare the integer and do not initialize it.
Eg.
int x;
Then the cpu just allocates the memory and does not write anything in that memory so in this case x won't be in cache.
This can be generalized to when is a variable fetched in cache.
Let me know if my thinking is correct.
Thanks
there is no definitive answer to this.... but yes more than likely. There is even a very good chance it will be put into a register also. If it can, it will avoid memory and just keep it in a register!
If there's optimization at all, it's fairly likely that 3 will never even make it into a register. Rather, the compiler will recognize that x has a value of 3 and will substitute 3 for the next use of x, possibly with, eg, an add-immediate instruction that doesn't first place the value in a register.
Or the compiler may optimize x into a register and so the value of x will never be stored in memory and hence never go through cache.
And some processors have what's known as a "store through" cache, meaning that if x is assigned a storage location the value may be placed into that location without first/simultaneously being placed in storage cache.
So we can most definitely say that the value 3 might possibly appear somewhere in cache. Sometimes.
First, the CPU does not allocate memory for your integer variable, at least not by itself. Memory allocation is the combined task of the compiler and OS. The compiler generates code either for the CPU or for the OS to allocate memory either on the stack or in the heap. Then that code executes and reserves memory.
Depending on your code, compiler optimizations and operation of the various caches there are multiple possible fates for the variable:
it doesn't get into the cache at all because there's no cache or it's disabled
it doesn't get into the cache because no code uses this variable or any data immediately adjacent to it, so there's no chance of sucking this variable into the cache
it can get into the instruction cache instead of the data cache if the compiler finds that this variable is a constant and can be directly encoded in an instruction (example: int x=3; y+=x; Here the compiler may simply generate code for y+=3 and that 3 can be an immediate operand of a mov or add instruction)
similarly to the above the compiler may find out how to generate optimized code without ever needing the value of the variable (3) anywhere (example: int x=3; while(x--) printf("*"); Here the compiler may just generate 3 calls to printf("*") or even a single call to printf("***"))
it gets into the cache temporarily and after use is squeezed out of the cache by other data
In addition to being implementation-dependent, it also depends on where that declaration is written. If it's a global variable, the 3 will probably just be stored in the executable's data segment so that it's mapped into the process's address space when the program starts running. The assignment doesn't "happen" at runtime in that case.
If x is allocated on the stack instead of being in a register or optimized entirely away, then it will certainly be in cache. The stack is almost always in cache because the stack is always being used.