Location in memory for integers in c [duplicate] - c

This question already has answers here:
Order of local variable allocation on the stack
(10 answers)
Closed 1 year ago.
I wonder how memory is allocated in c. In the example below, it looks like the compiler allocates some memory for the program and then goes backward. How and why does it work like that?
Code:
int main (void) {
int a;
int b;
int c, d;
printf("a: %p\nb: %p\nc: %p\nd: %p\n", &a, &b, &c, &d);
}
Output:
a: 0x7ffff275351c
b: 0x7ffff2753518
c: 0x7ffff2753514
d: 0x7ffff2753510

This depends on the architecture, but on many systems local variables are stored on the stack. For the compiler it is not strictly necessary to populate the stack top down, so this is depending on the compiler implementation. Usually the compiler saves enough space on the stack and organizes it, as it sees fit.
The compiler may not even put it in memory, but can place the values directly in registers instead, as long as the observable behavior stays the same.

This strongly depends on the specific implementation. The language itself only specifies minimal rules for where and how memory is allocated for different objects (see the C 2011 Online Draft, section 6.2.4., for the complete set of rules).
In most implementations, auto variables (i.e., "local" variables) are created on the stack. While implementations are not required to create variables in the same order in which they are declared, it's not uncommon for them to do so.
On x86/x86-64, the stack grows "downwards" towards decreasing addresses, so each item pushed onto the stack will have a lower address than the previous item.

Related

Is the stack pre-allocated in a process?

Well, my question is as follows, I saw somewhere that a linux process allocates 8 MiB on the stack to be used, if I have a C program for example, that I only allocate two variables on the stack, it is right to say that I allocated or is it better to say that I just reused that space? Since a process allocates 8 MiB on the stack it does not depend on the size that I am going to use in my program, as long as it does not exceed my stack, that is, whichever term is appropriate, I will allocate a data on the stack or I will reuse a data that has already been allocated by a linux process?
#include <stdio.h>
void f() {
int x = 5;
printf("Value = %d End = %p\n", x, &x);
}
void g() {
int y = 10;
printf("Value = %d End = %p\n", y, &y);
}
int main(){
f();
g();
return 0;
}
See that the addresses will be the same, because I reused the size that had already been allocated, the same wouldn't happen with malloc, summarizing the term Allocated right data in the Stack isn't very correct?
Is the stack pre-allocated in a process?
On a stack-based architecture, a process will have stack space available to it from the beginning of its execution. That could be described as "pre-allocated". However, do note that in some contexts, it may be possible for a process's stack to be extended during the lifetime of the process. Perhaps that changes how you would view it?
In any case, that has little to do with whether the process of assigning storage space for automatic variables should be described as "allocation". Although it has technical implications, it is of little account linguistically that such space may be carved out of the stack, as opposed to out of some other area of memory controlled by the process. The lifetimes of such objects do obey different rules than the lifetimes of mallocated objects, but so what?
if I have a C program for example, that I only allocate two variables on the stack, it is right to say that I allocated or is it better to say that I just reused that space?
People are likely to understand you just fine either way. Although I'm sure there are some who would quibble over whether "allocate" is technically correct for automatic variables, it is nevertheless widely used for them. If you are conversing with people, as opposed to writing technical documentation to which the distinction is important, then I would not hesitate to use "allocate" to describe assigning storage space to automatic variables.

Static Allocation - C language [duplicate]

This question already has answers here:
Do I really need malloc?
(2 answers)
Closed 2 years ago.
As far as I know, the C compiler (I am using GCC 6) will scan the code in order to:
Finding syntax issues;
Allocating memory to the program (Static allocation concept);
So why does this code work?
int main(){
int integers_amount; // each int has 4 bytes
printf("How many intergers do you wanna store? \n");
scanf("%d", &integers_amount);
int array[integers_amount];
printf("Size of array: %d\n", sizeof(array)); // Should be 4 times integer_amount
for(int i = 0; i < integers_amount; i++){
int integer;
printf("Type the integer: \n");
scanf("%d", &integer);
array[i] = integer;
}
for(int j = 0; j < integers_amount; j++){
printf("Integer typed: %d \n", array[j]);
}
return 0;
}
My point is:
How does the C compiler infer the size of the array during compilation time?
I mean, it was declared but its value has not been informed just yet (Compilation time). I really believed that the compiler allocated the needed amount of memory (in bytes) at compilation time - That is the concept of static allocation matter of fact.
From what I could see, the allocation for the variable 'array' is done during runtime, only after the user has informed the 'size' of the array. Is that correct?
I thought that dynamic allocation was used to use the needed memory only (let's say that I declare an integer array of size 10 because I don't know how many values the user will need to hold there, but I ended up only using 7, so I have a waste of 12 bytes).
If during runtime I have those bytes informed I can allocate only the memory needed. However, it doesn't seem to be the case because from the code we can see that the array is only allocated during runtime.
Can I have some help understanding that?
Thanks in advance.
How does the C compiler infer the size of the array during compilation time?
It's what's called a variable length array or for short a VLA, the size is determined at runtime but it's a one off, you cannot resize anymore. Some compilers even warn you about the usage of such arrays, as they are stored in the stack, which has a very limited size, it can potencially cause a stackoverflow.
From what I could see, the allocation for the variable 'array' is done during runtime, only after the user has informed the 'size' of the array. Is that correct?
Yes, that is correct. That's why these can be dangerous, the compiler won't know what is the size of the array at compile time, so if it's too large there is nothing it can do to avoid problems. For that reason C++ forbids VLA's.
let's say that I declare an integer array of size 10 because I don't know how many values the user will need to hold there, but I ended up only using 7, so I have a waste of 12 bytes
Contrary to fixed size arrays, a variable length array size can be determined at runtime, but when its size is defined you can no longer change it, for that you have dynamic memory allocation (discussed ahead) if you are really set on having the exact size needed, and not one byte more.
Anyway, if you are expecting an outside value to set the size of the array, odds are that it is the size you need, if not, well there is nothing you can do, aside from the mentioned dynamic memory allocation, in any case it's better to have a little more wasted space than too little space.
Can I have some help understanding that?
There are three concepts I find relevant to the discussion:
Fixed size arrays, i.e. int array[10]:
Their size defined at compile time, they cannot be resized and are useful if you already know the size they should have.
Variable length arrays, i.e. int array[size], size being a non constant variable:
Their size is defined at runtime, but can only be set once, they are useful if the size of the array is dependant on external values, e.g. a user input or some value retrived from a file.
Dynamically allocated arrays: i.e. int *array = malloc(sizeof *arr * size), size may or may not be a constant:
These are used when your array will need to be resized, or if it's too large to store in the stack, which has limited size. You can change its size at any point in your code using realloc, which may simply resize the array or, as #Peter reminded, may simply allocate a new array and copy the contents of the old one over.
Variables defined inside functions, like array in your snippet (main is a function like any other!), have "automatic" storage duration; typically, this translates to them being on the "stack", a universal concept for a first in/last out storage which gets built and unbuilt as functions are entered and exited.
The "stack" simply is an address which keeps track of the current edge of unused storage available for local variables of a function. The compiler emits code for moving it "forward" when a function is entered in order to accommodate the memory needs of local variables and to move it "backward" when the program flow leaves the function (the double quotes are there because the stack may as well grow towards smaller addresses).
Typically these stack adjustments upon entering into and returning from functions are computed at compile time; after all, the local variables are all visible in the program code. But principally, nothing keeps a program from changing the stack pointer "on the fly". Very early on, Unixes made use of this and provided a function which dynamically allocates space on the stack, called alloca(). The FreeBSD man page says: "The alloca() function appeared in Version 32V AT&T UNIX"´(which was released in 1979).
alloca behaves very much like alloc except that the storage is lost when the current function returns, and that it underlies the usual stack size restrictions.
So the first part of the answer is that your array does not have static storage duration. The memory where local variables will reside is not known at compile time (for example, a function with local variables in it may or may not be called at all, depending on run-time user input!). If it were, your astonishment would be entirely justified.
The second part of the answer is that array is a variable length array, a fairly new feature of the C programming language which was only added in 1999. It declares an object on the stack whose size is not known until run time (leading to the anti-paradigmatic consequence that sizeof(array) is not a compile time constant!).
One could argue that variable length arrays are only syntactic sugar around an alloca call; but alloca is, although widely available, not part of any standard.

Is memory allocated when the variable is not used in c

#include<stdio.h>
int main()
{
int a,b;
float e;
char f;
printf("int &a = %u\n",&a);
printf("int &b = %u\n",&b);
printf("float &e = %u\n",&e);
printf("char &f = %u\n",&f);
}
The Output is
int &a = 2293324
int &b = 2293320
float &e = 2293316
char &f = 2293315
But when i use this code and replace the printf for float--
#include<stdio.h>
int main()
{
int a,b;
float e;
char f;
printf("int &a = %u\n",&a);
printf("int &b = %u\n",&b);
printf("char &f = %u\n",&f);
}
Then the Output is
int &a = 2293324
int &b = 2293320
char &f = 2293319
here address is not provided to float, but it is declared on top.
My questions are
Is memory not allocated to variables not used in program?
Why addresses allocated in decreasing order. ex- it's going from 2293324 to 2293320?
1) Is memory not allocated to variables not used in program?
Yes that can happen, the compiler is allowed to optimize it out.
2) Why addresses allocated in decreasing order. ex- it's going from 2293324 to 2293320?
That is usual for most local storage implementations, that they use the CPU supported stack pointer going from stack top to stack bottom. All those local variables will be allocated at the stack most probably.
1) Is memory not allocated to variables not used in program?
It's an allowed optimization; if an unused variable doesn't affect the program's observable behavior, a compiler may just discard it completely. Note that most modern compilers will warn you about unused variables (so you can either remove them from the code or do something with them).
2) Why addresses allocated in decreasing order. ex- it's going from 2293324 to 2293320?
The compiler is not required to allocate storage for separate objects in any particular order, so don't assume that your variables will be allocated in the order they were declared. Also, remember that on x86 and some other systems, the stack grows "downwards" towards decreasing addresses. Remember the top of any stack is simply the location where something was most recently pushed - it has nothing to do with relative address values.
While not specifically required by the standard, local variables are universally located on the program stack.
When you enter a function, one of the first thing done is to decrement the stack pointer to provide space for the local variables.
SUBL #SOMETHING, SP
Where SOMETHING is the amount of space required and SP is the stack pointer register.. In your first example, SOMETHING is probably 13. Then the address of:
f is 0(SP)
e is 1(sp)
b is 5(sp)
a is 9(sp)
I am assuming your compiler did not align the stack pointer. Often they do giving something more like:
f is 3(SP)
e is 4(sp)
b is 8(sp)
a is 12(sp)
And SOMETHING would be rounded up to 16 on a 32-bit system.
You might want to generate an assembly listing using your compiler to see what is going on underneath.
Is memory not allocated to variables not used in program?
Note that for local variable memory is not really allocated. A variable is temporarily bound to a location on the program stack (stack is not required by the standard but is how it is done in most cases). That is why the variable's initial value is undefined. It could have been bound to something else previously.
The compiler does not need to reserve space for variables that are not used. They can be optimized away. Usually, there are compiler settings to instruct not to do this for debugging.
Why addresses allocated in decreasing order. ex- it's going from 2293324 to 2293320?
Program stacks generally grow downward. Starting ye olde days, the program would be at the bottom of the address space, the heap above that and the stack at the opposite end.
The heap would grow towards higher addresses. The stack would grow towards the heap (lower addresses).
While the address spaces can be more complicated than that these days, the downward growth of stacks has stayed.
There is no particular requirement that the compiler map the variables to the stack in descending order but there's a 50/50 chance it will do it that way.

Buffer overflow - order of local variables on stack [duplicate]

This question already has answers here:
Order of local variable allocation on the stack
(10 answers)
Closed 4 years ago.
I'm quite confused about how the local variables are ordered on the stack. I understand, that (on Intel x86) the local variables are stored from higher to lower address as they go in the code. So it's clear, that this code:
int i = 0;
char buffer[4];
strcpy(buffer, "aaaaaaaaaaaaaaa");
printf("%d", i);
produces something like this:
1633771873
The i variable was overwritten by the overflowed buffer.
However, if I swap the first two lines:
char buffer[4];
int i = 0;
strcpy(buffer, "aaaaaaaaaaaaaaa");
printf("%d", i);
the output is absolutely same.
How is it possible? The i's address is lower than the buffer's one and so an overflow of the buffer should overwrite other data, but not i. Or am I missing something?
There is no rule about the order of local variables, so the compiler is generally free to allocate them the way it likes. But on the other hand there are many strategies that a compiler will use to reduce the possibility that could happen what you are voluntarily trying to do.
One of those safety enhancement would be to allocate a buffer always far from other scalar variables because an array can be addressed out of bounds and be more incline to bloat adjacent variables. Another trick is to add some trap empty space after arrays to create a kind of isolation for the bounds problem.
Anyway you can use the debugger to have a look to the assembly for confirmation of variables positioning.
If you want to look at how the local variables are allocated by the compiler try compiling with gcc -S which will output the assembly code. On the assembly code you can see how the compiler has chosen to order the variables.
One thing to keep in mind in how the compiler chooses to order local variables is that each char only needs to align by 1 (which means that it can start at any byte of memory), on the other hand the int has to align by 4 (which means that it can only start on a byte evenly divisible by 4), so depending on the alignment the compiler has it's own logic on how to avoid having empty bytes of data which means that it often groups together variables of similar type in a certain order. So even if you define them like this:
int a;
char c;
int b;
char d;
It is likely that the compiler has grouped together the ints and chars in memory so the memory going from low memory on top to high memory on bottom might look something like:
low memory
| | | char d | char c|
| int b |
| int a |
high memory
each block of || represents one byte and an entire line represents 4 bytes.
Try messing around with the assembly code sometime it is pretty interesting.

Do the automatic local variables are stored in the stack in C?

Okay I know that main()'s automatic local variables are stored in the stack and also any function automatic local variables too, but when I have tried the following code on gcc version 4.6.3:
#include <stdio.h>
int main(int argc, char *argv[]) {
int var1;
int var2;
int var3;
int var4;
printf("%p\n%p\n%p\n%p\n",&var1,&var2,&var3,&var4);
}
the results are :
0xbfca41e0
0xbfca41e4
0xbfca41e8
0xbfca41ec
according to the results var4 on the top of the stack and var1 on the bottom of the stack and the stack pointer now pointing on the address below var1 address....but why var4 on the
top of the stack and var1 on the bottom...its declared after var1 so I think logically that var1 should be on the top of the stack and any variable declared after var1 should be below
it in memory...so in my example like this:
>>var1 at 0xbfca41ec
>>var2 at 0xbfca41e8
>>var3 at 0xbfca41e4
>>var4 at 0xbfca41e0
>>and stack pointer pointing here
..
..
EDIT 1:
After reading the comment by #AusCBloke I’ve tried the following code :
#include <stdio.h>
void fun(){
int var1;
int var2;
printf("inside the function\n");
printf("%p\n%p\n",&var1,&var2);
}
int main(int argc, char *argv[]) {
int var1;
int var2;
int var3;
int var4;
printf("inside the main\n");
printf("%p\n%p\n%p\n%p\n",&var1,&var2,&var3,&var4);
fun();
return 0;
}
And the results :
inside the main
0xbfe82d60
0xbfe82d64
0xbfe82d68
0xbfe82d6c
inside the function
0xbfe82d28
0xbfe82d2c
so the variables inside fun() stack frame are below the variables inside main() stack frame and that’s true according to the nature of the stack ,..but inside the same stack frame its not necessary to be ordered from top to the bottom.
thanks #AusCBloke..... your comment helped me a lot
There is no requirement for these variables to be allocated in the order in which they were declared. They can be moved around by the compiler, or even optimized out entirely. If you need the relative addresses to stay the same, use a struct.
Objects with automatic storage duration are typically stored on the stack, but the language standard doesn't require it. In fact, the standard (the link is to the latest pre-release C11 draft)
doesn't even mention the word "stack".
The word "stack", unfortunately, is ambiguous.
In the most abstract sense, a stack is a data structure in which the most recently added items are removed first (last-in first-out, or LIFO). The requirements regarding the lifetime of objects with automatic storage duration (i.e., objects defined within a function with no static keyword) imply some kind of stack-like allocation.
The word "stack" is also commonly used to refer to a contiguous region of memory, typically controlled by a "stack pointer" pointing to the top-most element. The stack grows by moving the stack pointer away from the base, and shrinks by moving it toward the base. (It can grow in either direction, toward higher or lower memory addresses.) Most C compilers use this kind of contiguous stack to implement automatic objects -- but not all do. There have been C compilers for IBM mainframe systems which allocate storage for function calls from a heap-like structure, and the addresses for nested calls need not be uniformly in either increasing or decreasing order.
This is an unusual implementation, and there are very good reasons that this approach is not commonly used (a contiguous stack is simpler, more efficient, and is typically supported by the CPU). But the C standard is carefully written to avoid requiring a specific scheme, and C code that's carefully written to be portable will work correctly regardless of which method a compiler chooses. You don't need to know. All you really need to know about the address of var1 is that it's &var1. If you write if (&var1 < &var2) { ... }, then you're probably doing something wrong (that expression's behavior is undefined, BTW).
That's the standard C answer. I see that your question is tagged gcc. As far as I know, all versions of gcc use a contiguous stack. But even so, there's rarely any benefit in taking advantage of this.
On many (most) modern platform stack grows from higher addresses in memory to lower addresses. I..e. when you start your program, the stack pointer is immediately put to some address in memory, which is determined by the maximum stack size in your program. Once things get pushed into stack, the stack pointer actually moves down.
I could be wrong but stacks start in lower memory addresses and are then added to. So it is correct for var4 to be on top. It is a stack after all!
edit: the assembly code behind it has the stack pointer at the bottom of the memory stack and whenever data is added, the stackpointer is incremented so that the next variable falls ontop.
I'm 99.9999% sure that the answer is Yes. Also, the stack grows downwards on Intel architecture machines, not upwards. The lower area becomes the virtual "top" of the stack (it's upside-down, so to speak).
So technically, the variables are in the correct order in stack memory.
EDIT: This is probably still compiler-specific, though.

Resources