Which memory region is used by Function and function parameter?
Also in which region memory for inline function get allocated?
If i am calling the inline function inside the normal function multiple times will memory allocated for the inline function multiple time?
Below is sample program
inline int add (int a, int b)
{
return A+B;
}
int calculation(int c , int d)
{
int ret;
for (int i=0; i < 3; i++) {
ret = add(c, d);
c++;
d++;
}
return ret;
}
Where the memory for a& b and c&d will be allocated?
Memory regions aren't standardized, though de facto standards like ELF exist, which is a common format both for Unix-like systems and embedded systems.
Assuming an ELF-like system, the region where executable code is stored is called .text. It doesn't matter if a function is inlined or not, it's machine code will end up in that segment.
A normal function stores its parameters either in registers or on the stack. This is system-specific and depends on the "ABI" (Application Binary Interface). When such a function gets inlined, it may not be necessary to copy the variables from the caller, in which case they remain in whatever register or region they were already allocated in.
As for what will happen in your specific code example, the function doesn't contain any side effects and results aren't stored, so only the last lap in the for loop is actually relevant. The loop would have been executed 4 times, so the the various ++ operations just boil down to 2+2=4.
The generated machine code on an optimizing x86 compiler boils down to
lea eax, [rdi+4+rsi]
ret
Which in the equivalent C code pretty much means that your code was replaced with this:
int calculation(int c , int d)
{
return c + d + 4;
}
This is because the algorithm itself is nonsense, more so than the inlining. The compiler is perfectly able to inline this without the inline keyboard and will do so with optimizations enabled.
Inlining is compiler specific. However you can thing about it like a series of transformations to the code.
Starting from the original code:
ret = add(c, d);
First the function arguments are exported:
int a = c;
int b = d;
ret = add(a, b)
Then the body of the function is inlined:
int a = c;
int b = d;
ret = a + b
Then all kinds of other optimizations will take place, however in the worst case the (just the above code without any optimization), the variables a and b will be in the stack after the ret variable.
The main point is that there will not be many allocations, just one. The int a =... and int b = ... may seem that they are allocated at every loop, but in reality there are allocated with the call to the function, like if they were just after the int ret statement.
Related
In C when you have a function that returns a pointer to one of it's local (on the stack) variables the calling function gets null returned instead. Why does that happen?
I can do this in C on my hardware
void A() {
int A = 5;
}
void B() {
// B will be 5 even when uninitialised due to the B stack frame using
// the old memory layout of A
int B;
printf("%d\n", B);
}
int main() {
A();
B();
}
Due to the fact that the stack frame memory doesn't get reset and B overlays A's memory record in the stack.
However I can't do
int* C() {
int C = 10;
return &C;
}
int main() {
// D will be null ?
int* D = C();
}
I know I shouldn't do this code, it's UB, is different on different hardware, compilers could optimize it to change the behaviour of the example, and it will get clobbered when we next call another function in this example anyway.
But I was wondering why specifically D is null when compiled with GCC and why I get a segmentation fault if I try and access that memory address, shouldn't the bits still be there?
Is it the compiler doing this?
GCC sees the undefined behaviour (UB) visible at compile time and decides to just return NULL on purpose. This is good: noisy failure right away on first use of a value is easier to debug. Returning NULL was a new feature somewhere around GCC5; as #P__J__'s answer shows on Godbolt, GCC4.9 prints non-null stack addresses.
Other compilers may behave differently, but any decent compile will warn about this error. See also What Every C Programmer Should Know About Undefined Behavior
Or with optimization disabled, you could use a tmp variable to hide the UB from the compiler. Like int *p = &C; return p; because gcc -O0 doesn't optimize across statements. (Or with optimization enabled, make that pointer variable volatile to launder a value through it, hiding the source of the pointer value from the optimizer.)
#include <stdio.h>
int* C() {
int C = 10;
int *volatile p = &C; // volatile pointer to plain int
return p; // still UB, but hidden from the compiler
}
int main()
{
int* D = C();
printf("%p\n", (void *)D);
if (D){
printf("%#x\n", *D); // in theory should be passing an unsigned int for %x
}
}
Compiling and running on the Godbolt compiler explorer, with gcc10.1 -O3 for x86-64:
0x7ffcdbf188e4
0x7ffc
Interestingly, the dead store to int C optimized away, although it does still have an address. It has its address taken, but the var holding the address doesn't escape the function until int C goes out of scope at the same time that address is returned. Thus no well-defined accesses to the 10 value are possible, and it is valid for the compiler to make this optimization. Making int C volatile as well would give us the value.
The asm for C() is:
C:
lea rax, [rsp-12] # address in the red-zone, below RSP
mov QWORD PTR [rsp-8], rax # store to a volatile local var, also in the red zone
mov rax, QWORD PTR [rsp-8] # reload it as return value
ret
The version that actually runs is inlined into main and behaves similarly. It's loading some garbage value from the callstack that was left there, probably the top half of an address. (x86-64's 64-bit addresses only have 48 significant bits. The low half of the canonical range always has 16 leading zero bits).
But it's memory that wasn't written by main, so perhaps an address used by some function that ran before main.
// B will be 5 even when uninitialised due to the B stack frame using
// the old memory layout of A
int B;
Nothing about that is guaranteed. It's just luck that that happens to work out when optimization is disabled. With a normal level of optimization like -O2, reading an uninitialized variable might just read as 0 if the compiler can see that at compile time. Definitely no need for it to load from the stack.
And the other function would have optimized away a dead store.
GCC also warns for use-uninitialized.
It is an undefined behaviour (UB) but many modern compilers when they detect it return the reference to the automatic storage variable return NULL as a precaution (for example newer versions of gcc).
example here:
https://godbolt.org/z/H-zU4C
Consider the following code:
void foo(){
.....
}
int main()
{
int arr[3][3] ;
char string[10];
foo();
return 0;
}
How can the function foo access the locals of my main function without passing the parameters to the function as function arguments ? Does the function foo have enough privileges to access and modify the variables in main ?
please reply
thank you
As per the C language specification, a function's local variables are inaccessible to other functions. There is no legal, supported way to do what you're asking. That said, in most (all?) implementations of the C language the main function's variables will be stored on the stack, which is easy to locate and can be read and written by anyone (it has to be because everyone needs to store local information in it) so it is technically possible (though it is a remarkably bad idea).
void foo(){
int b; // puts a 4 byte word on the stack atop the return address
(&b)[2]; // interpret b as the first entry in an array of integers (called the stack)
// and offset past b and the return address to get to a
// for completeness
(&b)[0]; // gets b
(&b)[1]; // gets the return address
}
int main()
{
int a; // puts a 4 byte word on the stack
foo(); // puts a (sometimes 4 byte) return address on the stack atop a
return 0;
}
This code, might, on some systems (like 32 bit x86 systems) access the variables inside of the main function, but it is very easily broken (e.g. if pointers on this system are 8 bytes, if there's padding on the stack, if stack canaries are in use, if there are multiple variables in each function and the compiler has its own ideas about what order they should be in, etc., this code won't work as expected). So don't use it, use parameters because there's no reason not to do so, and they work.
I try to verify my understand of the stack memory layout in C by compiling following code and inspect the address in gdb. I only record the least significant digits, the higher ones are the same. The outputs are generated by using the
print \u &a
Here is a simple test code:
void test(int a,int b)
{
int c = a;
int d = b;
printf("%d,%d\n",c,d);
}
int main()
{
int x = 1;
int y = 2;
test(x,y);
return 0;
}
If I look at the test function frame, I have following results,
&b: 6808
&a: 6812
&c: 6824
&d: 6828
$rbp: 6832 (frame pointer).
I am confused. Shouldn't function parameters sit at higher memory address with respect to the local variables. Can someone explain this in detail please? Thanks.
edit:
if I print the memory out like:
printf("&a:%p,&b:%p\n",(&a),(&b));
printf("&c:%p,&d:%p\n",(&c),(&d));
I got
&a:0x7fff4737687c,&b:0x7fff47376878
&c:0x7fff47376888,&d:0x7fff4737688c
It turns to be in b a c d order. There is a 8 byte gap between end of a and beginning c. I guess it shall be the return address?
As per the flow of a function, first arguments are allocated and then the internal arguments.
Your concern is based on the presumption that stack grows upwards (which is not necessary).
Please follow the below link for more understanding:
Does stack grow upward or downward?
Will it be precise to say that in
void f() {
int x;
...
}
"int x;" means allocating sizeof(int) bytes on the stack?
Are there any specifications for that?
Nothing in the standard mandates that there is a stack. And nothing in the standard mandates that a local variable needs memory allocated for it. The variable could be placed in a register, or even removed altogether as an optimization.
There are no specification about that and your assumption is often (but not always) false.
Consider some code like
void f() {
int x;
for (x=0; x<1000; x++)
{ // do something with x
}
// x is no more used here
}
First, an optimizing compiler would put x inside some register of the machine and not consume any stack location (unless e.g. you do something with the address &x like storing it in a global).
Also the compiler could unroll that loop, and remove x from the generated code. For example, many compilers would replace
for (x=0; x<5; x++) g(x);
with the equivalent of
g(0); g(1); g(2); g(3); g(4);
and perhaps replace
for (x=0; x<10000; x++) t[x]=x;
with something like
for (α = 0; α < 10000; α += 4)
{ t[α] = α; t[α+1] = α+1; t[α+2] = α+2; t[α+3] = α+3; };
where α is a fresh variable (or perhaps x itself).
Also, there might be no stack. For C it is uncommon, but some other languages did not have any stack (see e.g. old A.Appel's book compiling with continuations).
BTW, if using GCC you could inspect its intermediate (Gimple) representations with e.g. the MELT probe (or using gcc -fdump-tree-all which produces hundreds of dump files!).
from GNU:
3.2.1 Memory Allocation in C Programs
Automatic allocation happens when you declare an automatic variable,
such as a function argument or a local variable. The space for an
automatic variable is allocated when the compound statement containing
the declaration is entered, and is freed when that compound statement
is exited. In GNU C, the size of the automatic storage can be an
expression that varies. In other C implementations, it must be a
constant.
It depends on a lot of factor. The compiler can optimize and remove it from the stack, keeping the value in register. etc.
If you compile in debug it certainly does allocate some space in the stack but you never know. This is not specify. The only thing specify is the visibility of the variable and the size and arithmetic on it. Look at the C99 spec for more information.
I think it depends on compiler. I used the default compiler for Code::Blocks and Dev-C++ and it looks like memory is allocated during initialization. In following cout statement, changing n2 to n1 will give the same answer. But if I initialize n1 to some value, or if I display n2 before I display the average, I will get a different answer which it is garbage.
Note that VS does correctly handles this by giving error since variables are not initialized.
void getNums();
void getAverage();
int main()
{
getNums();
getAverage();
return 0;
}
void getNums()
{
int num1 = 4;
double total = 10;
}
void getAverage()
{
int counter;
double n1 , n2;
cout << n2/counter << endl;
}
Imagine I have the following simple C program:
int main() {
int a=5, b= 6, c;
c = a +b;
return 0;
}
Now, I would like to know the address of the expression c=a+b, that is the program address
where this addition is carried out. Is there any possibility that I could use printf?
Something along the line:
int main() {
int a=5, b= 6, c;
printf("Address of printf instruction in memory: %x", current_address_pointer_or_something)
c = a +b;
return 0;
}
I know how I could find the address out by using gdb and then info line file.c:line. However, I should know if I could also do that directly with the printf.
In gcc, you can take the address of a label using the && operator. So you could do this:
int main()
{
int a=5, b= 6, c;
sum:
c = a+b;
printf("Address of sum label in memory: %p", &&sum);
return 0;
}
The result of &&sum is the target of the jump instruction that would be emitted if you did a goto sum. So, while it's true that there's no one-to-one address-to-line mapping in C/C++, you can still say "get me a pointer to this code."
Visual C++ has the _ReturnAddress intrinsic, which can be used to get some info here.
For instance:
__declspec(noinline) void PrintCurrentAddress()
{
printf("%p", __ReturnAddress);
}
Which will give you an address close to the expression you're looking at. In the event of some optimizations, like tail folding, this will not be reliable.
Tested in Visual Studio 2008:
int addr;
__asm
{
call _here
_here: pop eax
; eax now holds the PC.
mov [addr], eax
}
printf("%x\n", addr);
Credit to this question.
Here's a sketch of an alternative approach:
Assume that you haven't stripped debug symbols, and in particular you have the line number to address table that a source-level symbolic debugger needs in order to implement things like single step by source line, set a break point at a source line, and so forth.
Most tool chains use reasonably well documented debug data formats, and there are often helper libraries that implement most of the details.
Given that and some help from the preprocessor macro __LINE__ which evaluates to the current line number, it should be possible to write a function which looks up the address of any source line.
Advantages are that no assembly is required, portability can be achieved by calling on platform-specific debug information libraries, and it isn't necessary to directly manipulate the stack or use tricks that break the CPU pipeline.
A big disadvantage is that it will be slower than any approach based on directly reading the program counter.
For x86:
int test()
{
__asm {
mov eax, [esp]
}
}
__declspec(noinline) int main() // or whatever noinline feature your compiler has
{
int a = 5;
int aftertest;
aftertest = test()+3; // aftertest = disasms to 89 45 F8 mov dword ptr [a],eax.
printf("%i", a+9);
printf("%x", test());
return 0;
}
I don't know the details, but there should be a way to make a call to a function that can then crawl the return stack for the address of the caller, and then copy and print that out.
Using gcc on i386 or x86-64:
#include <stdio.h>
#define ADDRESS_HERE() ({ void *p; __asm__("1: mov 1b, %0" : "=r" (p)); p; })
int main(void) {
printf("%p\n", ADDRESS_HERE());
return 0;
}
Note that due to the presence of compiler optimizations, the apparent position of the expression might not correspond to its position in the original source.
The advantage of using this method over the &&foo label method is it doesn't change the control-flow graph of the function. It also doesn't break the return predictor unit like the approaches using call :)
On the other hand, it's very much architecture-dependent... and because it doesn't perturb the CFG there's no guarantee that jumping to the address in question would make any sense at all.
If the compiler is any good this addition happens in registers and is never stored in memory, at least not in the way you are thinking. Actually a good compiler will see that your program does nothing, manipulating values within a function but never sending those values anywhere outside the function can result in no code.
If you were to:
c = a+b;
printf("%u\n",c);
Then a good compiler will also never store that value C in memory it will stay in registers, although it depends on the processor as well. If for example compilers for that processor use the stack to pass variables to functions then the value for c will be computed using registers (a good compiler will see that C is always 11 and just assign it) and the value will be put on the stack while being sent to the printf function. Naturally the printf function may well need temporary storage in memory due to its complexity (cant fit everything it needs to do in registers).
Where I am heading is that there is no answer to your question. It is heavily dependent on the processor, compiler, etc. There is no generic answer. I have to wonder what the root of the question is, if you were hoping to probe with a debugger, then this is not the question to ask.
Bottom line, disassemble your program and look at it, for that compile on that day with those settings, you will be able to see where the compiler has placed intermediate values. Even if the compiler assigns a memory location for the variable that doesnt mean the program will ever store the variable in that location. It depends on optimizations.