In C when you have a function that returns a pointer to one of it's local (on the stack) variables the calling function gets null returned instead. Why does that happen?
I can do this in C on my hardware
void A() {
int A = 5;
}
void B() {
// B will be 5 even when uninitialised due to the B stack frame using
// the old memory layout of A
int B;
printf("%d\n", B);
}
int main() {
A();
B();
}
Due to the fact that the stack frame memory doesn't get reset and B overlays A's memory record in the stack.
However I can't do
int* C() {
int C = 10;
return &C;
}
int main() {
// D will be null ?
int* D = C();
}
I know I shouldn't do this code, it's UB, is different on different hardware, compilers could optimize it to change the behaviour of the example, and it will get clobbered when we next call another function in this example anyway.
But I was wondering why specifically D is null when compiled with GCC and why I get a segmentation fault if I try and access that memory address, shouldn't the bits still be there?
Is it the compiler doing this?
GCC sees the undefined behaviour (UB) visible at compile time and decides to just return NULL on purpose. This is good: noisy failure right away on first use of a value is easier to debug. Returning NULL was a new feature somewhere around GCC5; as #P__J__'s answer shows on Godbolt, GCC4.9 prints non-null stack addresses.
Other compilers may behave differently, but any decent compile will warn about this error. See also What Every C Programmer Should Know About Undefined Behavior
Or with optimization disabled, you could use a tmp variable to hide the UB from the compiler. Like int *p = &C; return p; because gcc -O0 doesn't optimize across statements. (Or with optimization enabled, make that pointer variable volatile to launder a value through it, hiding the source of the pointer value from the optimizer.)
#include <stdio.h>
int* C() {
int C = 10;
int *volatile p = &C; // volatile pointer to plain int
return p; // still UB, but hidden from the compiler
}
int main()
{
int* D = C();
printf("%p\n", (void *)D);
if (D){
printf("%#x\n", *D); // in theory should be passing an unsigned int for %x
}
}
Compiling and running on the Godbolt compiler explorer, with gcc10.1 -O3 for x86-64:
0x7ffcdbf188e4
0x7ffc
Interestingly, the dead store to int C optimized away, although it does still have an address. It has its address taken, but the var holding the address doesn't escape the function until int C goes out of scope at the same time that address is returned. Thus no well-defined accesses to the 10 value are possible, and it is valid for the compiler to make this optimization. Making int C volatile as well would give us the value.
The asm for C() is:
C:
lea rax, [rsp-12] # address in the red-zone, below RSP
mov QWORD PTR [rsp-8], rax # store to a volatile local var, also in the red zone
mov rax, QWORD PTR [rsp-8] # reload it as return value
ret
The version that actually runs is inlined into main and behaves similarly. It's loading some garbage value from the callstack that was left there, probably the top half of an address. (x86-64's 64-bit addresses only have 48 significant bits. The low half of the canonical range always has 16 leading zero bits).
But it's memory that wasn't written by main, so perhaps an address used by some function that ran before main.
// B will be 5 even when uninitialised due to the B stack frame using
// the old memory layout of A
int B;
Nothing about that is guaranteed. It's just luck that that happens to work out when optimization is disabled. With a normal level of optimization like -O2, reading an uninitialized variable might just read as 0 if the compiler can see that at compile time. Definitely no need for it to load from the stack.
And the other function would have optimized away a dead store.
GCC also warns for use-uninitialized.
It is an undefined behaviour (UB) but many modern compilers when they detect it return the reference to the automatic storage variable return NULL as a precaution (for example newer versions of gcc).
example here:
https://godbolt.org/z/H-zU4C
Related
Which memory region is used by Function and function parameter?
Also in which region memory for inline function get allocated?
If i am calling the inline function inside the normal function multiple times will memory allocated for the inline function multiple time?
Below is sample program
inline int add (int a, int b)
{
return A+B;
}
int calculation(int c , int d)
{
int ret;
for (int i=0; i < 3; i++) {
ret = add(c, d);
c++;
d++;
}
return ret;
}
Where the memory for a& b and c&d will be allocated?
Memory regions aren't standardized, though de facto standards like ELF exist, which is a common format both for Unix-like systems and embedded systems.
Assuming an ELF-like system, the region where executable code is stored is called .text. It doesn't matter if a function is inlined or not, it's machine code will end up in that segment.
A normal function stores its parameters either in registers or on the stack. This is system-specific and depends on the "ABI" (Application Binary Interface). When such a function gets inlined, it may not be necessary to copy the variables from the caller, in which case they remain in whatever register or region they were already allocated in.
As for what will happen in your specific code example, the function doesn't contain any side effects and results aren't stored, so only the last lap in the for loop is actually relevant. The loop would have been executed 4 times, so the the various ++ operations just boil down to 2+2=4.
The generated machine code on an optimizing x86 compiler boils down to
lea eax, [rdi+4+rsi]
ret
Which in the equivalent C code pretty much means that your code was replaced with this:
int calculation(int c , int d)
{
return c + d + 4;
}
This is because the algorithm itself is nonsense, more so than the inlining. The compiler is perfectly able to inline this without the inline keyboard and will do so with optimizations enabled.
Inlining is compiler specific. However you can thing about it like a series of transformations to the code.
Starting from the original code:
ret = add(c, d);
First the function arguments are exported:
int a = c;
int b = d;
ret = add(a, b)
Then the body of the function is inlined:
int a = c;
int b = d;
ret = a + b
Then all kinds of other optimizations will take place, however in the worst case the (just the above code without any optimization), the variables a and b will be in the stack after the ret variable.
The main point is that there will not be many allocations, just one. The int a =... and int b = ... may seem that they are allocated at every loop, but in reality there are allocated with the call to the function, like if they were just after the int ret statement.
this is the code :
#include <stdio.h>
#include <stdlib.h>
int main() {
int a = 10;
int b = 20;
//printf("\n&a value %p", &a);
int* x = &b;
x = x + 1;
*x = 5;
printf("\nb value %d", b);
printf("\na value %d", a);
}
I want override a with b adress for test the c overflow but when I comment the line 5(printf fuction) I can't write five in a. While if I print the a adress I can write five in a.
Why?
Sorry for my english and thank you.
The reason this occurred is that all normal compilers store objects with automatic storage duration (objects declared inside a block that are not static or extern) on a stack. Your compiler “pushed” a onto the stack, which means it wrote a to the memory location where the stack pointer was pointing and then decremented the pointer. (Decrementing the pointer adds to the stack, because the stack grows in the direction of decreasing memory addresses. Stacks can be oriented in the other direction, but the behavior you observed strongly suggests your system uses the common direction of growing downward.) Then your compiler pushed b onto the stack. So b ended up at a memory address just below a.
When you took the address of b and added one, that produced the memory address where a is. When you used that address to assign 5, that value was written to where a is.
None of this behavior is defined by the C standard. It is a consequence of the particular compiler you used and the switches you compiled with.
You probably compiled with little or no optimization. With optimization turned on, many compilers would simplify the code by removing unnecessary steps (essentially replacing them with shortcuts), so that 20 and 10 are not actually stored on the stack. A possible result with optimization is that “20” and “10” are printed, and your assignment to *x has no effect. However, the C standard does not say what the behavior must be when you use *x in this way, so the results are determined only by the particular compiler you are using, along with the input switches you give it.
After x = x + 1;, x contains an address that you do not own. And by doing *x = 5; you are trying to write to some location that might not be accessible to you. Thus causing UB. Nothing more can be reasoned about.
This question already has answers here:
C standard compliant way to access null pointer address?
(5 answers)
Closed 7 years ago.
Suppose I need to write to zero address (e.g. I've mmapped something there and want to access it, for whatever reason including curiosity), and the address is known at compile time. Here're some variants I could think of to obtain the pointer, one of these works and another three don't:
#include <stdint.h>
void testNullPointer()
{
// Obviously UB
unsigned* p=0;
*p=0;
}
void testAddressZero()
{
// doesn't work for zero, GCC detects it as NULL
uintptr_t x=0;
unsigned* p=(unsigned*)x;
*p=0;
}
void testTrickyAddressZero()
{
// works, but the resulting assembly is not as terse as it could be
unsigned* p;
asm("xor %0,%0\n":"=r"(p));
*p=0;
}
void testVolatileAddressZero()
{
// p is updated, but the code doesn't actually work
unsigned*volatile p=0;
*p=0; // because this doesn't dereference p! // EDIT: pointee should also be volatile, then this will work
}
I compile this with
gcc test.c -masm=intel -O3 -c -o test.o
and then objdump -d test.o -M intel --no-show-raw-insn gives me (alignment bytes are skipped here):
00000000 <testNullPointer>:
0: mov DWORD PTR ds:0x0,0x0
a: ud2a
00000010 <testAddressZero>:
10: mov DWORD PTR ds:0x0,0x0
1a: ud2a
00000020 <testTrickyAddressZero>:
20: xor eax,eax
22: mov DWORD PTR [eax],0x0
28: ret
00000030 <testVolatileAddressZero>:
30: sub esp,0x10
33: mov DWORD PTR [esp+0xc],0x0
3b: mov eax,DWORD PTR [esp+0xc]
3f: add esp,0x10
42: ret
Here the testNullPointer obviously has UB since it dereferences what is null pointer by definition.
The principle of testAddressZero would give the expected code for any other than 0 address, e.g. 1, but for zero GCC appears to detect that address zero corresponds to null pointer, so also generates UD2.
The asm way of getting the zero address certainly inhibits the compiler's checks, but the price of that is that one has to write different assembly code for each architecture even if the principle of testAddressZero might have been successful (i.e. the same flat memory model on each arch) if not UD2 and similar traps. Also, the code appears not as terse as in the above two variants.
The way of volatile pointer would seem to be the best, but the code generated here appears to not dereference the address for some reason, so it's also broken.
The question now: if I'm targeting GCC, how can I seamlessly access zero address without any traps or other consequences of UB, and without the need to write in assembly?
As a workaround you can use the GCC option -fno-delete-null-pointer-checks that refrain the compiler to actively check for null pointer dereferencing.
While this option is intended to be used to speed-up code optimization it can be used in specific cases as this.
I would put the pointer into a global variable:
const uintptr_t zero = 0;
unsigned* zeroAddress= (unsigned *)zero;
void testZeroAddressPointer()
{
*zeroAddress=0;
}
Provided you expose the address beyond the scope of optimization (so the compiler can't figure out you don't set it somewhere else), that should do the trick, albeit slightly less efficiently.
Edit: make this code independent of implicit zero to null conversion.
The 0 address is the C99 NULL pointer (actually the "implementation" of the null pointer, which you can often write as 0....) on all the architectures I know about.
The null pointer has a very specific status in hosted C99: when a pointer can be (or was) dereferenced, it is guaranteed (by the language specification) to not be NULL (otherwise, it is undefined behavior).
Hence, the GCC compiler has the right to optimize (and actually will optimize)
int *p = something();
int x = *p;
/// the compiler is permitted to skip the following
/// because p has been dereferenced so cannot be NULL
if (p == NULL) { doit(); return; };
In your case, you might want to compile for the freestanding subset of the C99 standard. So compile with gcc -ffreestanding (beware, this option can bring some infelicities).
BTW, you might declare some extern char strange[] __attribute__((weak)); (perhaps even add asm("0") ...) and have some assembler or linker trick to make that strange have a 0 address. The compiler would not know that such a strange symbol is in fact at the 0 address...
My strong suggestion is to avoid dereferencing the 0 address.... See this. If you really need to deference the address 0, be prepared to suffer.... (so code some asm, lower the optimization, etc...).
(If you have mmap-ed the first page, just avoid using its first byte at address 0; that is often not a big deal.)
(IIRC, you are touching a grey area of GCC optimizations - and perhaps even of the C99 language specification, and you certainly want the free standing flavor of C; notice that -O3 optimization for free standing C is not well tested in the GCC compiler and might have residual bugs....)
You could consider changing the GCC compiler so that the null pointer has the numerical address 42. That would take some work.
I wrote a program in C having dangling pointer.
#include<stdio.h>
int *func(void)
{
int num;
num = 100;
return #
}
int func1(void)
{
int x,y,z;
scanf("%d %d",&y,&z);
x=y+z;
return x;
}
int main(void)
{
int *a = func();
int b;
b = func1();
printf("%d\n",*a);
return 0;
}
I am getting the output as 100 even though the pointer is dangling.
I made a single change in the above function func1(). Instead of taking the value of y and z from standard input as in above program, now I am assigning the value during compile time.
I redefined the func1() as follows:
int func1(void)
{
int x,y,z;
y=100;
z=100;
x=y+z;
return x;
}
Now the output is 200.
Can somebody please explain me the reason for the above two outputs?
Undefined Behavior means anything can happen, including it'll do as you expect. Your stack variables weren't overwritten in this case.
void func3() {
int a=0, b=1, c=2;
}
If you include a call to func3() in between func1 and printf you'll get a different result.
EDIT: What actually happens on some platforms.
int *func(void)
{
int num;
num = 100;
return #
}
Let's assume, for simplicity, that the stack pointer is 10 before you call this function, and that the stack grows upwards.
When you call the function, the return address is pushed on stack (at position 10) and the stack pointer is incremented to 14 (yes, very simplified). The variable num is then created on stack at position 14, and the stack pointer is incremented to 18.
When you return, you return a pointer to address 14 - return address is popped from stack and the stack pointer is back to 10.
void func2() {
int y = 1;
}
Here, the same thing happens. Return address pushed at position, y created at position 14, you assign 1 to y (writes to address 14), you return and stack pointer's back to position 10.
Now, your old int * returned from func points to address 14, and the last modification made to that address was func2's local variable assignment. So, you have a dangling pointer (nothing above position 10 in stack is valid) that points to a left-over value from the call to func2
It's because of the way the memory gets allocated.
After calling func and returning a dangling pointer, the part of the stack where num was stored still has the value 100 (which is what you are seeing afterwards). We can reach that conclusion based on the observed behavior.
After the change, it looks like what happens is that the func1 call overwrites the memory location that a points to with the result of the addition inside func1 (the stack space previously used for func is reused now by func1), so that's why you see 200.
Of course, all of this is undefined behavior so while this might be a good philosophical question, answering it doesn't really buy you anything.
It's undefined behavior. It could work correctly on your computer right now, 20 minutes from now, might crash in an hour, etc. Once another object takes the same place on the stack as num, you will be doomed!
Dangling pointers (pointers to locations that have been disassociated) induce undefined behavior, i.e. anything can happen.
In particular, the memory locations get reused by chance* in func1. The result depends on the stack layout, compiler optimization, architecture, calling conventions and stack security mechanisms.
With dangling pointers, the result of a program is undefined. It depends on how the stack and the registers are used. With different compilers, different compiler versions and different optimization settings, you'll get a different behavior.
Returning a pointer to a local variable yields undefined behaviour, which means that anything the program does (anything at all) is valid. If you are getting the expected result, that's just dumb luck.
Please study functions from basic C. Your concept is flawed...main should be
int main(void)
{
int *a = func();
int b;
b = func1();
printf("%d\n%d",*a,func1());
return 0;
}
This will output 100 200
Imagine I have the following simple C program:
int main() {
int a=5, b= 6, c;
c = a +b;
return 0;
}
Now, I would like to know the address of the expression c=a+b, that is the program address
where this addition is carried out. Is there any possibility that I could use printf?
Something along the line:
int main() {
int a=5, b= 6, c;
printf("Address of printf instruction in memory: %x", current_address_pointer_or_something)
c = a +b;
return 0;
}
I know how I could find the address out by using gdb and then info line file.c:line. However, I should know if I could also do that directly with the printf.
In gcc, you can take the address of a label using the && operator. So you could do this:
int main()
{
int a=5, b= 6, c;
sum:
c = a+b;
printf("Address of sum label in memory: %p", &&sum);
return 0;
}
The result of &&sum is the target of the jump instruction that would be emitted if you did a goto sum. So, while it's true that there's no one-to-one address-to-line mapping in C/C++, you can still say "get me a pointer to this code."
Visual C++ has the _ReturnAddress intrinsic, which can be used to get some info here.
For instance:
__declspec(noinline) void PrintCurrentAddress()
{
printf("%p", __ReturnAddress);
}
Which will give you an address close to the expression you're looking at. In the event of some optimizations, like tail folding, this will not be reliable.
Tested in Visual Studio 2008:
int addr;
__asm
{
call _here
_here: pop eax
; eax now holds the PC.
mov [addr], eax
}
printf("%x\n", addr);
Credit to this question.
Here's a sketch of an alternative approach:
Assume that you haven't stripped debug symbols, and in particular you have the line number to address table that a source-level symbolic debugger needs in order to implement things like single step by source line, set a break point at a source line, and so forth.
Most tool chains use reasonably well documented debug data formats, and there are often helper libraries that implement most of the details.
Given that and some help from the preprocessor macro __LINE__ which evaluates to the current line number, it should be possible to write a function which looks up the address of any source line.
Advantages are that no assembly is required, portability can be achieved by calling on platform-specific debug information libraries, and it isn't necessary to directly manipulate the stack or use tricks that break the CPU pipeline.
A big disadvantage is that it will be slower than any approach based on directly reading the program counter.
For x86:
int test()
{
__asm {
mov eax, [esp]
}
}
__declspec(noinline) int main() // or whatever noinline feature your compiler has
{
int a = 5;
int aftertest;
aftertest = test()+3; // aftertest = disasms to 89 45 F8 mov dword ptr [a],eax.
printf("%i", a+9);
printf("%x", test());
return 0;
}
I don't know the details, but there should be a way to make a call to a function that can then crawl the return stack for the address of the caller, and then copy and print that out.
Using gcc on i386 or x86-64:
#include <stdio.h>
#define ADDRESS_HERE() ({ void *p; __asm__("1: mov 1b, %0" : "=r" (p)); p; })
int main(void) {
printf("%p\n", ADDRESS_HERE());
return 0;
}
Note that due to the presence of compiler optimizations, the apparent position of the expression might not correspond to its position in the original source.
The advantage of using this method over the &&foo label method is it doesn't change the control-flow graph of the function. It also doesn't break the return predictor unit like the approaches using call :)
On the other hand, it's very much architecture-dependent... and because it doesn't perturb the CFG there's no guarantee that jumping to the address in question would make any sense at all.
If the compiler is any good this addition happens in registers and is never stored in memory, at least not in the way you are thinking. Actually a good compiler will see that your program does nothing, manipulating values within a function but never sending those values anywhere outside the function can result in no code.
If you were to:
c = a+b;
printf("%u\n",c);
Then a good compiler will also never store that value C in memory it will stay in registers, although it depends on the processor as well. If for example compilers for that processor use the stack to pass variables to functions then the value for c will be computed using registers (a good compiler will see that C is always 11 and just assign it) and the value will be put on the stack while being sent to the printf function. Naturally the printf function may well need temporary storage in memory due to its complexity (cant fit everything it needs to do in registers).
Where I am heading is that there is no answer to your question. It is heavily dependent on the processor, compiler, etc. There is no generic answer. I have to wonder what the root of the question is, if you were hoping to probe with a debugger, then this is not the question to ask.
Bottom line, disassemble your program and look at it, for that compile on that day with those settings, you will be able to see where the compiler has placed intermediate values. Even if the compiler assigns a memory location for the variable that doesnt mean the program will ever store the variable in that location. It depends on optimizations.