I've been doing some tests with pointers and came across the two following scenario. Can anybody explain to me what's happening?
void t ();
void wrong_t ();
void t () {
int i;
for (i=0;i<1;i++) {
int *p;
int a = 54;
p = &a;
printf("%d\n", *p);
}
}
void wrong_t() {
int i;
for (i=0;i<1;i++) {
int *p;
*p = 54;
printf("%d\n", *p);
}
}
Consider these two versions of main:
int main () {
t();
wrong_t();
}
prints:
54\n54\n, as expected
int main () {
wrong_t();
}
yields:
Segmentation fault: 11
I think that the issue arises from the fact that "int *p" in "wrong_t()" is a "bad pointer" as it's not correctly initialized (cfr.: cslibrary.stanford.edu/102/PointersAndMemory.pdf, page 8). But I don't understand why such problem arises just in some cases (e.g.: it does not happen if I call t() before wrong_t() or if I remove the for loop around the code in wrong_t()).
Because dereferencing an uninitialised pointer (as you correctly guessed) invokes undefined behaviour. Anything could happen.
If you want to understand the precise behaviour you're observing, then the only way is to look at the assembler code that your compiler produced. But this is normally not very productive.
What happens is almost certainly:
In both t and wrong_t, the definition int *p allocates space for p on the stack. When you call only wrong_t, this space contains data left over from previous activity (e.g., from the code that sets up the environment before main is called). It happens to be some value that is not valid as a pointer, so using it to access memory causes a segment fault.
When you call t, t initializes this space for p to contain a pointer to a. When you call wrong_t after this, wrong_t fails to initialize the space for p, but it already contains the pointer to a from when t executed, so using it to access memory results in accessing a.
This is obviously not behavior you may rely on. You may find that compiling with optimization turned on (e.g., -O3 with GCC) alters the behavior.
In wrong_t function, this statement *p = 54; is interesting. You are trying to store a value into a pointer p for which you haven't yet allocated the memory and hence, the error.
Related
Consider the following code.
#include<stdio.h>
int *abc(); // this function returns a pointer of type int
int main()
{
int *ptr;
ptr = abc();
printf("%d", *ptr);
return 0;
}
int *abc()
{
int i = 45500, *p;
p = &i;
return p;
}
Output:
45500
I know according to link this type of behavior is undefined. But why i am getting correct value everytime i run the program.
Every time you call abc it "marks" a region at the top of the stack as the place where it will write all of its local variables. It does that by moving the pointer that indicates where the top of stack is. That region is called the stack frame. When the function returns, it indicates that it does not want to use that region anymore by moving the stack pointer to where it was originally. As a result, if you call other functions afterwards, they will reuse that region of the stack for their own purposes. But in your case, you haven't called any other functions yet. So that region of the stack is left in the same state.
All the above explain the behavior of your code. It is not necessary that all C compilers implement functions that way and therefore you should not rely on that behavior.
Well, undefined behavior is, undefined. You can never rely on UB (or on an output of a program invoking UB).
Maybe, just maybe in your environment and for your code, the memory location allocated for the local variable is not reclaimed by the OS and still accessible, but there's no guarantee that it will have the same behavior for any other platform.
The following application works with both the commented out malloced int and when just using an int pointer to point to the local int 'a.' My question is if this is safe to do without malloc because I would think that int 'a' goes out of scope when function 'doit' returns, leaving int *p pointing at nothing. Is the program not seg faulting due to its simplicity or is this perfectly ok?
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef struct ht {
void *data;
} ht_t;
ht_t * the_t;
void doit(int v)
{
int a = v;
//int *p = (int *) malloc (sizeof(int));
//*p = a;
int *p = &a;
the_t->data = (void *)p;
}
int main (int argc, char *argv[])
{
the_t = (ht_t *) malloc (sizeof(ht_t));
doit(8);
printf("%d\n", *(int*)the_t->data);
doit(4);
printf("%d\n", *(int*)the_t->data);
}
Yes, dereferencing a pointer to a local stack variable after the function is no longer in scope is undefined behavior. You just happen to be unlucky enough that the memory hasn't been overwritten, released back to the OS or turned into a function pointer to a demons-in-nose factory before you try to access it again.
Not every UB (undefined behaviour) results in a segfault.
Normally, the stack's memory won't be released back to the OS (but it might!) so that accessing that memory works. But at the next (bigger) function call, the memory where the pointer points to might be overwritten, so your data you felt like bing safe there is lost.
malloc() inside a function call will work because it stores on heap.
The pointer remains until you free the memory.
And yes, not every undefined behaviour will result in segmentation fault.
It does not matter that p does not point at anything on return from doit.
Because, you know, p isn't there any more either.
It does matter that you are reading the pointer to a no-longer-existing object in main though. That's UB, but harmless on most modern platforms.
Even worse though, you are reading the non-existent object it once pointed to, which is straight Undefined Behavior.
Still, nobody is obliged to catch you:
Can a local variable's memory be accessed outside its scope?
As an aside, Don't cast the result of malloc (and friends).
Also, be aware that not free-ing memory is not harmless on all platforms: Can I avoid releasing allocated memory in C with modern OSes?
Yes, the malloc is needed, otherwise the pointer would point on the 4-byte space of the stack, which would be used by other data, if you called other functions or created now local variables.
You can see that if you call that function afterwards with a value of 10:
void use_memory(int i)
{
int f[128]={};
if(i>0)
{
use_memory(i-1);
}
}
I would like to understand the difference between the following two C programs.
First program:
void main()
{
int *a;
{
int b = 10;
a=&b;
}
printf("%d\n", *a);
}
Second program:
void main()
{
int *a;
a = foo();
printf("%d\n", *a);
}
int* foo()
{
int b = 10;
return &b;
}
In both cases, the address of a local variable (b) is returned to and assigned to a. I know that the memory a is pointing should not be accessed when b goes out of scope. However, when compiling the above two programs, I receive the following warning for the second program only:
warning C4172: returning address of local variable or temporary
Why do I not get a similar warning for the first program?
As you already know that b goes out of scope in each instance, and accessing that memory is illegal, I am only dumping my thoughts on why only one case throws the warning and other doesn't.
In the second case, you're returning the address of a variable stored on Stack memory. Thus, the compiler detects the issue and warns you about it.
The first case, however skips the compiler checking because the compiler sees that a valid initialized address is assigned to a. The compilers depends in many cases on the intellect of the coder.
Similar examples for depicting your first case could be,
char temp[3] ;
strcpy( temp, "abc" ) ;
The compiler sees that the temp have a memory space but it depends on the coder intellect on how many chars, they are going to copy in that memory region.
your foo() function has undefined behavior since it returns a pointer to a part of stack memory that is not used anymore and that will be overwritten soon on next function call or something
it is called "b is gone out of scope".
Sure the memory still exists and probably have not changed so far but this is not guaranteed.
The same applies to your first code since also the scope of b ends with the closing bracket of the block there b is declared.
Edit:
you did not get the warning in first code because you did not return anything. The warning explicitly refers to return. And since the compiler may allocate the stack space of the complete function at once and including all sub-blocks it may guarantee that the value will not be overwritten. but nevertheless it is undefined behavior.
may be you get additional warnings if you use a higher warning level.
In the first code snippet even though you explicitly add brackets the stack space you are using is in the same region; there are no jumps or returns in the code so the code still uses consecutive memory addresses from the stack. Several things happen:
The compiler will not push additional variables on the stack even if you take out the code block.
You are only restricting the visibility of variable b to that code-block; which is more or less the same as if you would declare it at the beginning and only use it once in the exact same place, but without the { ... }
The value for b is most likely saved in a register which so there would be no problem to print it later - but this is speculative.
For the second code snippet, the function call means a jump and a return which means:
pushing the current stack pointer and the context on the stack
push the relevant values for the function call on the stack
jump to the function code
execute the function code
restore the stack pointer to it's value before the function call
Because the stack pointer has been restored, anything that is on the stack is not lost (yet) but any operations on the stack will be likely to override those values.
I think it is easy to see why you get the warning in only one case and what the expected behavior can be...
Maybe it is related with the implementation of a compiler. In the second program,the compiler can identify that return call is a warning because the program return a variable out of scope. I think it is easy to identify using information about ebp register. But in the first program our compiler needs to do more work for achieving it.
Your both programs invoke undefined behaviour. Statements grouped together within curly braces is called a block or a compound statement. Any variable defined in a block has scope in that block only. Once you go out of the block scope, that variable ceases to exist and it is illegal to access it.
int main(void) {
int *a;
{ // block scope starts
int b = 10; // b exists in this block only
a = &b;
} // block scope ends
// *a dereferences memory which is no longer in scope
// this invokes undefined behaviour
printf("%d\n", *a);
}
Likewise, the automatic variables defined in a function have function scope. Once the function returns, the variables which are allocated on the stack are no longer accessible. That explains the warning you get for your second program. If you want to return a variable from a function, then you should allocate it dynamically.
int main(void) {
int *a;
a = foo();
printf("%d\n", *a);
}
int *foo(void) {
int b = 10; // local variable
// returning the address of b which no longer exists
// after the function foo returns
return &b;
}
Also, the signature of main should be one of the following -
int main(void);
int main(int argc, char *argv[]);
In your first program-
The variable b is a block level variable and the visibility is inside that block
only.
But the lifetime of b is lifetime of the function so it lives upto the exit of main function.
Since the b is still allocated space, *a prints the value stored in b ,since a points b.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Can a local variable's memory be accessed outside its scope?
returning address of local variable
I have a question, first of all look at the code
#include <stdio.h>
int sum(); /* function declaration */
int main()
{
int *p2;
p2 = sum(); /* Calling function sum and coping its return type to pointer variable p2 */
printf("%d",*p2);
} /* END of main */ `
int sum()
{
int a = 10;
int *p = &a;
return p;
} /* END of sum */
I think the answer is 10 and address of variable a, but my tesacher says that a is local to the function come so a and its
value will be deleted from the memory location when the function returns or is finished executing. I tried this code and the answer is weel of course 10 and address of a, I use the GNU/GCC compiler. Can anyone say what is right and wrong.
Thanks in advance.
Your teacher is absolutely right: even if you fix your program to return int* in place of int, your program still contains undefined behavior. The issue is that the memory in which a used to be placed is ripe for reuse once sum returns. The memory may stay around untouched for you to access, so you might even print ten, but this behavior is still undefined: it may run on one platform and crash on ten others.
You may get the right result but that is just because you are lucky, by the time the sum() is returned, the memory of a is returned to the system, and it can be used by any other variables, so the value may be changed.
For example:
#include <stdio.h>
int* sum(); /* function declaration */
int* sum2(); /* function declaration */
int main()
{
int *p2;
p2 = sum(); /* Calling function sum and coping its return type to pointer variable p2 */
sum2();
printf("%d",*p2);
}
int* sum()
{
int a = 10;
int *p = &a;
return p;
} /* END of sum */
int* sum2()
{
int a = 100;
int *p = &a;
return p;
} /* END of sum */
With this code, the a will be reused by sum2() thus override the memory value with 100.
Here you just return a pointer to int, suppose you are returning an object:
TestClass* sum()
{
TestClass tc;
TestClass *p = &tc;
return p;
}
Then when you dereference tc, weird things would happen because the memory it points to might be totally screwed.
Your pointer still points to a place in memory where your 10 resides. However, from Cs point of view that memory is unallocated and could be reused. Placing more items on the stack, or allocating memory could cause that part of memory to be reused.
The object a in the function sum has automatic lifetime. It's lifetime will be ended as soon as the scope in which it has been declared in (the function body) is left by the program flow (the return statement at the end of the function).
After that, accessing the memory at which a lived will either do what you expect, summon dragons or set your computer on fire. This is called undefined behavior in C. The C standard itself, however, says nothing about deleting something from memory as it has no concept of memory.
Logically speaking, a no longer exists once sum exits; its lifetime is limited to the scope of the function. Physically speaking, the memory that a occupied is still there and still contains the bit pattern for the value 10, but that memory is now available for something else to use, and may be overwritten before you can use it in main. Your output may be 10, or it may be garbage.
Attempting to access the value of a variable outside that variable's lifetime leads to undefined behavior, meaning the compiler is free to handle the situation any way it wants to. It doesn't have to warn you that you're doing anything hinky, it doesn't have to work the way you expect it to, it doesn't have to work at all.
I wrote a program in C having dangling pointer.
#include<stdio.h>
int *func(void)
{
int num;
num = 100;
return #
}
int func1(void)
{
int x,y,z;
scanf("%d %d",&y,&z);
x=y+z;
return x;
}
int main(void)
{
int *a = func();
int b;
b = func1();
printf("%d\n",*a);
return 0;
}
I am getting the output as 100 even though the pointer is dangling.
I made a single change in the above function func1(). Instead of taking the value of y and z from standard input as in above program, now I am assigning the value during compile time.
I redefined the func1() as follows:
int func1(void)
{
int x,y,z;
y=100;
z=100;
x=y+z;
return x;
}
Now the output is 200.
Can somebody please explain me the reason for the above two outputs?
Undefined Behavior means anything can happen, including it'll do as you expect. Your stack variables weren't overwritten in this case.
void func3() {
int a=0, b=1, c=2;
}
If you include a call to func3() in between func1 and printf you'll get a different result.
EDIT: What actually happens on some platforms.
int *func(void)
{
int num;
num = 100;
return #
}
Let's assume, for simplicity, that the stack pointer is 10 before you call this function, and that the stack grows upwards.
When you call the function, the return address is pushed on stack (at position 10) and the stack pointer is incremented to 14 (yes, very simplified). The variable num is then created on stack at position 14, and the stack pointer is incremented to 18.
When you return, you return a pointer to address 14 - return address is popped from stack and the stack pointer is back to 10.
void func2() {
int y = 1;
}
Here, the same thing happens. Return address pushed at position, y created at position 14, you assign 1 to y (writes to address 14), you return and stack pointer's back to position 10.
Now, your old int * returned from func points to address 14, and the last modification made to that address was func2's local variable assignment. So, you have a dangling pointer (nothing above position 10 in stack is valid) that points to a left-over value from the call to func2
It's because of the way the memory gets allocated.
After calling func and returning a dangling pointer, the part of the stack where num was stored still has the value 100 (which is what you are seeing afterwards). We can reach that conclusion based on the observed behavior.
After the change, it looks like what happens is that the func1 call overwrites the memory location that a points to with the result of the addition inside func1 (the stack space previously used for func is reused now by func1), so that's why you see 200.
Of course, all of this is undefined behavior so while this might be a good philosophical question, answering it doesn't really buy you anything.
It's undefined behavior. It could work correctly on your computer right now, 20 minutes from now, might crash in an hour, etc. Once another object takes the same place on the stack as num, you will be doomed!
Dangling pointers (pointers to locations that have been disassociated) induce undefined behavior, i.e. anything can happen.
In particular, the memory locations get reused by chance* in func1. The result depends on the stack layout, compiler optimization, architecture, calling conventions and stack security mechanisms.
With dangling pointers, the result of a program is undefined. It depends on how the stack and the registers are used. With different compilers, different compiler versions and different optimization settings, you'll get a different behavior.
Returning a pointer to a local variable yields undefined behaviour, which means that anything the program does (anything at all) is valid. If you are getting the expected result, that's just dumb luck.
Please study functions from basic C. Your concept is flawed...main should be
int main(void)
{
int *a = func();
int b;
b = func1();
printf("%d\n%d",*a,func1());
return 0;
}
This will output 100 200