#include <stdio.h>
int main()
{
int *ptr;
{
int x = 2;
ptr = &x;
}
printf("%x %d", ptr, *ptr);
return 0;
}
Output: address of x, value of x.
Here, ptr should be a dangling pointer, right? Yet, it still stores the address of x. How is it still pointing the value of x, even after that block is deleted?
#include <stdio.h>
int * func (int n)
{
int temp;
int *ptr = &temp;
temp = n * n;
return ptr;
}
int main()
{
int n = 4;
int *p = func(4);
printf("%x, %d", p, *p);
return 0;
}
Output: address of temp, 16
In this program, the data variable temp and its pointer variable ptr is created in separate function. Why does it produce a correct result?
#include <stdio.h>
int * func (int n)
{
int temp;
int *ptr = &temp;
temp = n * n;
for (int i = 0; i < 10; i++)
printf("%d ", *ptr);
return ptr;
}
int main()
{
int n = 4;
int *p = func(4);
printf("\n%x, %d", p, *p);
for (int i = 0; i < 10; i++)
printf("%d ", *ptr);
*p = 12;
printf("%d\n", *p);
printf("%d\n", *p);
return 0;
}
Output: 16 16 16 16 16 16 16 16 16 16
address of temp, 1
16 16 16 16 16 16 16 16 16 16
12
12
The above program is similar to the second one aside from the for loop. In
the main() function, it gives the correct output every time. Even if I tried to change it to *p = 10, it would still give the correct output no matter how many times I print it.
But in the second program, it only gives the correct output once because of undefined behavior. It gives garbage values after the first printf.
But in third program, how does it still give the correct output every time?
My questions are:
The pointer variable points to a local variable which goes out of scope, but still prints the correct output and is accessible through the pointer variable by changing it's value. Why is it?
Like the temp created in increment(), ptr is also created locally. Why is it printing the values correctly all of the time without any warning or error? If the for loop is not there, it also gives an error after printing once. Why is that so?
When I passed temp I got a warning and segmentation fault error. But why is ptr, a local variable, printing the values correctly?
In the first program, after printing *ptr many times, it gives a correct output, and I was able to change *ptr = 1; after the first printf. Why can I access ptr even though the variable went out of scope?
Thank you everyone for answering. I underatand now from all your answers. Thank you very much.
Both of your programs behaviour is undefined.
In first code, your program is accessing x, via its address, outside of block in which it was declared. x is a local(automatic) non-static variable and its lifetime is limited to its scope1) i.e. the block in which it has been declared. Any attempt to access it outside of its lifetime will result in undefined behaviour2). Same is the case with temp variable in second code.
An undefined behaviour includes it may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended.
Also, the correct format specifier for printing a pointer is %p.
1). From C11 Standard#6.2.1p4 [emphasis mine]
Every other identifier has scope determined by the placement of its declaration (in a declarator or type specifier). If the declarator or type specifier that declares the identifier appears outside of any block or list of parameters, the identifier has file scope, which terminates at the end of the translation unit. If the declarator or type specifier that declares the identifier appears inside a block or within the list of parameter declarations in a function definition, the identifier has block scope, which terminates at the end of the associated block. ......
2). From C11 Standard#6.2.4p2 [emphasis mine]
2 The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address,33) and retains its last-stored value throughout its lifetime.34) If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.
I have disassembled your third program by IDA.
The func() function is compiled as a part of the main() function, not compiled as an independent function.
So, the correct values are remained.
I guess this is the optimization result during compiling.
But, When I add one line to func(), the result of program is different.
In this case, the compiler recognized the 'func()' as a function.
The expected result is occurred and the program is crashed at '*p = 12'.
The 'x' in the first code, and the 'temp' in the second code is a local variables, and thus it is released from the stack when the variables are out of the defined block.
The 'ptr' and the 'p' are pointers to the address of these local variables, but the values stored in these pointers are not valid after the local variables are released from the stack.
After the local variable is released, whether the value remains in the memory or not, is a problem with the development tool and the environment. That is, the stack is released, then emptying the memory of the pointer that occupied the local variable, is being treated within the OS or compiler, and the point is that you can no longer use the value of that address valid.
When I reviewed VC ++ 2008, after the local variable was released, the pointer has no more valid value. It has random value.
Related
Is this valid C code without undefined behaviour?
int main(){
int a;
memset(&a, 5, sizeof(int));
return a;
}
I'm assuming this is equal to just doing int a = 5.
I'm trying to understand if just declaring a variable in the above example (without defining it) is enough to put it on the stack.
Is this valid C code without undefined behaviour?
Yes – Once the a variable has been declared in a given scope (like a function or other { ... } delimited block), it is valid to take its address and access the variable using that address within that scope (as your memset call does). An attempt to use that address when that scope has ended (i.e. is no longer 'active') will cause undefined behaviour; for example, the following is UB:
int main()
{
int* p;
{ // New scope ...
int a;
p = &a; // Pointer to "a" is valid HERE
} // The scope of "a" (and its 'lifetime') ends here
memset(p, 5, sizeof(int)); // INVALID: "p" now points to a dead (invalid) variable
}
However, there's a major caveat in your code sample …
I'm assuming this is equal to just doing int a = 5.
There's the rub: It's assigning 5 to each component byte of the a variable, so it's doing this (assuming a 4-byte int):
int a = 0x05050505;
Which is the same as:
int a = 84215045;
From the C Standard (7.23.6.1 The memset function)
2 The memset function copies the value of c (converted to an unsigned
char) into each of the first n characters of the object pointed to by
s.
So this call
memset(&a, 5, sizeof(int));
does not set the variable a equal to 5. Internally the variable will look like
0x05050505
Here is a demonstrative program
#include <stdio.h>
#include <string.h>
int main(void)
{
int a;
memset( &a, 5, sizeof( int ) );
printf( "%#x\n", ( unsigned )a );
return 0;
}
Its output is
0x5050505
You should use the function memset with integers with caution because in general it can produce a trap value. Also the result depends on how internally integers are stored starting from MSB or LSB.
P.S. You declared a variable inside a block scope with no linkage. It is also a variable definition that has automatic storage duration. As the variable explicitly was not initialized then it has an indeterminate value. You may apply the address of operator & to get the address of the memory extent where the variable is defined.
That's not undefined behavior. The problem is that it doesn't what you expect.
The result of
memset(&a, 5, sizeof(int));
consists in setting to 5 each of the four bytes of your integer a.
Instead of initializing a pointer like this,
int main (){
int *ptr;
int x = 5;
ptr = &x;
}
What happens in memory when you do something like this?
int main (){
int *ptr = 100;
}
Would *ptr be looking for a random address that contains the value 100 or is it storing the value of 100 in ptr?
This is a constraint violation, the compiler should give you a diagnostic message. (If the compiler doesn't say "error" then I would recommend changing compiler flags so that it does). If the compiler generates an executable anyway, then the behaviour of that executable is undefined.
In Standard C, this code does not assign 100 to the pointer, as claimed by several other comments/answers. Integers cannot be assigned to pointers (i.e. using the assignment operator or initialization syntax with integer on the right-hand side), except the special case of constant expression zero.
To attempt to point the pointer to address 100, the code would be int *ptr = (int *)100;.
First of all, as mentioned in other answers, the code int *ptr = 100; is not even valid C. Assigning an integer to a pointer is not a valid form of simple assignment (6.5.16.1) so the code is a so-called "constraint violation", meaning it is a C standard violation.
So your first concern needs to be why your compiler does not follow the standard. If you are using gcc, please note that it is unfortunately not configured to be a conforming compiler per default, you have to tell it to become one by passing -std=c11 -pedantic-errors.
Once that is sorted, you can fix the code to become valid C by converting the integer to a pointer type, through an explicit cast. int *ptr = (int*)100;
This means nothing else but store the address 100 inside a pointer variable. No attempts have been made to access that memory location.
If you would attempt to access that memory by for example *ptr = 0; then it is your job to ensure that 100 is an aligned address for the given system. If not, you invoke undefined behavior.
And that's as far as the C language is concerned. C doesn't know or care what is stored at address 100. It is outside the scope of the language. On some systems this could be a perfectly valid address, on other systems it could be nonsense.
int *ptr = (int*)100; // valid
int *ptr = 100; // invalid as int to int * cast is not automatic
Usage of absolute address is discouraged because in a relocatable program segment, you would never know where a pointer should have as a value of some address, rather it should point to a valid address to avoid surprises.
Compiler won't give you any error. But it's a constraint violation. Pointer variables store addresses of other variables and *ptr is used to alter value stored at that address.
The long and short of it is nothing. Assigning the value 100 to the pointer value p does something; it sets the address for the pointer value to 100, just like assigning the value of the call malloc to it would; but since the value is never used, it doesn't actually do anything. We can take a look at what the value produces:
/* file test.c */
#include <stdlib.h>
#include <stdio.h>
int main() {
int *ptr = (int *) 100;
printf("%p\n", ptr);
return 0;
}
Executing this results in this output:
0x64
Pointers are, in a sense, just an integer. They (sometimes) take up about the same size of an integer (depending on the platform), and can be easily casted to and from an integer. There is even a specific (optional) type defined in C11 for an integer that is the same size as a pointer: intptr_t (see What is the use of intptr_t?).
Going back to the point, attempting to perform any dereferencing of this pointer of any kind can cause Weird Behavior(tm) - the layout of the memory is platform and implementation dependent, and so attempting to grab something at the address 100 will likely cause a Segmentation Fault (if you're lucky):
/* file test.c */
#include <stdlib.h>
#include <stdio.h>
int main() {
int *ptr = (int *) 100;
printf("%i\n", *ptr);
return 0;
}
Executing results in this output:
fish: "./test" terminated by signal SIGSEGV (Address boundary error)
So don't do it unless you know exactly what you're doing, and why you're doing it.
with:
int *ptr = 100;
you have assigned the value 100 to the pointer. Doing anything with it but printing its value will be undefined behavior:
printf ("pointer value is %p\n", (void *)ptr);
void first(){
int x;
int *p;
p= &x;
scanf("%d",p);
printf("The value in x or *p is: %d\n",x);
}
void second(){
int x;
int *ptr;
scanf("%d",&x);
printf("The value in *ptr is: %d\n",*ptr);
}
int main(){
first();
second();
}
In the above code the second() function is miss behaving.What ever value I give for the x variable that value is getting assigned to *ptr as well as to x. Why?
You didn't assign p a value, so it remains uninitialized. Attempting to dereference that pointer invokes undefined behavior.
Give p a value and you'll get the result you expect:
int *p = &x;
The fact that your code is still printing the correct value is part of that undefined behavior. One of the ways undefined behavior can manifest is that the code appears to work properly, but then a seemingly unrelated change will cause it to break.
In this particular situation, the functions first and second each define 2 local variables of the same type and in the same order. After the call to first completes, the memory that contained the values of x and p from that function still contain those values, but there have been no other function calls yet to overwrite them.
When you then call second immediately after first. The variables x and ptr in second end up using the same memory as x and p in first. And because ptr is uninitialized, it still contains the old value which is the address of x in first, which happens to be the same as the address of x in second.
Again, this is undefined behavior, so you can't depend on this happening all the time. If you added another variable to first or called another function between first and second, that will modify the stack memory previously used by first. Then than memory would contain some other value and you'll probably either print a garbage value or core dump.
The same code may give different results if compiled with a different compiler or different compiler options. For example, another compiler may choose to put the variables in each function in a different order on the stack, or it could decide to zero out the stack used by a function after the function returns.
int *p=&x;
The address of x gets stored in p.
printf("%d",*p);
Since p has the address of x, *p basically means that you are going to the location specified by p and picking up the item, which is the value of x.
Output is: 10 and it gives no error.
int main(){
int j=10;
int *i=&j;
printf("%d",*i);
return 0;
}
but it gives me an error:
int main(){
int *i;
int j=10;
*i=&j;
printf("%d",*i);
return 0;
}
I understand that pointer de-referencing is causing the error. But how is that happening?
Because you are using an uninitialized pointer.
Your *i = &j should be i = &j
This defines i as an int * and sets its value to the address of j:
int *i=&j;
This defines i as an int *, then tries to set what i points to to the address of j:
int *i;
int j=10;
*i=&j;
The final *i = ... is trying to dereference an uninitialized variable.
int *i=&j;
Here you're declaring i to be a int *, and assigning the address of j.
*i=&j;
In this case, though, you've already declared i, and you're assigning &j to the location that i points to rather than to i itself. So that's one error. Another is that i doesn't point to anything yet because you haven't initialized it. If you want i to point to j, you should drop the *:
i = &j;
i itself is declared as a "pointer to int". So you should write i = &j; to assign it with the address of j.
In your case, *i = &j dereferences it before the assignment, that is, the value of a pointer is assigned to an int, which resides in a legal or illegal memory block, because i is uninitialised.
Note that accessing an uninitialised variable causes undefined behaviour, not to mention accessing the object an uninitialised pointer points to.
Here is a simple declaration, with initialization, of an integer variable i:
int i = 10;
Normally it is very easy to split up the declaration and the initialization:
int i;
/* ... */
i = 10;
That's fine. But the syntax of pointer declarations in C is unusual, and it leads to a little bit of asymmetry when working with declarations and initializations. You can write
int *i = &j; /* correct */
But if you split it up, the * does not tag along, because it was part of the declaration.
int *i;
/* ... */
i = &j; /* right */
*i = &j; /* WRONG */
int *i;
You've declared i as a pointer to int, but you haven't set it to point to anything yet; the value of i is indeterminate1. It will contain some random string of bits that (most likely) does not correspond to a valid address2. Attempting to dereference i at this point leads to undefined behavior, which can mean anything from an outright crash to corrupted data to working without any apparent issues.
The line
*i = &j;
has two problems, the first one being that i doesn't point anywhere meaningful (and this is where your runtime error is undoubtedly coming from; you're attempting to access an invalid address). The second is that the types of *i and &j don't match; *i has type int, while &j has type int *.
Variables declared locally to a function without the static keyword have automatic storage duration, and are not implicitly initialized to any particular value. Do not assume that any such variable is initially set to 0 or NULL in the absence of an explicit initializer. Variables declared outside of any function body or with the static keyword have static storage duration, and those variables will be initialized to 0 or NULL in the absence of an explicit initializer.
"Valid" meaning the address of an object defined within your program (i.e., another variable, or a chunk of memory allocated via `malloc`, etc.) or a well-known address defined by the platform (such as a fixed hardware input address). NULL is a well-defined invalid address that's easy to test against.
Why does setting the value of a dereferenced pointer raise a Segmentation fault 11? To make what I mean clear look a the follow code:
#include <stdio.h>
int *ptr;
*ptr = 2;
int main(){
printf("%d\n", *ptr);
return 0;
}
I thought that *ptr=2 would set the rvalue that the pointer ptr is point to to 2. Is that not the case? I apologize if for those c expert programmers, this is really easy/obvious.
Are we only allowed to set a dereferenced pointer (i.e. *ptr) to a value if that value had a memory address? i.e. like doing:
int k = 7;
int *ptr = k;
and then:
*ptr = 2;
The problem here is that ptr is not pointing to allocated space. See the following:
#include <stdio.h>
#include <stdlib.h>
int main(void){
// Create a pointer to an integer.
// This pointer will point to some random (likely unallocated) memory address.
// Trying set the value at this memory address will almost certainly cause a segfault.
int *ptr;
// Create a new integer on the heap, and assign its address to ptr.
// Don't forget to call free() on it later!
ptr = malloc(sizeof(*ptr));
// Alternatively, we could create a new integer on the stack, and have
// ptr point to this.
int value;
ptr = &value;
// Set the value of our new integer to 2.
*ptr = 2;
// Print out the value at our now properly set integer.
printf("%d\n", *ptr);
return 0;
}
It's not 'illegal', it's simply implementation defined. In fact, on some platforms (such as DOS), specific memory addresses were necessary, for example to write text to the video buffer which started at 0xB8000, or memory mapped controller I/O on the SNES.
On most current OS's, a feature called ASLR is used, for security reasons, which makes ancient modes of dedicated addresses a thing of the past, in favor of going through driver and kernel layers, which is what makes it 'illegal' for most places you would run it.
The most basic issue here is that you are not assigning ptr to a valid memory address, there are some cases where 0 is a valid memory address but usually not. Since ptr is global variable in your first case, it will be initialized to 0. remyabal asked a great follow-up question and best answer made me realize that this is a redeclaration here:
*ptr = 2;
and you are then setting ptr to have a value of 2 which is except by chance unlikely to point to a valid memory address.
If ptr was a local or automatic variable then it would be uninitialized and it's value would be indeterminate. Using a pointer with an indeterminate value is undefined behavior in both C and C++. It is in most case undefined behavior to use a NULL pointer as well although implementations are allowed to define the behavior.
On most modern system attempting to access memory your process does not own will result in a segmentation fault.
You can assign a valid memory address to ptr in a few ways, for example:
int k = 7;
int *ptr = &k;
^
note the use of of & to take the address of k or you could use malloc to allocate memory dynamically for it.
Your code is invalid, though some C compilers may permit it for compatibility with older versions of the language.
Statements, including assignment statements, are illegal (a syntax error) if they appear outside the body of a function.
You have:
int *ptr;
*ptr = 2;
at file scope. The first line is a valid declaration of an int* object called ptr, implicitly initialized to a null pointer value. The second line looks like an assignment statement, but since it's outside a function, the compiler most likely won't even try to interpret it that way. gcc treats it as a declaration. Old versions of C permitted you to omit the type name in a declaration; C99 dropped the "implicit int" rule. So gcc treats
*ptr = 2;
as equivalent to
int *ptr = 2;
and produces the following warnings:
c.c:4:1: warning: data definition has no type or storage class [enabled by default]
c.c:4:8: warning: initialization makes pointer from integer without a cast [enabled by default]
The first warning is because you omitted the int (or other type name) from the declaration. The second is because 2 is a value of type int, and you're using it to initialize an object of type int*; there is no implicit conversion from int to int* (other than the special case of a null pointer constant).
Once you get past that, you have two declarations of the same object -- but they're compatible, so that's permitted. And the pointer variable is initialized to (int*)2, which is a garbage pointer value (there's not likely to be anything useful at memory address 0x00000002).
In your main function, you do:
printf("%d\n", *ptr);
which attempts to print the value of an int object at that memory address. Since that address is not likely to be one that your program has permission to access, a segmentation fault is not a surprising result. (More generally, the behavior is undefined.)
(This is a fairly common problem in C: minor errors in a program can result in something that still compiles, but is completely different from what you intended. The way I think of it is that C's grammar is relatively"dense"; small random tweaks to a valid program often produce different but syntactically valid programs rather than creating syntax errors.)
So that's what your program actually does; I'm sure it's not what you intended it to do.
Take a deep breath and read on.
Here's something that's probably closer to what you intended:
#include <stdio.h>
int *ptr;
int main(void) {
*ptr = 2;
printf("%d\n", *ptr);
return 0;
}
Since there's now no initializer for ptr, it's implicitly initialized to a null pointer value. (And if ptr were defined inside main, its initial value would be garbage.) The assignment statement attempts to dereference that null pointer, causing a segmentation fault (again, the behavior is undefined; a segmentation fault is a likely result). Execution never reaches the printf call.
I thought that *ptr=2 would set the rvalue that the pointer ptr is point to to 2. Is that not the case?
Not quite. Pointers don't point to rvalues; an "rvalue" is merely the value of an expression. Pointers point to objects (if they point to anything). The assignment
*ptr = 2;
would assign the value 2 to the object that ptr points to -- but ptr doesn't point to an object!
Now let's see a version of your program that actually works:
#include <stdio.h>
int *ptr;
int variable;
int main(void) {
ptr = &variable;
*ptr = 2;
printf("*ptr = %d\n", *ptr);
printf("variable = %d\n", variable);
return 0;
}
Now ptr points to an object, and *ptr = 2 assigns a value to that object. The output is:
*ptr = 2
variable = 2