When I define a pointer without initializing it:
int *pi;
it points to a random part of the memory.
What happens when I define a pointer to pointer without initializing it?
int **ppi;
Where does it point? It should points to another pointer, but I didn't define it so maybe it points to a random part of the memory? If possible, could you show the difference with an example please?
To make it clear consider the following declaration
T *ptr;
where T is some type specifier. If the declared variable ptr has automatic storage duration then the pointer is initialized neither explicitly nor implicitly.
So the pointer has an indeterminate value.
T can be any type. You can define T as for example
typedef int T;
or
typedef int *T;
Pointers are scalar objects. Thus in this declaration
typedef int *T;
T *ptr;
the pointer ptr has indeterminate value the same way as in the declaration
typedef int T;
T *ptr;
Any local variable that isn't initialized contains an indeterminate value. Regardless of type. There's no obvious difference here between for example an uninitialized int, int* or int**.
However, there is a rule in C saying that if you don't access the address of such an uninitialized local variable, but use its value, you invoke undefined behavior - meaning a bug, possibly a crash etc. The rationale is likely that such variables may be allocated in registers and not have an addressable memory location. See https://stackoverflow.com/a/40674888/584518 for details.
So these examples below are all bad and wrong, since the address of the local variables themselves are never used:
{
int i;
int* ip;
int** ipp;
printf("%d\n, i); // undefined behavior
printf("%p\n, (void*)ip); // undefined behavior
printf("%p\n, (void*)ipp); // undefined behavior
}
However, if you take the address of a variable somewhere, C is less strict. In such a case you end up with the variable getting an indeterminate value, which means that it could contain anything and the value might not be consistent if you access it multiple times. This could be a "random address" in case of pointers, but not necessarily so.
An indeterminate value may be what's known as a "trap representation", a forbidden binary sequence for that type. In such cases, accessing the variable (read or write) invokes undefined behavior. This isn't likely to happen for plain int unless you have a very exotic system that is not using 2's complement - because in standard 2's complement systems, all value combinations of an int are valid and there are no padding bits, negative zero etc.
Example (assuming 2's complement):
{
int i;
int* ip = &i;
printf("%d\n", *ip); // unspecified behavior, might print anything
}
Unspecified behavior meaning that the compiler need not document the behavior. You can get any kind of output and it need not be consistent. But at least the program won't crash & burn as might happen in the case of undefined behavior.
But trap representations is more likely to be a thing for pointer variables. A specific CPU could have a restricted address space or at the low level initialization, the MMU could be set to have certain regions as virtual, some regions to only contain data or some regions to only contain code etc. It might be possible that such a CPU generates a hardware exception even when you read an invalid address value into an index register. It is certainly very likely that it does so if you attempt to access memory through an invalid address.
For example, the MMU might block runaway code which tries to execute code from the data segment of the memory, or to access the contents of the code memory as if it was data.
Related
Background info: My program involves creating a hash table and one of my functions is free_hash(struct hash_table *table).
struct hash_table *table points to an array of struct hash_entry pointers. To test my free_hash function in main I have a void *test_free = what. the declaration and initialization for what is hash_table *what = new_hash(array_size).
this is struct hash_table *new_hash() it returns a function that returns a pointer to a new initialized struct hash_table.
My question: After freeing what, eg.free_hash(what), what happens to test_free. What is the address of it/the value of what it is pointing at. And is there any other way I can make sure that what has been destructed/freed.
test_free and what are pointers. The value they have is basically an address. And you assigned the same address to both of them. Nothing happens to either variable once you free that to which they point.
Once you do, the pointers are deemed to be indeterminate, so it becomes undefined behaviour to deference either one. But there's nothing in either variable that indicates this. The onus is on the programmer to ensure no attempts is made to access a freed structure.
As for checking if everything was properly freed, there's -fsanitize=address, valgrind, etc.
C 2018 6.2.4 2 says “… The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.” When you release memory with free, the lifetime of any objects in it end, so any pointers to this memory become indeterminate. “Indeterminate” means the pointer value is not even fixed; it may act as if it has a different value each time it is used.
This rule exists because there have been C implementations in which maintaining pointers required auxiliary information associated with the allocated memory. So the “value” of a pointer was not represented just by the bits directly in the memory used for the pointer object itself. Once the memory, and its auxiliary information, are released, it might no longer be possible to interpret the value of the pointer correctly.
In most modern C implementations, addresses are implemented simply as numbers in a “flat” address space. In this case, no auxiliary information is needed to interpret the value of a pointer or to work with it as by adding offsets to it. However, because the rule exists, optimizers in compilers may treat any pointer to freed memory as indeterminate.
For example, in this code:
void *x;
free(p);
if (SomeTest)
x = p;
else
x = q;
printf("%p\n", x);
the compiler is allowed to optimize this to:
void *x;
free(p);
x = q;
printf("%p\n", x);
even if SomeTest is true. That is the fact that p is indeterminate after free means it is allowed to have any value, so it could have the value of q, so the if statement would just be:
if (SomeTest)
x = q;
else
x = q;
which of course can be optimized to x = q;.
In short, once you release memory, the C standard does not give you any assurance your program will behave as if a pointer to that memory has any particular value. It may act as if the pointer has a different value each time the program uses it.
#include <stdio.h>
int main(void)
{
int *ptr;
printf("%p", ptr); // Error: uninitialized local variable 'ptr' used
// Output is "0"
}
I'm reading C-FAQ about null pointer. And it says that uninitialized pointer might point to anywhere. Does that mean it points to random location in memory? Also if this statement is true, why does error occur if i try printf("%p",ptr)? Since uninitialized pointer ptr points to some random location, it seems that it must print out this random location!
The contents of an unitialized auto variable (pointer type or otherwise) are indeterminate; in practice, it's whatever was last written to that memory location. The odds that this random bit pattern corresponds to a valid address1 in your program are pretty low; it may even be a trap representation (a bit pattern that does not correspond to a legal value for the type).
Attempting to dereference an invalid pointer value results in undefined behavior; any result is possible. Your code may crash outright, it may run with no apparent issues, it may leave your system in a bad state.
That is, the address of an object or function defined in your program, or a dynamic object allocated with malloc or similar.
Fetching the value of an invalid pointer is an implementation defined behavior in C++ according to this. Now consider the following C program:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int* p=(int*)malloc(sizeof(int));
*p=3;
printf("%d\n",*p);
printf("%p\n",(void*)p);
free(p);
printf("%p\n",(void*)p); // Is this undefined or implementation defined in behaviour C?
}
But is the behaviour same in C also? Is the behaviour of the above C program undefined or implementation defined? What does the C99/C11 standard say about this?
Please tell me if the behaviour is different in C99 & C11.
Expanding on Andrew Henle's answer:
From the C99 Standard, 6.2.4:
An object has a storage duration that determines its lifetime. There are three storage durations: static, automatic, and allocated. Allocated storage is described in 7.20.3. […] The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.
Then in 7.20.3.2: the standard goes on describing malloc(), calloc() and free(), mentioning that
The free function causes the space pointed to by ptr to be deallocated.
In 3.17.2:
indeterminate value
either an unspecified value or a trap representation
In 6.2.6.1.5:
Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. […] Such a representation is called a trap representation.
Since the pointer becomes indeterminate, and an indeterminate value can be a trap representation, and you have a variable which is an lvalue, and reading an lvalue trap representation is undefined, therefore yes, the behavior may be undefined.
Per the C standard, section 6.2.4:
The lifetime of an object is the portion of program execution during
which storage is guaranteed to be reserved for it. An object exists,
has a constant address, and retains its last-stored value throughout
its lifetime. If an object is referred to outside of its lifetime,
the behavior is undefined. The value of a pointer becomes
indeterminate when the object it points to (or just past) reaches the
end of its lifetime.
If a compiler correctly determines that code will inevitably fetch a
pointer to an object which has been passed to "free" or "realloc", even if
code will not make any use of the object identified thereby, the Standard will
impose no requirements on what the compiler may or may not do after that point.
Thus, using a construct like:
char *thing = malloc(1000);
int new_size = getData(thing, ...whatever); // Returns needed size
char *new_thing = realloc(thing, new_size);
if (!new_thing)
critical_error("Shrinking allocation failed!");
if (new_thing != thing)
adjust_pointers(thing, new_thing);
thing = new_thing;
might on most implementations allow code to save the effort of recalculating
some pointers in the event that using realloc to shrink an allocated block
doesn't cause the block to move, but there would be nothing illegitimate about
an implementation that unconditionally reported that the shrinking allocation
failed since if it didn't fail code would inevitably attempt a comparison
involving a pointer to a realloc'ed block. For that matter, it would also
be just as legitimate (though less "efficient") for an implementation to
keep the check whether realloc returned null, but allow arbitrary code to
execute if it doesn't.
Personally, I see very little to be gained by preventing programmers from determining testing when certain steps can be skipped. Skipping unnecessary code if a pointer doesn't change may yield significant efficiency improvements in cases where realloc is used to shrink a memory block (such an action is allowed to move the block but on most implementations it usually won't), but it is currently fashionable for compilers to apply their own aggressive optimizations which will break code that tries to use such techniques.
Continuing from the comments. I think the confusion over whether it is valid or invalid surrounds what aspect of the pointer is being asked about. Above, free(p); effects the starting address to the block of memory pointed to by p, it does not effect the address of p itself, which remains valid. There is no longer an address held by p (as it's value) leaving it indeterminate until reassigned. A short example helps:
#include <stdio.h>
#include <stdlib.h>
int main (void) {
int *p = NULL;
printf ("\n the address of 'p' (&p) : %p\n", &p);
p = malloc (sizeof *p);
if (!p) return 1;
*p = 3;
printf (" the address of 'p' (&p) : %p p points to %p with value %d\n",
&p, p, *p);
free (p);
/* 'address of p' unchanged, p itself indeterminate until reassigned */
printf (" the address of 'p' (&p) : %p\n\n", &p);
p = NULL; /* p no longer indeterminate and can be allocated again */
return 0;
}
Output
$ ./bin/pointer_addr
the address of 'p' (&p) : 0x7fff79e2e8a0
the address of 'p' (&p) : 0x7fff79e2e8a0 p points to 0x12be010 with value 3
the address of 'p' (&p) : 0x7fff79e2e8a0
The address of p itself is unchanged by either the malloc or free. What is effected is the value of p (or more correctly, the address p stores as its value). Upon free, the address p stores is released to the system and can no longer be accessed through p. Once you explicitly reassign p = NULL; p is no longer indeterminate and can be used for allocation again.)
I have read
Directly assigning values to C Pointers
However, I am trying to understand this different scenario...
int *ptr = 10000;
printf("value: %d\n", ptr);
printf("value: %d\n", *ptr);
I got a segmentation fault on the second printf.
Now, I am under the impression that 10000 is a memory location because pointers point to the address in the memory. I am also aware that 10000 could be anywhere in the memory (which might already be occupied by some other process)
Therefore, I am thinking so the first print is just saying that "ok, just give me the value of the address as some integer value", so, ok, I got 10000.
Then I am saying "ok, now deference it for me", but I have not put anything in it so (or it is uninitialized) so I got a segmentation fault.
Maybe my logic is already totally off the track and this point.
UPDATED::::
Thanks for all the quick responses.. So here is my understanding.
First,
int *ptr = 10000;
is UB because I cannot assign a pointer to a constant value.
Second, the following is also UB because instead of using %p, I am using %d.
printf("value: %d\n", ptr)
Third, I have given an address (although it is UB), but I have not initialized to some value so, the following statement got seg fault.
print("value: %d\n", *ptr)
Is my understanding correct now ?
thanks.
int *ptr = 10000;
This is not merely undefined behavior. This is a constraint violation.
The expression 10000 is of type int. ptr is of type int*. There is no implicit conversion from int to int* (except for the special case of a null pointer constant, which doesn't apply here).
Any conforming C compiler, on processing this declaration, must issue a diagnostic message. It's permitted for that message to be a non-fatal warning, but once it's issued that message, the program's behavior is undefined.
A compiler could treat it as a fatal error and refuse to compile your program. (In my opinion, compilers should do this.)
If you really wanted to assign ptr to point to address 10000, you could have written:
int *ptr = (int*)10000;
There's no implicit conversion from int to int*, but you can do an explicit conversion with a cast operator.
That's a valid thing to do if you happen to know that 10000 is a valid address for the machine your code will run on. But in general the result of converting an integer to a pointer "is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation" (N1570 section 6.3.2.3). If 10000 isn't a valid address (and it very probably isn't), then your program still has undefined behavior, even if you try to access the value of the pointer, but especially if you try to dereference it.
This also assumes that converting the integer value 10000 to a pointer type is meaningful. Commonly such a conversion copies the bits of the numeric value, but the C standard doesn't say so. It might do some strange implementation-defined transformation on the number to produce an address.
Addresses (pointer values) are not numbers.
printf("value: %d\n", ptr);
This definitely has undefined behavior. The %d format requires an int argument. On many systems, int and int* aren't even the same size. You might end up printing, say, the high-order half of the pointer value, or even some complete garbage if integers and pointers aren't passed as function arguments in the same way. To print a pointer, use %p and convert the pointer to void*:
printf("value: %p\n", (void)ptr);
Finally:
printf("value: %d\n", *ptr);
The format string is correct, but just evaluating *ptr has undefined behavior (unless (int*)10000 happens to be a valid address).
Note that "undefined behavior" doesn't mean your program is going to crash. It means that the standard says nothing about what will happen when you run it. (Crashing is probably the best possible outcome; it makes it obvious that there's a bug.)
No, the definition int *ptr = 10000 does not give undefined behaviour.
It converts the literal value 10000 into a pointer, and initialises ptr with that value.
However, in your example
int *ptr = 10000;
printf("value: %d\n", ptr);
printf("value: %d\n", *ptr);
both of the printf() statements give undefined behaviour.
The first gives undefined behaviour because the %d format tells printf() that the corresponding argument is of type int, which ptr is not. In practice (with most compilers/libraries) it will often happily print the value 10000, but that is happenstance. Essentially (and a little over-simplistically), for that to happen, a round-trip conversion (e.g. converting 10000 from int to pointer, and then converting that pointer value to an int) needs to give the same value. Surviving that round trip is NOT guaranteed, although it does happen with some implementations, so the first printf() might APPEAR well behaved, despite involving undefined behaviour.
Part of the problem with undefined behaviour is that one possible result is code behaving as the programmer expects. That doesn't make the behaviour defined. It simply means that a particular set of circumstances (behaviour of compiler, operating system, hardware, etc) happen to conspire to give behaviour that seems sensible to the programmer.
The second printf() statement gives undefined behaviour because it dereferences ptr. The standard gives no basis to expect that a pointer with value 10000 corresponds to anything in particular. It might be a location in RAM. It might be a location in video memory. It might be a value that does not correspond to any location in memory that exists on your computer. It might be a logical or physical memory location that your operating system deems your process is not allowed to access (which is actually what causes an access violation under several operating systems, which then send a signal to the process running your program directing it to terminate).
A lot of C compilers (if appropriately configured) will give a warning on the initialisation of ptr because of this - an initialisation like this is easier for the compiler to detect, and usually indicates problems in subsequent code.
This may cause undefined behavior since the pointer converted from 10000 may be invalid.
Your OS may not allow your program to access the address 10000, so it will raise Segmentation Fault.
int *x = some numerical value (i.e. 10, whatever)
may be for microcomputers or low-level (example: creating OS).
I am learning now C, and these days I am studying pointers and I just come with a question!
int *ptr; //declare the ptr
ptr = &var; //init the ptr with the address of the variable var
with these lines, I created a pointer and I linked the ptr with a variable. My question is this, when I declare a pointer int *ptr; and I don't initialize it with an address, this pointer where it points?
In C, variables are generally not initialized unless you specifically say so:
int a; // not initialized
int b = 1; // initialized
int arr[10]; // not initialized
int brr[4] = { 1 }; // initialized as { 1, 0, 0, 0 }
void * p; // not initialized
void * q = &a; // initialized
(There are exceptions for variables with static or thread-local storage, which are always zero-initialized.)
It is not allowed to try and get at the value of an uninitialized variable. The only thing you can do with an uninitialized variable is assign to it, which does not access its current value, but only assigns a new value to it. Before initialization or assignment, the current value of a variable is "indeterminate" and you must not attempt to access it. Doing so results in undefined behaviour.
This is true for all variables, but in particular it applies to your pointer variable. It simply has no meaningful value until you assign one.
void * p; // not initialized
if (p) { /*...*/ } // undefined behaviour!
printf("%p\n", p); // undefined behaviour!
p = &a; // now p has a well-defined value
The technical term for the action that is causing undefined behaviour is the so-called "lvalue conversion". That is the moment in which you take a named variable (an "lvalue") and use its content. E.g. C11, 6.3.2.1/2 says:
If the lvalue designates an object of automatic storage duration [...] and that object
is uninitialized (not declared with an initializer and no assignment to it has been
performed prior to use), the behavior is undefined.
It is just like any other uninitialized local variable -- it is undefined where it points or what value it contains, and you are not allowed to use it (e.g., dereference it) until it is initialized. As stated in #WhozCraig's comment, almost all other operations are forbidden as well (using the pointer's value at all, including arithmetic and comparisons). Uninitialized non-pointer variables (even those with simple types such as ints) cannot be used for any operations that access their values, either.
Actually, as it has been stated in almost all answers so far, the pointer's value is unknown and consists of the contents of the memory at that location when it was allocated.
Contrary to what some answers state though, noone and nothing is going to forbid you dereferencing it, or doing any kind of operation with this pointer.
As a result, using such a pointer will produce any kind of unpredictable results. It is not only best practice but a requirement for producing less buggy code, to initialize a pointer on declaration to something, even if that something is, simply, NULL.
It points to a random memory location. Dereferencing such a pointer usually leads to a segfault.
In this case, it will point anywhere. You don't know. The contents of the pointer will be whatever was at the memory location before. So this is very dangerous and should be avoided. You should always init a pointer with NULL, then it will point to "nothing" in a defined way.
Like any other non-static variable in C, it's not initialized automatically. It contains whatever junk data was in the memory slot, and so deferencing it before assigning a proper value to it is likely to be a bad idea.