Is the printf statement valid? - c

int main()
{
struct a
{
struct a *next;
struct a *prev;
};
struct a *A[2];
printf("Address of (&(A[0])->next) = %p",(&(A[0])->next));
getch();
return 0;
}
In the above printf statement I'm accessing "next" pointer of "struct a" structure & when I run the program in dev compiler it's giving me the valid memory address (though I've not yet allocated any memory for it). An explanation of how come this happens will be very helpful.
Is any memory allocated for the "next" & "prev" fields?

Let's think about what this means:
&(A[0])->next
It is the address of the next pointer (not where it points, but the address of the pointer itself). And the next pointer is the first element of struct a, so the address of next is the same as the address of its enclosing a.
Therefore, the expression is the address of the struct a referred to by A[0]. In your original code, you never assign anything there, so it's simply a garbage value being printed. As #alk points out in another answer, you could initialize the two pointers in your variable A and then you would see the first of those values being printed (say, 0x0).
By the way, if you want to quickly initialize A, do it this way, not with the more verbose memset():
struct a *A[2] = {0};
It does the same thing (sets the two pointers to 0).
While the value being printed is garbage, the code may not be illegal. This may seem surprising, but see here: Dereferencing an invalid pointer, then taking the address of the result - you've got something similar, though admittedly you've taken it a step further by dereferencing a member of a struct as opposed to simply using *. So the open question in my mind is: given that &*foo is always legal when foo is a pointer (as shown in the above link), does the same hold true for &foo->bar?

&(A[0])->next
is the address of the next member of the first structure in the A array.
This can be thought of as &A[0] + offsetof(struct a, next). I. e., this just results in whatever the value of the uninitialized pointer A[0] was plus the offset of the next member from the base address of the structure (which happens to be zero, since next is the first element of the structure).
According to the C standard, your program invokes undefined behavior because it performs pointer arithmetic on an invalid pointer. However, in practice, this will most likely not crash and print a bogus address (only an addition is performed, nothing accesses the memory behind the pointer). Expect a crash though if you actually dereference the pointer.

Related

What happens to a pointer after what it's been pointing to has been freed

Background info: My program involves creating a hash table and one of my functions is free_hash(struct hash_table *table).
struct hash_table *table points to an array of struct hash_entry pointers. To test my free_hash function in main I have a void *test_free = what. the declaration and initialization for what is hash_table *what = new_hash(array_size).
this is struct hash_table *new_hash() it returns a function that returns a pointer to a new initialized struct hash_table.
My question: After freeing what, eg.free_hash(what), what happens to test_free. What is the address of it/the value of what it is pointing at. And is there any other way I can make sure that what has been destructed/freed.
test_free and what are pointers. The value they have is basically an address. And you assigned the same address to both of them. Nothing happens to either variable once you free that to which they point.
Once you do, the pointers are deemed to be indeterminate, so it becomes undefined behaviour to deference either one. But there's nothing in either variable that indicates this. The onus is on the programmer to ensure no attempts is made to access a freed structure.
As for checking if everything was properly freed, there's -fsanitize=address, valgrind, etc.
C 2018 6.2.4 2 says “… The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.” When you release memory with free, the lifetime of any objects in it end, so any pointers to this memory become indeterminate. “Indeterminate” means the pointer value is not even fixed; it may act as if it has a different value each time it is used.
This rule exists because there have been C implementations in which maintaining pointers required auxiliary information associated with the allocated memory. So the “value” of a pointer was not represented just by the bits directly in the memory used for the pointer object itself. Once the memory, and its auxiliary information, are released, it might no longer be possible to interpret the value of the pointer correctly.
In most modern C implementations, addresses are implemented simply as numbers in a “flat” address space. In this case, no auxiliary information is needed to interpret the value of a pointer or to work with it as by adding offsets to it. However, because the rule exists, optimizers in compilers may treat any pointer to freed memory as indeterminate.
For example, in this code:
void *x;
free(p);
if (SomeTest)
x = p;
else
x = q;
printf("%p\n", x);
the compiler is allowed to optimize this to:
void *x;
free(p);
x = q;
printf("%p\n", x);
even if SomeTest is true. That is the fact that p is indeterminate after free means it is allowed to have any value, so it could have the value of q, so the if statement would just be:
if (SomeTest)
x = q;
else
x = q;
which of course can be optimized to x = q;.
In short, once you release memory, the C standard does not give you any assurance your program will behave as if a pointer to that memory has any particular value. It may act as if the pointer has a different value each time the program uses it.

Pointer layout in memory in C

I've recently been messing around with pointers and I would like to know a bit more about them, namely how they are organized in memory after using malloc for example.
So this is my understanding of it so far.
int **pointer = NULL;
Since we explicitly set the pointer to NULL it now points to the address 0x00.
Now let's say we do
pointer = malloc(4*sizeof(int*));
Now we have pointer pointing to an address in memory - let's say pointer points to the address 0x0010.
Let's say we then run a loop:
for (i = 0; i<4; i++) pointer[i] = malloc(3*sizeof(int));
Now, this is where it starts getting confusing to me. If we dereference pointer, by doing *pointer what do we get? Do we get pointer[0]? And if so, what is pointer[0]?
Continuing, now supposedly pointer[i] contains stored in it an address. And this is where it really starts confusing me and I will use images to better describe what I think is going on.
In the image you see, if it is correct, is pointer[0] referring to the box that has the address 0x0020 in it? What about pointer[1]?
If I were to print the contents of pointer would it show me 0x0010? What about pointer[0]? Would it show me 0x0020?
Thank you for taking the time to read my question and helping me understand the memory layout.
Pointer Refresher
A pointer is just a numeric value that holds the address of a value of type T. This means that T can also be a pointer type, thus creating pointers-to-pointers, pointers-to-pointers-to-pointers, and crazy things like char********** - which is simply a pointer (T*) where T is a pointer to something else (T = E*) where E is a pointer to something else (and so on...).
Something to remember here is that a pointer itself is a value and thus takes space. More specifically, it's (usually) the size of the addressable space the CPU supports.
So for example, the 6502 processor (commonly found in old gaming consoles like the NES and Atari, as well as the Apple II, etc.) could only address 16 bits of memory, and thus its "pointers" were 16-bits in size.
So regardless of the underlying type, a pointer will (usually) be as large as the addressable space.
Keep in mind that a pointer doesn't guarantee that it points to valid memory - it's simply a numeric value that happens to specify a location in memory.
Array Refresher
An array is simply a series of T elements in contiguously addressable memory. The fact it's a "double pointer" (or pointer-to-a-pointer) is innocuous - it is still a regular pointer.
For example, allocating an array of 3 T's will result in a memory block that is 3 * sizeof(T) bytes long.
When you malloc(...) that memory, the pointer returned simply points to the first element.
T *array = malloc(3 * sizeof(T));
printf("%d\n", (&array[0] == &(*array))); // 1 (true)
Keep in mind that the subscript operator (the [...]) is basically just syntactic sugar for:
(*(array + sizeof(*array) * n)) // array[n]
Arrays of Pointers
To sum all of this up, when you do
E **array = malloc(3 * sizeof(E*));
You're doing the same thing as
T *array = malloc(3 * sizeof(T));
where T is really E*.
Two things to remember about malloc(...):
It doesn't initialize the memory with any specific values (use calloc for that)
It's not guaranteed (nor really even common) for the memory to be contiguous or adjacent to the memory returned by a previous call to malloc
Therefore, when you fill the previously created array-of-pointers with subsequent calls to malloc(), they might be in arbitrarily random places in memory.
All you're doing with your first malloc() call is simply creating the block of memory required to store n pointers. That's it.
To answer your questions...
If we dereference pointer, by doing *pointer what do we get? Do we get pointer[0]?
Since pointer is just a int**, and remembering that malloc(...) returns the address of the first byte in the block of memory you allocated, *pointer will indeed evaluate to pointer[0].
And if so, what is pointer[0]?
Again, since pointer as the type int**, then pointer[0] will return a value type of int* with the numeric contents of the first sizeof(int*) bytes in the memory block pointed to by pointer.
If I were to print the contents of pointer would it show me 0x0010?
If by "printing the contents" you mean printf("%p\n", (void*) pointer), then no.
Since you malloc()'d the memory block that pointer points to, pointer itself is just a value with the size of sizeof(int**), and thus will hold the address (as a numeric value) where the block of memory you malloc()'d resides.
So the above printf() call will simply print that value out.
What about pointer[0]?
Again assuming you mean printf("%p\n", (void*) pointer[0]), then you'll get a slightly different output.
Since pointer[0] is the equivalent of *pointer, and thus causes pointer to be dereferenced, you'll get a value of int* and thus the pointer value that is stored in the first element.
You would need to further dereference that pointer to get the numeric value stored in the first integer that you allocated; for example:
printf("%d\n", **pointer);
// or
printf("%d\n", *pointer[0]);
// or even
printf("%d\n", pointer[0][0]); // though this isn't recommended
// for readability's sake since
// `pointer[0]` isn't an array but
// instead a pointer to a single `int`.
If I dereference pointer, by doing *pointer what do I get? pointer[0]?
Yes.
And if so, what is pointer[0]?
With your definitions: 0x0020.
In the image you see, if it is correct
It seems correct to me.
is pointer[0] referring to the box that has the address 0x0020 in it?
Still yes.
What about pointer[1]?
At this point, I think you can guess that it woud show: 0x002c.
To go further
If you want to check how memory is managed and what pointers look like you can use gdb. It allows running a program step by step and performing various operations such as showing the content of variables. Here is the main page for GNU gdb. A quick internet search should let you find numerous gdb tutorials.
You can also show the address of a pointer in c by using a printf line:
int *plop = NULL;
fprintf(stdout, "%p\n", (void *)pointer);
Note: don't forget to include <stdio.h>

C pointer initialization differences

I am new to C and have some questions about the pointer.
Question 1 What`s differences b/w the following two? Which way is better to initialize a pointer and why?
int *p=NULL;
int *p;
#include <stdio.h>
void main()
{
char *s = "hello";
printf("%p\t%p",s);
//printf("%p\t%p",&s) it will give me unpredictable result every time
//printf("%p\t%p",(void *)&s) it will be fine
//Question3: why?
}
Question 2: I try to google what is %p doing. According to my reading, it is supposed to print the pointer. It that mean it print the address of the pointer?
Question 1, these are definitions of pointer p. One initializes the pointer to NULL, another leaves it uninitialized (if it is local variable in a function, and not global variable, global variables get initialized to 0 by default). Initializing with NULL can be good, or it can be bad, because compiler can warn you about use of uninitialized variables and help you find bugs. On the other hand compiler can't detect every possible use of uninitialized variable, so initializing to NULL is pretty much guaranteed to produce segmentation fault if used, which you can then catch and debug with a debugger very easily. Personally I'd go with always initializing when variable defined, with the correct value if possible (if initialization is too complex for single statement, add a helper function to get the value).
Question 2, %p prints the address value passed to printf. So printf("%p", pointer); gets passed value of variable pointer and it prints that, while printf("%p", &pointer); (note the extra & there) gets passed address of the variable pointer, and it prints that. Exact numeric format of %p is implementation defined, it might be printed just as a plain number.
Question 3 is about undefined behavior, because format string has more items than what you actually pass to printf. Short answer is, behavior is undefined, there is no "why". Longer answer is, run the application with machine code debugger and trace the execution in disassembly view to see what actually happens, to see why. Note that results may be different on different runs, and behavior may be different under debugger and running normally, because memory may have different byte values in different runs for various reasons.
1) The first is an initialization (to NULL in this case) the second is only a declaration of p as a pointer to int, no initial value is assigned to p in this case. You should always prefer an initialization to prevent undefined behavior.
2) You should cast to void* when using %p to print out a pointer (beware that you are using it too many times in your format specifier). The memory address to which p points is printed.
1)
int *p = NULL
defines and initializes a pointer 'p' to NULL. This is the correct way to initialize pointers in order to get "Seg Fault" if you forget to assign a valid address to this pointer later.
int *p
Only defines a pointer "p" with an unknown address. If you forget to assign a valid value to this pointer before using it, then some compilers will notify you about this mistakes while some others will not and you may access a non-valid address and get a run time error or undefined behaviour of the program.
2) "%p" is printing the address where the pointer is points. Since the pointer holds an address, then "%p" prints this address.
printf("%p\t%p",s);
So the first "%p" will print the address where the pointer "s" points which is the address which stores the string "hello". However, note that you are using twice "%p" but you providing only one pointer to print its address !!
Most compilers will not scream about this cause it is effect-less; however try to avoid it.
Answer1 :
int *p=NULL;
p is a pointer to a int variable initialized with NULL. Here NULL means pointer p is not pointing to any valid memory location.
int *p;
p is a pointer to a int variable. p in uninitialized. Reading uninitialized variables is Undefined Behavior. (one possibility if try to use is that it will throw a segmentation fault)
Answer2:
It prints content of pointer. I mean base address of string "hello"
The main difference is that in *p = NULL, NULL is a pre-defined and standard 'place' where the pointer points.
Reading from Wikipedia,
The macro NULL is defined as an implementation-defined null pointer constant,
which in C99 can be portably expressed as the integer value 0
converted implicitly or explicitly to the type void*.
This means that the 'memory cell' called p contains the MACRO value of NULL.
If you just write int *p, you are naming the memory cell with the name p but this cell is empty.

Confusion about the fact that uninitialized pointer points to anywhere

#include <stdio.h>
int main(void)
{
int *ptr;
printf("%p", ptr); // Error: uninitialized local variable 'ptr' used
// Output is "0"
}
I'm reading C-FAQ about null pointer. And it says that uninitialized pointer might point to anywhere. Does that mean it points to random location in memory? Also if this statement is true, why does error occur if i try printf("%p",ptr)? Since uninitialized pointer ptr points to some random location, it seems that it must print out this random location!
The contents of an unitialized auto variable (pointer type or otherwise) are indeterminate; in practice, it's whatever was last written to that memory location. The odds that this random bit pattern corresponds to a valid address1 in your program are pretty low; it may even be a trap representation (a bit pattern that does not correspond to a legal value for the type).
Attempting to dereference an invalid pointer value results in undefined behavior; any result is possible. Your code may crash outright, it may run with no apparent issues, it may leave your system in a bad state.
That is, the address of an object or function defined in your program, or a dynamic object allocated with malloc or similar.

memcpy fails but assignment doesn't on character pointers

Actually, memcpy works just fine when I use pointers to characters, but stops working when I use pointers to pointers to characters.
Can somebody please help me understand why memcpy fails here, or better yet, how I could have figured it out myself. I am finding it very difficult to understand the problems arising in my c/c++ code.
char *pc = "abcd";
char **ppc = &pc;
char **ppc2 = &pc;
setStaticAndDynamicPointers(ppc, ppc2);
char c;
c = (*ppc)[1];
assert(c == 'b'); // assertion doesn't fail.
memcpy(&c,&(*ppc[1]),1);
if(c!='b')
puts("memcpy didn't work."); // this gets printed out.
c = (*ppc2)[3];
assert(c=='d'); // assertion doesn't fail.
memcpy(&c, &(*ppc2[3]), 1);
if(c != 'd')
puts("memcpy didn't work again.");
memcpy(&c, pc, 1);
assert(c == 'a'); // assertion doesn't fail, even though used memcpy
void setStaticAndDynamicPointers(char **charIn, char **charIn2)
{
// sets the first arg to a pointer to static memory.
// sets the second arg to a pointer to dynamic memory.
char stat[5];
memcpy(stat, "abcd", 5);
*charIn = stat;
char *dyn = new char[5];
memcpy(dyn, "abcd", 5);
*charIn2 = dyn;
}
your comment implies that char stat[5] should be static, but it isn't. As a result charIn points to a block that is allocated on the stack, and when you return from the function, it is out of scope. Did you mean static char stat[5]?
char stat[5];
is a stack variable which goes out of scope, it's not // sets the first arg to a pointer to static memory.. You need to malloc/new some memory that gets the abcd put into it. Like you do for charIn2
Just like what Preet said, I don't think the problem is with memcpy. In your function "setStaticAndDynamicPointers", you are setting a pointer to an automatic variable created on the stack of that function call. By the time the function exits, the memory pointed to by "stat" variable will no longer exist. As a result, the first argument **charIn will point to something that's non-existent. Perhaps you can read in greater detail about stack frame (or activation record) here: link text
You have effectively created a dangling pointer to a stack variable in that code. If you want to test copying values into a stack var, make sure it's created in the caller function, not within the called function.
In addition to the definition of 'stat', the main problem in my eyes is that *ppc[3] is not the same as (*ppc)[3]. What you want is the latter (the fourth character from the string pointed to by ppc), but in your memcpy()s you use the former, the first character of the fourth string in the "string array" ppc (obviously ppc is not an array of char*, but you force the compiler to treat it as such).
When debugging such problems, I usually find it helpful to print the memory addresses and contents involved.
Note that the parenthesis in the expressions in your assignment statements are in different locations from the parenthesis in the memcpy expressions. So its not too suprising that they do different things.
When dealing with pointers, you have to keep the following two points firmly in the front of your mind:
#1 The pointer itself is separate from the data it points to. The pointer is just a number. The number tells us where, in memory, we can find the beginning of some other chunk of data. A pointer can be used to access the data it points to, but we can also manipulate the value of the pointer itself. When we increase (or decrease) the value of the pointer itself, we are moving the "destination" of the pointer forward (or backward) from the spot it originally pointed to. This brings us to the second point...
#2 Every pointer variable has a type that indicates what kind of data is being pointed to. A char * points to a char; a int * points to an int; and so on. A pointer can even point to another pointer (char **). The type is important, because when the compiler applies arithmetic operations to a pointer value, it automatically accounts for the size of the data type being pointed to. This allows us to deal with arrays using simple pointer arithmetic:
int *ip = {1,2,3,4};
assert( *ip == 1 ); // true
ip += 2; // adds 4-bytes to the original value of ip
// (2*sizeof(int)) => (2*2) => (4 bytes)
assert(*ip == 3); // true
This works because the array is just a list of identical elements (in this case ints), laid out sequentially in a single contiguous block of memory. The pointer starts out pointing to the first element in the array. Pointer arithmetic then allows us to advance the pointer through the array, element-by-element. This works for pointers of any type (except arithmetic is not allowed on void *).
In fact, this is exactly how the compiler translates the use of the array indexer operator []. It is literally shorthand for a pointer addition with a dereference operator.
assert( ip[2] == *(ip+2) ); // true
So, How is all this related to your question?
Here's your setup...
char *pc = "abcd";
char **ppc = &pc;
char **ppc2 = &pc;
for now, I've simplified by removing the call to setStaticAndDynamicPointers. (There's a problem in that function too—so please see #Nim's answer, and my comment there, for additional details about the function).
char c;
c = (*ppc)[1];
assert(c == 'b'); // assertion doesn't fail.
This works, because (*ppc) says "give me whatever ppc points to". That's the equivalent of, ppc[0]. It's all perfectly valid.
memcpy(&c,&(*ppc[1]),1);
if(c!='b')
puts("memcpy didn't work."); // this gets printed out.
The problematic part —as others have pointed out— is &(*ppc[1]), which taken literally means "give me a pointer to whatever ppc[1] points to."
First of all, let's simplify... operator precedence says that: &(*ppc[1]) is the same as &*ppc[1]. Then & and * are inverses and cancel each other out. So &(*ppc[1]) simplifies to ppc[1].
Now, given the above discussion, we're now equipped to understand why this doesn't work: In short, we're treating ppc as though it points to an array of pointers, when in fact it only points to a single pointer.
When the compiler encounters ppc[1], it applies the pointer arithmetic described above, and comes up with a pointer to the memory that immediately follows the variable pc -- whatever that memory may contain. (The behavior here is always undefined).
So the problem isn't with memcopy() at all. Your call to memcpy(&c,&(*ppc[1]),1) is dutifully copying 1-byte (as requested) from the memory that's pointed to by the bogus pointer ppc[1], and writing it into the character variable c.
As others have pointed out, you can fix this by moving your parenthesis around:
memcpy(&c,&((*ppc)[1]),1)
I hope the explanation was helpful. Good luck!
Although the previous answers raise valid points, I think the other thing you need to look at is your operator precedence rules when you memcpy:
memcpy(&c, &(*ppc2[3]), 1);
What happens here? It might not be what you're intending. The array notation takes higher precedence than the dereference operator, so you first attempt perform pointer arithmetic equivalent to ppc2++. You then dereference that value and pass the address into memcpy. This is not the same as (*ppc2)[1]. The result on my machine is an access violation error (XP/VS2005), but in general this is undefined behaviour. However, if you dereference the same way you did previously:
memcpy(&c, &((*ppc2)[3]), 1);
Then that access violation goes away and I get proper results.

Resources