So I have this structure:
typedef struct my_structure_s{
uint32_t* label;
uint32_t* data_1;
uint32_t* data_2;
} my_structure_t;
Ultimately, I am working towards allocating a single large block of memory, and moving the pointers to point towards appropriate places in the large block of memory. This block will be big enough so that I can have room for N1 and N2 elements for data_1 and data_2, where N1 and N2 are calculated in the program.
Actually, I am going to have a few of these structures, and I want to eventually point all of them to proper places in this block. I know how to move the pointers to point to the correct places, but I am having trouble understanding where the inner pointers get created in memory.
For instance I define:
my_structure_t* my_struct = malloc( sizeof(my_struct) );
my_structure_t example_struct;
my_struct = &example_struct;
Now I want to see the relative addresses, so I write:
printf("Base Address: %p\n", &my_struct);
printf("First Element: %p\n", &(my_struct->label));
printf("Difference: %x\n\n", (uint32_t)&(my_struct->label) - (uint32_t)&my_struct);
For some reason, the difference is a negative value! Why? Yes I am using a 32 bit kernel. Shouldn't the first element be next to, but further down, the heap? Also, why is the memory difference not 0x4 but instead -0x8? I think everything should be aligned...seeing how its a structure of repetitions of the same data type...and that data type has a sizeof() of 4.
Second, what if I don't point the pointer to an instance of a structure? Eg what if I do:
my_structure_t* my_struct_2;
my_structure_t* big_pool = malloc( sizeof(my_structure_t) );
my_struct_2 = big_pool;
printf("Base Address: %p\n", &my_struct_2);
printf("First Element: %p\n", &(my_struct_2->label));
printf("Difference: %x\n", (uint32_t)&(my_struct_2->label) - (uint32_t)&my_struct_2);
Here, the values are totally wacky. Once, I got a difference of 0x49f02aac. Where then is the actual label pointer placed in memory? In this case, how can I make sure the address of the label pointer itself is right after the base address, without pointing the base pointer to a normal structure instance? Can it be assigned somehow?
Ie, if I were to point the label pointer somewhere after creating the structure pointer, where would the label pointer itself be stored? How can I make sure that all of these inner pointers get stored, in the same sequence as in the definition, right after the base pointer?
Yes yes, I could have possible defined data_1[MAX_N1] or some other work around like that, then just defined an array of structure pointers, and assigned each one a fixed memory size, but I didn't. There are various reasons. Please stick with the questions I am asking.
Thanks!
First, malloc() do not allocate pointers, it allocates memory for your struct, and then it returns a pointer to it. It is perfectly possible to have an uninitialized pointer, or a pointer pointing nowhere (NULL).
Second, in C there are three types of variables, regarding to allocation:
Static variables (globals). They exist all the time the program is running.
Automatic variables (locals). They exist as long as the code block containing them is running.
Dynamic variables (malloc). They exists until they are freed.
With that in mind your examples:
my_structure_t* my_struct = malloc( sizeof(my_struct) );
my_structure_t example_struct;
my_struct = &example_struct;
The first structure is dynamic, the second one is automatic. Then you make the pointer point to the automatic one. Remember, you can have pointers to anything. The dynamic variable is lost (leaked).
printf("Base Address: %p\n", &my_struct);
printf("First Element: %p\n", &(my_struct->label));
That is tricky: my_struct is a pointer, but you are getting its address, so what you get is the address of the automatic variable my_struct of type pointer.
The -8 happens because you are comparing the addresses of two local variables: the local pointer my_struct and the member label of the local structure variable example_struct. And those just happen to be so layout in memory.
Try instead:
printf("Base Address: %p\n", (void*)my_struct);
printf("First Element: %p\n", (void*)&(my_struct->label));
(BTW: it is always a good idea to cast to (void*) when printf("%p").
The other example has a similar error: do print the address of the structure or the value of the pointer, but not the address of the pointer.
my_struct is a pointer to a my_structure_t. There are no guarantees about where the pointer is in relation to the object it points to.
I beleive what you are trying to test, based on your comments about the memory location is
printf("Base Address: %p\n", my_struct);
printf("First Element: %p\n", &(my_struct->label));
printf("Difference: %x\n\n", (uint32_t)&(my_struct->label) - (uint32_t)my_struct)
Notice the removal of & before my_struct in 2 places compared to your code.
In answer to the actual question you asked
Also, why is the memory difference not 0x4 but instead -0x8?
Either sizeof(my_structure_t*) is 8, or the compiler decided that example_struct should be 8 byte aligned, and added padding when adding thee variables to your stack.
Related
int main()
{
int *p;
printf("%p \n", &p);
printf("%p \n", p);
return 0;
}
By executing this code I am receiving the following output:
0x16b77f710
0x104683f4c
I expected to get the same memory address, because the &p id not referenced to any other variable.
Why i am getting two different memory address?
Thanks,
The pointer is a normal object having (in your case type of int *). It cant point to itself because the type of such pointer would have to be int **
*image stolen from internet.
A pointer is a variable like any other. It has an address, which is typically the address in memory where that variable sits.
Like any other variable, a pointer variable also has some data in it. For a pointer variable, that data is the address of some other variable, the variable at which the pointer points.
The address of a variable, and the contents of a variable, are two totally different things. They are almost never equal.
Try this program, in which I give your variable p something to point to:
int main()
{
int i = 5;
int *p = &i;
printf("p: %p, p's address: %p\n", p, &p);
printf("i: %d, i's address: %p\n", i, &i);
}
You should notice two things:
As in your first program, "p" and "p's address" will be different.
Whatever value you see for "p", it will be the same as "i's address".
The reason is that a pointer, when declared, does not point to itself by default. Simplified you can imagine it in such a way that a pointer occupies 2 memory cells. 1 memory cell has the virtual address where the pointer itself is located (&p in your case), the 2nd memory cell contains the virtual address where the pointer points to (p in your case).
Since memory cells retain their value when they are deallocated, the cell containing the destination address of your pointer still contains an obsolete value from another, already completed process.
You would have the same phenomenon if you declare a new integer variable and then print its value with printf, you will see that there will already be some number in the new variable that will appear completely random. This is also due to the fact that there is still an obsolete value in the corresponding memory cell.
Let assume there is random memory block for just understanding name it a. Now assume that p is pointing to that memory block.
&p returns the address of memory block where p is present.
p returns the address of memory block(&a) to the variable/memory block(a) which p is pointing.
So of course it will give different memory addresses.
I expected to get the same memory address, because the &p id not referenced to any other variable.
Pointer variables do not automatically point to themselves; if you don't explicitly initialize them, then their initial value will be indeterminate (or NULL, depending on how they are declared).
There's nothing magic about pointer variables - like any other scalar, they store some kind of value; it's just that in their case, that value is an address of another object.
If you really want p to store its own address, you'll have to do something like
p = (int *) &p;
The cast is necessary because the type of the expression &p is int **, and you can't assign a pointer value of one type to a variable of a different pointer type. But, pointers to different types are not guaranteed to have the same size, representation, or alignment. On modern commodity hardware like x86 you can probably count on int * and int ** having the same size and representation, just be aware that doesn't have to be the case everywhere.
I've recently been messing around with pointers and I would like to know a bit more about them, namely how they are organized in memory after using malloc for example.
So this is my understanding of it so far.
int **pointer = NULL;
Since we explicitly set the pointer to NULL it now points to the address 0x00.
Now let's say we do
pointer = malloc(4*sizeof(int*));
Now we have pointer pointing to an address in memory - let's say pointer points to the address 0x0010.
Let's say we then run a loop:
for (i = 0; i<4; i++) pointer[i] = malloc(3*sizeof(int));
Now, this is where it starts getting confusing to me. If we dereference pointer, by doing *pointer what do we get? Do we get pointer[0]? And if so, what is pointer[0]?
Continuing, now supposedly pointer[i] contains stored in it an address. And this is where it really starts confusing me and I will use images to better describe what I think is going on.
In the image you see, if it is correct, is pointer[0] referring to the box that has the address 0x0020 in it? What about pointer[1]?
If I were to print the contents of pointer would it show me 0x0010? What about pointer[0]? Would it show me 0x0020?
Thank you for taking the time to read my question and helping me understand the memory layout.
Pointer Refresher
A pointer is just a numeric value that holds the address of a value of type T. This means that T can also be a pointer type, thus creating pointers-to-pointers, pointers-to-pointers-to-pointers, and crazy things like char********** - which is simply a pointer (T*) where T is a pointer to something else (T = E*) where E is a pointer to something else (and so on...).
Something to remember here is that a pointer itself is a value and thus takes space. More specifically, it's (usually) the size of the addressable space the CPU supports.
So for example, the 6502 processor (commonly found in old gaming consoles like the NES and Atari, as well as the Apple II, etc.) could only address 16 bits of memory, and thus its "pointers" were 16-bits in size.
So regardless of the underlying type, a pointer will (usually) be as large as the addressable space.
Keep in mind that a pointer doesn't guarantee that it points to valid memory - it's simply a numeric value that happens to specify a location in memory.
Array Refresher
An array is simply a series of T elements in contiguously addressable memory. The fact it's a "double pointer" (or pointer-to-a-pointer) is innocuous - it is still a regular pointer.
For example, allocating an array of 3 T's will result in a memory block that is 3 * sizeof(T) bytes long.
When you malloc(...) that memory, the pointer returned simply points to the first element.
T *array = malloc(3 * sizeof(T));
printf("%d\n", (&array[0] == &(*array))); // 1 (true)
Keep in mind that the subscript operator (the [...]) is basically just syntactic sugar for:
(*(array + sizeof(*array) * n)) // array[n]
Arrays of Pointers
To sum all of this up, when you do
E **array = malloc(3 * sizeof(E*));
You're doing the same thing as
T *array = malloc(3 * sizeof(T));
where T is really E*.
Two things to remember about malloc(...):
It doesn't initialize the memory with any specific values (use calloc for that)
It's not guaranteed (nor really even common) for the memory to be contiguous or adjacent to the memory returned by a previous call to malloc
Therefore, when you fill the previously created array-of-pointers with subsequent calls to malloc(), they might be in arbitrarily random places in memory.
All you're doing with your first malloc() call is simply creating the block of memory required to store n pointers. That's it.
To answer your questions...
If we dereference pointer, by doing *pointer what do we get? Do we get pointer[0]?
Since pointer is just a int**, and remembering that malloc(...) returns the address of the first byte in the block of memory you allocated, *pointer will indeed evaluate to pointer[0].
And if so, what is pointer[0]?
Again, since pointer as the type int**, then pointer[0] will return a value type of int* with the numeric contents of the first sizeof(int*) bytes in the memory block pointed to by pointer.
If I were to print the contents of pointer would it show me 0x0010?
If by "printing the contents" you mean printf("%p\n", (void*) pointer), then no.
Since you malloc()'d the memory block that pointer points to, pointer itself is just a value with the size of sizeof(int**), and thus will hold the address (as a numeric value) where the block of memory you malloc()'d resides.
So the above printf() call will simply print that value out.
What about pointer[0]?
Again assuming you mean printf("%p\n", (void*) pointer[0]), then you'll get a slightly different output.
Since pointer[0] is the equivalent of *pointer, and thus causes pointer to be dereferenced, you'll get a value of int* and thus the pointer value that is stored in the first element.
You would need to further dereference that pointer to get the numeric value stored in the first integer that you allocated; for example:
printf("%d\n", **pointer);
// or
printf("%d\n", *pointer[0]);
// or even
printf("%d\n", pointer[0][0]); // though this isn't recommended
// for readability's sake since
// `pointer[0]` isn't an array but
// instead a pointer to a single `int`.
If I dereference pointer, by doing *pointer what do I get? pointer[0]?
Yes.
And if so, what is pointer[0]?
With your definitions: 0x0020.
In the image you see, if it is correct
It seems correct to me.
is pointer[0] referring to the box that has the address 0x0020 in it?
Still yes.
What about pointer[1]?
At this point, I think you can guess that it woud show: 0x002c.
To go further
If you want to check how memory is managed and what pointers look like you can use gdb. It allows running a program step by step and performing various operations such as showing the content of variables. Here is the main page for GNU gdb. A quick internet search should let you find numerous gdb tutorials.
You can also show the address of a pointer in c by using a printf line:
int *plop = NULL;
fprintf(stdout, "%p\n", (void *)pointer);
Note: don't forget to include <stdio.h>
#include<stdio.h>
#include<stdlib.h>
#include<malloc.h>
struct node
{
int id;
struct node *next;
};
typedef struct node NODE;
int main()
{
NODE *hi;
printf("\nbefore malloc\n");
printf("\naddress of node is: %p",hi);
printf("\naddress of next is: %p",hi->next);
return 0;
}
The output is:
before malloc
address of node is: 0x7ffd37e99e90
address of next is: 0x7ffd37e9a470
Why both are not same?
TL;DR
Your code provokes Undefined Behavior, as already mentioned in Morlacke's Answer. Other than that, it seems that you're having problems on understanding how pointers work. See references for tutorials.
First, From your comments
When you say that there's memory allocated for ip in this case:
int i = 10;
int *ip;
ip = &i;
What happens is:
You declare an int variable called i and assign the value 10 to it. Here, the computer allocates memory for this variable on the stack. Say, at address 0x1000. So now, address 0x1000 has content 10.
Then you declare a pointer called ip, having type int. The computer allocates memory for the pointer. (This is important, see bellow for explanation). Your pointer is at address, say, 0x2000.
When you assign ip = &i, you're assigning the address of variable i to variable ip. Now the value of variable ip (your pointer) is the address of i. ip doesn't hold the value 10 - i does. Think of this assignment as ip = 0x1000 (don't actually write this code).
To get the value 10 using your pointer you'd have to do *ip - this is called dereferencing the pointer. When you do that, the computer will access the contents of the address held by the pointer, in this case, the computer will access the contents on the address of i, which is 10. Think of it as: get the contents of address 0x1000.
Memory looks like this after that snippet of code:
VALUE : 10 | 0x1000 |
VARIABLE : i | ip |
ADDRESS : 0x1000 | 0x2000 |
Pointers
Pointers are a special type of variable in C. You can think of pointers as typed variables that hold addresses. The space your computer allocates on the stack for pointers depends on your architecture - on 32bit machines, pointers will take 4 bytes; on 64bit machines pointers will take 8 bytes. That's the only memory your computer allocates for your pointers (enough room to store an address).
However, pointers hold memory addresses, so you can make it point to some block of memory... Like memory blocks returned from malloc.
So, with this in mind, lets see your code:
NODE *hi;
printf("\nbefore malloc\n");
printf("\naddress of node is: %p",hi);
printf("\naddress of next is: %p",hi->next);
Declare a pointer to NODE called hi. Lets imagine this variable hi has address 0x1000, and the contents of that address are arbitrary - you didn't initialize it, so it can be anything from zeroes to a ThunderCat.
Then, when you print hi in your printf you're printing the contents of that address 0x1000... But you don't know what's in there... It could be anything.
Then you dereference the hi variable. You tell the computer: access the contents of the ThunderCat and print the value of variable next. Now, I don't know if ThunderCats have variables inside of them, nor if they like to be accessed... so this is Undefined Behavior. And it's bad!
To fix that:
NODE *hi = malloc(sizeof NODE);
printf("&hi: %p\n", &hi);
printf(" hi: %p\n", hi);
Now you have a memory block of the size of your structure to hold some data. However, you still didn't initialize it, so accessing the contents of it is still undefined behavior.
To initialize it, you may do:
hi->id = 10;
hi->next = hi;
And now you may print anything you want. See this:
#include <stdio.h>
#include <stdlib.h>
struct node {
int id;
struct node *next;
};
typedef struct node NODE;
int main(void)
{
NODE *hi = malloc(sizeof(NODE));
if (!hi) return 0;
hi->id = 10;
hi->next = hi;
printf("Address of hi (&hi) : %p\n", &hi);
printf("Contents of hi : %p\n", hi);
printf("Address of next(&next): %p\n", &(hi->next));
printf("Contents of next : %p\n", hi->next);
printf("Address of id : %p\n", &(hi->id));
printf("Contents of id : %d\n", hi->id);
free(hi);
return 0;
}
And the output:
$ ./draft
Address of hi (&hi) : 0x7fffc463cb78
Contents of hi : 0x125b010
Address of next(&next): 0x125b018
Contents of next : 0x125b010
Address of id : 0x125b010
Contents of id : 10
The address of variable hi is one, and the address to which it points to is another. There are several things to notice on this output:
hi is on the stack. The block to which it points is on the heap.
The address of id is the same as the memory block (that's because it's the first element of the structure).
The address of next is 8 bytes from id, when it should be only 4(after all ints are only 4 bytes long) - this is due to memory alignment.
The contents of next is the same block pointed by hi.
The amount of memory "alloced" for the hi pointer itself is 8 bytes, as I'm working on a 64bit. That's all the room it has and needs.
Always free after a malloc. Avoid memory leaks
Never write code like this for other purposes than learning.
Note: When I say "memory alloced for the pointer" I mean the space the computer separates for it on the stack when the declaration happens after the Stack Frame setup.
References
SO: How Undefined is Undefined Behavior
SO: Do I cast the result of malloc
SO: What and where are the stack and heap?
Pointer Basics
Pointer Arithmetic
C - Memory Management
Memory: Stack vs Heap
Memory Management
The Lost Art of C Strucutre Packing will tell you about structures, alignment, packing, etc...
You have no malloc here.
hi pointer points to something undefined.
hi->next the same.
About the question. Why they should be?
I think you do not understand pointers.
because next is a pointer type and it is pointing to 0x7ffd37e9a470.
if you print the address if next &(hi->next), you can see both hi and hi->next has a difference of 2 (because of int id is the first element).
If I have for example
typedef struct node
{
int numbers[5];
} node;
Whenever I create an instance of such a struct there's gonna be allocation of memory in the stack for the array itself, (in our case 20 bytes for 5 ints(considering ints as 32 bits)), and numbers is gonna be a pointer to the first byte of that buffer. So, I thought that since inside an instance of node, there's gonna be a 20 bytes buffer(for the 5 ints) and a 4 bytes pointer(numbers), sizeof(node) should be 24 bytes. But when I actually print it out is says 20 bytes. Why is this happening? Why is the pointer to the array not taken into account?
I shall be very grateful for any response.
Arrays are not pointers:
int arr[10]:
Amount of memory used is sizeof(int)*10 bytes
The values of arr and &arr are necessarily identical
arr points to a valid memory address, but cannot be set to point to another memory address
int* ptr = malloc(sizeof(int)*10):
Amount of memory used is sizeof(int*) + sizeof(int)*10 bytes
The values of ptr and &ptr are not necessarily identical (in fact, they are mostly different)
ptr can be set to point to both valid and invalid memory addresses, as many times as you will
There is no pointer, just an array. Therefore the struct is of size sizeof( int[5] ) ( plus possible padding ).
The struct node and its member numbersshare the address. If you have a variable of type node or a pointer to that variable, you can access its member.
When you have a variable such as int x; space is set aside for the value. Whenever the identifier x is used, the compiler generates code to access the data in that space in the appropriate manner... there's no need to store a pointer to it to do this (and if there were, wouldn't you need a pointer to that pointer? And a pointer to that? etc.).
When you have an array like int arr[5];, space is set aside the same way, but for 5 ints. When the identifier arr is used, the compiler generates code to access either the relevant array element or give the address of the array (depending on how it's used). The array is not a pointer, and doesn't contain one... but the compiler may use its address instead of its contents in some situations.
An array is said to decay to a pointer to its first element in many situations, but that just means that in those situations the identifier will give its address instead of its contents, much like when you use the address-of operator with a non-array variable. The fact that you can get the address of the int x with &x doesn't mean x contains the address of an int... just that the compiler knows how to figure it out.
Arrays don't work like that. They only allocate space for their elements, but not for a pointer. The "pointer" you are talking about (numbers) is just a placeholder for the address of the array's first element; think of it as a literal, instead of a variable. Therefore, you can not assign a value to it.
int myint;
numbers = &myint;
This won't work, since there is no memory where you could store &myint. numbers will just be converted to an address at compile time.
Size of structure is always defined by the size of its members.
So its really doesn't matter whether members are simply int, char, float or arrary or even structure itself.
If a pointer stores the address of a variable ... then from where do we get the pointer?
What I asked was that if we are using pointer directly, then there must be a location from where we get this pointer?
Yes, a declared pointer has its own location in memory.
In the example above, you have a variable, 'b', which stores the value "17".
int b = 17; /* the value of 'b' is stored at memory location 1462 */
When you create a pointer to that variable, the pointer is stored in its own memory location.
int *a;
a = &b; /* the pointer 'a' is stored at memory location 874 */
It is the compiler's job to know where to "get the pointer." When your source code refers to the pointer 'a', the compiler translates it into -> "whatever address value is stored in memory location 874".
Note: This diagram isn't technically correct since, in 32-bit systems, both pointers and int's use four bytes each.
Yes. Below I have an int and a pointer to an int and code to print out each one's memory address.
int a;
printf("address of a: %x", &a);
int* pA = &a;
printf("address of pA: %x", &pA);
Pointers, on 32bit systems, take up 4 bytes.
In C:
char *p = "Here I am";
p then stores the address where 'H' is stored. p is a variable. You can take a pointer to it:
char **pp = &p;
pp now stores the address of p. If you wanted to get the address of pp that would be &pp etc etc.
Look at this SO post for a better understanding of pointers.
What are the barriers to understanding pointers and what can be done to overcome them?
As far as your question goes, if I understand what you want, then, basically, when you declare a pointer, you specify an address or a numeric index that is assigned to each unit of memory in the system (typically a byte or a word). The system then provides an operation to retrieve the value stored in the memory at that address.
The compiler deals with translating the variables in our code into memory locations used in machine instructions.
The location of a pointer variable depends on where it is declared in the code, but programmers usually don't have to deal with that directly.
A variable declared inside a function lives on the stack or in a register, (unless it is declared static).
A variable declared at the top level lives in a section of memory at the top of the program.
A variable declared as part of a dynamically allocated struct or array lives on the heap.
The & operator returns the memory location of the variable, but unlike the * operator, it can't be repeated.
For example, ***i gets the value at the address **i, which is the value at address *i, which is the value stored in i, which the compiler figures out how to find.
But &&i won't compile. &i is a number, which is the memory location the compiler uses for the variable i. This number is not stored anywhere, so &&i makes no sense.
(Note that if &i is used in the source code, then the compiler can't store i in a register.)