How do arrays work inside a struct? - c

If I have for example
typedef struct node
{
int numbers[5];
} node;
Whenever I create an instance of such a struct there's gonna be allocation of memory in the stack for the array itself, (in our case 20 bytes for 5 ints(considering ints as 32 bits)), and numbers is gonna be a pointer to the first byte of that buffer. So, I thought that since inside an instance of node, there's gonna be a 20 bytes buffer(for the 5 ints) and a 4 bytes pointer(numbers), sizeof(node) should be 24 bytes. But when I actually print it out is says 20 bytes. Why is this happening? Why is the pointer to the array not taken into account?
I shall be very grateful for any response.

Arrays are not pointers:
int arr[10]:
Amount of memory used is sizeof(int)*10 bytes
The values of arr and &arr are necessarily identical
arr points to a valid memory address, but cannot be set to point to another memory address
int* ptr = malloc(sizeof(int)*10):
Amount of memory used is sizeof(int*) + sizeof(int)*10 bytes
The values of ptr and &ptr are not necessarily identical (in fact, they are mostly different)
ptr can be set to point to both valid and invalid memory addresses, as many times as you will

There is no pointer, just an array. Therefore the struct is of size sizeof( int[5] ) ( plus possible padding ).
The struct node and its member numbersshare the address. If you have a variable of type node or a pointer to that variable, you can access its member.

When you have a variable such as int x; space is set aside for the value. Whenever the identifier x is used, the compiler generates code to access the data in that space in the appropriate manner... there's no need to store a pointer to it to do this (and if there were, wouldn't you need a pointer to that pointer? And a pointer to that? etc.).
When you have an array like int arr[5];, space is set aside the same way, but for 5 ints. When the identifier arr is used, the compiler generates code to access either the relevant array element or give the address of the array (depending on how it's used). The array is not a pointer, and doesn't contain one... but the compiler may use its address instead of its contents in some situations.
An array is said to decay to a pointer to its first element in many situations, but that just means that in those situations the identifier will give its address instead of its contents, much like when you use the address-of operator with a non-array variable. The fact that you can get the address of the int x with &x doesn't mean x contains the address of an int... just that the compiler knows how to figure it out.

Arrays don't work like that. They only allocate space for their elements, but not for a pointer. The "pointer" you are talking about (numbers) is just a placeholder for the address of the array's first element; think of it as a literal, instead of a variable. Therefore, you can not assign a value to it.
int myint;
numbers = &myint;
This won't work, since there is no memory where you could store &myint. numbers will just be converted to an address at compile time.

Size of structure is always defined by the size of its members.
So its really doesn't matter whether members are simply int, char, float or arrary or even structure itself.

Related

Changing the array's base-address

Why can't I modify the base address of an array? Is it because the allocated memory would be lost? in that case, I can make an array using a pointer and change what the pointer points to and the allocated memory would be lost too, then what is the difference?
Arrays are objects all on their own, and not pointers. Consider a simpler object:
int a = 0;
Would you expect to be able to change its address? Of course not. An object is a region of storage with a type. The region of storage is identified by its address, so you won't expect to change it. And arrays are objects too. When you declare
int b[8] = {0};
you declare an object, the size of eight integers, that will occupy some storage. You can't change its address any more than you can change the address of any single int.
You have probably been told that arrays are pointers. But they are not! They may be converted, even implicitly, to a pointer more often than not, but they are still object types. Pointers often stand in for arrays because the address of the first element is enough to reach any other element with pointer arithmetic, but the pointer is not the array object itself. The difference becomes apparent when you inspect their object properties. For instance:
sizeof(b) != sizeof(int*)
The object b is not the size of a pointer, indeed it is the size of 8 integers, likely larger than a pointer.

Confusion between "int array[int]" and "int *array"

int array[100];
int *array;
I am confused about the differences between int array[100] and int *array.
Essentially, when I do int array[100] (100 it's just an example of an int), I just reserved space in memory for 100 ints, but I can do int * array and I didn't specify any type of size for this array, but I can still do array[9999] = 30 and that will still make sense.
So what's the difference between these two?
A pointer is a pointer, it points somewhere else (like the first element of an array). The compiler doesn't have any information about where it might point or the size of the data it might point to.
An array is, well, an array of a number of consecutive elements of the same type. The compiler knows its size, since it's always specified (although sometimes the size is only implicitly specified).
An array can be initialized, but not assigned to. Arrays also often decay to pointers to their first element.
Array decay example:
int array[10];
int *pointer = array; // Here the symbol array decays to the expression &array[0]
// Now the variable pointer is pointing to the first element of array
Arrays can't naturally be passed to function. When you declare a function argument like int arr[], the compiler will be translating it as int *arr.
All of this information, and more, should be in any good book, tutorial or class.
A non-technical explanation:
A pointer's contents refer to an address (which may or may not be valid). An array has an address (which must be valid for the array to exist).
You can think of a pointer as being like an envelope - you can put any address you want on it, but if you want it sent to somewhere in particular, that address has to be correct.
An array is like your house - it exists somewhere, so it has an address. Things properly addressed get sent there.
In short:
A pointer holds an address.
An array has an address.
So
int *array;
creates a pointer of indeterminate value (it can point anywhere!).
When you then have
array[9999] = 30;
you're trying to set the 9999th int value from where array points to the value of 30. But you don't know where array points because you didn't give it an actual value.
And that's undefined behavior.
The difference is when you do int array[100], a memory block of 100 * sizeof(int) is allocated on the stack, but when you do int *array, you need to dynamically allocate memory (with malloc function for example) to use the array variable. Dynamically allocated memory is on the heap, not stack.
int array[100] means a variable array which will be able to hold 100 int values this memory will be allocated from the stack. The variablearray will be having the base address of the array and memory will be allocated for the same.
But in the case of int *array since you are declaring this as a local variable, pointer variable array will be having a garbage address. So if you do array[9999] it could cause a segmentation violation since you are trying to access garbage memory location outside your program.
Some points that you can find useful to know:
Via int arr[N] you specify an array of type int which can store N
integers. To get information about how much memory array is taking you can use sizeof operator. Just multiply the number of items in an array by the size of type: N*sizeof(int).
Name of the array points to the first element in an array, e.g. *arr is the same as arr[0], also you may wonder why a[5] == 5[a].
An uninitialized array of non-static storage duration is filled with indeterminate values.
The size of an array may be known at runtime, if you write int arr[] = {1, 2} the size is calculated by a compiler.
Accessing an unexisting element can cause undefined behaivor, which means that anything could happen, and in most cases you'll get garbage values.
Via int *array you specify a pointer array of type int
Unless a value is assigned, a pointer will point to some garbage address by default.
If you don't allocate memory at all or not fully allocate it or access unexisting element but try to use a pointer as an array, you'll get undefined behavior as expected.
After allocating memory (when the pointer is no longer needed) memory should be freed.
int array[100]; defines an array of int.
int *array; defines a pointer to an int. This pointer may point to an int variable or to an element of an array of int, or to nothing at all (NULL), or even to an arbitrary, valid or invalid address in memory, which is the case when it is an uninitialized local variable. It is a tad misleading to call this pointer array, but commonly used when naming a function argument that indeed points to an actual array. The compiler cannot determine the size of the array, if any, from the pointer value.
Here is a topographic metaphor:
Think of an array as a street with buildings. It has GPS coordinates (memory address) a name (but not always) and a fixed number of buildings (at a given time, hard to change). The street name together with the building number specifies a precise building. If you specify a number larger than the last number, it is an invalid address.
A pointer is a very different thing: think of it as a an address label. It is a small piece of paper that can be used to identify a building. If it is blank (a null pointer), it is useless and if you stick it to a letter and send that, the letter will get lost and discarded (undefined behavior, but it is easy to tell that it is invalid). If you write an invalid address on it, the effect is similar, but might cost much more before failing delivery (undefined behavior and difficult to test for).
If a street is razed (if memory was freed), previously written address labels are not modified, but they no longer point the anything useful (undefined behavior if you send the letter, the difficult kind). If a new street is later named with the name on the label, the letter might get delivered, but probably not as intended (undefined behavior again, memory was freed and some other allocated object happens to be at the same memory address).
If you pass a building to a function, you would usually not unearth it and truck it, but merely pass its street address (a pointer to the n-th building of the street, &array[n]). If you don't specify a building and just name the street, it means go to the beginning of the street. Similarly, when passing an array to a function is C, the function receives a pointer to the beginning of the array, we say that arrays decays as pointers.
Without specifying size in int * array, array[9999] = 30 can cause segmentation fault as it may lead to accessing of inaccessible memory
Basically int * array points to a random location. For accessing the 9999th element the array must point to a location having that much sufficient space. But the statement int * array doesn't explicitly creates any space for that.

Size of 2d pointers

I'm trying to improve my knowledge with pointers by making an pointer who points to another pointer that is practically a string.
Now I want to get size who normally I could get fromsizeof(foo[0])/sizeof(foo[0][0])
Pointer form
char** foo;
sizeof(test)/sizeof(*test) doesn't indicate the number of elements anymore with your declaration, because the compiler doesn't know what is the pointer pointing to, because sizeof() is a compile time operation and hence not dynamic.
To find no of elements, you can add a sentinel value:
char **test = {"New York", "Paris", "Cairo", NULL};
int testLen = -1;
while(test[++testLen] != NULL){
//DO NOTHING
}
You will never get the size of a block of memory where a pointer points to... because there can be anything.
test simply points to a place in memory where some other pointers are stored (to the first one). Each pointer will again lead to another place in Memory where some character values are stored. So, your test variable contains a simple number (the index of a place in Memory) and depending on your operating System sizeof(test) will maybe have 4 bytes or 8 bytes as result regardless of the size of the allocated memory.
sizeof() will work as you might have expected when using stack arrays. If test is declared as
char test[10][20];
Then sizeof(test) will in fact return 200.
How I can get it's length (=rows)?
You cannot. Read more in How to get the length of dynamically allocated two dimensional arrays in C
Your attempt:
char** foo;
sizeof(foo[0])/sizeof(foo[0][0])
most probably results in 8, right? That's because you are getting the size of a pointer (which is probably 8 in your system) and then divide by the size of a character, which is always 1.
If you are allocating something large you use malloc() and malloc receives one argument - the size in bytes(e.g malloc(sizeof(int)*20).
malloc also returns a void pointer to the allocated memory. You typically cast this pointer to fit your type.
In other words you can't really get the size. You must store it somewhere and pass it to other functions when its needed.
A pointer to pointer (**) is like adding one additional dimension.
[] these are more of a syntax sugar for pointer arithmetic.
a[i] would be the same as *(a+i).
This may vary on your system but sizof() will give you these values for these types.
int a; //4
int b[5]; //20
int* c; //8
int d[5][5];//100
int** e; //8

Pointer layout in memory in C

I've recently been messing around with pointers and I would like to know a bit more about them, namely how they are organized in memory after using malloc for example.
So this is my understanding of it so far.
int **pointer = NULL;
Since we explicitly set the pointer to NULL it now points to the address 0x00.
Now let's say we do
pointer = malloc(4*sizeof(int*));
Now we have pointer pointing to an address in memory - let's say pointer points to the address 0x0010.
Let's say we then run a loop:
for (i = 0; i<4; i++) pointer[i] = malloc(3*sizeof(int));
Now, this is where it starts getting confusing to me. If we dereference pointer, by doing *pointer what do we get? Do we get pointer[0]? And if so, what is pointer[0]?
Continuing, now supposedly pointer[i] contains stored in it an address. And this is where it really starts confusing me and I will use images to better describe what I think is going on.
In the image you see, if it is correct, is pointer[0] referring to the box that has the address 0x0020 in it? What about pointer[1]?
If I were to print the contents of pointer would it show me 0x0010? What about pointer[0]? Would it show me 0x0020?
Thank you for taking the time to read my question and helping me understand the memory layout.
Pointer Refresher
A pointer is just a numeric value that holds the address of a value of type T. This means that T can also be a pointer type, thus creating pointers-to-pointers, pointers-to-pointers-to-pointers, and crazy things like char********** - which is simply a pointer (T*) where T is a pointer to something else (T = E*) where E is a pointer to something else (and so on...).
Something to remember here is that a pointer itself is a value and thus takes space. More specifically, it's (usually) the size of the addressable space the CPU supports.
So for example, the 6502 processor (commonly found in old gaming consoles like the NES and Atari, as well as the Apple II, etc.) could only address 16 bits of memory, and thus its "pointers" were 16-bits in size.
So regardless of the underlying type, a pointer will (usually) be as large as the addressable space.
Keep in mind that a pointer doesn't guarantee that it points to valid memory - it's simply a numeric value that happens to specify a location in memory.
Array Refresher
An array is simply a series of T elements in contiguously addressable memory. The fact it's a "double pointer" (or pointer-to-a-pointer) is innocuous - it is still a regular pointer.
For example, allocating an array of 3 T's will result in a memory block that is 3 * sizeof(T) bytes long.
When you malloc(...) that memory, the pointer returned simply points to the first element.
T *array = malloc(3 * sizeof(T));
printf("%d\n", (&array[0] == &(*array))); // 1 (true)
Keep in mind that the subscript operator (the [...]) is basically just syntactic sugar for:
(*(array + sizeof(*array) * n)) // array[n]
Arrays of Pointers
To sum all of this up, when you do
E **array = malloc(3 * sizeof(E*));
You're doing the same thing as
T *array = malloc(3 * sizeof(T));
where T is really E*.
Two things to remember about malloc(...):
It doesn't initialize the memory with any specific values (use calloc for that)
It's not guaranteed (nor really even common) for the memory to be contiguous or adjacent to the memory returned by a previous call to malloc
Therefore, when you fill the previously created array-of-pointers with subsequent calls to malloc(), they might be in arbitrarily random places in memory.
All you're doing with your first malloc() call is simply creating the block of memory required to store n pointers. That's it.
To answer your questions...
If we dereference pointer, by doing *pointer what do we get? Do we get pointer[0]?
Since pointer is just a int**, and remembering that malloc(...) returns the address of the first byte in the block of memory you allocated, *pointer will indeed evaluate to pointer[0].
And if so, what is pointer[0]?
Again, since pointer as the type int**, then pointer[0] will return a value type of int* with the numeric contents of the first sizeof(int*) bytes in the memory block pointed to by pointer.
If I were to print the contents of pointer would it show me 0x0010?
If by "printing the contents" you mean printf("%p\n", (void*) pointer), then no.
Since you malloc()'d the memory block that pointer points to, pointer itself is just a value with the size of sizeof(int**), and thus will hold the address (as a numeric value) where the block of memory you malloc()'d resides.
So the above printf() call will simply print that value out.
What about pointer[0]?
Again assuming you mean printf("%p\n", (void*) pointer[0]), then you'll get a slightly different output.
Since pointer[0] is the equivalent of *pointer, and thus causes pointer to be dereferenced, you'll get a value of int* and thus the pointer value that is stored in the first element.
You would need to further dereference that pointer to get the numeric value stored in the first integer that you allocated; for example:
printf("%d\n", **pointer);
// or
printf("%d\n", *pointer[0]);
// or even
printf("%d\n", pointer[0][0]); // though this isn't recommended
// for readability's sake since
// `pointer[0]` isn't an array but
// instead a pointer to a single `int`.
If I dereference pointer, by doing *pointer what do I get? pointer[0]?
Yes.
And if so, what is pointer[0]?
With your definitions: 0x0020.
In the image you see, if it is correct
It seems correct to me.
is pointer[0] referring to the box that has the address 0x0020 in it?
Still yes.
What about pointer[1]?
At this point, I think you can guess that it woud show: 0x002c.
To go further
If you want to check how memory is managed and what pointers look like you can use gdb. It allows running a program step by step and performing various operations such as showing the content of variables. Here is the main page for GNU gdb. A quick internet search should let you find numerous gdb tutorials.
You can also show the address of a pointer in c by using a printf line:
int *plop = NULL;
fprintf(stdout, "%p\n", (void *)pointer);
Note: don't forget to include <stdio.h>

Is an int pointer an array of ints?

I have been following some examples that declare an int pointer
int *myInt;
and then turn that pointer into an array
myInt = (int*)malloc(1024);
this checks out
myInt[0] = 5;
cout << myInt[0]; // prints 5
myInt[1] = 7;
cout << myInt[1]; // prints 7
I thought an int pointer was a pointer to an int and never anything else. I know that pointers to strings just point to the first character of the string but it looks like the same sort of thing is happening here with an array of ints. But then if what we want is an array of ints why not just create an array of ints instead of a pointer to an int?
By the way I am interested in how this works in C not C++. This is in a C++ file but the relevant code is in C.
Is an int pointer an array of ints?
No.
I thought an int pointer was a pointer to an int and never anything else
That's right. Pointers are pointers, arrays are arrays.
What confuses you is that pointers can point to the first element of arrays, and arrays can decay into pointers to their first element. And what's even more confusing: pointers have the same syntax for dereferencing and pointer arithmetic that arrays utilize for indexing. Namely,
ptr[i]
is equivalent with
*(ptr + i)
if ptr is a pointer. Of course, similarly, arr[i] is the ith element of the arr array too. The similarity arises out of the common nature of pointers and arrays: they are both used to access (potentially blocks of) memory indirectly.
The consequence of this strong relation is that in some situations (and with some constraints), arrays and pointers can be used as if they were interchangeable. This still doesn't mean that they are the same, but they exhibit enough common properties so that their usage often appears to be "identical".
There is an alternative syntax for accessing items pointed by a pointer - the square brackets. This syntax lets you access data through pointers as if the pointer were an array (of course, pointers are not arrays). An expression a[i] is simply an alternative form of writing *(a+i)* .
When you allocate dynamic storage and assign it to myInt, you can use the pointer like a dynamic array that can change size at runtime:
myInt = malloc(1024*sizeof(int)); // You do not need a cast in C, only in C++
for (int i = 0 ; i != 1024 ; i++) {
myInt[i] = i; // Use square bracket syntax
}
for (int i = 0 ; i != 1024 ; i++) {
printf("%d ", *(myInt+i)); // Use the equivalent pointer syntax
}
* Incidentally, commutativity of + lets you write 4[array] instead of array[4]; don't do that!
Sort of, and technically no. An int pointer does point to the int. But an array of ints is contiguous in memory, so the next int can be referenced using *(myInt+1). The array notation myInt[1] is equivalent, in that it uses myInt pointer, adds 1 unit to it (the size of an int), and reference that new address.
So in general, this is true:
myInt[i] == *(myint + i)
So you can use an int pointer to access the array. Just be careful to look out for the '\0' character and stop.
An int pointer is not an array of ints. But your bigger question seems to be why both arrays and pointers are needed.
An array represents the actual storage in memory of data. Once that storage is allocated, it makes no significant difference whether you refer to the data stored using array notation or pointer notation.
However, this storage can also be allocated without using array notation, meaning that arrays are not necessarily needed. The main benefit of arrays is convenient allocation of small blocks of memory, i.e., int x[20] and the slightly more convenient notation array[i] rather than *(array+i). Thankfully, this more convenient notation can be used regardless of whether array came from an array declaration or is just a pointer. (Essentially, once an array has been allocated, its variable name from that point onwards is no different than a pointer that has been assigned to point to the location in memory of the first value in the array.)
Note that the compiler will complain if you try to directly allocate too big of a block of memory in an array.
Arrays:
represent the actual memory that is allocated
the variable name of the array is the same as a pointer that references the point in memory where the array begins (and the variable name + 1 is the same as a pointer that references the point in memory where the second element of the array begins (if it exists), etc.)
values in the array can be accessed using array notation like array[i]
Pointers:
are a place to store the location of something in memory
can refer to the memory that is allocated in an array
or can refer to memory that has been allocated by functions like malloc
the value stored in the memory pointed to by the pointer can be accessed by dereferencing the pointer, i.e., *pointer.
since the name of the array is also a pointer, the value of the first element in the array can be accessed by *array, the second element by *(array+1), etc.
an integer can be added or subtracted to a pointer to create a new pointer that points to other values within the same block of memory your program has allocated. For example, array+5 points to the place in memory where the value array[5] is stored.
a pointer can be incremented or decremented to point to other values with the same block of memory.
In many situations one notation will be more convenient than the other, so it is extremely beneficial that both notations are available and so easily interchanged with each other.
They are not the same. Here is the visible difference.
int array[10];
int *pointer;
printf ("Size of array = %d\nSize of pointer = %d\n",
sizeof (array), sizeof (pointer));
The result is,
Size of array = 40
Size of pointer = 4
If You do "array + 1", the resulting address will be address of array[0] + 40. If You do "pointer + 1", resulting address will be address of pointer[0] + 4.
Array declaration results in compile time memory allocation. Pointer declaration does not result in compile time memory allocation and dynamic allocation is needed using calloc() or malloc()
When you do following assignment, it is actually implicit type cast of integer array to integer pointer.
pointer = array;

Resources