Why can't I modify the base address of an array? Is it because the allocated memory would be lost? in that case, I can make an array using a pointer and change what the pointer points to and the allocated memory would be lost too, then what is the difference?
Arrays are objects all on their own, and not pointers. Consider a simpler object:
int a = 0;
Would you expect to be able to change its address? Of course not. An object is a region of storage with a type. The region of storage is identified by its address, so you won't expect to change it. And arrays are objects too. When you declare
int b[8] = {0};
you declare an object, the size of eight integers, that will occupy some storage. You can't change its address any more than you can change the address of any single int.
You have probably been told that arrays are pointers. But they are not! They may be converted, even implicitly, to a pointer more often than not, but they are still object types. Pointers often stand in for arrays because the address of the first element is enough to reach any other element with pointer arithmetic, but the pointer is not the array object itself. The difference becomes apparent when you inspect their object properties. For instance:
sizeof(b) != sizeof(int*)
The object b is not the size of a pointer, indeed it is the size of 8 integers, likely larger than a pointer.
Related
If malloc() returns a pointer to a single block of memory, how can it be used to store multiple values contiguously and allow access to each one using the subscript operator, acting as a pointer to an array?
If I were to try and change the "second element" of an integer by subscripting its address, it would cause undefined behaviour. As malloc() returns the pointer to a single block of memory, shouldn't the pointer it returns refer to the entire block, and thus subscripting it should access the garbage value next to it in memory?
Furthermore, the allocated memory can also be used to store a single value, but only up to the size of the type the pointer is cast to, not to that of the allocated block of memory.
Is all this something to do with the type the pointer is cast to after being returned? Could someone point me in the right direction?
I think your misunderstanding is here:
As malloc() returns the pointer to a single block of memory, shouldn't the pointer it returns refer to the entire block, and thus subscripting it should access the garbage value next to it in memory?
Indeed if you do p = malloc(n) and p has type "pointer to some type of size n", then p[1] is an out-of-bounds array access. However, normally when you do p = malloc(n) to allocate an array, the type of p is not a pointer to the array (of size n), but a pointer to the first element of the array. That is, instead of
char (*p)[500] = malloc(500);
you do:
char *p = malloc(500);
and in this case p[1] is perfectly valid. Note that with the first, unusual, form, you could still do (*p)[1] or p[0][1] and have it be valid.
But be careful, if you use malloc several times it will return memory allocated in different parts of heap. So you can't move around from one array to another.
int array[100];
int *array;
I am confused about the differences between int array[100] and int *array.
Essentially, when I do int array[100] (100 it's just an example of an int), I just reserved space in memory for 100 ints, but I can do int * array and I didn't specify any type of size for this array, but I can still do array[9999] = 30 and that will still make sense.
So what's the difference between these two?
A pointer is a pointer, it points somewhere else (like the first element of an array). The compiler doesn't have any information about where it might point or the size of the data it might point to.
An array is, well, an array of a number of consecutive elements of the same type. The compiler knows its size, since it's always specified (although sometimes the size is only implicitly specified).
An array can be initialized, but not assigned to. Arrays also often decay to pointers to their first element.
Array decay example:
int array[10];
int *pointer = array; // Here the symbol array decays to the expression &array[0]
// Now the variable pointer is pointing to the first element of array
Arrays can't naturally be passed to function. When you declare a function argument like int arr[], the compiler will be translating it as int *arr.
All of this information, and more, should be in any good book, tutorial or class.
A non-technical explanation:
A pointer's contents refer to an address (which may or may not be valid). An array has an address (which must be valid for the array to exist).
You can think of a pointer as being like an envelope - you can put any address you want on it, but if you want it sent to somewhere in particular, that address has to be correct.
An array is like your house - it exists somewhere, so it has an address. Things properly addressed get sent there.
In short:
A pointer holds an address.
An array has an address.
So
int *array;
creates a pointer of indeterminate value (it can point anywhere!).
When you then have
array[9999] = 30;
you're trying to set the 9999th int value from where array points to the value of 30. But you don't know where array points because you didn't give it an actual value.
And that's undefined behavior.
The difference is when you do int array[100], a memory block of 100 * sizeof(int) is allocated on the stack, but when you do int *array, you need to dynamically allocate memory (with malloc function for example) to use the array variable. Dynamically allocated memory is on the heap, not stack.
int array[100] means a variable array which will be able to hold 100 int values this memory will be allocated from the stack. The variablearray will be having the base address of the array and memory will be allocated for the same.
But in the case of int *array since you are declaring this as a local variable, pointer variable array will be having a garbage address. So if you do array[9999] it could cause a segmentation violation since you are trying to access garbage memory location outside your program.
Some points that you can find useful to know:
Via int arr[N] you specify an array of type int which can store N
integers. To get information about how much memory array is taking you can use sizeof operator. Just multiply the number of items in an array by the size of type: N*sizeof(int).
Name of the array points to the first element in an array, e.g. *arr is the same as arr[0], also you may wonder why a[5] == 5[a].
An uninitialized array of non-static storage duration is filled with indeterminate values.
The size of an array may be known at runtime, if you write int arr[] = {1, 2} the size is calculated by a compiler.
Accessing an unexisting element can cause undefined behaivor, which means that anything could happen, and in most cases you'll get garbage values.
Via int *array you specify a pointer array of type int
Unless a value is assigned, a pointer will point to some garbage address by default.
If you don't allocate memory at all or not fully allocate it or access unexisting element but try to use a pointer as an array, you'll get undefined behavior as expected.
After allocating memory (when the pointer is no longer needed) memory should be freed.
int array[100]; defines an array of int.
int *array; defines a pointer to an int. This pointer may point to an int variable or to an element of an array of int, or to nothing at all (NULL), or even to an arbitrary, valid or invalid address in memory, which is the case when it is an uninitialized local variable. It is a tad misleading to call this pointer array, but commonly used when naming a function argument that indeed points to an actual array. The compiler cannot determine the size of the array, if any, from the pointer value.
Here is a topographic metaphor:
Think of an array as a street with buildings. It has GPS coordinates (memory address) a name (but not always) and a fixed number of buildings (at a given time, hard to change). The street name together with the building number specifies a precise building. If you specify a number larger than the last number, it is an invalid address.
A pointer is a very different thing: think of it as a an address label. It is a small piece of paper that can be used to identify a building. If it is blank (a null pointer), it is useless and if you stick it to a letter and send that, the letter will get lost and discarded (undefined behavior, but it is easy to tell that it is invalid). If you write an invalid address on it, the effect is similar, but might cost much more before failing delivery (undefined behavior and difficult to test for).
If a street is razed (if memory was freed), previously written address labels are not modified, but they no longer point the anything useful (undefined behavior if you send the letter, the difficult kind). If a new street is later named with the name on the label, the letter might get delivered, but probably not as intended (undefined behavior again, memory was freed and some other allocated object happens to be at the same memory address).
If you pass a building to a function, you would usually not unearth it and truck it, but merely pass its street address (a pointer to the n-th building of the street, &array[n]). If you don't specify a building and just name the street, it means go to the beginning of the street. Similarly, when passing an array to a function is C, the function receives a pointer to the beginning of the array, we say that arrays decays as pointers.
Without specifying size in int * array, array[9999] = 30 can cause segmentation fault as it may lead to accessing of inaccessible memory
Basically int * array points to a random location. For accessing the 9999th element the array must point to a location having that much sufficient space. But the statement int * array doesn't explicitly creates any space for that.
If I have for example
typedef struct node
{
int numbers[5];
} node;
Whenever I create an instance of such a struct there's gonna be allocation of memory in the stack for the array itself, (in our case 20 bytes for 5 ints(considering ints as 32 bits)), and numbers is gonna be a pointer to the first byte of that buffer. So, I thought that since inside an instance of node, there's gonna be a 20 bytes buffer(for the 5 ints) and a 4 bytes pointer(numbers), sizeof(node) should be 24 bytes. But when I actually print it out is says 20 bytes. Why is this happening? Why is the pointer to the array not taken into account?
I shall be very grateful for any response.
Arrays are not pointers:
int arr[10]:
Amount of memory used is sizeof(int)*10 bytes
The values of arr and &arr are necessarily identical
arr points to a valid memory address, but cannot be set to point to another memory address
int* ptr = malloc(sizeof(int)*10):
Amount of memory used is sizeof(int*) + sizeof(int)*10 bytes
The values of ptr and &ptr are not necessarily identical (in fact, they are mostly different)
ptr can be set to point to both valid and invalid memory addresses, as many times as you will
There is no pointer, just an array. Therefore the struct is of size sizeof( int[5] ) ( plus possible padding ).
The struct node and its member numbersshare the address. If you have a variable of type node or a pointer to that variable, you can access its member.
When you have a variable such as int x; space is set aside for the value. Whenever the identifier x is used, the compiler generates code to access the data in that space in the appropriate manner... there's no need to store a pointer to it to do this (and if there were, wouldn't you need a pointer to that pointer? And a pointer to that? etc.).
When you have an array like int arr[5];, space is set aside the same way, but for 5 ints. When the identifier arr is used, the compiler generates code to access either the relevant array element or give the address of the array (depending on how it's used). The array is not a pointer, and doesn't contain one... but the compiler may use its address instead of its contents in some situations.
An array is said to decay to a pointer to its first element in many situations, but that just means that in those situations the identifier will give its address instead of its contents, much like when you use the address-of operator with a non-array variable. The fact that you can get the address of the int x with &x doesn't mean x contains the address of an int... just that the compiler knows how to figure it out.
Arrays don't work like that. They only allocate space for their elements, but not for a pointer. The "pointer" you are talking about (numbers) is just a placeholder for the address of the array's first element; think of it as a literal, instead of a variable. Therefore, you can not assign a value to it.
int myint;
numbers = &myint;
This won't work, since there is no memory where you could store &myint. numbers will just be converted to an address at compile time.
Size of structure is always defined by the size of its members.
So its really doesn't matter whether members are simply int, char, float or arrary or even structure itself.
I have been following some examples that declare an int pointer
int *myInt;
and then turn that pointer into an array
myInt = (int*)malloc(1024);
this checks out
myInt[0] = 5;
cout << myInt[0]; // prints 5
myInt[1] = 7;
cout << myInt[1]; // prints 7
I thought an int pointer was a pointer to an int and never anything else. I know that pointers to strings just point to the first character of the string but it looks like the same sort of thing is happening here with an array of ints. But then if what we want is an array of ints why not just create an array of ints instead of a pointer to an int?
By the way I am interested in how this works in C not C++. This is in a C++ file but the relevant code is in C.
Is an int pointer an array of ints?
No.
I thought an int pointer was a pointer to an int and never anything else
That's right. Pointers are pointers, arrays are arrays.
What confuses you is that pointers can point to the first element of arrays, and arrays can decay into pointers to their first element. And what's even more confusing: pointers have the same syntax for dereferencing and pointer arithmetic that arrays utilize for indexing. Namely,
ptr[i]
is equivalent with
*(ptr + i)
if ptr is a pointer. Of course, similarly, arr[i] is the ith element of the arr array too. The similarity arises out of the common nature of pointers and arrays: they are both used to access (potentially blocks of) memory indirectly.
The consequence of this strong relation is that in some situations (and with some constraints), arrays and pointers can be used as if they were interchangeable. This still doesn't mean that they are the same, but they exhibit enough common properties so that their usage often appears to be "identical".
There is an alternative syntax for accessing items pointed by a pointer - the square brackets. This syntax lets you access data through pointers as if the pointer were an array (of course, pointers are not arrays). An expression a[i] is simply an alternative form of writing *(a+i)* .
When you allocate dynamic storage and assign it to myInt, you can use the pointer like a dynamic array that can change size at runtime:
myInt = malloc(1024*sizeof(int)); // You do not need a cast in C, only in C++
for (int i = 0 ; i != 1024 ; i++) {
myInt[i] = i; // Use square bracket syntax
}
for (int i = 0 ; i != 1024 ; i++) {
printf("%d ", *(myInt+i)); // Use the equivalent pointer syntax
}
* Incidentally, commutativity of + lets you write 4[array] instead of array[4]; don't do that!
Sort of, and technically no. An int pointer does point to the int. But an array of ints is contiguous in memory, so the next int can be referenced using *(myInt+1). The array notation myInt[1] is equivalent, in that it uses myInt pointer, adds 1 unit to it (the size of an int), and reference that new address.
So in general, this is true:
myInt[i] == *(myint + i)
So you can use an int pointer to access the array. Just be careful to look out for the '\0' character and stop.
An int pointer is not an array of ints. But your bigger question seems to be why both arrays and pointers are needed.
An array represents the actual storage in memory of data. Once that storage is allocated, it makes no significant difference whether you refer to the data stored using array notation or pointer notation.
However, this storage can also be allocated without using array notation, meaning that arrays are not necessarily needed. The main benefit of arrays is convenient allocation of small blocks of memory, i.e., int x[20] and the slightly more convenient notation array[i] rather than *(array+i). Thankfully, this more convenient notation can be used regardless of whether array came from an array declaration or is just a pointer. (Essentially, once an array has been allocated, its variable name from that point onwards is no different than a pointer that has been assigned to point to the location in memory of the first value in the array.)
Note that the compiler will complain if you try to directly allocate too big of a block of memory in an array.
Arrays:
represent the actual memory that is allocated
the variable name of the array is the same as a pointer that references the point in memory where the array begins (and the variable name + 1 is the same as a pointer that references the point in memory where the second element of the array begins (if it exists), etc.)
values in the array can be accessed using array notation like array[i]
Pointers:
are a place to store the location of something in memory
can refer to the memory that is allocated in an array
or can refer to memory that has been allocated by functions like malloc
the value stored in the memory pointed to by the pointer can be accessed by dereferencing the pointer, i.e., *pointer.
since the name of the array is also a pointer, the value of the first element in the array can be accessed by *array, the second element by *(array+1), etc.
an integer can be added or subtracted to a pointer to create a new pointer that points to other values within the same block of memory your program has allocated. For example, array+5 points to the place in memory where the value array[5] is stored.
a pointer can be incremented or decremented to point to other values with the same block of memory.
In many situations one notation will be more convenient than the other, so it is extremely beneficial that both notations are available and so easily interchanged with each other.
They are not the same. Here is the visible difference.
int array[10];
int *pointer;
printf ("Size of array = %d\nSize of pointer = %d\n",
sizeof (array), sizeof (pointer));
The result is,
Size of array = 40
Size of pointer = 4
If You do "array + 1", the resulting address will be address of array[0] + 40. If You do "pointer + 1", resulting address will be address of pointer[0] + 4.
Array declaration results in compile time memory allocation. Pointer declaration does not result in compile time memory allocation and dynamic allocation is needed using calloc() or malloc()
When you do following assignment, it is actually implicit type cast of integer array to integer pointer.
pointer = array;
I am trying to construct an m-way tree and I am having trouble visualizing an array of pointers pointing to different instances of the B_tree node class (this basically creates the array type nodes and includes all functions associated with the tree such as count, insert etc)
Are there any tips/tricks to visualizing an array of pointers for this case? Are there any good links/resources for explanation of array of pointers? (I did not find the common search results on google that helpful)...
Here is a picture of an array of pointers, they aren't pointing to anything, but this is a visualization of an array of pointers. Here is a link explaining arrays of pointers http://ee.hawaii.edu/~tep/EE160/Book/chap9/section2.1.4.html. Enjoy.
Array with pointers is just like a usual array with a maximum number of size. Each position of the array it does not hold an integer or float or char or a struct. It holds a pointer.
What a pointer is ?
Imagine the computer's memory like a huge array which hold different kind of values. The variable which holds the pointer what it actually holds is the address of a block of memory. It does not hold the value of an integer but if you have int *a; , it means that the variable a which is store in a memory address hold the address of something which is integer.
The pointer always holds 4 bytes. Therefore an array with pointers means that each positions shows the memory address of something. If you have an integer array with pointers of size 10, means that each position shows the address of a memory block( this block stores an integer). So the array holds 10 pointers and each one of them shows to an integer.