pointer to pointer malloc and manipulation - c

#include <stdio.h>
#include <stdlib.h>
int main(int argc, char * argv[])
{
char *p[2];
char **pp = calloc(2, 4);
p[0] = "ab";
p[1] = "cd";
// p[2] = "ef";
pp[0] = "ab";
pp[1] = "cd";
pp[2] = "ef";
printf("pp: %s, %s, %s\n", pp[0], pp[1], pp[2]);
printf("size_p: %d\nsize_pp: %d\n", sizeof p, sizeof pp);
}
if 'p[2]' is defined and assigned a value - the resulting behavior is a segfault. if 'pp[2]' is assigned - the output is the following: "ab, cd, ef". 'sizeof' returns 8 (2x4 bytes per pointer) for 'p' and only 4 bytes for 'pp'. why am i being able to assign 'pp[2]', even though it should only be in possession of 8 bytes of allocated memory (that should be able to store only 2 pointer addresses)? also, how does 'sizeof' determine the actual memory size in both of the cases?

p is declared to have two elements, so p[2] does not exist - hence the segfault. Since p is a local array (of pointers), sizeof(p) gives the size of the element type (and the element type is char *, whose size is 4) multiplied by the number of elements (2). pp, on the other hand, is a pointer (to a pointer), not an array, so sizeof(p) is simply the size of a char **, which is the same as the size of any other pointer on a 32-bit machine, namely 4. That the assignment to pp[2] seems to succeed is pure chance - you're writing outside the allocated memory (which only contains space for two char * elements).
By the way, pointers to string literals should be const char *.

With char *p[2]; you allocate 2 char* on the stack. When accessing p[2], you have a buffer overflow and might access any other fings belonging to the stack frame of the current method (some compilers check this in debug mode).
With calloc, you allocate memory in the heap. Accessing pp[2] is (probabely) free memory, you no segfault here. But this memory could also be used by other objects, so this is absolutely not ok!
For size calculation: sizeof(char**) is 4, as is for every 32-bit pointer. sizeof(char*[2]) is 8, because it is 2x4 bytes.

As Aasmund Eldhuset said, p[2] does not exist. p[0] and p[1] are the two elements of the array.
The reason it segfaults for p[2] and not pp[2] is, as I understand it, because p is stored on the stack and pp is stored on the heap. So although you don't own the memory at pp[2], it doesn't seg fault. Instead, it just overwrites god-knows-what and will probably cause your program to misbehave in unpredictable ways.
In general, dynamically allocated memory (such as pp in your example) will not always segfault if you overstep their bounds, whereas statically allocated memory (eg. p) will segfault.

Related

How does malloc know how much memory space is treated for an index?

When we use malloc, it returns a pointer to the beginning of a fixed sized memory address that was passed to malloc. For example, malloc(40) will throw me some uninitialized piece of memory that is 40 bytes long. The thing is, I have seen examples of code where people index into this piece of memory. My question is, how does malloc define the size of an index?
For example, take this piece of code,
#include <stdio.h>
#include <stdlib.h>
int main()
{
char **array;
array = malloc(3 * sizeof(char *));
for (int i=0; i < 3; i++) {
array[i] = malloc(10);
}
for (int i=0; i<10; i++) {
free(array[i]);
}
free(array);
return 0;
}
I would first like to explain what I believe is happening and would hope someone could correct me about any incorrect ideas that I have.
char **array creates the variable "array", where it will become a pointer, to a character pointer. This means if we dereference this value, it will give us a memory address location of where the char is stored.
array = malloc(3 * sizeof(char *)) . Let's assume here that sizeof(char *) will always return 8. Carrying on, this will create an uninitialized piece of memory that is 32 bytes long. They key point here is that it is 32 bytes long, how does it treat an indexable size?
array[i] = malloc(10) is the part of my confusion here. We have an uninitialized piece of memory that is 32 bytes long, how do we index into it?
I have an idea which I would like to draw out and hope someone could correct any misunderstanding I have.
0x02 0x0A 0x12
[ 0x90 | 0x91 | 0x92 ]
<-- sizeof(char*) -> <-- sizeof(char*) -> <-- sizeof(char*) ->
^
|
|
0x01 (memory address of variable array) (array - points to 0x02)
-- Random memory locations
0x90 -- | Starting from the memory address location 0x90, the next sizeof(char) bytes will representing the value in this memory address location.
['c']
0x91
['a']
0x92
['t']
From my understanding malloc will know the indexable size from the cast we have done on our initial pointer, i.e. the char* inside of char** array. This means, that our pointer that was returned from malloc(40) will pointer to, in this example, a memory address space located at 0x02 (The beginning of the array).
Each time we perform the action array[i] we are actually doing 0x02 + sizeof(char*) * i which will push the pointer to the beginning of a new location. This means for example when we do array[1] we are actually doing 0x02 + sizeof(char*) * 1 which would push us to 0x02 + 8 (0x0A). This means that from the memory address location 0x0A the next sizeof(char *) bytes will be read as the index stored in this place in memory. In this example it would be a char *, in my example I have written 0x90, meaning some other place in memory 0x90 the next sizeof(char), i.e. 1 byte will have the actual value. The actual value representing 'c' (for example), but this could be located somewhere else in memory, not related to malloc.
Using this formula we can have for example an integer array returned from malloc, by having int* ten_int_array = malloc(10 *sizeof(10)). Now the formula would be adjusted to ten_int_array + sizeof(int) * i. Which would make malloc not a fixed size indexable.
Thank you for any replies, I am trying to verify my assumptions here.
Here is what happening. Supposedly that you are running it on a 64-bit system where memory address needs 8 bytes (64 bits).
char *str; declares a pointer variable denoted by * which can point to a place in memory. By the above convention it is supposed to be 8 bytes large. The compiler knows that the object it is supposed to point to is a char. It has absolutely no idea if there should be other chars which follow or precedes this location, only the programmer does.
So, str = malloc(10); allocates enough memory space to keep 10 characters, including terminating '0'. The address of the memory is assigned to the pointer variable str.
char **array; declares a pointer * to a pointer *. This yet another 8-byte variable which, as the compiler knows, points to another pointer which itself points to a char. Similarly to the previous, it has no idea if there are more pointers adjacent to the one it is supposed to point.
array = malloc(3 * sizeof(char*)); allocates enough space in memory to keep exactly 3 pointers to char. The result of allocation will be assigned to array.
In ā€˜cā€™ operator [] applied to a pointer is similar to the one applied to an array variable. So array[1] returns a pointer #2 from the memory allocated above.
array[i] = malloc(10); allocates memory for a string of 10 characters and assigns the result to the pointer which s pointed by the array[i]. free(array[i]) frees this memory.
As a result, you have a two-level dynamic structure.
|-malloc(3 * sizeof (char*)) == 24 bytes
|
V
array --> [0] --> malloc(10) == 10 bytes
[1] --> malloc(10)
[2] --> malloc(10)
So, when you free, you need to free(array[i]) //0..2 first then free(array) because after freeing array, the memory it points too becomes invalidated and you cannot use it.

Could a void pointer store a larger array than its dynamically allocated size?

I am still in the learning phase of C and wrote the following function. I was not expecting it to work because the void pointer is only given 2 bytes which is not enough for my 23 byte char array. Though it is stored the char array and could typecaste it to another pointer variable.
void main(){
void *p = malloc(2 * sizeof(char));
p = "Unites State of America";
printf("%p length -> %ld, sizeof -> %ld\n", p, strlen(p), sizeof(p)/sizeof(p[0]));
char *pstr = (char*) p;
printf("%s length -> %ld\n", pstr, strlen(pstr));
}
Result:
0x55600dccd008 length -> 23, sizeof -> 8
Unites State of America length -> 23
How did my void pointer exceed the size I initially requested?
You allocated 2 chars worth of memory, so now there's a small chunk of memory in the heap waiting for data to be stored there. However, you then reassign p to point to the string "Unites State of America", which is stored elsewhere. p = "string" does not move a string into the memory pointed to by p, it makes p point to the string.
Could a void pointer store a larger array than its dynamically allocated size?
Pointers of any pointer type store addresses. Pointers themselves have a size determined by their (pointer) type, like any other object, and it's very common for all pointers provided by a given implementation to have the same size as each other. If you allocate memory, then the size is a characteristic of the allocated object, not of any pointer to it.
I was not expecting it to work because the void pointer is only given 2 bytes which is not enough for my 23 byte char array.
Indeed the two bytes you allocated are not enough to accommodate the 24 bytes of your string literal (don't forget to count the string terminator), but
That's irrelevant, because you don't attempt to use the allocated space. Assigning to a pointer changes the value of the pointer itself, not of whatever data, if any, it points to.
Even if you were modifying the pointed-to data, via strcpy(), for example, C does not guarantee that it would fail. Instead, such an attempt would produce undefined behavior, which could manifest in any way at all that is within the power of the program and C implementation. Sometimes that even takes the appearance of what the programmer wanted or supposed.
How did my void pointer exceed the size I initially requested?
It did not. You allocated two bytes, and recorded a pointer to their location in p. Then, you assigned the address of the first character of your string literal to p, replacing its previous value. The contents of the string literal are not copied. The program no longer having a pointer to the allocated two bytes, it has no way to access them or free them, but deallocating them requires calling free, so they remain allocated until the program terminates. This is called a "memory leak".
I should furthermore point out that there is a special consideration here involving string literals. These represent arrays, and in most contexts where an array-valued expression appears in C source code, the array is converted automatically to a pointer to its first element. A popular term for that is that arrays decay to pointers. This is why you end up assigning p to point to the first character of the array. The same would not apply if you were assigning, say, an int or double value to p, and your compiler indeed ought at least to warn in such cases.
Your understanding is not quite correct here. When you do the below two lines, you are actually leaking memory by not using the actually allocated dynamic memory.
void *p = malloc(2 * sizeof(char));
p = "Unites State of America";
Your pointer p holds a region in heap to store 2 * sizeof(char) bytes but, you are actually overwriting that location with a statically allocated string. All your string operations strlen(), sizeof() are done in this statically allocated string "Unites State of America"
You need to use functions like strncpy() or equivalent to copy the string characters to the dynamically allocated location. But since you don't have allocated sufficient bytes to hold the large string but only 2 * sizeof(char) bytes.
Your other pointer assignment isn't quite incorrect, because you have just introduced another pointer to point to the location where the const string pointed by p is referring to.
char *pstr = (char*) p;
So to summarize even if you had use the right string copy functions, copying beyond the allocated size, i.e. copying 23 bytes to a 2 byte allocated region, is a memory access violation and could lead to undesirable results.

In C, how does the specific type of pointer treat the memory space which point to?

Is a non-void pointer in C only cares about the memory space from its address to the address that the memory space is suitable for the type or ...?
Example:
typedef struct {...} A;
// the allocated memory space is much larger than sizeof(A)
A* temp = (A*) malloc(sizeof(A) + 256 * 256);
char* charPointer = (char*) temp;
charPointer += sizeof(A);
temp = (A*) charPointer;
In the last line, is temp still point to the new "A variable"? (seems an array of A allocated)
Update:
Does the cast in temp declaration & initialisation turns the memory space into an array of A, or memory space has no "type", the temp takes first (size: sizeof(A) ) memory space to store A variable, and the rest of memory space did nothing?
First of all a void* is implicitly convertible to any other pointer type (hint: the cast to value returned by malloc is superfluous).
Then memory means nothing, is how you interpret its contents that gives it a meaning.
So you are basically allocating sizeof(A) + SOME_LENGTH bytes of memory, then you tell the compiler that you want to treat a specific address starting from the allocated memory as a A*.
Nothing prevents you from doing it, and it will work as long as the memory reserved starting from the address is >= sizeof(A).
The only problem is how you release the memory. The derived address charPointer + sizeof(A) is not an address that is marked as something returned by malloc from the operating system. This means that the following code yields undefined behavior:
void* temp = malloc(sizeof(A) + sizeof(A));
A* ptr = temp + sizeof(A);
free(ptr);
Yes, it points to a "A struct". Allocated memory is only some of bytes, user can access that with any pointer type.
In these usages the user should be aware of dynamic memory allocation and pointers' concept in c, to avoid the segmentation fault problem.
In C, malloc returns the address of the first byte of a new memory allocation.
You have to cast it, (or at least put it in a pointer type).
Depending of your pointer type, if you increment this address, it will jump the correct amount of byte.
For exemple :
main.c
int* myPointer = NULL;
mypointer = (int*) malloc(sizeof(int) * 10);
//mypointer is the address of the first int
mypointer++;
//mypointer is now the address of the second int.
He knows how many byte he have to jump after the pointer incrementation, because he knows the type of your pointer (int*).
An int is 4Byte, so in the memory when you increment an int* it goes 4 address further.
So yes it point to the A struct, but if you cast the wrong type, you will have segmentation fault because it will not increment by 4 (in this exemple).
Hope it helped.

Incrementing pointer to array

I came across this program on HSW:
int *p;
int i;
p = (int *)malloc(sizeof(int[10]));
for (i=0; i<10; i++)
*(p+i) = 0;
free(p);
I don't understand the loop fully.
Assuming the memory is byte addressable, and each integer takes up 4 bytes of memory, and say we allocate 40 bytes of memory to the pointer p from address 0 to 39.
Now, from what I understand, the pointer p initially contains value 0, i.e. the address of first memory location. In the loop, a displacement is added to the pointer to access the subsequent integers.
I cannot understand how the memory addresses uptil 39 are accessed with a displacement value of only 0 to 9. I checked and found that the pointer is incremented in multiples of 4. How does this happen? I'm guessing it's because of the integer type pointer, and each pointer is supposedly incremented by the size of it's datatype. Is this true?
But what if I actually want to point to memory location 2 using an integer pointer. So, I do this: p = 2. Then, when I try to de-reference this pointer, should I expect a segmentation fault?
Now, from what I understand, the pointer p initially contains value 0
No, the pointer p would not hold the value 0 in case malloc returns successfully.
At the point of declaring it, the pointer is uninitialized and most probably holds a garbage value. Once you assign it to the pointer returned by malloc, the pointer points to a region of dynamically allocated memory that the allocator sees as unoccupied.
I cannot understand how the memory addresses uptil 39 are accessed
with a displacement value of only 0 to 9
The actual displacement values are 0, 4, 8, 12 ... 36. Because the pointer p has a type, in that case int *, this indicates that the applied offset in pointer arithmetics is sizeof(int), in your case 4. In other words, the displacement multiplier is always based on the size of the type that your pointer points to.
But what if I actually want to point to memory location 2 using an
integer pointer. So, I do this: p = 2. Then, when I try to
de-reference this pointer, should I expect a segmentation fault?
The exact location 2 will most probably be unavailable in the address space of your process because that part would either be reserved by the operating system, or will be protected in another form. So in that sense, yes, you will get a segmentation fault.
The general problem, however, with accessing a data type at locations not evenly divisible by its size is breaking the alignment requirements. Many architectures would insist that ints are accessed on a 4-byte boundary, and in that case your code will trigger an unaligned memory access which is technically undefined behaviour.
Now, from what I understand, the pointer p initially contains value 0
No, it contains the address to the first integer in an array of 10. (Assuming that malloc was successful.)
In the loop, a displacement is added to the pointer to access the subsequent integers.
Umm no. I'm not sure what you mean but that is not what the code does.
I checked and found that the pointer is incremented in multiples of 4. How does this happen?
Pointer arithmetic, that is using + - ++ -- etc operators on a pointer, are smart enough to know the type. If you have an int pointer a write p++, then the address that is stored in p will get increased by sizeof(int) bytes.
But what if I actually want to point to memory location 2 using an integer pointer. So, I do this: p = 2.
No, don't do that, it doesn't make any sense. It sets the pointer to point at address 0x00000002 in memory.
Explanation of the code:
int *p; is a pointer to integer. By writing *p = something you change the contents of what p points to. By writing p = something you change the address of where p points.
p = (int *)malloc(sizeof(int[10])); was written by a confused programmer. It doesn't make any sense to cast the result of malloc in, you can find extensive information about that topic on this site.
Writing sizeof(int[10]) is the same as writing 10*sizeof(int).
*(p+i) = 0; is the very same as writing p[i] = 0;
I would fix the code as follows:
int *p = malloc(sizeof(int[10]));
if(p == NULL) { /* error handling */ }
for (int i=0; i<10; i++)
{
p[i] = 0;
}
free(p);
Since you have a typed pointer, when you perform common operations on it (addition or subtraction), it automatically adjusts the alignment for your type. Here, since on your computer sizeof (int) is 4, p + i will result in the address p + sizeof (int) * i, or p + 4*i in your case.
And you seem to misunderstand the statement *(p+i) = 0. This statement is equivalent to p[i] = 0. Obviously, your malloc() call won't return you 0, except if it fails to actually allocate the memory you asked.
Then, I assume that your last question means "If I shift my malloc-ated address by exactly two bytes, what will occur?".
The answer depends on what you do next and on the endianness of your system. For example:
/*
* Suppose our pointer p is well declared
* And points towards a zeroed 40 bytes area.
* (here, I assume sizeof (int) = 4)
*/
int *p1 = (int *)((char *)p + 2);
*p1 = 0x01020304;
printf("p[0] = %x, p[1] = %x.\n", p[0], p[1]);
Will output
p[0] = 102, p[1] = 3040000.
On a big endian system, and
p[0] = 3040000, p[1] = 102
On a little endian system.
EDIT : To answer to your comment, if you try to dereference a randomly assigned pointer, here is what can happen:
You are lucky : the address you type correspond to a memory area which has been allocated for your program. Thus, it is a valid virtual address. You won't get a segfault, but if you modify it, it might corrupt the behavior of your program (and it surely will ...)
You are luckier : the address is invalid, you get a nice segfault that prevents your program from totally screwing things up.
It is called pointer arithmetic. Add an integer n to a pointer of type t* moves the pointer by n * sizeof(t) elements. Therefore, if sizeof(int) is 4 bytes:
p + 1 (C) == p + 1 * sizeof(int) == p + 1 * 4 == p + 4
Then it is easier to index your array:
*(p+i) is the i-th integer in the array p.
I don't know if by "memory location 2" you mean your example memory address 2 or if you mean the 2nd value in your array. If you mean the 2nd value, that would be memory address 1. To get a pointer to this location you would do int *ptr = &p[1]; or equivalently int *ptr = p + 1;, then you can print this value with printf("%d\n", *ptr);. If you mean the memory address 2 (your example address), that would be the 3rd value in the array, then you'd want p[2] or p + 2. Note that memory addresses are usually in hex and wouldn't actually start at 0. It would be something like 0x092ef000, 0x092ef004, 0x092ef008, . . .. All of the other answers aren't understanding that you are using memory addresses 0 . . . 39 just as example addresses. I don't think you honestly are referring to the physical locations starting at address 0x00000000 and if you are then what everyone else is saying is right.

Difference between pointer and array in terms of memory [duplicate]

This question already has answers here:
Pointers - Difference between Array and Pointer
(2 answers)
Closed 9 years ago.
char* pointer;
char array[10];
I know that the memory of second one is already allocated in the buffer. But, I don't know how exactly pointer works in terms of memory allocation. How much space does pointer initially takes before it is allocated by the programmer with malloc or calloc? Additionally, if I initialize it like this
char* pointer;
pointer = "Hello World!";
If the memory isn't allocated before it's initialized with some random string size, how is this being initialized? Wouldn't there be any error involved?
I was just programming with pointers and arrays mechanically w/o really knowing how it works inside the computer. And, I thought I should understand this perfectly for better programming practice.
Pointer is just to store address of one variable. i.e, if you say char* it stores address of one character. like int i=9; means memory of sizeof(int) is reserved and labled as "i" inyour program. Like wise char* c; means memory of size(char*) is reserved and labled as "c"; in c="hello"; "h","e","l","l","o" got seperate "continous" memory allocated. and pointer c points to first char "H".
consider in memory HELLO is store before string "India".
"HELLOINDIA."
for char *c="HELLO"; c[5] returns I.
for char c[5]="HELLO"; c[5] is array out of bound error.
char array[10] ,
It will reserve memory of 10 bytes on stack frame of function in which you have declared it.
Whereas in char *ptr = "hello" ptr will get 4 bytes memory on stack in 32 bit O.S,also "hello" is string literal which will get stored on non-bss part of your executable,and ptr is pointing to it from stack frame.
This declaration:
char *pointer;
reserves sizeof(char *) bytes for the pointer value. No other memory is allocated.
This declaration:
char array[10];
reserves 10 bytes for the array.
In this case:
char *pointer;
pointer = "Hello World!";
You still have a single pointer (sizeof(char *) in size) that points to a string literal somewhere in memory - no other allocations are taking place. Storage for the string literal is worked out by your toolchain at compile time.
Pointers and arrays are completely different things. Any similarity and confusion between the two is an artifact of the C language.
A pointer is a variable which holds the location of another variable. An array (in C) is an aggregate of values of identical type, consecutively allocated in memory.
Arrays are accessed via arithmetic upon a pointer to the base element, [0]. If an expression which refers to an array is evaluated, the value which emerges is a pointer to element [0].
int array[10];
int *p = array; /* array expression yields pointer to array[0] */
int *q = &array[0]; /* q also points to same place as p */
The array notation in C is a sham which actually works with pointers. The syntax E1[E2] means the same thing as *(E1 + E2) (assuming that E1 and E2 are sufficiently parenthesized that we don't have to be distracted by associativity and precedence.) When we take the address of an element via &E1[E2], this is the same as &*(E1 + E2). The address-of and dereference "cancel out" leaving E1 + E2. Therefore, these are also equivalent:
int *r = array + 3;
int *q = &array[3];
Because array[i] and pointer[i] are both valid syntax, people in the newbie stage (mistaking syntax to be semantics) conclude that arrays and pointers are somehow the same thing.
Spend some time programming in an assembly language. In assembly language, you might define some storage like this:
A: DFS 42 ;; define 42 words, labelled as A.
Then use it like this:
MOV R13, A ;; address of A storage is moved into R13
MOV R1, [R13 + 3] ;; load fourth word, A[3]
R13 points to the storage. That doesn't mean A is a pointer. A is the name of the storage. Of course to use the storage, we need its effective address and so a reference to A resolves to that. When the code is assembled and linked, that MOV instruction will end up loading some numeric address into R13.
C is just a higher level assembly language. Arrays are like named storage which resolves to its effective address (being a pointer data type).
Arrays do not always resolve to their effective address. sizeof a calculates the size of an array a in bytes, whereas sizeof p calculates the size of the pointer data type p.
Another difference is that the name of an array cannot be made to refer to any place other than that array, whereas we can assign values to pointers:
array++; /* invalid */
array = &array[0]; /* invalid */
p = &array2[0]; /* valid */
p++; /* valid: point to the next element in array2 */
A pointer takes up whatever space is required to describe a memory location. In general (but not always), the size of a pointer is the same as the bit-size of the processor/OS/program-mode (e. g. 8 bytes for a 64-bit program on a 64-bit OS, 4 bytes on a 32-bit program, etc.). You can find this out using
sizeof (void *)
In the case of
char* pointer;
pointer = "Hello World!";
You'll have one pointer allocated in R/W memory, plus the space for the string (13 bytes, including the trailing null byte) in R/O memory, perhaps more if the next object in memory is aligned on better than a byte boundary). Note that the same R/O space would be allocated for
printf("Hello World!");
so that actually has nothing to do with the pointer. In fact, most optimizing compilers would notice that the two strings are exactly the same and only allocate it once in R/O memory.
I don't know how exactly pointer works in terms of memory allocation. How much space does pointer initially takes before it is allocated by the programmer with malloc or calloc?
pointer itself is just a data type, like int, char...etc.
it point to a memory address (or NULL).
it can be malloc, become a pointer point to a block of memory you ask.
you very likely mistaken that pointer = malloc, it's not.
when you define a pointer in c, e.g. char *c; or int *i it will reserve a memory equal to sizeof(char *) and sizeof(int *) respectively.
But the actual memory reserved depends on your system/OS, if it is 64-bit it will reserve 8 bytes, if it is 32-bit it reserves 4 bytes.
In case of declaring char *c = "Hello world";
the string "Hello world" can be stored any where in memory but here c points to first character of the string that is 'H'.

Resources