Misconception of how memory is created using malloc()/calloc()

Misconception of how memory is created using malloc()/calloc() - c

My concept of the way malloc()/calloc() create memory has always been that once an item is created, the address of the object stays the same. But a function I often use to create an array of strings, and one that seems to have always worked well, recently caused me to question my understanding, that is, memory addresses of objects can be (and are) moved simply by calling calloc/malloc.
To illustrate, here is the function I have used to create memory for an array of strings - char **:
char ** CreateArrayOfStrings(char **a, int numWords, int maxWordLen)
{
int i;
a = calloc(numWords, sizeof(char *)); //create array of pointers
if(!a) return a; //required caller to check for NULL
for(i=0;i<numWords;i++)
{
a[i] = calloc(maxWordLen + 1, 1); //create memory for each string
}
return a;
}
On my system, (Win7, 32bit compile, ANSI C) The line:
a = calloc(numWords, sizeof(char *)); //create array of pointers
Creates a block of contiguous memory, sized for numWords char *, in this case 7, yielding 28 bytes:
Memory spans from address 0x03260080 + 1C (0x0326009C)
Or:
a[0] is at 0x3200260080
a[1] is at 0x3200260084
a[2] is at 0x3200260088
a[3] is at 0x320026008C
a[4] is at 0x3200260090
a[5] is at 0x3200260094
a[6] is at 0x3200260098
Then, I create memory for each of numWords (7) strings
for(i=0;i<numWords;i++)
{
a[i] = calloc(maxWordLen + 1, 1); //maxWordLen == 5 in this example
}
Which results in the following:
This shows that the memory locations of the pointers a[1] - a[6] have been changed.
Can someone explain how/why this happens in malloc()/calloc()?

It appears that you are comparing apples to oranges:
When you print a[i] is at ... pointers, you show the addresses of elements inside the array a
However, when you shoe the memory layout, you show the values at these addresses, which are themselves pointers, so the whole picture looks confusing.
If you print the values at a[i] before assigning calloc results to them, you should get all zeros, because calloc NULLs out the memory. After the assignments, you see pointers to 6-byte blocks at each a[i], which makes perfect sense.
To summarize, your initial understanding of what happens when you allocate memory with malloc and calloc is correct: once a chunk of memory is allocated, its address* remains the same.
* On systems with virtual memory, I should say "its virtual address".

The memory of those addresses has not been changed. You are creating a 28-byte large block of space (a) and then at each element, dynamically allocating a second 6-byte block of space with its own address.
In other words, at a[1] (memory address 0x03260084), the value that is stored there is a pointer to memory address 0x32600D0.
To check the memory locations and the values at each one, try this:
for ( i = 0; i < 8; i++ )
{
printf("a[%d] %p %p\n",i,&(a[i]),a[i]);
}

When you call calloc(numWords, sizeof(char *)) you ask the operating system to allocate numWords pointers of size ``4 bytes each, on your system'', which is why the resulting block is 4 * numWords bytes, and it returns the address to the first one of them. So now you can store the addresses of the pointers that will hold the actual data. The nth call to calloc(maxWordLen + 1, 1) will then return the address to a block of size maxWordLen and store it at the memory location pointed to by a[n] which is simply the address returned by the first call to calloc plus n * sizeof(char *).

Related

How does malloc know how much memory space is treated for an index?

When we use malloc, it returns a pointer to the beginning of a fixed sized memory address that was passed to malloc. For example, malloc(40) will throw me some uninitialized piece of memory that is 40 bytes long. The thing is, I have seen examples of code where people index into this piece of memory. My question is, how does malloc define the size of an index?
For example, take this piece of code,
#include <stdio.h>
#include <stdlib.h>
int main()
{
char **array;
array = malloc(3 * sizeof(char *));
for (int i=0; i < 3; i++) {
array[i] = malloc(10);
}
for (int i=0; i<10; i++) {
free(array[i]);
}
free(array);
return 0;
}
I would first like to explain what I believe is happening and would hope someone could correct me about any incorrect ideas that I have.
char **array creates the variable "array", where it will become a pointer, to a character pointer. This means if we dereference this value, it will give us a memory address location of where the char is stored.
array = malloc(3 * sizeof(char *)) . Let's assume here that sizeof(char *) will always return 8. Carrying on, this will create an uninitialized piece of memory that is 32 bytes long. They key point here is that it is 32 bytes long, how does it treat an indexable size?
array[i] = malloc(10) is the part of my confusion here. We have an uninitialized piece of memory that is 32 bytes long, how do we index into it?
I have an idea which I would like to draw out and hope someone could correct any misunderstanding I have.
0x02 0x0A 0x12
[ 0x90 | 0x91 | 0x92 ]
<-- sizeof(char*) -> <-- sizeof(char*) -> <-- sizeof(char*) ->
^
|
|
0x01 (memory address of variable array) (array - points to 0x02)
-- Random memory locations
0x90 -- | Starting from the memory address location 0x90, the next sizeof(char) bytes will representing the value in this memory address location.
['c']
0x91
['a']
0x92
['t']
From my understanding malloc will know the indexable size from the cast we have done on our initial pointer, i.e. the char* inside of char** array. This means, that our pointer that was returned from malloc(40) will pointer to, in this example, a memory address space located at 0x02 (The beginning of the array).
Each time we perform the action array[i] we are actually doing 0x02 + sizeof(char*) * i which will push the pointer to the beginning of a new location. This means for example when we do array[1] we are actually doing 0x02 + sizeof(char*) * 1 which would push us to 0x02 + 8 (0x0A). This means that from the memory address location 0x0A the next sizeof(char *) bytes will be read as the index stored in this place in memory. In this example it would be a char *, in my example I have written 0x90, meaning some other place in memory 0x90 the next sizeof(char), i.e. 1 byte will have the actual value. The actual value representing 'c' (for example), but this could be located somewhere else in memory, not related to malloc.
Using this formula we can have for example an integer array returned from malloc, by having int* ten_int_array = malloc(10 *sizeof(10)). Now the formula would be adjusted to ten_int_array + sizeof(int) * i. Which would make malloc not a fixed size indexable.
Thank you for any replies, I am trying to verify my assumptions here.

Here is what happening. Supposedly that you are running it on a 64-bit system where memory address needs 8 bytes (64 bits).
char *str; declares a pointer variable denoted by * which can point to a place in memory. By the above convention it is supposed to be 8 bytes large. The compiler knows that the object it is supposed to point to is a char. It has absolutely no idea if there should be other chars which follow or precedes this location, only the programmer does.
So, str = malloc(10); allocates enough memory space to keep 10 characters, including terminating '0'. The address of the memory is assigned to the pointer variable str.
char **array; declares a pointer * to a pointer *. This yet another 8-byte variable which, as the compiler knows, points to another pointer which itself points to a char. Similarly to the previous, it has no idea if there are more pointers adjacent to the one it is supposed to point.
array = malloc(3 * sizeof(char*)); allocates enough space in memory to keep exactly 3 pointers to char. The result of allocation will be assigned to array.
In ‘c’ operator [] applied to a pointer is similar to the one applied to an array variable. So array[1] returns a pointer #2 from the memory allocated above.
array[i] = malloc(10); allocates memory for a string of 10 characters and assigns the result to the pointer which s pointed by the array[i]. free(array[i]) frees this memory.
As a result, you have a two-level dynamic structure.
|-malloc(3 * sizeof (char*)) == 24 bytes
|
V
array --> [0] --> malloc(10) == 10 bytes
[1] --> malloc(10)
[2] --> malloc(10)
So, when you free, you need to free(array[i]) //0..2 first then free(array) because after freeing array, the memory it points too becomes invalidated and you cannot use it.

How to free allocated memory to an array of pointers after assigning value?

A simple beginner's dilemma so it should be quickly apparent.
I am trying to free allocated memory from a variable inside of the array of char pointers
This throws no error:
array= malloc(1*sizeof(char *) + 1);
array[0] = malloc(2*sizeof(char *));
free(array[0]);
Yet if I add some value to it, I get an error:
array= malloc(2*sizeof(char *));
array[0] = malloc(2*sizeof(char *));
array[0] = "a";
free(array[0]);
(malloc: *** error for object 0x10......: pointer being freed was not allocated
malloc: *** set a breakpoint in malloc_error_break to debug)
How could this be explained and how to deal with this?

Assuming your array is to be a list of pointers to strings (each of which will take data such as the 2-character, "a"), then there are a couple of errors in your approach.
First, the array[0] (and other elements) should be allocated as sizeof(char) * 2 (not sizeof(char*) * 2) – that will give pointers to buffers that can each hold up to 2 characters (the a letter and the nul-terminator).
So:
array= malloc(2*sizeof(char *)); // An array of two char* pointers
array[0] = malloc(2*sizeof(char)); // One pointer to a 2-character buffer
//...
Then, when you want to assign a given string to one of the elements, use the strcpy function, as shown below. What your array[0] = "a"; line does is to replace the address of the allocated buffer with the address of the string literal – and, as you didn't allocate that, you can't free it (and you then can't even free your actual allocated buffer, as you've 'lost' its address).
//...
strcpy(array[0], "a"); // Copy the second argument's data to the first
// ... later on ...
free(array[0]); // This will now (still) work, as you didn't change the address
// ...
free(array); // And don't forget to free the array of pointers!
Note, also, that sizeof(char) is defined (by the C Standard) to be 1 byte, so you can omit that in the second line of the first snippet above (but you need it in the sizeof(char*) case – the size of a pointer will vary between platforms and compilers).

#Adrian Mole is totally correct, and I just want to add a few comments about string literal.
String literal constants lie in the .rodata segment of the program, which is a pre-allocated, read-only segment occupying the program memory space. You cannot free or modify any value in this segment in any case, so
array[0] = "a";
array[0][0] = array[0][0] + 1;
will also cause an error.

Amount of memory to allocate to array of strings?

I have a char** which is designed to hold and unknown amount of strings with unknown length
I've initially allocated 10 bytes using
char **array = malloc(10);
and similarly, before adding strings to this array, I allocate
array[num] = malloc(strlen(source)+1)
I've noticed that my program crashes upon adding the 6th element to the array
My question is, how does memory with these arrays work? When I allocated 20 bytes, nothing happened, yet when I allocated 30, it suddenly could hold 10 elements. These were all strings of 2-3 characters in size. I'm struggling to think of a condition to realloc memory with, e.g
if condition{
memoryofarray += x amount
realloc(array, memoryofarray)
}
What exactly uses the memory in the char**? I was under the impression that each byte corresponds to how many lines they can hold, i.e. malloc(10) would allow the array to hold 10 strings. I need to know this to establish conditions + to know how much to increment the memory allocated to the array by.
Also, curiously, when I malloced
array[num] = malloc(0)
before assigning a string to that array element, it worked without problems. Don't you need to at least have strlen amount of bytes to store strings? This is confusing me massively

This line:
char **array = malloc(10);
allocates 10 bytes, however, remember that a pointer is not the same size as a byte.
Therefore you need to make sure you allocate an array of sufficient size by using the size of the related type:
char **array = malloc(10 * sizeof(char*));
Now that you have an array of 10 pointers you need to allocate memory for each of the 10 strings, e.g.
array[0] = malloc(25 * sizeof(char));
Here sizeof(char) is not needed but I added it to make it more obvious how malloc works.

If you want to hold 10 strings then you need to allocate memory for 10 char *'s and then allocate memory to those char pointers .You allocate memory of 10 bytes( not enough for 10 char *'s ) .Allocate like this -
char **array = malloc(10*sizeof(char *)); // allocate memory for 10 char *'s
And then do what you were doing -
array[num] = malloc(strlen(source)+1) // allocate desired memory to each pointer
note - take care that num is initialized and does not access out of bound index.

This will allocate enough memory for 10 pointers to char (char*) in array
char **array = malloc(10*sizeof(array[0]));
On a 64bit system the size of a char* is 8 bytes = 64 bits. The size of a char is typically 1 byte = 8 bits.
The advantage of using sizeof(array[0]) instead sizeof(char*) is that it's easier to change the type of array in the future.
char** is pointer to a pointer to char. It may point to the start of a memory block in the heap with pointers to char. Similarly char* is a pointer to char and it may point to the start of a memory block of char on the heap.
If you write beyond the allocated memory you get undefined behaviour. If you are lucky it may actually behave well! So when you do for example :
array[num] = malloc(0);
you may randomly not get a segmentation fault out of (good) luck.
Your use of realloc is wrong. realloc may have to move the memory block whose size you want to increase in which case it will return a new pointer. Use it like this :
if (condition) {
memoryofarray += amount;
array = realloc(array, memoryofarray);
}

Rather than allocating memory using the fault-prone style
pointer = malloc(n); // or
pointer = malloc(n * sizeof(type_of_pointer));
Use
pointer = malloc(sizeof *pointer * n);
Then
// Bad: certainly fails to allocate memory for 10 `char *` pointers
// char **array = malloc(10);
// Good
char **array = malloc(sizeof *array * 10);
how does memory with these arrays work?
If insufficient memory is allocated, it does not work. So step 1: allocate sufficient memory.
Concerning array[num] = malloc(0). An allocation of 0 may return NULL or a pointer to no writable memory or a pointer to some writable memory. Writing to that pointer memory is undefined behavior (UB) in any of the 3 cases. Code may crash, may "work", it is simply UB. Code must not attempt writing to that pointer.
To be clear: "worked without problems" does not mean code is correct. C is coding without a net. Should code do something wrong (UB), the language is not obliged to catch that error. So follow safe programming practices.

First allocate an array of pointers:
char* (*array)[n] = malloc( sizeof(*array) );
Then for each item in the array, allocate the variable-length strings individually:
for(size_t i=0; i<n; i++)
{
(*array)[i] = malloc( some_string_length );
}

Dynamically allocated 2 dimensional arrays

Does anyone know what the third line "Free(array)" does? array here is just the address of the first element of array(in other words, a pointer to the first element in the array of int * right)? Why do we need the third line to free the "columns" of the 2D array? I basically memorized/understand that a is a pointer to means a holds the address of ____. Is this phrase correct?
For example: int **a; int * b; int c; b = &c = 4;
a = &b; This is correct right? Thankyou!!!
Also, in general, double pointers are basically dynamically allocated arrays right?
"Finally, when it comes time to free one of these dynamically allocated multidimensional ``arrays,'' we must remember to free each of the chunks of memory that we've allocated. (Just freeing the top-level pointer, array, wouldn't cut it; if we did, all the second-level pointers would be lost but not freed, and would waste memory.) Here's what the code might look like:" http://www.eskimo.com/~scs/cclass/int/sx9b.html
for(i = 0; i < nrows; i++)
free(array[i]);
free(array);

Why do we need the third line to free the "columns" of the 2D array?
The number of deallocations should match up with the number of allocations.
If you look at the code at the start of the document:
int **array;
array = malloc(nrows * sizeof(int *));
for(i = 0; i < nrows; i++) {
array[i] = malloc(ncolumns * sizeof(int));
}
you'll see that there is one malloc() for the array itself and one malloc() for each row.
The code that frees this is basically the same in reverse.
Also, in general, double pointers are basically dynamically allocated arrays right?
Not necessarily. Dynamically allocated arrays is one use for double pointers, but it's far from the only use.

Calls to malloc allocate memory on the heap, equal to the number of bytes specified by its argument, and returns the address of this block of memory. Your '2D array' is really a 1D array of int addresses, each pointing to a chunk of memory allocated by malloc. You need to free each of these chunks when you are done, making it available for others to use. But your 1D array is really just another malloc'd chunk of memory to hold these malloc'd addresses, and that needs to be freed also.
Also, when you use printf("%s", array) where array is an char *, the compiler sees the array as the address of array[0] but prints it right? I'm just curious if I'm understanding it right.
Yes, %s tells printf to go to whatever address you give it (an address of a char, aka a char*, let's say), and start reading and displaying whatever is in memory at that address, one character at a time until it finds a 'NULL character'. So in the case of a string, that is the expected behavior, since a string is just an array of chars, followed by the '\0' char.

How to get the size of dynamically allocated 2d array

I have dynamically allocated 2D array.
Here is the code
int **arrofptr ;
arrofptr = (int **)malloc(sizeof(int *) * 2);
arrofptr[0] = (int *)malloc(sizeof(int)*6144);
arrofptr[1] = (int *)malloc(sizeof(int)*4800);
Now i have to know that how many bytes are allocated in arrofptr,arrofptr[0],arrofptr[1]?
is there any way to know the size?
if we will print
sizeof(arrofptr);
sizeof(arrofptr[0]);
sizeof(arrofptr[1]);
then it will print 4.

You can't find size of arrofptr, because it is only a pointer to pointer. You are defining an array of arrays using that. There's no way to tell the size information with only a pointer, you need to maintain the size information yourself.

The only return value you get from malloc() is a pointer to the first byte of the allocated region (or NULL on failure). There is no portable, standard, way of getting the associated allocation size from such a pointer, so in general the answer is no.
The C way is to represent arrays and buffers in general with a pair of values: a base address and a size. The latter is typically of the type size_t, the same as the argument to malloc(), by the way.

if you want to keep track of the size of an allocated block of code you would need to store that information in the memory block that you allocate e.g.
// allocate 1000 ints plus one int to store size
int* p = malloc(1000*sizeof(int) + sizeof(int));
*p = (int)(1000*sizeof(int));
p += sizeof(int);
...
void foo(int *p)
{
if (p)
{
--p;
printf( "p size is %d bytes", *p );
}
}
alt. put in a struct
struct
{
int size;
int *array;
} s;

You can't get the length of dynamically allocated arrays in C (2D or otherwise). If you need that information save it to a variable (or at least a way to calculate it) when the memory is initially allocated and pass the pointer to the memory and the size of the memory around together.
In your test case above sizeof is returning the size of the pointer, and thus your calculation the size of the pointers is usually 4, this is why you got 4 and is likely to have the trivial result of 4, always.