Difference between two methods of malloc - c

I want to create 5*5 2D Matrix. I usually use the following way of memory allocation:
int **M = malloc(5 * sizeof(int *));
for (i = 0; i < 5; i++)
{
M[i] = malloc(5 * sizeof(int));
}
While I was reading a blog, I found also another way to do that:
int **M = malloc(5 * sizeof(int*));
M[0] = malloc((5*5) * sizeof(int));
My question is: What is the difference between both methods? Which one in more efficient?

For the second code, note that you need to initialize the other array members for it to work correctly:
for (int i = 1; i < 5; i++) {
M[i] = M[0] + i * 5;
}
So in the second code the arrays members (through all arrays) are contiguous. It does not make any difference to access them (e.g., you an still access them using M[i][j] syntax). It has the advantage over the first code to require only two malloc calls and as mentioned in the comments to favor caching which can greatly improve the access performances.
But if you plan to dynamically allocate large arrays, it is better to use the first method because of memory fragmentation (large contiguous memory allocation can be not available or can exacerbate memory fragmentation).
A similar example of this kind of dynamic allocation of arrays of arrays can be found in the c-faq: http://c-faq.com/aryptr/dynmuldimary.html

After seeing ouah's answer and seeing the example in the C FAQ, I now understand where the second technique comes from, although I personally wouldn't use it where I could help it.
The main problem with the first approach you show is that the rows in the array are not guaranteed to be adjacent in memory; IOW, the object immediately following M[0][4] is not necessarily M[1][0]. If two rows are allocated from different pages, that could degrade runtime performance.
The second approach guarantees that all the rows will be allocated contiguously, but you have to manually assign M[1] through M[4] to get the normal M[i][j] subscripting to work, as in
for ( size_t i = 0; i < 5; i++ )
M[i] = M[i-1] + 5;
IMO it's a clumsy approach compared to the following:
int (*M)[5] = malloc( sizeof *M * 5 );
This also guarantees that the memory is allocated contiguously, and the M[i][j] subscripting works without any further effort.
However, there is a drawback; on compilers that don't support variable-length arrays, the array size must be known at compile time. Unless your compiler supports VLAs, you can't do something like
size_t cols;
...
int (*M)[cols] = malloc( sizeof *M * rows );
In that case, the M[0] = malloc( rows * cols * sizeof *M[0]) followed by manually assigning M[1] through M[rows - 1] would be a reasonable substitute.

I hope I'm not missing something here but here's my attempt to answer the question "What is the difference...". If I am completely off base, forgive me and I will correct my answer but here goes:
I tried drawing out what is happening in your two mallocs so what I have to say is tied to the picture included which I drew by hand (hand crafted answers?)
First option:
For the first option, you allocate a memory block the size of 5 int*s. M, which is an int** points to the start of that memory block.
Then, you go over each of the memory blocks (the size of int*) and in each block you put in the address of a memory block the size of 5 ints. Note that these are located in some random portion of your memory (the heap) that has enough space to take the size of 5 ints.
This is the key - it's a noncontiguous block of memory. So if you think about memory as an array, you are pointing at different start locations in the array.
Second Option
Your second does the allocation of int** exactly the same. But instead, it allocates the size of 25 ints and returns places the address of that array in the memory block M[0]. Note: you've never placed any address in the memory locations M[1] - M[4].
So, what happens? You have a contiguous block of 25 ints with an address that can be found in M[0]. What happens when you try getting M[1]? You guessed it - it's empty or contains junk values. Even more, it's a value that does not point to an allocated memory space so you Segfault.

If you want to allocate a 5x5 array in contiguous memory, the correct approach would be
int rows = 5;
int cols = 5;
int (*M)[cols] = malloc(rows * sizeof(*M));
You can then access the array with normal array indexing, e.g.
M[3][2] = 6;

int **M = malloc(5 * sizeof(int *)); refers to allocating memory for a pointer M[i] = malloc(5 * sizeof(int)); refers to allocating memory for a variable of int.
Maybe this will help you understand what is going on:
int **M = malloc(5 * sizeof(void *));
/* size of 'void *' and size of 'int *' are the same */
for (i = 0; i < 5; i++)
{
M[i] = malloc(5 * sizeof(int));
}

Another little difference when using malloc((5*5) * sizeof(int));. Certainly a side issue to what OP is looking for, but still a concern.
Both of the below are the same as the order of the 2 operands still result in using size_t math for the product.
#define N 5
malloc(N * sizeof(int));
malloc(sizeof(int) * N);
Consider:
#define N some_large_value
malloc((N*N) * sizeof(int));
The type of the result of sizeof() is type size_t, an unsigned integer type, that is certainly has SIZE_MAX >= INT_MAX, possible far larger. so to avoid int overflow that does not overflow size_t math use
malloc(sizeof(int) * N * N);

Related

Allocate 6xNxN array

I have a variable N. I need a 6xNxN array.
Something like this:
int arr[6][N][N];
But, obviously, that doesn't work.
I'm not sure how I'd go about allocating this so that I can access, e.g. arr[5][4][4] if N is 5, and arr[5][23][23] if N is 24.
Note that N will never change, so I'll never have to reallocate arr.
What should I do? Will int ***arr = malloc(6 * N * N * sizeof(int)); work?
You can allocate your 3-dimensional array on the heap as
int (*arr)[N][N] = malloc(sizeof(int[6][N][N]));
After use, you can free as
free(arr);
Another way of writing the same as suggested by #StoryTeller is -
int (*arr)[N][N] = malloc(6u * sizeof(*arr));
But here you need to be careful about the u after 6 to prevent signed arithmetic overflow.
Also, there can still be issues on platforms where size_t is smaller in width that int as suggested by #chqrlie, but that won't be the case on "most" commonly used platforms and hence you are fine using it.
int arr[6][N][N]; will work just fine. You merely need to update your compiler and C knowledge to the year 1999 or later, when variable-length arrays (VLA) were introduced to the language.
(If you have an older version of GCC than 5.0, you must explicitly tell it to not use an ancient version of the C standard, by passing -std=c99 or -std=c11.)
Alternatively if you need heap allocation, you can do:
int (*arrptr)[Y][Z] = malloc( sizeof(int[X][Y][Z]) );
You cannot do int ***arr = malloc(6 * N * N * sizeof(int)); since a int*** cannot point at a 3D array. In general, more than two levels of indirection is a certain sign that your program design is completely flawed.
Detailed info here: Correctly allocating multi-dimensional arrays.
What you want can't work directly. For indexing a multi-dimensional array, all but the very first dimension need to be part of the type and here's why:
The indexing operator operates on pointers by first adding an index to the pointer and then dereferencing it. The identifier of an array evaluates to a pointer to its first element (except when e.g. used with sizeof, _Alignof and &), so indexing on arrays works as you would expect.
It's very simple in the case of a single-dimension array. With
int a[42];
a evaluates to a pointer of type int * and indexing works the following way: a[18] => *(a + 18).
Now in a 2-dimensional array, all the elements are stored contiguously ("row" after "row" if you want to understand it as a matrix), and what's making the indexing "magic" work is the types involved. Take for example:
int a[16][42];
Here, the elements of a have the type int ()[42] (42-element array of int). According to the rules above, evaluating an expression of this type in most contexts again yields an int * pointer. But what about a itself? Well, it's an array of int ()[42] so a will evaluate to a pointer to 42-element array of int: int (*)[42]. Then let's have a look at what the indexing operator does:
a[3][18] => *(*(a + 3) + 18)
With a evaluating to the address of a with type int (*)[42], this inner addition of 3 can properly add 42 * sizeof(int). This would be impossible if the second dimension wasn't known in the type.
I guess it's simple to deduce the example for the n-dimensional case.
In your case, you have two possibilities to achieve something similar to what you want.
Use a dynamically allocated flat array with size 6*N*N. You can calculate the indices yourself if you save N somewhere.
Somewhat less efficient, but yielding better readable code, you could use an array of pointers to arrays of pointers to int (multiple indirection). You could e.g. do
int ***a = malloc(6 * sizeof *int);
for (size_t i = 0; i < 6; ++i)
{
a[i] = malloc(N * sizeof *(a[i]));
for (size_t j = 0; j < N ++j)
{
a[i][j] = malloc(N* sizeof *(a[i][j]));
}
}
// add error checking to malloc calls!
Then your accesses will look just like those to a normal 3d array, but it's stored internally as many arrays with pointers to the other arrays instead of in a big contiguous block.
I don't think it's worth using this many indirections, just to avoid writing e.g. a[2*N*N+5*N+4] to access the element at 2,5,4, so my recommendation would be the first method.
Making a simple change to the declaration on this line and keeping the malloc can easily solve your problem.
int ***arr = malloc(6 * N * N * sizeof(int));
However, int *** is unnecessary (and wrong). Use a flat array, which is easy to allocate:
int *flatarr = malloc(6 * N * N * sizeof(int));
This works for three dimensions, and instead of accessing arr[X][Y][Z] as in the question, you access flatarr[(X*N*N) + (Y*N) + Z]. In fact, you could even write a handy macro:
#define arr(X,Y,Z) flatarr[((X)*N*N) + ((Y)*N) + (Z)]
This is basically what I've done in my language Cubically to allow for multiple-size cubes. Thanks to Programming Puzzles & Code Golf user Dennis for giving me this idea.

Dynamic memory allocation for 2D array

I want to allot memory dynamically for a 2D array.
Is there any difference between these two ?
1)
array = (int**)malloc(size * sizeof(int*));
for (i = 0; i < size; i++) {
array[i] = (int *) malloc(size * sizeof(int));
}
2)
array = (int**)malloc(size *size* sizeof(int));
If yes, what is better to use and why ?
In the first case
array = (int**)malloc(size * sizeof(int*));
for (i = 0; i < size; i++) {
array[i] = (int *) malloc(size * sizeof(int));
}
you are allocating size extents of the size equal to size * sizeof( int ) That is you are allocating size one-dimensional arrays. Accordingly you are allocating size pointers that point to first elements of these one-dimensional arrays.
In the second case expression
(int**)malloc(size *size* sizeof(int))
means allocation of an extent of size * size of objects of type int and the returned pointer is interpretated as int **. So this expression has no sense independing on what is placed in the left side of the assignment. take into account that the size of pointer can be greater than the size of int.
You could write instead
int ( *array )[size] = ( int (*)[size] )malloc(size *size* sizeof(int));
In this case you are indeed allocating a two dimensional array provided that size is a constant expression.
Those two solutions are very different. The first will give you a vector of pointers to vectors. The second will give you a vector of the requested size. It all depends on your use case. Which do you want?
When it comes to releasing the memory, the first can only be freed by calling free for each pointer in the vector and then a final free on the vector itself. The second can be freed with a single call. Don't have that be your deciding reason to use one or the other. It all depends on your use case.
What is the type of the object you want to allocate? Is it an int **, an int *[] or an int[][]?
I want to allot memory dynamically for a 2 dimensional array.
Then just do
int (*arr)[size] = malloc(size * sizeof *arr);
Is there any difference between these two ?
Yes, they are wrong because of different errors. The first attempt does not allocate a 2D array, it allocates an array of pointers and then a bunch of arrays of ints. Hence the result will not necessarily be contiguous in memory (and anyway, a pointer-to-pointer is not the same thing as a two-dimensional array.)
The second piece of code does allocate a contiguous block of memory, but then you are treating it as if it was a pointer-to-pointer, which is still not the same thing.
Oh, and actually, both snippets have a common error: the act of casting the return value of malloc().

Proper argument for malloc

I have always used the malloc function as, for exemple,
int size = 10000;
int *a;
a = malloc(size * sizeof(int));
I recently run into a piece of code that discards the sizeof(int) part, i.e.
int size = 10000;
int *a;
a = malloc(size);
This second code seems to be working fine.
My question is then, which form is correct? If the second form is, am I allocating needless space with the first form.
The argument to malloc is the number of bytes to be allocated. If you need space for an array of n elements of type T, call malloc(n * sizeof(T)). malloc does not know about types, it only cares about bytes.
The only exception is that when you allocate space for (byte/char) strings, the sizeof can be omitted because sizeof(char) == 1 per definition in C. Doing something like
int *a = malloc(10000);
a[9000] = 0;
may seem to work now, but actually exploits undefined behavior.
malloc allocates a given number of bytes worth of memory, suitably aligned for any type. If you want to store N elements of type T, you need N * sizeof(T) bytes of aligned storage. Typically, T * p = malloc(N * sizeof(T)) provides that and lets you index the elements as p[i] for i in [0, N).
From the man page:
The malloc() function allocates size bytes and returns a pointer to the allocated memory.
The first form is correct.
Even if the sizeof(int) on the machine you are targeting is one (which is sometimes true on 8-bit microcontrollers) you still want your code to be readable.
The reason the "second code seems to be working fine" is that you are lucky.
The version of malloc you are using might be returning a pointer to an area of memory that is larger than what you requested. No matter what is happening behind the scenes, the behavior may change if you switch to a different compiler, so you do not want to rely on it.

2-D array in single malloc call

int **arrayPtr;
arrayPtr = malloc(sizeof(int) * rows *cols + sizeof(int *) * rows);
In the above code, we are trying to allocate a 2D array in a single malloc call.
malloc takes a number of bytes and allocates memory for that many bytes,
but in the above case, how does malloc know that first it has to allocate a array of pointers, each of which pointer points to a one-dimensional array?
How does malloc work internally in this particular case?
2D arrays aren't the same as arrays of pointers to arrays.
int **arrayPtr doesn't define a 2D array. 2D arrays look like this:
int array[2][3]
And a pointer to the first element of this array would look like:
int (*array)[3]
which you can point to a block of memory:
int (*array)[3] = malloc(sizeof(int)*5*3);
Note how that's indexed:
array[x] would expand to *(array+x), so "x arrays of 3 ints forward".
array[x][y] would expand to *( *(array+x) + y), so "then y ints forward".
There's no immediate array of pointers involved here, only one contignous block of memory.
If you'd have an array of arrays (not the same as 2D array, often done using int** ptr and a series of per-row mallocs), it would go like:
ptr[x] would expand to *(array+x), so "x pointers forward"
ptr[x][y] would expand to *( *(array+x) + y) = "y ints forward".
Mind the difference. Both are indexed with [x][y], but they are represented in a different way in memory and the indexing happens in a different manner.
how does malloc know that first it has to allocate a array of pointers, each of which pointer points to a one-dimensional array?
It doesn't; malloc simply allocates the number of bytes you specify, it has no working knowledge of how those bytes are structured into an aggregate data type.
If you're trying to dynamically allocate a multidimensional array, you have several choices.
If you're using a C99 or C2011 compiler that supports variable length arrays, you could simply declare the array as
int rows;
int cols;
...
rows = ...;
cols = ...;
...
int array[rows][cols];
There are a number of issues with VLAs, though; they don't work for very large arrays, they can't be declared at file scope, etc.
A secondary approach is to do something like the following:
int rows;
int cols;
...
rows = ...;
cols = ...;
...
int (*arrayPtr)[cols] = malloc(sizeof *arrayPtr * rows);
In this case, arrayPtr is declared as a pointer to an array of int with cols elements, so we're allocating rows arrays of cols elements each. Note that you can access each element simply by writing arrayPtr[i][j]; the rules of pointer arithmetic work the same way as for a regular 2D array.
If you aren't working with a C compiler that supports VLAs, you'll have to take a different approach.
You can allocate everything as a single chunk, but you'll have to access it as a 1-d array, computing the offsets like so:
int *arrayPtr = malloc(sizeof *arrayPtr * rows * cols);
...
arrayPtr[i * rows + j] = ...;
Or you can allocate it in two steps:
int **arrayPtr = malloc(sizeof *arrayPtr * rows);
if (arrayPtr)
{
int i;
for (i = 0; i < rows; i++)
{
arrayPtr[i] = malloc(sizeof *arrayPtr[i] * cols);
if (arrayPtr[i])
{
int j;
for (j = 0; j < cols; j++)
{
arrayPtr[i][j] = some_initial_value();
}
}
}
}
malloc() does not know that it needs to allocate an array of pointers to arrays. It simply returns a chunk of memory of the requested size. You can certainly do the allocation this way, but you'll need to initialize the first "row" (or last, or even a column instead of a row - however you want to do it) that are to be used as pointers so that they point to the appropriate area within that chunk.
It would be better and more efficient to just do:
int *arrayPtr = malloc(sizeof(int)*rows*cols);
The downside to that is that you have to calculate the proper index on every use, but you could write a simple helper function to do that. You wouldn't have the "convenience" of using [] to reference an element, but you could have e.g. element(arrayPtr, x, y).
I would re-direct your attention rather to question of "what does [] operator do?".
If you plan to access elements in your array via [] operator, then you need to realize that it can only do off-setting based on element's size, unless some array geometry info is supplied.
malloc does not have provisions for dimension info, calloc - explicitly 1D.
On the other hand, declared arrays (arr[3][4]) explicitly specify the dimensions to the compiler.
So to access dynamically alloc'ed multi-D arrays in a fashion arr[i][j], you in fact allocate the series of 1D-arrays of the target dimension size. You will need to loop to do that.
malloc returns plain pointer to heap memory, no information about geometry or data-type. Thus [][] won't work, you'll need to the offsetting manually.
So it's your call whether []-indexing is your priority, or the bulk allocation.
int **arrayPtr; does not point to a 2D array. It points to an array of pointers to int. If you want to create a 2D array, use:
int (*arrayPtr)[cols] = calloc(rows, sizeof *arrayPtr);

How allocate or free only parts of an array?

See this example:
int *array = malloc (10 * sizeof(int))
Is there a way to free only the first 3 blocks?
Or to have an array with negative indexes, or indexes that don't begin with 0?
You can't directly free the first 3 blocks. You can do something similar by reallocating the array smaller:
/* Shift array entries to the left 3 spaces. Note the use of memmove
* and not memcpy since the areas overlap.
*/
memmove(array, array + 3, 7);
/* Reallocate memory. realloc will "probably" just shrink the previously
* allocated memory block, but it's allowed to allocate a new block of
* memory and free the old one if it so desires.
*/
int *new_array = realloc(array, 7 * sizeof(int));
if (new_array == NULL) {
perror("realloc");
exit(1);
}
/* Now array has only 7 items. */
array = new_array;
As to the second part of your question, you can increment array so it points into the middle of your memory block. You could then use negative indices:
array += 3;
int first_int = array[-3];
/* When finished remember to decrement and free. */
free(array - 3);
The same idea works in the opposite direction as well. You can subtract from array to make the starting index greater than 0. But be careful: as #David Thornley points out, this is technically invalid according to the ISO C standard and may not work on all platforms.
You can't free part of an array - you can only free() a pointer that you got from malloc() and when you do that, you'll free all of the allocation you asked for.
As far as negative or non-zero-based indices, you can do whatever you want with the pointer when you get it back from malloc(). For example:
int *array = malloc(10 * sizeof(int));
array -= 2;
Makes an array that has valid indices 2-11. For negative indices:
int *array = malloc(10 * sizeof(int));
array += 10;
Now you can access this array like array[-1], array[-4], etc.
Be sure not to access memory outside your array. This sort of funny business is usually frowned upon in C programs and by C programmers.

Resources