2-D array in single malloc call - c

int **arrayPtr;
arrayPtr = malloc(sizeof(int) * rows *cols + sizeof(int *) * rows);
In the above code, we are trying to allocate a 2D array in a single malloc call.
malloc takes a number of bytes and allocates memory for that many bytes,
but in the above case, how does malloc know that first it has to allocate a array of pointers, each of which pointer points to a one-dimensional array?
How does malloc work internally in this particular case?

2D arrays aren't the same as arrays of pointers to arrays.
int **arrayPtr doesn't define a 2D array. 2D arrays look like this:
int array[2][3]
And a pointer to the first element of this array would look like:
int (*array)[3]
which you can point to a block of memory:
int (*array)[3] = malloc(sizeof(int)*5*3);
Note how that's indexed:
array[x] would expand to *(array+x), so "x arrays of 3 ints forward".
array[x][y] would expand to *( *(array+x) + y), so "then y ints forward".
There's no immediate array of pointers involved here, only one contignous block of memory.
If you'd have an array of arrays (not the same as 2D array, often done using int** ptr and a series of per-row mallocs), it would go like:
ptr[x] would expand to *(array+x), so "x pointers forward"
ptr[x][y] would expand to *( *(array+x) + y) = "y ints forward".
Mind the difference. Both are indexed with [x][y], but they are represented in a different way in memory and the indexing happens in a different manner.

how does malloc know that first it has to allocate a array of pointers, each of which pointer points to a one-dimensional array?
It doesn't; malloc simply allocates the number of bytes you specify, it has no working knowledge of how those bytes are structured into an aggregate data type.
If you're trying to dynamically allocate a multidimensional array, you have several choices.
If you're using a C99 or C2011 compiler that supports variable length arrays, you could simply declare the array as
int rows;
int cols;
...
rows = ...;
cols = ...;
...
int array[rows][cols];
There are a number of issues with VLAs, though; they don't work for very large arrays, they can't be declared at file scope, etc.
A secondary approach is to do something like the following:
int rows;
int cols;
...
rows = ...;
cols = ...;
...
int (*arrayPtr)[cols] = malloc(sizeof *arrayPtr * rows);
In this case, arrayPtr is declared as a pointer to an array of int with cols elements, so we're allocating rows arrays of cols elements each. Note that you can access each element simply by writing arrayPtr[i][j]; the rules of pointer arithmetic work the same way as for a regular 2D array.
If you aren't working with a C compiler that supports VLAs, you'll have to take a different approach.
You can allocate everything as a single chunk, but you'll have to access it as a 1-d array, computing the offsets like so:
int *arrayPtr = malloc(sizeof *arrayPtr * rows * cols);
...
arrayPtr[i * rows + j] = ...;
Or you can allocate it in two steps:
int **arrayPtr = malloc(sizeof *arrayPtr * rows);
if (arrayPtr)
{
int i;
for (i = 0; i < rows; i++)
{
arrayPtr[i] = malloc(sizeof *arrayPtr[i] * cols);
if (arrayPtr[i])
{
int j;
for (j = 0; j < cols; j++)
{
arrayPtr[i][j] = some_initial_value();
}
}
}
}

malloc() does not know that it needs to allocate an array of pointers to arrays. It simply returns a chunk of memory of the requested size. You can certainly do the allocation this way, but you'll need to initialize the first "row" (or last, or even a column instead of a row - however you want to do it) that are to be used as pointers so that they point to the appropriate area within that chunk.
It would be better and more efficient to just do:
int *arrayPtr = malloc(sizeof(int)*rows*cols);
The downside to that is that you have to calculate the proper index on every use, but you could write a simple helper function to do that. You wouldn't have the "convenience" of using [] to reference an element, but you could have e.g. element(arrayPtr, x, y).

I would re-direct your attention rather to question of "what does [] operator do?".
If you plan to access elements in your array via [] operator, then you need to realize that it can only do off-setting based on element's size, unless some array geometry info is supplied.
malloc does not have provisions for dimension info, calloc - explicitly 1D.
On the other hand, declared arrays (arr[3][4]) explicitly specify the dimensions to the compiler.
So to access dynamically alloc'ed multi-D arrays in a fashion arr[i][j], you in fact allocate the series of 1D-arrays of the target dimension size. You will need to loop to do that.
malloc returns plain pointer to heap memory, no information about geometry or data-type. Thus [][] won't work, you'll need to the offsetting manually.
So it's your call whether []-indexing is your priority, or the bulk allocation.

int **arrayPtr; does not point to a 2D array. It points to an array of pointers to int. If you want to create a 2D array, use:
int (*arrayPtr)[cols] = calloc(rows, sizeof *arrayPtr);

Related

**array vs array[][]: are they both 2D arrays?

My teacher told me that int **array is not a 2D array, it is just a pointer to a pointer to an integer.
Now, in one of my projects I have to dynamically allocate a 2D array of structs, and this is how it is done:
struct cell **array2d = (struct x **)calloc(rows, sizeof(struct x *));
for (int i = 0; i < columns; i++) {
array2d[i] = (struct x *)calloc(j, sizeof(struct x));
}
But here we return a pointer to a pointer to the struct, so how is this a 2D array?
Before using dynamic allocation, I had a statically allocated array of the form:
array2d[][]
Now that I replaced it by dynamic allocation, I also replaced array2d[][] by **array2d.
Every function that takes array2d[i][j] als argument now returns an error saying the types don't match.
Could someone explain me what is happening here? What is the difference between **array and array[m][n] and why is the compiler complaining?
They're thoroughly different things.
An array is a sequence of values of the same type stored one after another in memory.
In C, an array is more or less interchangeable with a pointer to its first element — a[0] is *a,
a[1] is *(a+1), etc. — at least when we're talking about one-dimensional arrays.
But now consider:
int a[3][3];
in this case, a contains nine elements, contiguous in memory. a[0][0] through a[0][2], then a[1][0] immediately after, up until a[2][2].
If you pass a to a function, it would fit into a parameter type of int * or int[3][3] or int [][3] (knowing the "stride" of the second dimension is absolutely necessary to doing the math to look up a given element).
On the other hand:
int *b[3];
b[0] = malloc(...);
b[1] = malloc(...);
b[2] = malloc(...);
in this case, b is an array of 3 elements, each of which is a pointer to an array of 3 elements. You still access it like b[0]0] or b[1][2], but something completely different is happening under the hood. The elements aren't all stored contiguously in memory, and *b isn't any of them, it's a pointer. If we were to pass b to a function, we would receive it with a parameter of type int ** or int *[]. Knowing the length of each row in advance isn't necessary, and in fact each row could have a different length from the others. Some of the rows could even be null pointers, with no storage behind them for integers.

Two dimension arrays and pointer representation

Three questions in 1.
If I have a 2-D array -
int array_name[num_rows][num_columns]
So it consists of num_rows arrays -each of which is an array of size = num_columns. Its equivalent representation using an array of pointers is-
int* array_name[num_rows]
-so the index given by [num_rows] still shows the number of 1-D arrays - somewhere using malloc we can then specify the size of each of the 1-D arrays as num_columns. Is that right? I saw some texts telling
int* array_name[num_columns]
will the indices not get switched in this case ?
For a ID array I specify size dynamically as-
int *p;
p = (int*) malloc (size * sizeof(int))
For 2 D arrays do I specify the size of entire 2-D array or of one 1-D array in malloc -
int*p [row_count];
p = (int*) malloc (row_count * column_count * sizeof(int))
or
p = (int*) malloc (column_count * sizeof(int))
I think it should be the second since p is a pointer to a 1-D array and p+1 is a pointer to a 1-D array etc. Please clarify.
For ques 2 - what if p was defined as -
int **p; rather than
int * p[row_count]
How will the malloc be used then? I think it should be -
p = (int*) malloc (row_count * column_count * sizeof(int))
Please correct, confirm, improve.
Declaring :
int *array_name[num_rows];
or :
int *array_name[num_columns];
is the same thing. Only the name changes, but your variable is still referring to rows because C is a row major so you should name it row.
Here is how to allocate a 2D array :
int (*p)[column] = malloc (sizeof(int[row][column]);
An int ** can be allocated whereas int [][] is a temporary array defined only in the scope of your function.
Don't forget that a semicolon is needed at the end of nearly every line.
You should read this page for a more complete explanation of the subject
(and 2.)
If I have a 2-D array
int array_name[num_rows][num_columns];
So it consists of num_rows arrays -each of which is an array of size = num_columns.
If both num_rows and num_columns are known at compile time, that line declares an array of num_rows arrays of num_columns ints, which, yes, is commonly referred as a 2D array of int.
Since C99 (and optionally in C11) you can use two variables unknown at compile time and end up declaring a Variable-Length Array, instead.
Its equivalent representation using an array of pointers is
int* array_name[num_rows];
So the index given by [num_rows] still shows the number of 1-D arrays - somewhere using malloc we can then specify the size of each of the 1-D arrays as num_columns. Is that right?
Technically, now array_name is declared as an array of num_rows pointers to int, not arrays. To "complete" the "2D array", one should traverse the array and allocate memory for each row. Note that the rows could have different sizes.
Using this form:
int (*array_name)[num_columns];
// ^ ^ note the parenthesis
array_name = malloc(num_rows * sizeof *array_name);
Here, array_name is declared as a pointer to an array of num_columns ints and then the desired number of rows is allocated.
3.
what if p was defined as int **p;
The other answers show how to allocate memory in this case, but while it is widely used, it isn't always the best solution. See e.g.:
Correctly allocating multi-dimensional arrays

Allocate 6xNxN array

I have a variable N. I need a 6xNxN array.
Something like this:
int arr[6][N][N];
But, obviously, that doesn't work.
I'm not sure how I'd go about allocating this so that I can access, e.g. arr[5][4][4] if N is 5, and arr[5][23][23] if N is 24.
Note that N will never change, so I'll never have to reallocate arr.
What should I do? Will int ***arr = malloc(6 * N * N * sizeof(int)); work?
You can allocate your 3-dimensional array on the heap as
int (*arr)[N][N] = malloc(sizeof(int[6][N][N]));
After use, you can free as
free(arr);
Another way of writing the same as suggested by #StoryTeller is -
int (*arr)[N][N] = malloc(6u * sizeof(*arr));
But here you need to be careful about the u after 6 to prevent signed arithmetic overflow.
Also, there can still be issues on platforms where size_t is smaller in width that int as suggested by #chqrlie, but that won't be the case on "most" commonly used platforms and hence you are fine using it.
int arr[6][N][N]; will work just fine. You merely need to update your compiler and C knowledge to the year 1999 or later, when variable-length arrays (VLA) were introduced to the language.
(If you have an older version of GCC than 5.0, you must explicitly tell it to not use an ancient version of the C standard, by passing -std=c99 or -std=c11.)
Alternatively if you need heap allocation, you can do:
int (*arrptr)[Y][Z] = malloc( sizeof(int[X][Y][Z]) );
You cannot do int ***arr = malloc(6 * N * N * sizeof(int)); since a int*** cannot point at a 3D array. In general, more than two levels of indirection is a certain sign that your program design is completely flawed.
Detailed info here: Correctly allocating multi-dimensional arrays.
What you want can't work directly. For indexing a multi-dimensional array, all but the very first dimension need to be part of the type and here's why:
The indexing operator operates on pointers by first adding an index to the pointer and then dereferencing it. The identifier of an array evaluates to a pointer to its first element (except when e.g. used with sizeof, _Alignof and &), so indexing on arrays works as you would expect.
It's very simple in the case of a single-dimension array. With
int a[42];
a evaluates to a pointer of type int * and indexing works the following way: a[18] => *(a + 18).
Now in a 2-dimensional array, all the elements are stored contiguously ("row" after "row" if you want to understand it as a matrix), and what's making the indexing "magic" work is the types involved. Take for example:
int a[16][42];
Here, the elements of a have the type int ()[42] (42-element array of int). According to the rules above, evaluating an expression of this type in most contexts again yields an int * pointer. But what about a itself? Well, it's an array of int ()[42] so a will evaluate to a pointer to 42-element array of int: int (*)[42]. Then let's have a look at what the indexing operator does:
a[3][18] => *(*(a + 3) + 18)
With a evaluating to the address of a with type int (*)[42], this inner addition of 3 can properly add 42 * sizeof(int). This would be impossible if the second dimension wasn't known in the type.
I guess it's simple to deduce the example for the n-dimensional case.
In your case, you have two possibilities to achieve something similar to what you want.
Use a dynamically allocated flat array with size 6*N*N. You can calculate the indices yourself if you save N somewhere.
Somewhat less efficient, but yielding better readable code, you could use an array of pointers to arrays of pointers to int (multiple indirection). You could e.g. do
int ***a = malloc(6 * sizeof *int);
for (size_t i = 0; i < 6; ++i)
{
a[i] = malloc(N * sizeof *(a[i]));
for (size_t j = 0; j < N ++j)
{
a[i][j] = malloc(N* sizeof *(a[i][j]));
}
}
// add error checking to malloc calls!
Then your accesses will look just like those to a normal 3d array, but it's stored internally as many arrays with pointers to the other arrays instead of in a big contiguous block.
I don't think it's worth using this many indirections, just to avoid writing e.g. a[2*N*N+5*N+4] to access the element at 2,5,4, so my recommendation would be the first method.
Making a simple change to the declaration on this line and keeping the malloc can easily solve your problem.
int ***arr = malloc(6 * N * N * sizeof(int));
However, int *** is unnecessary (and wrong). Use a flat array, which is easy to allocate:
int *flatarr = malloc(6 * N * N * sizeof(int));
This works for three dimensions, and instead of accessing arr[X][Y][Z] as in the question, you access flatarr[(X*N*N) + (Y*N) + Z]. In fact, you could even write a handy macro:
#define arr(X,Y,Z) flatarr[((X)*N*N) + ((Y)*N) + (Z)]
This is basically what I've done in my language Cubically to allow for multiple-size cubes. Thanks to Programming Puzzles & Code Golf user Dennis for giving me this idea.

Writting value at the end of 2d array

I have an 2d array . The 1st dimension has fixed size and i dynamicly create the 2nd dimension. e.g
int **arr;
*arr=( int * ) malloc ( X * sizeof ( int )) // X is input from user
What i want to do is create 2nd dimension and write value at the end of it.
For example i create 2nd dimension for arr[0] using
arr[0]=( int * ) malloc ( 2 * sizeof ( int ))
and then i want to write value in this 2nd dimension but without knowing the index. A lot of programming languages has method array.push which push item at the end of the array without knowing index or length of the array. How can i achieve such result in C? Is it possible?
Short answer: NO. You need to know the last index of the memory allocated via a pointer. There is no way of knowing the memory allocated for a pointer, so you need to know the last index for each column. Applying sizeof on a pointer gives you the memory occupied by a pointer (most often 4 or 8 bytes), and not the memory allocated by the pointer. This is a fundamental difference between pointers and arrays. They are not the same, although arrays decay to pointers when passed as arguments to functions.
Assuming your 2D array has NROWS and NCOLS, what you need is:
arr = malloc(NROWS * sizeof(int*)); // allocate memory for first dim, i.e. for rows
then allocate memory for each row, e.g. for the 5-th row:
arr[5] = malloc(NCOLS * sizeof(int)); // allocate NCOLS for the 5-th row
In general you allocate memory for the second dimension in a loop:
for(size_t i = 0 ; i < NCOLS; ++i)
free(arr[i]);
Then don't forget to release the memory at the end, in reverse order:
for(size_t i = 0; i < NCOLS; ++i)
free(arr[i]); // release memory for cols
free(arr); // release memory for rows
However, I recommend you use a 1D array instead (in case the dimension is the same for each column), and map from 1D to 2D and viceversa. It's better this way since the data is stored contiguously (better data locality) and there are no cache misses.
If you switch to C++, then you can use the standard container std::vector, which "knows" its indexes, and you can add at the end via std::vector::push_back or std::vector::emplace_back member functions.

Dynamic memory allocation for 2D array

I want to allot memory dynamically for a 2D array.
Is there any difference between these two ?
1)
array = (int**)malloc(size * sizeof(int*));
for (i = 0; i < size; i++) {
array[i] = (int *) malloc(size * sizeof(int));
}
2)
array = (int**)malloc(size *size* sizeof(int));
If yes, what is better to use and why ?
In the first case
array = (int**)malloc(size * sizeof(int*));
for (i = 0; i < size; i++) {
array[i] = (int *) malloc(size * sizeof(int));
}
you are allocating size extents of the size equal to size * sizeof( int ) That is you are allocating size one-dimensional arrays. Accordingly you are allocating size pointers that point to first elements of these one-dimensional arrays.
In the second case expression
(int**)malloc(size *size* sizeof(int))
means allocation of an extent of size * size of objects of type int and the returned pointer is interpretated as int **. So this expression has no sense independing on what is placed in the left side of the assignment. take into account that the size of pointer can be greater than the size of int.
You could write instead
int ( *array )[size] = ( int (*)[size] )malloc(size *size* sizeof(int));
In this case you are indeed allocating a two dimensional array provided that size is a constant expression.
Those two solutions are very different. The first will give you a vector of pointers to vectors. The second will give you a vector of the requested size. It all depends on your use case. Which do you want?
When it comes to releasing the memory, the first can only be freed by calling free for each pointer in the vector and then a final free on the vector itself. The second can be freed with a single call. Don't have that be your deciding reason to use one or the other. It all depends on your use case.
What is the type of the object you want to allocate? Is it an int **, an int *[] or an int[][]?
I want to allot memory dynamically for a 2 dimensional array.
Then just do
int (*arr)[size] = malloc(size * sizeof *arr);
Is there any difference between these two ?
Yes, they are wrong because of different errors. The first attempt does not allocate a 2D array, it allocates an array of pointers and then a bunch of arrays of ints. Hence the result will not necessarily be contiguous in memory (and anyway, a pointer-to-pointer is not the same thing as a two-dimensional array.)
The second piece of code does allocate a contiguous block of memory, but then you are treating it as if it was a pointer-to-pointer, which is still not the same thing.
Oh, and actually, both snippets have a common error: the act of casting the return value of malloc().

Resources