The basic question is:
For code that expects a pointer to pointer that will be syntactically indexed like a 2-dimensional array, is there a valid way to create such an array using a single allocation?†
While on the surface, it seems I am asking for how to do so (like in this question), I already understand how it could be done (see below). The problem is that there might be an alignment issue.‡
The rest of the text describes some alternatives, and then explains the method under question.
† Olaf points out a pointer to pointer is not a 2-dimensional array. The premise of the question is that 3rd party code expects a pointer to pointer passed in, and the 3rd party code will index it as a 2-dimensional array.
‡ ErikNyquist presented a possible duplicate which explains how one might perform such an allocation, but I am questioning the validity of the technique with regards to alignment of the data.
If I need to dynamically allocate a multi-dimensional array, I typically use a single allocation call to avoid an iteration when I want to free the array later.
If VLA is available, I might code it like this:
int n, m;
n = initialize_n();
m = initialize_m();
double (*array)[m] = malloc(n * sizeof(*array));
array[i][j] = x;
Without VLA, I either rely on a macro on a structure for array accesses, or I add space for the pointer table for code that expects the ** style of two-dimensional array. The macro approach would look like:
struct array_2d {
int n, m;
double data[];
};
// a2d is a struct array_2d *
#define GET_2D(a2d, i, j) (a2d)->data[(i) * x->n + (j)]
struct array_2d *array = malloc(sizeof(*array) + n * m * sizeof(double));
array->n = n;
array->m = m;
GET_2D(array, i, j) = x;
The pointer table method is more complicated, because it requires a loop to initialize the table.
struct array_p2d {
int n, m;
double *data[];
};
#define GET_P2D(a2d, i, j) (a2d)->data[i][j]
struct array_p2d *array = malloc(sizeof(*array) + n * sizeof(double *)
+ n * m * sizeof(double));
for (k = 0; k < n; ++k) {
array->data[k] = (double *)&array->data[n] + k * m;
}
GET_P2D(array, i, j) = x;
// array->data can also be passed to a function wanting a double **
The problem with the pointer table method is that there might be an alignment issue. As long as whatever type the array is of does not have stricter alignment requirements than a pointer, the code should work.
Is the above expected to always work? If not, is there a valid way to achieve a single allocation for a pointer to pointer style 2-dimensional array?
Well, your malloc allocation is guaranteed to be aligned (unless you're using a non-standard alignment), so all you need to do is to round up the pointer table size to the alignment of the data segment:
const size_t pointer_table_size = n * sizeof(double *);
const size_t data_segment_offset = pointer_table_size +
((_Alignof(double) - (pointer_table_size / _Alignof(double))) % _Alignof(double));
double **array = malloc(data_segment_offset + (n * m * sizeof(double));
double *data = (double **)(((char **) array) + data_segment_offset);
for (int i = 0; i != n; ++i)
array[i] = data + (m * i);
Related
I'm using this function to read the input matrix:
void leMatInt(int **M,int linhas,int colunas){
int i, j;
for (i = 0; i < linhas; i++){
for (j = 0; j < colunas; j++){
scanf("%d", &M[i][j]);
//printf("Leu [%d, %d]\n", i, j);
}
}
}
And I'm creating the matrix like this:
scanf("%d", &v1);
int **matriz1=(int **)malloc(v1 * sizeof(int));
for(i = 0;i < v1; i++){
matriz1[i] = (int *)malloc(v1 * sizeof(int));
}
leMatInt(matriz1, v1, v1);
The code works nicely for v1 <= 4, but if I try to input a 5v5 matrix, the code gets runtime error at the function.
matriz1 is a double pointer so while allocating memory you should write sizeof(int*). because ** pointer will holds/contains * single pointers.
int **matriz1 = malloc(v1 * sizeof(int*));
for(i = 0;i < v1; i++){
matriz1[i] = malloc(v1 * sizeof(int));
}
typecasting malloc() is discouraged.
int **matriz1=malloc(v1 * sizeof(int*));
A double (**) pointer variable will hold pointer to an int.
matriz1 is a pointer to a pointer variable (pointer to a int*).
So it will contain int* variables.
Also casting the return type of malloc is unnecessary and check the return value of malloc.
It's undefined behavior here. That's why it works for 4 unexpectedly but doesn't work for 5x5 matrices.
An example: (Considering a particular case)-
Though sizeof int is implementation defined, but it's usually 4
bytes. But sizeof pointer is usually 8 byte in 64-bit compiler.
Now to hold 5 int* variable you need 40 byte. But you are allocating 20
bytes. So you are allocating less memory than what you need. And
accessing memory that you are not permitted to, invoking undefined
behavior.
int **matriz1=(int **)malloc(v1 * sizeof(int));
This line is wrong. As already stated by other answers, the elements pointed to by matriz1 are not integers but pointers to integers, so using sizeof with a type name should be sizeof(int *) here.
Also, casting the return value of malloc() is unnecessary and arguably bad style in C.
To avoid errors like that, sizeof can also take an expression as an operand and uses the type this expression would evaluate to, which comes very handy with malloc() -- it should look like this:
int **matriz1 = malloc(v1 * sizeof *matriz1);
If you change the type of matriz1 later, sizeof *matriz1 still gives the correct element size -- it's the size of whatever matriz1 points to.
I am working on C, specifically on creating a matrix using pointers, and one thing that confuses me is that in a 2D array that matrix[i][j] is equal to
*(*(matrix+i)+j)
Does this mean that the element located, in say, the position [3][3] is given by *(*(0+3)+3))?
More specifically, I'm coding a matrix in C by using the following code:
double** makeMatrix(unsigned int rows, unsigned int cols)
{
unsigned int i;
double** matrix;
matrix = (double** ) malloc(rows * sizeof(double *));
if (!matrix) { return NULL; }/* failed */
for (i = 0; i < rows; i++)
{
matrix[i] = (double *) malloc(cols*sizeof(double));
if (!matrix[i])
return NULL;
}
return matrix;
}
So, allocating memory to each i'th element within the array - is this the reason that we get ((matrix+i)+j) for the [i][j] - due to the fact that each element has its own memory block?
In C, there are no multidimensional arrays to the "true" sense (like in LISP, C# or C++/CLI). Rather than that, what you can declare is array of arrays (or array of pointers, where each pointer is assigned by malloc etc.). For instance:
int matrix[2][3];
defines two-elements array, where each element is of type array of three ints.
Now, when you refer to an ultimate array element, you need to first derefence into that inner array, then into the int object:
int value = matrix[2][3];
which is equivalent to:
int value = (*(*(matrix + 2) + 3));
a[i] is a syntactic sugar for *(a + i). That means that a[3][3] is equivalent to *(*(a + 3) + 3).
One other thing is notable, 3[a] is equivalent to *(3 + a) which is *(a + 3), which is a[3].
I want to use two-dimension array inside of some struct:
typedef struct{
int rows;
int cols;
another_struct *array[][];
}some_struct;
But seems i can't do multidimensional array of incomplete type, so i choose to go with another_struct *array[0][0];
And allocate it this way:
some_struct *allocate_some_struct(int rows, int cols){
some_struct *p;
uint32_t length;
length = sizeof(some_struct) + rows * sizeof(another_struct *[cols]);
p = malloc(length);
p->rows = rows;
p->cols = cols;
return (p);
}
But whenever i try to access it this way : ((another_struct *[p->rows][p->cols])p->array)[i],i get this error: used type 'another_struct *[p->rows][p->cols]' where arithmetic or pointer type is required.
Although (*((another_struct *(*)[p->rows][p-cols])&(p->array)))[i], work perfectly fine.
So my questions is why can't i use first syntax? Is there fundamental difference with the second one ?
In C typing is static, so it means that every type must be completely known when you operate with it (when compiling has finished). For a bidimensional array, this means that all the dimensions must be know for the language to be able to do the access to the individual cells. Access to an array is made using a formula that need the size of the already used indexed parts of it. For a cell is the cell size, but for an array of cells you must know how many cells you have in that direction.
But, there's a workaround that allows you to use indexing with the [] brackets, and doesn't need to know any size but the size of an individual cell. You have to use pointers, as in this example:
double **new_matrix(int rows, int cols)
{
double **res = malloc(rows * sizeof(double *));
int i;
for (i = 0; i < rows; i++)
res[i] = malloc(cols * sizeof(double));
return res;
}
void free_matrix(double **matrix, int rows)
{
int i;
for (i = 0; i < rows; i++) free(matrix[i]);
free(matrix);
}
...
double **matrix = new_matrix(24, 3);
matrix[12][1] /* will access correctly row 13 and column 2 element */
...
free(matrix, 24); /* will free all allocated memory */
There are solutions that allow you to allocate the whole matrix (and the pointers in one bunch (and allow to use free(3) directly on the matrix thing) but I leave this as an exercise to the reader :)
Is this the correct method to define an 5*3 matrix using double pointers?`
int **M1;
M1 = (int **)malloc(5 * sizeof(int *));
for (i=0;i<5;i++)
{
M1[i] = (int *)malloc(3 * sizeof(int));
}`
If so, how can I assign M1[3][15] = 9 in the code and still get no error? And why am I getting a segmentation error in assigning M1[6][3]=2?
I understood after few such initializations that I created a 5*xx array, i.e. I couldn't go above 5th row but I could assign any value to the number of columns. How should I create just a 5*3 array?
In your code, you're allocating memory for 5 pointers
M1 = (int **)malloc(5 * sizeof(int *));
and later, you're trying to access beyond that, based on an unrelated value of m
for (i=0;i<m;i++)
when m goes beyond 4, you're essentially accessing out of bound memory.
A better way to allocate will be
int m = 5;
M1 = malloc(m * sizeof(*M1));
if (M1)
{
for (i=0;i<5;i++)
{
M1[i] = malloc(3 * sizeof(*M1[i]));
}
}
couldn't go above 5th row but I could assign any value to the number of columns.
NO, you can not. In any way possible, accessing out of bound memory invokes undefined behaviour.
Since Sourav tackled the UB case, I'll answer
How should I create just a 5*3 array?
Why not rely on automatic variables? Unless you've a compelling reason not to, use them
int matrix[5][3];
If you don't know the dimensions in advance, and don't prefer doing the double pointer manipulation, flatten it like this:
int *m = malloc(sizeof(int) * rows * cols);
// accessing anything from 0 to (rows * cols) - 1 is permitted
// helper to make usage easier
int get_element(int *m, int i, int j, int cols) {
return m[i * cols + j];
}
OTOH, if you only don't know the first dimension at compile-time, then you may do:
typedef int (Int5) [5]; // cols known at complie-time
int rows = 3;
Int5 *r = malloc(rows * sizeof(Int5));
r[0][0] = 1; // OK
r[0][5] = 2; // warning: out of bounds access
With this method you get a bit more type safety due to the compiler knowing the size in advice.
Suppose we want to construct an array of structs, where the definition of the struct cannot be known at compile time.
Here is a SSCCE:
#include <stdlib.h>
int main(int argc, char *argv[]){
if (argc < 3) return 1;
int n = atoi(argv[1]);
int k = atoi(argv[2]);
if ((n < 1) || (k < 1)) return 2;
// define struct dynamically
typedef struct{
int a[n];
short b[k];
}data_point_t;
int m = 10;
// construct array of `m` elements
data_point_t *p = malloc(sizeof(data_point_t)*m);
// do something with the array
for(int i = 0; i < m; ++i) p[i].a[0] = p[i].b[0] = i;
free(p);
return 0;
}
This works fine with gcc (C99), however it doesn't with clang, which yields:
error: fields must have a constant size:
'variable length array in structure' extension will never be supported
So I'm obviously relying on a gcc extension. My question is, how to deal with this kind of problem in standard conform C99? (Bonus question: how to do this in C++11?)
Note: Performance matters, when iterating p there should be aligned memory access. Dereferencing pointers in the loop, yielding random memory access, is not an option.
I think your best bet is to drop the idea of wrapping the array in a structure, bite the bullet and allocate a 2D array yourself.
This will mean that you need to do explicit indexing, but that would have to happen under the hood anyway.
When it comes to alignment, if you're going to visit every n array elements in each of the m arrays, it probably doesn't matter, it's better to make them compact to maximize use of cache.
Something like:
int *array = malloc(m * n * sizeof *array);
Then to index, just do:
// do something with the array
for(int i = 0; i < m; ++i)
{
for(int j = 0; j < n; ++j)
array[i * n + j] = j;
}
If you're very worried about that multiplication, use a temporary pointer. After profiling it, of course.
Sometimes you see this done with a helper macro to do the indexing:
#define INDEX(a, n, i, j) (a)[(i) * (n) + (j)]
then you can write the final line as:
INDEX(array, n, i, j) = j;
it's a bit clumsy since the n needs to go in there all the time, of course.
First of all, it only makes sense to wrap the array inside a struct in the case there are other struct members present. If there are no other struct members, simply allocate an array.
If there are other struct members, then use a flexible array member to achieve what you want. Flexible array members are well-defined in the C standard and will work on every C99 compiler.
// define struct dynamically
typedef struct{
type_t the_reason_you_need_this_to_be_a_struct_and_not_an_array;
int a[]; // flexible array member
}data_point_t;
// construct array of `m` elements
int m = 10;
size_t obj_size = sizeof(data_point_t) + n*sizeof(int);
data_point_t *p = malloc(m * obj_size);
In C++ you can of course use pointers much like you do now, but for a "proper" C++ solution the only viable solution is to use std::vector:
struct data_point_t
{
explicit data_point_t(const size_t sz)
: a(sz) // Construct the vector `a` with `sz` entries,
// each element will be zero initialized (`int()`)
{}
std::vector<int> a;
};
int main(int argc, char *argv[]){
// Read `n`...
int n = 10; // Just example
// Read `m`...
int m = 10; // Just example
// Construct vector of `m` elements
std::vector<data_point_t> p(m, data_point_t(n));
// Here the vector `p` contains `m` elements, where each instance
// have been initialized with a vector `a` with `n` elements
// All fully allocated and initialized
// Do something with the array
// ...
}
This is valid C++03 code, so unless you use something ancient (like Turbo C++) any compiler today should support it.