I'm implementing a matrix library in c. I've already made a vector library and I have defined a matrix to be nothing but a collection of pointers pointing to vectors (so each pointer references a vector struct which is the matrices column.) I have a pointer of pointers instead of an array of pointers because a) I want jagged matrices to be possible b) I want vector operations to work on the individual columns of the matrix and c) I want the actual matrix to be dynamic.
Here is the matrix type definition:
typedef struct mat {
size_t alloc; // num of allocated bytes
size_t w, h; // dimensions for the matrix
vec** cols; // each column is a vector
} mat;
Suppose I want to resize the dimensions of the matrix. The following code works just fine for me:
void resizem(mat* m, size_t w, size_t h) {
m -> alloc = w * VECTOR_SIZE;
m -> cols = realloc(m -> cols, m -> alloc);
// if(w > m -> w) {
// memset(m -> cols + m -> w, init_vec(h), (w - (m -> w)) * VECTOR_SIZE);
// }
if(w > m -> w) {
for(int i = m -> w; i < w; i++) {
m -> cols[i] = init_vec(h);
};
}
for(int i = 0; i < w; i++) {
resizev(m -> cols[i], h);
};
m -> w = w;
m -> h = h;
}
My approach was as follows: 1) re-compute the amount of bytes to reallocate and store this in the matrix struct (amount of columns * column size) 2) reallocate memory to the columns 3) if the matrix 'grew' in width then each new column needs to be initialised to a default vector. 4) resize the size of each vector in the matrix.
Note the commented out lines however. Why can't I just add an offset (the size of the former matrix) to the column pointers and use memset on the remaining columns to initialise them to a default vector? When I run this it doesn't work so I use a for loop instead.
Note that if it helps at all here is the github link to the vector library so far: Github link
When you use memset() you only call initvec() once. So all the elements would point to the same vector.
Also, memset() assigns the value to each byte in the array. But your row is supposed to contain pointers, not bytes.
BTW, your code is leaking lots of memory, because you never free() the old pointers in the row. You need to do this before calling realloc().
Related
I can pass int arrays to both of these functions (1d and 2d arrays). What is the difference between them? With the second function you need to specify the size of the array. Why?
void foo(int *a, int cols)
void bar(int (*a)[N])
I have a program where I want to pass 2d int arrays to functions. Which is the better one to use and does it matter?
There are significant differences difference between these methods for 2D matrices
in the first method you pass a pointer to int, which is not the proper type for the matrix, even after implicit decaying.
cols is an int argument in the first prototype, whereas it is a constant N in the second one. In both cases, the number of rows is not specified, so it must be implicit, either because it is constant or because the matrix is square.
You will need to write the index computations explicitly in the first case and you can use a[row][col] in the second, which will be simpler, more readable and compile to more efficient code.
Note that since C99, there is a 3rd possibility allowing you to use the array syntax even for a variable number of columns. Here is an example:
#include <stdio.h>
#include <stdlib.h>
void init_matrix(int rows, int cols, int (*a)[cols]) {
int x = 0;
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
a[r][c] = x++;
}
}
}
void print_matrix(int rows, int cols, const int (*a)[cols]) {
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
printf("%3d%c", a[r][c], " \n"[c == cols - 1]);
}
}
}
int main() {
int cols = 4, rows = 3;
int (*a)[cols] = malloc(sizeof(*a) * rows);
init_matrix(rows, cols, a);
print_matrix(rows, cols, a);
return 0;
}
Difference between int* a and int (*a)[N] in functions?
While int* a is a pointer to int, int (*b)[N] is a pointer to an array of N ints. These are different types, and are not compatible.
b can only point to an array of N ints while a can point to an int wether it's in an array or not. When you pass an array as argument you are really passing a pointer to its first element, it's commonly said that it decays to a pointer. Another difference is that if you increment b it will point to the next block of N ints whereas if you increment a it will just point to the next int.
I have a program where I want to pass 2d int arrays to functions. Which is the better one to use and does it matter?
It matters, for foo, it hints to a flat array with cols width, of course with some arithmetic you can treat it as a 2D array, bar is a more natural use for a 2D array.
The memory layout will probably be the similar, and using a flat array to store elements in such a way that you can use it as a 2D array is perfectly fine. I personally find it less messy and more clear to use a pointer to array, instead of pointer to int, when I need a 2D array.
Example:
#include <stdio.h>
#include <stdlib.h>
#define N 5
#define M 5
void bar(int (*a)[N]) {
// prints 20, 4 bytes * 5 (can be different deppending on the size of int)
printf("Size of 'a' is %zu\n\n", sizeof *a);
// populate the array
int c = 1;
for (int i = 0; i < N; i++)
for (int j = 0; j < M; j++)
a[i][j] = c++;
for(int i = 0; i < N; i++){
printf("a[%d][0] = %2d\n", i, **a);
// incrementing 'a' will make it point to the next array line
a++;
}
putchar('\n');
}
Output:
Size of a is 20
a[0][0] = 1
a[1][0] = 6
a[2][0] = 11
a[3][0] = 16
a[4][0] = 21
int main() {
int(*a)[N] = malloc(sizeof *a * M);
bar(a);
// prints the complete array
for (int i = 0; i < N; i++) {
for (int j = 0; j < M; j++) {
printf("%2d ", a[i][j]);
}
putchar('\n');
}
free(a);
}
In this code a is a 2D array N x M which is passed to bar to be populated. I added some handy prints and comments for clarification. See it live: https://godbolt.org/z/1EfEE5ed7
Output:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
With the second function you need to specify the size of the array. Why?
You don't have to specify the entire size of the entire 2D array, but you do have to specify the number of columns, i.e. the size of the sub-array.
The reason why the compiler must know the size of the sub-array is that this information is required for offset calculations.
For example, if you define a 2D array like this
int arr[2][4];
then the array elements will be stored in memory in the following order:
arr[0][0]
arr[0][1]
arr[0][2]
arr[0][3]
arr[1][0]
arr[1][1]
arr[1][2]
arr[1][3]
If the compiler wants to for example access arr[1][2], it will need to know how many columns there are per row, i.e. how big each sub-array is. If the compiler does not know that there are 4 columns per row, then the compiler has no way of knowing where to find arr[1][2]. However, if the compiler knows that there are 4 columns per row, it will know that the 5th element of the 2D array is the start of the second row, so it will know that arr[1][2] is the 7th element of the 2D array.
I can pass int arrays to both of these functions (1d and 2d arrays).
Although accessing a 2D array as a 1D array may work on most compilers, it is undefined behavior according to the ISO C standard. See this question for further information:
One-dimensional access to a multidimensional array: is it well-defined behaviour?
Which is the better one to use and does it matter?
The first one is better in the sense that it allows you to specify the number of columns at run-time, whereas with the second one, the number of columns must be set at compile-time, so you are less flexible.
However, as stated above, depending on your compiler, it may not be safe to access a 2D array as a 1D array. Therefore, it would probably be best to pass 1D arrays to the first function and 2D arrays to the second function.
However, it is not clear how the second function is supposed to know how many rows the 2D array contains, since this information is not passed to the function as an argument. Maybe the function is assuming a fixed number of rows, or maybe it is not intended to be used for 2D arrays at all. The information that you have provided in the question is insufficient to answer this.
The basic question is:
For code that expects a pointer to pointer that will be syntactically indexed like a 2-dimensional array, is there a valid way to create such an array using a single allocation?†
While on the surface, it seems I am asking for how to do so (like in this question), I already understand how it could be done (see below). The problem is that there might be an alignment issue.‡
The rest of the text describes some alternatives, and then explains the method under question.
† Olaf points out a pointer to pointer is not a 2-dimensional array. The premise of the question is that 3rd party code expects a pointer to pointer passed in, and the 3rd party code will index it as a 2-dimensional array.
‡ ErikNyquist presented a possible duplicate which explains how one might perform such an allocation, but I am questioning the validity of the technique with regards to alignment of the data.
If I need to dynamically allocate a multi-dimensional array, I typically use a single allocation call to avoid an iteration when I want to free the array later.
If VLA is available, I might code it like this:
int n, m;
n = initialize_n();
m = initialize_m();
double (*array)[m] = malloc(n * sizeof(*array));
array[i][j] = x;
Without VLA, I either rely on a macro on a structure for array accesses, or I add space for the pointer table for code that expects the ** style of two-dimensional array. The macro approach would look like:
struct array_2d {
int n, m;
double data[];
};
// a2d is a struct array_2d *
#define GET_2D(a2d, i, j) (a2d)->data[(i) * x->n + (j)]
struct array_2d *array = malloc(sizeof(*array) + n * m * sizeof(double));
array->n = n;
array->m = m;
GET_2D(array, i, j) = x;
The pointer table method is more complicated, because it requires a loop to initialize the table.
struct array_p2d {
int n, m;
double *data[];
};
#define GET_P2D(a2d, i, j) (a2d)->data[i][j]
struct array_p2d *array = malloc(sizeof(*array) + n * sizeof(double *)
+ n * m * sizeof(double));
for (k = 0; k < n; ++k) {
array->data[k] = (double *)&array->data[n] + k * m;
}
GET_P2D(array, i, j) = x;
// array->data can also be passed to a function wanting a double **
The problem with the pointer table method is that there might be an alignment issue. As long as whatever type the array is of does not have stricter alignment requirements than a pointer, the code should work.
Is the above expected to always work? If not, is there a valid way to achieve a single allocation for a pointer to pointer style 2-dimensional array?
Well, your malloc allocation is guaranteed to be aligned (unless you're using a non-standard alignment), so all you need to do is to round up the pointer table size to the alignment of the data segment:
const size_t pointer_table_size = n * sizeof(double *);
const size_t data_segment_offset = pointer_table_size +
((_Alignof(double) - (pointer_table_size / _Alignof(double))) % _Alignof(double));
double **array = malloc(data_segment_offset + (n * m * sizeof(double));
double *data = (double **)(((char **) array) + data_segment_offset);
for (int i = 0; i != n; ++i)
array[i] = data + (m * i);
I want to use two-dimension array inside of some struct:
typedef struct{
int rows;
int cols;
another_struct *array[][];
}some_struct;
But seems i can't do multidimensional array of incomplete type, so i choose to go with another_struct *array[0][0];
And allocate it this way:
some_struct *allocate_some_struct(int rows, int cols){
some_struct *p;
uint32_t length;
length = sizeof(some_struct) + rows * sizeof(another_struct *[cols]);
p = malloc(length);
p->rows = rows;
p->cols = cols;
return (p);
}
But whenever i try to access it this way : ((another_struct *[p->rows][p->cols])p->array)[i],i get this error: used type 'another_struct *[p->rows][p->cols]' where arithmetic or pointer type is required.
Although (*((another_struct *(*)[p->rows][p-cols])&(p->array)))[i], work perfectly fine.
So my questions is why can't i use first syntax? Is there fundamental difference with the second one ?
In C typing is static, so it means that every type must be completely known when you operate with it (when compiling has finished). For a bidimensional array, this means that all the dimensions must be know for the language to be able to do the access to the individual cells. Access to an array is made using a formula that need the size of the already used indexed parts of it. For a cell is the cell size, but for an array of cells you must know how many cells you have in that direction.
But, there's a workaround that allows you to use indexing with the [] brackets, and doesn't need to know any size but the size of an individual cell. You have to use pointers, as in this example:
double **new_matrix(int rows, int cols)
{
double **res = malloc(rows * sizeof(double *));
int i;
for (i = 0; i < rows; i++)
res[i] = malloc(cols * sizeof(double));
return res;
}
void free_matrix(double **matrix, int rows)
{
int i;
for (i = 0; i < rows; i++) free(matrix[i]);
free(matrix);
}
...
double **matrix = new_matrix(24, 3);
matrix[12][1] /* will access correctly row 13 and column 2 element */
...
free(matrix, 24); /* will free all allocated memory */
There are solutions that allow you to allocate the whole matrix (and the pointers in one bunch (and allow to use free(3) directly on the matrix thing) but I leave this as an exercise to the reader :)
I'm trying to implement my own basic version of matrix multiplication in C and based on another implementation, I have made a matrix data type. The code works, but being a C novice, I do not understand why.
The issue: The I have a struct with a dynamic array inside it and I am initializing the pointer. See below:
// Matrix data type
typedef struct
{
int rows, columns; // Number of rows and columns in the matrix
double *array; // Matrix elements as a 1-D array
} matrix_t, *matrix;
// Create a new matrix with specified number of rows and columns
// The matrix itself is still empty, however
matrix new_matrix(int rows, int columns)
{
matrix M = malloc(sizeof(matrix_t) + sizeof(double) * rows * columns);
M->rows = rows;
M->columns = columns;
M->array = (double*)(M+1); // INITIALIZE POINTER
return M;
}
Why do I need to initialize the array to (double*)(M+1)? It seems that also (double*)(M+100) works ok, but e.g. (double *)(M+10000) does not work anymore, when I run my matrix multiplication function.
The recommended method for this kind of stuff is unsized array used in conjunction with offsetof. It ensures correct alignment.
#include <stddef.h>
#include <stdlib.h>
// Matrix data type
typedef struct s_matrix
{
int rows, columns; // Number of rows and columns in the matrix
double array[]; // Matrix elements as a 1-D array
} matrix;
// Create a new matrix with specified number of rows and columns
// The matrix itself is still empty, however
matrix* new_matrix(int rows, int columns)
{
size_t size = offsetof(matrix_t, array) + sizeof(double) * rows * columns;
matrix* M = malloc(size);
M->rows = rows;
M->columns = columns;
return M;
}
M+1 points to the memory that immediately follows M (i.e. that follows the two int and the double*). This is the memory you've allocated for the matrix data:
matrix M = malloc(sizeof(matrix_t) + sizeof(double) * rows * columns);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Using M+100 or M+10000 and then attempting to populate the matrix will result in undefined behaviour. This could result in a program that crashes, or a program that appears to work (but in reality is broken), or anything in between.
You need to initialize it because otherwise (wait for it) it's uninitialized!
And you can't use an uninitialized pointer for anything, except to generate undefined behavior.
Initializing it to M + 1 is precisely right, and very good code. Any other value would fail to use the memory you allocated for this exact purpose.
My point is that the double * at the end of struct won't "automatically" point at this memory, which is the implied belief in your question why it should be initialized. Thus, it has to be set to the proper address.
Is it possible to write a function which accept 2-d array when the width is not known at compile time?
A detailed description will be greatly appreciated.
You can't pass a raw two-dimensional array because the routine won't know how to index a particular element. The 2D array is really one contiguous memory segment.
When you write x[a][b] (when x is a 2d array), the compiler knows to look at the address (x + a * width + b). It can't know how to address the particular element if you don't tell it the width.
As an example, check http://www.dfstermole.net/OAC/harray2.html#offset (which has a table showing how to find the linear index for each element in an int[5][4])
There are two ways to work around the limitation:
1) Make your program work with pointer-to-pointers (char *). This is not the same as char[][]. A char * is really one memory segment, with each value being a memory address to another memory segment.
2) Pass a 1d pointer, and do the referencing yourself. Your function would then have to take a "width" parameter, and you could use the aforementioned formula to reference a particular point
To give a code example:
#include <stdio.h>
int get2(int *x) { return x[2]; }
int main() {
int y[2][2] = {{11,12},{21,22}};
printf("%d\n", get2((int *)y));
}
This should print out 21, since y is laid out as { 11, 12, 21, 22 } in memory.
C supports variable-length arrays. You must specify the width from a value known at run-time, which may be an earlier parameter in the function declaration:
void foo(size_t width, int array[][width]);
One way is use the good old "pointer to array of pointers to arrays" trick coupled with a single continuous allocation:
/* Another allocation function
--------------------------- */
double ** AnotherAlloc2DTable(
size_t size1, /*[in] Nb of lines */
size_t size2 /*[in] Nb of values per line */
)
{
double ** ppValues;
size_t const size1x2 = size1*size2;
if(size1x2 / size2 != size1)
return NULL; /*size overflow*/
ppValues = malloc(sizeof(*ppValues)*size1);
if(ppValues != NULL)
{
double * pValues = malloc(sizeof(*pValues)*size1x2);
if(pValues != NULL)
{
size_t i;
/* Assign all pointers */
for(i=0 ; i<size1 ; ++i)
ppValues[i] = pValues + (i*size2);
}
else
{
/* Second allocation failed, free the first one */
free(ppValues), ppValues=NULL;
}
}/*if*/
return ppValues;
}
/* Another destruction function
---------------------------- */
void AnotherFree2DTable(double **ppValues)
{
if(ppValues != NULL)
{
free(ppValues[0]);
free(ppValues);
}
}
Then all you have to do is pass a char ** to your function. The matrix is continuous, and usable as mat[x][y].
Possible accessor functions:
int get_multi(int rows, int cols, int matrix[][cols], int i, int j)
{
return matrix[i][j];
}
int get_flat(int rows, int cols, int matrix[], int i, int j)
{
return matrix[i * cols + j];
}
int get_ptr(int rows, int cols, int *matrix[], int i, int j)
{
return matrix[i][j];
}
An actual multi-dimensional array and a fake one:
int m_multi[5][7];
int m_flat[5 * 7];
Well-defined ways to use the accessor functions:
get_multi(5, 7, m_multi, 4, 2);
get_flat(5, 7, m_flat, 4, 2);
{
int *m_ptr[5];
for(int i = 0; i < 5; ++i)
m_ptr[i] = m_multi[i];
get_ptr(5, 7, m_ptr, 4, 2);
}
{
int *m_ptr[5];
for(int i = 0; i < 5; ++i)
m_ptr[i] = &m_flat[i * 7];
get_ptr(5, 7, m_ptr, 4, 2);
}
Technically undefined usage that works in practice:
get(5, 7, (int *)m_multi, 4, 2);
[Warning - this answer addresses the case where the number of columns - the WIDTH - is known]
When working with 2D arrays, the compiler needs to know the number of columns in your array in order to compute indexing. For instance, if you want a pointer p that points to a range of memory to be treated as a two-dimensional set of values, the compiler cannot do the necessary indexing arithmetic unless it knows how much space is occupied by each row of the array.
Things become clearer with a concrete example, such as the one below. Here, the pointer p is passed in as a pointer to a one-dimensional range of memory. You - the programmer - know that it makes sense to treat this as a 2D array and you also know (must know) how many columns are there in this array. Armed with this knowledge, you can write code to create q, that is treated by the compiler as a 2D array with an unknown number of rows, where each row has exactly NB columns.
I usually employ this when I want the compiler to do all the indexing arithmetic (why do it by hand when the compiler can do it?). In the past, I've found this construct to be useful to carry out 2D transposes from one shape to another - note though that generalized 2D transposes that transpose an MxN array into an NxM array are rather beastly.
void
WorkAs2D (double *p)
{
double (*q)[NB] = (double (*)[NB]) p;
for (uint32_t i = 0; i < NB; i++)
{
for (uint32_t j = 0; j < ZZZ; j++) /* For as many rows as you have in your 2D array */
q[j][i] = ...
}
}
I believe a nice solution would be the use of structures.
So I have an example for 1d-Arrays:
Definition of the struct:
struct ArrayNumber {
unsigned char *array;
int size;
};
Definition of a function:
struct ArrayNumber calcMultiply(struct ArrayNumber nra, struct ArrayNumber nrb);
Init the struct:
struct ArrayNumber rs;
rs.array = malloc(1);
rs.array[0] = 0;
rs.size = 1;
//and adding some size:
rs.size++;
rs.array = realloc(rs.array, rs.size);
hope this could be a solution for you. Just got to change to a 2d Array.