I'm trying to implement my own basic version of matrix multiplication in C and based on another implementation, I have made a matrix data type. The code works, but being a C novice, I do not understand why.
The issue: The I have a struct with a dynamic array inside it and I am initializing the pointer. See below:
// Matrix data type
typedef struct
{
int rows, columns; // Number of rows and columns in the matrix
double *array; // Matrix elements as a 1-D array
} matrix_t, *matrix;
// Create a new matrix with specified number of rows and columns
// The matrix itself is still empty, however
matrix new_matrix(int rows, int columns)
{
matrix M = malloc(sizeof(matrix_t) + sizeof(double) * rows * columns);
M->rows = rows;
M->columns = columns;
M->array = (double*)(M+1); // INITIALIZE POINTER
return M;
}
Why do I need to initialize the array to (double*)(M+1)? It seems that also (double*)(M+100) works ok, but e.g. (double *)(M+10000) does not work anymore, when I run my matrix multiplication function.
The recommended method for this kind of stuff is unsized array used in conjunction with offsetof. It ensures correct alignment.
#include <stddef.h>
#include <stdlib.h>
// Matrix data type
typedef struct s_matrix
{
int rows, columns; // Number of rows and columns in the matrix
double array[]; // Matrix elements as a 1-D array
} matrix;
// Create a new matrix with specified number of rows and columns
// The matrix itself is still empty, however
matrix* new_matrix(int rows, int columns)
{
size_t size = offsetof(matrix_t, array) + sizeof(double) * rows * columns;
matrix* M = malloc(size);
M->rows = rows;
M->columns = columns;
return M;
}
M+1 points to the memory that immediately follows M (i.e. that follows the two int and the double*). This is the memory you've allocated for the matrix data:
matrix M = malloc(sizeof(matrix_t) + sizeof(double) * rows * columns);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Using M+100 or M+10000 and then attempting to populate the matrix will result in undefined behaviour. This could result in a program that crashes, or a program that appears to work (but in reality is broken), or anything in between.
You need to initialize it because otherwise (wait for it) it's uninitialized!
And you can't use an uninitialized pointer for anything, except to generate undefined behavior.
Initializing it to M + 1 is precisely right, and very good code. Any other value would fail to use the memory you allocated for this exact purpose.
My point is that the double * at the end of struct won't "automatically" point at this memory, which is the implied belief in your question why it should be initialized. Thus, it has to be set to the proper address.
Related
I'm implementing a matrix library in c. I've already made a vector library and I have defined a matrix to be nothing but a collection of pointers pointing to vectors (so each pointer references a vector struct which is the matrices column.) I have a pointer of pointers instead of an array of pointers because a) I want jagged matrices to be possible b) I want vector operations to work on the individual columns of the matrix and c) I want the actual matrix to be dynamic.
Here is the matrix type definition:
typedef struct mat {
size_t alloc; // num of allocated bytes
size_t w, h; // dimensions for the matrix
vec** cols; // each column is a vector
} mat;
Suppose I want to resize the dimensions of the matrix. The following code works just fine for me:
void resizem(mat* m, size_t w, size_t h) {
m -> alloc = w * VECTOR_SIZE;
m -> cols = realloc(m -> cols, m -> alloc);
// if(w > m -> w) {
// memset(m -> cols + m -> w, init_vec(h), (w - (m -> w)) * VECTOR_SIZE);
// }
if(w > m -> w) {
for(int i = m -> w; i < w; i++) {
m -> cols[i] = init_vec(h);
};
}
for(int i = 0; i < w; i++) {
resizev(m -> cols[i], h);
};
m -> w = w;
m -> h = h;
}
My approach was as follows: 1) re-compute the amount of bytes to reallocate and store this in the matrix struct (amount of columns * column size) 2) reallocate memory to the columns 3) if the matrix 'grew' in width then each new column needs to be initialised to a default vector. 4) resize the size of each vector in the matrix.
Note the commented out lines however. Why can't I just add an offset (the size of the former matrix) to the column pointers and use memset on the remaining columns to initialise them to a default vector? When I run this it doesn't work so I use a for loop instead.
Note that if it helps at all here is the github link to the vector library so far: Github link
When you use memset() you only call initvec() once. So all the elements would point to the same vector.
Also, memset() assigns the value to each byte in the array. But your row is supposed to contain pointers, not bytes.
BTW, your code is leaking lots of memory, because you never free() the old pointers in the row. You need to do this before calling realloc().
I want to use two-dimension array inside of some struct:
typedef struct{
int rows;
int cols;
another_struct *array[][];
}some_struct;
But seems i can't do multidimensional array of incomplete type, so i choose to go with another_struct *array[0][0];
And allocate it this way:
some_struct *allocate_some_struct(int rows, int cols){
some_struct *p;
uint32_t length;
length = sizeof(some_struct) + rows * sizeof(another_struct *[cols]);
p = malloc(length);
p->rows = rows;
p->cols = cols;
return (p);
}
But whenever i try to access it this way : ((another_struct *[p->rows][p->cols])p->array)[i],i get this error: used type 'another_struct *[p->rows][p->cols]' where arithmetic or pointer type is required.
Although (*((another_struct *(*)[p->rows][p-cols])&(p->array)))[i], work perfectly fine.
So my questions is why can't i use first syntax? Is there fundamental difference with the second one ?
In C typing is static, so it means that every type must be completely known when you operate with it (when compiling has finished). For a bidimensional array, this means that all the dimensions must be know for the language to be able to do the access to the individual cells. Access to an array is made using a formula that need the size of the already used indexed parts of it. For a cell is the cell size, but for an array of cells you must know how many cells you have in that direction.
But, there's a workaround that allows you to use indexing with the [] brackets, and doesn't need to know any size but the size of an individual cell. You have to use pointers, as in this example:
double **new_matrix(int rows, int cols)
{
double **res = malloc(rows * sizeof(double *));
int i;
for (i = 0; i < rows; i++)
res[i] = malloc(cols * sizeof(double));
return res;
}
void free_matrix(double **matrix, int rows)
{
int i;
for (i = 0; i < rows; i++) free(matrix[i]);
free(matrix);
}
...
double **matrix = new_matrix(24, 3);
matrix[12][1] /* will access correctly row 13 and column 2 element */
...
free(matrix, 24); /* will free all allocated memory */
There are solutions that allow you to allocate the whole matrix (and the pointers in one bunch (and allow to use free(3) directly on the matrix thing) but I leave this as an exercise to the reader :)
Is this the correct method to define an 5*3 matrix using double pointers?`
int **M1;
M1 = (int **)malloc(5 * sizeof(int *));
for (i=0;i<5;i++)
{
M1[i] = (int *)malloc(3 * sizeof(int));
}`
If so, how can I assign M1[3][15] = 9 in the code and still get no error? And why am I getting a segmentation error in assigning M1[6][3]=2?
I understood after few such initializations that I created a 5*xx array, i.e. I couldn't go above 5th row but I could assign any value to the number of columns. How should I create just a 5*3 array?
In your code, you're allocating memory for 5 pointers
M1 = (int **)malloc(5 * sizeof(int *));
and later, you're trying to access beyond that, based on an unrelated value of m
for (i=0;i<m;i++)
when m goes beyond 4, you're essentially accessing out of bound memory.
A better way to allocate will be
int m = 5;
M1 = malloc(m * sizeof(*M1));
if (M1)
{
for (i=0;i<5;i++)
{
M1[i] = malloc(3 * sizeof(*M1[i]));
}
}
couldn't go above 5th row but I could assign any value to the number of columns.
NO, you can not. In any way possible, accessing out of bound memory invokes undefined behaviour.
Since Sourav tackled the UB case, I'll answer
How should I create just a 5*3 array?
Why not rely on automatic variables? Unless you've a compelling reason not to, use them
int matrix[5][3];
If you don't know the dimensions in advance, and don't prefer doing the double pointer manipulation, flatten it like this:
int *m = malloc(sizeof(int) * rows * cols);
// accessing anything from 0 to (rows * cols) - 1 is permitted
// helper to make usage easier
int get_element(int *m, int i, int j, int cols) {
return m[i * cols + j];
}
OTOH, if you only don't know the first dimension at compile-time, then you may do:
typedef int (Int5) [5]; // cols known at complie-time
int rows = 3;
Int5 *r = malloc(rows * sizeof(Int5));
r[0][0] = 1; // OK
r[0][5] = 2; // warning: out of bounds access
With this method you get a bit more type safety due to the compiler knowing the size in advice.
Is it possible to write a function which accept 2-d array when the width is not known at compile time?
A detailed description will be greatly appreciated.
You can't pass a raw two-dimensional array because the routine won't know how to index a particular element. The 2D array is really one contiguous memory segment.
When you write x[a][b] (when x is a 2d array), the compiler knows to look at the address (x + a * width + b). It can't know how to address the particular element if you don't tell it the width.
As an example, check http://www.dfstermole.net/OAC/harray2.html#offset (which has a table showing how to find the linear index for each element in an int[5][4])
There are two ways to work around the limitation:
1) Make your program work with pointer-to-pointers (char *). This is not the same as char[][]. A char * is really one memory segment, with each value being a memory address to another memory segment.
2) Pass a 1d pointer, and do the referencing yourself. Your function would then have to take a "width" parameter, and you could use the aforementioned formula to reference a particular point
To give a code example:
#include <stdio.h>
int get2(int *x) { return x[2]; }
int main() {
int y[2][2] = {{11,12},{21,22}};
printf("%d\n", get2((int *)y));
}
This should print out 21, since y is laid out as { 11, 12, 21, 22 } in memory.
C supports variable-length arrays. You must specify the width from a value known at run-time, which may be an earlier parameter in the function declaration:
void foo(size_t width, int array[][width]);
One way is use the good old "pointer to array of pointers to arrays" trick coupled with a single continuous allocation:
/* Another allocation function
--------------------------- */
double ** AnotherAlloc2DTable(
size_t size1, /*[in] Nb of lines */
size_t size2 /*[in] Nb of values per line */
)
{
double ** ppValues;
size_t const size1x2 = size1*size2;
if(size1x2 / size2 != size1)
return NULL; /*size overflow*/
ppValues = malloc(sizeof(*ppValues)*size1);
if(ppValues != NULL)
{
double * pValues = malloc(sizeof(*pValues)*size1x2);
if(pValues != NULL)
{
size_t i;
/* Assign all pointers */
for(i=0 ; i<size1 ; ++i)
ppValues[i] = pValues + (i*size2);
}
else
{
/* Second allocation failed, free the first one */
free(ppValues), ppValues=NULL;
}
}/*if*/
return ppValues;
}
/* Another destruction function
---------------------------- */
void AnotherFree2DTable(double **ppValues)
{
if(ppValues != NULL)
{
free(ppValues[0]);
free(ppValues);
}
}
Then all you have to do is pass a char ** to your function. The matrix is continuous, and usable as mat[x][y].
Possible accessor functions:
int get_multi(int rows, int cols, int matrix[][cols], int i, int j)
{
return matrix[i][j];
}
int get_flat(int rows, int cols, int matrix[], int i, int j)
{
return matrix[i * cols + j];
}
int get_ptr(int rows, int cols, int *matrix[], int i, int j)
{
return matrix[i][j];
}
An actual multi-dimensional array and a fake one:
int m_multi[5][7];
int m_flat[5 * 7];
Well-defined ways to use the accessor functions:
get_multi(5, 7, m_multi, 4, 2);
get_flat(5, 7, m_flat, 4, 2);
{
int *m_ptr[5];
for(int i = 0; i < 5; ++i)
m_ptr[i] = m_multi[i];
get_ptr(5, 7, m_ptr, 4, 2);
}
{
int *m_ptr[5];
for(int i = 0; i < 5; ++i)
m_ptr[i] = &m_flat[i * 7];
get_ptr(5, 7, m_ptr, 4, 2);
}
Technically undefined usage that works in practice:
get(5, 7, (int *)m_multi, 4, 2);
[Warning - this answer addresses the case where the number of columns - the WIDTH - is known]
When working with 2D arrays, the compiler needs to know the number of columns in your array in order to compute indexing. For instance, if you want a pointer p that points to a range of memory to be treated as a two-dimensional set of values, the compiler cannot do the necessary indexing arithmetic unless it knows how much space is occupied by each row of the array.
Things become clearer with a concrete example, such as the one below. Here, the pointer p is passed in as a pointer to a one-dimensional range of memory. You - the programmer - know that it makes sense to treat this as a 2D array and you also know (must know) how many columns are there in this array. Armed with this knowledge, you can write code to create q, that is treated by the compiler as a 2D array with an unknown number of rows, where each row has exactly NB columns.
I usually employ this when I want the compiler to do all the indexing arithmetic (why do it by hand when the compiler can do it?). In the past, I've found this construct to be useful to carry out 2D transposes from one shape to another - note though that generalized 2D transposes that transpose an MxN array into an NxM array are rather beastly.
void
WorkAs2D (double *p)
{
double (*q)[NB] = (double (*)[NB]) p;
for (uint32_t i = 0; i < NB; i++)
{
for (uint32_t j = 0; j < ZZZ; j++) /* For as many rows as you have in your 2D array */
q[j][i] = ...
}
}
I believe a nice solution would be the use of structures.
So I have an example for 1d-Arrays:
Definition of the struct:
struct ArrayNumber {
unsigned char *array;
int size;
};
Definition of a function:
struct ArrayNumber calcMultiply(struct ArrayNumber nra, struct ArrayNumber nrb);
Init the struct:
struct ArrayNumber rs;
rs.array = malloc(1);
rs.array[0] = 0;
rs.size = 1;
//and adding some size:
rs.size++;
rs.array = realloc(rs.array, rs.size);
hope this could be a solution for you. Just got to change to a 2d Array.
I was asked to write a program that gets a two dimensional array (a matrix), number of columns,and number of rows, and the program will return the transpose matrix (without using a [][], meaning only using pointer arithmetics)
The program I wrote, does indeed transpose the matrix, it is no problem. My problem is understanding how to return. here's my code:
int** transpose_matrix(matrix mat1,int number_of_rows,int number_of_columns)
{
matrix mat2;
int row_index,column_index;
for(row_index=0;row_index<number_of_rows;row_index++)
{
for(column_index=0;column_index<number_of_columns;column_index++)
**(mat2+(column_index*number_of_rows)+row_index)=**(mat1+(row_index*number_of_columns)+column_index);
}
// at this point, mat2 is exactly the transpose of mat1
return mat2;
}
now here's my problem: I can't return a matrix, closest thing I can do is return the address of the first value of the matrix, but even if i do that, all the rest of the matrix will be unusable as soon as i exit transpose_matrix function back into void main...How can I return mat2?
One, a two-dimensional array is not a double pointer.
Two, dynamic allocation. If matrix is a two-dimensional array type, then write something like this:
typedef int matrix[ROWS][COLUMNS];
typedef int (*matrix_ptr)[COLUMNS];
matrix_ptr transpose_matrix(matrix m, int rows, int cols)
{
matrix_ptr transposed = malloc(sizeof(*transposed) * rows);
// transpose, then
return transposed;
}
OK: Here you have 3 thing:
You cannot return a pointer to a local variable (it will be garbage
after return and the stack (memory) where it was is reused).
Array decay to pointer to the first element when passed.
Pointer arithmetic: p+1 increment the adress in p by sizeof(*p), so
p point to the next element, not to the next byte.
The simple fix for your code (this works for any matrix size) :
int* transpose_matrix(int *mat1,int number_of_rows,int number_of_columns)
{
int *mat2=malloc(number_of_rows*number_of_columns*sizeof(int));
int row_index,column_index;
for(row_index=0;row_index<number_of_rows;row_index++)
{
for(column_index=0;column_index<number_of_columns;column_index++)
mat2[column_index*number_of_rows+row_index]=mat1[row_index*number_of_columns+column_index];
}
// at this point, mat2 is exactly the transpose of mat1
return mat2;
}
...
print(m,r,c); // I hope you have a print()
int *t=transpose_matrix(m,r,c);
print (t,c,r);
...
// use t[max: c-1][max: r-1]
free(t);
If we have only fixed size matrix (well with C99 we can use varable length array too!).
typedef int Matrix[ROWS][COLUMNS];
typedef int TMatrix[COLUMNS][ROWS];
typedef int (*pMatrix)[COLUMNS];
typedef int (*pTMatrix)[ROWS];
pTMatrix transpose_matrix(Matrix m , int rows, int cols)
{
pTMatrix t = malloc(sizeof(*t)*cols);
for (int r=0; r<rows ; ++r)
for (int c=0; r<cols ; ++r)
t[c][r]=m[r][c];
return t;
}
Well, if rows and cols are fixed you dont need to pass it.... hmmm...