How can I make multidimensional dynamically allocated arrays in C? - c

before you mark this as a duplicate please notice that I'm looking for a more general solution for arrays of arbitrary dimensions. I have read many posts here or in forums about making 2D or 3D arrays of integers but these are specific solutions for specific dimensions. I want a general solution for an array of any dimension.
First I need to have a type of intlist as defined below:
typedef struct{
int l // length of the list
int * e // pointer to the first element of the array
}intlist;
this actually fills the gap in C for treating arrays just as pointers. using this type I can pass arrays to functions without worrying about loosing the size.
then in the next step I want to have a mdintlist as multidimensional dynamically allocated arrays. the type definition should be something like this:
typedef struct Mdintlist{
intlist d // dimension of the array
/* second part */
}mdintlist;
there are several options for the second part. on option is that to have a pointer towards a mdintlist of lower dimension like
struct Mdintlist * c;
the other options is to use void pointers:
void * c;
I don't know how to continue it from here.
P.S. one solution could be to allocate just one block of memory and then call the elements using a function. However I would like to call the elements in array form. something like tmpmdintlist.c[1][2][3]...
Hope I have explained clearly what I want.
P.S. This is an ancient post, but for those who may end up here some of my efforts can be seen in the Cplus repo.

You can't! you can only use the function option in c, because there is no way to alter the language semantics. In c++ however you can overload the [] operator, and even though I would never do such an ugly thing (x[1][2][3] is alread y ugly, if you continue adding "dimensions" it gets really ugly), I think it would be possible.

Well, if you separate the pointers and the array lengths, you end up with much less code.
int *one_dem_array;
size_t one_dem_count[1];
int **two_dem_array;
size_t two_dem_count[2];
int ***three_dem_array;
size_t three_dem_count[3];
This way you can still use your preferred notation.
int num_at_pos = three_dem_array[4][2][3];

Related

Overwriting an existing 2D Array in C

I'm currently writing a project in C, and I need to be able to fill a 2D array with information already stored in another 2D array. In a separate C file, I have this array:
int levelOne[][4] =
{{5,88,128,0},
{153,65,0,0},
{0,144,160,20}}; //First Array
int levelTwo[][4] =
{{5,88,128,0},
{153,65,0,0},
{0,144,160,20}}; //Second Array
And in my main file, I have this variable which I'd like to fill with the information from both of these arrays at different points in my code. (This isn't exactly what I'm doing, but it's the general gist):
#include "arrayFile.c"
void main()
{
int arrayContainer[][4] = levelOne;
while (true)
{
func(arrayContainer);
if(foo)
{
arrayContainer = levelTwo;//Switches to the other array if the conditional is met.
}
}
}
I know this method doesn't work - you can't overwrite items in arrays after they're instantiated. But is there any way to do something like this? I know I'll most likely need to use pointers to do this instead of completely overwriting the array, however there's not a lot of information on the internet about pointers with multidimensional arrays. In this situation, what's best practice?
Also, I don't know exactly how many arrays of 4 there will be, so I wouldn't be able to use a standard 3D array and just switch between indexes, unless there's a way to make a 3D jagged array that I don't know about.
Given the definitions you show, such as they are, all you need is memcpy(arrayContainer, levelTwo, sizeof LevelTwo);.
You should ensure that arrayContainer has sufficient memory to contain the copied data and that LevelTwo, since it is used as the operand of sizeof, is a designator for the actual array, not a pointer. If it is not, replace sizeof LevelTwo with the size of the array.
If you do not need the actual memory filled with data but simply need a way to refer to the contents of the different arrays, make arrayContainer a pointer instead of an array, as with int (*arrayContainer)[4];. Then you can use arrayContainer = levelOne; or arrayContainer = levelTwo; to change which data it points to.
Also, I don't know exactly how many arrays of 4 there will be, so I wouldn't be able to use a standard 3D array and just switch between indexes, unless there's a way to make a 3D jagged array that I don't know about.
It is entirely possible to have a pointer to dynamically allocated memory which is filled with pointers to arrays of four int, and those pointers can be changed at will.

Is it good programming practice in C to use first array element as array length?

Because in C the array length has to be stated when the array is defined, would it be acceptable practice to use the first element as the length, e.g.
int arr[9]={9,0,1,2,3,4,5,6,7};
Then use a function such as this to process the array:
int printarr(int *ARR) {
for (int i=1; i<ARR[0]; i++) {
printf("%d ", ARR[i]);
}
}
I can see no problem with this but would prefer to check with experienced C programmers first. I would be the only one using the code.
Well, it's bad in the sense that you have an array where the elements does not mean the same thing. Storing metadata with the data is not a good thing. Just to extrapolate your idea a little bit. We could use the first element to denote the element size and then the second for the length. Try writing a function utilizing both ;)
It's also worth noting that with this method, you will have problems if the array is bigger than the maximum value an element can hold, which for char arrays is a very significant limitation. Sure, you can solve it by using the two first elements. And you can also use casts if you have floating point arrays. But I can guarantee you that you will run into hard traced bugs due to this. Among other things, endianness could cause a lot of issues.
And it would certainly confuse virtually every seasoned C programmer. This is not really a logical argument against the idea as such, but rather a pragmatic one. Even if this was a good idea (which it is not) you would have to have a long conversation with EVERY programmer who will have anything to do with your code.
A reasonable way of achieving the same thing is using a struct.
struct container {
int *arr;
size_t size;
};
int arr[10];
struct container c = { .arr = arr, .size = sizeof arr/sizeof *arr };
But in any situation where I would use something like above, I would probably NOT use arrays. I would use dynamic allocation instead:
const size_t size = 10;
int *arr = malloc(sizeof *arr * size);
if(!arr) { /* Error handling */ }
struct container c = { .arr = arr, .size = size };
However, do be aware that if you init it this way with a pointer instead of an array, you're in for "interesting" results.
You can also use flexible arrays, as Andreas wrote in his answer
In C you can use flexible array members. That is you can write
struct intarray {
size_t count;
int data[]; // flexible array member needs to be last
};
You allocate with
size_t count = 100;
struct intarray *arr = malloc( sizeof(struct intarray) + sizeof(int)*count );
arr->count = count;
That can be done for all types of data.
It makes the use of C-arrays a bit safer (not as safe as the C++ containers, but safer than plain C arrays).
Unforntunately, C++ does not support this idiom in the standard.
Many C++ compilers provide it as extension though, but it is not guarantueed.
On the other hand this C FLA idiom may be more explicit and perhaps more efficient than C++ containers as it does not use an extra indirection and/or need two allocations (think of new vector<int>).
If you stick to C, I think this is a very explicit and readable way of handling variable length arrays with an integrated size.
The only drawback is that the C++ guys do not like it and prefer C++ containers.
It is not bad (I mean it will not invoke undefined behavior or cause other portability issues) when the elements of array are integers, but instead of writing magic number 9 directly you should have it calculate the length of array to avoid typo.
#include <stdio.h>
int main(void) {
int arr[9]={sizeof(arr)/sizeof(*arr),0,1,2,3,4,5,6,7};
for (int i=1; i<arr[0]; i++) {
printf("%d ", arr[i]);
}
return 0;
}
Only a few datatypes are suitable for that kind of hack. Therefore, I would advise against it, as this will lead to inconsistent implementation styles across different types of arrays.
A similar approach is used very often with character buffers where in the beginning of the buffer there is stored its actual length.
Dynamic memory allocation in C also uses this approach that is the allocated memory is prefixed with an integer that keeps the size of the allocated memory.
However in general with arrays this approach is not suitable. For example a character array can be much larger than the maximum positive value (127) that can be stored in an object of the type char. Moreover it is difficult to pass a sub-array of such an array to a function. Most of functions that designed to deal with arrays will not work in such a case.
A general approach to declare a function that deals with an array is to declare two parameters. The first one has a pointer type that specifies the initial element of an array or sub-array and the second one specifies the number of elements in the array or sub-array.
Also C allows to declare functions that accepts variable length arrays when their sizes can be specified at run-time.
It is suitable in rather limited circumstances. There are better solutions to the problem it solves.
One problem with it is that if it is not universally applied, then you would have a mix of arrays that used the convention and those that didn't - you have no way of telling if an array uses the convention or not. For arrays used to carry strings for example you have to continually pass &arr[1] in calls to the standard string library, or define a new string library that uses "Pascal strings" rather then "ASCIZ string" conventions (such a library would be more efficient as it happens),
In the case of a true array rather then simply a pointer to memory, sizeof(arr) / sizeof(*arr) will yield the number of elements without having to store it in the array in any case.
It only really works for integer type arrays and for char arrays would limit the length to rather short. It is not practical for arrays of other object types or data structures.
A better solution would be to use a structure:
typedef struct
{
size_t length ;
int* data ;
} intarray_t ;
Then:
int data[9] ;
intarray_t array{ sizeof(data) / sizeof(*data), data } ;
Now you have an array object that can be passed to functions and retain the size information and the data member can be accesses directly for use in third-party or standard library interfaces that do not accept the intarray_t. Moreover the type of the data member can be anything.
Obviously NO is the answer.
All programming languages has predefined functions stored along with the variable type. Why not use them??
In your case is more suitable to access count /length method instead of testing the first value.
An if clause sometimes take more time than a predefined function.
On the first look seems ok to store the counter but imagine you will have to update the array. You will have to do 2 operations, one to insert other to update the counter. So 2 operations means 2 variables to be changed.
For statically arrays might be ok to have them counter then the list, but for dinamic ones NO NO NO.
On the other hand please read programming basic concepts and you will find your idea as a bad one, not complying with programming principles.

How can I understand int *a[5][5]?

I got a task to modify the content of a 2-dimensional array int[5][5], I was given the definition int *a[5][5] and ordered to use a int** (the pointer of a pointer) to handle this task.
I'm now wondering the meaning of this int *a[5][5], how can I understand the meaning of this and similar definitions?
int *a[5][5] is a 2D array of pointers. A pointer-to-pointer can be used to point at any pointer item in this array.
A for how to understand the declaration, everything left of the variable name is the type of each item in the array, in this case int*.
You could also use this site. It works for many C declarations, but not all.
It is nothing but a Matrix of Pointers
In fact there are meny questions on stackoverflow on these. Please refer cdecl.org

passing multidimensional array as argument in C

C newbie here, I need some help: Can anyone explain to (and offer a workaroud) me why this works:
int n=1024;
int32_t data[n];
void synthesize_signal(int32_t *data) {
...//do something with data}
which let me alter data in the function; but this does not?
int n=1024;
int number=1024*16;
int32_t data[n][2][number];
void synthesize_signal(int32_t *data) {
...//do something with data}
The compiler error message is something like it expected int32_t * but got int32_t (*)[2][(sizetype)(number)] instead.
First, passing arrays in C is by reference. So you pass a pointer of some sort, and the function can modify the data in the array. You don't have to worry about passing a pointer to the array. In fact, in C there is no real different between a pointer that happens to be to the being of an array, and the array itself.
In your first version. You making a one-dimensional array data[n], and you are passing it to your function. In the array, you'll using it by saying, something like data[i]. This translates directly to (data + (i sizeof(int32_t)). It is using the size of the elements in the array to find the memory location that is i positions in front of the beginning of your array.
int n=1024;
int number=1024*16;
int32_t data[n][2][number];
void synthesize_signal(int32_t *data)
In the second case, you're setting up a mufti-dimensional array (3D in your case). You setup correctly. The problem is that when you pass it to the function, the only thing that gets passed the address of the being of the array. When it gets used inside the function, you'll do something like
data[i][1][x] = 5;
Internally C is calculating how from the beginning of the array this location is. In order for it to do that, it need to know the dimensions of the array. (Unlike some newer languages, C store any extra data about array lengths or sizes or anything). You just need to change the function signature so it knows the shape/size of array to expect. Because of the way, it calculates array positions, it doesn't need the first dimension.
In this case, change your function signature to look like this:
void synthesize_signal(int32_t data[][2][number]) { ...
Setup the array the same way you are doing the second one above, and just call it you'd expect:
synthesize_signal(data);
This should fix everything for you.
The comments mention some useful information about using more descriptive variable names, and global vs. local variable. All valid comments to keep in mind. I just addressed to code problem you're having in terms of mufti-dimensional arrays.
try
synthesize_signal(int32_t** data)
{
}
Your function also needs to know that data is multi dimensional. You should also consider renaming your data array. I suspect that it is a global variable and using the same name in function can lead to problems.
When you call the function, do it like this:
synthesize_signal(&data[0][0][0]);

Triple pointers in C: is it a matter of style?

I feel like triple pointers in C are looked at as "bad". For me, it makes sense to use them at times.
Starting from the basics, the single pointer has two purposes: to create an array, and to allow a function to change its contents (pass by reference):
char *a;
a = malloc...
or
void foo (char *c); //means I'm going to modify the parameter in foo.
{ *c = 'f'; }
char a;
foo(&a);
The double pointer can be a 2D array (or array of arrays, since each "column" or "row" need not be the same length). I personally like to use it when I need to pass a 1D array:
void foo (char **c); //means I'm going to modify the elements of an array in foo.
{ (*c)[0] = 'f'; }
char *a;
a = malloc...
foo(&a);
To me, that helps describe what foo is doing. However, it is not necessary:
void foo (char *c); //am I modifying a char or just passing a char array?
{ c[0] = 'f'; }
char *a;
a = malloc...
foo(a);
will also work.
According to the first answer to this question, if foo were to modify the size of the array, a double pointer would be required.
One can clearly see how a triple pointer (and beyond, really) would be required. In my case if I were passing an array of pointers (or array of arrays), I would use it. Evidently it would be required if you are passing into a function that is changing the size of the multi-dimensional array. Certainly an array of arrays of arrays is not too common, but the other cases are.
So what are some of the conventions out there? Is this really just a question of style/readability combined with the fact that many people have a hard time wrapping their heads around pointers?
Using triple+ pointers is harming both readability and maintainability.
Let's suppose you have a little function declaration here:
void fun(int***);
Hmmm. Is the argument a three-dimensional jagged array, or pointer to two-dimensional jagged array, or pointer to pointer to array (as in, function allocates an array and assigns a pointer to int within a function)
Let's compare this to:
void fun(IntMatrix*);
Surely you can use triple pointers to int to operate on matrices. But that's not what they are. The fact that they're implemented here as triple pointers is irrelevant to the user.
Complicated data structures should be encapsulated. This is one of manifest ideas of Object Oriented Programming. Even in C, you can apply this principle to some extent. Wrap the data structure in a struct (or, very common in C, using "handles", that is, pointers to incomplete type - this idiom will be explained later in the answer).
Let's suppose that you implemented the matrices as jagged arrays of double. Compared to contiguous 2D arrays, they are worse when iterating over them (as they don't belong to a single block of contiguous memory) but allow for accessing with array notation and each row can have different size.
So now the problem is you can't change representations now, as the usage of pointers is hard-wired over user code, and now you're stuck with inferior implementation.
This wouldn't be even a problem if you encapsulated it in a struct.
typedef struct Matrix_
{
double** data;
} Matrix;
double get_element(Matrix* m, int i, int j)
{
return m->data[i][j];
}
simply gets changed to
typedef struct Matrix_
{
int width;
double data[]; //C99 flexible array member
} Matrix;
double get_element(Matrix* m, int i, int j)
{
return m->data[i*m->width+j];
}
The handle technique works like this: in the header file, you declare a incomplete struct and all the functions that work on the pointer to the struct:
// struct declaration with no body.
struct Matrix_;
// optional: allow people to declare the matrix with Matrix* instead of struct Matrix*
typedef struct Matrix_ Matrix;
Matrix* create_matrix(int w, int h);
void destroy_matrix(Matrix* m);
double get_element(Matrix* m, int i, int j);
double set_element(Matrix* m, double value, int i, int j);
in the source file you declare the actual struct and define all the functions:
typedef struct Matrix_
{
int width;
double data[]; //C99 flexible array member
} Matrix;
double get_element(Matrix* m, int i, int j)
{
return m->data[i*m->width+j];
}
/* definition of the rest of the functions */
The rest of the world doesn't know what does the struct Matrix_ contain and it doesn't know the size of it. This means users can't declare the values directly, but only by using pointer to Matrix and the create_matrix function. However, the fact that the user doesn't know the size means the user doesn't depend on it - which means we can remove or add members to struct Matrix_ at will.
Most of the time, the use of 3 levels of indirection is a symptom of bad design decisions made elsewhere in the program. Therefore it is regarded as bad practice and there are jokes about "three star programmers" where, unlike the the rating for restaurants, more stars means worse quality.
The need for 3 levels of indirection often originates from the confusion about how to properly allocate multi-dimensional arrays dynamically. This is often taught incorrectly even in programming books, partially because doing it correctly was burdensome before the C99 standard. My Q&A post Correctly allocating multi-dimensional arrays addresses that very issue and also illustrates how multiple levels of indirection will make the code increasingly hard to read and maintain.
Though as that post explains, there are some situations where a type** might make sense. A variable table of strings with variable length is such an example. And when that need for type** arises, you might soon be tempted to use type***, because you need to return your type** through a function parameter.
Most often this need arises in a situation where you are designing some manner of complex ADT. For example, lets say that we are coding a hash table, where each index is a 'chained' linked list, and each node in the linked list an array. The proper solution then is to re-design the program to use structs instead of multiple levels of indirection. The hash table, linked list and array should be distinct types, autonomous types without any awareness of each other.
So by using proper design, we will avoid the multiple stars automatically.
But as with every rule of good programming practice, there are always exceptions. It is perfectly possible to have a situation like:
Must implement an array of strings.
The number of strings is variable and may change in run-time.
The length of the strings is variable.
You can implement the above as an ADT, but there may also be valid reasons to keep things simple and just use a char* [n]. You then have two options to allocate this dynamically:
char* (*arr_ptr)[n] = malloc( sizeof(char*[n]) );
or
char** ptr_ptr = malloc( sizeof(char*[n]) );
The former is more formally correct, but also cumbersome. Because it has to be used as (*arr_ptr)[i] = "string";, while the alternative can be used as ptr_ptr[i] = "string";.
Now suppose we have to place the malloc call inside a function and the return type is reserved for an error code, as is custom with C APIs. The two alternatives will then look like this:
err_t alloc_arr_ptr (size_t n, char* (**arr)[n])
{
*arr = malloc( sizeof(char*[n]) );
return *arr == NULL ? ERR_ALLOC : OK;
}
or
err_t alloc_ptr_ptr (size_t n, char*** arr)
{
*arr = malloc( sizeof(char*[n]) );
return *arr == NULL ? ERR_ALLOC : OK;
}
It is quite hard to argue and say that the former is more readable, and it also comes with the cumbersome access needed by the caller. The three star alternative is actually more elegant, in this very specific case.
So it does us no good to dismiss 3 levels of indirection dogmatically. But the choice to use them must be well-informed, with an awareness that they may create ugly code and that there are other alternatives.
So what are some of the conventions out there? Is this really just a question of style/readability combined with the fact that many people have a hard time wrapping their heads around pointers?
Multiple indirection is not bad style, nor black magic, and if you're dealing with high-dimension data then you're going to be dealing with high levels of indirection; if you're really dealing with a pointer to a pointer to a pointer to T, then don't be afraid to write T ***p;. Don't hide pointers behind typedefs unless whoever is using the type doesn't have to worry about its "pointer-ness". For example, if you're providing the type as a "handle" that gets passed around in an API, such as:
typedef ... *Handle;
Handle h = NewHandle();
DoSomethingWith( h, some_data );
DoSomethingElseWith( h, more_data );
ReleaseHandle( h );
then sure, typedef away. But if h is ever meant to be dereferenced, such as
printf( "Handle value is %d\n", *h );
then don't typedef it. If your user has to know that h is a pointer to int1 in order to use it properly, then that information should not be hidden behind a typedef.
I will say that in my experience I haven't had to deal with higher levels of indirection; triple indirection has been the highest, and I haven't had to use it more than a couple of times. If you regularly find yourself dealing with >3-dimensional data, then you'll see high levels of indirection, but if you understand how pointer expressions and indirection work it shouldn't be an issue.
1. Or a pointer to pointer to int, or pointer to pointer to pointer to pointer to struct grdlphmp, or whatever.
After two levels of indirection, comprehension becomes difficult. Moreover if the reason you're passing these triple (or more) pointers into your methods is so that they can re-allocate and re-set some pointed-to memory, that gets away from the concept of methods as "functions" that just return values and don't affect state. This also negatively affects comprehension and maintainability beyond some point.
But more fundamentally, you've hit upon one of the main stylistic objections to the triple pointer right here:
One can clearly see how a triple pointer (and beyond, really) would be required.
It's the "and beyond" that is the issue here: once you get to three levels, where do you stop? Surely it's possible to have an aribitrary number of levels of indirection. But it's better to just have a customary limit someplace where comprehensibility is still good but flexibility is adequate. Two's a good number. "Three star programming", as it's sometimes called, is controversial at best; it's either brilliant, or a headache for those who need to maintain the code later.
Unfortunately you misunderstood the concept of pointer and arrays in C. Remember that arrays are not pointers.
Starting from the basics, the single pointer has two purposes: to create an array, and to allow a function to change its contents (pass by reference):
When you declare a pointer, then you need to initialize it before using it in the program. It can be done either by passing address of a variable to it or by dynamic memory allocation.
In latter, pointer can be used as indexed arrays (but it is not an array).
The double pointer can be a 2D array (or array of arrays, since each "column" or "row" need not be the same length). I personally like to use it when I need to pass a 1D array:
Again wrong. Arrays are not pointers and vice-versa. A pointer to pointer is not the 2D array.
I would suggest you to read the c-faq section 6. Arrays and Pointers.

Resources