Confusion with malloc - c

I'm trying to understand a piece of C code which is as follows:
#define DIM2( basetype, name, w1 ) basetype (*name)[w1]
int mx = 10; //number of rows per processor
int my = 100; //number of cols
DIM2 (double, f, my);
f = (typeof (f)) malloc (2 * mx * sizeof (*f));
If I'm correct, with DIM2 a 1-d array of (size=100) pointers to double is created.
I'm not able to understand what happens again with malloc? Is it necessary for two such statements?
Is there any alternative way to achieve what happens in the last two lines of code above in any other way?

The macro evaluates to:
double (*f)[my];
which is a pointer to array of double, not an array of pointer to double.
malloc allocates an array of 2 * mx * <whateverfpoints to> (i.e. an array of double). Not sure why it would allocate twice as many entries as given by mx, but that's what it does.
So, f points to the first array of double afterwards. It effectively allocates a true 2 dimensional array. (not the often confused array of pointers to double).
Note that the cast of malloc is bad practise in C.
Comment: As there is not less typing and the macro does not add specific information, it is actually bad practise. Worse is it hides the pointer semantics obfuscating the code. Recommendation is not to use it, but better be explicit; this is even not more typing.
Update: There is currently an argument if sizeof(*f) presents _undefined behaviour, becausef` is used uninitialized here. While I see a flaw in the standard here which should be more precise, you might better play safe and use an explicit expression:
f = malloc (2 * mx * my * sizeof (double))

double (*f)[my] is a VLA type because my is an int. So sizeof (*f) causes undefined behaviour because the argument of sizeof is evaluated if it has VLA type. For more discussion see here.
Unfortunately the sizeof *f idiom can only be used with pointer to array of fixed dimension (or pointer to non-array!). So this entire idea is bogus.
IMHO it is simpler and clearer to do away with the macro and write:
double (*f)[my] = malloc( sizeof(double[mx][my]) );

Related

Can one malloc call be used to allocate two arrays?

If I know, that too types T and U have same alignment, can I use one malloc call like this:
void* allocate_memory(int n, int m) {
return malloc(sizeof(T) * n + sizeof(U) * m);
}
to allocate contiguous memory for arrays of these two types?
If it is okay, what is the correct way to acquire the pointer to the first element of the second array? Conversion void* -> char* -> (+= sizeof(T) * n) -> U* seems fine, but I feel like there might be some kind of undefined behaviour there.
(I'm almost sure it can't be done in C++, rules of pointer arithmetic won't allow this (At no point array of U starts to exist, so you can't perform pointer arithmetic on this storage). Hence my cautiousness about C rules)
edit:
Since P0593R6 got accepted and applied as Defect Report to all C++ standards back to C++98, a call to malloc implicitly creates objects in allocated storage. Because of that, this construction is now valid in C++ too and pointer arithmetic on this range is well-defined as well.
In C, you can perform arithmetic on the full allocated object via its representation array, which has type unsigned char [] but can legally be addressed (less verbosely) via just char *. I'm not sure about in C++ but I would think you could do the same.
If p is the pointer returned, (U *)((char *)p + sizeof(T) * n) is a valid pointer to what you want.
Note that you can get rid of the "same alignment" requirement just by using _Alignof(U) or by using sizeof(U) (or the highest power of two that divides it) as a (not necessarily sharp) estimate for the alignment and working out the necessary padding in between to reach a multiple of the alignment. If you do this make sure to allocate the right total amount including the padding.

Allocate 6xNxN array

I have a variable N. I need a 6xNxN array.
Something like this:
int arr[6][N][N];
But, obviously, that doesn't work.
I'm not sure how I'd go about allocating this so that I can access, e.g. arr[5][4][4] if N is 5, and arr[5][23][23] if N is 24.
Note that N will never change, so I'll never have to reallocate arr.
What should I do? Will int ***arr = malloc(6 * N * N * sizeof(int)); work?
You can allocate your 3-dimensional array on the heap as
int (*arr)[N][N] = malloc(sizeof(int[6][N][N]));
After use, you can free as
free(arr);
Another way of writing the same as suggested by #StoryTeller is -
int (*arr)[N][N] = malloc(6u * sizeof(*arr));
But here you need to be careful about the u after 6 to prevent signed arithmetic overflow.
Also, there can still be issues on platforms where size_t is smaller in width that int as suggested by #chqrlie, but that won't be the case on "most" commonly used platforms and hence you are fine using it.
int arr[6][N][N]; will work just fine. You merely need to update your compiler and C knowledge to the year 1999 or later, when variable-length arrays (VLA) were introduced to the language.
(If you have an older version of GCC than 5.0, you must explicitly tell it to not use an ancient version of the C standard, by passing -std=c99 or -std=c11.)
Alternatively if you need heap allocation, you can do:
int (*arrptr)[Y][Z] = malloc( sizeof(int[X][Y][Z]) );
You cannot do int ***arr = malloc(6 * N * N * sizeof(int)); since a int*** cannot point at a 3D array. In general, more than two levels of indirection is a certain sign that your program design is completely flawed.
Detailed info here: Correctly allocating multi-dimensional arrays.
What you want can't work directly. For indexing a multi-dimensional array, all but the very first dimension need to be part of the type and here's why:
The indexing operator operates on pointers by first adding an index to the pointer and then dereferencing it. The identifier of an array evaluates to a pointer to its first element (except when e.g. used with sizeof, _Alignof and &), so indexing on arrays works as you would expect.
It's very simple in the case of a single-dimension array. With
int a[42];
a evaluates to a pointer of type int * and indexing works the following way: a[18] => *(a + 18).
Now in a 2-dimensional array, all the elements are stored contiguously ("row" after "row" if you want to understand it as a matrix), and what's making the indexing "magic" work is the types involved. Take for example:
int a[16][42];
Here, the elements of a have the type int ()[42] (42-element array of int). According to the rules above, evaluating an expression of this type in most contexts again yields an int * pointer. But what about a itself? Well, it's an array of int ()[42] so a will evaluate to a pointer to 42-element array of int: int (*)[42]. Then let's have a look at what the indexing operator does:
a[3][18] => *(*(a + 3) + 18)
With a evaluating to the address of a with type int (*)[42], this inner addition of 3 can properly add 42 * sizeof(int). This would be impossible if the second dimension wasn't known in the type.
I guess it's simple to deduce the example for the n-dimensional case.
In your case, you have two possibilities to achieve something similar to what you want.
Use a dynamically allocated flat array with size 6*N*N. You can calculate the indices yourself if you save N somewhere.
Somewhat less efficient, but yielding better readable code, you could use an array of pointers to arrays of pointers to int (multiple indirection). You could e.g. do
int ***a = malloc(6 * sizeof *int);
for (size_t i = 0; i < 6; ++i)
{
a[i] = malloc(N * sizeof *(a[i]));
for (size_t j = 0; j < N ++j)
{
a[i][j] = malloc(N* sizeof *(a[i][j]));
}
}
// add error checking to malloc calls!
Then your accesses will look just like those to a normal 3d array, but it's stored internally as many arrays with pointers to the other arrays instead of in a big contiguous block.
I don't think it's worth using this many indirections, just to avoid writing e.g. a[2*N*N+5*N+4] to access the element at 2,5,4, so my recommendation would be the first method.
Making a simple change to the declaration on this line and keeping the malloc can easily solve your problem.
int ***arr = malloc(6 * N * N * sizeof(int));
However, int *** is unnecessary (and wrong). Use a flat array, which is easy to allocate:
int *flatarr = malloc(6 * N * N * sizeof(int));
This works for three dimensions, and instead of accessing arr[X][Y][Z] as in the question, you access flatarr[(X*N*N) + (Y*N) + Z]. In fact, you could even write a handy macro:
#define arr(X,Y,Z) flatarr[((X)*N*N) + ((Y)*N) + (Z)]
This is basically what I've done in my language Cubically to allow for multiple-size cubes. Thanks to Programming Puzzles & Code Golf user Dennis for giving me this idea.

Getting length of an array

I've been wondering how to get the number of elements of an array. Somewhere in this website I found an answer which told me to declare the following macro:
#define NELEMS(x) (sizeof(x) / sizeof(x[0]))
It works well for arrays defined as:
type arr[];
but not for the following:
type *arr = (type) malloc(32*sizeof(type));
it returns 1 in that case (it's supposed to return 32).
I would appreciate some hint on that
Pointers do not keep information about whether they point to a single element or the first element of an array
So if you have a statement like this
type *arr = (type) malloc(32*sizeof(type));
then here is arr is not an array. It is a pointer to the beginning of the dynamically allocated memory extent.
Or even if you have the following declarations
type arr[10];
type *p = arr;
then again the pointer knows nothing about whether it points to a single object or the first element of an array. You can in any time write for example
type obj;
p = &obj;
So when you deal with pointers that point to first elements of arrays you have to keep somewhere (in some other variable) the actual size of the referenced array.
As for arrays themselves then indeed you may use expression
sizeof( arr ) / sizeof( *arr )
or
sizeof( arr ) / sizeof( arr[0] )
But arrays are not pointers though very often they are converted to pojnters to their first elements with rare exceptions. And the sizeof operator is one such exception. Arrays used in sizeof operator are not converted to pointers to their first elements.
sizeof operator produces the size of a type of the variable. It does not count the amount of memory allocated to a pointer (representing the array).
To elaborate,
in case of type arr[32];, sizeof (arr) is essentially sizeof(type[32]).
in case of type *arr;, sizeof(arr) is essentially sizeof(type*)
To get the length of a string, you need to use strlen().
Remember, the definition of string is a null-terminated character array.
That said, in your code,
type *arr = (type) malloc(32*sizeof(type));
is very wrong. To avoid this kind of error, we suggest do not cast malloc().
And remove the cast. You should not cast the result of malloc and
family.
These are the main reasons for not casting the returned value from malloc (and family of functions).
in C, the return type of those functions is 'void*'. A void * can be assigned to any pointer type.
During debugging and during maintenance the receiving pointer type is often changed. The origin of that change is often not where the malloc function is called. If the returned value is cast, then a bug is introduced to the code. This kind of bug can be very difficult to find.
There is no safe and sound way of finding the length of an array in C since no bookkeeping is done for them.
You will need to use some other data structures which does the book keeping for you in order to ensure the correct result every time.

What are the differences between *ptr and **ptr?

I am coding a 3D array using triple pointers with malloc. I replaced *ptrdate in (a), *ptrdate[i], and *ptrdate[i] with *ptrdate in the code below since They are all basically pointers of type Date but access in different dimension. I got the same results both ways.
Question: what's the difference when used as the operand of sizeof?
typedef struct {
int day;
} Date;
int main(){
int i, j, k, count=0;
int row=3, col=4, dep=5;
Date ***ptrdate = malloc(row * sizeof *ptrdate); //(a)
for (i=0; i<row; i++) {
ptrdate[i] = malloc(col * sizeof *ptrdate[i]); //(b)
for (j=0; j<col; j++) {
ptrdate[i][j] = malloc(dep * sizeof *ptrdate[i][j]); //(c)
}
}
I am coding a 3D array using triple pointers with malloc.
First of all, there is no need for any array to be allocated using more than one call to malloc. In fact, it is incorrect to do so, as the word "array" is considered to denote a single block of contiguous memory, i.e. one allocation. I'll get to that later, but first, your question:
Question: what's the difference when used as the operand of sizeof?
The answer, though obvious, is often misunderstood. They're different pointer types, which coincidentally have the same size and representation on your system... but they might have different sizes and representations on other systems. It is important to keep that possibility in mind, so that you can be sure your code is as portable as possible.
Given size_t row=3, col=4, dep=5;, you can declare an array like so: Date array[row][col][dep];. I know you have no use for such a declaration in this question... Bear with me for a moment. If we printf("%zu\n", sizeof array);, it'll print row * col * dep * sizeof (Date). It knows the full size of the array, including all of the dimensions... and this is exactly how many bytes are required when allocating such an array.
printf("%zu\n", sizeof ptrDate); with ptrDate declared as in your code will produce something entirely different, though... It'll produce the size of a pointer (to pointer to pointer to Date, not to be confused with pointer to Date or pointer to pointer to Date) on your system. All of the size information, regarding the number of dimensions (e.g. the row * col * dep multiplication) is lost, because we haven't told our pointers to maintain that size information. We can still find sizeof (Date) by using sizeof *ptrDate, though, because we've told our code to keep that size information associated with the pointer type.
What if we could tell our pointers to maintain the other size information (the dimensions), though? What if we could write ptrDate = malloc(row * sizeof *ptrDate);, and have sizeof *ptrDate equal to col * dep * sizeof (Date)? This would simplify allocation, wouldn't it?
This brings us back to my introduction: There is a way to perform all of this allocation using one single malloc. It's a simple pattern to remember, but a difficult pattern to understand (and probably appropriate to ask another question about):
Date (*ptrDate)[col][dep] = malloc(row * sizeof *ptrDate);
Suffice to say, usage is still mostly the same. You can still use this like ptrDate[x][y][z]... There is one thing that doesn't seem quite right, though, and that is sizeof ptrDate still yields the size of a pointer (to array[col][dep] of Date) and sizeof *ptrDate doesn't contain the row dimension (hence the multiplication in the malloc above. I'll leave it as an exercise to you to work out whether a solution is necessary for that...
free(ptrDate); // Ooops! I must remember to free the memory I have allocated!
int *ptr is the declaration of pointer which stores the address of the integer variable and int **ptr is the declaration that stores the address of the pointer storing the integer variable.

How to pass "pointer to a pointer" to a function that expects "pointer to array"

Consider this piece of code:
#define MAX 4
............
............
int** ptr = (int**)malloc(sizeof(int*)*MAX);
*ptr = (int*)malloc(sizeof(int)*MAX);
// Assigned values to the pointer successfully
How foo() can be invoked with ptr as parameter ? foo()'s prototype has been declared as below:
void foo(int arr[][MAX]);
You can't pass ptr as parameter to that foo function.
The memory layout of a 2-dimensional array (array of arrays) is quite different from that of an array of pointers to arrays.
Either change the function signature to accept a int** (and probably also size information), or define ptr to be a 2-dimensional array of the appropriate size.
I am going to assume the function foo in your example actually wants a 2-D array of int, with MAX columns and an unspecified number of rows. This works in C and C++ because of how the rows lay out in memory. All the elements in row N+1 appear contiguously after all the elements in row N.
The syntax int arr[][MAX] asks for a pointer to the first element of such a 2-D array, not an array of pointers to rows. I'll assume you want the 2-D array, and not an array of pointers.
First, you need to correctly allocate your elements. You haven't specified what the leftmost dimension of arr[][MAX] is, or where it comes from. I'll assume it's in the variable dim here:
int (*ptr)[MAX]; /* pointer first element of an int[][MAX] array */
/* Allocate a 2-D array of dim * MAX ints */
ptr = (int (*)[MAX]) malloc( dim * MAX * sizeof(int) );
Then, to call your function, just do foo( ptr ) and it'll work without errors or warnings.
To make your code cleaner (especially if you're using many of these 2-D arrays), you might consider wrapping the pointer type in a typedef, and writing a small function to allocate these arrays.
typedef int (*int_array_2d)[MAX];
int_array_2d alloc_int_array_2d( int dim1 )
{
return (int_array_2d) malloc( dim1 * MAX * sizeof(int) );
}
That way, elsewhere in your code, you can say something much simpler and cleaner:
int_array_2d ptr = alloc_int_array_2d( dim );
Use the type system to your advantage. The C and C++ syntax for the type and the typecast are ugly, and unfamiliar to most people. They look strange due to the precedence of * vs. []. If you hide it in a typedef though, it can help keep you focused on what you're trying to do, rather than understanding C/C++'s weird precedence issues that arise when you mix arrays and pointers.

Resources