What are the differences between *ptr and **ptr? - c

I am coding a 3D array using triple pointers with malloc. I replaced *ptrdate in (a), *ptrdate[i], and *ptrdate[i] with *ptrdate in the code below since They are all basically pointers of type Date but access in different dimension. I got the same results both ways.
Question: what's the difference when used as the operand of sizeof?
typedef struct {
int day;
} Date;
int main(){
int i, j, k, count=0;
int row=3, col=4, dep=5;
Date ***ptrdate = malloc(row * sizeof *ptrdate); //(a)
for (i=0; i<row; i++) {
ptrdate[i] = malloc(col * sizeof *ptrdate[i]); //(b)
for (j=0; j<col; j++) {
ptrdate[i][j] = malloc(dep * sizeof *ptrdate[i][j]); //(c)
}
}

I am coding a 3D array using triple pointers with malloc.
First of all, there is no need for any array to be allocated using more than one call to malloc. In fact, it is incorrect to do so, as the word "array" is considered to denote a single block of contiguous memory, i.e. one allocation. I'll get to that later, but first, your question:
Question: what's the difference when used as the operand of sizeof?
The answer, though obvious, is often misunderstood. They're different pointer types, which coincidentally have the same size and representation on your system... but they might have different sizes and representations on other systems. It is important to keep that possibility in mind, so that you can be sure your code is as portable as possible.
Given size_t row=3, col=4, dep=5;, you can declare an array like so: Date array[row][col][dep];. I know you have no use for such a declaration in this question... Bear with me for a moment. If we printf("%zu\n", sizeof array);, it'll print row * col * dep * sizeof (Date). It knows the full size of the array, including all of the dimensions... and this is exactly how many bytes are required when allocating such an array.
printf("%zu\n", sizeof ptrDate); with ptrDate declared as in your code will produce something entirely different, though... It'll produce the size of a pointer (to pointer to pointer to Date, not to be confused with pointer to Date or pointer to pointer to Date) on your system. All of the size information, regarding the number of dimensions (e.g. the row * col * dep multiplication) is lost, because we haven't told our pointers to maintain that size information. We can still find sizeof (Date) by using sizeof *ptrDate, though, because we've told our code to keep that size information associated with the pointer type.
What if we could tell our pointers to maintain the other size information (the dimensions), though? What if we could write ptrDate = malloc(row * sizeof *ptrDate);, and have sizeof *ptrDate equal to col * dep * sizeof (Date)? This would simplify allocation, wouldn't it?
This brings us back to my introduction: There is a way to perform all of this allocation using one single malloc. It's a simple pattern to remember, but a difficult pattern to understand (and probably appropriate to ask another question about):
Date (*ptrDate)[col][dep] = malloc(row * sizeof *ptrDate);
Suffice to say, usage is still mostly the same. You can still use this like ptrDate[x][y][z]... There is one thing that doesn't seem quite right, though, and that is sizeof ptrDate still yields the size of a pointer (to array[col][dep] of Date) and sizeof *ptrDate doesn't contain the row dimension (hence the multiplication in the malloc above. I'll leave it as an exercise to you to work out whether a solution is necessary for that...
free(ptrDate); // Ooops! I must remember to free the memory I have allocated!

int *ptr is the declaration of pointer which stores the address of the integer variable and int **ptr is the declaration that stores the address of the pointer storing the integer variable.

Related

Problem with dynamic allocation of memory

Until now I did allocate the memory for a matrix like this :
int **p,n;
scanf("%d",&n);
p=malloc(n*sizeof(int));
for(int i=0;i<n;i++)
p[i]=malloc(n*sizeof(int));
but someone told me to do like this :
int **p,n;
scanf("%d",&n);
p=malloc(n*sizeof*p);
for(int i=0;i<n;i++)
p[i]=malloc(n*sizeof*p);
sizeof(p) is not 0 because is not allocated ??
Which one is good ?
In the first code snippet, this statement is wrong:
p=malloc(n*sizeof(int));
because the type of p is int **, so, p can be pointer to a int *
type. It should be:
p = malloc (n * sizeof (int *));
^^^^^
In the second code snippet, allocation to p is correct because of this - sizeof*p. The type of *p is int *. So, sizeof*p is equivalent to sizeof (int *).
But in second code snippet, this is wrong:
p[i]=malloc(n*sizeof*p);
because the type of p[i] is int * i.e. pointer to an int. So, it can point to an integer. Hence, you should allocate memory of n * sizeof (int). It should be
p[i] = malloc (n * sizeof *p[i]);
Here, n * sizeof *p[i] is equivalent to n * sizeof (int) because the type of *p[i] is int.
Its matter of choice to use whichever style you want to. The matter of fact is that, you should have a good understanding of what you are doing and how it works because the lack of understanding can result in mistake in any style that you choose (as you can see there is mistake in both the code snippets you have shown).
First of all, p=malloc(n*sizeof(int)); is wrong - you aren't allocating a 2D array but an array of pointers, each pointing to an array of int. This needs to be p=malloc(n*sizeof(int*)); for the first example to be correct.
Apart from that bug, this is a matter of subjective coding style. Some prefer to write malloc(n*sizeof*p); since sizeof *p gives the size of the pointed-at item. This works because sizeof isn't evaluated for side effects, so no pointer de-referencing actually happens. The size is computed at compile-time.
A third style is also possible: p=malloc( sizeof(int*[n]) );. Here you make it more explicit that you are declaring an array. Which of these three styles to use is subjective and mostly a matter of opinion.
And in case you want to allocate actual 2D arrays allocated adjacently, you need to do as advised here instead: Correctly allocating multi-dimensional arrays

Why malloc( ) is used ? And why the size of the variable isn't increasing?

According to the answer from my faculty malloc dynamically allocates memory, Then why the output shows the same size allocated to both normal variable and malloc();. I am a newbie to programming, so I guess you would answer my question the way that a newbie can understand.
#include<stdio.h>
int main()
{
int a,b;
a = (int *) malloc(sizeof(int)*2);
printf("The size of a is:%d \n",sizeof(a));
printf("The size of b is:%d \n",sizeof(b));
return 0;
}
Output:
The size of a is:4
The size of b is:4
Malloc is used on a pointer. You are declaring an integer int a. This needs to be changed to int *a
The sizeof() operator will not give the no of bytes allocated by malloc. This needs to be maintained by the programmer and typically cannot be determined directly from the pointer.
For int *a, sizeof(a) will always return the size of the pointer,
int *a;
printf("%zu\n",sizeof(a)); // gives the size of the pointer e.g. 4
a = malloc(100 * sizeof(int));
printf("%zu\n",sizeof(a)); // also gives the size of the pointer e.g. 4
You should always remember to free the memory you have allocated with malloc
free(a);
Edit The printf format specifiers should be %zu for a sizeof() output. See comments below.
You declare and define both variables as int. Nothing else has an influence on the value of sizeof().
int a,b;
This assigns a value to one of those ints which which is very special, but it does not change anything about the fact that a remains an int (and your cast is misleading and does not do anything at all, even less to change anything about a).
a = (int *) malloc(sizeof(int)*2);
In order to change above line to something sensible (i.e. a meaningful use of malloc) it should be like this:
int* a;
a= malloc(sizeof(int)*2);
I.e. a is now a pointer to int and gets the address of an area which can store two ints. No cast needed.
That way, sizeof(a) (on many machines) will still be 4, which is often the size of a pointer. The size of what it is pointing to is irrelevant.
The actual reason for using malloc() is determined by the goal of the larger scope of the program it is used for. That is not visible in this artificially short example. Work through some pointer-related tutorials. Looking for "linked list" or "binary tree" will get you on the right track.
What programs which meaningfully use malloc have in common is that they are dealing with data structures which are not known at compile time and can change during runtime. The unknown attributes could simply be the total size, but especially in the case of trees, the larger structure is usually unknown, too.
There is an interesting aspect to note when using malloc():
Do I cast the result of malloc?

Allocate 6xNxN array

I have a variable N. I need a 6xNxN array.
Something like this:
int arr[6][N][N];
But, obviously, that doesn't work.
I'm not sure how I'd go about allocating this so that I can access, e.g. arr[5][4][4] if N is 5, and arr[5][23][23] if N is 24.
Note that N will never change, so I'll never have to reallocate arr.
What should I do? Will int ***arr = malloc(6 * N * N * sizeof(int)); work?
You can allocate your 3-dimensional array on the heap as
int (*arr)[N][N] = malloc(sizeof(int[6][N][N]));
After use, you can free as
free(arr);
Another way of writing the same as suggested by #StoryTeller is -
int (*arr)[N][N] = malloc(6u * sizeof(*arr));
But here you need to be careful about the u after 6 to prevent signed arithmetic overflow.
Also, there can still be issues on platforms where size_t is smaller in width that int as suggested by #chqrlie, but that won't be the case on "most" commonly used platforms and hence you are fine using it.
int arr[6][N][N]; will work just fine. You merely need to update your compiler and C knowledge to the year 1999 or later, when variable-length arrays (VLA) were introduced to the language.
(If you have an older version of GCC than 5.0, you must explicitly tell it to not use an ancient version of the C standard, by passing -std=c99 or -std=c11.)
Alternatively if you need heap allocation, you can do:
int (*arrptr)[Y][Z] = malloc( sizeof(int[X][Y][Z]) );
You cannot do int ***arr = malloc(6 * N * N * sizeof(int)); since a int*** cannot point at a 3D array. In general, more than two levels of indirection is a certain sign that your program design is completely flawed.
Detailed info here: Correctly allocating multi-dimensional arrays.
What you want can't work directly. For indexing a multi-dimensional array, all but the very first dimension need to be part of the type and here's why:
The indexing operator operates on pointers by first adding an index to the pointer and then dereferencing it. The identifier of an array evaluates to a pointer to its first element (except when e.g. used with sizeof, _Alignof and &), so indexing on arrays works as you would expect.
It's very simple in the case of a single-dimension array. With
int a[42];
a evaluates to a pointer of type int * and indexing works the following way: a[18] => *(a + 18).
Now in a 2-dimensional array, all the elements are stored contiguously ("row" after "row" if you want to understand it as a matrix), and what's making the indexing "magic" work is the types involved. Take for example:
int a[16][42];
Here, the elements of a have the type int ()[42] (42-element array of int). According to the rules above, evaluating an expression of this type in most contexts again yields an int * pointer. But what about a itself? Well, it's an array of int ()[42] so a will evaluate to a pointer to 42-element array of int: int (*)[42]. Then let's have a look at what the indexing operator does:
a[3][18] => *(*(a + 3) + 18)
With a evaluating to the address of a with type int (*)[42], this inner addition of 3 can properly add 42 * sizeof(int). This would be impossible if the second dimension wasn't known in the type.
I guess it's simple to deduce the example for the n-dimensional case.
In your case, you have two possibilities to achieve something similar to what you want.
Use a dynamically allocated flat array with size 6*N*N. You can calculate the indices yourself if you save N somewhere.
Somewhat less efficient, but yielding better readable code, you could use an array of pointers to arrays of pointers to int (multiple indirection). You could e.g. do
int ***a = malloc(6 * sizeof *int);
for (size_t i = 0; i < 6; ++i)
{
a[i] = malloc(N * sizeof *(a[i]));
for (size_t j = 0; j < N ++j)
{
a[i][j] = malloc(N* sizeof *(a[i][j]));
}
}
// add error checking to malloc calls!
Then your accesses will look just like those to a normal 3d array, but it's stored internally as many arrays with pointers to the other arrays instead of in a big contiguous block.
I don't think it's worth using this many indirections, just to avoid writing e.g. a[2*N*N+5*N+4] to access the element at 2,5,4, so my recommendation would be the first method.
Making a simple change to the declaration on this line and keeping the malloc can easily solve your problem.
int ***arr = malloc(6 * N * N * sizeof(int));
However, int *** is unnecessary (and wrong). Use a flat array, which is easy to allocate:
int *flatarr = malloc(6 * N * N * sizeof(int));
This works for three dimensions, and instead of accessing arr[X][Y][Z] as in the question, you access flatarr[(X*N*N) + (Y*N) + Z]. In fact, you could even write a handy macro:
#define arr(X,Y,Z) flatarr[((X)*N*N) + ((Y)*N) + (Z)]
This is basically what I've done in my language Cubically to allow for multiple-size cubes. Thanks to Programming Puzzles & Code Golf user Dennis for giving me this idea.

Confusion with malloc

I'm trying to understand a piece of C code which is as follows:
#define DIM2( basetype, name, w1 ) basetype (*name)[w1]
int mx = 10; //number of rows per processor
int my = 100; //number of cols
DIM2 (double, f, my);
f = (typeof (f)) malloc (2 * mx * sizeof (*f));
If I'm correct, with DIM2 a 1-d array of (size=100) pointers to double is created.
I'm not able to understand what happens again with malloc? Is it necessary for two such statements?
Is there any alternative way to achieve what happens in the last two lines of code above in any other way?
The macro evaluates to:
double (*f)[my];
which is a pointer to array of double, not an array of pointer to double.
malloc allocates an array of 2 * mx * <whateverfpoints to> (i.e. an array of double). Not sure why it would allocate twice as many entries as given by mx, but that's what it does.
So, f points to the first array of double afterwards. It effectively allocates a true 2 dimensional array. (not the often confused array of pointers to double).
Note that the cast of malloc is bad practise in C.
Comment: As there is not less typing and the macro does not add specific information, it is actually bad practise. Worse is it hides the pointer semantics obfuscating the code. Recommendation is not to use it, but better be explicit; this is even not more typing.
Update: There is currently an argument if sizeof(*f) presents _undefined behaviour, becausef` is used uninitialized here. While I see a flaw in the standard here which should be more precise, you might better play safe and use an explicit expression:
f = malloc (2 * mx * my * sizeof (double))
double (*f)[my] is a VLA type because my is an int. So sizeof (*f) causes undefined behaviour because the argument of sizeof is evaluated if it has VLA type. For more discussion see here.
Unfortunately the sizeof *f idiom can only be used with pointer to array of fixed dimension (or pointer to non-array!). So this entire idea is bogus.
IMHO it is simpler and clearer to do away with the macro and write:
double (*f)[my] = malloc( sizeof(double[mx][my]) );

Questions about pointers and arrays

Sanity-check questions:
I did a bit of googling and discovered the correct way to return a one-dimensional integer array in C is
int * function(args);
If I did this, the function would return a pointer, right? And if the return value is r, I could find the nth element of the array by typing r[n]?
If I had the function return the number "3", would that be interpreted as a pointer to the address "3?"
Say my function was something like
int * function(int * a);
Would this be a legal function body?
int * b;
b = a;
return b;
Are we allowed to just assign arrays to other arrays like that?
If pointers and arrays are actually the same thing, can I just declare a pointer without specifying the size of the array? I feel like
int a[10];
conveys more information than
int * a;
but aren't they both ways of declaring an array? If I use the latter declaration, can I assign values to a[10000000]?
Main question:
How can I return a two-dimensional array in C? I don't think I could just return a pointer to the start of the array, because I don't know what dimensions the array has.
Thanks for all your help!
Yes
Yes but it would require a cast: return (int *)3;
Yes but you are not assigning an array to another array, you are assigning a pointer to a pointer.
Pointers and arrays are not the same thing. int a[10] reserves space for ten ints. int *a is an uninitialized variable pointing to who knows what. Accessing a[10000000] will most likely crash your program as you are trying to access memory you don't have access to or doesn't exist.
To return a 2d array return a pointer-to-pointer: int ** f() {}
Yes; array indexing is done in terms of pointer arithmetic: a[i] is defined as *(a + i); we find the address of the i'th element after a and dereference the result. So a could be declared as either a pointer or an array.
It would be interpreted as an address, yes (most likely an invalid address). You would need to cast the literal 3 as a pointer, because values of type int and int * are not compatible.
Yes, it would be legal. Pointless, but legal.
Pointers and arrays are not the same thing; in most circumstances, an expression of array type will be converted ("decay") to an expression of pointer type and its value will be the address of the first element of the array. Declaring a pointer by itself is not sufficient, because unless you initialize it to point to a block of memory (either the result of a malloc call or another array) its value will be indeterminate, and may not point to valid memory.
You really don't want to return arrays; remember that an array expression is converted to a pointer expression, so you're returning the address of the first element. However, when the function exits, that array no longer exists and the pointer value is no longer valid. It's better to pass the array you want to modify as an argument to the function, such as
void foo (int *a, size_t asize)
{
size_t i;
for (i = 0; i < asize; i++)
a[i] = some_value();
}
Pointers contain no metadata about the number of elements they point to, so you must pass that as a separate parameter.
For a 2D array, you'd do something like
void foo(size_t rows, size_t columns, int (*a)[columns])
{
size_t i, j;
for (i = 0; i < rows; i++)
for (j = 0; j < columns; j++)
a[i][j] = some_value;
}
This assumes you're using a C99 compiler or a C2011 compiler that supports variable length arrays; otherwise the number of columns must be a constant expression (i.e., known at compile time).
These answers certainly call for a bit more depth. The better you understand pointers, the less bad code you will write.
An array and a pointer are not the same, EXCEPT when they are. Off the top of my head:
int a[2][2] = { 1, 2, 3, 4 };
int (* p)[2] = a;
ASSERT (p[1][1] == a[1][1]);
Array "a" functions exactly the same way as pointer "p." And the compiler knows just as much from each, specifically an address, and how to calculate indexed addresses. But note that array a can't take on new values at run time, whereas p can. So the "pointer" aspect of a is gone by the time the program runs, and only the array is left. Conversely, p itself is only a pointer, it can point to anything or nothing at run time.
Note that the syntax for the pointer declaration is complicated. (That is why I came to stackoverflow in the first place today.) But the need is simple. You need to tell the compiler how to calculate addresses for elements past the first column. (I'm using "column" for the rightmost index.) In this case, we might assume it needs to increment the address ((2*1) + 1) to index [1][1].
However, there are a couple of more things the compiler knows (hopefully), that you might not.
The compiler knows two things: 1) whether the elements are stored sequentially in memory, and 2) whether there really are additional arrays of pointers, or just one pointer/address to the start of the array.
In general, a compile time array is stored sequentially, regardless of dimension(s), with no extra pointers. But to be sure, check the compiler documentation. Thus if the compiler allows you to index a[0][2] it is actually a[1][0], etc. A run time array is however you make it. You can make one dimensional arrays of whatever length you choose, and put their addresses into other arrays, also of whatever length you choose.
And, of course, one reason to muck with any of these is because you are choosing from using run time multiplies, or shifts, or pointer dereferences to index the array. If pointer dereferences are the cheapest, you might need to make arrays of pointers so there is no need to do arithmetic to calculate row addresses. One downside is it requires memory to store the addtional pointers. And note that if the column length is a power of two, the address can be calculated with a shift instead of a multiply. So this might be a good reason to pad the length up--and the compiler could, at least theoretically, do this without telling you! And it might depend on whether you select optimization for speed or space.
Any architecture that is described as "modern" and "powerful" probably does multiplies as fast as dereferences, and these issues go away completely--except for whether your code is correct.

Resources