My question is both specific to an assignment I'm working on and conceptual about the relationship between pointers and arrays. I'm writing a hash table in the form of an array of pointers to sorted lists. I've created a struct to define a type for the hash table and the number of elements in the table is defined in a macro. Since the size of the table is variable, the struct needs to contain a pointer to the table itself - a pointer to an array of pointers. My problem revolves around the idea that a pointer to some data type is the same as the label for the first element of an array of that data type.
I have a data type SortedList. As I understand things, SortedList* can be interpreted as either a pointer to a single SortedList or as a pointer to the first element of an array of SortedList's. Expanding on that, SortedList** can be an array of SortedList pointers and SortedList*** can be a pointer to that array. This is what I have in my hash table struct. My first question is, is my understanding of this correct?
In the function that creates the hash table I have this:
SortedList** array;
if ((array = calloc(size,sizeof(SortedList*))) == NULL) {
// error allocating memory
printf("Memory Error\n");
return NULL;
}
table->arrayPtr = &array;
So array is intended to be my array of SortedList pointers and arrayPtr is the SortedList*** type in my hash table struct. I'm using calloc because I think it will initialize all of my pointers to NULL. Please let me know if I'm mistaken about that. This all compiles with no errors so, as far as I know, so far
so good.
I have function that inserts data into the table that first checks to see if this pointer has not been used already by checking to see if it points to NULL, if not, it creates a SortedList for it to point to.
int i = index->hashFunc(word);
SortedList*** table = index->arrayPtr;
if (*(table +i) == NULL){
return 0;
}
So it seems to me that dereferencing (table +i) ought to give me a SortedList** - the ith element in an array of SortedList pointers - which I can then check to see if it's set to NULL. Unfortunately, the compiler disagrees. I get this error:
error: invalid operands to binary == (have ‘struct SortedList’ and ‘void *’)
So somewhere along the line my reasoning about all this is wrong.
You might need to read up a little bit more on arrays and pointers in C because I don't think you completely grasp the concept. I could be wrong but I doubt you'd need a three level pointer to achieve what you're trying to do; I think you might be confusing yourself and thinking that if you want to point to an array's data you need to point to the actual array (&array), which is essentially a pointer itself. Drawing a picture can also really help to visualise what's going on in memory.
An array is merely a block of sequential data, where the variable's name in C (without any [ ], which would get an element from the array) points to the first element in the array. The two lines in the example below are equivalent (where array is obviously an array):
int *p_array;
p_array = array; /* equivalent */
p_array = &array[0]; /* equivalent */
You could then use p_array exactly the same way as if you were to use array, ie.
p_array[3] == array[3]
The unnecessary thing that some people might do when they're learning is have a pointer to a pointer to an array (which I think is what you're doing):
int **p_p_array = &array;
And then to access the elements of array they would have to dereference the pointer and then use array notation to specify the element in the array:
*p_p_array[3] == array[3]
What we've actually done here is store the memory address of array (which itself is a pointer to the first element), which we then have to dereference to get to the first element, and then we move 3 positions forward to get to the fourth element (array[3]).
In the first example it's so much simpler and more logical, since we're storing a pointer to the first element in the array and having the pointer act in the same way as our initial array variable.
I recommend you draw out what you're trying to do on a piece of paper/whiteboard to see what you're doing wrong and it may then become obvious how to properly implement it the right way. My whiteboard is one of my best tools when I'm coding something.
Related
int (*mapTerrain)[10] = (int (*)[10])malloc(sizeof(int[10][10]));
free(mapTerrain);
Someone on this site suggested these 2 lines for working with dynamical 2d arrays in C. Dimensions are [10][10]. Problem is, I'm not sure I understand them correctly. If I had to explain these 2 lines I'd say the following:
On the left we have an array of int pointers with size 10. (Can't explain the casting, I myself would expect it to be int *).
What's being passed to malloc is an array of int-s sized [10][10]. (Why isn't it ...malloc(sizeof(int*10*10));?) What allows us to pass an array to malloc instead of size_t size?
As for the free(mapTerrain); line. How come one free is enough? From what I remember you have to call free for every row of a dynamical 2d array.
The cast of the result of malloc is just clutter and not necessary.
The most correct, formal version would be:
int (*mapTerrain)[10][10] = malloc(sizeof(int[10][10]));
This is an array pointer to a int [10][10] array. However, such a pointer is a bit painful to work with in practice, since to get an item we have to do:
*mapTerrain to get the array pointed at from the array pointer.
(*mapTerrain) parenthesis to not trip over operator precedence.
(*mapTerrain)[i][j] to get a single item.
As a trick, we can instead use a pointer to the first element of the int[10][10]. The first element is of type int[10], and so an array pointer to that element is int(*)[10]. With this type, we can do mapTerrain[i][j] as expected, because i means "give me array number i" through pointer arithmetic.
This trick is essentially the same thing as when we do something like
char* ptp = malloc(sizeof("hello"));`
Here we don't point to the whole array either, we point at the first element of it.
sizeof(int[10][10]) is 100% equivalent to sizeof(int*10*10) (or 400 for that matter), but the first is self-documenting code, showing that we are expecting to use the allocated data as an array int[10][10].
One free is enough because there is just one malloc. You allocate the whole 2D array as a contiguous chunk of memory.
Further reading: Correctly allocating multi-dimensional arrays.
Suppose we have:
typedef struct {
uint8_t someVal;
} Entry
typedef struct {
Entry grid[3][3];
} Matrix
//Make a 3x3 matrix of all 0s
Matrix emptyMatrix(void) {
Entry zero = {.value = 0}
Matrix matrix;
for (int i = 0; i < 3; i++)
for (int j = 0; j < 3; j++) {
//Shallow copy empty struct
matrix.grid[i][j] = zero;
}
return matrix;
}
Matrix myMatrix = emptyMatrix();
I understand that in C we're allowed to return a struct from a function so this works and I've tested it. However, it's unclear to me HOW the assignment works.
Does the compiler allocate the memory for myMatrix then copy each Entry element of the array in the Matrix struct returned by emptyMatrix()?
I guess it would also be helpful to know the memory map of Matrix - I assumed that since grid is an array that Matrix's memory would contain pointers. However, it apparently stores the value. If this is the case, my guess for how the assignment works makes much more sense to me.
Edit: It seems like people are answering the question incompletely. I want to know whether my guess of how the assignment works is correct.
Each instance of Matrix will contain a 3x3 array of Entry. When you assign one instance of Matrix to another, the contents of the source matrix will be copied to the destination matrix.
Arrays are not pointers. Array expressions will "decay" to pointers if the expression is not the operand of the sizeof or unary & operators, or is not a string literal used to initialize a character array in a declaration.
For example, if you had a function call like
printMatrix( myMatrix.grid );
the expression myMatrix.grid has type "3-element array of 3-element array of Entry"; since it's not the operand of the sizeof or unary & operator, it "decays" to an expression of type "pointer to 3-element array of Entry" (Entry (*)[3]) and the value of the expression is the address of the first element of grid (which will also be the address of the whole Matrix instance).
The ABI for each environment defines how structures are passed and returned by value. A common choice is this:
small structures, up to the size of 2 or 4 registers are returned in registers.
for larger objects, the caller allocates space on its stack frame for the return value and passes a pointer to the function. When returning, the function copies whatever object is being returned into the space to which it received a pointer for the return value. That's it. This simple method allows for recursive calls.
the optimizer tries to minimize the amount of copying, especially if it can expand the function inline or if the returned value is stored into an object, as opposed to discarded or passed by value to another function.
It does not matter if the structure has one or more member arrays. The same method applies to unions as well.
Identifiers bound to arrays may be interpreted as pointers, but that doesn't mean that the array variables are pointers.
The storage for the array is part of the memory layout of the struct itself. The same way as arrays declared on the stack are on the stack itself.
You can test this by yourself by checking sizeof(Matrix).
So in short:
Does the compiler allocate the memory for myMatrix then copy each Entry element of the array in the Matrix struct returned by emptyMatrix()?
Yes
Matrix is a data type struct struct {Entry grid[3][3];}. For every object of the type Metrix memory of size Entry grid[3][3] will be allocated.
Function emptyMatrix is returning an object of type Matrix and the object will be returned by value just like any other object.
I have a question about locating an index.
suppose I have a "relative" index in an array (that was allocated with malloc), or basically an index that doesn't tell me where I am really. how can I find the "absolute" index?
I'm trying to use binary search to locate a number in an array but I also need the index, and when I do it with recursion I loose the actual index.
I was thinking since it is an array maybe I can subtract sized or something (suppose it's an ints array) to figure out how many steps I made from the begining but I can't quite figure it out. can you help?
Assuming that by relative index you mean a pointer inside the array, you can get its offset using pointer arithmetic:
int *array = malloc(100*sizeof(int));
// Let's say you've got a pointer to an array element somehow,
// through your recursive search or in any other way.
// I'll assign it directly for simplicity:
int *ptr = &array[23];
int absIndex = ptr - array; // This equals 23
The compiler deals with dividing out the sizeof the array element for you, so the result of the subtraction does not change if your array elements are doubles, chars, structs, or anything else. The pointer types of ptr and array need to match, though.
I'm a bit confused about pointer arrays and I just wanna make sure I'm right.
When I write int *arrit is just a pointer to an int variable, not an array yet. It is only that I initialize it (say with malloc) that it becomes an array. Am I right so far?
Also I have another question: were given (in school) a little function that is supposed to return an array of grades, with the first cell being the average. The function was deliberately wrong: what they did was to set
int *grades = getAllGrades();
And than they have decreased the pointer by one for the average 'cell'
*(grades - 1) = getAverage();
return *(grades - 1)
I know this is wrong because the returned value is not an array, I just don't know how to explain it. When I set a pointer, how does the machine/compiler know if I want just a pointer or an array?
(If I'm not clear its because I'm trying to ask about something that is still vague for me, my apologizes)
how does the machine/compiler know if I want just a pointer or an
array?
It doesn't, and it never will. Suppose you
int *a = malloc(3 * sizeof(int));
You just allsocated 12 bytes (assuming int is 4). But malloc only sees 12. Is that 1 big object or lots of little ones? It doesn't know. The only one who actually knows is you ;)
Now about your particular example,
int *grades = getAllGrades();
At this point, as you said, there's nothing to say whether grades points to an array. But you know it points to an array, and that's what's important. Or, maybe you know it doesn't point to an array. The key is you have to know what getAllGrades does, to know if it's returning an array or a pointer to 1 thing.
*(grades - 1) = getAverage();
return *(grades - 1)
This is not necessarily wrong, but it does look kind of sketch. If it is an array, you would expect grades[0] == *(grade + 0) to be the first element, so grades[-1] == *(grades - 1) looks like it would be before the first element. Again, it's not necessarily wrong; maybe in getAllGrades they did:
int* getAllGrades() {
int *grades = malloc(sizeof(int) * 10);
return grades + 1;
}
ie they scooched the start up by 1. It's been known to happen (look in Numerical Recipes in C) but it's kind of odd.
Arrays are not pointers. Pointers are not arrays.
Perhaps it would be clearer to say that array objects are not pointer objects, and vice versa.
When you declare int *arr, arr is a pointer object. That's all it is; it cannot be, and never will be, an array.
When you execute arr = malloc(10 * sizeof *arr);, (if malloc() doesn't fail, which you should always check), arr now points to an int object. That object happens to be the first element of a 10-element array of int (the one created by the malloc call). Note that there is such a thing as a pointer to an array, but this isn't it.
Arrays, in a very real sense, are not first-class types in C. You can create and manipulate array objects as you can with any other type of objects, but you'll rarely deal with array values directly. Instead, you'll deal with the elements of an array object indirectly, via pointers to those elements. And in the case of the arr declaration above, you can perform arithmetic on the pointer to the first element to obtain pointers to the other elements (and you have to have some other mechanism to remember how many elements there are).
Any expression of array type, in most contexts, is implicitly converted to a pointer to the array's first element (the exceptions are: the operand of a unary & operator, the operand of the sizeof operator, and a string literal in an initializer used to initialize an array (sub)object). That's the rule that makes it seem as if arrays and pointers are interchangeable.
The array indexing operator [] is actually defined to work on pointers, not arrays. a[b] is simply another way of writing *((a)+(b)). If a happens to be the name of an array object, it's first converted to a pointer, as I describe above.
I highly recommend reading section 6 of the comp.lang.c FAQ. (The link is to the front page, not directly to section 6, because I like to encourage people to browse the whole thing.)
I mentioned that there are array pointers. Given int foo[10];, &foo[0] is a pointer to an int, but &foo is a pointer to the entire array. Both point to the same location in memory, but they're of different types, and they behave quite differently under pointer arithmetic.
When I write int *arr it is just a pointer to an int variable, not
an array yet. It is only that I initialize it (say with malloc) that
it becomes an array. Am I right so far?
Well, yes and no :). arr is a pointer to (or the address of) some block of memory. Until arr is initialized it probably points to an invalid, or non-sense memory address. So it may be confusing to think of arr as a pointer to an int variable. For example, before the malloc, you can't store an integer in the location that it is pointing to.
Also, it may be easier to understand if you say that after the malloc, arr points to an array, it does not "become" an array. Before the malloc, arr points to some random non-sense location.
When I set a pointer, how does the machine/compiler know if I want
just a pointer or an array?
If you set a pointer (e.g. arr = <something>) you are just changing where the pointer points. That may be what you want. If don't want to change where arr points but you want to change the values stored in the memory where it is pointing you have to do it one element at a time (e.g. with a for loop that iterates over each element in the array).
You are right, the only difference between arrays and pointers is convention. Here's a picture of what memory must look like when the getAllGrades() function returns:
| secret malloc() stuff |
+-----------------------+
| average value |
+-----------------------+ ----\
grades* points here ---> | grade at index 0 | | by convention
+-----------------------+ | this stuff is
| grade at index 1 | | called grades[]
+-----------------------+ |
| grade at index 2 | |
+-----------------------+ .
| ... | .
Now, there is no difference between an array and a pointer. So, when the compiler sees *(grades - 1) it first subtracts 1 from the grades pointer. This is special pointer arithmetic so it knows to go one whole int block upwards, and points at the average value. Then it can operate on this value, for example to set the average with *(grades - 1) = getAverage().
An aside on array indexing: Array indexing gets compiled exactly like pointer arithmetic. For example, grades[2] gets compiled down to *(grades + 2) which does pointer arithmetic to move down 2 blocks to the memory address marked "grade at index 2" in my picture. This means you could change *(grades - 1) = getAverage() to grades[-1] = getAverage() and it would work exactly the same.
If you wanted to experiment you could do (-1)[grades] which compiles down to *(-1 + grades) and works as well, but that's stupid so don't do that :)
I have an array named record[10] whose type is a table structure, say { int, int, long, long,char}
I have a function to which I want to pass the address of this array which gets called in a loop:
for(i = 0 ; i<10; i++)
{
// internal resolution will be *(record + i) will fetch an address
function(record[i]);
}
I'm confused as to why it is not working. I know it is related to basics.
It started working with
for(i = 0 ; i<10; i++)
{
// then why do I need to pass this address of address here
function(&record[i]);
}
*(record + i) is not in fact an address. record is an address, and so is (record + i), but *(record + i) is the value stored at the address represented by (record + i). Therefore, calling function(record[i]) is the same as function(*(record + i)), which will pass the value of the array element to the function, not a pointer.
The syntax &record[i] is not taking the address of an address. It is taking the address of record[i], which is an object. The braces have a higher precedence than the ampersand, so &record[i] is equivalent to &(record[i]). You can think of it as expanding to &(*(record + i)) and then simplifying to (record + i).
Update:
To address your question from the comment, an array "decays" into a pointer if you reference the name of the array by itself. If you add square brackets [], you will get a value from within the array. So, for your example, say you have an array of structures:
struct A {
...
char abc[10];
...
} record[10];
Then, you would have:
record[i] - an object of type struct A from the record array
record[i].abc - the abc array inside a particular record object, decayed to a pointer
record[i].abc[k] - a specific character from the string
&record[i].abc[0] - one way of creating a pointer to the string
The notation record[i]->abc that you mention in your comment cannot be used, since record[i] is an object and not a pointer.
Update 2:
In regards to your second comment, the same rules described above apply regardless of how you nest the array within a structure (and whether you access that structure directly or through a pointer). Accessing an array using arrayname[index] notation will give you an item from the array. Accessing an array using arrayname notation (that is, using the array name by itself) will give you a pointer to the first element in the array. If you need more details regarding this phenomenon, here are a couple of links that explain arrays and the way that their names can decay into pointers:
http://boredzo.org/pointers/
http://www.ibiblio.org/pub/languages/fortran/append-c.html
http://c-faq.com/aryptr/index.html
You're saying two different things in your question. First, you say you want to pass the address of the array, but in the code you appear to be trying to pass the address of a particular element. One of the features of C is that an array will automatically turn into pointer to the array's first element when you use it in certain contexts. That means these two calls are 100% equivalent:
function(array);
function(&array);
To get the address of a particular array element, you can do two things. One is as you've shown above:
function(&array[10]);
And the second is just do the pointer arithmetic directly:
function(array + 10);
In the first case the & is required, since as you mentioned in your question the [] causes the pointer to be dereferenced - the & undoes that operation. What you appear to be confused about are the real semantics of the [] operation. It both does pointer arithmetic and then dereferences the result - you're not getting an address out of that. That's where the & comes in (or just using array + 10 directly).
You are passing by value which means a copy of the variable is sent to the function. In the 2nd case you are passing by reference.
In the second case you are directly modifying the contents at the address of the array plus index.
Check this simple example to know the exact difference.
Your function's signature is probably
void function(table *); // argument's type is pointer to a table
When you pass record[i], you pass a table object.
In order to pass a pointer to a table, you have to pass &record[i], like you did.
Your function is expecting a pointer to the structure. This arguement can be an individual instance of that structure or it could be an element in the array of the give structure. Like
struct myStruct {
int a, b;
long cL, dL;
char e;
} struc1, struc2, record[20];
and function's prototype will be
function( struct myStruct *ptr);
Now you can pass the structure to function:
function( &struct1 );
// or
function( &record[ index] );
Now your confusion arises because of the misconception that syntax array[i] can also be treated as a pointer like we can do with the name of the array.
record - name of the array- gives the address of the first member of the array, (pointers also point to memory addresses) hence it can be be passed to the function. But record[index], it is different.
Actually, when we write record[ index] it gives us the value placed there which is not a pointer. Hence your function which is accepting a pointer, does not accept it.
To make it acceptable to the function, you will have to pass the address of the elements of the array i.e
function( &record[ index ] );
Here & operator gives the address of the elements of the array.
Alternatively, you can also use:
function( record + index );
Here, as we know record is the address of the first element, and when we add index in it, it gives the address of the respective element using pointer arithmetic.
Hope it was helpful.