About arrays and pointers - c

A brief question to do mainly with understanding how pointers work with arrays in this example:
char *lineptr[MAXLENGTH]
Now I understand this is the same as char **lineptr as an array in itself is a pointer.
My question is how it works in its different forms/ de-referenced states such as:
lineptr
*lineptr
**lineptr
*lineptr[]
In each of those states, whats happening, what does each state do/work as in code?
Any help is much appreciated!

Now I understand this is the same as char **lineptr as an array in itself is a pointer.
No, an array is not the same as a pointer. See the C FAQ: http://c-faq.com/aryptr/index.html.
lineptr
This is the array itself. In most situations, it decays into a pointer to its first element (i.e. &lineptr[0]). So its type is either int *[MAXLENGTH] or int **.
*lineptr
This dereferences the pointer to the first element, so it's the value of the first element (i.e. it's the same as lineptr[0]). Its type is int *.
**lineptr
This dereferences the first elements (i.e. it's the same as *lineptr[0]). Its type is int.
*lineptr[]
I don't think this is valid syntax (in this context).

lineptr is the /array/ itself.
*lineptr is the first element of the array, a char *
**lineptr is the char pointed to by the first element of the array
*lineptr[N] is the char pointed to by the Nth element of the array

Ok, first things first.
Arrays are not pointers. They simply decompose to pointers when needed. Think of an array as a pointer that already has some data malloced/alloca'ed to it.
lineptr : This simply returns the array. Not much to say.
*lineptr : This is the same as accessing your array's first location. *lineptr = lineptr[0]. This just happens to return a char *
**lineptr: This is accessing the array's first location, an then dereferencing that location. **lineptr = *(lineptr[0]). Since your array holds char* this will return the char stored at the char * in slot 0 of the array.
*lineptr[i] : This dereferences the char* stored at i. So the char pointed to by lineptr[i] is returned.

Except when it is the operand of the sizeof or unary & operator, or is a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be replaced with ("decay to") an expression of type "pointer to T", and the value of the expression will be the address of the first element.
The expression lineptr has type "MAXLENGTH-element array of char *". Under most circumstances, it will be replaced with an expression of type char **. Here's a handy table showing all the possibilities:
Expression Type Decays to Evaluates to
---------- ---- --------- ------------
lineptr char *[MAXLENGTH] char ** address of first element in array
&lineptr char *(*)[MAXLENGTH] n/a address of array
*lineptr char * n/a value of first element
lineptr[i] char * n/a value of i'th element
&lineptr[i] char ** n/a address of i'th element
*lineptr[i] char n/a value of character pointed to by i'th element
**lineptr char n/a value of character pointed to by zero'th element
Note that lineptr and &lineptr evaluate to the same value (the address of an array is the same as the address of its first element), but their types are different (char ** vs. char *(*)[MAXLENGTH]).

Related

Print the address that stores the pointer points to an array in C

This question may seems quite duplicated; however, I indeed have had a serious survey on this site and still cannot quite understand.
char str[] = "test";
printf("%p\n", str);
printf("%p\n", &str);
I know str itself is a pointer which points to the starting location that stores "t". Therefore I expect that printf("%p\n", str); shows this address (say 000000FFE94FFC34). Next, I wish to know in my OS, at what memory location that stores the exact information 000000FFE94FFC34. (i.e. the address of str itself.)
However, the output of the third line is the same as the previous one. It seems quite a weird behavior. And how can I figure out the address of str itself? I guess this can be achieved by using another new pointer that points to the pointer I want, i.e. char **super_pointer = str; but I think it is an unnecessarily complicated way.
Lets draw it out with the pointers added, to hopefully make it simpler to understand:
+--------+--------+--------+--------+--------+
| str[0] | str[1] | str[2] | str[3] | str[4] |
+--------+--------+--------+--------+--------+
^
|
&str[0]
|
&str
As you can see the pointer to the first element (&str[0] which is what plain str decays to) points to the same location as the array itself (&str).
But (and it's an important but): The type are very different!
The type of &str[0] is char *.
The type of &str is char (*)[5].
On another note you say that
... str itself is a pointer...
This is wrong. str itself is the array, which can decay to a pointer to its first element.
I know str itself is a pointer which points to the starting location
that stores "t".
As you wrote "str itself" is not a pointer it is an array. But used in expressions array designators with rare exceptions are converted to pointers to their first elements.
From the C Standard (6.3.2.1 Lvalues, arrays, and function designators)
3 Except when it is the operand of the sizeof operator or the unary &
operator, or is a string literal used to initialize an array, an
expression that has type ‘‘array of type’’ is converted to an
expression with type ‘‘pointer to type’’ that points to the initial
element of the array object and is not an lvalue. If the array object
has register storage class, the behavior is undefined.
In this call
printf("%p\n", str);
the array designator str used as an argument expression is converted to a pointer to its first element. In fact this call is equivalent to
printf("%p\n", &str[0]);
That is this call outputs the starting address of the extent of memory occupied by the array.
This call
printf("%p\n", &str);
that is better to write as and the previous call like
printf("%p\n", ( void * )&str);
also outputs the starting address of the extent of the memory occupied by the array.
What is the difference?
The expression str used in the call of printf has the type char * while the expression &str has the type char ( * )[5].
The both expressions is interpreted by the call as expressions of the type void *.
To make this more clear consider a two dimensional array
char str[2][6] = { "test1", "test2" };
The addresses of the array as whole and of its first "row" and of its first character of the first "row" are coincide.
That is the expression &sgtr that has the type char ( * )[2][6] and the expression &str[0] that has the type char ( * )[6] and the expression &str[0][0] that has the type char * yield the same value: hhe starting address of the extent of memory occupied by the array..

Two level pointers

Say we have the following array:
char *names[]={"Abc", "Def", "Ghi", "Klm", "Nop"};
If we want to create a pointer that points to the array above, why should we use a two-level pointer as follows?
char **p1 = names;
Thanks.
Your names is an array [], of char *, i.e., an array of pointers to char.
Meanwhile p1 is a pointer which points to a pointer to char, i.e., a pointer to char *. You can assign names to it because the array decays to a pointer to its first element, and the first element of names is a pointer to char, hence names decays to a pointer to char *. This is the same type – char ** – as p1, therefore they are compatible.
(On another note, the element type of names is incorrect; the string literals are constant, and thus it should be const char *names[], and similarly p1 should be const char** – pointer to pointer to const char.)
You can see it this way, when you need to point to an integer array, you need to use a pointer to int.
int arr[];
int* a;
a = arr;
So, in this context, you need a pointer to the elements in the array names[]. What are the elements in that array. Its char*. So you would need a pointer to char*. Which translates to pointer to pointer to char.
That is
char *names[]={"Abc", "Def", "Ghi", "Klm", "Nop"};
char **p1 = names;
You did not create a pointer to array of pointers .It actually is pointer to first pointer of array.
If you want to create a pointer to array of pointers write something like this -
char *names[MAX];
char* (*p1)[MAX] = &names;
If you want to create a pointer that points to array
char *names[]={"Abc", "Def", "Ghi", "Klm", "Nop"};
then you have to write
char * ( *p1 )[sizeof( names ) / sizeof( *names )] = &names;
If you want to create a pointer that points to the elements of the array then you indeed should write
char **p1 = names;
In this case pointer p1 is initialized by the address of the address of the first character of string literal "Abc".
So for example expression *p1 will contain the value of the first element of array names that in turn (the value) is the address of the first character of string literal "Abc" and has type char *.
If you will apply dereferencing the second time **p1 you will get the first character of the string literal itslef that is 'A'.
For example
printf( "%c\n", **p1 );
To make it more clear let's consider a general situation.
If you have an array with elements of type T like this
T a[N];
then this array in expressions is converted to pointer to its first element that will have type T *. So if you want to declare such a pointer yourself you should write
T *p1 = a;
This record is equivalent to
T *p1 = &a[0];
In your original example type T corresponds to type char * - the type of the elements ofarray names. So after substitution char * for T you will get
char * *p1 = names;
^^^^^^
T
EDIT: Thanks to the comment, the array is actually bound here: because the compiler can deduce its size from the initializers.
From its declaration, names is: an array of pointers to char. A bound array decays to a pointer to its first element. Elements of names being pointer to char, when used in an expression names is implicitly converted to a pointer to a pointer to char.
This is why you can assign names in your second line:
char **p1 = names;
So p1 points to the first element in names, which for most practical purposes is like pointing to the array (in memory, an array being juxtaposed objects). Technically, you are pointing to the first element though.

Why the output is same in all three cases?

Can somebody please explain why the output is same in all three snippets below.
and what exactly does the 0th element of array represents.
int main(void) {
char arr[10];
scanf("%s",&arr[0]);
printf("%s",arr);
return 0;
}
int main(void) {
char arr[10];
scanf("%s",&arr[0]);
printf("%s",&arr);
return 0;
}
int main(void) {
char arr[10];
scanf("%s",&arr[0]);
printf("%s",*&arr);
return 0;
}
& ("address of") and * ("dereference pointer") cancel each other out, so *&foo is the same as foo.
Your second snippet is wrong. It passes &arr (a pointer to an array of 10 chars, char (*)[10]) to printf %s, which expects a pointer to char (char *). It just so happens that on your platform those two types have the same size, use the same representation, and are passed the same way to printf. That's why the output looks correct.
As for the difference: arr is a an array of chars. Evaluating an array (i.e. using it anywhere other than the operand of & or sizeof) yields a pointer to its first element.
&arr yields a pointer to the whole array. An array has no runtime structure (that is, at runtime an array is its elements), so the address of the array is also the address of its first element. It's just that the first element is smaller than the whole array and the two addresses have different types.
arr[0] represents the first element of array arr. &arr[0] is the address of first element of array. In all of the three snippet, scanf is reading a string from standard input and will store in array arr.
In first snippet
printf("%s",arr);
will print the stored string in array arr. %s expects an argument of char * type and &arr[0] is of that type and so is arr after it will decay to pointer to its first element.
In second snippet, &arr is the address of array arr and is of type char (*)[10]. Using wrong specifier will invoke undefined behavior.
In third snippet, applying * on &arr will dereference back it to address of the first element of array arr which is of type char * as said above.
Snippet first and third are correct and will give same output for the same input under the condition that input string should not be greater than 10 characters including '\0'. Third code will invoke undefined behavior and nothing can be said in this case.

Reference to Array vs reference to array pointer

void check(void* elemAddr){
char* word = *((char**)elemAddr);
printf("word is %s\n",word);
}
int main(){
char array[10] = {'j','o','h','n'};
char * bla = array;
check(&bla);
check(&array);
}
Output:
word is john
RUN FINISHED; Segmentation fault; core dumped;
First one works, but second not. I don't understand why this happens.
The problem is, when we do &array, we are getting a char (*)[10] from an char [10], instead of a char **.
Before we do our experiment, I will emphasize that, when we pass an array as an argument to a function, C actually casts the array to a pointer. The big bucket of data is not copied.
Thus, int main(int argc, char **argv) is identical to int main(int argc, char *argv[]) in C.
This made it available for us to print the address of an array with a simple printf.
Let's do the experiment:
char array[] = "john";
printf("array: %p\n", array);
printf("&array: %p\n", &array);
// Output:
array: 0x7fff924eaae0
&array: 0x7fff924eaae0
After knowing this, let's dig into your code:
char array[10] = "john";
char *bla = array;
check(&bla);
check(&array);
bla is char *, and &bla is char **.
However, array is char [10], and &array is char (*)[10] instead of char **.
So when you pass &array as an argument, char (*)[10] acts like a char * when passing as an argument, as is said above.
Therefore **(char **) &bla == 'j' while *(char *) &array == 'j'. Do some simple experiments and you will prove it.
And you are casting void *elemAddr to a char ** and try to deference it. This will only work with &bla since it is char **. &array will cause a segfault because "john" is interpreted as an address as you do the cast.
For check(&bla); you are sending pointer to pointer
void check(void* elemAddr){
char* word = *((char**)elemAddr); // works fine for pointer to pointer
printf("word is %s\n",word);
}
This is working fine.
But, for check(&array); you are passing pointer only
void check(void* elemAddr){
char* word = *((char**)elemAddr); // This is not working for pointer
char* word = *(char (*)[10])(elemAddr); // Try this for [check(&array);]
printf("word is %s\n",word);
}
Full Code--
Code for check(array);:
void check(void* elemAddr){
char* word = *(char (*)[10])(elemAddr);
printf("word is %s\n",word);
}
int main() {
char array[10] = {'j','o','h','n'};
check((char*)array);
return 0;
}
Code for check(&bla);:
void check(void* elemAddr){
char* word = *((char**)elemAddr);
printf("word is %s\n",word);
}
int main() {
char array[10] = {'j','o','h','n'};
char* bla = array;
check(&bla);
return 0;
}
The C specification says that array and &array are the same pointer address.
Using the name of an array when passing an array to a function will automatically convert the argument to a pointer per the C specification (emphasis mine).
6.3.2.1-4
Except when it is the operand of the sizeof operator or the unary &
operator, or is a string literal used to initialize an array, an
expression that has type ‘‘array of type’’ is converted to an
expression with type ‘‘pointer to type’’ that points to the initial
element of the array object and is not an lvalue. If the array object
has register storage class, the behavior is undefined.
So calling func(array) will cause a pointer to char[] to be passed to the function. But there is a special case for using the address-of operator on an array. Since array has type "array of type" it falls into the 'Otherwise' category of the specification (emphasis mine).
6.5.3.2-3
The unary & operator yields the address of its operand. If the operand
has type ‘‘type’’, the result has type ‘‘pointer to type’’. If the
operand is the result of a unary * operator, neither that operator nor
the & operator is evaluated and the result is as if both were omitted,
except that the constraints on the operators still apply and the
result is not an lvalue. Similarly, if the operand is the result of a
[] operator, neither the & operator nor the unary * that is implied by
the [] is evaluated and the result is as if the & operator were
removed and the [] operator were changed to a + operator. Otherwise,
the result is a pointer to the object or function designated by its
operand
So calling func(&array) will still cause a single pointer to be passed to the function just like calling func(array) does since both array and &array are the same pointer value.
Common-sense would lead you to believe that &array is a double pointer to the first element of the array because using the & operator typically behaves that way. But arrays are different. So when you de-reference the passed array pointer as a double pointer to the array you get a Segmentation fault.
This is not a direct answer to your question, but it might be helpful to you in the future.
Arrays are not pointers:
type arr[10]:
An amount of sizeof(type)*10 bytes is used
The values of arr and &arr are necessarily identical
arr points to a valid memory address, but cannot be set to point to another memory address
type* ptr = arr:
An additional amount of sizeof(type*) bytes is used
The values of ptr and &ptr are typically different, unless you set ptr = (type*)&ptr
ptr can be set to point to both valid and invalid memory addresses, as many times as you will
As with regards to your question: &bla != bla == array == &array, and therefore &bla != &array.
One problem is that your char array is NOT NECESSARILY going to be null-terminated. Since array is an automatic variable that is allocated locally on the stack, it is not guaranteed to be zeroed-out memory. So, even though you are initializing the first 4 chars, the latter 6 are left undefined.
However ...
The simple answer to your question is that &bla != &array so your check() function is assuming it will find null-terminated character arrays at 2 different addresses.
The following equations are true:
array == &array // while not the same types exactly, these are equivalent pointers
array == bla
&array == bla
*bla == array[0]
&bla is never going to equal anything you want because that syntax references the address of the bla variable on the local stack and has nothing to do with its value (or what it points to).
Hope that helps.

C pointer : array variable

I read this in my book (and many sources on the internet):
The array variable points to the first element in the array.
If this true, then the array variable and the first element are different. Right?
It means by below code, it will produce two different results:
int main(){
char msg[] = "stack over flow";
printf("the store string is store at :%p\n",&msg);
printf("First element: %p\n",&msg[0]);
}
But I receive the same results for the two cases. So, by this example, I think we should say: the array variable is the first element. (because it has the same address)
I don't know if this true or wrong. Please teach me.
The array variable signifies the entire memory block the array occupies, not only the array's first element. So array is not the same as array[0] (cf. sizeof array / sizeof array[0]). But the array's first element is located at the same memory address as the array itself.
Saying the array points to the first element is also incorrect, in most circumstances, an array expression decays into a pointer to its first element, but they are different things (again cf. sizeof for example).
They point to the same address, i.e. printf will show the same value but they have different types.
The type of &msg is char(*)[16], pointer to array 16 of char
The type of &msg[0] is char *, pointer to char
A cheap way to test this is to do some pointer arithmetic. Try printing &msg + 1.
This C FAQ might prove useful.
The array variable is the whole array. It decays into a pointer to the first element of the array.
If you look at the types:
msg is of type char [16]
&msg is of type char (*)[16]
&msg[0] is of type char *
So in a context where msg can decay into an array, for example when passed as an argument, its value would be equal to &msg[0].
Let me draw this:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+
|s|t|a|c|k| |o|v|e|r| |f|l|o|w|\0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+
Imagine the starting point of this array, where 's' is located is address 0x12345678.
msg itself, refers to the whole 16 bytes of memory. Like when you say int a;, a refers to 4 bytes of memory.
msg[0] is the first byte of that 16 bytes.
&msg is the address where array begins: 0x12345678
&msg[0] is the address of first element of array: 0x12345678
This is why the values of &msg and &msg[0] are the same, but their types are different.
Now the thing is, msg by itself is not a first class citizen. You cannot for example assign arrays. That is why, in most of the cases, the array will decay into its pointer.
If you know function pointers, this is very similar:
int array[10];
int function(int);
In int *var = array, array decays to a pointer (&array)
In void *var = function, function decays to a pointer (&function)
Note that, in case of function pointers, we like to keep the type, so we write:
int (*var)(int) = function;
Similarly, you can do with arrays:
int (*var)[10] = array;
char myChar = 'A'
char msg[] = 'ABCDEFGH'
When you type myChar you get value.
But with msg you get pointer to first char(for values you have to use msg[x])
msg = &msg[0]
This can help you to understand, I think.
Look at it this way:
&msg = 0x0012
&msg[0] = 0x0012
&msg[1] = 0x0013
In this case &msg[1] is pointing to msg+1. When you reference &msg or &msg[0] you are referring to the same address of memory because this is where the pointer starts. Incrementing the array variable will increment the pointer by +1 since a char variable is only 1 byte in size.
If you do the same trick with say an integer you will increment the pointer by +4 bytes since an integer is 4 bytes in size.
When you use an array expression, the compiler converts it to a pointer to the first element. This is an explicit conversion specified by the 1999 C standard, in 6.3.2.1 3. It is a convenience for you, so that you do not have to write &array[0] to get a pointer to the first element.
The conversion happens in all expressions except when an array expression is the operand of sizeof or the unary & or is a string literal used to initialize an array.
You can see that an array and its first element are different by printing sizeof array and sizeof array[0].
In most circumstances, an expression of array type ("N-element array of T") will be replaced with / converted to / "decay" to an expression of pointer type ("pointer to T"), and the value of the expression will be the address of the first element in the array.
So, assuming the declaration
int a[10];
the type of the expression a is "10-element array of int", or int [10]. However, in most contexts, the type of the expression will be converted to "pointer to int", or int *, and the value of the expression will be equivalent to &a[0].
The exceptions to this rule are when the array expression is the operand of the sizeof or unary & operators, or is a string literal being used to initialize another array in a declaration.
So, based on our declaration above, all of the following are true:
Expression Type Decays to Value
---------- ---- --------- -----
a int [10] int * address of the first element of a
&a int (*)[10] n/a address of the array, which is the
same as the address of the first
element
&a[0] int * n/a address of the first element of a
*a int n/a value of a[0]
sizeof a size_t n/a number of bytes in the array
(10 * sizeof (int))
sizeof &a size_t n/a number of bytes in a pointer to
an array of int
sizeof *a size_t n/a number of bytes in an int
sizeof &a[0] size_t n/a number of bytes in a pointer to int
Note that the expressions a, &a, and &a[0] all have the same value (address of the first element of a), but the types are different. Types matter. Assume the following:
int a[10];
int *p = a;
int (*pa)[10] = &a;
Both p and pa point to the first element of a, which we'll assume is at address 0x8000. After executing the lines
p++;
pa++;
however, p points to the next integer (0x8004, assuming 4-byte ints), while pa points to the next 10-element array of integers; that is, the first integer after the last element of a (0x8028).

Resources