We could declare a pointer to an integer by writing int*. We already saw a pointer type char** argv. This is a pointer to pointers to characters.
Seems that argv is a pointer to multiple pointers which point to chars.
In C strings are represented by the pointer type char*. Under the hood they are stored as a list of characters, where the final character is a special character called the null terminator.
Is it the case with above char** where the pointers are stored as characters in the string ?
A pointer can point to a single object, or it can point to an array of objects.
In the case of the argv parameter to main which is declared as char *argv[] (or equivalently char ** since it is a function parameter), it is a pointer to an array of char *.
In memory it looks something like this:
argv
-----
| .-|----> ------
----- | | ----------------------------------
| .-|-----> | s | t | r | i | n | g | 1 | \0 |
| | ----------------------------------
------
| | ----------------------------------
| .-|-----> | s | t | r | i | n | g | 2 | \0 |
| | ----------------------------------
------
| | ----------------------------------
| .-|-----> | s | t | r | i | n | g | 3 | \0 |
| | ----------------------------------
------
...
When we define a char *argv[] for example :
Example 1:
char *p[5] = {{"ali"}, {"reza"}, {"hamid"}, {"saeed"}, {"mohsen"}};
for(int i = 0;i < 5;i++)
printf("%s\n", *p[i]);
Example 2 : (Here we have 5 pointers pointing to char*)
char **p;
p = new char*[5];
for(int i = 0;i < 5;i++)
p[i] = new char[10];
This happens in memory :
Yes.
A pointer p to type T can point to a single T, or to an array of T. In the latter case you can index into the array using pointer arithmetics, such as p[n]. In the same way, argv[n]'s pointees are not single chars, but nul-terminated arrays of chars, AKA C-style strings.
A pointer is a reference to a memory address - pointer contains address to a variable. A pointer to pointer is a form of indirection where the pointer contains address to the other pointer variable. The second pointer variable contains address where the value is stored.
argv refers to argument vector which has reference to arguments passes to a program via the command line. As pointer argv refers to the first element in the character array; now since the vector is represented as an array its implicit to find the other pointers.
Memory-Address: |0xA0|0xA1|0xA2|0xA3|0xA4|0xA5|0xA6|0xA7|
Memory-Content: | 0x123 | 0x456 |
|-------4-Byte------|
|<- int* = 0x123
An pointer in C contains the address of a specific region in memory (ignoring VirtualMemory).
The pure address marks the start-position (here 0xA0) and the range is bounded by the size of the actual C-type.
But the content may be a pointer as well. (Here just 32-Bit addresses!)
Memory-Address: |0xA0|0xA1|0xA2|0xA3|0xA4|0xA5|0xA6|0xA7|
Memory-Content: | 0xA4 | 0x123 |
|-------4-Byte------|
|<- int** = 0xA4 |<- int* = 0x123
So you can construct any pointer hierarchy in memory.
Related
I have been working on trying to write a function that does string comparison for a generic binary search function.
However, while writing the function, I realized that my pointer dereferencing does not work.
In essence, this is what doesn't work:
printf("***a[0] = %c\n", (*(char **)(void *)&"a")[0]);
I ran the debugger which tells me EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
However, this extremely similar code (which I believe to be identical to my previous code) does work.
char * stringa = "a";
printf("***stringa[0] = %c\n", (*(char **)(void *)&stringa)[0]);
I don't understand why the second one works but the first one doesn't. My understanding is that both "a" and stringa both represent the memory address of the beginning of a character array.
Thank you in advance.
Pointers are not arrays. Arrays are not pointers.
&stringa results in a pointer to pointer of type char**.
&"a" results in an array pointer of type char(*)[2]. It is not compatible with char**.
You try to de-reference the char(*)[2] by treating it as a char** which won't work - they are not compatible types and in practice the actual array pointer is saying "at address x there is data" but when converting it you say "at address x there is a pointer".
If you try to print printf("%p\n", *(char **)(void *)&"a"); you don't get an address but data. I get something like <garbage> 0061 which is a little endian machine trying to convert the string into a larger integer number. In memory you'll have 0x61 ('a')then 0x00 (null term) - the string itself, not an address which you can de-reference.
First, check this rule - from C11 Standard#6.3.2.1p3 [emphasis added]:
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ''array of type'' is converted to an expression with type ''pointer to type'' that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
From String literals [emphasis added]:
Constructs an unnamed object of specified character array type in-place, used when a character string needs to be embedded in source code.
Lets decode this first:
char * stringa = "a";
printf("***stringa[0] = %c\n", (*(char **)(void *)&stringa)[0]);
In this statement char * stringa = "a";, string "a" will convert to pointer to type char that points to the initial element of the string "a". So, after initialisation, stringa will point to first element of string literal "a".
&stringa is of type char **. Dereferencing it will give char * type which is nothing but string "a" and applying [0] to it will give character 'a'.
Now, lets decode this:
printf("***a[0] = %c\n", (*(char **)(void *)&"a")[0]);
Since, here you are giving unary operator & so, in this expression, (*(char **)(void *)&"a")[0], string "a" will not convert to pointer to its initial element and &"a" will give the pointer of type const char (*)[2] and that pointer will be type casted to char ** type.
Dereferencing this pointer will give value at address which is nothing but string "a", which it will think of as a pointer of type char (because of type casting char **) and applying [0] to it. That means, it's trying to do something like this ((char *)0x0000000000000061)[0] (0x61 is hex value of character 'a') which is resulting in the error EXC_BAD_ACCESS.
Instead, you should do
printf("***a[0] = %c\n", (*(const char (*)[2])(void *)&"a")[0]);
EDIT:
OP is still confused. This edit is an attempt to explain the expressions (above in the post) in a different way.
From comments:
OP: But you wrote ((const char ()[2])(void )&"a")[0] works! There are two dereferencing operations ( and [0]) going on here!
Not sure if you aware of it or not but, I think, it's good to share definition of [] operator, from C11 Standard#6.5.2.1p2:
The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).
Expression (*(char **)(void *)&stringa)[0]:
(*(char **)(void *)&stringa)[0]
| | |
| +----------------------+
| |
| this will result in
| type casting a pointer
| of type char ** to char **
|
|
This dereferencing
will be applied on result of &stringa
i.e. ( * ( &stringa ) )
and result in stringa
i.e. this
|
|
| &stringa (its type is char **)
| +-------+
| | 800 |---+
| +-------+ |
| |
+-------> stringa |
/ +-------+ (pointer stringa pointing to first char of string "a"
/ | 200 |---+ (type of stringa is char *)
| +-------+ |
now apply [0] 800 |
to it |
i.e. stringa[0]. +-------+
stringa[0] is +-> | a | 0 | (string literal - "a")
equivalent to | +-------+
*((stringa) + (0)) | 200 ---> address of "a"
i.e. |
*(200 + 0), |
add 0 to address 200 |
and dereference it. |
*(200 + 0) => *(200) |
dereferencing address |
200 will result in |
value at that address |
which is character |
'a', that means, |
*(200) result in -------+
Expression (*(char **)(void *)&"a")[0]:
(*(char **)(void *)&"a")[0]
| | |
| +------------------+
| |
| this will result in
| type casting a pointer
| of type const char (*)[2] to char **
|
|
this dereferencing will be
applied to pointer of type
char ** which is actually a
pointer of type char (*)[2]
i.e. *(&"a").
It will result in value at address 200
which is nothing but string "a"
but since we are type casting
&"a" with double pointer (char **)
so single dereference result
will be considered as pointer of
type char i.e. char *.
*(char **)(void *)&"a"
|
|
| &"a" (its type is const char (*)[2] because type of "a" is
| +-------+ const char [2] i.e. array of 2 characters)
| | 200 |---+
| +-------+ |
| |
| |
| |
| +-------+
+------------------> | a | 0 | (string literal - "a")
/ +-------+
/ 200 ---> address of "a"
|
|
The content at this location will be
treated as pointer (of type char *)
i.e. the hex of "a" (0x0061) [because the string has character `a` followed by null character]
will be treated as pointer.
Applying [0] to this pointer
i.e. (0x0061)[0], which is
equivalent to (* ((0x0061) + 0)).
(* ((0x0061) + 0)) => *(0x0061)
i.e. trying to dereference 0x0061
Hence, resulting in bad access error.
Expression (*(const char (*)[2])(void *)&"a")[0]:
(*(const char (*)[2])(void *)&"a")[0]
| | |
| +----------------------------+
| |
| this will result in
| type casting a pointer
| of type const char (*)[2] to const char (*)[2]
|
|
this dereferencing will be
applied to pointer of type
const char (*)[2]
i.e. *(&"a")
and result string "a"
whose type is const char [2]
|
|
| &"a" (its type is const char (*)[2] because type of "a" is
| +-------+ const char [2] i.e. array of 2 characters)
| | 200 |---+
| +-------+ |
| |
| |
| |
| +-------+
+------------------> | a | 0 | (string literal - "a")
/ +-------+
/ 200 ---> address of "a"
|
|
Apply [0] to "a"
i.e. "a"[0].
Now, scroll to the top of my post
and check string literal definition -
string literal constructs unnamed object of character array type.....
also, read rule 6.3.2.1p3
(which is applicable for an array of type) -
....an expression that has type 'array of type' is converted
to an expression with type 'pointer to type' that points to
the initial element of the array object. ....
So, "a" (in expression "a"[0]) will be converted to pointer
to initial element i.e. pointer to character `a` which is
nothing but address 200.
"a"[0] -> (* ((a) + (0))) -> (* ((200) + (0)))
-> (* (200)) -> 'a'
From comments:
OP: there is no such thing as an object in C ....
Don't confuse word object with objects in C++ or other object oriented languages.
This is how C standard defines an object:
From C11 Standard#3.15p1
1 object
region of data storage in the execution environment, the contents of which can represent values
E.g. - int x; --> x is an object of type int.
Let me know, if you have any more question.
This question already has answers here:
Why in a 2D array a and *a point to same address? [duplicate]
(4 answers)
Closed 3 years ago.
I was testing some codes to find out how 2d array is implemented in c.
Then I met following problem.
The code is:
int main(){
int a[4][4];
printf("a: %p, *a: %p, **a: 0x%x\n",a,*a,**a);
}
I compiled this with 32-bit ubuntu gcc
The result was:
a: 0xbf9d6fdc, *a: 0xbf9d6fdc, **a: 0x0
I expected different value for a and *a, but they are same.
why a and *a are same in this case?
Is not a a int** type?
Then what is role of *operator in *a?
Check the types!!
Given the definition as int a[4][4];
a is of type int [4][4] - array of an array of 4 integers. It's not the same as int **.
a[n] is of type int [4] - array of 4 integers. It's not the same as int *
a[n][m] is of type int. - integer.
Now, given the fact, that the address of the array is also the address of the first element in the array, the values are same, but they differ in types.
To check it visually
int a[4][4]
+-------+--------+-------+-----------+
| | | | |
|a[0][0]| a[0][1]| a[0][2] a[0][3] |
| | | | |
+------------------------------------+
| | | | |
|a[1][0]| | | |
| | | | |
+------------------------------------+
| | | | |
|a[2][0]| | | |
| | | | |
+------------------------------------+
| | | | |
| | | | |
| | | | |
| | | | |
+-------+--------+-------+-----------+
Then, quoting the C11, §6.3.2.1
Except when it is the operand of the sizeof operator, the _Alignof operator, or the
unary & operator, or is a string literal used to initialize an array, an expression that has
type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points
to the initial element of the array object and is not an lvalue. [...]
So, while passing an argument of type array as a function argument, it decays to the pointer to the first element of the array.
So, let's have a look.
a, decays to &(a[0]) - the address of the first element of type int (*)[4].
*a, which is the same as a[0], decays to an int *, pointer to the first element.
**a which is same as *(*a) == *(a[0]) == a[0][0] - that's the int value at that index.
Now once again look carefully at the image above - do you see that the first element of int [0] and int [0][0] are basically residing at the same address? In other words, the starting address are the same.
That's the reason behind the output you see.
Program:
#include<stdio.h>
int main(void) {
int x[4];
printf("%p\n", x);
printf("%p\n", x + 1);
printf("%p\n", &x);
printf("%p\n", &x + 1);
}
Output:
$ ./a.out
0xbff93510
0xbff93514
0xbff93510
0xbff93520
$
I expect that the following is the output of the above program. For example:
x // 0x100
x+1 // 0x104 Because x is an integer array
&x // 0x100 Address of array
&x+1 // 0x104
But the output of the last statement is different from whast I expected. &x is also the address of the array. So incrementing 1 on this
will print the address incremented by 4. But &x+1 gives the address incremented by 10. Why?
x -> Points to the first element of the array.
&x ->Points to the entire array.
Stumbled upon a descriptive explanation here: http://arjunsreedharan.org/post/69303442896/the-difference-between-arr-and-arr-how-to-find
SO link: Why is arr and &arr the same?
In case 4 you get 0x100 + sizeof x and sizeof x is 4 * sizeof int = 4 * 4 = 16 = 0x10.
(On your system, sizeof int is 4).
An easy thumbrule to evaluate this is:
Any pointer on increment points to the next memory location of its base type.
The base type of &x here is int (*p)[4] which is a pointer to array of 4 integers.
So the next pointer of this type will point to 16 bytes away (assuming int to be 4 bytes) from the original array.
Even though x and &x evaluate to the same pointer value, they are different types. Type of x after it decays to a pointer is int* whereas type of &x is int (*)[4].
sizeof(x) is sizeof(int)*4.
Hence the numerical difference between &x and &x + 1 is sizeof(int)*4.
It can be better visualized using a 2D array. Let's say you have:
int array[2][4];
The memory layout for array is:
array
|
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
array[0] array[1]
| |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
If you use a pointer to such an array,
int (*ptr)[4] = array;
and look at the memory through the pointer, it looks like:
ptr ptr+1
| |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
As you can see, the difference between ptr and ptr+1 is sizeof(int)*4. That analogy applies to the difference between &x and &x + 1 in your code.
Believe it or not, the behaviour of your program is undefined!
&x + 1 is actually pointing to just beyond the array, as #i486's answer cleverly points out. You don't own that memory. Even attempting to assign a pointer to it is undefined behaviour, let alone attempting to dereference it.
This question already has answers here:
How come an array's address is equal to its value in C?
(6 answers)
Address of an array
(3 answers)
Closed 9 years ago.
#include <stdio.h>
#include <stdlib.h>
#include <conio.h>
#include <string.h>
struct BOOK{
char name[15];
char author[33];
int year;
};
struct BOOK *books;
int main(){
int i,noBooks;
noBooks=2;
books=malloc(sizeof(struct BOOK)*noBooks);
books[0].year=1986;
strcpy(books[0].name,"MartinEden");
strcpy(books[0].author,"JackLondon");
//asking user to give values
scanf("%d",&books[1].year);
scanf("%s",&books[1].name);
scanf("%s",books[1].author);
printf("%d %s %s\n",books[0].year,books[0].author,books[0].name);
printf("%d %s %s\n",books[1].year,books[1].author,books[1].name);
getch();
return 0;
}
I give 1988 theidiotanddostoyevski
the output is
1986 JackLondon MartinEden
1988 dostoyevski theidiot
in scanf, in books[].name i used &, in books[].author I did not use but still it did same. For year it did not work. & is useless in structure?
I mean here
scanf("%d",&books[1].year);
scanf("%s",&books[1].name);
scanf("%s",books[1].author); //no & operator
char name[15];
char author[33];
here, i can use
char *name[15];
char *author[33];
nothing changes. why i cant see the difference?
The name member of the BOOK structure is a char array of size 15. When the name of the array is used in an expression, its value is the address of the array's initial element.
When you take an address of the name member from a struct BOOK, though, the compiler returns the base address of the struct plus the offset of the name member, which is precisely the same as the address of name's initial element. That is why both &books[1].name and books[1].name expressions evaluate to the same value.
Note: you should specify the size of the buffers into which you are going to read the strings; this will prevent potential buffer overruns:
scanf("%14s", books[1].name);
scanf("%32s", books[1].author);
This form is valid:
scanf("%s",books[1].author);
This form is invalid:
scanf("%s", &books[1].author);
s conversion specifier expects an argument of type pointer to char in scanf function, which is true in the first statement but false in the second statement. Failing to meet this requirement makes your function call invoke undefined behavior.
In the first statement, the trailing argument (after conversion) is of type pointer to char and in the second statement, the argument is of type pointer to an array 33 of char.
Except when it is the operand of the sizeof or unary & operator, or is a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element in the array.
When you write
scanf("%s", books[1].author);
the expression books[i].author has type "33-element array of char". By the rule above, it will be converted to an expression of type "pointer to char" (char *) and the value of the expression will be the address of the first element of the array.
When you write
scanf("%s", &books[1].name);
the expression books[1].name is an operand of the unary & operator, so the conversion doesn't happen; instead, the type of the expression &books[1].name has type "pointer to 15-element array of char" (char (*)[15]), and its value is the address of the array.
In C, the address of the array and the address of the first element of the array are the same, so both expressions result in the same value; however, the types of the two expressions are different, and type always matters. scanf expects the argument corresponding to the %s conversion specifier to have type char *; by passing an argument of type char (*)[15], you invoke undefined behavior, meaning the compiler isn't required to warn you about the type mismatch, nor is it required to handle the expression in any meaningful way. In this particular case, the code "works" (gives you the result you expect), but it isn't required to; it could just as easily have caused a crash, or led to corrupted data, depending on the specific implementation of scanf.
Both calls should be written as
scanf("%s", books[1].name);
scanf("%s", books[1].author);
Edit
In answer to your comment, a picture may help. Here's what your books array would look like:
+---+ +---+
| | | name[0]
| +---+
| | | name[1]
| +---+
| ...
| +---+
| | | name[14]
| +---+
books[0] | | author[0]
| +---+
| | | author[1]
| +---+
| ...
| +---+
| | | author[33]
| +---+
| | | year
+---+ +---+
| | | name[0] <------ books[1].name
| +---+
| | | name[1]
| +---+
| ...
| +---+
| | | name[14]
| +---+
books[1] | | author[0] <------ books[1].author
| +---+
| | | author[1]
| +---+
| ...
| +---+
| | | author[33]
| +---+
| | | year
+---+ +---+
Each element of the books array contains two arrays plus an integer. books[1].name evaluates to the address of the first element of the name array within books[1]; similarly, the expression books[1].author evaluates to the address of the first element of the author array within books[1].
I'm trying to write a C99 program and I have an array of strings implicitly defined as such:
char *stuff[] = {"hello","pie","deadbeef"};
Since the array dimensions are not defined, how much memory is allocated for each string? Are all strings allocated the same amount of elements as the largest string in the definition? For example, would this following code be equivalent to the implicit definition above:
char stuff[3][9];
strcpy(stuff[0], "hello");
strcpy(stuff[1], "pie");
strcpy(stuff[2], "deadbeef");
Or is each string allocated just the amount of memory it needs at the time of definition (i.e. stuff[0] holds an array of 6 elements, stuff[1] holds an array of 4 elements, and stuff[2] holds an array of 9 elements)?
Pictures can help — ASCII Art is fun (but laborious).
char *stuff[] = {"hello","pie","deadbeef"};
+----------+ +---------+
| stuff[0] |--------->| hello\0 |
+----------+ +---------+ +-------+
| stuff[1] |-------------------------->| pie\0 |
+----------+ +------------+ +-------+
| stuff[2] |--------->| deadbeef\0 |
+----------+ +------------+
The memory allocated for the 1D array of pointers is contiguous, but there is no guarantee that the pointers held in the array point to contiguous sections of memory (which is why the pointer lines are different lengths).
char stuff[3][9];
strcpy(stuff[0], "hello");
strcpy(stuff[1], "pie");
strcpy(stuff[2], "deadbeef");
+---+---+---+---+---+---+---+---+---+
| h | e | l | l | o | \0| x | x | x |
+---+---+---+---+---+---+---+---+---+
| p | i | e | \0| x | x | x | x | x |
+---+---+---+---+---+---+---+---+---+
| d | e | a | d | b | e | e | f | \0|
+---+---+---+---+---+---+---+---+---+
The memory allocated for the 2D array is contiguous. The x's denote uninitialized bytes. Note that stuff[0] is a pointer to the 'h' of 'hello', stuff[1] is a pointer to the 'p' of 'pie', and stuff[2] is a pointer to the first 'd' of 'deadbeef' (and stuff[3] is a non-dereferenceable pointer to the byte beyond the null byte after 'deadbeef').
The pictures are quite, quite different.
Note that you could have written either of these:
char stuff[3][9] = { "hello", "pie", "deadbeef" };
char stuff[][9] = { "hello", "pie", "deadbeef" };
and you would have the same memory layout as shown in the 2D array diagram (except that the x's would be zeroed).
char *stuff[] = {"hello","pie","deadbeef"};
Is not a multidimensional array! It is simply an array of pointers.
how much memory is allocated for each string?
The number of characters plus a null terminator. Same as any string literal.
I think you want this:
char foo[][10] = {"hello","pie","deadbeef"};
Here, 10 is the amount of space per string and all the strings are in contiguous memory. Thus, there will be padding for strings less than size 10.
In the first example, it is a jagged array I suppose.
It declares an array of const pointers to a char. So the string literal can be as long as you like. The length of the string is independent of the array columns.
In the second one.. the number of characters per row (string) lengths must be 9 as specified by your column size, or less.
Are all strings allocated the same amount of elements as the largest
string in the definition?
No, only 3 pointer are allocated and they point to 3 string literals.
char *stuff[] = {"hello","pie","deadbeef"};
and
char stuff[3][9];
are not at all equivalent. First is an array of 3 pointers whereas the second is a 2D array.
For the first only pointer are allocated and the string literals they point to may be stored in the read-only section. The second is allocated on automatic storage (usually stack).