I am confused as to how the following passage matches up with the code that follows it:
Since argv is a pointer to an array of pointers, we can manipulate the
pointer rather than index the array. This next variant is based on
incrementing argv, which is a pointer to pointer to char, while argc
is counted down:
#include <stdio.h>
/* echo command-line arguments; 2nd version */
main(int argc, char *argv[])
{
while (--argc > 0)
printf("%s%s", *++argv, (argc > 1) ? " " : "");
printf("\n");
return 0;
}
Isn't char *argv[] just an array of pointers? Wouldn't a pointer to an array of pointers be written as char *(*argv[]) or something similar?
As a side note, is it normal that in general I find declarations that mix arrays and pointers rather confusing?
Such terms as "pointer to array" or "to point to an array" are often treated rather loosely in C terminology. They can mean at least two different things.
In the most strict and pedantic sense of the term, a "pointer to array" has to be declared with "pointer to array" type, as in
int a[10];
int (*p)[10] = &a;
In the above example p is declared as a pointer to array of 10 ints and it is actually initialized to point to such an array.
However, the term is also often used is its less formal meaning. In this example
int a[10];
int *p = &a;
p is declared as a mere pointer to int. It is initialized to point to the first element of array a. You can often hear and see people say that p in this case also "points to an array" of ints, even though this situation is semantically different from previous one. "Points to an array" in this case means "provides access to elements of an array through pointer arithmetic", as in p[5] or *(p + 3).
This is exactly what is meant by the phrase "...argv is a pointer to an array of pointers..." you quoted. argv's declaration in parameter list of main is equivalent to char **argv, meaning that argv is actually a pointer to a char * pointer. But since it physically points to the first element of some array of char * pointers (maintained by the calling code), it is correct to say semi-informally that argv points to an array of pointers.
That's exactly what is meant by the text you quoted.
Where C functions claim to accept arrays, strictly they accept pointers instead. The language does not distinguish between void fn(int *foo) {} and void fn(int foo[]). It doesn't even care if you have void fn(int foo[100]) and then pass that an array of int [10].
int main(int argc, char *argv[])
is the same as
int main(int argc, char **argv)
Consequently, argv points to the first element of an array of char pointers, but it is not itself an array type and it does not (formally) point to a whole array. But we know that array is there, and we can index into it to get the other elements.
In more complex cases, like accepting multi-dimensional arrays, it is only the first [] which drops back to a pointer (and which can be left unsized). The others remain as part of the type that is being pointed to, and they have an influence on pointer arithmetic.
The array-pointer equivalence thing only holds true only for function arguments, so while void fn(const char* argv[]) and void fn(const char** argv) are equivalent, it doesn't hold true when it comes to the variables you might want to pass TO the function.
Consider
void fn(const char** argv)
{
...
}
int main(int argc, const char* argv[])
{
fn(argv); // acceptable.
const char* meats[] = { "Chicken", "Cow", "Pizza" };
// "meats" is an array of const char* pointers, just like argv, so
fn(meats); // acceptable.
const char** meatPtr = meats;
fn(meatPtr); // because the previous call actually cast to this,.
// an array of character arrays.
const char vegetables[][10] = { "Avocado", "Pork", "Pepperoni" };
fn(vegetables); // does not compile.
return 0;
}
"vegetables" is not a pointer to a pointer, it points directly to the first character in a 3*10 contiguous character sequence. Replace fn(vegetables) in the above to get
int main(int argc, const char* argv[])
{
// an array of character arrays.
const char vegetables[][10] = { "Avocado", "Pork", "Pepperoni" };
printf("*vegetables = %c\n", *(const char*)vegetables);
return 0;
}
and the output is "A": vegetables itself is pointing directly - without indirection - to the characters, and not intermediate pointers.
The vegetables assignment is basically a shortcut for this:
const char* __vegetablesPtr = "Avocado\0\0\0Pork\0\0\0\0\0\0Pepperoni\0";
vegetables = __vegetablesPtr;
and
const char* roni = vegetables[2];
translates to
const char* roni = (&vegetables[0]) + (sizeof(*vegetables[0]) * /*dimension=*/10 * /*index=*/2);
Since argv is a pointer to an array of pointers.
This is wrong. argv is an array of pointers.
Since argv is a pointer to an array of pointers,
No, not even close.
Isn't char *argv[] just an array of pointers?
No, it's a pointer to pointers.
"Pointer to the first element of an array" is a common construct. Every string function uses it, including stdio functions that input and output strings. main uses it for argv.
"Pointer to an array" is a rare construct. I can't find any uses of it in the C standard library or POSIX. grepping all the headers I have installed locally (for '([^)]*\*[^)]) *\[') I find exactly 2 legitimate instances of pointer-to-array, one in libjpeg and one in gtk. (Both are struct members, not function parameters, but that's beside the point.)
So if we stick to official language, we have a rare thing with a short name and a similar but much more common thing with a long name. That's the opposite of the way human language naturally wants to work, so there's tension, which gets resolved in all but the most formal situations by using the short name "incorrectly".
The reason we don't just say "pointer to pointer" is that there's another common use of pointers as function parameters, in which the parameter points to a single object that's not a member of an array. For example, in
long strtol(const char *nptr, char **endptr, int base);
endptr is exactly the same type as argv is in main, both are pointer-to-pointer, but they're used in different ways. argv points to the first char * in an array of char *s; inside main you're expected to use it with indexes like argv[0], argv[optind], etc., or step through the array by incrementing it with ++argv.
endptr points to a single char *. Inside strtol, it is not useful to increment endptr or to refer to endptr[n] for any value of n other than zero.
That's semantic difference is expressed by the informal usage of "argv is a pointer to an array". The possible confusion with what "pointer to array" means in formal language is ignored, because the natural instinct to use concise language is stronger than the desire to adhere to a formal definition that tells you not to use the most obvious simple phrase because it's reserved for a situation that will almost never happen.
Related
I am trying hard to understand the difference between char *s[] and char s** initialization.
My char *s[] works fine, whereas my char s1** throws an error [Error] scalar object 's1' requires one element in initializer. I don't get the meaning of that error.
How can we initialize char s1** properly?
#include<stdio.h>
int main(void)
{
char *s[]={"APPLE","ORANGE"};
char **s1={"APPLE","ORANGE"};
return 0;
}
TLDR: char **s1=(char*[]){"apple","orange"};.
You can, of course, initialize pointers with the address of an element in an array. This is more common with simpler data types: Given int arr[] = {1,2};you can say int *p = &arr[0];; a notation which I hate and have only spelled out here in order to make clear what we are doing. Since arrays decay to pointers to their first elements anyway, you can simpler write int *p = arr;. Note how the pointer is of the type "pointer to element type".
Now your array s contains elements of type "pointer to char". You can do exactly the same as before. The element type is pointer to char, so the pointer type must be a pointer to that, a pointer to pointer to char, as you have correctly written:
char **s2= &s[0];, or simpler char **s2= s;.
Now that's a bit pointless because you have s already and don't really need a pointer any longer. What you want is an "array literal". C99 introduced just that with a notation which prefixes the element list with a kind of type cast. With a simple array of ints it would look like this: int *p = (int []){1, 2};. With your char pointers it looks like this:
char **s1=(char*[]){"apple","orange"};.
Caveat: While the string literals have static storage duration (i.e., pointers to them stay valid until the program ends), the array object created by the literal does not: Its lifetime ends with the enclosing block. That's probably OK if the enclosing block is the main function like here, but you cannot, for example, initialize a bunch of pointers in an "initialize" routine and use them later.
Caveat 2: It would be better to declare the arrays and pointers as pointing to const char, since the string literals typically are not writable on modern systems. Your code compiles only for historical reasons; forbidding char *s = "this is constant"; would break too much existing code. (C++ does forbid it, and such code cannot be compiled as C++. But in this special case C++ does not have the concept of compound literals in this way, and the program below is not valid C++.) I adjusted the types accordingly in the complete program below which demonstrates the use of a compound literal. You can even take its address, like that of any other array!
#include<stdio.h>
int main(void)
{
/// array of pointers to char.
const char *arrOfCharPtrs[2]={"APPLE","ORANGE"};
/// pointer to first element in array
const char **ptrToArrElem= &arrOfCharPtrs[0];
/// pointer to element in array literal
const char **ptrToArrLiteralElem=(const char*[]){"apple","orange"};
/// pointer to entire array.
/// Yes, you can take the address of the entire array!
const char *(*ptrToArr)[2] = &arrOfCharPtrs;
/// pointer to entire array literal. Note the parentheses around
/// (*ptrToArrLiteral)- Yes, you can take the address of an array literal!
const char *(*ptrToArrLiteral)[2] = &(const char *[]){"apples", "pears"};
printf("%s, %s\n", ptrToArrElem[0], ptrToArrElem[1]);
printf("%s, %s\n", ptrToArrLiteralElem[0], ptrToArrLiteralElem[1]);
printf("%s, %s\n", (*ptrToArr)[0], (*ptrToArr)[1]);
// In order to access elements in an array pointed to by ptrToArrLiteral,
// you have to dereference the pointer first, yielding the array object,
// which then can be indexed. Note the parentheses around (*ptrToArrLiteral)
// which force dereferencing *before* indexing, here and in the declaration.
printf("%s, %s\n", (*ptrToArrLiteral)[0], (*ptrToArrLiteral)[1]);
return 0;
}
Sample session:
$ gcc -Wall -pedantic -o array-literal array-literal.c && ./array-literal
APPLE, ORANGE
apple, orange
APPLE, ORANGE
apples, pears
Isn't the address of an array and thus of all its elements as well constant anyway?
And if so, in a declaration like:
char *const argv[]
isn't the const qualifier redundant?
No, the const in char *const argv[] is not redundant.
First, const and "constant" are actually two different things in C, even though the const keyword is obviously derived from the word "constant". A constant expression is one that can be evaluated at compile time. const really means "read-only". For example:
const int r = rand();
is perfectly legal.
Yes, the address of an array -- like the address of any object -- is read-only. But that doesn't mean that the value of the array (which consists of the values of its elements) is read-only, any more than any other object is necessarily read-only.
Consider these three declarations:
char *arr1[10];
char *const arr2[10];
const char *arr3[10];
arr1 is a 10-element array of pointers to char. You can modify the char* elements and you can modify the objects that those elements point to.
arr2 is an array of const (read-only) pointers to char. That means that you can't modify the char* elements of the array (once they're initialized) -- but you can still modify the char objects or arrays that those elements point to.
And arr3 is an array of pointers to const char; you can modify the array elements, but you can't modify what they point to.
Now the fact that you used the name argv suggests that you're talking about the second parameter to main, which has some huge effects on this. The language specifies that main's second parameter is
char *argv[]
or, equivalently,
char **argv
There is no const. You can probably get away with adding one, but it's best to follow the form specified by the standard. (Update: I see from your comment that you're asking about the argv parameter of getopt(), which is defined as char * const argv[].)
And since it's a parameter defined as an array, another rule comes into play: a parameter defined as an array of some type is "adjusted" to a pointer to that type. (This rule applies only to parameters.) This isn't a run-time conversion. A function cannot have a parameter of array type.
The relationship between arrays and pointers in C can be confusing -- and there's a lot of misinformation out there. The most important thing to remember is that arrays are not pointers.
Section 6 of the comp.lang.c FAQ is an excellent explanation of the details.
Isn't the address of an array and thus of all its elements as well
constant anyway?
Yes, and it is true for any object in C. Recall that by object here, we mean a location in memory having a value and referenced by an identifier. The identifier is bound to a fixed memory location throughout its scope and you cannot change it. You can change the value of the object though.
int a = 4;
a = 6; // legal. you can change the value of the object
&a = 23456; // illegal. you cannot change the address of the object
Similarly, an array is also an object and each of its elements will have a fixed memory address. However, the value held by an element of the array has nothing to do with the address of the element.
Note that if the declaration appears in a function parameter list, then the following are equivalent
char *const argv[]
char *const *argv
which means that argv is a pointer to an object which is of type char *const, i.e., a constant pointer to a character. It's obvious that char *const *argv and char **argv are different. So let's take another example.
char *const argv[10];
The above statement defines argv to be an array of 10 constant pointers to a character. This means that you have to initialize the array and cannot later change the pointers to point to a different character. However, this has nothing to do with the address of the array elements.
char c = 'A';
char d = 'B';
char *const argv[2] = {&c, &d};
argv = &c; // illegal. you cannot the change the address of an object
argv[0] = &d; // illegal. you cannot change the value of the array element
*argv[0] = 'C'; // legal. you change the value pointed to by the element
Without the const qualifier, char *argv[2] means an array of 2 pointers to characters.
This is clearly different from the case when we have the const qualifier as explained above. Therefore, to answer your second question, no, the const qualifier is not redundant. That's because the const qualifier qualifies the type of the array elements.
No, it isn't. char *const argv[] is an array of constant pointers to char. So the const makes the pointers in the array constant (you cannot change them to point to other strings in memory).
I'm learning c programming, and I don't understand what is this asterisk for in the main method.
int main(int argc, char* argv[])
char* a; means that a is a pointer to variable of type char.
In your case argv is a pointer to a pointer (or even several of them - it is specified in argv in your case) to a variable(s) of type char. In other words, it's a pointer to an array (of length argv) of pointers to char variables.
You can even write your code this way: int main(int argc, char** argv) and nothing, actually, changes as soon as char* a is the same as char a[].
It means that argv is an array of character pointers.
The declaration char *argv[] declares argv as an array (of unknown size) of pointer to char.
For any type T, the declaration
T *p;
declares p as a pointer to T. Note that the * is bound to the identifier, not the type; in the declaration
T *a, b;
only a is declared as a pointer.
It signifies a pointer. char argv[] declares an array of characters. char* argv[] declares an array of character pointers, or pointers to strings.
Those are parameters passed from the command line to your program. This asterix is a pointer operator.
Basically char argv[] is an array of characters, char *argv[] is a pointer to an array of characters. So it is here to represent multiple strings to put it simply!
Note that: char *argv[] is equivalent to char * * argv, as char argv[] could be represented as char *argv.
Just to go further you would be amazed that those two expressions are equivalent:
int a[5];
int 5[a];
This is because an array of integers is a pointer to a set of integers in memory.
So a[1] can be represented as *(a + 1), a[2] as *(a + 2) etc. Which is equivalent to *(1 + a) or *(2 + a).
Anyway, pointers are like one of the most important and difficult notion to grasp when starting programming in C so I would suggest you taking a serious look at it on Google!
This " * " over here is, for sure to specify a pointer only, to place the argv[] //variable number of argument values// to a place it can fit.
Cause you don't know how many parameters will the user be passing as it is argc [argument count] and argv [argument value]. But we do want to allocate them a space where they can fit so we use a pointer with no defined specific SIZE, this pointer will automaticaly find and fit to appropriate memory location.
Hope this helped, if this didn't I'll be glad to help just let me know :)
This might be a bit of a basic question, but what is the difference between writing char * [] and char **? For example, in main,I can have a char * argv[]. Alternatively I can use char ** argv. I assume there's got to be some kind of difference between the two notations.
Under the circumstances, there's no difference at all. If you try to use an array type as a function parameter, the compiler will "adjust" that to a pointer type instead (i.e., T a[x] as a function parameter means exactly the same thing as: T *a).
Under the right circumstances (i.e., not as a function parameter), there can be a difference between using array and pointer notation though. One common one is in an extern declaration. For example, let's assume we have one file that contains something like:
char a[20];
and we want to make that visible in another file. This will work:
extern char a[];
but this will not:
extern char *a;
If we make it an array of pointers instead:
char *a[20];
...the same remains true -- declaring an extern array works fine, but declaring an extern pointer does not:
extern char *a[]; // works
extern char **a; // doesn't work
Depends on context.
As a function parameter, they mean the same thing (to the compiler), but writing it char *argv[] might help make it obvious to programmers that the char** being passed points to the first element of an array of char*.
As a variable declaration, they mean different things. One is a pointer to a pointer, the other is an array of pointers, and the array is of unspecified size. So you can do:
char * foo[] = {0, 0, 0};
And get an array of 3 null pointers. Three char*s is a completely different thing from a pointer to a char*.
You can use cdecl.org to convert them to English:
char *argv[] = declare argv as array of pointer to char
char **argv = declare argv as pointer to pointer to char
I think this is a little bit more than syntactic sugar, it also offers a way to express semantic information about the (voluntary) contract implied by each type of declaration.
With char*[] you are saying that this is intended to be used as an array.
With char**, you are saying that you CAN use this as an array but that's not the way it's intended to be used.
As it was mentioned in the other answers, char*[] declares an array of pointers to char, char** declares a pointer to a pointer to char (which can be used as array).
One difference is that the array is constant, whereas the pointer is not.
Example:
int main()
{
char** ppc = NULL;
char* apc[] = {NULL};
ppc++;
apc++; /* this won't compile*/
return 0;
}
This really depends on the context of where the declarations occur.
Outside of a function parameter definition, the declaration
T a[];
declares a as an unknown-sized array of T; the array type is incomplete, so unless a is defined elsewhere (either in this translation unit or another translation unit that gets linked) then no storage is set aside for it (and you will probably get an "undefined reference" error if you attempt to link, although I think gcc's default behavior is to define the array with 1 element) . It cannot be used as an operand to the sizeof operator. It can be used as an operand of the & operator.
For example:
/**
* module1.c
*/
extern char *a[]; /* non-defining declaration of a */
void foo()
{
size_t i = 0;
for (i = 0; a[i] != NULL; i++)
printf("a[%lu] = %s\n", (unsigned long) i, a[i++]);
}
module1.c uses a non-defining declaration of a to introduce the name so that it can be used in the function foo, but since no size is specified, no storage is set aside for it in this translation unit. Most importantly, the expression a is not a pointer type; it is an incomplete array type. It will be converted to a pointer type in the call to printf by the usual rules.
/**
* module2.c
*/
char *a[] = {"foo", "bar", "bletch", "blurga", NULL}; /* defining declaration of a */
int main(void)
{
void foo();
foo();
return 0;
}
module2.c contains a defining declaration for a (the size of the array is computed from the number of elements in the initializer), which causes storage to be allocated for the array.
Style note: please don't ever write code like this.
In the context of a function parameter declaration, T a[] is synonymous with T *a; in both cases, a is a pointer type. This is only true in the context of a function parameter declaration.
As Paul said in the comment above, it's syntactic sugar. Both char* and char[] are the same data type. In memory, they will both contain the address of a char.
The array/index notation is equivalent to the pointer notation, both in declaration and in access, but sometimes much more intuitive. If you are creating an array of char pointers, you may want to write it one way or another to clarify your intention.
Edit: didn't consider the case Jerry mentioned in the other answer. Take a look at that.
char *ptr[2]={"good","bad"}; //Array of ptr to char
char **str; //Refer ptr to ptr to char
int i;
//str = &ptr[0]; //work
str = ptr;
for(i=0;i<2;i++) printf("%s %s\n",ptr[i],str[i]);
Its o/p same. Using that we can easily understand.
For pointers, I'm getting confused with declarations and function parameters on when to use char ** or char * or *array[n], etc. Like if a function takes a (*array[n]) parameter, do I pass it a **type?
I try using the Right-Left rule and know that p would be a pointer to a pointer to a char (char **p), and p is an array of n pointers (*p[n]), but someone said that *p[n] and **p are essentially equivalent. Is that true?
In the correct context (namely, arguments to a function), then the following declarations are equivalent:
int main(int argc, char *argv[]);
int main(int argc, char **argv);
int main(int argc, char *argv[12]); // Very aconventional!
Similar comments apply to the function definitions (which have a block enclosed in braces in place of the semi-colon).
In any other context, there are important differences between the notations. For example:
extern char *list1[];
extern char **list2;
extern char *list3[12];
The first says that somewhere there is an array of indeterminate size containing 'char *' values. The second says that somewhere - possibly here - there is a single value containing a pointer to a char pointer. The third says that somewhere - possibly here - there is an array of 12 character pointers.
However, all the three lists can be referenced in somewhat the same way - assuming that they actually have been defined and initialized.
list1[0][0] = '1';
list2[0][0] = '2';
list3[0][0] = '3';
Further, if they are passed into a function like this:
function(list1, list2, list3);
then the function can be declared as:
void function(char **list1, char **list2, char **list3);
The arrays (list1, list3) decay from the array to the pointer to the first element of the array; list2, of course, is already a pointer to a pointer.
One detail to note in a function such as:
void otherfunction(char *list[12])
{
...
}
The C compiler does not treat that declaration any differently from:
void otherfunction(char **list)
{
...
}
or
void otherfunction(char *list[])
{
...
}
In particular, it does no array bounds checking, and as far as the function is concerned, the 12 may as well be absent.
C99 introduces VLA (variable length array) types and also introduces a notation with 'static' and a size in the array bounds. You would need to read the standard to understand those fully.
Suffice to say in a function like the following the size of the array does matter, and is determined at run-time. With two-dimensional arrays in general, all the dimensions except the first need to be specified.
void vla_function(size_t m, int vla[m][m]);
Quoting from the standard (section 6.7.5.3):
void f(double (* restrict a)[5]);
void f(double a[restrict][5]);
void f(double a[restrict 3][5]);
void f(double a[restrict static 3][5]);
(Note that the last declaration also specifies that the argument corresponding to a in any call to f must be a
non-null pointer to the first of at least three arrays of 5 doubles, which the others do not.)
Reading C declarators (that's the part of the variable with the * and []) is fairly nuanced. There are some websites with tips:
http://www.antlr.org/wiki/display/CS652/How+To+Read+C+Declarations
http://www.ericgiguere.com/articles/reading-c-declarations.html
A char** is a pointer to (possible multiple) pointer(s) to (possibly multiple) char(s). For example, it might be a pointer to a string pointer, or a pointer to an array of string pointers.
A char*[] is an array of pointers to char. When you have a function that takes this as a parameter, the C compiler makes it "decay" into a char**. This only happens to the first layer... so, taking a complicated example, char*[4][] becomes char*(*)[4]. Read the links above so you can understand what the heck that means.
Or you can do a (very sensible) thing and make a bunch of typedefs. I don't do this, but until you're good at reading declarators, it's a good idea.
typedef char * stringp;
void func(stringp array[]) { ... }
static stringp FOUR_STRINGS[4] = { ... };
If n==0 then they reference the same memory. Array indexing is basically a pointer plus an offset. *(p[n]) would be the same as **(p+n). You can see for yourself how simple this is in C, because array[4] and 4[array] will give you the same thing.