I know that when used as function parameter char* a[] is equivalent to char a[][].
When used as function parameter char* a[] is equivalent to char** a. Also known as Array to pointer conversion to some.
However when used in block scope they are not the same, and I'm confused as when I should prefer one over the other, or if I should skip char a[][], as I usually tend to see char* a[] in other people's code.
One argument against char a[][] is obviously that you have to give a fixed size for the C-strings it will contain, but does that affect performance in any way?
Should I prefer this:
char* a[] = {"hello", "world"};
Or this:
char a[][10] = {"hello", "world"};
The key to understanding the seemingly strange syntax cases of function parameters is to understand array decay. This is all about one rule in C that says, whenever you pass an array as parameter to a function, it decays into a pointer to the first element of that array (*).
So when you write something like
void func (int a[5]);
then at compile-time, the array gets replaced with a pointer to the first element, making the above equal to:
void func (int* a);
This rule of pointer decay applies recursively to multi-dimensional arrays. So if you pass a multi-dimensional array to a function:
void func (int a[5][3]);
it still decays to a pointer to the first element. Now as it happens, a 2D array is actually an array of arrays. The first element is therefore a one-dimensional array, with size 3 in this example. You get an array pointer to that array, the type int(*)[3]. Making the above equivalent to
void func (int (*a)[3]);
And this is actually the reason why we can omit the left-most dimension of the array parameter, and only that dimension. Upon doing so we make an array of incomplete type, which you normally wouldn't be able to use: for example you can't write code like int array[]; inside a function's body. But in the parameter case, it doesn't matter that the left-most dimension isn't specified, because that dimension will "decay away" anyway.
(*) Source, C11 6.7.6.3/7:
A declaration of a parameter as ‘‘array of type’’ shall be adjusted to
‘‘qualified pointer to type’’, ...
Adjustment of array type to pointer type works only when it is declared as a parameter of a function.
As a function parameter char* a[] will be adjusted to char** a and char a[][10] to char (*a)[10]. Otherwise char* a[] declares a as an array of pointers to char while char a[][10] declares a an array of arrays of char.
Preference of
char* a[] = {"hello", "world"};
over this
char a[][10] = {"hello", "world"};
make sense when you want to save some bytes of memory. In latter case for each of a[0] and a[1] 10 bytes are allocated. Note that in case of char* a[], strings pointed by the elements of a are immutable.
If you want contiguous memory allocation then go with char a[][10].
Related
I was learning about pointers and strings.
I understood that,
Pointers and Arrays/Strings have similar behaviours.
array[] , *array , &array[0]. They all are one and the same.
Why does the three statements in this code work, and char * help one does not ?
#include <stdio.h>
void display(char*help){
for(int i=0; help[i]!='\0'; i++){
printf("%c", help[i]);
}
}
int main(){
// char help[] = "Help_Me"; //Works
// char help[] = {'H','e','l','p','_','M','e','\0'}; //Works
// char *help = "Help_Me"; //Works
char *help = {'H','e','l','p','_','M','e','\0'}; //Error
display(help);
}
Error Messages :
warning: initialization of 'char *' from 'int' makes pointer from integer without a cast
warning: excess elements in scalar initializer
Pointers and Arrays/Strings have similar behaviours.
Actually, no, I wouldn't agree with that. It is an oversimplification that hides important details. The true situation is that arrays have almost no behaviors of their own, but in most contexts, an lvalue designating an array is automatically converted to pointer to the first array element. The resulting pointer behaves like a pointer, of course, which is what may present the appearance that pointers and arrays have similar behaviors.
Additionally, arrays are objects, whereas strings are certain configurations of data that char arrays can contain. Although people sometimes conflate strings with the arrays containing them or with pointers to their first elements, that is not formally correct.
array[] , *array , &array[0]. They all are one and the same.
No, not at all, though the differences depend on the context in which those appear:
In a declaration of array (other than in a function prototype),
type array[] declares array as an array of type whose size will be determined from its initializer;
type *array declares array as a pointer to type; and
&array[0] is not part of any valid declaration of array.
In a function prototype,
type array[] is "adjusted" automatically as if it were type *array, and it therefore declares array as a pointer to type;
type *array declares array as a pointer to type; and
&array[0] is not part of any valid declaration of array.
In an expression,
array[] is invalid;
*array is equivalent to array[0], which designates the first element of array; and
&array[0] is a pointer to array[0].
Now, you ask,
Why does the three statements in this code work, and char * help one does not ?
"Help_Me" is a string literal. It designates a statically-allocated array just large enough to contain the specified characters plus a string terminator. As an array-valued expression, in most contexts it is converted to a pointer to its first element, and such a pointer is of the correct type for use in ...
// char *help = "Help_Me"; //Works
But the appearance of a string literal as the initializer of a char array ...
// char help[] = "Help_Me"; //Works
... is one of the few contexts where an array value is not automatically converted to a pointer. In that context, the elements of the array designated by the string literal are used to initialize the the array being declared, very much like ...
// char help[] = {'H','e','l','p','_','M','e','\0'}; //Works
. There, {'H','e','l','p','_','M','e','\0'} is an array initializer specifying values for 8 array elements. Note well that taken as a whole, it is not itself a value, just a syntactic container for eight values of type int (in C) or char (in C++).
And that's why this ...
char *help = {'H','e','l','p','_','M','e','\0'}; //Error
... does not make sense. There, help is a scalar object, not an array or a structure, so it takes only one value. And that value is of type char *. The warnings delivered by your compiler are telling you that eight values have been presented instead of one, and they have, or at least the one used for the initialization has, type int instead of type char *.
array[] , *array , &array[0]. They all are one and the same.
No. Presuming array names some array, array[] cannot be used in an expression (except where it might appear in some type description, such as a cast).
array by itself in an expression is automatically converted to a pointer to its first element except when it is the operand of sizeof or the operand of unary &. (Also, a string literal, such as "abc", denotes an array, and this array has another exception to when it is converted: When it is used to initialize an array.)
In *array, array will be automatically converted to a pointer, and then * refers to the element it points to. Thus *array refers to an element in an array; it is not a pointer to the array or its elements.
In &array[0], array[0] refers to the first element of the array, and then & takes its address, so &array[0] is a pointer to the first element of the array. This makes it equivalent to array in expressions, with the exceptions noted above. For example, void *p = array; and void *p = &array[0]; will initialize p to the same thing, a pointer to the first element of the array, because of the automatic conversion. However, size_t s = sizeof array; and size_t s = sizeof &array[0]; may initialize s to different values—the first to the size of the entire array and the second to the size of a pointer.
// char help[] = "Help_Me"; //Works
help is an array of char, and character arrays can be initialized with a string literal. This is a special rule for initializations.
// char help[] = {'H','e','l','p','_','M','e','\0'}; //Works
help is an array, and the initializer is a list of values for the elements of the array.
// char *help = "Help_Me"; //Works
help is a pointer, and "Help_Me" is a string literal. Because it is not in one of the exceptions—operand of sizeof, operand of unary &, or used to initialize an array—it is automatically converted to a pointer to its first element. Then help is initialized with that pointer value.
char *help = {'H','e','l','p','_','M','e','\0'}; //Error
help is a pointer, but the initializer is a list of values. There is only one thing to be initialized, a pointer, but there are multiple values listed for it, so that is an error. Also, a pointer should be initialized with a pointer value (an address or a null pointer constant), but the items in that list are integers. (Character literals are integers; their values are the codes for the characters.)
{'H','e','l','p','_','M','e','\0'} is not a syntax that creates a string or an array. It is a syntax that can be used to provide a list of values when initializing an object. So the compiler does not recognize it as a string or array and does not use it to initialize the pointer help.
Pointer is not the array and it cant be initialized like an array. You need to create an object, then you can assign its reference to the pointer.
char *help = (char[]){'H','e','l','p','_','M','e','\0'};
I am trying hard to understand the difference between char *s[] and char s** initialization.
My char *s[] works fine, whereas my char s1** throws an error [Error] scalar object 's1' requires one element in initializer. I don't get the meaning of that error.
How can we initialize char s1** properly?
#include<stdio.h>
int main(void)
{
char *s[]={"APPLE","ORANGE"};
char **s1={"APPLE","ORANGE"};
return 0;
}
TLDR: char **s1=(char*[]){"apple","orange"};.
You can, of course, initialize pointers with the address of an element in an array. This is more common with simpler data types: Given int arr[] = {1,2};you can say int *p = &arr[0];; a notation which I hate and have only spelled out here in order to make clear what we are doing. Since arrays decay to pointers to their first elements anyway, you can simpler write int *p = arr;. Note how the pointer is of the type "pointer to element type".
Now your array s contains elements of type "pointer to char". You can do exactly the same as before. The element type is pointer to char, so the pointer type must be a pointer to that, a pointer to pointer to char, as you have correctly written:
char **s2= &s[0];, or simpler char **s2= s;.
Now that's a bit pointless because you have s already and don't really need a pointer any longer. What you want is an "array literal". C99 introduced just that with a notation which prefixes the element list with a kind of type cast. With a simple array of ints it would look like this: int *p = (int []){1, 2};. With your char pointers it looks like this:
char **s1=(char*[]){"apple","orange"};.
Caveat: While the string literals have static storage duration (i.e., pointers to them stay valid until the program ends), the array object created by the literal does not: Its lifetime ends with the enclosing block. That's probably OK if the enclosing block is the main function like here, but you cannot, for example, initialize a bunch of pointers in an "initialize" routine and use them later.
Caveat 2: It would be better to declare the arrays and pointers as pointing to const char, since the string literals typically are not writable on modern systems. Your code compiles only for historical reasons; forbidding char *s = "this is constant"; would break too much existing code. (C++ does forbid it, and such code cannot be compiled as C++. But in this special case C++ does not have the concept of compound literals in this way, and the program below is not valid C++.) I adjusted the types accordingly in the complete program below which demonstrates the use of a compound literal. You can even take its address, like that of any other array!
#include<stdio.h>
int main(void)
{
/// array of pointers to char.
const char *arrOfCharPtrs[2]={"APPLE","ORANGE"};
/// pointer to first element in array
const char **ptrToArrElem= &arrOfCharPtrs[0];
/// pointer to element in array literal
const char **ptrToArrLiteralElem=(const char*[]){"apple","orange"};
/// pointer to entire array.
/// Yes, you can take the address of the entire array!
const char *(*ptrToArr)[2] = &arrOfCharPtrs;
/// pointer to entire array literal. Note the parentheses around
/// (*ptrToArrLiteral)- Yes, you can take the address of an array literal!
const char *(*ptrToArrLiteral)[2] = &(const char *[]){"apples", "pears"};
printf("%s, %s\n", ptrToArrElem[0], ptrToArrElem[1]);
printf("%s, %s\n", ptrToArrLiteralElem[0], ptrToArrLiteralElem[1]);
printf("%s, %s\n", (*ptrToArr)[0], (*ptrToArr)[1]);
// In order to access elements in an array pointed to by ptrToArrLiteral,
// you have to dereference the pointer first, yielding the array object,
// which then can be indexed. Note the parentheses around (*ptrToArrLiteral)
// which force dereferencing *before* indexing, here and in the declaration.
printf("%s, %s\n", (*ptrToArrLiteral)[0], (*ptrToArrLiteral)[1]);
return 0;
}
Sample session:
$ gcc -Wall -pedantic -o array-literal array-literal.c && ./array-literal
APPLE, ORANGE
apple, orange
APPLE, ORANGE
apples, pears
int array[5];
Expressions such as
array[3] gets converted to *(array+3)
Or in
void fun ( int *array[] );
*array[] gets converted to int **array
I was wondering what does the array declaration
int array[5];
Get converted to? Is it
int *(array+5)
If yes, what does this even mean? And how does one interpret it and/or read it?
array[i] gets converted to *(array+i)
Correct, given that array[i] is part of an expression, then array "decays" into a pointer to its first element, which is why the above holds true.
Void fun ( Int *array[] );
*array[] gets converted to Int **array
Yes because of the rule of function parameter adjustment ("decay"), which is similar to array decay in expressions. The first item of that array is an int* so after decay you end up with a pointer to such a type, a int**.
This is only true for functions with the specific format you posted, there is otherwise no relation between pointer-to-pointers and arrays.
I was wondering what does the array declaration
Int array[5];
Get converted to?
Nothing, declarations don't get converted. It is an array of 5 integers.
To sum this up, you actually list 3 different cases.
When an array is used as part of an expression, it "decays" into a pointer to the first element.
When an array is used as part of a function parameter declaration, it "decays" too - it actually has its type replaced by the compiler at compile-time - into a pointer to the first element. C was deliberately designed this way, so that functions would work together with arrays used in expressions.
When an array is declared normally (not part of a parameter list), nothing happens except you get an array of the specified size.
I think you are confusing two things.
*(array+i)
cannot be used for declaration, only for accessing the memory location (array being the starting address and i the offset)
also, the following declaration will create an array of 5 integers onto the stack
int array[5];
You can access any element from the array with the other notation, because values are being pushed onto the stack. The following two yielding in the same result:
int a = *(array+3);
int b = array[3];
if (a == b) printf("Same value");
else printf("Not same value");
is char a[64][] equivalent to char *a[64]?
If yes, what if I want to declare char a[][64] using a single pointer. How can i do it?
char a[][64]
is similar in pointer representation to
char (*a)[64]
you can read this. for 2D-array the second dimension size should be specified.
You're looking for a pointer to an array:
char (*a)[64];
Perhaps you are not sure about how the following two statements are different.
char* a[64];
char (*a)[64];
The first one defines a to be an array of 64 char* objects. Each of those pointers could point to any number of chars. a[0] could point to an array of 10 chars while a[1] could point to an array of 20 chars. You would do that with:
a[0] = malloc(10);
a[1] = malloc(20);
The second one defines a to be a pointer to 64 chars. You can allocate memory for a with:
a = malloc(64);
You can also allocate memory for a with:
a = malloc(64*10);
In the first case, you can only use a[0][0] ...a[0][63]. In the second case, you can use a[0][0] ... a[9][63]
char a[64][] is equivalent to char *a[64].
No, because char a[64][] is an error. It attempts to define a to an array of 64 elements where each element is of type char[] - an incomplete type. You cannot define an array of elements of incomplete type. The size of element must be a fixed known constant. The C99 standard §6.7.5.2 ¶2 says
The element type shall not be an incomplete or function type.
Now if you were to compare char a[][64] and char *a[64], then again they are different. That's because the array subscript operator has higher precedence than *.
// declares an array type a where element type is char[64] -
// an array of 64 characters. The array a is incomplete type
// because its size is not specified. Also the array a must have
// external linkage.
extern char a[][64];
// you cannot define an array of incomplete type
// therefore the following results in error.
char a[][64];
// however you can leave the array size blank if you
// initialize it with an array initializer list. The size
// of the array is inferred from the initializer list.
// size of the array is determined to be 3
char a[][2] = {{'a', 'b'}, {'c', 'd'}, {'x', 'y'}};
// defines an array of 64 elements where each element is
// of type char *, i.e., a pointer to a character
char *a[64];
If you want to declare a pointer to an array in a function parameter, then you can do the following -
void func(char a[][64], int len);
// equivalent to
void func(char (*a)[64], int len);
char (*a)[64] means a is a pointer to an object of type char[64], i.e., an array of 64 characters. When an array is passed to a function, it is implicitly converted to a pointer to its first element. Therefore, the corresponding function parameter must have the type - pointer to array's element type.
In the C program below, I don't understand why buf[0] = 'A' after I call foo. Isn't foo doing pass-by-value?
#include <stdio.h>
#include <stdlib.h>
void foo(char buf[])
{
buf[0] = 'A';
}
int main(int argc, char *argv[])
{
char buf[10];
buf[0] = 'B';
printf("before foo | buf[0] = %c\n", buf[0]);
foo(buf);
printf("after foo | buf[0] = %c\n", buf[0]);
system("PAUSE");
return 0;
}
output:
before foo | buf[0] = 'B'
after foo | buf[0] = 'A'
void foo(char buf[])
is the same as
void foo(char* buf)
When you call it, foo(buf), you pass a pointer by value, so a copy of the pointer is made.
The copy of the pointer points to the same object as the original pointer (or, in this case, to the initial element of the array).
C does not have pass by reference semantics in the sense that C++ has pass by reference semantics. Everything in C is passed by value. Pointers are used to get pass by reference semantics.
an array is just a fancy way to use a pointer. When you pass buf to the function, you're passing a pointer by value, but when you dereference the pointer, you're still referencing the string it points to.
Array as function parameter is equivalent to a pointer, so the declaration
void foo( char buf[] );
is the same as
void foo( char* buf );
The array argument is then decayed to the pointer to its first element.
Arrays are treated differently than other types; you cannot pass an array "by value" in C.
Online C99 standard (draft n1256), section 6.3.2.1, "Lvalues, arrays, and function designators", paragraph 3:
Except when it is the operand of the sizeof operator or the unary & operator, or is a
string literal used to initialize an array, an expression that has type ‘‘array of type’’ is
converted to an expression with type ‘‘pointer to type’’ that points to the initial element of
the array object and is not an lvalue. If the array object has register storage class, the
behavior is undefined.
In the call
foo(buf);
the array expression buf is not the operand of sizeof or &, nor is it a string literal being used to initialize an array, so it is implicitly converted ("decays") from type "10-element array of char" to "pointer to char", and the address of the first element is passed to foo. Therefore, anything you do to buf in foo() will be reflected in the buf array in main(). Because of how array subscripting is defined, you can use a subscript operator on a pointer type so it looks like you're working with an array type, but you're not.
In the context of a function parameter declaration, T a[] and T a[N] are synonymous with T *a, but this is only case where that is true.
*char buf[] actually means char ** so you are passing by pointer/reference.
That gives you that buf is a pointer, both in the main() and foo() function.
Because you are passing a pointer to buf (by value). So the content being pointed by buf is changed.
With pointers it's different; you are passing by value, but what you are passing is the value of the pointer, which is not the same as the value of the array.
So, the value of the pointer doesn't change, but you're modifying what it's pointing to.
arrays and pointers are (almost) the same thing.
int* foo = malloc(...)
foo[2] is the same as *(foo+2*sizeof(int))
anecdote: you wrote
int main(int argc, char *argv[])
it is also legal (will compile and work the same) to write
int main(int argc, char **argv)
and also
int main(int argc, char argv[][])
they are effectively the same. its slightly more complicated than that, because an array knows how many elements it has, and a pointer doesn't. but they are used the same.
in order to pass that by value, the function would need to know the size of the argument. In this case you are just passing a pointer.
You are passing by reference here. In this example, you can solve the problem by passing a single char at the index of the array desired.
If you want to preserve the contents of the original array, you could copy the string to temporary storage in the function.
edit: What would happen if you wrapped your char array in a structure and passed the struct? I believe that might work too, although I don't know what kind of overhead that might create at the compiler level.
please note one thing,
declaration
void foo(char buf[])
says, that will be using [ ] notation. Not which element of array you will use.
if you would like to point that, you want to get some specific value, then you should declare this function as
void foo(char buf[X]); //where X would be a constant.
Of course it is not possible, because it would be useless (function for operating at n-th element of array?). You don't have to write down information which element of array you want to get. Everything what you need is simple declaration:
voi foo(char value);
so...
void foo(char buf[])
is a declaration which says which notation you want to use ( [ ] - part ), and it also contains pointer to some data.
Moreover... what would you expect... you sent to function foo a name of array
foo(buf);
which is equivalent to &buf[0]. So... this is a pointer.
Arrays in C are not passed by value. They are not even legitimate function parameters. Instead, the compiler sees that you're trying to pass an array and demotes it to pointer. It does this silently because it's evil. It also likes to kick puppies.
Using arrays in function parameters is a nice way to signal to your API users that this thing should be a block of memory segmented into n-byte sized chunks, but don't expect compilers to care if you spell char *foo char foo[] or char foo[12] in function parameters. They won't.