Why can a 2D character array be initialized as a pointer but not as a 2D integer array? Why does it give an error when I try to do so? Also, what does initializing an array as a pointer mean?
#include<stdio.h>
int main()
{
char* m[] = { "Excellent","Good", "bad" };
int* x[] = { {1,2,3},{4,5,6} };
return 0;
}
In the context of a declaration, { and } just mean “here is a group of things.” They do not represent an object or an address or an array. (Note: Within initializations, there are expressions, and those expressions can contain braces in certain contexts that do represent objects. But, in the code shown in the question, the braces just group things.)
In char* m[] = { "Excellent","Good", "bad" };, three items are listed to initialize m: "Excellent", "Good", and "bad". So each item initializes one element of m.
"Excellent" is a string literal. During compilation, it becomes an array of characters, terminated by a null character. In some situations, an array is kept as an array:
When it is used as the operand of sizeof.
When it is used as the operand of unary & (for taking an address).
When it is a string literal used to initialize an array.
None of these apply in this situation. "Excellent" is not the operand of sizeof, it is not the operand of &, and it is initializing just one element of m, not the entire array. So, the array is not kept as an array: By a rule in C, it is automatically converted to a pointer to its first element. Then this pointer initializes m[0]: m[0] is a pointer to the first element of "Excellent".
Similarly, m[1] is initialized to a pointer to the first element of "Good", and m[2] is initialized to a pointer to the first element of "bad".
In int* x[] = { {1,2,3},{4,5,6} };, two things are listed to initialize x. Each of these things is itself a group (of three things). However, x is an array of int *. Each member of x should be initialized with a pointer. But a group of three things, {1,2,3}, is not a pointer.
The C rules on interpreting groups of things when initializing arrays and structures are a bit complicated, because they are designed to provide some flexibility for omitting braces, so I have to study the standard a bit more to explain how they apply here. Suffice it to say that the compiler interprets the declaration as using 1 to initialize x[0]. Since 1 is an int and x[0] is an int *, the compiler complains that the types do not match.
Supplementary Notes
char *m[] does not declare a two-dimensional array. It is an array of pointers to char. Because of C’s rules, it can generally be used syntactically the same way as a two-dimensional array, so that m[i][j] picks out character j of string i. However, there is a difference between char *m[] and char a[3][4], for example:
In m[i][j], m[i] is a pointer. That pointer is loaded from memory and use as the base address for [j]. Then j is added to that address, and the character there is loaded from memory. There are two memory loads in this evaluation.
In a[i][j], a[i] is an array. The location of this array is calculated by arithmetic from the start of a. Then a[i][j] is a char, and its address is calculated by adding j, and the character there is loaded from memory. There is one memory load in this evaluation.
There is a syntax for initialization an array of int pointers to point to an array of int. It is called a compound literal. This is infrequently used:
int *x[] = { (int []) {1, 2, 3}, (int []) {4, 5, 6} };
A crucial difference between these string literals and compound literals is that string literals define objects which exist for the lifetime of program execution, but compound literals used inside functions have an automatic storage duration—it will vanish when your function returns, and possibly earlier, depending on where it is used. Novice C programmers should avoid using compound literals until they understand the storage duration rules.
Related
I was learning about pointers and strings.
I understood that,
Pointers and Arrays/Strings have similar behaviours.
array[] , *array , &array[0]. They all are one and the same.
Why does the three statements in this code work, and char * help one does not ?
#include <stdio.h>
void display(char*help){
for(int i=0; help[i]!='\0'; i++){
printf("%c", help[i]);
}
}
int main(){
// char help[] = "Help_Me"; //Works
// char help[] = {'H','e','l','p','_','M','e','\0'}; //Works
// char *help = "Help_Me"; //Works
char *help = {'H','e','l','p','_','M','e','\0'}; //Error
display(help);
}
Error Messages :
warning: initialization of 'char *' from 'int' makes pointer from integer without a cast
warning: excess elements in scalar initializer
Pointers and Arrays/Strings have similar behaviours.
Actually, no, I wouldn't agree with that. It is an oversimplification that hides important details. The true situation is that arrays have almost no behaviors of their own, but in most contexts, an lvalue designating an array is automatically converted to pointer to the first array element. The resulting pointer behaves like a pointer, of course, which is what may present the appearance that pointers and arrays have similar behaviors.
Additionally, arrays are objects, whereas strings are certain configurations of data that char arrays can contain. Although people sometimes conflate strings with the arrays containing them or with pointers to their first elements, that is not formally correct.
array[] , *array , &array[0]. They all are one and the same.
No, not at all, though the differences depend on the context in which those appear:
In a declaration of array (other than in a function prototype),
type array[] declares array as an array of type whose size will be determined from its initializer;
type *array declares array as a pointer to type; and
&array[0] is not part of any valid declaration of array.
In a function prototype,
type array[] is "adjusted" automatically as if it were type *array, and it therefore declares array as a pointer to type;
type *array declares array as a pointer to type; and
&array[0] is not part of any valid declaration of array.
In an expression,
array[] is invalid;
*array is equivalent to array[0], which designates the first element of array; and
&array[0] is a pointer to array[0].
Now, you ask,
Why does the three statements in this code work, and char * help one does not ?
"Help_Me" is a string literal. It designates a statically-allocated array just large enough to contain the specified characters plus a string terminator. As an array-valued expression, in most contexts it is converted to a pointer to its first element, and such a pointer is of the correct type for use in ...
// char *help = "Help_Me"; //Works
But the appearance of a string literal as the initializer of a char array ...
// char help[] = "Help_Me"; //Works
... is one of the few contexts where an array value is not automatically converted to a pointer. In that context, the elements of the array designated by the string literal are used to initialize the the array being declared, very much like ...
// char help[] = {'H','e','l','p','_','M','e','\0'}; //Works
. There, {'H','e','l','p','_','M','e','\0'} is an array initializer specifying values for 8 array elements. Note well that taken as a whole, it is not itself a value, just a syntactic container for eight values of type int (in C) or char (in C++).
And that's why this ...
char *help = {'H','e','l','p','_','M','e','\0'}; //Error
... does not make sense. There, help is a scalar object, not an array or a structure, so it takes only one value. And that value is of type char *. The warnings delivered by your compiler are telling you that eight values have been presented instead of one, and they have, or at least the one used for the initialization has, type int instead of type char *.
array[] , *array , &array[0]. They all are one and the same.
No. Presuming array names some array, array[] cannot be used in an expression (except where it might appear in some type description, such as a cast).
array by itself in an expression is automatically converted to a pointer to its first element except when it is the operand of sizeof or the operand of unary &. (Also, a string literal, such as "abc", denotes an array, and this array has another exception to when it is converted: When it is used to initialize an array.)
In *array, array will be automatically converted to a pointer, and then * refers to the element it points to. Thus *array refers to an element in an array; it is not a pointer to the array or its elements.
In &array[0], array[0] refers to the first element of the array, and then & takes its address, so &array[0] is a pointer to the first element of the array. This makes it equivalent to array in expressions, with the exceptions noted above. For example, void *p = array; and void *p = &array[0]; will initialize p to the same thing, a pointer to the first element of the array, because of the automatic conversion. However, size_t s = sizeof array; and size_t s = sizeof &array[0]; may initialize s to different values—the first to the size of the entire array and the second to the size of a pointer.
// char help[] = "Help_Me"; //Works
help is an array of char, and character arrays can be initialized with a string literal. This is a special rule for initializations.
// char help[] = {'H','e','l','p','_','M','e','\0'}; //Works
help is an array, and the initializer is a list of values for the elements of the array.
// char *help = "Help_Me"; //Works
help is a pointer, and "Help_Me" is a string literal. Because it is not in one of the exceptions—operand of sizeof, operand of unary &, or used to initialize an array—it is automatically converted to a pointer to its first element. Then help is initialized with that pointer value.
char *help = {'H','e','l','p','_','M','e','\0'}; //Error
help is a pointer, but the initializer is a list of values. There is only one thing to be initialized, a pointer, but there are multiple values listed for it, so that is an error. Also, a pointer should be initialized with a pointer value (an address or a null pointer constant), but the items in that list are integers. (Character literals are integers; their values are the codes for the characters.)
{'H','e','l','p','_','M','e','\0'} is not a syntax that creates a string or an array. It is a syntax that can be used to provide a list of values when initializing an object. So the compiler does not recognize it as a string or array and does not use it to initialize the pointer help.
Pointer is not the array and it cant be initialized like an array. You need to create an object, then you can assign its reference to the pointer.
char *help = (char[]){'H','e','l','p','_','M','e','\0'};
Why does this work:
char *name = "steven";
but this doesn't:
char **names = {"steven", "randy", "ben"};
Or, why does this work:
char *names[] = {"steven", "randy", "ben"};
but, again, this doesn't:
char **names = {"steven", "randy", "ben"};
A char **p is not a 2D array, it is a pointer to a pointer to a character. However, you can have more pointers and more characters following, resembling a kind of model of a 2D structure of characters.
C compiler interpret { "steven" } as a 1D array of characters, because the braces are optional (standard chapter 6.7.9 paragraph 14).
As you tried, you can declare an array of pointers to a character by char *p[].
But if you want to have that pointer (to pointers to characters), you need to tell your compiler. The address of an array can be assigned to the pointer.
char **p = (char *[]){ "steven", "randy", "ben", };
Additional note: Since string literals are unmutable, you better add a const for the characters. And since the address of these unnamed string literals are constant, too, you can provide another one.
const char * const *p = (const char * const []){ "steven", "randy", "ben", };
I also wondered, what if I could answer you in the simplest way possible.
Why are you confused?
A simple pointer to integer for example allocated with 8 cells, acts in the same way as an array has a dimension of 8 cells.
The only difference, that you can't see, is that a pointer that has 8 cells allocated is on a part of the memory that is called the HEAP, while a variable of type int tab[8] is allocated on the STACK.
Indeed, since the cells are linked in memory, it is easy to imagine that a pointer and an array whose first cell address is sent are the same thing.
Why it doesn't work in the other case
However, when the idea comes to associate (** and [][])
Let's take the example of an int ** ;
int **tab;
tab = malloc(sizeof(int *) * 4);
//secure malloc do not forget
for (int i = 0; i < 4; i++)
{
tab[i] = malloc(sizeof(int) * 3);
//secure malloc do not forget
}
and an
int[4][3];
You have a problem.
To imagine, a double array type follows itself in memory, because it is the very principle of arrays.
While a double pointer has first 4 cells of type int * allocated (which follow each other in memory) and then each pointer of these 4 cells, each points to a memory area of 3 ints which follow each other. But the whole thing does not follow each other in the memory!
A way that may interest you
One thing you can do instead is to create an int ptr(*)[3];
which can point to the first element of an array of size 3, i.e. the address of an array [4][3] for example.
The initializer for a scalar object may not contain more than one item.
6.7.9 Initialization
...
Constraints
2 No initializer shall attempt to provide a value for an object not contained within the entity
being initialized.
...
11 The initializer for a scalar shall be a single expression, optionally enclosed in braces. The
initial value of the object is that of the expression (after conversion); the same type
constraints and conversions as for simple assignment apply, taking the type of the scalar
to be the unqualified version of its declared type
C 2011 Online Draft
char **names declares a single, scalar object, not an array, so any initializer for it must only contain a single item. That initializer may be a single string ("steven"), optionally enclosed in braces ({ "steven" }). However, it may not be a list of initializers.
Why is it so that a struct can be assigned after defining it using a compound literal (case b) in sample code), while an array cannot (case c))?
I understand that case a) does not work as at that point compiler has no clue of the memory layout on the rhs of the assignment. It could be a cast from any type. But going with this line, in my mind case c) is a perfectly well-defined situation.
typedef struct MyStruct {
int a, b, c;
} MyStruct_t;
void function(void) {
MyStruct_t st;
int arr[3];
// a) Invalid
st = {.a=1, .b=2, .c=3};
// b) Valid since C90
st = (MyStruct_t){.a=1, .b=2, .c=3};
// c) Invalid
arr = (int[3]){[0]=1, [1]=2, [2]=3};
}
Edit:
I am aware that I cannot assign to an array - it's how C's been designed. I could use memcpy or just assign values individually.
After reading the comments and answers below, I guess now my question breaks down to the forever-debated conundrum of why you can't assign to arrays.
What's even more puzzling as suggested by this post and M.M's comment below is that the following assignments are perfectly valid (sure, it breaks strict aliasing rules). You can just wrap an array in a struct and do some nasty casting to mimic an assignable array.
typedef struct Arr3 {
int a[3];
} Arr3_t;
void function(void) {
Arr3_t a;
int arr[3];
a = (Arr3_t){{1, 2, 3}};
*(Arr3_t*)arr = a;
*(Arr3_t*)arr = (Arr3_t){{4, 5, 6}};
}
So then what's stopping developers to include a feature like this to, say C22(?)
C does not have assignment of arrays, at all. That is, where array has any array type, array = /* something here */ is invalid regardless of the contents of "something here". Whether it's a compound literal (which you seem to have confused with designated initializer, a completely different concept) is irrelevant. array1 = array2 would be just as invalid.
As to why it's invalid, at some level that's a question of the motivations/rationale of the C language and its design and unanswerable. However, mechanically, arrays in any context except the operand of sizeof or the operand of & "decay" to pointers to their first element. So in the case of:
arr = (int[3]){[0]=1, [1]=2, [2]=3};
you are attempting to assign pointer to the first element of the compound literal array to a non-lvalue (the rvalue produced when arr decays). And of course that is nonsense.
A compound array literal can be used anywhere that an actual array variable can be used. Since you can't assign one array to another array, it's also not valid to assign a compound literal to an array.
Since you can copy arrays using memcpy(), you could write:
memcpy(arr, (int[3]){[0]=1, [1]=2, [2]=3}, sizeof(arr));
Just like the array variable, the array literal decays to a pointer to its first element.
Compound struct literals can also be used in place of an actual struct variable. But structs can be assign to each other, so it's valid to assign a compound struct literal to a struct variable.
That's the difference between the two cases.
Is my understanding of arrays in C correct?
Arrays are nothing more than a syntactic convenience such that, for instance, when you declare in your C code an array:
type my_array[x];
the compiler sees it as something equivalent to:
type *my_array = malloc(sizeof(*my_array) * x);
with a free system call that releases my_array once we leave the scope of my_array.
Once my_array is declared
my_array[y];
is nothing more but:
*(my_array + y)
Transposing this to character strings; I was also wondering what was happening behind the curtain with
char *my_string = "Hello"
and
my_string = "Hello"
No, an array object is an array object. C has some odd rules that make it appear that arrays and pointers are the same thing, or at least very similar, but they very definitely are not.
This declaration:
int my_array[100];
creates an array object; the object's size is 100 * sizeof (int). It does not create a pointer object.
There is no malloc(), even implicitly. Storage for my_array is allocated the same way as storage for any object declared in the same scope.
What may be confusing you is that, in most but not all contexts, an expression of array type is implicitly converted to a pointer to the array's first element. (This gives you a pointer value; there's still no pointer object.) This conversion doesn't happen if the array expression is the operand of a unary & or sizeof. &my_array gives you the address of the array, not of some nonexistent pointer obejct. sizeof my_array is the size of the entire array (100 * sizeof (int)`), not the size of a pointer.
Also, if you define a function parameter with an array type:
void func(int param[]) { ... }
it's adjusted at compile time to a pointer:
void func(int *param) { ... }
This isn't a conversion; in that context (and only in that context), int param[] really means int *param.
Also, array indexing:
my_array[3] = 42;
is defined in terms of pointer arithmetic -- which means that the prefix my_array has to be converted to a pointer before you can index into it.
The most important thing to remember is this: Arrays are not pointer. Pointers are not arrays.
Section 6 of the comp.lang.c FAQ explains all this very well.
Once my_array is declared
my_array[y];
is nothing more but :
*(my_array + y)
Yes, because my_array is converted to a pointer, and the [] operator is defined so that x[y] means *(x+y).
Transposing this to character strings; i was also wondering what was
happening behind the curtain with
char *my_string = "Hello"
and
my_string = "Hello"
"Hello" is a string literal. It's an expression of type char[6], referring to an anonymous statically allocated array object. If it appears on the RHS of an assignment or initializer, it's converted, like any array expression, to a pointer. The first line initializes my_string so it points to the first character of "Hello". The second is a pointer assignment that does the same thing.
So what about this?
char str[] = "Hello";
This is the third context in which array-to-pointer conversion doesn't happen. str takes its size from the size of the string literal, and the array is copied to str. It's the same as:
char str[] = { 'H', 'e', 'l', 'l', 'o', '\0' };
No!
type array[n] is a variable stored on stack
type *array is a pointer variable stored on the stack too. But after array = malloc(sizeof(*array) * n); it'll point to some data on the heap
If it walks like a duck, swims like a duck and flies like a duck, then it is a duck.
So, let's see. Arrays and pointers have some common attributes as you correctly described, however, you can see there are some differences. Read more here.
1.char str[] = "hello"; //legal
2.char str1[];
str1 = "hello"; // illegal
I understand that "hello" returns the address of the string literal from the string literal pool which cannot be directly assigned to an array variable. And in the first case the characters from the "hello" literal are copied one by one into the array with a '\0' added at the end.
Is this because the assignment operator "=" is overloaded here to support this?
I would also like to know other interesting cases wherein initialization is different from assignment.
You cannot think of it as overloading (which doesn't exist in C anyway), because the initialization of char arrays with string literals is a special case. The type of a string literal is const char[N], so if it were similar to overloading, you'd be able to initialize a char array with any expression whose type is const char[N]. But you cannot!
const char arr[3];
const char arr1[] = arr; //compiler error. Cannot initialize array with another array.
The language standard simply says that character arrays can be initialized with string literals. Since they say nothing about assignment, the general rules apply, in particular, that an array cannot be assigned to.
As for other cases when initialization is different from assignment: in C++, where there are references and classes, there would be zillions of examples. In C, with no full-fledged classes or references, the only other thing I can think of off the top of my head is const variables:
const int a = 4; //OK;
const int b; //Error;
b = 4; //Error;
Another example: array initialization with braces
int a[3] = {1,2,3}; //OK
int b[3];
b = {1,2,3}; //error
Same with structs
If you want to think of it as the operator being overloaded (even though C doesn't use the term), you can of course do that.
Do you also consider this to be overloading:
unsigned char x;
double y;
x = 2;
y = 1.243;
Those are assigning totally different types of data, after all, but using the "same operator", right?
It's just different, to be initializing or to be assigning.
Another big difference is that you used to be able to initialize structures, but there was no corresponding "struct literal" syntax for later assignments. This is no longer true as of C99, where we now have compound literals.
char str[] = "hello";
Is array initialization, using syntactic sugar defined in C because string initialization is so common. The compiler allocates some fixed memory in your program an initializes it. The name of the array (str) evaluates to the address of this memory, and it cannot be changed because there is no variable which holds that address.
Grijesh Chauhan explains more details of this.
Other cases depend on what you mean. Extending the current case, you can easily see that other initialized arrays have the same properties, for example
int a[] = { 1, 2, 3, 4 };
Array has non modifiable address. You need a pointer as a modifiable lvalue.
By assigning(trying) to a contant string literal, you are taking the address of it. Different address causes that illegality.
"hello" allocates some space in memory and gives and address. Then you take its address to initialize the array.