Related
Just learning C, so bear with me. I understand that char *argv[] is an array whose elements are pointers to strings. So, for example:
char *fruits[] = {"Apple", "Pear"};
represents a "ragged array" of char arrays (i.e. a two-dimensional array whose rows have different lengths). So far, so good.
But when I try to abstract this to int types, it does not seem to work.
int *numbers[] = { {1,2,3}, {4,5,6} };
I get the following warning from the GCC compiler:
warning: braces around scalar initializer.
Can someone help me wrap my brain around this?
int *numbers[] = { {1,2,3}, {4,5,6} } can't work, you are attempting to initialize elements of an array of pointers with lists of integers.
To initialize an array of pointers you need to provide the addresses that point to the desired values, i.e. you must initialize each element of the pointer array with addresses of the ints you want them to point to, in this case the initial element of an array of int so that you can have access to the beginning of the array and thus to the rest of it via indexing:
//have 2 flat arrays of int
int a[] = {1, 2, 3};
int b[] = {4, 5, 6};
// make the array of pointers point to its initial elements
int *numbers[] = { &a[0], &b[0] };
// access
printf("%d", numbers[1][1]); // 5
You could also use:
int *numbers[] = { a, b };
Why? Because when you use an array name in an expression, for example you pass it as argument of a function or an initializer list like the above one, it decays to a pointer to its first element.
char *fruits[] = {"Apple", "Pear"}; works fine because string literals have type char[] generally, e.g. "Apple" has type char[6], so when you use them in the initializer list expression the same decay process occurs, and you'll end up with a pointer to the first element of the nul terminated array of chars.
Note that unlike the above string literals (which all end with a nul byte \0), the int arrays have no sentinel value, unless you establish one, otherwise for you to safely navigate inside the bounds of each array you must keep track of its size.
int *numbers[] It is not "ragged array" only an array of pointers.
int *numbers[] = { (int[]){1,2,3}, (int[]){4,5,6,7,8} };
In this example it has two elements having type pointer to int. Those pointers hold the reference of the first element of the arrays used to initialize it.
Why is it so that a struct can be assigned after defining it using a compound literal (case b) in sample code), while an array cannot (case c))?
I understand that case a) does not work as at that point compiler has no clue of the memory layout on the rhs of the assignment. It could be a cast from any type. But going with this line, in my mind case c) is a perfectly well-defined situation.
typedef struct MyStruct {
int a, b, c;
} MyStruct_t;
void function(void) {
MyStruct_t st;
int arr[3];
// a) Invalid
st = {.a=1, .b=2, .c=3};
// b) Valid since C90
st = (MyStruct_t){.a=1, .b=2, .c=3};
// c) Invalid
arr = (int[3]){[0]=1, [1]=2, [2]=3};
}
Edit:
I am aware that I cannot assign to an array - it's how C's been designed. I could use memcpy or just assign values individually.
After reading the comments and answers below, I guess now my question breaks down to the forever-debated conundrum of why you can't assign to arrays.
What's even more puzzling as suggested by this post and M.M's comment below is that the following assignments are perfectly valid (sure, it breaks strict aliasing rules). You can just wrap an array in a struct and do some nasty casting to mimic an assignable array.
typedef struct Arr3 {
int a[3];
} Arr3_t;
void function(void) {
Arr3_t a;
int arr[3];
a = (Arr3_t){{1, 2, 3}};
*(Arr3_t*)arr = a;
*(Arr3_t*)arr = (Arr3_t){{4, 5, 6}};
}
So then what's stopping developers to include a feature like this to, say C22(?)
C does not have assignment of arrays, at all. That is, where array has any array type, array = /* something here */ is invalid regardless of the contents of "something here". Whether it's a compound literal (which you seem to have confused with designated initializer, a completely different concept) is irrelevant. array1 = array2 would be just as invalid.
As to why it's invalid, at some level that's a question of the motivations/rationale of the C language and its design and unanswerable. However, mechanically, arrays in any context except the operand of sizeof or the operand of & "decay" to pointers to their first element. So in the case of:
arr = (int[3]){[0]=1, [1]=2, [2]=3};
you are attempting to assign pointer to the first element of the compound literal array to a non-lvalue (the rvalue produced when arr decays). And of course that is nonsense.
A compound array literal can be used anywhere that an actual array variable can be used. Since you can't assign one array to another array, it's also not valid to assign a compound literal to an array.
Since you can copy arrays using memcpy(), you could write:
memcpy(arr, (int[3]){[0]=1, [1]=2, [2]=3}, sizeof(arr));
Just like the array variable, the array literal decays to a pointer to its first element.
Compound struct literals can also be used in place of an actual struct variable. But structs can be assign to each other, so it's valid to assign a compound struct literal to a struct variable.
That's the difference between the two cases.
Why can a 2D character array be initialized as a pointer but not as a 2D integer array? Why does it give an error when I try to do so? Also, what does initializing an array as a pointer mean?
#include<stdio.h>
int main()
{
char* m[] = { "Excellent","Good", "bad" };
int* x[] = { {1,2,3},{4,5,6} };
return 0;
}
In the context of a declaration, { and } just mean “here is a group of things.” They do not represent an object or an address or an array. (Note: Within initializations, there are expressions, and those expressions can contain braces in certain contexts that do represent objects. But, in the code shown in the question, the braces just group things.)
In char* m[] = { "Excellent","Good", "bad" };, three items are listed to initialize m: "Excellent", "Good", and "bad". So each item initializes one element of m.
"Excellent" is a string literal. During compilation, it becomes an array of characters, terminated by a null character. In some situations, an array is kept as an array:
When it is used as the operand of sizeof.
When it is used as the operand of unary & (for taking an address).
When it is a string literal used to initialize an array.
None of these apply in this situation. "Excellent" is not the operand of sizeof, it is not the operand of &, and it is initializing just one element of m, not the entire array. So, the array is not kept as an array: By a rule in C, it is automatically converted to a pointer to its first element. Then this pointer initializes m[0]: m[0] is a pointer to the first element of "Excellent".
Similarly, m[1] is initialized to a pointer to the first element of "Good", and m[2] is initialized to a pointer to the first element of "bad".
In int* x[] = { {1,2,3},{4,5,6} };, two things are listed to initialize x. Each of these things is itself a group (of three things). However, x is an array of int *. Each member of x should be initialized with a pointer. But a group of three things, {1,2,3}, is not a pointer.
The C rules on interpreting groups of things when initializing arrays and structures are a bit complicated, because they are designed to provide some flexibility for omitting braces, so I have to study the standard a bit more to explain how they apply here. Suffice it to say that the compiler interprets the declaration as using 1 to initialize x[0]. Since 1 is an int and x[0] is an int *, the compiler complains that the types do not match.
Supplementary Notes
char *m[] does not declare a two-dimensional array. It is an array of pointers to char. Because of C’s rules, it can generally be used syntactically the same way as a two-dimensional array, so that m[i][j] picks out character j of string i. However, there is a difference between char *m[] and char a[3][4], for example:
In m[i][j], m[i] is a pointer. That pointer is loaded from memory and use as the base address for [j]. Then j is added to that address, and the character there is loaded from memory. There are two memory loads in this evaluation.
In a[i][j], a[i] is an array. The location of this array is calculated by arithmetic from the start of a. Then a[i][j] is a char, and its address is calculated by adding j, and the character there is loaded from memory. There is one memory load in this evaluation.
There is a syntax for initialization an array of int pointers to point to an array of int. It is called a compound literal. This is infrequently used:
int *x[] = { (int []) {1, 2, 3}, (int []) {4, 5, 6} };
A crucial difference between these string literals and compound literals is that string literals define objects which exist for the lifetime of program execution, but compound literals used inside functions have an automatic storage duration—it will vanish when your function returns, and possibly earlier, depending on where it is used. Novice C programmers should avoid using compound literals until they understand the storage duration rules.
I'm trying to assign a compound literal to a variable, but it seems not to work, see:
int *p[] = (int *[]) {{1,2,3},{4,5,6}};
I got a error in gcc.
but if I write only this:
int p[] = (int []) {1,2,3,4,5,6};
Then it's okay.
But is not what I want.
I don't understand why the error occurrs, because if I initialize it like a array, or use it with a pointer of arrays of chars, its okay, see:
int *p[] = (int *[]) {{1,2,3},{4,5,6}}; //I got a error
int p[][3] = {{1,2,3},{4,5,6}}; //it's okay
char *p[] = (char *[]) {"one", "two"...}; // it's okay!
Note I don't understand why I got an error in the first one, and please I can't, or I don't want to write like the second form because it's needs to be a compound literals, and I don't want to say how big is the array to the compiler. I want something like the second one, but for int values.
Thanks in advance.
First, the casts are redundant in all of your examples and can be removed. Secondly, you are using the syntax for initializing a multidimensional array, and that requires the second dimension the be defined in order to allocate a sequential block of memory. Instead, try one of the two approaches below:
Multidimensional array:
int p[][3] = {{1,2,3},{4,5,6}};
Array of pointers to one dimensional arrays:
int p1[] = {1,2,3};
int p2[] = {4,5,6};
int *p[] = {p1,p2};
The latter method has the advantage of allowing for sub-arrays of varying length. Whereas, the former method ensures that the memory is laid out contiguously.
Another approach that I highly recommend that you do NOT use is to encode the integers in string literals. This is a non-portable hack. Also, the data in string literals is supposed to be constant. Do your arrays need to be mutable?
int *p[] = (int *[]) {
"\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00",
"\x04\x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00"
};
That example might work on a 32-bit little-endian machine, but I'm typing this from an iPad and cannot verify it at the moment. Again, please don't use that; I feel dirty for even bringing it up.
The casting method you discovered also appears to work with a pointer to a pointer. That can be indexed like a multidimensional array as well.
int **p = (int *[]) { (int[]) {1,2,3}, (int[]) {4,5,6} };
First understand that "Arrays are not pointers".
int p[] = (int []) {1,2,3,4,5,6};
In the above case p is an array of integers. Copying the elements {1,2,3,4,5,6} to p. Typecasting is not necessary here and both the rvalue and lvalue types match which is an integer array and so no error.
int *p[] = (int *[]) {{1,2,3},{4,5,6}};
"Note I don't understand why I got a error in the first one,.."
In the above case, p an array of integer pointers. But the {{1,2,3},{4,5,6}} is a two dimensional array ( i.e., [][] ) and cannot be type casted to array of pointers. You need to initialize as -
int p[][3] = { {1,2,3},{4,5,6} };
// ^^ First index of array is optional because with each column having 3 elements
// it is obvious that array has two rows which compiler can figure out.
But why did this statement compile ?
char *p[] = {"one", "two"...};
String literals are different from integer literals. In this case also, p is an array of character pointers. When actually said "one", it can either be copied to an array or point to its location considering it as read only.
char cpy[] = "one" ;
cpy[0] = 't' ; // Not a problem
char *readOnly = "one" ;
readOnly[0] = 't' ; // Error because of copy of it is not made but pointing
// to a read only location.
With string literals, either of the above case is possible. So, that is the reason the statement compiled. But -
char *p[] = {"one", "two"...}; // All the string literals are stored in
// read only locations and at each of the array index
// stores the starting index of each string literal.
I don't want to say how big is the array to the compiler.
Dynamically allocating the memory using malloc is the solution.
Hope it helps !
Since nobody's said it: If you want to have a pointer-to-2D-array, you can (probably) do something like
int (*p)[][3] = &(int[][3]) {{1,2,3},{4,5,6}};
EDIT: Or you can have a pointer to its first element via
int (*p)[3] = (int[][3]) {{1,2,3},{4,5,6}};
The reason why your example doesn't work is because {{1,2,3},{4,5,6}} is not a valid initializer for type int*[] (because {1,2,3} is not a valid initializer for int*). Note that it is not an int[2][3] — it's simply an invalid expression.
The reason why it works for strings is because "one" is a valid initializer for char[] and char[N] (for some N>3). As an expression, it's approximately equivalent to (const char[]){'o','n','e','\0'} except the compiler doesn't complain too much when it loses constness.
And yes, there's a big difference between an initializer and an expression. I'm pretty sure char s[] = (char[]){3,2,1,0}; is a compile error in C99 (and possibly C++ pre-0x). There are loads of other things too, but T foo = ...; is variable initialization, not assignment, even though they look similar. (They are especially different in C++, since the assignment operator is not called.)
And the reason for the confusion with pointers:
Type T[] is implicitly converted to type T* (a pointer to its first element) when necessary.
T arg1[] in a function argument list actually means T * arg1. You cannot pass an array to a function for Various Reasons. It is not possible. If you try, you are actually passing a pointer-to-array. (You can, however, pass a struct containing a fixed-size array to a function.)
They both can be dereferenced and subscripted with identical (I think) semantics.
EDIT: The observant might notice that my first example is roughly syntactically equivalent to int * p = &1;, which is invalid. This works in C99 because a compound literal inside a function "has automatic storage duration associated with the enclosing block" (ISO/IEC 9899:TC3).
The one that you are using is array of int pointers. You should use pointer to array :
int (*p)[] = (int *) {{1,2,3}, {4,5,6}}
Look at this answer for more details.
It seems you are confusing pointers and array. They're not the same thing! An array is the list itself, while a pointer is just an address. Then, with pointer arithmetic you can pretend pointers are array, and with the fact that the name of an array is a pointer to the first element everything sums up in a mess. ;)
int *p[] = (int *[]) {{1,2,3},{4,5,6}}; //I got a error
Here, p is an array of pointers, so you are trying to assign the elements whose addresses are 1, 2, 3 to the first array and 4, 5, 6 to the second array. The seg fault happens because you can't access those memory locations.
int p[][3] = {{1,2,3},{4,5,6}}; //it's okay
This is ok because this is an array of arrays, so this time 1, 2, 3, 4, 5 and 6 aren't addresses but the elements themselves.
char *p[] = (char *[]) {"one", "two"...}; // it's okay!
This is ok because the string literals ("one", "two", ...) aren't really strings but pointers to those strings, so you're assigning to p[1] the address of the string literal "one".
BTW, this is the same as doing char abc[]; abc = "abc";. This won't compile, because you can't assign a pointer to an array, while char *def; def = "def"; solves the problem.
when we write something like this
int arr[5] = 0; or int arr[5] = {0};
there is no problem
but when we do something like this
int arr[5];
arr[5] = {0};
an error occurs. Any explanation for this ?
It is simply a part of the language definition that arrays can be initialised, but not directly assigned. You can do what you want in C99 using memcpy() with a compound literal:
int arr[5];
/* ... */
memcpy(&arr, &(int [5]){ 0 }, sizeof arr);
With GCC's typeof extension, you can add a little more safety:
memcpy(&arr, &(typeof(arr)){ 0 }, sizeof arr);
In C89 you must give the source array a name:
{ static const int zero[5] = { 0 }; memcpy(&arr, &zero, sizeof arr); }
During definition, you can do this assignment.
However, arr[5] means trying to assign value to the 5th index and it expects a single integer value, not values inside curly braces to indicate array initialization.
int arr[5] means an array of integers with place to hold 5 values having index 0,1,2,3,4. now arr[5] would not point to any element in this array.
u could use
arr[] = {0,0,0,0,0}
knowing that there are 5 elements in your array.
or may be memset() could help you.
The syntax of C is such that arr alone decays to &a[0], the address of the first element, in almost all contexts. So the syntax that would be natural for assigning to arrays arr = { ... }; can't work. So no assignment to an array as a whole is not possible.
The syntax int a[5] = { 0 }; is initialization and this works that it initializes all elements with 0. The best you can do is to always initialize arrays with this "catch all" initializer to have all elements in a known state. Then, if later in your program you decide that you want different values, assign them directly.
First, please note the linguistic difference between initialization and assignment. The former occurs when a variable is given a value at the same time as it is declared, while the latter is everything else giving the variable a value in runtime. If you set all elements of a variable in a loop, you are doing so in runtime, and therefore assigning values to the array.
The {} is called an initializer list. Initializer lists can only occur at the same line as the variable declaration, and all items inside the list must be constants (assuming C90 standard). They can be used to initialize both arrays or structs (a.k.a "aggregates"), same rules apply to both.
int arr[5] = {0};
What this does in detail, is to set the first element at index [0] to zero. The remaining relements are initialized according to a rule which states that if only a few elements are given in an initializer list, the elements that weren't explicitly initialized by the programmer shall be set to zero by the compiler (ISO 9899:1999 6.7.8 §19).
So the line above sets arr[0] to zero, and then the compiler sets the remaining elements, arr[1] to arr[4], to zero as well. Had you written
int arr[5] = {1};
then the elements of the array would have been initialized to {1, 0, 0, 0, 0}.