Case 1: When I write
char*str={"what","is","this"};
then str[i]="newstring"; is valid whereas str[i][j]='j'; is invalid.
Case 2: When I write
char str[][5]={"what","is","this"};
then str[i]="newstring"; is not valid whereas str[i][j]='J'; is valid.
Why is it so? I am a beginner who already get very confused after reading the other answers.
First of all: A suggestion: Please read about arrays are not pointers and vice-versa!!
That said, to enlighten this particular scenario,
In the first case,
char*str={"what","is","this"};
does not do what you think it does. It is a constraint violation, requiring a diagnostic from any conforming C implementation, as per chapter§6.7.9/P2:
No initializer shall attempt to provide a value for an object not contained within the entity
being initialized.
If you enable warnings, you'd (at least) see
warning: excess elements in scalar initializer
char*str={"what","is","this"};
However, a(ny) compiler with strict conformance turned on, should refuse to compile the code. In case, the compiler chose to compile and produce a binary anyway, the behavior is not withing the scope of definition of C language, it's up to the compiler implementation (and thus, can vary widely).
In this case, compiler decided this statement to make functionally only same as char*str= "what";
So, here str is a pointer to a char, which points to a string literal.
You can re-assign to the pointer,
str="newstring"; //this is valid
but, a statement like
str[i]="newstring";
would be invalid, as here, a pointer type is attempted to be converted and stored into a char type, where the types are not compatible. The compiler should throw a warning about the invalid conversion in this case.
Thereafter, a statement like
str[i][j]='J'; // compiler error
is syntactically invalid, as you're using the Array subscripting [] operator on something which is not "pointer to complete object type", like
str[i][j] = ...
^^^------------------- cannot use this
^^^^^^ --------------------- str[i] is of type 'char',
not a pointer to be used as the operand for [] operator.
On the other hand, in second case,
str is an array of arrays. You can change individual array elements,
str[i][j]='J'; // change individual element, good to go.
but you cannot assign to an array.
str[i]="newstring"; // nopes, array type is not an lvalue!!
Finally, considering you meant to write (as seen in comments)
char* str[ ] ={"what","is","this"};
in your first case, the same logic for arrays hold. This makes str an array of pointers. So, the array members, are assignable, so,
str[i]="newstring"; // just overwrites the previous pointer
is perfectly OK. However, the pointers, which are stored as array members, are pointers to string literal, so for the very same reason mentioned above, you invoke undefined behavior, when you want to modify one of the elements of the memory belonging to the string literal
str[i][j]='j'; //still invalid, as above.
The memory layout is different:
char* str[] = {"what", "is", "this"};
str
+--------+ +-----+
| pointer| ---> |what0|
+--------+ +-----+ +---+
| pointer| -------------> |is0|
+--------+ +---+ +-----+
| pointer| ----------------------> |this0|
+--------+ +-----+
In this memory layout, str is an array of pointers to the individual strings. Usually, these individual strings will reside in static storage, and it is an error to try to modify them. In the graphic, I used 0 to denote the terminating null bytes.
char str[][5] = {"what", "is", "this"};
str
+-----+
|what0|
+-----+
|is000|
+-----+
|this0|
+-----+
In this case, str is a contiguous 2D array of characters located on the stack. The strings are copied into this memory area when the array is initialized, and the individual strings are padded with zero bytes to give the array a regular shape.
These two memory layout are fundamentally incompatible with each other. You cannot pass either to a function that expects a pointer to the other. However, access to the individual strings is compatible. When you write str[1], you get a char* to the first character of a memory region containing the bytes is0, i.e. a C string.
In the first case, it is clear that this pointer is simply loaded from memory. In the second case, the pointer is created via array-pointer-decay: str[1] actually denotes an array of exactly five bytes (is000), which immediately decays into a pointer to its first element in almost all contexts. However, I believe that a full explanation of the array-pointer-decay is beyond the scope of this answer. Google array-pointer-decay if you are curious.
With the first you define a variable that is a pointer to a char, which is usually used as just a single string. It initializes the pointer to point to the string literal "what". The compiler should also complain that you have too many initializers in the list.
The second definition makes str an array of three arrays of five char. That is, it's an array of three five-character strings.
A little differently it can be seen something like this:
For the first case:
+-----+ +--------+
| str | --> | "what" |
+-----+ +--------+
And for the second you have
+--------+--------+--------+
| "what" | "is" | "this" |
+--------+--------+--------+
Also note that for the first version, with the pointer to a single string, the expression str[i] = "newstring" should also lead to warnings, as you try to assign a pointer to the single char element str[i].
That assignment is invalid in the second version as well, but for another reason: str[i] is an array (of five char elements) and you can't assign to an array, only copy to it. So you could try doing strcpy(str[i], "newstring") and the compiler will not complain. It's wrong though, because you try to copy 10 characters (remember the terminator) into an array of 5 characters, and that will write out of bounds leading to undefined behavior.
In the first declaration
char *str={"what","is","this"};
declares str a pointer to a char and is a scalar. The standard says that
6.7.9 Initialization (p11):
The initializer for a scalar shall be a single expression, optionally enclosed in braces. [...]
That said a scalar type can have braced enclosed initializer but with a single expression, but in case of
char *str = {"what","is","this"}; // three expressions in brace enclosed initializer
it is upto compilers that how it is going to handle this. Note that what happen to rest of the initializers is a bug. A confirming complier should give a diagnostic message.
[Warning] excess elements in scalar initializer
5.1.1.3 Diagnostics (P1):
A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined
You claim "str[i]="newstring"; is valid whereas str[i][j]='j'; is invalid."
str[i] is of char type and can hold only a char data type. Assigning "newstring" (which is of char *) is invalid. The statement str[i][j]='j'; is invalid as subscript operator can only be applied to an array or pointer data type.
You can make str[i]="newstring"; working by declaring str as an array of char *
char *str[] = {"what","is","this"};
In this case str[i] is of char * type and a string literal can be assigned to it but modifying the string literal str[i] points to will invoke undefined behavior. That said you can't do str[0][0] = 'W'.
The snippet
char str[][5]={"what","is","this"};
declare str as an array of arrays of chars. str[i] is actually an array and as arrays are non modifiable lvalues so you can't use them as a left operand of assignment operator. This makes str[i]="newstring"; invalid. While str[i][j]='J'; works because elements of an array can be modified.
Just because you said other answers are confusing me, lets see what is happening with a simpler example first
char *ptr = "somestring";
Here "somestring" is a string literal which is stored in read only data section of the memory. ptr is a pointer (allocated just like other variables in the same section of code) which is pointing to the first byte of that allocated memory.
Hence cnosider these two statements
char *ptr2 = ptr; //statement 1 OK
ptr[1] = 'a'; //statement 2 error
Statement 1 is doing a perfectly valid operation (assigning 1 pointer to another), but statement 2 is not a valid operation (trying to write into a read only location).
On the other hand if we write:
char ptr[] = "somestring";
Here ptr is not actually a pointer, but the name of an array(unlike the pointer it doesn't take extra space in the memory). It allocates the same number of bytes as required by "somestring" (not read only) and that's it.
Hence consider the same two statements and one extra statement
char *ptr2 = ptr; //statement 1 OK
ptr[1] = 'a'; //statement 2 OK
ptr = "someotherstring" //statement 3 error
Statement 1 is doing a perfectly valid operation (assigning array name to a pointer, array name returns the address of the 1st byte), statement 2 is also valid because the memory is not readonly.
Statement 3 is not a valid operation because here ptr is not a pointer, It can not point to some other memory location.
Now in this code,
char **str={"what","is","this"};
*str is a pointer (str[i] is same as *(str+i))
but in this code
char str[][] = {"what", "is", "this"};
str[i] is not a pointer. It is the name of an array.
The same thing as above follows.
To begin with
char*str={"what","is","this"};
is not even valid C code 1), so discussing it isn't very meaningful. For some reason, the gcc compiler lets this code through with only a warning. Do not ignore compiler warnings. When using gcc, make sure to always compile using -std=c11 -pedantic-errors -Wall -Wextra.
What gcc seems to do when encountering this non-standard code, is to treat it as if you had written char*str={"what"};. Which in turn is the same thing as char*str="what";. This is by no means guaranteed by the C language.
str[i][j] tries to indirect a pointer twice, even though it only has one level of indirection, and therefore you get a compiler error. It makes as little sense as typing
int array [3] = {1,2,3}; int x = array[0][0];.
As for the difference between char* str = ... and char str[] = ..., see FAQ: What is the difference between char s[] and char *s?.
Regarding the char str[][5]={"what","is","this"}; case, it creates an array of arrays (2D array). The inner-most dimension is set to 5 and the outer-most dimension is set automatically by the compiler depending on how many initializers the programmer provided. In this case 3, so the code is equivalent to char[3][5].
str[i] gives you array number i in the array of arrays. You cannot assign to arrays in C, because that's how the language is designed. Furthermore, it would be incorrect to do so for a string anyway, FAQ: How to correctly assign a new string value?
1) This is a constraint violation of C11 6.7.9/2. Also see 6.7.9/11.
To do away with the confusion, you must have proper understanding of pointers, arrays and initializers.
A common misconception amongst C programming beginners is that an array is equivalent to a pointer.
An array is a collection of items of the same type. consider the following declaration:
char arr[10];
This array contains 10 elements, each of type char.
An initializer list may be used to initialize an array in a convenient manner. The following initializes the array elements with the corresponding values of the initializer list:
char array[10] = {'a','b','c','d','e','f','g','h','i','\0'};
Arrays are not assignable, thus the use of initializer list is valid upon array declaration only.
char array[10];
array = {'a','b','c','d','e','f','g','h','i','\0'}; // Invalid...
char array1[10];
char array2[10] = {'a','b','c','d','e','f','g','h','i','\0'};
array1 = array2; // Invalid...; You cannot copy array2 to array1 in this manner.
After the declaration of an array, assignments to array members must be via the array indexing operator or its equivalent.
char array[10];
array[0] = 'a';
array[1] = 'b';
.
.
.
array[9] = 'i';
array[10] = '\0';
Loops are a common and convenient way of assigning values to array members:
char array[10];
int index = 0;
for(char val = 'a'; val <= 'i'; val++) {
array[index] = val;
index++;
}
array[index] = '\0';
char arrays may be initialized via string literals which are constant null terminated char arrays:
char array[10] = "abcdefghi";
However the following is not valid:
char array[10];
array = "abcdefghi"; // As mentioned before, arrays are not assignable
Now, let us get to pointers...
Pointers are variables that can store the address of another variable, usually of the same type.
Consider the following declaration:
char *ptr;
This declares a variable of type char *, a char pointer. That is, a pointer that may point to a char variable.
Unlike arrays, pointers are assignable. Thus the following is valid:
char var;
char *ptr;
ptr = &var; // Perfectly Valid...
As a pointer is not an array, a pointer may be assigned a single value only.
char var;
char *ptr = &var; // The address of the variable `var` is stored as a value of the pointer `ptr`
Recall that a pointer must be assigned a single value, thus the following is not valid, as the number of initializers is more than one:
char *ptr = {'a','b','c','d','\0'};
This is a constraint violation, but your compiler might just assign 'a' to ptr and ignore the rest. But even then, the compiler will warn you because character literals such as 'a' have int type by default, and is incompatible with the type of ptr which is char *.
If this pointer has been dereferenced at runtime, then it will result in a run-time error for accessing invalid memory, causing the program to crash.
In your example:
char *str = {"what", "is", "this"};
again, this is a constraint violation, but your compiler may assign the string what to str and ignore the rest, and simply display a warning:
warning: excess elements in scalar initializer.
Now, here is how we eliminate the confusion regarding pointers and arrays:
In some contexts, an array may decay to a pointer to the first element of the array. Thus the following is valid:
char arr[10];
char *ptr = arr;
by using the array name arr in an assignment expression as an rvalue, the array decays to a pointer to it's first element, which makes the previous expression equivalent to:
char *ptr = &arr[0];
Remember that arr[0] is of type char, and &arr[0] is its address that is of type char *, which is compatible with the variable ptr.
Recall that string literals are constant null terminated char arrays, thus the following expression is also valid:
char *ptr = "abcdefghi"; // the array "abcdefghi" decays to a pointer to the first element 'a'
Now, in your case, char str[][5] = {"what","is","this"}; is an array of 3 arrays, each contain 5 elements.
Since arrays are not assignable, str[i] = "newstring"; is not valid as str[i] is an array, but str[i][j] = 'j'; is valid since
str[i][j] is an array element that is NOT an array by itself, and is assignable.
Case 1:
When I write
char*str={"what","is","this"};
then str[i]="newstring"; is valid whereas str[i][j]='j'; is invalid.
Part I.I
>> char*str={"what","is","this"};
In this statement, str is a pointer to char type.
When compiling, you must be getting a warning message on this statement:
warning: excess elements in scalar initializer
char*str={"what","is","this"};
^
Reason for the warning is - You are providing more than one initializer to a scalar.
[Arithmetic types and pointer types are collectively called scalar types.]
str is a scalar and from C Standards#6.7.9p11:
The initializer for a scalar shall be a single expression, optionally enclosed in braces. ..
Furthermore, giving more than one initializer to a scalar is undefined behavior.
From C Standards#J.2 Undefined behavior:
The initializer for a scalar is neither a single expression nor a single expression enclosed in braces
Since it is undefined behavior as per the standard, there is no point in discussing it further. Discussing Part I.II and Part I.III with an assumption - char *str="somestring", just for better understanding of char * type.
Seems that you want to create an array of pointers to string. I have added a brief about the array of pointers to string, below in this post, after talking about both the cases.
Part I.II
>> then str[i]="newstring"; is valid
No, this is not valid.
Again, the compiler must be giving a warning message on this statement because of incompatible conversion.
Since str is a pointer to char type. Therefore, str[i] is a character at i places past the object pointed to by str [str[i] --> *(str + i)].
"newstring" is a string literal and a string literal decays into a pointer, except when used to initialize an array, of type char * and here you are trying to assign it to a char type. Hence the compiler reporting it as a warning.
Part I.III
>> whereas str[i][j]='j'; is invalid.
Yes, this is invalid.
The [] (subscript operator) can be used with array or pointer operands.
str[i] is a character and str[i][j] means you are using [] on char operand which is invalid. Hence the compiler reporting it as an error.
Case 2:
When I write
char str[][5]={"what","is","this"};
then str[i]="newstring"; is not valid whereas str[i][j]='J'; is valid.
Part II.I
>> char str[][5]={"what","is","this"};
This is absolutely correct.
Here, str is a 2D-array. Based on the number of initializers, the compiler will automatically set the first dimension.
The in-memory view of str[][5], in this case, would be something like this:
str
+-+-+-+-+-+
str[0] |w|h|a|t|0|
+-+-+-+-+-+
str[1] |i|s|0|0|0|
+-+-+-+-+-+
str[2] |t|h|i|s|0|
+-+-+-+-+-+
Based on initializer list, the respective elements of 2D-array will be initialized and the rest of the elements are set to 0.
Part II.II
>> then str[i]="newstring"; is not valid
Yes, this is not valid.
str[i] is a one-dimensional array.
As per the C Standards, an array is not a modifiable lvalue.
From C Standards#6.3.2.1p1:
An lvalue is an expression (with an object type other than void) that potentially designates an object;64) if an lvalue does not designate an object when it is evaluated, the behavior is undefined. When an object is said to have a particular type, the type is specified by the lvalue used to designate the object. A modifiable lvalue is an lvalue that does not have array type, does not have an incomplete type, does not have a const- qualified type, and if it is a structure or union, does not have any member (including, recursively, any member or element of all contained aggregates or unions) with a const- qualified type.
Also, an array name convert to pointer that point to initial element of the array object except when it is the operand of the sizeof operator, the _Alignof operator or the unary & operator.
From C Standards#6.3.2.1p3:
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ''array of type'' is converted to an expression with type ''pointer to type'' that points to the initial element of the array object and is not an lvalue.
Since str is already initialized and when you assign some other string literal to ith array of str, the string literal convert to a pointer which makes the assignment incompatible because you have lvalue of type char array and rvalue of type char *. Hence the compiler reporting it as an error.
Part II.III
>> whereas str[i][j]='J'; is valid.
Yes, this is valid as long as the i and j are valid values for given array str.
str[i][j] is of type char, so you can assign a character to it.
Beware, C does not check array boundaries and accessing an array out of bounds is undefined behavior which includes - it may fortuitously do exactly what the programmer intended or segmentation fault or silently generating incorrect results or anything can happen.
Assuming that in the Case 1, you want to create an array of pointers to string.
It should be like this:
char *str[]={"what","is","this"};
^^
The in-memory view of str will be something like this:
str
+----+ +-+-+-+-+--+
str[0]| |--->|w|h|a|t|\0|
| | +-+-+-+-+--+
+----+ +-+-+--+
str[1]| |--->|i|s|\0|
| | +-+-+--+
+----+ +-+-+-+-+--+
str[2]| |--->|t|h|i|s|\0|
| | +-+-+-+-+--+
+----+
"what", "is" and "this" are string literals.
str[0], str[1] and str[2] are pointers to the respective string literal and you can make them point to some other string as well.
So, this is perfectly fine:
str[i]="newstring";
Assuming i is 1, so str[1] pointer is now pointing to string literal "newstring":
+----+ +-+-+-+-+-+-+-+-+-+--+
str[1]| |--->|n|e|w|s|t|r|i|n|g|\0|
| | +-+-+-+-+-+-+-+-+-+--+
+----+
But you should not do this:
str[i][j]='j';
(assuming i=1 and j=0, so str[i][j] is first character of second string)
As per the standard attempting to modify a string literal results in undefined behavior because they may be stored in read-only storage or combined with other string literals.
From C standard#6.4.5p7:
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
Additional:
There is no native string type in C language. In C language, a string is a null-terminated array of characters. You should know the difference between arrays and pointers.
I would suggest you read following for better understanding about arrays, pointers, array initialization:
Array Initialization, check this.
Equivalence of pointers and arrays, check this and this.
case 1 :
char*str={"what","is","this"};
First of all above statement is not valid, read the warnings properly. str is single pointer, it can points to single char array at a time not to multiple char array.
bounty.c:3:2: warning: excess elements in scalar initializer [enabled by default]
str is a char pointer and it's stored in section section of RAM but it's contents are stored in code(Can't modify the content section of RAM because str is initialized with string(in GCC/linux).
as you stated str[i]="newstring"; is valid whereas str[i][j]='j'; is invalid.
str= "new string" is not causing modifying code/read-only section, here you are simply assigning new address to str that's why it's valid but
*str='j' or str[0][0]='j' is not valid because here you are modifying the read only section, trying to change first letter of str.
Case 2 :
char str[][5]={"what","is","this"};
here str is 2D array i.e str and str[0],str[1],str[2] itself are stored in stack section of RAM that means you can change each str[i] contents.
str[i][j]='w'; it's valid because you are trying to stack section contents which is possible. but
str[i]= "new string"; it's not possible because str[0] itself a array and array is const pointer(can't change the address), you can't assign new address.
Simply in first case str="new string" is valid because str is pointer, not an array and in second case str[0]="new string" is not valid because str is array not a pointer.
I hope it helps.
For example,
I declared variable like this,
char szBuffer[12] = {"Hello"};
char szData[12] = {"Cheese"};
szBuffer = szData;
is error, since szBuffer can't be l-value.
szBuffer has its own address, for example, 0x0012345678, and szBuffer's value is also its address, 0x0012345678.
So I think "array name can't be l-value" means that an array's address and its value have to be equal.
Am I right?
If I'm right, why do they have to be equal?
array name can't be l-value
It means an array can not be used as l-value or left hand side of the assignment operator (not to be confused with initialization). An l-value must be modifiable. You can modify the contents of array but not the array itself.
In C you can not assign to arrays. Though you can intialize them.
You should use strcpy(szBuffer, szData) or memcpy(szBuffer, szData, 12).
Also there is no need of {} in the initialization from string literal.
If you insist on using operator =, you need to put your string in a struct because struct object copy is allowed in C.
ex:
struct string {
char name[12];
};
struct string szBuffer = {"Hello"};
struct string szData = {"Cheese"};
szBuffer = szData;
No, it won't mean such a thing.
Array's address isn't value of array in general.
Arrays in expression except for operands of sizeof and unary & operator are automatically converted to pointers to first arguments of that array.
Therefore, the converted pointer is not an l-value and you cannot assign there.
array name can't be l-value means that array names cannot be used on the left side of an =.
To be more clearer, you need a modifiable l-value on the left side of a =
Arrays are modifiable l-value when they are used with indices like arr[i].
But array name themselves are not, and hence cannot be used on the left side of a =
An L-value is something that can appear on the left hand side of an assignment. Examples: Scalar variables, array elements, pointer dereferences. An array name is not an L-value in C. Instead, you can do one of two things: (1) a pointer assignment, if you just need a pointer to the array, or (2) an array copy, if you really need to copy the array itself.
I am trying to learn the basics, I would think that declaring a char[] and assigning a string to it would work.
thanks
int size = 100;
char str[size];
str = "\x80\xbb\x00\xcd";
gives error "incompatible types in assignment". what's wrong?
thanks
You can use a string literal to initialize an array of char, but you can't assign an array of char (any more than you can assign any other array). OTOH, you can assign a pointer, so the following would be allowed:
char *str;
str = "\x80\xbb\x00\xcd";
This is actually one of the most difficult parts of learning a programming language.... str is an array, that is, a part of memory (size times a char, so size chars) that has been reserved and labeled as str. str[0] is the first character, str[1] the second... str[size-1] is the last one. str itself, without specifiying any character, is a pointer to the memory zone that was created when you did
char str[size]
As Jerry so clearly said, in C you can not initialize arrays that way. You need to copy from one array to other, so you can do something like this
strncpy(str, "\x80\xbb\x00\xcd", size); /* Copy up to size characters */
str[size-1]='\0'; /* Make sure that the string is null terminated for small values of size */
Summarizing: It's very important to make a difference between pointers, memory areas and array.
Good luck - I am pretty sure that in less time than you imagine you will be mastering these concepts :)
A char-array can be implicitely cast to a char* when used as Rvalue, but not when used as Lvalue - that's why the assignment won't work.
You cannot assign array contents using the =operator. That's just a fact of the C language design. You can initialize an array in the declaration, such as
char str[size] = "\x80\xbb\x00\xcd";
but that's a different operation from an assignment. And note that in this case, and extra '\0' will be added to the end of the string.
The "incompatible types" warning comes from how array expressions are treated by the language. First of all, string literals are stored as arrays of char with static extent (meaning they exist over the lifetime of the program). So the type of the string literal "\x80\xbb\x00\xcd" is "4 5-element array of char". However, in most circumstances, an expression of array type will implicitly be converted ("decay") from type "N-element array of T" to "pointer to T", and the value of the expression will be the address of the first element in the array. So, when you wrote the statement
str = "\x80\xbb\x00\xcd";
the type of the literal was implicitly converted from "4 5-element array of char" to "pointer to char", but the target of the assignment is type "100-element array of char", and the types are not compatible (above and beyond the fact that an array expression cannot be the target of the = operator).
To copy the contents of one array to another you would have to use a library function like memcpy, memmove, strcpy, etc. Also, for strcpy to function properly, the source string must be 0-terminated.
Edit per R's comment below, I've struck out the more dumbass sections of my answer.
To assign a String Literal to the str Array you can use a the String copy function strcpy.
char a[100] = "\x80\xbb\x00\xcd"; OR char a[] = "\x80\xbb\x00\xcd";
str is the name of an array. The name of an array is the address of the 0th element. Therefore, str is a pointer constant. You cannot change the value of a pointer constant, just like you cannot change a constant (you can't do 6 = 5, for example).