This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 4 years ago.
I am learning about pointers. I don't understand what the difference between the char variables is.
#include <stdio.h>
int main() {
char *cards = "JQK";
char cards2[] = "JQK";
printf("%c\n", cards[2]);
printf("%c\n", cards2[2]);
}
I experimented with them in the printf() and they seem to be working the same way except that cards2[] can't be resigned while cards can. Why is this the case?
The difference is second is an array object initialized with the content of the string literal. First one is char* which contains the address of the string literal. String literals are array - this array is converted into pointer to first element and then that is assigned to char*.
The thing is, arrays are non modifiable lvalue - it can't appear in the left side of = assignment operator. A Pointer can (not marked const) can. And as the pointer is pointing to a string literal - you shouldn't try to modify it. Doing so will invoke undefined behavior.
Pointer and arrays are not the same thing - arrays decay into pointer in most cases. Here also that happened with those string literals when used in the right hand side of assignment in the pointer initialization case. Second one is different as it is explicitly mentioned that this will copy the content of the string literal to a the array declared - this is why this is modifiable unlike the previous case(Here cards2).
To clarify a bit - first let's know what is going on and what is the difference between array and pointer?
Here I have said that string literals are arrays. From §6.4.5¶6
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals.78) The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence. For UTF-8 string literals, the array elements have type char, and are initialized with the characters of the multibyte character sequence, as encoded in UTF-8.
This is what is written in c11 standard - it is posted to show you that string literals are indeed a character array of static storage duration. This is how it is. Now the question is when we write
char *cards = "JQK";
So now we have an array which is in the left hand side of a pointer declaration. What will happen?
Now comes the concept of array decaying. Most of the cases array is converted into pointers to the first element of it. This may seem strange at first but this is what happens. For example,
int a[]={1,2,3};
Now if you write a[i] this is equivalent to *(a+i) and a is the decayed pointer to the first element of the array. Now you are asking to go to the position a+i and give the value that is in that address, which is 3 for i=2. Same thing happened here.
The pointer to the first element of the literal array is assigned to cards. Now what happened next? cards points to the first element of the literal array which is J.
Story doesn't end here. Now standard has imposed a constraint over string literal. You are not supposed to change it - if you try to do that it will be undefined behavior - undefined behavior as the name implies is not defined by the standard. You should avoid it. So what does that mean? You shouldn't try to change the string literal or whatever the cards points to. Most implementation put string literals in read only section - so trying to write to it will be erroneous.
Being said that, what happened in the second case? Only thing is - this time we say that
char cards2[] = "JQK";
Now cards2 is an array object - to the right of the assignment operator there is string literal again. What will happen? From §6.7.9¶14
An array of character type may be initialized by a character string literal or UTF-8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
The thing is now it means that - you can put the string literal to the right side of the assignment. The char array will be initialized. Here that is what is being done. And this is modifiable. So you can change it unlike the previous case. That is a key difference here.
Also if you are curious are there cases when we see an array as array and not as pointer - the whole rule is stated here.
From §6.3.2.1¶3
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type 'array of type' is converted to an expression with type 'pointer to type' that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
This is all there is to this. This I hope will give you a clear idea than it gave before.
Related
I started learning pointers in C. I understood it fine untill I came across the topic "Using Pointers to store character arrays".
A sample program to highlight my doubt is as follows
#include <stdio.h>
main()
{
char *string;
string = "good";
printf ("%s", string);
}
This prints the character string, i.e, good.
Pointers are supposed to store memory addresses, or in other words, we assign the adress of a variable (using the address operator) to a pointer variable.
What I don't understand is how are we able to assign a character string directly to the pointer? That too without address operator?
Also, how are we able to print the string without the indirection operator (*) ?
A literal string like "good" is really stored as a (read-only) array of characters. Also, all strings in C must be terminated with a special "null" character '\0'.
When you do the assingment
string = "good";
what is really happening is that you make string point to the first character in that array.
Functions handling strings knows how to deal with pointers like that, and know how to loop over such arrays using the pointer to find all the characters in the string until it finds the terminator.
Looking at it a little differently, the compile creates its array
char internal_array[] = { 'g', 'o', 'o', 'd', '\0' };
then you make string point to the first element in the array
string = &internal_array[0];
Note that &internal_array[0] is actually equal to internal_array, since arrays naturally decays to pointers to their first element.
"cccccc" is a string literal which is actually the char array stored in the ReadOnly memory. You assign the pointer to the address of the first character of this literal.
if you want to copy string literal to the RAM you need to:
char string[] = "fgdfdfgdfgf";
Bare in mind that the array initialization (when you declare it) is the only place where you can use the = to copy the string literal to the char array (string).
In any other circumstances you need to use the appropriate library function for example.
strcpy(string, "asdf");
(the string has to have enough space to accommodate the new string)
What I don't understand is how are we able to assign a character string directly to the pointer? That too without address operator?
When an array is assigned to something, the array is converted to a pointer.
"good" is a string literal. It has a array 5 of char which includes a trailing null character. It exists in memory where write attempts should not be attempted. Attempting to write is undefined behavior (UB). It might "work", it might not. Code may die, etc.
char *string; declare string as pointer to char.
string = "good"; causes an assignment. The operation takes "good" and converts that array to the address and type (char*) of its first element 'g'. Then assigns that char * to string.
Also, how are we able to print the string without the indirection operator (*) ?
printf() expects a char * - which matches the type of string.
printf ("%s", string); passes string to printf() as a char * - no conversion is made. printf ("%s",... expects to see a "... the argument shall be a pointer to the initial element of an array of character type." then "Characters from the array are written up to (but not including) the terminating null character." C11 §7.21.6.1 8.
Your first question:
What I don't understand is how are we able to assign a character string directly to the pointer? That too without address operator?
A character string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, for e.g. "good".
From C Standard#6.4.5 [String literals]:
...The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence.....
In C, an expression that has type array of type is converted to an expression with type pointer to type that points to the initial element of the array object [there are few exceptions]. Hence, the string literal which is an array decays into pointer which can be assigned to the type char *.
In the statement:
string = "good";
string will point to the initial character in the array where "good" is stored.
Your second question:
Also, how are we able to print the string without the indirection operator (*) ?
From printf():
s
writes a character string
The argument must be a pointer to the initial element of an array of characters...
So, format specifier %s expect pointer to initial element which is what the variable string is - a pointer to initial character of "good". Hence, you don't need indirection operator (*).
Im trying to create a 2d array for multiple data types but it seems to not be accepting the char data type. Why is this?
struct {
union {
int ival;
float fval;
char cval[50];
} val;
} as[120][4];
as[0][1].val.cval = "Testtttt"; ***This does not work***
as[1][1].val.ival = 3; ***This works***
You are in c, thus you should use string.h when it comes to string handling!
Change this:
as[0][1].val.cval = "Testtttt";
to this:
strcpy(as[0][1].val.cval, "Testtttt");
by using strcpy(), instead of the assignment operator (this would work in c++, not in c).
Of course, alternative functions exist, such as strncpy()* and memcpy().
Moreover, since C string handling seems new to you, you must read about null terminated strings in C.
*Credits to #fukanchik who reminded me that
In C, this code
as[0][1].val.cval
can not be assigned to. Per the C Standard, 6.3.2.1 Lvalues, arrays, and function designators:
Except when it is the operand of the sizeof operator, the _Alignof
operator, or the unary & operator, or is a string literal used to
initialize an array, an expression that has type "array of type"
is converted to an expression with type "pointer to type" that
points to the initial element of the array object and is not an
lvalue.
Without getting too in-depth into the C Standard, an lvalue is something you can assign something to. Thus, this code
as[1][1].val.ival
represents an lvalue and you can assign 3 to it.
The reason an array can't be assigned to is because it decays to "an expression with type ‘‘pointer to type". In other words, a bare array like
as[0][1].val.cval
is treated as the address of the array.
And the address of the array is where it is and is not something that can be assigned to.
Your val.cval members are arrays of char. String literals also represent arrays of char. C does not support whole-array assignment, regardless of the type of the array elements.
You can copy the contents of one array to another in various ways. strcpy() will do it for null-terminated arrays of char. memcpy() and / or memmove() will do it more generally, and of course you can always write an element-by-element copy loop.
You cannot copy the contents of one array to another using the = operator; you must use a library function like strcpy (for strings) or memcpy (for anything else), or you must assign each element individually:
as[0][1].val.cval[0] = 'T';
as[0][1].val.cval[1] = 'e';
as[0][1].val.cval[2] = 's';
...
as[0][1].val.cval[7] = 't';
as[0][1].val.cval[8] = 0;
Remember that in C, a string is a sequence of character values terminated by a 0-valued byte. Strings (including string literals like "Testtttt") are stored as arrays of char, but not all arrays of char store a string.
I am trying to understanding the passing of string to a called function and modifying the elements of the array inside the called function.
void foo(char p[]){
p[0] = 'a';
printf("%s",p);
}
void main(){
char p[] = "jkahsdkjs";
p[0] = 'a';
printf("%s",p);
foo("fgfgf");
}
Above code returns an exception. I know that string in C is immutable, but would like to know what is there is difference between modifying in main and modifying the calling function. What happens in case of other date types?
I know that string in C is immutable
That's not true. The correct version is: modifying string literals in C are undefined behaviors.
In main(), you defined the string as:
char p[] = "jkahsdkjs";
which is a non-literal character array, so you can modify it. But what you passed to foo is "fgfgf", which is a string literal.
Change it to:
char str[] = "fgfgf";
foo(str);
would be fine.
In the first case:
char p[] = "jkahsdkjs";
p is an array that is initialized with a copy of the string literal. Since you don't specify the size it will determined by the length of the string literal plus the null terminating character. This is covered in the draft C99 standard section 6.7.8 Initialization paragraph 14:
An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
in the second case:
foo("fgfgf");
you are attempting to modify a string literal which is undefined behavior, which means the behavior of program is unpredictable, and an exception is one possibility. From the C99 draft standard section 6.4.5 String literals paragraph 6 (emphasis mine):
It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. If the program attempts to modify such an array, the behavior is
undefined.
The difference is in how you are initializing p[].
char p[] = "jkahsdkjs";
This initializas a writeable array called p, auto-sized to be large enough to contain your string and stored on the stack at runtime.
However, in the case of:
foo("fgfgf");
You are passing in a pointer to the actual string literal, which are usually enforced as read-only in most compilers.
What happens in case of other date types?
String literals are a very special case. Other data types, such as int, etc do not have an issue that is analogous to this, since they are stored strictly by value.
I'm sorry for asking something that probably seems a little inane as it is apparently not broken but my (newbie) understanding of how C handles string literals tells me that this should not work...
char some_array_of_strings[3][200];
strcpy(some_array_of_strings[2], "Some garbage");
strcpy(some_array_of_strings[2], "Some other garbage");
I thought that C prevented the direct modification of string literals and that was why pointers were used when dealing with strings. The fact that this works tells me I am misunderstanding something.
Also, if this works, why does...
some_array_of_strings[1]="Some garbage"
some_array_of_strings[1]="Some garbage that causes a compiler error due to reassignment"
not work?
Be careful with the phrase "array of strings". A "string" is not a data type in C; it's a data layout. Specifically, a string is defined as
a contiguous sequence of characters terminated by and including the
first null character
An array of char may contain a string, and a char* pointer may point to (the first character of) a string. (The standard defines a pointer to the first character of a string as a pointer to a string.)
char some_array_of_strings[3][200];
This defines a 3-element array, where each of the elements is a 200-element array of char. (It's a 2-dimensional array, which in C is simply an array of arrays.)
strcpy(some_array_of_strings[2], "Some garbage");
The string literal "Some garbage" refers to an anonymous statically allocated array of char; it exists for the entire execution of your program, and you're not allowed to modify it. The strcpy() call, as the name implies, copies the contents of that array, up to and including the terminating '\0' null character, into some_array_of_strings[].
strcpy(some_array_of_strings[2], "Some other garbage");
Same thing: this copies the contents "Some other garbage" into some_array_of_strings[2], overwriting what you copied on the previous line. In both cases, there's more than enough room.
You're not modifying a string literal, you're modifying your own array by copying bytes from a string literal (more precisely, from that anonymous array I mentioned above).
some_array_of_strings[1]="Some garbage";
This doesn't just "not work", it's illegal. There is no assignment of arrays in C.
Let's take a simpler example:
char arr[10];
arr = "hello"; /* also illegal */
arr is an object of array type. In most contexts, an expression of array type is implicitly converted to a pointer to the array object's first element. That applies to both sides of the assignment: the object name arr and the string literal "hello".
But the pointer on the left side is just a pointer value. There is no pointer object. In technical terms, it's not an lvalue, so it can't appear on the left side of an assignment any more than you could write 42 = x;).
(If the array-to-pointer conversion didn't happen, it would still be illegal, because C doesn't permit array assignments.)
Some more detail on the issue of arrays on the left side of assignment:
The contexts where an array expression doesn't decay into a pointer are when the array expression is:
an operand of the unary sizeof operator;
an operand of unary & (address-of) operator; or
a string literal in an initializer used to initialize an array object.
The left side of an assignment isn't any of those contexts, so in:
char array[10];
array = "hello";
LHS is, in principle, converted to a pointer. But the resulting pointer expression is no longer an lvalue, which makes the assignment a constraint violation.
One way to look at it is that the expression array is converted to a pointer, which then makes the assignment illegal. Another is that since the assignment is illegal, the whole program is not valid C, so it has no defined behavior and it's meaningless to ask whether any conversion does or does not happen.
(I'm playing a little fast and loose with my use of the word "illegal", but this answer is long enough already so I won't get into it.)
Recommended reading: Section 6 of the comp.lang.c FAQ; it does an excellent job of explaining the often bewildering relationship between arrays and pointers in C.
you aren't modifying the string literal, you are using it as a source to copy it into your array of characters. Once the copy is finished, your string literal has nothing to do with the copy in your array. You are free to then manipulate the array.
From your definition, char some_array_of_strings[3][200]; indicates that some_array_of_strings is an array of 3 elements, each of which is itself an array of 200 characters or strings of length 200 characters.
strcpy(some_array_of_strings[2], "Some garbage");
strcpy(some_array_of_strings[2], "Some other garbage");
In these 2 statements, you are actually copying the content from one char pointer to another char pointer which is valid. some_array_of_strings[2] is actually similar to char[200] which is similar to char *.
some_array_of_strings[1]="Some garbage";
some_array_of_strings[1]="Some garbage that causes a compiler error due to reassignment";
Here, you are assigning a char * like "Some garbage" to a char[200] i.e. some_array_of_strings[1] which is not supported. The difference lies in assigning and copying the content.
some_array_of_strings[1]="Some garbage"
some_array_of_strings[1]="Some garbage that causes a compiler error due to reassignment"
In the first line you assign some_array_of_strings[1] to a string literal so the address of some_array_of_strings[1] or &some_array_of_strings[1] points to a string literal. So in the second line when you try to reassign some_array_of_strings[1] it gives you the error.
It is just as Keith and Fred have said, with strcpy you are only copying the characters of the string literal into you array.
some_array_of_strings[2] is an array of 200 chars.
When it's used in most expressions, it "decays" (fancy word for converts) into a pointer to the first element of the array.
strcpy(some_array_of_strings[2], "Some garbage"); then copies "Some garbage" character by character into that array of 200 chars, by making use of the pointer to the first element of the array and advancing it one by one.
In most expressions "Some garbage" is a pointer to an array of chars containing those respective characters plus a string termination character ('\0').
some_array_of_strings[1]="Some garbage" on the other hand attempts to assign a pointer (to the string) to the constant/non-modifiable pointer to the first element of 200 chars, which is also illegal (like doing 1=2;)
I am trying to learn the basics, I would think that declaring a char[] and assigning a string to it would work.
thanks
int size = 100;
char str[size];
str = "\x80\xbb\x00\xcd";
gives error "incompatible types in assignment". what's wrong?
thanks
You can use a string literal to initialize an array of char, but you can't assign an array of char (any more than you can assign any other array). OTOH, you can assign a pointer, so the following would be allowed:
char *str;
str = "\x80\xbb\x00\xcd";
This is actually one of the most difficult parts of learning a programming language.... str is an array, that is, a part of memory (size times a char, so size chars) that has been reserved and labeled as str. str[0] is the first character, str[1] the second... str[size-1] is the last one. str itself, without specifiying any character, is a pointer to the memory zone that was created when you did
char str[size]
As Jerry so clearly said, in C you can not initialize arrays that way. You need to copy from one array to other, so you can do something like this
strncpy(str, "\x80\xbb\x00\xcd", size); /* Copy up to size characters */
str[size-1]='\0'; /* Make sure that the string is null terminated for small values of size */
Summarizing: It's very important to make a difference between pointers, memory areas and array.
Good luck - I am pretty sure that in less time than you imagine you will be mastering these concepts :)
A char-array can be implicitely cast to a char* when used as Rvalue, but not when used as Lvalue - that's why the assignment won't work.
You cannot assign array contents using the =operator. That's just a fact of the C language design. You can initialize an array in the declaration, such as
char str[size] = "\x80\xbb\x00\xcd";
but that's a different operation from an assignment. And note that in this case, and extra '\0' will be added to the end of the string.
The "incompatible types" warning comes from how array expressions are treated by the language. First of all, string literals are stored as arrays of char with static extent (meaning they exist over the lifetime of the program). So the type of the string literal "\x80\xbb\x00\xcd" is "4 5-element array of char". However, in most circumstances, an expression of array type will implicitly be converted ("decay") from type "N-element array of T" to "pointer to T", and the value of the expression will be the address of the first element in the array. So, when you wrote the statement
str = "\x80\xbb\x00\xcd";
the type of the literal was implicitly converted from "4 5-element array of char" to "pointer to char", but the target of the assignment is type "100-element array of char", and the types are not compatible (above and beyond the fact that an array expression cannot be the target of the = operator).
To copy the contents of one array to another you would have to use a library function like memcpy, memmove, strcpy, etc. Also, for strcpy to function properly, the source string must be 0-terminated.
Edit per R's comment below, I've struck out the more dumbass sections of my answer.
To assign a String Literal to the str Array you can use a the String copy function strcpy.
char a[100] = "\x80\xbb\x00\xcd"; OR char a[] = "\x80\xbb\x00\xcd";
str is the name of an array. The name of an array is the address of the 0th element. Therefore, str is a pointer constant. You cannot change the value of a pointer constant, just like you cannot change a constant (you can't do 6 = 5, for example).