How can a char pointer contain a string in it? - c

If I define a char pointer in a C program and initialize it to "some String"
Can someone explain how a char pointer that's supposed to hold an address in it can hold a string in it?
Isn't it a contradiction to the definition of a pointer? What am I missing here?
For example :
char pointer=" how is it possible at all ? ";
printf("%s",pointer);

String literals, like "how is it possible at all ? " are really arrays of read-only characters stored somewhere by the compiler.
When you do
char *pointer=" how is it possible at all ? ";
you initialize pointer to point to the first element of that array.
This is very similar to
char string[] = " how is it possible at all ? ";
char *pointer = &string[0]; // Make pointer point to the first character in the array
How pointers themselves work depends on the compiler and the target architecture, but most of the time they are simple integers whose value is the address of the memory they point to. Then the compiler handles them specially and translates usage of the pointers into the correct machine-code instructions to access the memory a pointer is pointing to.
Because string literals are read only, that's the reason you should really use const char * when making pointers to them. C allows plain non-constant char *, but then the compiler might not be able to detect attempts to modify the read-only literal, which leads to undefined behavior.

In the statement:
char *pointer=" how is it possible at all ? ";
" how is it possible at all ? " is a string literal and a string literal is an array of characters.
From C Standards#6.3.2.1p3
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ''array of type'' is converted to an expression with type ''pointer to type'' that points to the initial element of the array object and is not an lvalue.
So, the string literal which is an array decays into pointer of type char.

Can someone explain how a char pointer that's supposed to hold an address in it can hold a string in it?
Another way to look at it, by analogy: the same way a huge building can have a street address such as "123 Main Street".
The address is where it is - not how big it is.

Related

Pointers and Arrays in c, what is the difference? [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 4 years ago.
I am learning about pointers. I don't understand what the difference between the char variables is.
#include <stdio.h>
int main() {
char *cards = "JQK";
char cards2[] = "JQK";
printf("%c\n", cards[2]);
printf("%c\n", cards2[2]);
}
I experimented with them in the printf() and they seem to be working the same way except that cards2[] can't be resigned while cards can. Why is this the case?
The difference is second is an array object initialized with the content of the string literal. First one is char* which contains the address of the string literal. String literals are array - this array is converted into pointer to first element and then that is assigned to char*.
The thing is, arrays are non modifiable lvalue - it can't appear in the left side of = assignment operator. A Pointer can (not marked const) can. And as the pointer is pointing to a string literal - you shouldn't try to modify it. Doing so will invoke undefined behavior.
Pointer and arrays are not the same thing - arrays decay into pointer in most cases. Here also that happened with those string literals when used in the right hand side of assignment in the pointer initialization case. Second one is different as it is explicitly mentioned that this will copy the content of the string literal to a the array declared - this is why this is modifiable unlike the previous case(Here cards2).
To clarify a bit - first let's know what is going on and what is the difference between array and pointer?
Here I have said that string literals are arrays. From §6.4.5¶6
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals.78) The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence. For UTF-8 string literals, the array elements have type char, and are initialized with the characters of the multibyte character sequence, as encoded in UTF-8.
This is what is written in c11 standard - it is posted to show you that string literals are indeed a character array of static storage duration. This is how it is. Now the question is when we write
char *cards = "JQK";
So now we have an array which is in the left hand side of a pointer declaration. What will happen?
Now comes the concept of array decaying. Most of the cases array is converted into pointers to the first element of it. This may seem strange at first but this is what happens. For example,
int a[]={1,2,3};
Now if you write a[i] this is equivalent to *(a+i) and a is the decayed pointer to the first element of the array. Now you are asking to go to the position a+i and give the value that is in that address, which is 3 for i=2. Same thing happened here.
The pointer to the first element of the literal array is assigned to cards. Now what happened next? cards points to the first element of the literal array which is J.
Story doesn't end here. Now standard has imposed a constraint over string literal. You are not supposed to change it - if you try to do that it will be undefined behavior - undefined behavior as the name implies is not defined by the standard. You should avoid it. So what does that mean? You shouldn't try to change the string literal or whatever the cards points to. Most implementation put string literals in read only section - so trying to write to it will be erroneous.
Being said that, what happened in the second case? Only thing is - this time we say that
char cards2[] = "JQK";
Now cards2 is an array object - to the right of the assignment operator there is string literal again. What will happen? From §6.7.9¶14
An array of character type may be initialized by a character string literal or UTF-8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
The thing is now it means that - you can put the string literal to the right side of the assignment. The char array will be initialized. Here that is what is being done. And this is modifiable. So you can change it unlike the previous case. That is a key difference here.
Also if you are curious are there cases when we see an array as array and not as pointer - the whole rule is stated here.
From §6.3.2.1¶3
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type 'array of type' is converted to an expression with type 'pointer to type' that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
This is all there is to this. This I hope will give you a clear idea than it gave before.

Char not working in multi dimensional array in C. Need clarification

Im trying to create a 2d array for multiple data types but it seems to not be accepting the char data type. Why is this?
struct {
union {
int ival;
float fval;
char cval[50];
} val;
} as[120][4];
as[0][1].val.cval = "Testtttt"; ***This does not work***
as[1][1].val.ival = 3; ***This works***
You are in c, thus you should use string.h when it comes to string handling!
Change this:
as[0][1].val.cval = "Testtttt";
to this:
strcpy(as[0][1].val.cval, "Testtttt");
by using strcpy(), instead of the assignment operator (this would work in c++, not in c).
Of course, alternative functions exist, such as strncpy()* and memcpy().
Moreover, since C string handling seems new to you, you must read about null terminated strings in C.
*Credits to #fukanchik who reminded me that
In C, this code
as[0][1].val.cval
can not be assigned to. Per the C Standard, 6.3.2.1 Lvalues, arrays, and function designators:
Except when it is the operand of the sizeof operator, the _Alignof
operator, or the unary & operator, or is a string literal used to
initialize an array, an expression that has type "array of type"
is converted to an expression with type "pointer to type" that
points to the initial element of the array object and is not an
lvalue.
Without getting too in-depth into the C Standard, an lvalue is something you can assign something to. Thus, this code
as[1][1].val.ival
represents an lvalue and you can assign 3 to it.
The reason an array can't be assigned to is because it decays to "an expression with type ‘‘pointer to type". In other words, a bare array like
as[0][1].val.cval
is treated as the address of the array.
And the address of the array is where it is and is not something that can be assigned to.
Your val.cval members are arrays of char. String literals also represent arrays of char. C does not support whole-array assignment, regardless of the type of the array elements.
You can copy the contents of one array to another in various ways. strcpy() will do it for null-terminated arrays of char. memcpy() and / or memmove() will do it more generally, and of course you can always write an element-by-element copy loop.
You cannot copy the contents of one array to another using the = operator; you must use a library function like strcpy (for strings) or memcpy (for anything else), or you must assign each element individually:
as[0][1].val.cval[0] = 'T';
as[0][1].val.cval[1] = 'e';
as[0][1].val.cval[2] = 's';
...
as[0][1].val.cval[7] = 't';
as[0][1].val.cval[8] = 0;
Remember that in C, a string is a sequence of character values terminated by a 0-valued byte. Strings (including string literals like "Testtttt") are stored as arrays of char, but not all arrays of char store a string.

Strings in C: Why does this work?

I'm sorry for asking something that probably seems a little inane as it is apparently not broken but my (newbie) understanding of how C handles string literals tells me that this should not work...
char some_array_of_strings[3][200];
strcpy(some_array_of_strings[2], "Some garbage");
strcpy(some_array_of_strings[2], "Some other garbage");
I thought that C prevented the direct modification of string literals and that was why pointers were used when dealing with strings. The fact that this works tells me I am misunderstanding something.
Also, if this works, why does...
some_array_of_strings[1]="Some garbage"
some_array_of_strings[1]="Some garbage that causes a compiler error due to reassignment"
not work?
Be careful with the phrase "array of strings". A "string" is not a data type in C; it's a data layout. Specifically, a string is defined as
a contiguous sequence of characters terminated by and including the
first null character
An array of char may contain a string, and a char* pointer may point to (the first character of) a string. (The standard defines a pointer to the first character of a string as a pointer to a string.)
char some_array_of_strings[3][200];
This defines a 3-element array, where each of the elements is a 200-element array of char. (It's a 2-dimensional array, which in C is simply an array of arrays.)
strcpy(some_array_of_strings[2], "Some garbage");
The string literal "Some garbage" refers to an anonymous statically allocated array of char; it exists for the entire execution of your program, and you're not allowed to modify it. The strcpy() call, as the name implies, copies the contents of that array, up to and including the terminating '\0' null character, into some_array_of_strings[].
strcpy(some_array_of_strings[2], "Some other garbage");
Same thing: this copies the contents "Some other garbage" into some_array_of_strings[2], overwriting what you copied on the previous line. In both cases, there's more than enough room.
You're not modifying a string literal, you're modifying your own array by copying bytes from a string literal (more precisely, from that anonymous array I mentioned above).
some_array_of_strings[1]="Some garbage";
This doesn't just "not work", it's illegal. There is no assignment of arrays in C.
Let's take a simpler example:
char arr[10];
arr = "hello"; /* also illegal */
arr is an object of array type. In most contexts, an expression of array type is implicitly converted to a pointer to the array object's first element. That applies to both sides of the assignment: the object name arr and the string literal "hello".
But the pointer on the left side is just a pointer value. There is no pointer object. In technical terms, it's not an lvalue, so it can't appear on the left side of an assignment any more than you could write 42 = x;).
(If the array-to-pointer conversion didn't happen, it would still be illegal, because C doesn't permit array assignments.)
Some more detail on the issue of arrays on the left side of assignment:
The contexts where an array expression doesn't decay into a pointer are when the array expression is:
an operand of the unary sizeof operator;
an operand of unary & (address-of) operator; or
a string literal in an initializer used to initialize an array object.
The left side of an assignment isn't any of those contexts, so in:
char array[10];
array = "hello";
LHS is, in principle, converted to a pointer. But the resulting pointer expression is no longer an lvalue, which makes the assignment a constraint violation.
One way to look at it is that the expression array is converted to a pointer, which then makes the assignment illegal. Another is that since the assignment is illegal, the whole program is not valid C, so it has no defined behavior and it's meaningless to ask whether any conversion does or does not happen.
(I'm playing a little fast and loose with my use of the word "illegal", but this answer is long enough already so I won't get into it.)
Recommended reading: Section 6 of the comp.lang.c FAQ; it does an excellent job of explaining the often bewildering relationship between arrays and pointers in C.
you aren't modifying the string literal, you are using it as a source to copy it into your array of characters. Once the copy is finished, your string literal has nothing to do with the copy in your array. You are free to then manipulate the array.
From your definition, char some_array_of_strings[3][200]; indicates that some_array_of_strings is an array of 3 elements, each of which is itself an array of 200 characters or strings of length 200 characters.
strcpy(some_array_of_strings[2], "Some garbage");
strcpy(some_array_of_strings[2], "Some other garbage");
In these 2 statements, you are actually copying the content from one char pointer to another char pointer which is valid. some_array_of_strings[2] is actually similar to char[200] which is similar to char *.
some_array_of_strings[1]="Some garbage";
some_array_of_strings[1]="Some garbage that causes a compiler error due to reassignment";
Here, you are assigning a char * like "Some garbage" to a char[200] i.e. some_array_of_strings[1] which is not supported. The difference lies in assigning and copying the content.
some_array_of_strings[1]="Some garbage"
some_array_of_strings[1]="Some garbage that causes a compiler error due to reassignment"
In the first line you assign some_array_of_strings[1] to a string literal so the address of some_array_of_strings[1] or &some_array_of_strings[1] points to a string literal. So in the second line when you try to reassign some_array_of_strings[1] it gives you the error.
It is just as Keith and Fred have said, with strcpy you are only copying the characters of the string literal into you array.
some_array_of_strings[2] is an array of 200 chars.
When it's used in most expressions, it "decays" (fancy word for converts) into a pointer to the first element of the array.
strcpy(some_array_of_strings[2], "Some garbage"); then copies "Some garbage" character by character into that array of 200 chars, by making use of the pointer to the first element of the array and advancing it one by one.
In most expressions "Some garbage" is a pointer to an array of chars containing those respective characters plus a string termination character ('\0').
some_array_of_strings[1]="Some garbage" on the other hand attempts to assign a pointer (to the string) to the constant/non-modifiable pointer to the first element of 200 chars, which is also illegal (like doing 1=2;)

Char String Assignment

I am working on Microsoft Visual Studio environment. I came across a strange behavior
char *src ="123";
char *des ="abc";
printf("\nThe src string is %c", src[0]);
printf("\tThe dest string is %c",dest[0]);
des[0] = src[0];
printf("\nThe src string is %c", src[0]);
printf("\tThe dest string is %c",dest[0]);
The result is:
1 a
1 a
That means the des[0] is not being initialized. As src is pointing to the first element of the string. I guess by rules this should work.
This is undefined behavior:
des[0] = src[0];
Try this instead:
char des[] ="abc";
Since src and des are initialized with string literals, their type should actually be const char *, not char *; like this:
const char * src ="123";
const char * des ="abc";
There was never memory allocated for either of them, they just point to the predefined constants. Therefore, the statement des[0] = src[0] is undefined behavior; you're trying to change a constant there!
Any decent compiler should actually warn you about the implicit conversion from const char * to char *...
If using C++, consider using std::string instead of char *, and std::cout instead of printf.
Section 2.13.4 of ISO/IEC 14882 (Programming languages - C++) says:
A string literal is a sequence of characters (as defined in 2.13.2) surrounded by double quotes, optionally beginning with the letter L, as in "..." or L"...". A string literal that does not begin with L is an ordinary string literal, also referred to as a narrow string literal. An ordinary string literal has type “array of n const char” and static storage duration (3.7), where n is the size of the string as defined below, and is initialized with the given characters. ...
Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation defined. The effect of attempting to modify a string literal is undefined.
In C, string literals such as "123" are stored as arrays of char (const char in C++). These arrays are stored in memory such that they are available over the lifetime of the program. Attempting to modify the contents of a string literal results in undefined behavior; sometimes it will "work", sometimes it won't, depending on the compiler and the platform, so it's best to treat string literals as unwritable.
Remember that under most circumstances, an expression of type "N-element array of T" will be converted to an expression of type "pointer to T" whose value is the location of the first element in the array.
Thus, when you write
char *src = "123";
char *des = "abc";
the expressions "123" and "abc" are converted from "3-element array of char" to "pointer to char", and src will point to the '1' in "123", and des will point to the 'a' in "abc".
Again, attempting to modify the contents of a string literal results in undefined behavior, so when you write
des[0] = src[0];
the compiler is free to treat that statement any way it wants to, from ignoring it completely to doing exactly what you expect it to do to anything in between. That means that string literals, or a pointer to them, cannot be used as target parameters to calls like strcpy, strcat, memcpy, etc., nor should they be used as parameters to calls like strtok.
vinaygarg: That means the des[0] is not being initialized. As src is pointing to the first element of the string. I guess by rules this should work.
Firstly you must remember that *src and *dst are defined as pointers, nothing more, nothing less.
So you must then ask yourself what exactly "123" and "abc" are and why it cannot be altered? Well to cut a long story short, it is stored in application memory, which is read-only. Why? The strings must be stored with the program in order to be available to your code at run time, in theory you should get a compiler warning for assigning a non-const char* to a const char *. Why is it read-only? The memory for exe's and dll's need to be protected from being overwritten somehow, so it must be read-only to stop bugs and viruses from modifying executing code.
So how can you get this string into modifiable memory?
// Copying into an array.
const size_t BUFFER_SIZE = 256;
char buffer[BUFFER_SIZE];
strcpy(buffer, "abc");
strncpy(buffer, "abc", BUFFER_SIZE-1);

c string basics, why unassigned?

I am trying to learn the basics, I would think that declaring a char[] and assigning a string to it would work.
thanks
int size = 100;
char str[size];
str = "\x80\xbb\x00\xcd";
gives error "incompatible types in assignment". what's wrong?
thanks
You can use a string literal to initialize an array of char, but you can't assign an array of char (any more than you can assign any other array). OTOH, you can assign a pointer, so the following would be allowed:
char *str;
str = "\x80\xbb\x00\xcd";
This is actually one of the most difficult parts of learning a programming language.... str is an array, that is, a part of memory (size times a char, so size chars) that has been reserved and labeled as str. str[0] is the first character, str[1] the second... str[size-1] is the last one. str itself, without specifiying any character, is a pointer to the memory zone that was created when you did
char str[size]
As Jerry so clearly said, in C you can not initialize arrays that way. You need to copy from one array to other, so you can do something like this
strncpy(str, "\x80\xbb\x00\xcd", size); /* Copy up to size characters */
str[size-1]='\0'; /* Make sure that the string is null terminated for small values of size */
Summarizing: It's very important to make a difference between pointers, memory areas and array.
Good luck - I am pretty sure that in less time than you imagine you will be mastering these concepts :)
A char-array can be implicitely cast to a char* when used as Rvalue, but not when used as Lvalue - that's why the assignment won't work.
You cannot assign array contents using the =operator. That's just a fact of the C language design. You can initialize an array in the declaration, such as
char str[size] = "\x80\xbb\x00\xcd";
but that's a different operation from an assignment. And note that in this case, and extra '\0' will be added to the end of the string.
The "incompatible types" warning comes from how array expressions are treated by the language. First of all, string literals are stored as arrays of char with static extent (meaning they exist over the lifetime of the program). So the type of the string literal "\x80\xbb\x00\xcd" is "4 5-element array of char". However, in most circumstances, an expression of array type will implicitly be converted ("decay") from type "N-element array of T" to "pointer to T", and the value of the expression will be the address of the first element in the array. So, when you wrote the statement
str = "\x80\xbb\x00\xcd";
the type of the literal was implicitly converted from "4 5-element array of char" to "pointer to char", but the target of the assignment is type "100-element array of char", and the types are not compatible (above and beyond the fact that an array expression cannot be the target of the = operator).
To copy the contents of one array to another you would have to use a library function like memcpy, memmove, strcpy, etc. Also, for strcpy to function properly, the source string must be 0-terminated.
Edit per R's comment below, I've struck out the more dumbass sections of my answer.
To assign a String Literal to the str Array you can use a the String copy function strcpy.
char a[100] = "\x80\xbb\x00\xcd"; OR char a[] = "\x80\xbb\x00\xcd";
str is the name of an array. The name of an array is the address of the 0th element. Therefore, str is a pointer constant. You cannot change the value of a pointer constant, just like you cannot change a constant (you can't do 6 = 5, for example).

Resources