Why pointers can't be used to index arrays? [duplicate] - c

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 3 years ago.
I am trying to change value of character array components using a pointer. But I am not able to do so. Is there a fundamental difference between declaring arrays using the two different methods i.e. char A[] and char *A?
I tried accessing arrays using A[0] and it worked. But I am not able to change values of the array components.
{
char *A = "ab";
printf("%c\n", A[0]); //Works. I am able to access A[0]
A[0] = 'c'; //Segmentation fault. I am not able to edit A[0]
printf("%c\n", A[0]);
}
Expected output:
a
c
Actual output:
a
Segmentation fault

The difference is that char A[] defines an array and char * does not.
The most important thing to remember is that arrays are not pointers.
In this declaration:
char *A = "ab";
the string literal "ab" creates an anonymous array object of type char[3] (2 plus 1 for the terminating '\0'). The declaration creates a pointer called A and initializes it to point to the initial character of that array.
The array object created by a string literal has static storage duration (meaning that it exists through the entire execution of your program) and does not allow you to modify it. (Strictly speaking an attempt to modify it has undefined behavior.) It really should be const char[3] rather than char[3], but for historical reasons it's not defined as const. You should use a pointer to const to refer to it:
const char *A = "ab";
so that the compiler will catch any attempts to modify the array.
In this declaration:
char A[] = "ab";
the string literal does the same thing, but the array object A is initialized with a copy of the contents of that array. The array A is modifiable because you didn't define it with const -- and because it's an array object you created, rather than one implicitly created by a string literal, you can modify it.
An array indexing expression, like A[0] actually requires a pointer as one if its operands (and an integer as the other). Very often that pointer will be the result of an array expression "decaying" to a pointer, but it can also be just a pointer -- as long as that pointer points to an element of an array object.
The relationship between arrays and pointers in C is complicated, and there's a lot of misinformation out there. I recommend reading section 6 of the comp.lang.c FAQ.
You can use either an array name or a pointer to refer to elements of an array object. You ran into a problem with an array object that's read-only. For example:
#include <stdio.h>
int main(void) {
char array_object[] = "ab"; /* array_object is writable */
char *ptr = array_object; /* or &array_object[0] */
printf("array_object[0] = '%c'\n", array_object[0]);
printf("ptr[0] = '%c'\n", ptr[0]);
}
Output:
array_object[0] = 'a'
ptr[0] = 'a'

String literals like "ab" are supposed to be immutable, like any other literal (you can't alter the value of a numeric literal like 1 or 3.1419, for example). Unlike numeric literals, however, string literals require some kind of storage to be materialized. Some implementations (such as the one you're using, apparently) store string literals in read-only memory, so attempting to change the contents of the literal will lead to a segfault.
The language definition leaves the behavior undefined - it may work as expected, it may crash outright, or it may do something else.

String literals are not meant to be overwritten, think of them as read-only. It is undefined behavior to overwrite the string and your computer chose to crash the program as a result. You can use an array instead to modify the string.
char A[3] = "ab";
A[0] = 'c';

Is there a fundamental difference between declaring arrays using the two different methods i.e. char A[] and char *A?
Yes, because the second one is not an array but a pointer.
The type of "ab" is char /*readonly*/ [3]. It is an array with immutable content. So when you want a pointer to that string literal, you should use a pointer to char const:
char const *foo = "ab";
That keeps you from altering the literal by accident. If you however want to use the string literal to initialize an array:
char foo[] = "ab"; // the size of the array is determined by the initializer
// here: 3 - the characters 'a', 'b' and '\0'
The elements of that array can then be modified.
Array-indexing btw is nothing more but syntactic sugar:
foo[bar]; /* is the same as */ *(foo + bar);
That's why one can do funny things like
"Hello!"[2]; /* 'l' but also */ 2["Hello!"]; // 'l'

Related

Confusion about how char *s and char s[] works at low level

I know similar questions, like this question, have been posted and answered here but those answers don't offer me the complete picture, hence I'm posting this as a new question. Hope that is ok.
See following snippets -
char s[9] = "foobar"; //ok
s[1] = 'z' //also ok
And
char s[9];
s = "foobar" //doesn't work. Why?
But see following cases -
char *s = "foobar"; //works
s[1] = 'z'; //doesn't work
char *s;
s = "foobar"; //unlike arrays, works here
It is a bit confusing. I mean I have vague understanding that we can't assign values to arrays. But we can modify it. In case of char *s, it seems we can assign values but can't modify it because it is written in read only memory. But still I can't get the full picture.
What exactly is happening at low level?
char s[9] = "foobar"; This is initialization. An array of characters of size 9 is declared and then its contents receives the string "foobar" with any remaining characters set to '\0'.
s = "foobar" is just invalid C syntax. You cannot assign a string to a char array. To make s have the value foobar. Use strcpy(s,"foobar");
char *s = "foobar"; is also initialization, however, this assigns the address of the constant string foobar to the pointer variable s. Note that I say "constant string". A string literal is on most platforms constant. A better way of making this clear is to write const char *s = "foobar";
And indeed, your next assignment s[1]= 'z'; will not work because s is constant.
You need to understand what the expressions are actually doing, then it might come clear to you.
char s[9] = "foobar"; -> Initialize the char array s by the string literal "foobar". Correct.
s[1] = 'z' -> Assign the character constant 'z' to the second elem. of char array s. Correct.
char s[9]; s = "foobar"; -> Declare the char array a, then attempt to assign the string literal "foobar" to the char array. Not permissible. You can´t actually assign arrays in C, you can only initialize an array of char with a string when defining the array itself. That´s the difference. If you want to copy a string into an array of char use strcpy(s, "foobar"); instead.
char *s = "foobar"; -> Define the pointer to char s and initialize it to point to the string literal "foobar". Correct.
s[1] = 'z'; -> Attempt to modify the string literal "foobar", to which is s pointing to. Not permissible. A string literal is stored in read-only memory.
char *s; s = "foobar"; -> Declare the pointer to char s. Then assign the pointer to point to the string literal "foobar". Correct.
This declares array s with an initializer:
char s[9] = "foobar"; //ok
But this is an invalid assignment expression with array s on the left:
s = "foobar"; //doesn't work. Why?
Assignment expressions and declarations with initializers are not the same thing syntactically, although they both use an = in their syntax.
The reason that the assignment to the array s doesn't work is that the array decays to a pointer to its first element in the expression, so the assignment is equivalent to:
&(s[0]) = "foobar";
The assignment expression requires an lvalue on the left hand side, but the result of the & address operator is not an lvalue. Although the array s itself is an lvalue, the expression converts it to something that isn't an lvalue. Therefore, an array cannot be used on the left hand side of an assignment expression.
For the following:
char *s = "foobar"; //works
The string literal "foobar" is stored as an anonymous array of char and as an initializer it decays to a pointer to its first element. So the above is equivalent to:
char *s = &(("foobar")[0]); //works
The initializer has the same type as s (char *) so it is fine.
For the subsequent assignment:
s[1] = 'z'; //doesn't work
It is syntactically correct, but it violates a constraint, resulting in undefined behavior. The constraint that is being violated is that the anonymous arrays created by string literals are not modifiable. Assignment to an element of such an array is a modification and not allowed.
The subsequent assignment:
s = "foobar"; //unlike arrays, works here
is equivalent to:
s = &(("foobar")[0]); //unlike arrays, works here
It is assigning a char * value to a variable of type char *, so it is fine.
Contrast the following use of the initializer "foobar":
char *s = "foobar"; //works
with its use in the earlier declaration:
char s[9] = "foobar"; //ok
There is a special initialization rule that allows an array of char to be initialized by a string literal optionally enclosed by braces. That initialization rule is being used to initialize char s[9].
The string literal used to initialize the array also creates an anonymous array of char (at least notionally) but there is no way to access that anonymous array of char, so it may get omitted from the output of the compiler. This is in contrast with the anonymous array of char created by the string literal used to initialize char *s which can be accessed via s.
It may help to think of C as not allowing you to do anything with arrays except for assisting in a few special cases. C originated when programming languages did little more than help you move individual bytes and “words” (2 or maybe 4 bytes) around and do simple arithmetic and operations with them. With that in mind, let’s look at your examples:
char s[9] = "foobar"; //ok
This is one of the special cases: When you define an array of characters, the compiler will help you initialize it. In a definition, you may provide a string literal, which represents an array of characters, and the compiler will initialize your array with the contents of the string literal.
s[1] = 'z' //also ok
Yes, this just moves the value of one character into one array element.
char s[9];
s = "foobar" //doesn't work. Why?
This does not work because there is no assistance here. s and "foobar" are both arrays, but C has no provision for handling an array as one whole object.
However, although C does not handle an array as a whole object, it does provide some assistance for working with arrays. Since the compiler would not work with whole arrays, programmers needed some other ways to work with arrays. So C was given a feature that, when you used an array in an expression, the compiler would automatically convert it to a pointer to the first element of the array, and that would help the programmer write code to work with elements of the array. We see that in your next example:
char *s = "foobar"; //works
char *s declares s to be a pointer to char. Next, the string literal "foobar" represents an array. Above, we saw that using a string literal to initialize an array was a special case. However, here the string literal is not used to initialize an array. It is used to initialize a pointer, so the special case rules do not apply. In this case, the array represented by the string literal is automatically converted to a pointer to its first element. So s is initialized to be a pointer to the first element of the array containing “f”, “o”, “o”, “b”, “a”, “r”, and a null character.
s[1] = 'z'; //doesn't work
The arrays defined by string literals are intended to be constants. They are “read-only” in the sense that the C standard does not define what happens when you try to modify them. In many C implementations, they are assigned to memory that is read-only because the operating system and the computer hardware do not allow writing to it by normal program means. So s[1] = 'z'; may get an exception (trap) or a warning or error message from the compiler. (Ideally, char *s = "foobar"; would be disallowed because "foobar", being a constant, would have type const char [7]. However, because const did not exist in early C, the types of string literals do not have const.)
char *s;
s = "foobar"; //unlike arrays, works here
Here s is a char *, and the string literal "foobar" is automatically converted to a pointer to its first element, and that pointer is a char *, so the assignment is fine.

Why can't I change char array later?

char myArray[6]="Hello"; //declaring and initializing char array
printf("\n%s", myArray); //prints Hello
myArray="World"; //Compiler says"Error expression must a modifiable lvalue
Why can't I change myArray later? I did not declare it as const modifier.
When you write char myArray[6]="Hello"; you are allocating 6 chars on the stack (including a null-terminator).
Yes you can change individual elements; e.g. myArray[4] = '\0' will transform your string to "Hell" (as far as the C library string functions are concerned), but you can't redefine the array itself as that would ruin the stack.
Note that [const] char* myArray = "Hello"; is an entirely different beast: that is read-only memory and any changes to that string is undefined behaviour.
Array is a non modifiable lvalue. So you cannot modify it.
If you wish to modify the contents of the array, use strcpy.
Because the name of an array cannot be modified, just use strcpy:
strcpy(myArray, "World");
You can't assign to an array (except when initializing it in its declaration. Instead you have to copy to it. This you do using strcpy.
But be careful so you don't copy more than five characters to the array, as that's the longest string it can contain. And using strncpy in this case may be dangerous, as it may not add the terminating '\0' character if the source string is to long.
You can't assign strings to variables in C except in initializations. Use the strcpy() function to change values of string variables in C.
Well myArray is the name of the array which you cannot modify. It is illegal to assign a value to it.
Arrays in C are non-modifiable lvalues. There are no operations in C that can modify the array itself (only individual elements can be modifiable).
Well myArray is of size 6 and hence care must be taken during strcpy.
strcpy(myArray,"World") as it would result in overflow if the source's string length is more than the destination's (6 in this case).
A arrays in C are non-modifiable lvalues. There are no operations in C that can modify the array itself (only individual elements can be modifiable).
A possible and safe method would be
char *ptr = "Hello";
If you want to change
ptr = strdup("World");
NOTE:
Make sure that you free(ptr) at the end otherwise it would result in memory leak.
You cannot assign naked arrays in C. However you can assign pointers:
char const *myPtr = "Hello";
myPtr = "World";
Or you can assign to the elements of an array:
char myArray[6] = "Hello";
myArray[0] = 'W';
strcpy(myArray, "World");

C: Behaviour of arrays when assigned to pointers

#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
printf("%p %s" ,"hello",ptr );
getchar();
}
Hi, I am trying to understand clearly how can arrays get assign in to pointers. I notice when you assign an array of chars to a pointer of chars ptr="hello"; the array decays to the pointer, but in this case I am assigning a char of arrays that are not inside a variable and not a variable containing them ", does this way of assignment take a memory address specially for "Hello" (what obviously is happening) , and is it possible to modify the value of each element in "Hello" wich are contained in the memory address where this array is stored. As a comparison, is it fine for me to assign a pointer with an array for example of ints something as vague as thisint_ptr = 5,3,4,3; and the values 5,3,4,3 get located in a memory address as "Hello" did. And if not why is it possible only with strings? Thanks in advanced.
"hello" is a string literal. It is a nameless non-modifiable object of type char [6]. It is an array, and it behaves the same way any other array does. The fact that it is nameless does not really change anything. You can use it with [] operator for example, as in "hello"[3] and so on. Just like any other array, it can and will decay to pointer in most contexts.
You cannot modify the contents of a string literal because it is non-modifiable by definition. It can be physically stored in read-only memory. It can overlap other string literals, if they contain common sub-sequences of characters.
Similar functionality exists for other array types through compound literal syntax
int *p = (int []) { 1, 2, 3, 4, 5 };
In this case the right-hand side is a nameless object of type int [5], which decays to int * pointer. Compound literals are modifiable though, meaning that you can do p[3] = 8 and thus replace 4 with 8.
You can also use compound literal syntax with char arrays and do
char *p = (char []) { "hello" };
In this case the right-hand side is a modifiable nameless object of type char [6].
The first thing you should do is read section 6 of the comp.lang.c FAQ.
The string literal "hello" is an expression of type char[6] (5 characters for "hello" plus one for the terminating '\0'). It refers to an anonymous array object with static storage duration, initialized at program startup to contain those 6 character values.
In most contexts, an expression of array type is implicitly converted a pointer to the first element of the array; the exceptions are:
When it's the argument of sizeof (sizeof "hello" yields 6, not the size of a pointer);
When it's the argument of _Alignof (a new feature in C11);
When it's the argument of unary & (&arr yields the address of the entire array, not of its first element; same memory location, different type); and
When it's a string literal in an initializer used to initialize an array object (char s[6] = "hello"; copies the whole array, not just a pointer).
None of these exceptions apply to your code:
char *ptr;
ptr = "hello";
So the expression "hello" is converted to ("decays" to) a pointer to the first element ('h') of that anonymous array object I mentioned above.
So *ptr == 'h', and you can advance ptr through memory to access the other characters: 'e', 'l', 'l', 'o', and '\0'. This is what printf() does when you give it a "%s" format.
That anonymous array object, associated with the string literal, is read-only, but not const. What that means is that any attempt to modify that array, or any of its elements, has undefined behavior (because the standard explicitly says so) -- but the compiler won't necessarily warn you about it. (C++ makes string literals const; doing the same thing in C would have broken existing code that was written before const was added to the language.) So no, you can't modify the elements of "hello" -- or at least you shouldn't try. And to make the compiler warn you if you try, you should declare the pointer as const:
const char *ptr; /* pointer to const char, not const pointer to char */
ptr = "hello";
(gcc has an option, -Wwrite-strings, that causes it to treat string literals as const. This will cause it to warn about some C code that's legal as far as the standard is concerned, but such code should probably be modified to use const.)
#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
//instead of above tow lines you can write char *ptr = "hello"
printf("%p %s" ,"hello",ptr );
getchar();
}
Here you have assigned string literal "hello" to ptr it means string literal is stored in read only memory so you can't modify it. If you declare char ptr[] = "hello";, then you can modify the array.
Say what?
Your code allocates 6 bytes of memory and initializes it with the values 'h', 'e', 'l', 'l', 'o', and '\0'.
It then allocates a pointer (number of bytes for the pointer depends on implementation) and sets the pointer's value to the start of the 5 bytes mentioned previously.
You can modify the values of an array using syntax such as ptr[1] = 'a'.
Syntactically, strings are a special case. Since C doesn't have a specific string type to speak of, it does offer some shortcuts to declaring them and such. But you can easily create the same type of structure as you did for a string using int, even if the syntax must be a bit different.

Why is literal string assignment in C not possible for array with specified length? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What is the difference between char s[] and char *s in C?
Do these statements about pointers have the same effect?
All this time I thought that whenever I need to copy a string(either literal or in a variable) I need to use strcpy(). However I recently found out this:
char a[]="test";
and this
char *a="test";
From what I understand the second type is unsafe and will print garbage in some cases. Is that correct? What made me even more curious is why the following doesn't work:
char a[5];
a="test";
or this
char a[];
a="test";
but this works however
char *a;
a="test";
I would be greatful if someone could clear up things a bit.
char a[]="test";
This declares and initializes an array of size 5 with the contents of "test".
char *a="test";
This declares and initializes a pointer to the literal "test". Attempting to modify a literal through a is undefined behavior (and probably results in the garbage you are seeing). It's not unsafe, it just can't be modified since literals are immutable.
char a[5];
a="test";
This fails even when both a and "test" have the exact same type, just as any other attempt to copy arrays thorugh assignment.
char a[];
a="test";
This declares an array of unknown size. The declaration should be completed before being used.
char *a;
a="test";
This works just fine since "test" decays to a pointer to the literal's first element. Attempting to modify its contents is still undefined behavior.
Let's examine case by case:
char a[]="test";
This tells the compiler to allocate 5 bytes on the stack, put 't' 'e' 's' 't' and '\0' on it. Then the variable a points to where 't' was written and you have a pointer pointing to a valid location with 5 available spaces. (That is if you view a as a pointer. In truth, the compiler still treats a as a single custom type that consists of 5 chars. In an extreme case, you can imagine it something like struct { char a, b, c, d, e; } a;)
char *a="test";
"test" (which like I said is basically 't' 'e' 's' 't' and '\0') is stored somewhere in your program, say a "literal's area", and a is pointing to it. That area is not yours to modify but only to read. a by itself doesn't have any specific memory (I am not talking about the 4/8 bytes of pointer value).
char a[5];
a = "test";
You are telling the compiler to copy the contents of one string over to another one. This is not a simple operation. In the case of char a[] = "test"; it was rather simple because it was just 5 pushes on the stack. In this case however it is a loop that needs to copy 1 by 1.
Defining char a[];, well I don't think that's even possible, is it? You are asking for a to be an array of a size that would be determined when initialized. When there is no initialization, it's just doesn't make sense.
char *a;
a = "test";
You are defining a as a pointer to arrays of char. When you assign it to "test", a just points to it, it doesn't have any specific memory for it though, exactly like the case of char *a = "test";
Like I said, assigning arrays (whether null-terminated arrays of char (string) or any other array) is a non-trivial task that the compiler doesn't do for you, that is why you have functions for it.
Do not confuse assignment and initialisation in C, they are different.
In C a string is not a data type, it is a convention, utilising an array and a nul terminator. Like any array, when you assign it, it's name resolves as a mere pointer. You can assign a pointer, but that is not the same as assigning a string.

C strings pointer vs. arrays [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What is the difference between char s[] and char *s in C?
Why is:
char *ptr = "Hello!"
different than:
char ptr[] = "Hello!"
Specifically, I don't see why you can use (*ptr)++ to change the value of 'H' in the array, but not the pointer.
Thanks!
You can (in general) use the expression (*ptr)++ to change the value that ptr points to when ptr is a pointer and not an array (ie., if ptr is declared as char* ptr).
However, in your first example:
char *ptr = "Hello!"
ptr is pointing to a literal string, and literal strings are not permitted to be modified (they may actually be stored in memory area which are not writable, such as ROM or memory pages marked as read-only).
In your second example,
char ptr[] = "Hello!";
The array is declared and the initialization actually copies the data in the string literal into the allocated array memory. That array memory is modifiable, so (*ptr)++ works.
Note: for your second declaration, the ptr identifier itself is an array identifier, not a pointer and is not an 'lvalue' so it can't be modified (even though it converts readily to a pointer in most situations). For example, the expression ++ptr would be invalid. I think this is the point that some other answers are trying to make.
When pointing to a string literal, you should not declare the chars to be modifiable, and some compilers will warn you for this:
char *ptr = "Hello!" /* WRONG, missing const! */
The reason is as noted by others that string literals may be stored in an immutable part of the program's memory.
The correct "annotation" for you is to make sure you have a pointer to constant char:
const char *ptr = "Hello!"
And now you see directly that you can't modify the text stored at the pointer.
Arrays automatically allocate space and they can't be relocated or resized while pointers are explicitly assigned to point to allocated space and can be relocated.
Array names are read only!
If You use a string literal "Hello!", the literal itself becomes an array of 7 characters and gets stored somewhere in a data memory. That memory may be read only.
The statement
char *ptr = "Hello!";
defines a pointer to char and initializes it, by storing the address of the beginning of the literal (that array of 7 characters mentioned earlier) in it. Changing contents of the memory pointed to by ptr is illegal.
The statement
char ptr[] = "Hello!";
defines a char array (char ptr[7]) and initializes it, by copying characters from the literal to the array. The array can be modified.
in C strings are arrays of characters.
A pointer is a variable that contains the memory location of another variable.
An array is a set of ordered data items.
when you put (*ptr)++ you are getting Segmentation Fault with the pointer.
Maybe you are adding 1 to the whole string (with the pointer), instead of adding 1 to the first character of the variable (with the array).

Resources