Simple modification of C strings using pointers - c

I have two pointers to the same C string. If I increment the second pointer by one, and assign the value of the second pointer to that of the first, I expect the first character of the first string to be changed. For example:
#include "stdio.h"
int main() {
char* original_str = "ABC"; // Get pointer to "ABC"
char* off_by_one = original_str; // Duplicate pointer to "ABC"
off_by_one++; // Increment duplicate by one: now "BC"
*original_str = *off_by_one; // Set 1st char of one to 1st char of other
printf("%s\n", original_str); // Prints "ABC" (why not "BBC"?)
*original_str = *(off_by_one + 1); // Set 1st char of one to 2nd char of other
printf("%s\n", original_str); // Prints "ABC" (why not "CBC"?)
return 0;
}
This doesn't work. I'm sure I'm missing something obvious - I have very, very little experience with C.
Thanks for your help!

You are attempting to modify a string literal. String literals are not modifiable (i.e., they are read-only).
A program that attempts to modify a string literal exhibits undefined behavior: the program may be able to "successfully" modify the string literal, the program may crash (immediately or at a later time), a program may exhibit unusual and unexpected behavior, or anything else might happen. All bets are off when the behavior is undefined.
Your code declares original_string as a pointer to the string literal "ABC":
char* original_string = "ABC";
If you change this to:
char original_string[] = "ABC";
you should be good to go. This declares an array of char that is initialized with the contents of the string literal "ABC". The array is automatically given a size of four elements (at compile-time), because that is the size required to hold the string literal (including the null terminator).

The problem is that you can't modify the literal "ABC", which is read only.
Try char[] original_string = "ABC", which uses an array to hold the string that you can modify.

Related

Why pointers can't be used to index arrays? [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 3 years ago.
I am trying to change value of character array components using a pointer. But I am not able to do so. Is there a fundamental difference between declaring arrays using the two different methods i.e. char A[] and char *A?
I tried accessing arrays using A[0] and it worked. But I am not able to change values of the array components.
{
char *A = "ab";
printf("%c\n", A[0]); //Works. I am able to access A[0]
A[0] = 'c'; //Segmentation fault. I am not able to edit A[0]
printf("%c\n", A[0]);
}
Expected output:
a
c
Actual output:
a
Segmentation fault
The difference is that char A[] defines an array and char * does not.
The most important thing to remember is that arrays are not pointers.
In this declaration:
char *A = "ab";
the string literal "ab" creates an anonymous array object of type char[3] (2 plus 1 for the terminating '\0'). The declaration creates a pointer called A and initializes it to point to the initial character of that array.
The array object created by a string literal has static storage duration (meaning that it exists through the entire execution of your program) and does not allow you to modify it. (Strictly speaking an attempt to modify it has undefined behavior.) It really should be const char[3] rather than char[3], but for historical reasons it's not defined as const. You should use a pointer to const to refer to it:
const char *A = "ab";
so that the compiler will catch any attempts to modify the array.
In this declaration:
char A[] = "ab";
the string literal does the same thing, but the array object A is initialized with a copy of the contents of that array. The array A is modifiable because you didn't define it with const -- and because it's an array object you created, rather than one implicitly created by a string literal, you can modify it.
An array indexing expression, like A[0] actually requires a pointer as one if its operands (and an integer as the other). Very often that pointer will be the result of an array expression "decaying" to a pointer, but it can also be just a pointer -- as long as that pointer points to an element of an array object.
The relationship between arrays and pointers in C is complicated, and there's a lot of misinformation out there. I recommend reading section 6 of the comp.lang.c FAQ.
You can use either an array name or a pointer to refer to elements of an array object. You ran into a problem with an array object that's read-only. For example:
#include <stdio.h>
int main(void) {
char array_object[] = "ab"; /* array_object is writable */
char *ptr = array_object; /* or &array_object[0] */
printf("array_object[0] = '%c'\n", array_object[0]);
printf("ptr[0] = '%c'\n", ptr[0]);
}
Output:
array_object[0] = 'a'
ptr[0] = 'a'
String literals like "ab" are supposed to be immutable, like any other literal (you can't alter the value of a numeric literal like 1 or 3.1419, for example). Unlike numeric literals, however, string literals require some kind of storage to be materialized. Some implementations (such as the one you're using, apparently) store string literals in read-only memory, so attempting to change the contents of the literal will lead to a segfault.
The language definition leaves the behavior undefined - it may work as expected, it may crash outright, or it may do something else.
String literals are not meant to be overwritten, think of them as read-only. It is undefined behavior to overwrite the string and your computer chose to crash the program as a result. You can use an array instead to modify the string.
char A[3] = "ab";
A[0] = 'c';
Is there a fundamental difference between declaring arrays using the two different methods i.e. char A[] and char *A?
Yes, because the second one is not an array but a pointer.
The type of "ab" is char /*readonly*/ [3]. It is an array with immutable content. So when you want a pointer to that string literal, you should use a pointer to char const:
char const *foo = "ab";
That keeps you from altering the literal by accident. If you however want to use the string literal to initialize an array:
char foo[] = "ab"; // the size of the array is determined by the initializer
// here: 3 - the characters 'a', 'b' and '\0'
The elements of that array can then be modified.
Array-indexing btw is nothing more but syntactic sugar:
foo[bar]; /* is the same as */ *(foo + bar);
That's why one can do funny things like
"Hello!"[2]; /* 'l' but also */ 2["Hello!"]; // 'l'

How to change character in a string? [duplicate]

This question already has answers here:
What is the difference between char s[] and char *s?
(14 answers)
Closed 5 years ago.
I want to have a function that receives a string as a parameter and change symbols of the string
char *strChanger(char *str).
I tried to implement it like this:
char *strChanger(char *str) {
if(str[0] != '\0') {
str[0] = 'a';
}
return str;
}
In the program it should look like char *newstr = strChanger("hi");
But when I try to change a character in the string, a program chrashes.
I did some experiments and find that:
// Works fine
char str[] = "hi";
str[0] = 'a';
// Crashes
char *str = "hi";
str[0] = 'a';
I dont understand the difference. Why the second block of code doesn't work?
Because modifying string literal is undefined behavior - in your case leading to crash of your program.
In the second example, you are passing the string literal direclty and tried to make changes to it. Speaking in terms of char *newstr = strChanger("hi").
In fact char *str = "hi"; is basically making str point to the string literal. More specifically string literal is an array which is converted into pointer to the first element and which is then assigned to str. Then you tried to modify it - which is undefined behavior.
In the first case a copy of it is made which is modifiable and you can make changes to it which is then passed and it worked. You are declaring a char array and initializing it with the content of the string literal.
If you have POSIX defined "strdup" then you can do this
char *newstr = strChanger(strdup("hi"));
But again then inside strChanger you need to check the passed value - strdup may return NULL in case it fails to allocate memory and provide you with the copied string. At some point after using like this you will have to free the memory - free(newstr).
From standard 6.4.5p7 under string literal
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
From 6.7.9p14 under initialization
An array of character type may be initialized by a character string literal or UTF-8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

Looping char* gives me Access violation writing location 0x00CFB310

i try to print simple c* string like this :
char *cc = "HEllo";
for (char* inputPtr = cc; inputPtr[0]; inputPtr++) {
char c = inputPtr[0]++;
printf("%s",c);
}
but im getting :
Access violation writing location 0x00CFB310.
on :
char c = inputPtr[0]++;
what is wrong here ?
It looks like you are experimenting with inputPtr[0] as a substitution for *inputPtr. In many contexts, the two expressions produce the same result.
However, expression inputPtr[0]++ is not the same as *inputPtr++, because [0] has higher precedence than *, but it has the same precedence as suffix ++. Operators within this precedence level are applied left-to-right, so the first expression post-increments inputPtr[0], a character inside a character literal. This is undefined behavior, hence you see a crash.
If you replace inputPtr[0]++ with *inputPtr++ and remove inputPtr++ from loop header, your code is going to work fine:
for (char* inputPtr = cc; inputPtr[0]; ) {
char c = *inputPtr++;
printf("%c", c); // Replace %s with %c to print one character
}
inputPtr is pointing at "HEllo", which is a string literal.
Modifying string literal isn't allowed and trying to do so invokes undefined behavior.
inputPtr[0]++ is trying to modify string literal. If data of string literal is located in read-only locations, it may lead to Segmentation Fault.
String literals in C are read only, attempting to modify the characters in string literals leads to undefined behavior.
Any with inputPtr[0] you do modify the string since you increment the character inputPtr[0].
This is the reason you should always use const char * when pointing to string literals.
If you want to modify the contents of the string you have to create an array:
char cc[] = "HEllo";
As the others have written, you can't modify a string literal. Change your declaration to char cc[] = "HEllo"; and see what happens. The suggested declaration declares a string buffer that is modifiable.

Why does this small C program crash?

The program is:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char *a="abc",*ptr;
ptr=a;
ptr++;
*ptr='k';
printf("%c",*ptr);
return 0;
}
The problem is in the
*ptr='k';
line, when I remove it program works normally. But I can't figure out the reason.
The problem is because you are trying to change the string literal "abc" with:
char *a="abc",*ptr;
ptr=a; // ptr points to the 'a'.
ptr++; // now it points to the 'b'.
*ptr='k'; // now you try to change the 'b' to a 'k'.
That's undefined behaviour. The standard explicitly states that you are not permitted to change string literals as per section 6.4.5 String literals of C99:
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
It will work if you replace:
char *a="abc",*ptr;
with:
char a[]="abc",*ptr;
since that copies the string literal to a place that's safe to modify.
Because "abc" is a constant string literal. Then you point ptr to it and try to modify it which is undefined behaviour. Typically string literals are put in a memory section which gets mapped as read-only - hence the access violation.
See also this question: String literals: Where do they go?
The reason is that your string "abc" lives in a read-only area of memory. It gets put there by the linker. You try to change it in your program, and all bets are off.
This:
char *a="abc";
is really:
const char *a="abc";
You can't modify ptr, which points to the same address as a.

Different string initialization yields different behavior?

How come when I use the following method, to be used to convert all the characters in a string to uppercase,
while (*postcode) {
*postcode = toupper(*postcode);
postcode++;
}
Using the following argument works,
char wrong[20];
strcpy(wrong, "la1 4yt");
But the following, doesn't, despite them being the same?
char* wrong = "la1 4yt";
My program crashes in an attempt to write to an illegal address (a segfault, I presume). Is it an issue with not mallocing? Not being null-terimanted? It shouldn't be...
Through debugging I notice it crashes on the attempt to assign the first character as its uppercase.
Any help appreciated!
char* wrong = "la1 4yt";
This declares a pointer to a string constant. The constant cannot be modified, which is why your code crashes. If you wrote the more pedantic
const char* wrong = "la1 4yt"; // Better
then the compiler would catch the mistake. You should probably do this any time you declare a pointer to a string literal rather than creating an array.
This, on the other hand, allocates read/write storage for twenty characters so writing to the space is fine.
char wrong[20];
If you wanted to initialize it to the string above you could do so and then would be allowed to change it.
char wrong[20] = "la1 4yt"; // Can be modified
char wrong[] = "la1 4yt"; // Can be modified; only as large as required
char * whatever = "some cont string";
Is read-only.
In the second variant, "la1 4yt" is a constant and therefore is in a read-only segment. Only the pointer (wrong) to the constant is writeable. That's why you get the segfault. In the first example however, everything is writable.
This one might be interesting: http://eli.thegreenplace.net/2009/10/21/are-pointers-and-arrays-equivalent-in-c/
See Question 8.5 in the C FAQ list.
When you do
char wrong[20] = "la1 4yt";
the compiler copies the elements of the string literal {'l', 'a', '1', ' ', '4', 'y', 't', '\0'} to the corresponding elements of the wrong array; when you do
char *wrong = "la1 4yt";
the compiler assigns to wrong the address of the string literal.
String literals are char[] (arrays of char), not const char[] ... but you cannot change them!!
Quote from the Standard:
6.4.5 String literals
6 It is unspecified whether these arrays are distinct provided
their elements have the appropriate values. If the program
attempts to modify such an array, the behavior is undefined.
When I use a string literal to initialize a char *, I usually also tell the compiler I will not be changing the contents of that string literal by adding a const to the definition.
const char *wrong = "la1 4yt";
Edit
Suppose you had
char *test1 = "example test";
char *test2 = "test";
And the compiler created 1 single string literal and used that single string literal to initialize both test1 and test2. If you were allowed to change the string literal ...
test1[10] = 'x'; /* attempt to change the 's' */
printf("%s\n", test2); /* print "text", not "test"! */

Resources