Queries on declaring & initialising string and also string memory region [duplicate] - c

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
What is the difference between char s[] and char *s?
(14 answers)
Closed 3 years ago.
My question comes in three part.
char str[] = "hello!";
char *str1 = "heello!";
puts(str);
str1[1] = 1
puts(str1);
printf("%s\n", str1);
printf("%p\n", str1);
Ouput of code:
hello!
h1ello!
heello!
0x10735ef9b
1) I understand that when you're declaring a char * type, you're declaring a pointer that points to char type. And usually for pointer, since it contains the address of the variable its pointing to, when you print the pointer itself, you'll get the address stored in the pointer and only when you dereference the pointer, then you'll get the content stored in the variable the pointer is pointing to.
However for a char * type, why is it that it doesn't function like a regular pointer? You can print the content of the variable which the 'char *' pointer is pointing to without dereferencing it and also use it like a string variable(rather than a pointer) by changing the value of the str1[1] as shown above.
2) I was taught that you shouldn't declare a pointer variable as such in the topic of pointer:
char *str1 = "heello!";
because the pointer variable would be pointing to whatever address happens to be in the memory at that time, thus running a risk of changing the value in a memory location that you don't mean to and resulting in segfault.
Hence you should do the following:
char *str1;
char c;
str1 = &c;
*addr = "hello!"
By doing the above, you'll make sure that your pointer is pointing to the right location before dereferencing and writing to the location.
However in the notes on string, it says to declare and initialise as such:
char *str1 = "heello!";
I'm quite confused as to what I should follow now when declaring a char * pointer. Please help.
3) It says in my notes that the location of string is stored in a read only region of the memory called the text region.
Hence doing this:
char *str1 = "hello!";
str[1] = '.';
Will crash your program.
But when you do this:
char str2[7] = "hello!";
str2[1] = '.';
This is okay and will successfully change the second element in the string from 'e' to '.'
The explanation to this was that str1 points to a read only region in the memory, while str2 contains a copy of the string on the stack.
I do not understand how the above 2 different ways of declaring the string variable will make such a big difference when executing the code. Please help.
Thank you!

String literals like "heello!" are really read-only arrays of characters (including the string null-terminator).
With
char *str1 = "heello!";
you make str1 point to the first character of such an array.
And when you do str1[1] = 1 you attempt to modify the read-only array, which is forbidden and leads to undefined behavior (which can but doesn't have to lead to crashes).
That's why you either should use your own arrays for strings, or use const char * to point to literal strings.

Related

Why pointers can't be used to index arrays? [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 3 years ago.
I am trying to change value of character array components using a pointer. But I am not able to do so. Is there a fundamental difference between declaring arrays using the two different methods i.e. char A[] and char *A?
I tried accessing arrays using A[0] and it worked. But I am not able to change values of the array components.
{
char *A = "ab";
printf("%c\n", A[0]); //Works. I am able to access A[0]
A[0] = 'c'; //Segmentation fault. I am not able to edit A[0]
printf("%c\n", A[0]);
}
Expected output:
a
c
Actual output:
a
Segmentation fault
The difference is that char A[] defines an array and char * does not.
The most important thing to remember is that arrays are not pointers.
In this declaration:
char *A = "ab";
the string literal "ab" creates an anonymous array object of type char[3] (2 plus 1 for the terminating '\0'). The declaration creates a pointer called A and initializes it to point to the initial character of that array.
The array object created by a string literal has static storage duration (meaning that it exists through the entire execution of your program) and does not allow you to modify it. (Strictly speaking an attempt to modify it has undefined behavior.) It really should be const char[3] rather than char[3], but for historical reasons it's not defined as const. You should use a pointer to const to refer to it:
const char *A = "ab";
so that the compiler will catch any attempts to modify the array.
In this declaration:
char A[] = "ab";
the string literal does the same thing, but the array object A is initialized with a copy of the contents of that array. The array A is modifiable because you didn't define it with const -- and because it's an array object you created, rather than one implicitly created by a string literal, you can modify it.
An array indexing expression, like A[0] actually requires a pointer as one if its operands (and an integer as the other). Very often that pointer will be the result of an array expression "decaying" to a pointer, but it can also be just a pointer -- as long as that pointer points to an element of an array object.
The relationship between arrays and pointers in C is complicated, and there's a lot of misinformation out there. I recommend reading section 6 of the comp.lang.c FAQ.
You can use either an array name or a pointer to refer to elements of an array object. You ran into a problem with an array object that's read-only. For example:
#include <stdio.h>
int main(void) {
char array_object[] = "ab"; /* array_object is writable */
char *ptr = array_object; /* or &array_object[0] */
printf("array_object[0] = '%c'\n", array_object[0]);
printf("ptr[0] = '%c'\n", ptr[0]);
}
Output:
array_object[0] = 'a'
ptr[0] = 'a'
String literals like "ab" are supposed to be immutable, like any other literal (you can't alter the value of a numeric literal like 1 or 3.1419, for example). Unlike numeric literals, however, string literals require some kind of storage to be materialized. Some implementations (such as the one you're using, apparently) store string literals in read-only memory, so attempting to change the contents of the literal will lead to a segfault.
The language definition leaves the behavior undefined - it may work as expected, it may crash outright, or it may do something else.
String literals are not meant to be overwritten, think of them as read-only. It is undefined behavior to overwrite the string and your computer chose to crash the program as a result. You can use an array instead to modify the string.
char A[3] = "ab";
A[0] = 'c';
Is there a fundamental difference between declaring arrays using the two different methods i.e. char A[] and char *A?
Yes, because the second one is not an array but a pointer.
The type of "ab" is char /*readonly*/ [3]. It is an array with immutable content. So when you want a pointer to that string literal, you should use a pointer to char const:
char const *foo = "ab";
That keeps you from altering the literal by accident. If you however want to use the string literal to initialize an array:
char foo[] = "ab"; // the size of the array is determined by the initializer
// here: 3 - the characters 'a', 'b' and '\0'
The elements of that array can then be modified.
Array-indexing btw is nothing more but syntactic sugar:
foo[bar]; /* is the same as */ *(foo + bar);
That's why one can do funny things like
"Hello!"[2]; /* 'l' but also */ 2["Hello!"]; // 'l'

How can i copy to string literal in C [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 4 years ago.
char temp1[4] = "abc";
char *temp2 = "123";
strcpy(temp1,temp2);
if I want to copy a string literal to an array, it works well, but if I do it in the opposite way, I get an error:
char temp1[4] = "abc";
char *temp2 = "123";
strcpy(temp2,temp1);
The feedback from compiler is "Segmentation fault".
So what's the differences? Is there anyway to copy a string to a string literal?
Thx.
You need to understand the subtle difference between these 2 lines
char temp1[4] = "abc";
char *temp2 = "123";
The first one creates a 4 character variable and copies "abc\0" to it.
You can overwrite that if you want to. You can do e.g. temp1[0] = 'x' if you want.
The second one creates a pointer that points to the constant literal "123\0".
You cannot overwrite that, its typically in memory that is declared read only to the OS.
What you have is more complicated than a string literal and what you attempt to do cannot be described as "copying to string literal". Which is good, because copying to a string literal is literally impossible. (Excuse the pun.)
First, what you are successfully doing in the first code quote is copying from a string literal into an array of chars of size 4 (you knew that). You are however doing this with the added detail of copying via a pointer to that string literal (temp2). Also note that what the pointer is pointing to is not a variable which can be edited in any way. It is "just a string which the linker knows about".
In the second code quote you attempt to copy a string (strictly speaking a zero-terminated sequence of chars which is stored in an array, temp1, but not a string literal) to a place where a pointer to char (temp2) points to, but which happens not to be a variable which is legal to write to.
The types of the involved variables allow such an operation basically, but in this case it is forbidden/impossible; which causes the segmentation fault.
Now what IS possible and might be what you actually attempt, is to repoint temp2 to the address at the beginning of temp1. I believe that is what gives you the desired effect:
char temp1[4] = "abc";
char *temp2 = "123";
/* Some code, in which temp2 is used with a meaningful
initialisation value, which is represented by "123".
Then, I assume you want to change the pointer, so that it points
to a dynamically determined string, which is stored in a changeable
variable.
To do that: */
temp2=temp1;
/* But make sure to keep the variable, the name you chose makes me worry. */
Note that an array identifier can be used as a pointer to the type of the array entries.

Bus error 10 in C: string literals [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 7 years ago.
Can someone explain why
char *s1 = "abcd";
char *s2 = s1;
s1[0] = "z";
s1[2] = "\0";
gives me a bus error 10 BUT
char s1[] = "abcd";
char *s2 = s1;
s1[0] = "z";
s1[2] = "\0";
doesn't?
are char *s1 and char s1[] not equivalent? Please explain, thanks.
Be free (of historical lazyness), be wise! Pointers are not arrays! Tutorials have lied you!
In the first example, you're modifying a pointer to a constant string literal, and that's undefined behaviour. Anything can happen then!
Meanwhile, in the second case, the string itself is stored inside the array, and the array itself is in the stack. Thus, the second example exposes more than a plain innocent array that's modifiable.
The s2 pointers make no difference in all this. IMHO, the fact that the first case is compilable is just historical lazyness, otherwise known as backwards compatibility.
BTW: Are you assigning string literals to chars? That's undefined behaviour too!
In the first case you set a s1 pointer to the address const string. Const string are stored in read -only area and you cannot modify it. This means that you cannot modify a character s[x]. It is UB
In the second case you declare a local array inited with a string. In this case only the init value is read-only and after init you use an allocated array that can be modified.

Segmentation Fault related to Pointer in c [duplicate]

This question already has answers here:
Segmentation fault with strcpy() [duplicate]
(7 answers)
Closed 9 years ago.
int main()
{
char *s="Hello";
*s="World";
printf("%s\n",s);
}
Why does the above program result in a segmentation fault?
int main()
{
char *s="Hello"; // makes 's' point to a constant
*s="World"; // modifies what 's' points to
printf("%s\n",s);
}
The first line of code makes s point to a constant. The second line tries to modify what s points to. So you are trying to modify a constant, which you can't do because a constant is ... well ... constant.
because *s is a char not a char*(string)
char *s="Hello";
declares a pointer to a string literal "Hello". This may exist in read-only memory so the line
*s="World";
is results in undefined behaviour. A crash is a valid (and useful) form of undefined behaviour.
Either of the following alternatives would work
const char* s = "Hello";
s="World";
printf("%s\n",s);
char s[16] = "Hello";
strcpy(s, "World";)
printf("%s\n",s);
s points to static (global) memory when it is created. You cannot reassign to that memory at run-time, hence, the crash.
*s is the first char of the string, so assigning string to character makes error.
If you want to assing string use s = "world"
int main()
{
char *s="Hello";
s="World";
printf("%s\n",s);
}
now try it will work.
char *s="hello"; Here s is in readonly location. So we can assign another string, but cannot rewrite new string.
s = "hello"; //work
strcpy(s, "hello"); //segmentation fault
There are two problems here.
The statement
*s = "World";
dereferences s, which gives you the first character of the string "Hello", or 'H'. So you're trying to assign a pointer value (the address of the string "World") to a single char object (the first character of the "Hello" string literal).
But...
On some systems (such as yours, apparently), string literals are stored in a read-only data segment, and attempting to modify read-only memory will lead to a runtime error on some systems. Hence the crash.
To change s to point to the "World" string literal, simply drop the dereference:
s = "World";
*s is the same as s[0]. s[0] has room to store a single character; in this case a 'W'.
There's not room to store the location of "World" in that character.
That's why you're getting a segmentation fault.

Confusion in basic concept of pointers and strings [duplicate]

This question already has answers here:
What is the difference between char s[] and char *s?
(14 answers)
Closed 9 years ago.
In my GCC 32-bit compiler, the following code gives the output
char *str1="United";
printf("%s",str1);
output:
United
then should I consider char *str1="United"; the same as char str1[]="United"?
The two are not the same: char *str1="United" gives you a pointer to a literal string, which must not be modified (else it's undefined behavior). The type of a literal string should really be const but it's non-const for historical reasons. On the other hand, char str1[]="United" gives you a modifiable local string.
char* str = "United"; is read-only. You wouldn't be able to reach inside the string and change parts of it:
*str = 'A';
Will most likely give you a segmentation fault.
On the other hand char str1[] = "United"; is an array and so it can be modified as long as you don't exceed the space allocated for it (arrays cannot be resized). For example this is perfectly legal:
char str[] = "United";
str[0] = 'A';
printf("%s\n", str);
This will print Anited.
See the comp.lang.c.faq, question 1.32. It basically boils down to the fact that declaring the string in array form (char str[] = "foo") is identical to char str[] = {'f','o','o'}, which is identical to char str[] = {102, 111, 111}; that is, it is a normal array on the stack. But when you use a string literal in any other context it becomes "an unnamed, static array of characters, [which] may be stored in read-only memory, and which therefore cannot necessarily be modified." (And trying to modify it results in undefined behavior wherever it happens to be stored, so don't).

Resources