Does this...
char* myString = "hello";
... have the same effect as this?
char actualString[] = "hello";
char* myString = actualString;
No.
char str1[] = "Hello world!"; //char-array on the stack; string can be changed
char* str2 = "Hello world!"; //char-array in the data-segment; it's READ-ONLY
The first example creates an array of size 13*sizeof(char) on the stack and copies the string "Hello world!" into it.
The second example creates a char* on the stack and points it to a location in the data-segment of the executable, which contains the string "Hello world!". This second string is READ-ONLY.
str1[1] = 'u'; //Valid
str2[1] = 'u'; //Invalid - MAY crash program!
No. The first one gives you a pointer to const data, and if you change any character via that pointer, it's undefined behavior. The second one copies the characters into an array, which isn't const, so you can change any characters (either directly in array, or via pointer) at will with no ill effects.
No. In the first one, you can't modify the string pointed by myString, in the second one you can. Read more here.
It isn't the same, because the unnamed array pointed to by myString in the first example is read-only and has static storage duration, whereas the named array in the second example is writeable and has automatic storage duration.
On the other hand, this is closer to being equivalent:
static const char actualString[] = "hello";
char* myString = (char *)actualString;
It's still not quite the same though, because the unnamed arrays created by string literals are not guaranteed to be unique, whereas explicit arrays are. So in the following example:
static const char string_a[] = "hello";
static const char string_b[] = "hello";
const char *ptr_a = string_a;
const char *ptr_b = string_b;
const char *ptr_c = "hello";
const char *ptr_d = "hello";
ptr_a and ptr_b are guaranteed to compare unequal, whereas ptr_c and ptr_d may be either equal or unequal - both are valid.
Related
I read a few similar questions, C: differences between char pointer and array, What is the difference between char s[] and char *s?, What is the difference between char array[] and char *array? but none of them seem to clear my doubt.
I'm aware that
char *s = "Hello world";
makes the string immutable whereas
char s[] = "Hello world";
can be modified.
My doubt is if I do char stringA[LEN]; and char* stringB[LEN]; Are they any different? Or does stringB again becomes immutable as in the case before?
Let me give you a visual explanation:
As you can see, a is an array of 4 characters, whereas b is an array of 4 character pointers, each pointing to the beginning of a C string. Each one of those strings can have a different length.
Are they any different?
Yes.
Both variables stringA and stringB are arrays. stringA is an array of char of size LEN and stringB is an array of char * of size LEN.
char and char * are two different types. stringA can hold only one character string of length LEN while elements of stingB can point to LEN number of strings.
Or does stringB again becomes immutable as in the case before?
Whether strings pointed by elements of stringB is mutable or not will depend on how memory is allocated. If they are initialized with string literals
char* stringB[LEN] = { "Apple", "Bapple", "Capple"};
then they are immutable. In case of
for(int i = 0; i < LEN; i++)
stringB[i] = malloc(30) // Allocating 30 bytes for each element
strcpy(stringB[0], "Apple");
strcpy(stringB[1], "Bapple");
strcpy(stringB[2], "Capple");
they are mutable.
They are not the same.
Here stringA is an array of char, that means printing stringA[0] will show the letter S:
char stringA[] = "Something";
Whereas printing stringB[0] here will show Something (array of pointer to char):
char* stringB[] = { "Something", "Else" };
They are not having the same datatype, less being comparable.
char stringA[LEN]; is a char array of LEN length. (Array of chars)
char* stringB[LEN]; is a char * array of LEN length. (Array of char pointers)
FWIW, in case of char *s = "Hello world"; s is a pointer which points to a string literal which is non-modifiable. The pointer itself can certainly be changed. Only the content it points to (values) cannot be changed.
The reason
char* s = "Hello World!";
is immutable is because "Hello World!" is stored in RO(Read Only) memory, sow hen you try to change it, it throws an error. Declaring it as a pointer to the first element of an "array" is NOT the reason it's immutable. You seem to be slightly confused as well. Assuming your questions is exactly what you mean,
char stringA[LEN] = "ABC";
is a string in traditional C style, but stringB as you've defined isn't a string, it's an array of strings -
char* stringB[LEN] = {"ABC", "DEF"};
Assuming you mean what I think you mean,
char stringA[LEN] = "Hello World!";
char *stringB = malloc(LEN);
strcpy(stringB, stringA);
In this case, stringB IS mutable, since it refers to writable memory.
Here is my question. in C, i saw code like this:
char *s = "this is a string";
but then, s is not actually pointing to an actual memory right?
and if you try to use s to modify the string, the result is undefined.
my question is, what is the point of assigning a string to the pointer
then?
thanks.
char *s = "this is a string";
This is a string literal. So the string is stored in read-only location and that memory address is returned to s . So when you try to write to the read-only location you see undefined behavior and might see a crash.
Q1:s is not actually pointing to an actual memory right?
You are wrong s is holding the memory address where this string is stored.
Q2:what is the point of assigning a string to the pointer then?
http://en.wikipedia.org/wiki/String_literal
When you do a char *s = "this is a string";, the memory is automatically allocated and populated with this string and a pointer to that memory is returned back to the caller (you). So, you do not need to explicitly put the string to some memory.
s is not actually pointing to an actual memory right?
Wrong, it does point to an actual memory whose allocation implementation is hidden from you. And this memory lies in the Read-Only sector of memory, so that it can't be changed/modified. Hence the keyword const as these literals are called constant literals.
if you try to use s to modify the string, the result is undefined.
Because, you are trying to modify memory which is marked as Read-Only.
what is the point of assigning a string to the pointer then?
Another way to achieve the same is,
char temp[260] = {0} ;
char *s ;
strcpy (temp, "this is a string");
s = temp ;
Here the memory temp is managed by you.
char *s = "this is a string" ;
Here the memory is managed by the OS.
By using const char * instead of a char [] the string will be stored in read only memory space. This allows the compiler to eliminate string duplication.
Try running this program:
#include <stdio.h>
int main()
{
const char *s1 = "This is a string";
const char *s2 = "This is a string";
if (s1 == s2) {
puts("s1 == s2");
} else {
puts("s1 != s2");
}
}
For the me it outputs s1 == s2 which means that the string pointers point to the same memory location.
Now try replacing const char * with char []:
int main()
{
const char s1[] = "This is a string";
const char s2[] = "This is a string";
if (s1 == s2) {
puts("s1 == s2");
} else {
puts("s1 != s2");
}
}
This outputs s1 != s2 which means that the compiler had to duplicate the string memory.
By using char * instead of char [] the compiler can do these optimizations that will decrease the size of you executable file.
Also note that you should not use char *s = "string". You should use const char *s = "string" instead. char *s is deprecated and unsafe. By using const char * you avoid the mistake of passing the string to a function that tries to modify the string.
s is not actually pointing to an actual memory right?
Technically, it is pointing to read-only memory. But the compiler is allowed to do whatever it wants as long as if follows the as-if rule. For example, if you never use s, it can be removed from your code completely.
Since it is read-only, any attempt to modify it is undefined behaviour. You can and should use const to indicate that the target of a pointer is immutable:
const char* s = "Hello const";
my question is, what is the point of assigning a string to the pointer then?
Just like storing a constant to any other type. You don't always need to modify strings. But you may want to pass a pointer to a string around to functions that don't care whether they point to a literal or to an array you own:
void foo(const char* str) {
// I won't modify the target of str. I don't care who owns it.
printf("foo: %s", str);
}
void bar(const char* str) {}
char* a = "Hello, this is a literal";
char b[] = "Hello, this is a char array and I own it";
foo(a);
bar(a);
foo(b);
Look at this code:
char *s = "this is a string";
printf("%s",s);
as you can see,I used "assigning a string to the pointer".Is that clear?
And know that s is pointing to an actual memory,but it is read-only.
If you are assigning like this,
char *s = "this is a string";
It will stored in the read only memory. So this is the reason for the undefined behavior. In this s will pointing to the some memory in the read-only area.
If you print the address like this you can get the some memory address.
printf("%p",s);
So in this case, if you allocate a memory and the copy the value to that pointer, you can access that pointer like array.
Everybody else has told you about the read-only memory and potential for undefined behavior if you attempt to modify the string, so I'll skip that part and answer the question, "what is the point of assigning a string to a pointer then?".
There are two reasons why
1) Just for brevity. After assigning the string to the pointer you can refer to the string as s instead of repeatedly typing "this is a string". This assumes of course that you intend to use the string in multiple function calls.
2) Because you may want to change the string that the pointer references. For example, in the following code, s is initialized assuming that the code will succeed, and is subsequently changed if there's a failure. At the end, the string that s points to is printed.
const char *s = "Yay, it worked!!!";
if ( openTheFile() == FAILED )
s = "Dang, couldn't open the file";
else if ( readTheFile() == FAILED )
s = "Oops, there's nothing in the file";
printf( "%s\n", s );
Note that const char * means that the string that s points to cannot be changed. It doesn't mean that s itself can't be changed.
In your case the pointer s is just pointing to the location that carries the first literal of the string. So if we want to change the string, it creates confusion as pointer s is pointing to the previous location. And if you want to change string using pointer, you should take care of the previous string's ending (i.e NULL).
Lately I've been learning all about the C language, and am confused as to when to use
char a[];
over
char *p;
when it comes to string manipulation. For instance, I can assign a string to them both like so:
char a[] = "Hello World!";
char *p = "Hello World!";
and view/access them both like:
printf("%s\n", a);
printf("%s\n", p);
and manipulate them both like:
printf("%c\n", &a[6]);
printf("%c\n", &p[6]);
So, what am I missing?
char a[] = "Hello World!";
This allocates modifiable array just big enough to hold the string literal (including terminating NUL char). Then it initializes the array with contents of string literal. If it is a local variable, then this effectively means it does memcpy at runtime, every time the local variable is created.
Use this when you need to modify the string, but don't need to make it bigger.
Also, if you have char *ap = a;, when a goes out of scope ap becomes a dangling pointer. Or, same thing, you can't do return a; when a is local to that function, because return value will be dangling pointer to now destroyed local variables of that function.
Note that using exactly this is rare. Usually you don't want an array with contents from string literal. It's much more common to have something like:
char buf[100]; // contents are undefined
snprintf(buf, sizeof buf, "%s/%s.%d", pathString, nameString, counter);
char *p = "Hello World!";
This defines pointer, and initializes it to point to string literal. Note that string literals are (normally) non-writable, so you really should have this instead:
const char *p = "Hello World!";
Use this when you need pointer to non-modifiable string.
In contrast to a above, if you have const char *p2 = p; or do return p;, these are fine, because pointer points to the string literal in program's constant data, and is valid for the whole execution of the program.
The string literals themselves, text withing double quotes, the actual bytes making up the strings, are created at compile time and normally placed with other constant data within the application. And then string literal in code concretely means address of this constant data blob.
char * strings are read-only. They cannot be modified while char[] strings can be.
char *str = "hello";
str[0] = 't'; // This is an illegal operation
Whereas
char str[] = "hello"; str[0] = 't'; // Legal, string becomes tello
I am trying to understand why the following code is illegal:
int main ()
{
char *c = "hello";
c[3] = 'g'; // segmentation fault here
return 0;
}
What is the compiler doing when it encounters char *c = "hello";?
The way I understand it, its an automatic array of char, and c is a pointer to the first char. If so, c[3] is like *(c + 3) and I should be able to make the assignment.
Just trying to understand the way the compiler works.
String constants are immutable. You cannot change them, even if you assign them to a char * (so assign them to a const char * so you don't forget).
To go into some more detail, your code is roughly equivalent to:
int main() {
static const char ___internal_string[] = "hello";
char *c = (char *)___internal_string;
c[3] = 'g';
return 0;
}
This ___internal_string is often allocated to a read-only data segment - any attempt to change the data there results in a crash (strictly speaking, other results can happen as well - this is an example of 'undefined behavior'). Due to historical reasons, however, the compiler lets you assign to a char *, giving you the false impression that you can modify it.
Note that if you did this, it would work:
char c[] = "hello";
c[3] = 'g'; // ok
This is because we're initializing a non-const character array. Although the syntax looks similar, it is treated differently by the compiler.
there's a difference between these:
char c[] = "hello";
and
char *c = "hello";
In the first case the compiler allocates space on the stack for 6 bytes (i.e. 5 bytes for "hello" and one for the null-terminator.
In the second case the compiler generates a static const string called "hello" in a global area (aka a string literal, and allocates a pointer on the stack that is initialized to point to that const string.
You cannot modify a const string, and that's why you're getting a segfault.
You can't change the contents of a string literal. You need to make a copy.
#include <string.h>
int main ()
{
char *c = strdup("hello"); // Make a copy of "hello"
c[3] = 'g';
free(c);
return 0;
}
void reverse(char *str){
int i,j;
char temp;
for(i=0,j=strlen(str)-1; i<j; i++, j--){
temp = *(str + i);
*(str + i) = *(str + j);
*(str + j) = temp;
printf("%c",*(str + j));
}
}
int main (int argc, char const *argv[])
{
char *str = "Shiv";
reverse(str);
printf("%s",str);
return 0;
}
When I use char *str = "Shiv" the lines in the swapping part of my reverse function i.e str[i]=str[j] dont seem to work, however if I declare str as char str[] = "Shiv", the swapping part works? What is the reason for this. I was a bit puzzled by the behavior, I kept getting the message "Bus error" when I tried to run the program.
When you use char *str = "Shiv";, you don't own the memory pointed to, and you're not allowed to write to it. The actual bytes for the string could be a constant inside the program's code.
When you use char str[] = "Shiv";, the 4(+1) char bytes and the array itself are on your stack, and you're allowed to write to them as much as you please.
The char *str = "Shiv" gets a pointer to a string constant, which may be loaded into a protected area of memory (e.g. part of the executable code) that is read only.
char *str = "Shiv";
This should be :
const char *str = "Shiv";
And now you'll have an error ;)
Try
int main (int argc, char const *argv[])
{
char *str = malloc(5*sizeof(char)); //4 chars + '\0'
strcpy(str,"Shiv");
reverse(str);
printf("%s",str);
free(str); //Not needed for such a small example, but to illustrate
return 0;
}
instead. That will get you read/write memory when using pointers. Using [] notation allocates space in the stack directly, but using const pointers doesn't.
String literals are non-modifiable objects in both C and C++. An attempt to modify a string literal always results in undefined behavior. This is exactly what you observe when you get your "Bus error" with
char *str = "Shiv";
variant. In this case your 'reverse' function will make an attempt to modify a string literal. Thus, the behavior is undefined.
The
char str[] = "Shiv";
variant will create a copy of the string literal in a modifiable array 'str', and then 'reverse' will operate on that copy. This will work fine.
P.S. Don't create non-const-qualified pointers to string literals. You first variant should have been
const char *str = "Shiv";
(note the extra 'const').
String literals (your "Shiv") are not modifiable.
You assign to a pointer the address of such a string literal, then you try to change the contents of the string literal by dereferencing the pointer value. That's a big NO-NO.
Declare str as an array instead:
char str[] = "Shiv";
This creates str as an array of 5 characters and copies the characters 'S', 'h', 'i', 'v' and '\0' to str[0], str[1], ..., str[4]. The values in each element of str are modifiable.
When I want to use a pointer to a string literal, I usually declare it const. That way, the compiler can help me by issuing a message when my code wants to change the contents of a string literal
const char *str = "Shiv";
Imagine you could do the same with integers.
/* Just having fun, this is not C! */
int *ptr = &5; /* address of 5 */
*ptr = 42; /* change 5 to 42 */
printf("5 + 1 is %d\n", *(&5) + 1); /* 6? or 43? :) */
Quote from the Standard:
6.4.5 String literals
...
6 ... If the program attempts to modify such an array [a string literal], the behavior is undefined.
char *str is a pointer / reference to a block of characters (the string). But its sitting somewhere in a block of memory so you cannot just assign it like that.
Interesting that I've never noticed this. I was able to replicate this condition in VS2008 C++.
Typically, it is a bad idea to do in-place modification of constants.
In any case, this post explains this situation pretty clearly.
The first (char[]) is local data you can edit
(since the array is local data).
The second (char *) is a local pointer to
global, static (constant) data. You
are not allowed to modify constant
data.
If you have GNU C, you can compile
with -fwritable-strings to keep the
global string from being made
constant, but this is not recommended.