Lately I've been learning all about the C language, and am confused as to when to use
char a[];
over
char *p;
when it comes to string manipulation. For instance, I can assign a string to them both like so:
char a[] = "Hello World!";
char *p = "Hello World!";
and view/access them both like:
printf("%s\n", a);
printf("%s\n", p);
and manipulate them both like:
printf("%c\n", &a[6]);
printf("%c\n", &p[6]);
So, what am I missing?
char a[] = "Hello World!";
This allocates modifiable array just big enough to hold the string literal (including terminating NUL char). Then it initializes the array with contents of string literal. If it is a local variable, then this effectively means it does memcpy at runtime, every time the local variable is created.
Use this when you need to modify the string, but don't need to make it bigger.
Also, if you have char *ap = a;, when a goes out of scope ap becomes a dangling pointer. Or, same thing, you can't do return a; when a is local to that function, because return value will be dangling pointer to now destroyed local variables of that function.
Note that using exactly this is rare. Usually you don't want an array with contents from string literal. It's much more common to have something like:
char buf[100]; // contents are undefined
snprintf(buf, sizeof buf, "%s/%s.%d", pathString, nameString, counter);
char *p = "Hello World!";
This defines pointer, and initializes it to point to string literal. Note that string literals are (normally) non-writable, so you really should have this instead:
const char *p = "Hello World!";
Use this when you need pointer to non-modifiable string.
In contrast to a above, if you have const char *p2 = p; or do return p;, these are fine, because pointer points to the string literal in program's constant data, and is valid for the whole execution of the program.
The string literals themselves, text withing double quotes, the actual bytes making up the strings, are created at compile time and normally placed with other constant data within the application. And then string literal in code concretely means address of this constant data blob.
char * strings are read-only. They cannot be modified while char[] strings can be.
char *str = "hello";
str[0] = 't'; // This is an illegal operation
Whereas
char str[] = "hello"; str[0] = 't'; // Legal, string becomes tello
Related
I have been struggling for a few hours with all sorts of C tutorials and books related to pointers but what I really want to know is if it's possible to change a char pointer once it's been created.
This is what I have tried:
char *a = "This is a string";
char *b = "new string";
a[2] = b[1]; // Causes a segment fault
*b[2] = b[1]; // This almost seems like it would work but the compiler throws an error.
So is there any way to change the values inside the strings rather than the pointer addresses?
When you write a "string" in your source code, it gets written directly into the executable because that value needs to be known at compile time (there are tools available to pull software apart and find all the plain text strings in them). When you write char *a = "This is a string", the location of "This is a string" is in the executable, and the location a points to, is in the executable. The data in the executable image is read-only.
What you need to do (as the other answers have pointed out) is create that memory in a location that is not read only--on the heap, or in the stack frame. If you declare a local array, then space is made on the stack for each element of that array, and the string literal (which is stored in the executable) is copied to that space in the stack.
char a[] = "This is a string";
you can also copy that data manually by allocating some memory on the heap, and then using strcpy() to copy a string literal into that space.
char *a = malloc(256);
strcpy(a, "This is a string");
Whenever you allocate space using malloc() remember to call free() when you are finished with it (read: memory leak).
Basically, you have to keep track of where your data is. Whenever you write a string in your source, that string is read only (otherwise you would be potentially changing the behavior of the executable--imagine if you wrote char *a = "hello"; and then changed a[0] to 'c'. Then somewhere else wrote printf("hello");. If you were allowed to change the first character of "hello", and your compiler only stored it once (it should), then printf("hello"); would output cello!)
No, you cannot modify it, as the string can be stored in read-only memory. If you want to modify it, you can use an array instead e.g.
char a[] = "This is a string";
Or alternately, you could allocate memory using malloc e.g.
char *a = malloc(100);
strcpy(a, "This is a string");
free(a); // deallocate memory once you've done
A lot of folks get confused about the difference between char* and char[] in conjunction with string literals in C. When you write:
char *foo = "hello world";
...you are actually pointing foo to a constant block of memory (in fact, what the compiler does with "hello world" in this instance is implementation-dependent.)
Using char[] instead tells the compiler that you want to create an array and fill it with the contents, "hello world". foo is the a pointer to the first index of the char array. They both are char pointers, but only char[] will point to a locally allocated and mutable block of memory.
The memory for a & b is not allocated by you. The compiler is free to choose a read-only memory location to store the characters. So if you try to change it may result in seg fault. So I suggest you to create a character array yourself. Something like: char a[10]; strcpy(a, "Hello");
It seems like your question has been answered but now you might wonder why char *a = "String" is stored in read-only memory. Well, it is actually left undefined by the c99 standard but most compilers choose to it this way for instances like:
printf("Hello, World\n");
c99 standard(pdf) [page 130, section 6.7.8]:
The declaration:
char s[] = "abc", t[3] = "abc";
defines "plain" char array objects s and t whose elements are initialized with character string literals.
This declaration is identical to char
s[] = { 'a', 'b', 'c', '\0' }, t[] = { 'a', 'b', 'c' };
The contents of the arrays are modifiable. On the other hand, the declaration
char *p = "abc";
defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.
You could also use strdup:
The strdup() function returns a pointer to a new string which is a duplicate of the string s.
Memory for the new string is obtained with malloc(3), and can be freed with free(3).
For you example:
char *a = strdup("stack overflow");
All are good answers explaining why you cannot modify string literals because they are placed in read-only memory. However, when push comes to shove, there is a way to do this. Check out this example:
#include <sys/mman.h>
#include <unistd.h>
#include <stddef.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
int take_me_back_to_DOS_times(const void *ptr, size_t len);
int main()
{
const *data = "Bender is always sober.";
printf("Before: %s\n", data);
if (take_me_back_to_DOS_times(data, sizeof(data)) != 0)
perror("Time machine appears to be broken!");
memcpy((char *)data + 17, "drunk!", 6);
printf("After: %s\n", data);
return 0;
}
int take_me_back_to_DOS_times(const void *ptr, size_t len)
{
int pagesize;
unsigned long long pg_off;
void *page;
pagesize = sysconf(_SC_PAGE_SIZE);
if (pagesize < 0)
return -1;
pg_off = (unsigned long long)ptr % (unsigned long long)pagesize;
page = ((char *)ptr - pg_off);
if (mprotect(page, len + pg_off, PROT_READ | PROT_WRITE | PROT_EXEC) == -1)
return -1;
return 0;
}
I have written this as part of my somewhat deeper thoughts on const-correctness, which you might find interesting (I hope :)).
Hope it helps. Good Luck!
You need to copy the string into another, not read-only memory buffer and modify it there. Use strncpy() for copying the string, strlen() for detecting string length, malloc() and free() for dynamically allocating a buffer for the new string.
For example (C++ like pseudocode):
int stringLength = strlen( sourceString );
char* newBuffer = malloc( stringLength + 1 );
// you should check if newBuffer is 0 here to test for memory allocaton failure - omitted
strncpy( newBuffer, sourceString, stringLength );
newBuffer[stringLength] = 0;
// you can now modify the contents of newBuffer freely
free( newBuffer );
newBuffer = 0;
char *a = "stack overflow";
char *b = "new string, it's real";
int d = strlen(a);
b = malloc(d * sizeof(char));
b = strcpy(b,a);
printf("%s %s\n", a, b);
Let's say I have a char pointer called string1 that points to the first character in the word "hahahaha". I want to create a char[] that contains the same string that string1 points to.
How come this does not work?
char string2[] = string1;
"How come this does not work?"
Because that's not how the C language was defined.
You can create a copy using strdup() [Note that strdup() is not ANSI C]
Refs:
C string handling
strdup() - what does it do in C?
1) pointer string2 == pointer string1
change in value of either will change the other
From poster poida
char string1[] = "hahahahaha";
char* string2 = string1;
2) Make a Copy
char string1[] = "hahahahaha";
char string2[11]; /* allocate sufficient memory plus null character */
strcpy(string2, string1);
change in value of one of them will not change the other
What you write like this:
char str[] = "hello";
... actually becomes this:
char str[] = {'h', 'e', 'l', 'l', 'o'};
Here we are implicitly invoking something called the initializer.
Initializer is responsible for making the character array, in the above scenario.
Initializer does this, behind the scene:
char str[5];
str[0] = 'h';
str[1] = 'e';
str[2] = 'l';
str[3] = 'l';
str[4] = 'o';
C is a very low level language. Your statement:
char str[] = another_str;
doesn't make sense to C.
It is not possible to assign an entire array, to another in C. You have to copy letter by letter, either manually or using the strcpy() function.
In the above statement, the initializer does not know the length of the another_str array variable. If you hard code the string instead of putting another_str, then it will work.
Some other languages might allow to do such things... but you can't expect a manual car to switch gears automatically. You are in charge of it.
In C you have to reserve memory to hold a string.
This is done automatically when you define a constant string, and then assign to a char[].
On the other hand, when you write string2 = string1,
what you are actually doing is assigning the memory addresses of pointer-to-char objects. If string2 is declares as char* (pointer-to-char), then it is valid the assignment:
char* string2 = "Hello.";
The variable string2 now holds the address of the first character of the constanta array of char "Hello.".
It is fine, also, to write string2 = string1 when string2 is a char* and string1 is a char[].
However, it is supposed that a char[] has constant address in memory. Is not modifiable.
So, it is not allowed to write sentences like that:
char string2[];
string2 = (something...);
However, you are able to modify the individual characters of string2, because is an array of characters:
string2[0] = 'x'; /* That's ok! */
Does this...
char* myString = "hello";
... have the same effect as this?
char actualString[] = "hello";
char* myString = actualString;
No.
char str1[] = "Hello world!"; //char-array on the stack; string can be changed
char* str2 = "Hello world!"; //char-array in the data-segment; it's READ-ONLY
The first example creates an array of size 13*sizeof(char) on the stack and copies the string "Hello world!" into it.
The second example creates a char* on the stack and points it to a location in the data-segment of the executable, which contains the string "Hello world!". This second string is READ-ONLY.
str1[1] = 'u'; //Valid
str2[1] = 'u'; //Invalid - MAY crash program!
No. The first one gives you a pointer to const data, and if you change any character via that pointer, it's undefined behavior. The second one copies the characters into an array, which isn't const, so you can change any characters (either directly in array, or via pointer) at will with no ill effects.
No. In the first one, you can't modify the string pointed by myString, in the second one you can. Read more here.
It isn't the same, because the unnamed array pointed to by myString in the first example is read-only and has static storage duration, whereas the named array in the second example is writeable and has automatic storage duration.
On the other hand, this is closer to being equivalent:
static const char actualString[] = "hello";
char* myString = (char *)actualString;
It's still not quite the same though, because the unnamed arrays created by string literals are not guaranteed to be unique, whereas explicit arrays are. So in the following example:
static const char string_a[] = "hello";
static const char string_b[] = "hello";
const char *ptr_a = string_a;
const char *ptr_b = string_b;
const char *ptr_c = "hello";
const char *ptr_d = "hello";
ptr_a and ptr_b are guaranteed to compare unequal, whereas ptr_c and ptr_d may be either equal or unequal - both are valid.
I have been struggling for a few hours with all sorts of C tutorials and books related to pointers but what I really want to know is if it's possible to change a char pointer once it's been created.
This is what I have tried:
char *a = "This is a string";
char *b = "new string";
a[2] = b[1]; // Causes a segment fault
*b[2] = b[1]; // This almost seems like it would work but the compiler throws an error.
So is there any way to change the values inside the strings rather than the pointer addresses?
When you write a "string" in your source code, it gets written directly into the executable because that value needs to be known at compile time (there are tools available to pull software apart and find all the plain text strings in them). When you write char *a = "This is a string", the location of "This is a string" is in the executable, and the location a points to, is in the executable. The data in the executable image is read-only.
What you need to do (as the other answers have pointed out) is create that memory in a location that is not read only--on the heap, or in the stack frame. If you declare a local array, then space is made on the stack for each element of that array, and the string literal (which is stored in the executable) is copied to that space in the stack.
char a[] = "This is a string";
you can also copy that data manually by allocating some memory on the heap, and then using strcpy() to copy a string literal into that space.
char *a = malloc(256);
strcpy(a, "This is a string");
Whenever you allocate space using malloc() remember to call free() when you are finished with it (read: memory leak).
Basically, you have to keep track of where your data is. Whenever you write a string in your source, that string is read only (otherwise you would be potentially changing the behavior of the executable--imagine if you wrote char *a = "hello"; and then changed a[0] to 'c'. Then somewhere else wrote printf("hello");. If you were allowed to change the first character of "hello", and your compiler only stored it once (it should), then printf("hello"); would output cello!)
No, you cannot modify it, as the string can be stored in read-only memory. If you want to modify it, you can use an array instead e.g.
char a[] = "This is a string";
Or alternately, you could allocate memory using malloc e.g.
char *a = malloc(100);
strcpy(a, "This is a string");
free(a); // deallocate memory once you've done
A lot of folks get confused about the difference between char* and char[] in conjunction with string literals in C. When you write:
char *foo = "hello world";
...you are actually pointing foo to a constant block of memory (in fact, what the compiler does with "hello world" in this instance is implementation-dependent.)
Using char[] instead tells the compiler that you want to create an array and fill it with the contents, "hello world". foo is the a pointer to the first index of the char array. They both are char pointers, but only char[] will point to a locally allocated and mutable block of memory.
The memory for a & b is not allocated by you. The compiler is free to choose a read-only memory location to store the characters. So if you try to change it may result in seg fault. So I suggest you to create a character array yourself. Something like: char a[10]; strcpy(a, "Hello");
It seems like your question has been answered but now you might wonder why char *a = "String" is stored in read-only memory. Well, it is actually left undefined by the c99 standard but most compilers choose to it this way for instances like:
printf("Hello, World\n");
c99 standard(pdf) [page 130, section 6.7.8]:
The declaration:
char s[] = "abc", t[3] = "abc";
defines "plain" char array objects s and t whose elements are initialized with character string literals.
This declaration is identical to char
s[] = { 'a', 'b', 'c', '\0' }, t[] = { 'a', 'b', 'c' };
The contents of the arrays are modifiable. On the other hand, the declaration
char *p = "abc";
defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.
You could also use strdup:
The strdup() function returns a pointer to a new string which is a duplicate of the string s.
Memory for the new string is obtained with malloc(3), and can be freed with free(3).
For you example:
char *a = strdup("stack overflow");
All are good answers explaining why you cannot modify string literals because they are placed in read-only memory. However, when push comes to shove, there is a way to do this. Check out this example:
#include <sys/mman.h>
#include <unistd.h>
#include <stddef.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
int take_me_back_to_DOS_times(const void *ptr, size_t len);
int main()
{
const *data = "Bender is always sober.";
printf("Before: %s\n", data);
if (take_me_back_to_DOS_times(data, sizeof(data)) != 0)
perror("Time machine appears to be broken!");
memcpy((char *)data + 17, "drunk!", 6);
printf("After: %s\n", data);
return 0;
}
int take_me_back_to_DOS_times(const void *ptr, size_t len)
{
int pagesize;
unsigned long long pg_off;
void *page;
pagesize = sysconf(_SC_PAGE_SIZE);
if (pagesize < 0)
return -1;
pg_off = (unsigned long long)ptr % (unsigned long long)pagesize;
page = ((char *)ptr - pg_off);
if (mprotect(page, len + pg_off, PROT_READ | PROT_WRITE | PROT_EXEC) == -1)
return -1;
return 0;
}
I have written this as part of my somewhat deeper thoughts on const-correctness, which you might find interesting (I hope :)).
Hope it helps. Good Luck!
You need to copy the string into another, not read-only memory buffer and modify it there. Use strncpy() for copying the string, strlen() for detecting string length, malloc() and free() for dynamically allocating a buffer for the new string.
For example (C++ like pseudocode):
int stringLength = strlen( sourceString );
char* newBuffer = malloc( stringLength + 1 );
// you should check if newBuffer is 0 here to test for memory allocaton failure - omitted
strncpy( newBuffer, sourceString, stringLength );
newBuffer[stringLength] = 0;
// you can now modify the contents of newBuffer freely
free( newBuffer );
newBuffer = 0;
char *a = "stack overflow";
char *b = "new string, it's real";
int d = strlen(a);
b = malloc(d * sizeof(char));
b = strcpy(b,a);
printf("%s %s\n", a, b);
void reverse(char *str){
int i,j;
char temp;
for(i=0,j=strlen(str)-1; i<j; i++, j--){
temp = *(str + i);
*(str + i) = *(str + j);
*(str + j) = temp;
printf("%c",*(str + j));
}
}
int main (int argc, char const *argv[])
{
char *str = "Shiv";
reverse(str);
printf("%s",str);
return 0;
}
When I use char *str = "Shiv" the lines in the swapping part of my reverse function i.e str[i]=str[j] dont seem to work, however if I declare str as char str[] = "Shiv", the swapping part works? What is the reason for this. I was a bit puzzled by the behavior, I kept getting the message "Bus error" when I tried to run the program.
When you use char *str = "Shiv";, you don't own the memory pointed to, and you're not allowed to write to it. The actual bytes for the string could be a constant inside the program's code.
When you use char str[] = "Shiv";, the 4(+1) char bytes and the array itself are on your stack, and you're allowed to write to them as much as you please.
The char *str = "Shiv" gets a pointer to a string constant, which may be loaded into a protected area of memory (e.g. part of the executable code) that is read only.
char *str = "Shiv";
This should be :
const char *str = "Shiv";
And now you'll have an error ;)
Try
int main (int argc, char const *argv[])
{
char *str = malloc(5*sizeof(char)); //4 chars + '\0'
strcpy(str,"Shiv");
reverse(str);
printf("%s",str);
free(str); //Not needed for such a small example, but to illustrate
return 0;
}
instead. That will get you read/write memory when using pointers. Using [] notation allocates space in the stack directly, but using const pointers doesn't.
String literals are non-modifiable objects in both C and C++. An attempt to modify a string literal always results in undefined behavior. This is exactly what you observe when you get your "Bus error" with
char *str = "Shiv";
variant. In this case your 'reverse' function will make an attempt to modify a string literal. Thus, the behavior is undefined.
The
char str[] = "Shiv";
variant will create a copy of the string literal in a modifiable array 'str', and then 'reverse' will operate on that copy. This will work fine.
P.S. Don't create non-const-qualified pointers to string literals. You first variant should have been
const char *str = "Shiv";
(note the extra 'const').
String literals (your "Shiv") are not modifiable.
You assign to a pointer the address of such a string literal, then you try to change the contents of the string literal by dereferencing the pointer value. That's a big NO-NO.
Declare str as an array instead:
char str[] = "Shiv";
This creates str as an array of 5 characters and copies the characters 'S', 'h', 'i', 'v' and '\0' to str[0], str[1], ..., str[4]. The values in each element of str are modifiable.
When I want to use a pointer to a string literal, I usually declare it const. That way, the compiler can help me by issuing a message when my code wants to change the contents of a string literal
const char *str = "Shiv";
Imagine you could do the same with integers.
/* Just having fun, this is not C! */
int *ptr = &5; /* address of 5 */
*ptr = 42; /* change 5 to 42 */
printf("5 + 1 is %d\n", *(&5) + 1); /* 6? or 43? :) */
Quote from the Standard:
6.4.5 String literals
...
6 ... If the program attempts to modify such an array [a string literal], the behavior is undefined.
char *str is a pointer / reference to a block of characters (the string). But its sitting somewhere in a block of memory so you cannot just assign it like that.
Interesting that I've never noticed this. I was able to replicate this condition in VS2008 C++.
Typically, it is a bad idea to do in-place modification of constants.
In any case, this post explains this situation pretty clearly.
The first (char[]) is local data you can edit
(since the array is local data).
The second (char *) is a local pointer to
global, static (constant) data. You
are not allowed to modify constant
data.
If you have GNU C, you can compile
with -fwritable-strings to keep the
global string from being made
constant, but this is not recommended.