I'm quite confused because from what I've learned, pointers store addresses of the data they are pointing to. But in some codes, I see strings often assigned to pointers during initialization.
What exactly happens to the string?
Does the pointer automatically assign an address to store the string and point itself to that address?
How does "dereferencing" works in pointers to strings?
In case of
char *p = "String";
compiler allocate memory for "String", most likely "String" is stored in read only data section of memory, and set pointer p to points to the first byte of that memory address.
p --------------+
|
|
V
+------+------+------+------+------+------+------+
| | | | | | | |
| 'S' | 't' | 'r' | 'i' | 'n' | 'g' | '\0' |
| | | | | | | |
+------+------+------+------+------+------+------+
x100 x101 x102 x103 x104 x105 x106
Q: I see strings often assigned to pointers during initialization.
I think, what you are calling as string is actually a string literal.
According to C11 standard, chapter §6.4.5
A character string literal is a sequence of zero or more multibyte characters enclosed in
double-quotes, as in "xyz". [...]
The representation, "xyz" produces the address of the first element of the string literal which is then stored into the pointer, as you've seen in the initialization time.
Q: Does the pointer automatically assign an address to store the string and point itself to that address?
A: No, the memory for storing the string literal is allocated at compile time by the compiler. Whether a string literal is stored in a read only memory or read-write memory is compiler dependent. Standard only mentions that any attempt to modify a string literal results in undefined behavior.
Q: How does "dereferencing" works in pointers to strings?
A: Just the same way as it happens in case of another pointer to any other variable.
Related
This question already has answers here:
What is the difference between char s[] and char *s?
(14 answers)
String literals: pointer vs. char array
(1 answer)
Closed 1 year ago.
This post was edited and submitted for review 1 year ago and failed to reopen the post:
Original close reason(s) were not resolved
hello to all programmers, I can't understand something
char a[]="hello"
char* b="salam"
the first question is why can't we modify 2,for example b[0]='m', I know that 2 gets stored as compile time constant BUT I can't understand what does it mean and what is the quiddity of 2 ?
and second question:
3.
char a[]="hello";
char* c=a;
c[0]='r';
Now we can modify and then print c, but we couldn't modify 2 ! why?
I can't understand the concept of those pointers please explain it to me
char a[] = "hello;" is a null terminated array of characters, the array will be initialized with the charaters you specify and the size of it will be deduced by the compiler, in this case it will have space for 6 characters, these are mutable, the charaters are copied to the array, you can change them at will. e.g. a[0] = 'x' will change hello to xello.
char* c = a; just makes the pointer c point to a, the same operations can be performed in c as you are really operating in a.
char* b = "salam" is a different animal, b is a pointer to a string literal, these are not meant to be modified, they don't get stored in an array like a, they are read only and are usually stored in some read only section of memory, either way the behavior of editing b is undefined, i.e. b[0] = 'x' is illegal as per the language rules.
char a[]="hello";
This creates an array like this:
+---+---+---+---+---+----+
a: | h | e | l | l | o | \0 |
+---+---+---+---+---+----+
The array is modifiable and you can write other characters to it later if you like (although you cannot write more than 5 or 6 of them).
char* b="salam";
This uses a string literal to create a constant string somewhere, that variable b is then a pointer to. I like to draw it like this:
+-------+
b: | * |
+---|---+
|
V
+---+---+---+---+---+----+
| s | a | l | a | m | \0 |
+---+---+---+---+---+----+
There are two differences here: (1) b is a pointer, not an array as a was. (2) the string here (that b points to) is probably in nonwritable memory. But a was definitely in writable memory.
char* c=a;
Now c is a pointer, pointing at the earlier-declared array a. The picture looks like this:
+---+---+---+---+---+----+
a: | h | e | l | l | o | \0 |
+---+---+---+---+---+----+
^
|
\
|
+---|---+
c: | * |
+-------+
And the array a was modifiable, so there's no problem doing c[0] = 'r', and we end up sounding like Scooby-Doo and saying:
+---+---+---+---+---+----+
a: | r | e | l | l | o | \0 |
+---+---+---+---+---+----+
^
|
\
|
+---|---+
c: | * |
+-------+
The key difference (which can be quite subtle) is that a string literal in source code like "hello" can be used in two very different ways. When you say
char a[] = "hello";
the string literal is used as the initial value of the array a. But the array a is an ordinary, modifiable array, and there's no problem writing to it later.
Most other uses of string literals, however, work differently. When you say
char *b = "salam";
or
printf("goodbye\n");
those string literals are used to create and initialize "anonymous" string arrays somewhere, which are referred to thereafter via pointers. The arrays are "anonymous" in that they don't have names (identifiers) to refer to them, and they're also usually placed in read-only memory, so you're not supposed to try to write to them.
Let's start of with your first question:
We have 2 strings, a and b
char a[] = "hello";
char *b = "salam";
The first string can be modified, this is because it uses a different memory segment than the second string. It is stored in the data segment of the program, and we have write access to the data segment so we can modify it.
The second string is a pointer to a string, we cannot modify string literals (pointers to strings) since c specifies that this is undefined behavior.
The address of b will just point to somewhere in the program where that string is stored. This string should preferably be declared const since it can't be modified anyways.
const char *b = "salam";
Now let's look at the second question:
The code you provided for the second question is perfectly valid,
char a[] = "hello";
char *c = a;,
c[0] = 'r';
We have a, which stores the actual string and if using ASCII it consists of 6 bytes 'h', 'e', 'l', 'l', 'o', '\0'
c points to a we can verify this with this code
#include <stdio.h>
int main(void) {
char a[] = "hello";
char *c = a;
c[0] = 'r';
printf("a: %p\nc: %p\n", &a, &*c);
}
And we'll get output as such
a: 0x7ffe3c94ecf2
c: 0x7ffe3c94ecf2
They both point to the same address, the start of the array when we do
c[0] // It essentially means *(c + 0) = in other words the address which c points to + 0 and then we subscript this is how subscripting works a[1] = *(a + 1), etc...
So pretty much c in this case points to
0x7ffe3c94ecf2
c + 0 =
0x7ffe3c94ecf2
Access that address and modify the character.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Please take a look at this code.
#include <stdio.h>
int main()
{
char *p;
p = "%d";
p++;
p++;
printf(p-2,23);
return 0;
}
I have the following questions
1) How can a pointer to a character data type can hold a string data type?
2) What happens when p is incremented twice?
3) How can the printf()can print a string when no apparent quotation marks are used?
"How can a pointer to a character data type can hold a string data type?" Well, it's partly true that in C, type 'pointer to char' is the string type. Any function that operates on strings (including printf) will be found to accept these strings via parameters of type char *.
"How can printf() print a string when no apparent quotation marks are used?" There's no rule that says you need quotation marks to have a string! That thing with quotation marks is a string constant or string literal, and it's one way to get a string into your program, but it's not at all the only way. There are lots of ways to construct (and manipulate, and modify) strings that don't involve any quotation marks at all.
Let's draw some pictures representing your code:
char *p;
p is a pointer to char, but as you correctly note, it doesn't point anywhere yet. We can represent it graphically like this:
+-----------+
p: | ??? |
+-----------+
Next you set p to point somewhere:
p = "%d";
This allocates the string "%d" somewhere (it doesn't matter where), and sets p to point to it:
+---+---+---+
| % | d |\0 |
+---+---+---+
^
|
\
\
\
|
+-----|-----+
p: | * |
+-----------+
Next, you start incrementing p:
p++;
As you said, this makes p point one past where it used to, to the second character of the string:
+---+---+---+
| % | d |\0 |
+---+---+---+
^
|
|
|
|
|
+-----|-----+
p: | * |
+-----------+
Next,
p++;
Now we have:
+---+---+---+
| % | d |\0 |
+---+---+---+
^
|
/
/
/
|
+-----|-----+
p: | * |
+-----------+
Next you called printf, but somewhat strangely:
printf(p-2,23);
The key to that is the expression p-2. If p points to the third character in the string, then p-2 points to the first character in the string:
+---+---+---+
| % | d |\0 |
+---+---+---+
^ ^
+----|----+ |
p-2: | * | /
+---------+/
/
|
+-----|-----+
p: | * |
+-----------+
And that pointer, p-2, is more or less the same pointer that printf would have received if you're more conventionally called printf("%d", 23).
Now, if you thought printf received a string, it may surprise you to hear that printf is happy to receive a char * instead — and that in fact it always receives a char *. If this is surprising, ask yourself, what did you thing printf did receive, if not a pointer to char?
Strictly speaking, a string in C is an array of characters (terminated with the '\0' character). But there's this super-important secret fact about C, which if you haven't encountered yet you will real soon (because it's really not a secret at all):
You can't do much with arrays in C. Whenever you mention an array in an expression in C, whenever it looks like you're trying to do something with the value of the array, what you get is a pointer to the array's first element.
That pointer is pretty much the "value" of the array. Due to the way pointer arithmetic works, you can use pointers to access arrays pretty much transparently (almost as if the pointer was the array, but of course it's not). And this all applies perfectly well to arrays of (and pointers to) characters, as well.
So since a string in C is an array of characters, when you write
"%d"
that's an array of three characters. But when you use it in an expression, what you get is a pointer to the array's first element. For example, if you write
printf("%d", 23);
you've got an array of characters, and you're mentioning it in an expression, so what you get is a pointer to the array's first element, and that's what gets passed to printf.
If we said
char *p = "%d";
printf(p, 23);
we've done the same thing, just a bit more explicitly: again, we've mentioned the array "%d" in an expression, so what we get as its value is a pointer to its first element, so that's the pointer that's used to initialize the pointer variable p, and that's the pointer that gets passed as the first argument to printf, so printf is happy.
Up above, I said "it's partly true that in C, type 'pointer to char' is the string type". Later I said that "a string in C is an array of characters". So which is it? An array or a pointer? Strictly speaking, a string is an array of characters. But like all arrays, we can't do much with an array of characters, and when we try, what we get is a pointer to the first element. So most of the time, strings in C are accessed and manipulated and modified via pointers to characters. All functions that operate on strings (including printf) actually receive pointers to char, pointing at the strings they'll manipulate.
the following explains each statement in the posted code:
#include <stdio.h>// include the header file that has the prototype for 'printf()'
int main( void ) // correct signature of 'main' function
{
char *p; // declare a pointer to char, do not initialize
p = "%d"; // assign address of string to pointer
p++; // increment pointer (so points to second char in string
p++; // increment pointer (so points to third char in string
printf(p-2,23);// use string as 'format string' in print statement,
// and pass a parameter of 23
return 0; // exit the program, returning 0 to the OS
}
1) How can a pointer to a character data type can hold a string data type?
Ans: String is not a basic data type in C. String is nothing but a continuous placement of char in memory until '\0' is encountered.
2) What happens when p is incremented twice?
Ans: It now points to the '\0' character.
3) How can the printf()can print a string when no apparent quotation marks are used
Ans: A string is always represented in quotation marks so extra quotes are not needed.
1. How can a pointer to a character data type can hold a string data type?
-> Char pointer will hold the address of char datatype, since string is collection of char datatypes. Hence char pointer can hold the string data type..
2. What happens when p is incremented twice?
-> When you assign the char pointer to string pointer will point to first char. So when you increment the pointer twice, it will hold the address of 3rd char, in your case it is'\0';
3. How can the printf()can print a string when no apparent quotation marks are used?
-> printf(p-2,23); Uses string as format identifier in your case it is "%d".
I'm having trouble understanding the following code:
const char *suit[4] = {"Hearts", "Diamonds", "Clubs", "Spades"}
I don't understand what is stored in the array suit, are they pointers? And if so, where are the strings stored?
Also, is the pointer constant, or the array constant?
I would appreciate a full detailed explanation of this code, and what is going on in memory!
Thanks in advance.
We learn a lot by using cdecl.org. This is what it tells us about suit:
declare suit as array 4 of pointer to const char
So:
the array contains 4 pointers.
each pointer points at a char (in this case, the first character of each string).
the pointers are not const, and neither is the array.
The strings are literals; where they are stored is implementation-specific.
In ASCII art:
"Clubs"
^
| "Spades"
| ^
| |
+---+---+---+---+
suit | | | | |
+---+---+---+---+
| |
| v
| "Diamonds"
v
"Hearts"
Note that suit itself is not a pointer; it's the name of the array.
const char * is a string type since strings are just arrays of characters. This means you have an array of const char * (strings). The strings themselves are constant and are stored in the .data section of your file binary when compiled. Hence the data pointed to by the pointer is constant.
I'm trying to write a C99 program and I have an array of strings implicitly defined as such:
char *stuff[] = {"hello","pie","deadbeef"};
Since the array dimensions are not defined, how much memory is allocated for each string? Are all strings allocated the same amount of elements as the largest string in the definition? For example, would this following code be equivalent to the implicit definition above:
char stuff[3][9];
strcpy(stuff[0], "hello");
strcpy(stuff[1], "pie");
strcpy(stuff[2], "deadbeef");
Or is each string allocated just the amount of memory it needs at the time of definition (i.e. stuff[0] holds an array of 6 elements, stuff[1] holds an array of 4 elements, and stuff[2] holds an array of 9 elements)?
Pictures can help — ASCII Art is fun (but laborious).
char *stuff[] = {"hello","pie","deadbeef"};
+----------+ +---------+
| stuff[0] |--------->| hello\0 |
+----------+ +---------+ +-------+
| stuff[1] |-------------------------->| pie\0 |
+----------+ +------------+ +-------+
| stuff[2] |--------->| deadbeef\0 |
+----------+ +------------+
The memory allocated for the 1D array of pointers is contiguous, but there is no guarantee that the pointers held in the array point to contiguous sections of memory (which is why the pointer lines are different lengths).
char stuff[3][9];
strcpy(stuff[0], "hello");
strcpy(stuff[1], "pie");
strcpy(stuff[2], "deadbeef");
+---+---+---+---+---+---+---+---+---+
| h | e | l | l | o | \0| x | x | x |
+---+---+---+---+---+---+---+---+---+
| p | i | e | \0| x | x | x | x | x |
+---+---+---+---+---+---+---+---+---+
| d | e | a | d | b | e | e | f | \0|
+---+---+---+---+---+---+---+---+---+
The memory allocated for the 2D array is contiguous. The x's denote uninitialized bytes. Note that stuff[0] is a pointer to the 'h' of 'hello', stuff[1] is a pointer to the 'p' of 'pie', and stuff[2] is a pointer to the first 'd' of 'deadbeef' (and stuff[3] is a non-dereferenceable pointer to the byte beyond the null byte after 'deadbeef').
The pictures are quite, quite different.
Note that you could have written either of these:
char stuff[3][9] = { "hello", "pie", "deadbeef" };
char stuff[][9] = { "hello", "pie", "deadbeef" };
and you would have the same memory layout as shown in the 2D array diagram (except that the x's would be zeroed).
char *stuff[] = {"hello","pie","deadbeef"};
Is not a multidimensional array! It is simply an array of pointers.
how much memory is allocated for each string?
The number of characters plus a null terminator. Same as any string literal.
I think you want this:
char foo[][10] = {"hello","pie","deadbeef"};
Here, 10 is the amount of space per string and all the strings are in contiguous memory. Thus, there will be padding for strings less than size 10.
In the first example, it is a jagged array I suppose.
It declares an array of const pointers to a char. So the string literal can be as long as you like. The length of the string is independent of the array columns.
In the second one.. the number of characters per row (string) lengths must be 9 as specified by your column size, or less.
Are all strings allocated the same amount of elements as the largest
string in the definition?
No, only 3 pointer are allocated and they point to 3 string literals.
char *stuff[] = {"hello","pie","deadbeef"};
and
char stuff[3][9];
are not at all equivalent. First is an array of 3 pointers whereas the second is a 2D array.
For the first only pointer are allocated and the string literals they point to may be stored in the read-only section. The second is allocated on automatic storage (usually stack).
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
What is the difference between char s[] and char *s in C?
What is the difference between char a[]="string"; and char *p="string";?
The first one is array the other is pointer.
The array declaration "char a[6];" requests that space for six characters be set aside, to be known by the name "a." That is, there is a location named "a" at which six characters can sit. The pointer declaration "char *p;" on the other hand, requests a place which holds a pointer. The pointer is to be known by the name "p," and can point to any char (or contiguous array of chars) anywhere.
The statements
char a[] = "hello";
char *p = "world";
would result in data structures which could be represented like this:
+---+---+---+---+---+---+
a: | h | e | l | l | o |\0 |
+---+---+---+---+---+---+
+-----+ +---+---+---+---+---+---+
p: | *======> | w | o | r | l | d |\0 |
+-----+ +---+---+---+---+---+---+
It is important to realize that a reference like x[3] generates different code depending on whether x is an array or a pointer. Given the declarations above, when the compiler sees the expression a[3], it emits code to start at the location "a," move three past it, and fetch the character there. When it sees the expression p[3], it emits code to start at the location "p," fetch the pointer value there, add three to the pointer, and finally fetch the character pointed to. In the example above, both a[3] and p[3] happen to be the character 'l', but the compiler gets there differently.
You can use search there are tons of explanations on the subject in th internet.
char a[]="string"; //a is an array of characters.
char *p="string";// p is a string literal having static allocation. Any attempt to modify contents of p leads to Undefined Behavior since string literals are stored in read-only section of memory.
No difference. Unless you want to actually write to the array, in which case the whole world will explode if you try to use the second form. See here.
First declaration declares an array, while second - a pointer.
If you're interested in difference in some particular aspect, please clarify your question.
One difference is that sizeof(a)-1 will be replaced with the length of the string at compile time. With p you need to use strlen(p) to get the length at runtime. Also some compilers don't like char *p="string", they want const char *p="string" in which case the memory for "string" is read-only but the memory for a is not. Even if the compiler does not require the const declaration it's bad practice to modify the string pointed to by p (ie *p='a'). The pointer p can be changed to point to something else. With the array a, a new value has to be copied into the array (if it fits).