char * variable address vs. char [] variable address - c

I am printing out addresses and strings from the following two declarations and initializations:
char * strPtr = (char *) "This is a string, made on the fly.";
char charArray [] = "Chars in a char array variable.";
When printed, the following output occurs with wildly different addresses for the variables charArray and strPtr. The question is, "Why?"
Printing:
printf( "%10s%40s%20p\n", "strPtr", strPtr, &(*strPtr));
printf( "%10s%40s%20p\n", "charArray", charArray, charArray);
Output:
strPtr This is a string, made on the fly. 0x400880
charArray Chars in a char array variable. 0x7fff12d5ed30
The different addresses, as you see, are: 0x400880 vs. 0x7fff12d5ed30
The rest of the variable declared before this have addresses like that of charArray.
Again, the question is, "Why are the addresses so different?"
Thanks for any assistance.

Because string literals, e.g. "foo bar" get allocated in a "different place" than your char array.
This is implementation dependent, but a typical implementation will put string literals in your .rdata ("read-only data") section of your executable, and your char array is declared locally, and hence goes on the stack.
And different sections of your image will get mapped to vastly different addresses when they get loaded in RAM.

I am guessing the compiler/linker puts the char array on the stack, whereas the the other string is put into a static string table.

The text "Chars in a char array variable." and "This is a string, made on the fly." are probably quite near each other. However, char charArray[] = ... requests space on the stack into which the corresponding bit of text is copied. The stack is practically in a different universe from the original hard-coded text, once the OS is done with its virtualization etc.

This is how it goes - I remember reading about this in [Unix: Systems Programming]
1
As you can see Initialized static data gets stored in a different location on the heap as opposed to uninitialized static data.

The crucial thing to realise here is that in the case of strPtr, you are dealing with two different objects, whereas in the case of charArray, you are dealing with only one.
charArray is a single array object, filled with the characters of the "Chars in a char array variable." string.
strPtr itself is a single pointer object. Its value is the address of a second anonymous, unmodifiable array object that in turn contains the characters of the "This is a string, made on the fly." string.
When you print out charArray using %p, you are printing the address of charArray[0] (due to a special rule for arrays). When you print out &(*strPtr) (which is exactly the same as just strPtr), you are printing the address of the anonymous, unmodifiable array object mentioned earlier - and this is why it appears so different to the addresses of the other variables involved.
If you print out &strPtr using %p, you will see that the address of the variable strPtr itself is in a similar range to the other local variables.

Related

Memory address of strings declared using a char pointer

I read that when you declare strings using a pointer, the pointer contains the memory address of the string literal. Therefore I expected to get the memory address from this code but rather I got some random numbers. Please help me understand why it didn't work.
int main()
{
char *hi = "Greeting!";
printf("%p",hi);
return(0);
}
If the pointer hi contains the memory address of the string literal, then why did it not display the memory address?
It did work. It's just that you can consider the address as being arbitrarily chosen by the C runtime. hi is a pointer set to the address of the capital G in your string. You own all the memory from hi up to and including the nul-terminator at the end of that string.
Also, use const char *hi = "Greeting!"; rather than char *: the memory starting at hi is read-only. Don't try to modify the string: the behaviour on attempting to do that is undefined.
The "random numbers" you got are the Memory addresses. They are not constant, since on each execution of your program, other Memory addresses are used.
A pointer could be represented in several ways. The format string "%p" "writes an implementation defined character sequence defining a pointer" link. In most cases, it's the pointed object's address interpreted as an appropriately sized unsigned integer, which looks like "a bunch of random number".
A user readable pointer representation is generally only useful for debugging. It allows you to compare two different representations (are the pointers the same?) and, in some cases, the relative order and distance between pointers (which pointer comes "first", and how far apart are they?). Interpreting pointers as integers works well in this optic.
It would be helpful to us if you could clarify what output you expected. Perhaps you expected pointers to be zero-based?
Note that, while some compilers might accept your example, it would be wiser to use const char *. Using a char * would allow you to try to modify your string literal, which is undefined behavior.

Differences between arrays, pointers and strings in C

Just have a question in mind that troubles me.
I know pointers and arrays are different in C because pointers store an address while arrays store 'real' values.
But I'm getting confused when it comes to string.
char *string = "String";
I read that this line does several things :
An array of chars is created by the compiler and it has the value String.
Then, this array is considered as a pointer and the program assigns to the pointer string a pointer which points to the first element of the array created by the compiler.
This means, arrays are considered as pointers.
So, is this conclusion true or false and why ?
If false, what are then the differences between pointers and arrays ?
Thanks.
A pointer contains the address of an object (or is a null pointer that doesn't point to any object). A pointer has a specific type that indicates the type of object it can point to.
An array is a contiguous ordered sequence of elements; each element is an object, and all the elements of an array are of the same type.
A string is defined as "a contiguous sequence of characters terminated by and including the first null character". C has no string type. A string is a data layout, not a data type.
The relationship between arrays and pointers can be confusing. The best explanation I know of is given by section 6 of the comp.lang.c FAQ. The most important thing to remember is that arrays are not pointers.
Arrays are in a sense "second-class citizens" in C and C++. They cannot be assigned, passed as function arguments, or compared for equality. Code that manipulates arrays usually does so using pointers to the individual elements of the arrays, with some explicit mechanism to specify how long the array is.
A major source of confusion is the fact that an expression of array type (such as the name of an array object) is implicitly converted to a pointer value in most contexts. The converted pointer points to the initial (zeroth) element of the array. This conversion does not happen if the array is either:
The operand of sizeof (sizeof array_object yields the size of the array, not the size of a pointer);
The operand of unary & (&array_object yields the address of the array object as a whole); or
A string literal in an initializer used to initialize an array object.
char *string = "String";
To avoid confusion, I'm going to make a few changes in your example:
const char *ptr = "hello";
The string literal "hello" creates an anonymous object of type char[6] (in C) or const char[6] (in C++), containing the characters { 'h', 'e', 'l', 'l', 'o', '\0' }.
Evaluation of that expression, in this context, yields a pointer to the initial character of that array. This is a pointer value; there is no implicitly created pointer object. That pointer value is used to initialize the pointer object ptr.
At no time is an array "treated as" a pointer. An array expression is converted to a pointer type.
Another source of confusion is that function parameters that appear to be of array type are actually of pointer type; the type is adjusted at compile time. For example, this:
void func(char param[10]);
really means:
void func(char *param);
The 10 is silently ignored. So you can write something like this:
void print_string(char s[]) {
printf("The string is \"%s\"\n", s);
}
// ...
print_string("hello");
This looks like just manipulating arrays, but in fact the array "hello" is converted to a pointer, and that pointer is what's passed to the print_string function.
So, is this conclusion true or false and why ?
Your conclusion is false.
Arrays and pointers are different. comp.lang.c FAQ list ยท Question 6.8 explains the difference between arrays and pointers:
An array is a single, preallocated chunk of contiguous elements (all of the same type), fixed in size and location. A pointer is a reference to any data element (of a particular type) anywhere. A pointer must be assigned to point to space allocated elsewhere, but it can be reassigned (and the space, if derived from malloc, can be resized) at any time. A pointer can point to an array, and can simulate (along with malloc) a dynamically allocated array, but a pointer is a much more general data structure.
When you do
char *string = "String";
and when a C compiler encounters this, it sets aside 7 bytes of memory for the string literal String. Then set the pointer string to point to the starting location of the allocated memory.
When you declare
char string[] = "String";
and when a C compiler encounters this, it sets aside 7 bytes of memory for the string literal String. Then gives the name of that memory location, i.e. the first byte, string.
So,
In first case string is a pointer variable and in second case it is an array name.
The characters stored in first case can't be modified while in array version it can be modified.
This means arrays is not considered as pointers in C but they are closely related in the sense that pointer arithmetic and array indexing are equivalent in C, pointers and arrays are different.
You have to understand what is happening in memory here.
A string is a contiguous block of memory cells that terminates with a special value (a null terminator). If you know the start of this block of memory, and you know where it ends (either by being told the number of memory cells or by reading them until you get to the null) then you're good to go.
A pointer is nothing more than the start of the memory block, its the address of the first memory cell, or its a pointer to the first element. All those terms mean the same thing. Its like a cell reference in a spreadsheet, if you have a huge grid you can tell a particular cell by its X-Y co-ordinates, so cell B5 tells you of a particular cell. In computer terms (rather than spreadsheets) memory is really a very, very long list of cells, a 1-dimensional spreadsheet if you like, and the cell reference will look like 0x12345678 rather than B5.
The last bit is understanding that a computer program is a block of data that is loader by the OS into memory, the compiler will have figured out the location of the string relative to the start of the program, so you automatically know which block of memory it is located in.
This is exactly the same as allocating a block of memory on the heap (its just another part of the huge memory space) or the stack (again, a chunk of memory reserved for local allocations). You have the address of the first memory location where your string lives.
So
char* mystring = "string";
char mystring[7];
copy_some_memory(mystring, "string", 7);
and
char* mystring = new char(7);
copy_some_memory(mystring, "string", 7);
are all the same thing. mystring is the memory location of the first byte, that contains the value 's'. The language may make them look different, but that's just syntax. So an array is a pointer, its just that the language makes it look different, and you can operate on it with slightly different syntax designed to make operations on it safer.
(note: the big difference between the 1st and other examples is that the compiler-set set of data is read-only. If you could change that string data, you could change your program code, as it too it just a block of CPU instructions stored in a section of memory reserved for program data. For security reasons, these special blocks of memory are restricted to you).
Here's another way to look at them:
First, memory is some place you can store data.
Second, an address is the location of some memory. The memory referred to by the address may or may not exist. You can't put anything in an address, only at an address - you can only store data in the memory the address refers to.
An array is contiguous location in memory - it's a series of memory locations of a specific type. It exists, and can have real data put into it. Like any actual location in memory, it has an address.
A pointer contains an address. That address can come from anywhere.
A string is a NUL-terminated array of characters.
Look at it this way:
memory - A house. You can put things in it. The house has an address.
array - A row of houses, one next to the other, all the same.
pointer - a piece of paper you can write an address on. You can't store anything in the piece of paper itself (other than an address), but you can put things into the house at the address you write on the paper.
We can create an array with the name 'string'
char string[] = "Hello";
We can allocate a pointer to that string
char* stringPtr = string;
The array name is converted to a pointer
So, an array name is similar to the pointer. However, they're not the same, as the array is a contiguous block of memory, whereas the pointer references just a single location (address) in memory.
char *string = "String";
This declaration creates the array and sets the address of the pointer to the block of memory used to store the array.
This means, arrays are considered as pointers. So, is this conclusion true or false
False, arrays are not pointers. However, just to confuse(!), pointers can appear to be arrays, due to the dereference operator []
char *string = "String";
char letter = string[2];
In this case string[2], string is first converted to a pointer to the first character of the array and using pointer arithmetic, the relevant item is returned.
Then, this array is considered as a pointer and the program assigns to the pointer string a pointer which points to the first element of the array created by the compiler.
Not really great wording here. Array is still an array and is considered as such. The program assigns a pointer-to-first-element value (rvalue) to pointer-to-char variable (lvalue in general). That is the only intermediate/non-stored pointer value here, as compiler and linker know array's address at compile-link time. You can't directly access the array though, because it is anonymous literal. If you were instead initializing an array with literal, then literal would disappear (think like optimized-out as separate entity) and array would be directly accessible by its precomputed address.
char s[] = "String"; // char[7]
char *p = s;
char *p = &s[0];

Assigning a string to a char* variable

A code I am reviewing uses following string assignment
char *str;
str ="";
The coder then uses this 'str' to temporarily hold a string like.
str = "This is a message";
fwrite(str, 1 ,strlen(str), fp);
Then this str is used again at some other place to assign a new string with a similar use.
I know that this works, I want to find out how exactly does this work.
How can you declare a char pointer and make it point to a string like that?
What could be the maximum string length such a pointer can hold?
Where is this string stored? Is it automatically malloc'd?
A pointer doesn't "hold" a string, it just points to where the original string is located. In this case the string literal is kept as part of the program and the pointer is set to it; when you reassign the pointer, you're not making any copies, just setting the pointer to a different address.
The maximum size of the string is thus the maximum size of a string literal, which will depend on the compiler and the amount of available program space.
If you want to actually make a copy of a string, first you must allocate some storage for it which must be one greater than the number of characters. Then use strcpy to make the copy.
This string is statically contained in the object module. You don't need to malloc memory for such strings, because they already have a memory assigned by the compiler. Because of this, you also can not free such a pointer. If you look with an hex editor in your exe file, you can see that such a string is contained inside it, as opposed to a dynamically allocated string, which only exists in memory as long as the executable runs.
The maximum size of such a string depends on your compiler.
char* is just a pointer to a char (or series of them).
You can have it pointing to any "string" you like. In the examples given, they are just changing the pointer's value (i.e. what str points to).
char *name;
name="some string";//the name points to the address of location of
the string.
is not similar to :
char str[];
str="some string";// remember this type of statement
won't work,because str is about to store characters but you are
assigning the pointer.
A constant character string always represents a pointer to that string.

Pointers to strings

I am a newbie to C programming. And I am confused with the chaotic behavior of pointers. Specially when it comes to strings and arrays.
I know that I can't write like,
#include <stdio.h>
int main()
{
int *number=5;
printf("%d",*number);
}
Because clearly it will try to write to the 5th location of the memory.And that will crash the program.I have to initialize the "number".
But when it comes to strings I can write like,
#include <stdio.h>
int main()
{
char *name="xxxxx";
printf(name);
}
And it works too. So that means it implicitly initialize the "name" pointer.I also know that name=&name[0] But I have found that name=&name too. How can it be?
Because, to me it looks two variables with the same name. Can anybody tell me how strings are created in memory?(All this time time I assumed it creates name[0].....name[n-1] and another variable(a pointer) called "name", inside that we put the location of name[0].Seem to be I was wrong.)
PS:-My English may not be good and if somebody can give me a link regarding above matter that would be grateful.
C, like many programming languages, supports the concept of "literals" - special syntax that, when encountered, causes the compiler to create a value in a special way.
-2, for example, is an integer literal. When encountered, the compiler will treat it as a value of type int with the content of -2. "..." is a string literal - when encountered, the compiler allocates new space in a special memory area, fills it with data that corresponds to the chars you used inside the literal, adds a 0 at the end of that area, and finally uses a pointer to that area, of type char*, as the result of the literal expression. So char* s = "hello" is an assignment from something of type char* into a variable of type char* - completely legal.
You sneaked in another question here - why a == &a[0]. It's best to ask one question at a time, but the gist of it is that a[n] is identical to *((a)+(n)), and so:
&a[0] == &*(a+0) == a+0 == a
char *name="xxxxx";
This creates a char array(const) in memory, and it will assign the address of 1st element of it to name. char* array names are like pointers. &name[0] means address of 1st element and name (In c,cpp just the char * array name will provide you with the address of 1st element too (bcos that is what was assigned to name in the 1st place itself)) also gives the same result.
The notation name[i] is translated as *(name+i) so u actually have a base address name to which you add subscripts. (calculations are as per pointer arithmetic) . printf("%s", name) is designed to print from start address to \0 (which is appended to the end of strings created using char* a.k.a String literals)
Check this too.
That is because you those strings are actually arrays of characters. What you do with the line char *name = "xxxxx";: You construct an array of 6 character values (5 of them being 'x' and the last one being '\0'). name is a pointer to that array.
That's how strings are normally handles in C, you have some kind of sequence of characters, terminated with a '\0' character, to tell functions like printf where to stop processing the string.
Let's consider this piece of code:
char *name="xxxxx";
What happens here is that the string xxxxx is allocated memory and a pointer to that memory location,or the address of that string is passed to the pointer variable name.Or in other words you initialize name with that address.
And printf() is a variadic function (one that takes one fixed argument followed by a random number of arguments).The first argument to printf() is of type const char*.And the string identifier name when passed as an argument denotes the base address of that string.Hence you can use
printf(name);
It will simply output xxxxx
name=&name too. How can it be?--Well that's a justified question.
Let me explain first with a real-world analogy. Suppose you have the first house in a row of houses in a housing society.Suppose it's plot number 0.The other houses are on plots 1,2,3......Now suppose there is a pointer to your house, and there is another pointer to the whole housing society's row. Won't the two pointers have the same address, which will be plot 0?This is because a pointer signifies a single memory location.It's the type of the pointer that matters here.
Bringing this analogy to the string( array of characters), the name identifier only signifies the base address of the string, the address of its first character (Like the address of the first house).It is numerically same to the address of the whole string,which is (&name),which in my analogy is the ROW of houses.But they are of different types, one is of type char* and the other is of type char**.
Basicly what happens when the C-compiler see the expression
char *name = "xxxxx";
is, it will say. Hey "xxxxx" that's a constant string (which is an array of bytes terminated with a 0 byte), and put that in the resulting programs binary. Then it will substitute the string for the memory location, sort of like:
char *name = _some_secret_name_the_compiler_only_know;
where _some_secret_name_the_compiler_only_know is a pointer to the memory location where the string will live once the program gets executed. And get in with parsing the file.

How an unassigned string is stored in memory and how to reference it?

In this example seems that both strings "jesus" are equals(same memory location).
printf("%p\n","jesus");
printf("%p\n","jesus");
Also note that:
printf("%p\n",&"jesus");
printf("%p\n","jesus");
prints the same, but:
char* ptrToString = "jesus";
char* ptrToString = &"jesus"; //ERROR
So i wanna know how an unassigned string is stored in memory and how to point it...
First off, why are "jesus" and &"jesus" the same: "jesus" is an array of type const char[6], and it decays to a pointer to the first element. Taking the address of the array gives you a pointer to an array, whose type is const char (*)[6]. However, the pointer to the array is numerically the same as the pointer to its first element (only the types differ).
This also explains why you have an error in the last line - type type is wrong. You need:
const char (*pj)[6] = &"jesus";
Finally, the question is whether repeated string literals have the same address or not. This is entirely up to the compiler. If it were very naive, it could store a separate copy for each occurrence of a string literal in the source code. If it is slightly cleverer, it'll only store one unique copy for each string literal. String literals are of course stored in memory somewhere, typically in a read-only data segment of the program image. Think of them as statically initialized global variables.
One more thing: Your original code is actually undefined behaviour, since %p expects a void * argument, and not a const char * or a const char (*)[6]. So the correct code is:
printf("%p\n%p\n", (void const *)"jesus", (void const *)&"jesus");
C is a carefully specified language and we can make many observations about your examples that may answer some questions.
Character literals are stored in memory as initialized data. They have type array of char.
They are not necessarily strings because nul bytes can be embedded with \0.
It is not required that identical character string literals be unique, but it's undefined what happens if a program tries to modify one. This effectively allows them to be distinct or "interned" as the implementation sees fit.
In order to make that last line work, you need:
char (*ptrToString)[] = &"jesus"; // now not an ERROR

Resources