Initializing strings using pointers - c

What is the difference between:
char arr[20]="I am a string"
and
char *arr="I am a string"
How is it possible to initialize an array just by using a pointer?

First one is clear, it is an array initialisation, whereas the second one means that character pointer *arr is pointing to the unnamed static array which will store the String " I am a string".

One difference is in allocated storage size. First expression allocates 20 chars, but the second expression allocate the length of the string (13 chars).
The second difference is mentioned in this post. which is discussed on the way how these variables are allocated.

In first case you are partially initializing stack allocated array with 14 chars taken from buffer represented by "I am a string" string literal.
In second case you are initializing stack allocated pointer with a pointer to a buffer with static storage duration represented by "I am a string" string literal. Also notice that in second case you should use const char *arr.

Related

The array expression is not converted into a pointer When a string literal is used to initialize an array of characters

From C in a Nutshell:
In most cases, the compiler implicitly converts an expression
with an array type, such as the name of an array, into a pointer to
the array’s first element.
The array expression is not converted into a pointer only in the
following cases:
• When the array is the operand of the sizeof operator
• When the array is the operand of the address operator &
• When a string literal is used to initialize an array of char ,
wchar_t , char16_t , or char32_t
Could you explain what the last bullet means with some positive and
negative examples? I don't find an example in the book for the last
bullet.
Also why is an array of characters, not other element types?
char *ptr = "Hello OP!!";
ptr is an pointer to first char of the string literal stored in the RODATA segment. When you dereference it you can only read but not write values as string literals are constant char arrays.
char arr[] = "Hello OP!! How are you my friend?";
In this case:
Is allocated space for the arr array of the length of size literal including the trailing zero.
String literal is copied into the space allocated for the arr array
In this case arr is used as place in the memory where the string literal is copied.
You can read and write as the arr elements are read & write
And now answering the question
sizeof of an array is the size in bytes if all array elements. If the array was converted to pointer - the size would be the size of the pointer which is obviously wrong in this case
Array is only the continuous space in the memory accommodating all its elements. So the address of the array is always the address of this memory location
Third case i have explained above
you can see the code
https://godbolt.org/g/xVL5cR
** Note to TIM ** String literals are not converted to anything. String literal is only stored as a char (wchar_t ....) array with NUL (NOT NULL) teriminator at the end, in the RO memory.
Why is an array of characters, not other element types?
Its becouse string literals have static storage duration, and thus exist in memory for the life of the program.
Attempting to modify a string literal(with pointer to literal) results in undefined behavior: they may be stored in read-only storage (such as .rodata) or combined with other string literals.
Any of other constants arent stored like this, so this is why only array of characters (literals).
Could you explain what the last bullet means with some positive and
negative examples? I don't find an example in the book for the last
bullet.
String literal initialization looks like this:
char ptr[] = "Hello world!"; // This is char[]
char ptr[] = L"Hello world!"; // This is wchar_t[]
char ptr[] = u8"Hello world!"; // This is char[]
char ptr[] = u"Hello world!"; // This is char16_t[]
char ptr[] = U"Hello world!"; // This is char32_t[]
The string literal is copied from static storage duration to automatic storage duration and its possible to modify him.
While
char ptr[] = {'H','e','l','l',o',' ','w','o','r','l','d','\0'};
wont be string literal and wont have static duration storage.

Differences between arrays, pointers and strings in C

Just have a question in mind that troubles me.
I know pointers and arrays are different in C because pointers store an address while arrays store 'real' values.
But I'm getting confused when it comes to string.
char *string = "String";
I read that this line does several things :
An array of chars is created by the compiler and it has the value String.
Then, this array is considered as a pointer and the program assigns to the pointer string a pointer which points to the first element of the array created by the compiler.
This means, arrays are considered as pointers.
So, is this conclusion true or false and why ?
If false, what are then the differences between pointers and arrays ?
Thanks.
A pointer contains the address of an object (or is a null pointer that doesn't point to any object). A pointer has a specific type that indicates the type of object it can point to.
An array is a contiguous ordered sequence of elements; each element is an object, and all the elements of an array are of the same type.
A string is defined as "a contiguous sequence of characters terminated by and including the first null character". C has no string type. A string is a data layout, not a data type.
The relationship between arrays and pointers can be confusing. The best explanation I know of is given by section 6 of the comp.lang.c FAQ. The most important thing to remember is that arrays are not pointers.
Arrays are in a sense "second-class citizens" in C and C++. They cannot be assigned, passed as function arguments, or compared for equality. Code that manipulates arrays usually does so using pointers to the individual elements of the arrays, with some explicit mechanism to specify how long the array is.
A major source of confusion is the fact that an expression of array type (such as the name of an array object) is implicitly converted to a pointer value in most contexts. The converted pointer points to the initial (zeroth) element of the array. This conversion does not happen if the array is either:
The operand of sizeof (sizeof array_object yields the size of the array, not the size of a pointer);
The operand of unary & (&array_object yields the address of the array object as a whole); or
A string literal in an initializer used to initialize an array object.
char *string = "String";
To avoid confusion, I'm going to make a few changes in your example:
const char *ptr = "hello";
The string literal "hello" creates an anonymous object of type char[6] (in C) or const char[6] (in C++), containing the characters { 'h', 'e', 'l', 'l', 'o', '\0' }.
Evaluation of that expression, in this context, yields a pointer to the initial character of that array. This is a pointer value; there is no implicitly created pointer object. That pointer value is used to initialize the pointer object ptr.
At no time is an array "treated as" a pointer. An array expression is converted to a pointer type.
Another source of confusion is that function parameters that appear to be of array type are actually of pointer type; the type is adjusted at compile time. For example, this:
void func(char param[10]);
really means:
void func(char *param);
The 10 is silently ignored. So you can write something like this:
void print_string(char s[]) {
printf("The string is \"%s\"\n", s);
}
// ...
print_string("hello");
This looks like just manipulating arrays, but in fact the array "hello" is converted to a pointer, and that pointer is what's passed to the print_string function.
So, is this conclusion true or false and why ?
Your conclusion is false.
Arrays and pointers are different. comp.lang.c FAQ list · Question 6.8 explains the difference between arrays and pointers:
An array is a single, preallocated chunk of contiguous elements (all of the same type), fixed in size and location. A pointer is a reference to any data element (of a particular type) anywhere. A pointer must be assigned to point to space allocated elsewhere, but it can be reassigned (and the space, if derived from malloc, can be resized) at any time. A pointer can point to an array, and can simulate (along with malloc) a dynamically allocated array, but a pointer is a much more general data structure.
When you do
char *string = "String";
and when a C compiler encounters this, it sets aside 7 bytes of memory for the string literal String. Then set the pointer string to point to the starting location of the allocated memory.
When you declare
char string[] = "String";
and when a C compiler encounters this, it sets aside 7 bytes of memory for the string literal String. Then gives the name of that memory location, i.e. the first byte, string.
So,
In first case string is a pointer variable and in second case it is an array name.
The characters stored in first case can't be modified while in array version it can be modified.
This means arrays is not considered as pointers in C but they are closely related in the sense that pointer arithmetic and array indexing are equivalent in C, pointers and arrays are different.
You have to understand what is happening in memory here.
A string is a contiguous block of memory cells that terminates with a special value (a null terminator). If you know the start of this block of memory, and you know where it ends (either by being told the number of memory cells or by reading them until you get to the null) then you're good to go.
A pointer is nothing more than the start of the memory block, its the address of the first memory cell, or its a pointer to the first element. All those terms mean the same thing. Its like a cell reference in a spreadsheet, if you have a huge grid you can tell a particular cell by its X-Y co-ordinates, so cell B5 tells you of a particular cell. In computer terms (rather than spreadsheets) memory is really a very, very long list of cells, a 1-dimensional spreadsheet if you like, and the cell reference will look like 0x12345678 rather than B5.
The last bit is understanding that a computer program is a block of data that is loader by the OS into memory, the compiler will have figured out the location of the string relative to the start of the program, so you automatically know which block of memory it is located in.
This is exactly the same as allocating a block of memory on the heap (its just another part of the huge memory space) or the stack (again, a chunk of memory reserved for local allocations). You have the address of the first memory location where your string lives.
So
char* mystring = "string";
char mystring[7];
copy_some_memory(mystring, "string", 7);
and
char* mystring = new char(7);
copy_some_memory(mystring, "string", 7);
are all the same thing. mystring is the memory location of the first byte, that contains the value 's'. The language may make them look different, but that's just syntax. So an array is a pointer, its just that the language makes it look different, and you can operate on it with slightly different syntax designed to make operations on it safer.
(note: the big difference between the 1st and other examples is that the compiler-set set of data is read-only. If you could change that string data, you could change your program code, as it too it just a block of CPU instructions stored in a section of memory reserved for program data. For security reasons, these special blocks of memory are restricted to you).
Here's another way to look at them:
First, memory is some place you can store data.
Second, an address is the location of some memory. The memory referred to by the address may or may not exist. You can't put anything in an address, only at an address - you can only store data in the memory the address refers to.
An array is contiguous location in memory - it's a series of memory locations of a specific type. It exists, and can have real data put into it. Like any actual location in memory, it has an address.
A pointer contains an address. That address can come from anywhere.
A string is a NUL-terminated array of characters.
Look at it this way:
memory - A house. You can put things in it. The house has an address.
array - A row of houses, one next to the other, all the same.
pointer - a piece of paper you can write an address on. You can't store anything in the piece of paper itself (other than an address), but you can put things into the house at the address you write on the paper.
We can create an array with the name 'string'
char string[] = "Hello";
We can allocate a pointer to that string
char* stringPtr = string;
The array name is converted to a pointer
So, an array name is similar to the pointer. However, they're not the same, as the array is a contiguous block of memory, whereas the pointer references just a single location (address) in memory.
char *string = "String";
This declaration creates the array and sets the address of the pointer to the block of memory used to store the array.
This means, arrays are considered as pointers. So, is this conclusion true or false
False, arrays are not pointers. However, just to confuse(!), pointers can appear to be arrays, due to the dereference operator []
char *string = "String";
char letter = string[2];
In this case string[2], string is first converted to a pointer to the first character of the array and using pointer arithmetic, the relevant item is returned.
Then, this array is considered as a pointer and the program assigns to the pointer string a pointer which points to the first element of the array created by the compiler.
Not really great wording here. Array is still an array and is considered as such. The program assigns a pointer-to-first-element value (rvalue) to pointer-to-char variable (lvalue in general). That is the only intermediate/non-stored pointer value here, as compiler and linker know array's address at compile-link time. You can't directly access the array though, because it is anonymous literal. If you were instead initializing an array with literal, then literal would disappear (think like optimized-out as separate entity) and array would be directly accessible by its precomputed address.
char s[] = "String"; // char[7]
char *p = s;
char *p = &s[0];

How does C work with memory of local string literals?

Consider this function:
void useless() {
char data[] = "aaa";
}
From what I learned here, the "aaa" literal lives to the end of the program. However, the data[] (initialized by the literal) is local, so it lives only to the end of the function.
The memory is copied, so the program needs 4B for the literal, 4B for the data and sizeof(size_t) bytes for the pointer to data and sizeof(size_t) for the pointer of the literal - is this true?
If the literal has static storage duration, no new memory is allocated for the local literal by the second call - is this true?
char data[] = "aaa";
This is not a string literal but just an array. So there's no pointer there and memory is allocated only for the data.
If the literal has static storage duration, no new memory is allocated
for the local literal by the second call
This is true for string literals like: char *s="aaa"; From the standard:
2.13. Sttring literals
[...]An ordinary string literal has type “array of n const char” and static storage duration (3.7)
There is no pointer variable here. All there is an array, which is 4 bytes.
The compiler may or may not store the literal itself in memory; if it does, that is another 4 bytes.
Note that any memory taken up by anything other than the array itself is implementation-dependent.
I'm not sure what you mean by the "second call", but in general when you create an array, it takes up some amount of size... so if you create two arrays with the same literal, the compiler allocates space for two arrays (and perhaps -- or perhaps not -- for the literal also).

Why would someone initialize unallocated memory in C?

Say I do initialize an array like this:
char a[]="test";
What's the purpose of this? We know that the content might immediately get changed, as it is not allocated, and thus why would someone initialize the array like this?
To clarify, this code is wrong for the reasons stated by the OP:
char* a;
strcpy(a, "test");
As noted by other responses, the syntax "char a[] = "test"" does not actually do this. The actual effect is more like this:
char a[5];
strcpy(a, "test");
The first statement allocates a fixed-size static character array on the local stack, and the second initializes the data in it. The size is determined from the length of the string literal. Like all stack variables, the array is automatically deallocated on exiting the function scope.
The purpose of this is to allocate five bytes on the stack or the static data segment (depending on where this snippet occurs), then set those bytes to the array {'t','e','s','t','\0'}.
This syntax allocates an array of five characters on the stack, equivalent to this:
char a[5] = "test";
The elements of the array are initialized to the characters in the string given as an initializer. The size of the array is determined to fit the size of the initializer.
It is allocated. That code is equivalent to
char a[5]="test";
When you leave the number out, the compiler simply calculates the length of the character-array for you by counting the characters in the literal string. It then adds 1 to the length in order to include the necessary terminating nul '\0'. Hence, the length of the array is 5 while the length of the string is 4.
The array is allocated; its size is inferred from the string literal being used to initialize it (5 chars total).
Had you written
char *a = "test";
then all that would get allocated would be a pointer variable, not an array (the string literal "test" lives in memory such that it's allocated at program startup and held until the program exits).

C strings pointer vs. arrays [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What is the difference between char s[] and char *s in C?
Why is:
char *ptr = "Hello!"
different than:
char ptr[] = "Hello!"
Specifically, I don't see why you can use (*ptr)++ to change the value of 'H' in the array, but not the pointer.
Thanks!
You can (in general) use the expression (*ptr)++ to change the value that ptr points to when ptr is a pointer and not an array (ie., if ptr is declared as char* ptr).
However, in your first example:
char *ptr = "Hello!"
ptr is pointing to a literal string, and literal strings are not permitted to be modified (they may actually be stored in memory area which are not writable, such as ROM or memory pages marked as read-only).
In your second example,
char ptr[] = "Hello!";
The array is declared and the initialization actually copies the data in the string literal into the allocated array memory. That array memory is modifiable, so (*ptr)++ works.
Note: for your second declaration, the ptr identifier itself is an array identifier, not a pointer and is not an 'lvalue' so it can't be modified (even though it converts readily to a pointer in most situations). For example, the expression ++ptr would be invalid. I think this is the point that some other answers are trying to make.
When pointing to a string literal, you should not declare the chars to be modifiable, and some compilers will warn you for this:
char *ptr = "Hello!" /* WRONG, missing const! */
The reason is as noted by others that string literals may be stored in an immutable part of the program's memory.
The correct "annotation" for you is to make sure you have a pointer to constant char:
const char *ptr = "Hello!"
And now you see directly that you can't modify the text stored at the pointer.
Arrays automatically allocate space and they can't be relocated or resized while pointers are explicitly assigned to point to allocated space and can be relocated.
Array names are read only!
If You use a string literal "Hello!", the literal itself becomes an array of 7 characters and gets stored somewhere in a data memory. That memory may be read only.
The statement
char *ptr = "Hello!";
defines a pointer to char and initializes it, by storing the address of the beginning of the literal (that array of 7 characters mentioned earlier) in it. Changing contents of the memory pointed to by ptr is illegal.
The statement
char ptr[] = "Hello!";
defines a char array (char ptr[7]) and initializes it, by copying characters from the literal to the array. The array can be modified.
in C strings are arrays of characters.
A pointer is a variable that contains the memory location of another variable.
An array is a set of ordered data items.
when you put (*ptr)++ you are getting Segmentation Fault with the pointer.
Maybe you are adding 1 to the whole string (with the pointer), instead of adding 1 to the first character of the variable (with the array).

Resources