Pointer initialization in C - c

In C why is it legal to do
char * str = "Hello";
but illegal to do
int * arr = {0,1,2,3};

I guess that's just how initializers work in C. However, you can do:
int *v = (int[]){1, 2, 3}; /* C99. */

As for C89:
"A string", when used outside char array initialization, is a string literal; the standard says that when you use a string literal it's as if you created a global char array initialized to that value and wrote its name instead of the literal (there's also the additional restriction that any attempt to modify a string literal results in undefined behavior). In your code you are initializing a char * with a string literal, which decays to a char pointer and everything works fine.
However, if you use a string literal to initialize a char array, several magic rules get in action, so it is no longer "as if an array ... etc" (which would not work in array initialization), but it's just a nice way to tell the compiler how the array should be initialized.
The {1, 2, 3} way to initialize arrays keeps just this semantic: it's only for initialization of array, it's not an "array literal".

In the case of:
char * str = "Hello";
"Hello" is a string literal. It's loaded into memory (but often read-only) when the program is run, and has a memory address that can be assigned to a pointer like char *str. String literals like this are an exception, though.
With:
int * arr = {0,1,2,3};
..you're effectively trying to point at an array that hasn't been put anywhere in particular in memory. arr is a pointer, not an array; it holds a memory address, but does not itself have storage for the array data. If you use int arr[] instead of int *arr, then it works, because an array like that is associated with storage for its contents. Though the array decays to a pointer to its data in many contexts, it's not the same thing.
Even with string literals, char *str = "Hello"; and char str[] = "Hello"; do different things. The first sets the pointer str to point at the string literal, and the second initializes the array str with the values from "Hello". The array has storage for its data associated with it, but the pointer just points at data that happens to be loaded into memory somewhere already.

Because there's no point in declaring and initializing a pointer to an int array, when the array name can be used as a pointer to the first element. After
int arr[] = { 0, 1, 2, 3 };
you can use arr like an int * in almost all contexts (the exception being as operand to sizeof).

... or you can abuse the string literals and store the numbers as a string literal, which for a little endian machine will look as follows:
int * arr = (int *)"\0\0\0\0\1\0\0\0\2\0\0\0\3\0\0\0";

Related

Why pointers can't be used to index arrays? [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 3 years ago.
I am trying to change value of character array components using a pointer. But I am not able to do so. Is there a fundamental difference between declaring arrays using the two different methods i.e. char A[] and char *A?
I tried accessing arrays using A[0] and it worked. But I am not able to change values of the array components.
{
char *A = "ab";
printf("%c\n", A[0]); //Works. I am able to access A[0]
A[0] = 'c'; //Segmentation fault. I am not able to edit A[0]
printf("%c\n", A[0]);
}
Expected output:
a
c
Actual output:
a
Segmentation fault
The difference is that char A[] defines an array and char * does not.
The most important thing to remember is that arrays are not pointers.
In this declaration:
char *A = "ab";
the string literal "ab" creates an anonymous array object of type char[3] (2 plus 1 for the terminating '\0'). The declaration creates a pointer called A and initializes it to point to the initial character of that array.
The array object created by a string literal has static storage duration (meaning that it exists through the entire execution of your program) and does not allow you to modify it. (Strictly speaking an attempt to modify it has undefined behavior.) It really should be const char[3] rather than char[3], but for historical reasons it's not defined as const. You should use a pointer to const to refer to it:
const char *A = "ab";
so that the compiler will catch any attempts to modify the array.
In this declaration:
char A[] = "ab";
the string literal does the same thing, but the array object A is initialized with a copy of the contents of that array. The array A is modifiable because you didn't define it with const -- and because it's an array object you created, rather than one implicitly created by a string literal, you can modify it.
An array indexing expression, like A[0] actually requires a pointer as one if its operands (and an integer as the other). Very often that pointer will be the result of an array expression "decaying" to a pointer, but it can also be just a pointer -- as long as that pointer points to an element of an array object.
The relationship between arrays and pointers in C is complicated, and there's a lot of misinformation out there. I recommend reading section 6 of the comp.lang.c FAQ.
You can use either an array name or a pointer to refer to elements of an array object. You ran into a problem with an array object that's read-only. For example:
#include <stdio.h>
int main(void) {
char array_object[] = "ab"; /* array_object is writable */
char *ptr = array_object; /* or &array_object[0] */
printf("array_object[0] = '%c'\n", array_object[0]);
printf("ptr[0] = '%c'\n", ptr[0]);
}
Output:
array_object[0] = 'a'
ptr[0] = 'a'
String literals like "ab" are supposed to be immutable, like any other literal (you can't alter the value of a numeric literal like 1 or 3.1419, for example). Unlike numeric literals, however, string literals require some kind of storage to be materialized. Some implementations (such as the one you're using, apparently) store string literals in read-only memory, so attempting to change the contents of the literal will lead to a segfault.
The language definition leaves the behavior undefined - it may work as expected, it may crash outright, or it may do something else.
String literals are not meant to be overwritten, think of them as read-only. It is undefined behavior to overwrite the string and your computer chose to crash the program as a result. You can use an array instead to modify the string.
char A[3] = "ab";
A[0] = 'c';
Is there a fundamental difference between declaring arrays using the two different methods i.e. char A[] and char *A?
Yes, because the second one is not an array but a pointer.
The type of "ab" is char /*readonly*/ [3]. It is an array with immutable content. So when you want a pointer to that string literal, you should use a pointer to char const:
char const *foo = "ab";
That keeps you from altering the literal by accident. If you however want to use the string literal to initialize an array:
char foo[] = "ab"; // the size of the array is determined by the initializer
// here: 3 - the characters 'a', 'b' and '\0'
The elements of that array can then be modified.
Array-indexing btw is nothing more but syntactic sugar:
foo[bar]; /* is the same as */ *(foo + bar);
That's why one can do funny things like
"Hello!"[2]; /* 'l' but also */ 2["Hello!"]; // 'l'

Why can't I change char array later?

char myArray[6]="Hello"; //declaring and initializing char array
printf("\n%s", myArray); //prints Hello
myArray="World"; //Compiler says"Error expression must a modifiable lvalue
Why can't I change myArray later? I did not declare it as const modifier.
When you write char myArray[6]="Hello"; you are allocating 6 chars on the stack (including a null-terminator).
Yes you can change individual elements; e.g. myArray[4] = '\0' will transform your string to "Hell" (as far as the C library string functions are concerned), but you can't redefine the array itself as that would ruin the stack.
Note that [const] char* myArray = "Hello"; is an entirely different beast: that is read-only memory and any changes to that string is undefined behaviour.
Array is a non modifiable lvalue. So you cannot modify it.
If you wish to modify the contents of the array, use strcpy.
Because the name of an array cannot be modified, just use strcpy:
strcpy(myArray, "World");
You can't assign to an array (except when initializing it in its declaration. Instead you have to copy to it. This you do using strcpy.
But be careful so you don't copy more than five characters to the array, as that's the longest string it can contain. And using strncpy in this case may be dangerous, as it may not add the terminating '\0' character if the source string is to long.
You can't assign strings to variables in C except in initializations. Use the strcpy() function to change values of string variables in C.
Well myArray is the name of the array which you cannot modify. It is illegal to assign a value to it.
Arrays in C are non-modifiable lvalues. There are no operations in C that can modify the array itself (only individual elements can be modifiable).
Well myArray is of size 6 and hence care must be taken during strcpy.
strcpy(myArray,"World") as it would result in overflow if the source's string length is more than the destination's (6 in this case).
A arrays in C are non-modifiable lvalues. There are no operations in C that can modify the array itself (only individual elements can be modifiable).
A possible and safe method would be
char *ptr = "Hello";
If you want to change
ptr = strdup("World");
NOTE:
Make sure that you free(ptr) at the end otherwise it would result in memory leak.
You cannot assign naked arrays in C. However you can assign pointers:
char const *myPtr = "Hello";
myPtr = "World";
Or you can assign to the elements of an array:
char myArray[6] = "Hello";
myArray[0] = 'W';
strcpy(myArray, "World");

Initialization different from assignment?

1.char str[] = "hello"; //legal
2.char str1[];
str1 = "hello"; // illegal
I understand that "hello" returns the address of the string literal from the string literal pool which cannot be directly assigned to an array variable. And in the first case the characters from the "hello" literal are copied one by one into the array with a '\0' added at the end.
Is this because the assignment operator "=" is overloaded here to support this?
I would also like to know other interesting cases wherein initialization is different from assignment.
You cannot think of it as overloading (which doesn't exist in C anyway), because the initialization of char arrays with string literals is a special case. The type of a string literal is const char[N], so if it were similar to overloading, you'd be able to initialize a char array with any expression whose type is const char[N]. But you cannot!
const char arr[3];
const char arr1[] = arr; //compiler error. Cannot initialize array with another array.
The language standard simply says that character arrays can be initialized with string literals. Since they say nothing about assignment, the general rules apply, in particular, that an array cannot be assigned to.
As for other cases when initialization is different from assignment: in C++, where there are references and classes, there would be zillions of examples. In C, with no full-fledged classes or references, the only other thing I can think of off the top of my head is const variables:
const int a = 4; //OK;
const int b; //Error;
b = 4; //Error;
Another example: array initialization with braces
int a[3] = {1,2,3}; //OK
int b[3];
b = {1,2,3}; //error
Same with structs
If you want to think of it as the operator being overloaded (even though C doesn't use the term), you can of course do that.
Do you also consider this to be overloading:
unsigned char x;
double y;
x = 2;
y = 1.243;
Those are assigning totally different types of data, after all, but using the "same operator", right?
It's just different, to be initializing or to be assigning.
Another big difference is that you used to be able to initialize structures, but there was no corresponding "struct literal" syntax for later assignments. This is no longer true as of C99, where we now have compound literals.
char str[] = "hello";
Is array initialization, using syntactic sugar defined in C because string initialization is so common. The compiler allocates some fixed memory in your program an initializes it. The name of the array (str) evaluates to the address of this memory, and it cannot be changed because there is no variable which holds that address.
Grijesh Chauhan explains more details of this.
Other cases depend on what you mean. Extending the current case, you can easily see that other initialized arrays have the same properties, for example
int a[] = { 1, 2, 3, 4 };
Array has non modifiable address. You need a pointer as a modifiable lvalue.
By assigning(trying) to a contant string literal, you are taking the address of it. Different address causes that illegality.
"hello" allocates some space in memory and gives and address. Then you take its address to initialize the array.

C: Behaviour of arrays when assigned to pointers

#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
printf("%p %s" ,"hello",ptr );
getchar();
}
Hi, I am trying to understand clearly how can arrays get assign in to pointers. I notice when you assign an array of chars to a pointer of chars ptr="hello"; the array decays to the pointer, but in this case I am assigning a char of arrays that are not inside a variable and not a variable containing them ", does this way of assignment take a memory address specially for "Hello" (what obviously is happening) , and is it possible to modify the value of each element in "Hello" wich are contained in the memory address where this array is stored. As a comparison, is it fine for me to assign a pointer with an array for example of ints something as vague as thisint_ptr = 5,3,4,3; and the values 5,3,4,3 get located in a memory address as "Hello" did. And if not why is it possible only with strings? Thanks in advanced.
"hello" is a string literal. It is a nameless non-modifiable object of type char [6]. It is an array, and it behaves the same way any other array does. The fact that it is nameless does not really change anything. You can use it with [] operator for example, as in "hello"[3] and so on. Just like any other array, it can and will decay to pointer in most contexts.
You cannot modify the contents of a string literal because it is non-modifiable by definition. It can be physically stored in read-only memory. It can overlap other string literals, if they contain common sub-sequences of characters.
Similar functionality exists for other array types through compound literal syntax
int *p = (int []) { 1, 2, 3, 4, 5 };
In this case the right-hand side is a nameless object of type int [5], which decays to int * pointer. Compound literals are modifiable though, meaning that you can do p[3] = 8 and thus replace 4 with 8.
You can also use compound literal syntax with char arrays and do
char *p = (char []) { "hello" };
In this case the right-hand side is a modifiable nameless object of type char [6].
The first thing you should do is read section 6 of the comp.lang.c FAQ.
The string literal "hello" is an expression of type char[6] (5 characters for "hello" plus one for the terminating '\0'). It refers to an anonymous array object with static storage duration, initialized at program startup to contain those 6 character values.
In most contexts, an expression of array type is implicitly converted a pointer to the first element of the array; the exceptions are:
When it's the argument of sizeof (sizeof "hello" yields 6, not the size of a pointer);
When it's the argument of _Alignof (a new feature in C11);
When it's the argument of unary & (&arr yields the address of the entire array, not of its first element; same memory location, different type); and
When it's a string literal in an initializer used to initialize an array object (char s[6] = "hello"; copies the whole array, not just a pointer).
None of these exceptions apply to your code:
char *ptr;
ptr = "hello";
So the expression "hello" is converted to ("decays" to) a pointer to the first element ('h') of that anonymous array object I mentioned above.
So *ptr == 'h', and you can advance ptr through memory to access the other characters: 'e', 'l', 'l', 'o', and '\0'. This is what printf() does when you give it a "%s" format.
That anonymous array object, associated with the string literal, is read-only, but not const. What that means is that any attempt to modify that array, or any of its elements, has undefined behavior (because the standard explicitly says so) -- but the compiler won't necessarily warn you about it. (C++ makes string literals const; doing the same thing in C would have broken existing code that was written before const was added to the language.) So no, you can't modify the elements of "hello" -- or at least you shouldn't try. And to make the compiler warn you if you try, you should declare the pointer as const:
const char *ptr; /* pointer to const char, not const pointer to char */
ptr = "hello";
(gcc has an option, -Wwrite-strings, that causes it to treat string literals as const. This will cause it to warn about some C code that's legal as far as the standard is concerned, but such code should probably be modified to use const.)
#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
//instead of above tow lines you can write char *ptr = "hello"
printf("%p %s" ,"hello",ptr );
getchar();
}
Here you have assigned string literal "hello" to ptr it means string literal is stored in read only memory so you can't modify it. If you declare char ptr[] = "hello";, then you can modify the array.
Say what?
Your code allocates 6 bytes of memory and initializes it with the values 'h', 'e', 'l', 'l', 'o', and '\0'.
It then allocates a pointer (number of bytes for the pointer depends on implementation) and sets the pointer's value to the start of the 5 bytes mentioned previously.
You can modify the values of an array using syntax such as ptr[1] = 'a'.
Syntactically, strings are a special case. Since C doesn't have a specific string type to speak of, it does offer some shortcuts to declaring them and such. But you can easily create the same type of structure as you did for a string using int, even if the syntax must be a bit different.

C compound literals, pointer to arrays

I'm trying to assign a compound literal to a variable, but it seems not to work, see:
int *p[] = (int *[]) {{1,2,3},{4,5,6}};
I got a error in gcc.
but if I write only this:
int p[] = (int []) {1,2,3,4,5,6};
Then it's okay.
But is not what I want.
I don't understand why the error occurrs, because if I initialize it like a array, or use it with a pointer of arrays of chars, its okay, see:
int *p[] = (int *[]) {{1,2,3},{4,5,6}}; //I got a error
int p[][3] = {{1,2,3},{4,5,6}}; //it's okay
char *p[] = (char *[]) {"one", "two"...}; // it's okay!
Note I don't understand why I got an error in the first one, and please I can't, or I don't want to write like the second form because it's needs to be a compound literals, and I don't want to say how big is the array to the compiler. I want something like the second one, but for int values.
Thanks in advance.
First, the casts are redundant in all of your examples and can be removed. Secondly, you are using the syntax for initializing a multidimensional array, and that requires the second dimension the be defined in order to allocate a sequential block of memory. Instead, try one of the two approaches below:
Multidimensional array:
int p[][3] = {{1,2,3},{4,5,6}};
Array of pointers to one dimensional arrays:
int p1[] = {1,2,3};
int p2[] = {4,5,6};
int *p[] = {p1,p2};
The latter method has the advantage of allowing for sub-arrays of varying length. Whereas, the former method ensures that the memory is laid out contiguously.
Another approach that I highly recommend that you do NOT use is to encode the integers in string literals. This is a non-portable hack. Also, the data in string literals is supposed to be constant. Do your arrays need to be mutable?
int *p[] = (int *[]) {
"\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00",
"\x04\x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00"
};
That example might work on a 32-bit little-endian machine, but I'm typing this from an iPad and cannot verify it at the moment. Again, please don't use that; I feel dirty for even bringing it up.
The casting method you discovered also appears to work with a pointer to a pointer. That can be indexed like a multidimensional array as well.
int **p = (int *[]) { (int[]) {1,2,3}, (int[]) {4,5,6} };
First understand that "Arrays are not pointers".
int p[] = (int []) {1,2,3,4,5,6};
In the above case p is an array of integers. Copying the elements {1,2,3,4,5,6} to p. Typecasting is not necessary here and both the rvalue and lvalue types match which is an integer array and so no error.
int *p[] = (int *[]) {{1,2,3},{4,5,6}};
"Note I don't understand why I got a error in the first one,.."
In the above case, p an array of integer pointers. But the {{1,2,3},{4,5,6}} is a two dimensional array ( i.e., [][] ) and cannot be type casted to array of pointers. You need to initialize as -
int p[][3] = { {1,2,3},{4,5,6} };
// ^^ First index of array is optional because with each column having 3 elements
// it is obvious that array has two rows which compiler can figure out.
But why did this statement compile ?
char *p[] = {"one", "two"...};
String literals are different from integer literals. In this case also, p is an array of character pointers. When actually said "one", it can either be copied to an array or point to its location considering it as read only.
char cpy[] = "one" ;
cpy[0] = 't' ; // Not a problem
char *readOnly = "one" ;
readOnly[0] = 't' ; // Error because of copy of it is not made but pointing
// to a read only location.
With string literals, either of the above case is possible. So, that is the reason the statement compiled. But -
char *p[] = {"one", "two"...}; // All the string literals are stored in
// read only locations and at each of the array index
// stores the starting index of each string literal.
I don't want to say how big is the array to the compiler.
Dynamically allocating the memory using malloc is the solution.
Hope it helps !
Since nobody's said it: If you want to have a pointer-to-2D-array, you can (probably) do something like
int (*p)[][3] = &(int[][3]) {{1,2,3},{4,5,6}};
EDIT: Or you can have a pointer to its first element via
int (*p)[3] = (int[][3]) {{1,2,3},{4,5,6}};
The reason why your example doesn't work is because {{1,2,3},{4,5,6}} is not a valid initializer for type int*[] (because {1,2,3} is not a valid initializer for int*). Note that it is not an int[2][3] — it's simply an invalid expression.
The reason why it works for strings is because "one" is a valid initializer for char[] and char[N] (for some N>3). As an expression, it's approximately equivalent to (const char[]){'o','n','e','\0'} except the compiler doesn't complain too much when it loses constness.
And yes, there's a big difference between an initializer and an expression. I'm pretty sure char s[] = (char[]){3,2,1,0}; is a compile error in C99 (and possibly C++ pre-0x). There are loads of other things too, but T foo = ...; is variable initialization, not assignment, even though they look similar. (They are especially different in C++, since the assignment operator is not called.)
And the reason for the confusion with pointers:
Type T[] is implicitly converted to type T* (a pointer to its first element) when necessary.
T arg1[] in a function argument list actually means T * arg1. You cannot pass an array to a function for Various Reasons. It is not possible. If you try, you are actually passing a pointer-to-array. (You can, however, pass a struct containing a fixed-size array to a function.)
They both can be dereferenced and subscripted with identical (I think) semantics.
EDIT: The observant might notice that my first example is roughly syntactically equivalent to int * p = &1;, which is invalid. This works in C99 because a compound literal inside a function "has automatic storage duration associated with the enclosing block" (ISO/IEC 9899:TC3).
The one that you are using is array of int pointers. You should use pointer to array :
int (*p)[] = (int *) {{1,2,3}, {4,5,6}}
Look at this answer for more details.
It seems you are confusing pointers and array. They're not the same thing! An array is the list itself, while a pointer is just an address. Then, with pointer arithmetic you can pretend pointers are array, and with the fact that the name of an array is a pointer to the first element everything sums up in a mess. ;)
int *p[] = (int *[]) {{1,2,3},{4,5,6}}; //I got a error
Here, p is an array of pointers, so you are trying to assign the elements whose addresses are 1, 2, 3 to the first array and 4, 5, 6 to the second array. The seg fault happens because you can't access those memory locations.
int p[][3] = {{1,2,3},{4,5,6}}; //it's okay
This is ok because this is an array of arrays, so this time 1, 2, 3, 4, 5 and 6 aren't addresses but the elements themselves.
char *p[] = (char *[]) {"one", "two"...}; // it's okay!
This is ok because the string literals ("one", "two", ...) aren't really strings but pointers to those strings, so you're assigning to p[1] the address of the string literal "one".
BTW, this is the same as doing char abc[]; abc = "abc";. This won't compile, because you can't assign a pointer to an array, while char *def; def = "def"; solves the problem.

Resources