Why can't define string array using char **? - c

We can assign a string constant to char * or char [ ] just like:
char *p = "hello";
char a[] = "hello";
Now for string array, naturally it'll be like this:
char **p = {"hello", "world"}; // Error
char *a[] = {"hello", "world"};
the first way will generate a warning when compiling, and has a Segmentation fault when I'm trying to print the string constant with printf("%s\n", p[0]);
Why ?

char **p = {"hello", "world"};
Here, p is a pointer to pointer to char which can't be initialized with an an array initializer with an array of pointers (each of the string literals gets converted into a pointer during initialization -- actual type of a string literal in C is char[n]).
The types are incompatible i.e. p is of type char ** and RHS is of type char *[]. Hence, the diagnostic is issued by the compiler.
Whereas,
char *a[] = {"hello", "world"};
is valid as a is an array of pointers to char and the types match. Hence this is a valid initialization.
Since C99, the C language supports Compound literals (6.5.2.5, C99) using which you can initialize:
char **p = (char *[]) {"hello", "world"};
So either use compound literals if C99 or later is supported by your compiler. Otherwise, stick to the array of pointers initialization (char *a[]).
You can read about various examples on compound literals here (gcc manual).

To summarize, because char **p points to a pointer, not to an array of pointers.
That happens because you need to build the array your self. For each level (or dimension) you need to reserve memory for the pointer holders.
What you need to do is:
// this holds you pointers
char **p = malloc( sizeof( char *) * nr of elements);
// to set the elements, you need to:
p[0] = "hello";
p[1] = "world";
And this needs to be done for as many levels (or dimensions) you have.

because char **p is a pointer to pointer not an array of pointers, where char* a[] is an array of pointers
char *ptr ="hello";
defines ptr to be a pointer to the (read-only) string "hello", and thus contains the address of string say 100. ptr must itself be stored somewhere: say location 200.
char **p = &ptr;
Now p points to ptr, that is, it contains the address of ptr (which is 200).
printf("%s",**p);
So your creating a pointer to another pointer but not extra memory to store more than one string but char *a[] creates array of pointers depends on the size you have given
an advice
Do not use char *p = "hello";
use const char *p = "hello";
because string literals are saved in read only memory

First, char *p = "hello" is different from char a[] = "hello".
char *p = "hello"
"hello" is actually a pointer to literal constant. You assign a pointer to literal constant to p. So It's better to use const char *p = "hello".
char a[] = "hello"
The characters in "hello" were copied to array a with a '\0' end.
Second,
char *a[]
defines a array of pointers, so it's OK to use char *a[] = {"hello", "world"};
char **p
defines a pointer to a pointer to a char, so it makes no sense to using char **p = {"hello", "world"};

Related

What is the difference between char stringA[LEN] and char* stringB[LEN] in C

I read a few similar questions, C: differences between char pointer and array, What is the difference between char s[] and char *s?, What is the difference between char array[] and char *array? but none of them seem to clear my doubt.
I'm aware that
char *s = "Hello world";
makes the string immutable whereas
char s[] = "Hello world";
can be modified.
My doubt is if I do char stringA[LEN]; and char* stringB[LEN]; Are they any different? Or does stringB again becomes immutable as in the case before?
Let me give you a visual explanation:
As you can see, a is an array of 4 characters, whereas b is an array of 4 character pointers, each pointing to the beginning of a C string. Each one of those strings can have a different length.
Are they any different?
Yes.
Both variables stringA and stringB are arrays. stringA is an array of char of size LEN and stringB is an array of char * of size LEN.
char and char * are two different types. stringA can hold only one character string of length LEN while elements of stingB can point to LEN number of strings.
Or does stringB again becomes immutable as in the case before?
Whether strings pointed by elements of stringB is mutable or not will depend on how memory is allocated. If they are initialized with string literals
char* stringB[LEN] = { "Apple", "Bapple", "Capple"};
then they are immutable. In case of
for(int i = 0; i < LEN; i++)
stringB[i] = malloc(30) // Allocating 30 bytes for each element
strcpy(stringB[0], "Apple");
strcpy(stringB[1], "Bapple");
strcpy(stringB[2], "Capple");
they are mutable.
They are not the same.
Here stringA is an array of char, that means printing stringA[0] will show the letter S:
char stringA[] = "Something";
Whereas printing stringB[0] here will show Something (array of pointer to char):
char* stringB[] = { "Something", "Else" };
They are not having the same datatype, less being comparable.
char stringA[LEN]; is a char array of LEN length. (Array of chars)
char* stringB[LEN]; is a char * array of LEN length. (Array of char pointers)
FWIW, in case of char *s = "Hello world"; s is a pointer which points to a string literal which is non-modifiable. The pointer itself can certainly be changed. Only the content it points to (values) cannot be changed.
The reason
char* s = "Hello World!";
is immutable is because "Hello World!" is stored in RO(Read Only) memory, sow hen you try to change it, it throws an error. Declaring it as a pointer to the first element of an "array" is NOT the reason it's immutable. You seem to be slightly confused as well. Assuming your questions is exactly what you mean,
char stringA[LEN] = "ABC";
is a string in traditional C style, but stringB as you've defined isn't a string, it's an array of strings -
char* stringB[LEN] = {"ABC", "DEF"};
Assuming you mean what I think you mean,
char stringA[LEN] = "Hello World!";
char *stringB = malloc(LEN);
strcpy(stringB, stringA);
In this case, stringB IS mutable, since it refers to writable memory.

When to use char a[] over char p* and vice versa?

Lately I've been learning all about the C language, and am confused as to when to use
char a[];
over
char *p;
when it comes to string manipulation. For instance, I can assign a string to them both like so:
char a[] = "Hello World!";
char *p = "Hello World!";
and view/access them both like:
printf("%s\n", a);
printf("%s\n", p);
and manipulate them both like:
printf("%c\n", &a[6]);
printf("%c\n", &p[6]);
So, what am I missing?
char a[] = "Hello World!";
This allocates modifiable array just big enough to hold the string literal (including terminating NUL char). Then it initializes the array with contents of string literal. If it is a local variable, then this effectively means it does memcpy at runtime, every time the local variable is created.
Use this when you need to modify the string, but don't need to make it bigger.
Also, if you have char *ap = a;, when a goes out of scope ap becomes a dangling pointer. Or, same thing, you can't do return a; when a is local to that function, because return value will be dangling pointer to now destroyed local variables of that function.
Note that using exactly this is rare. Usually you don't want an array with contents from string literal. It's much more common to have something like:
char buf[100]; // contents are undefined
snprintf(buf, sizeof buf, "%s/%s.%d", pathString, nameString, counter);
char *p = "Hello World!";
This defines pointer, and initializes it to point to string literal. Note that string literals are (normally) non-writable, so you really should have this instead:
const char *p = "Hello World!";
Use this when you need pointer to non-modifiable string.
In contrast to a above, if you have const char *p2 = p; or do return p;, these are fine, because pointer points to the string literal in program's constant data, and is valid for the whole execution of the program.
The string literals themselves, text withing double quotes, the actual bytes making up the strings, are created at compile time and normally placed with other constant data within the application. And then string literal in code concretely means address of this constant data blob.
char * strings are read-only. They cannot be modified while char[] strings can be.
char *str = "hello";
str[0] = 't'; // This is an illegal operation
Whereas
char str[] = "hello"; str[0] = 't'; // Legal, string becomes tello

confusion about character array initialisation

What is the difference between
char ch [ ] = "hello";
and
char ch [ ] = { 'h','e','l','l','o','\0'};
and why we can only do
char *p = "hello";
But cant do
char *p = {'h','e','l','l','o','\0'};
char ch [ ] = "hello";
char ch [ ] = { 'h','e','l','l','o','\0'};
There is no difference. ch object will be exactly the same in both declarations.
On why we cannot do:
char *p = {'h','e','l','l','o','\0'};
An initializer list of more than one value can only be used for objects of aggregate type (structures or array). You can only initialize a char * with a pointer value.
Actually:
char ch [ ] = "hello";
and
char *p = "hello";
are not the same. The first initializes an array with the elements of a string literal and the second is a pointer to a string literal.
The two array declarations are the same. As for the pointer declaration, the form char *p = {'h','e','l','l','o','\0'}; is not valid, simply because it was not included in the compiler design.
There is no theoretical reason, in my knowledge, why that declaration should not be valid, when char *p = "hello"; is.
Open upon a time they made a mistake about const
The bit "hello" should be a const char * const but they where lazy and just used char *. But to keep the faith alive they let that one slip.
Then they said. Ok. They can be equal.
Then they had char [] = { 'a', 'b', ...}; and all was good in the world
The the evil monster came and thrust upon them char *p = "hello". But the evil monster was in a good mood and said it should be const char *p = "hello" but I would be happy with that.
He went home and but the evil monster was not amused. He dictated over his realm char *p = {'h','e','l','l','o','\0'}; is the sign of a heretic.
Basically there was a cock-up. Just do it right from now and there is old code kicking around that needs to be satisfied.
Let's take it one by one. This following is initializing a char array ch using the string literal "hello".
char ch[] = "hello";
The following is initializing the array ch using an array initialization list. This is equivalent to the above statement.
char ch[] = {'h', 'e', 'l', 'l', 'o', '\0'};
The following is initializing a char pointer p to point to the memory where the string literal "hello" is stored. This is read-only memory. Attempting to modify its contents will not give compile error because string literal in C are not const qualified unlike in C++, but will cause undefined behaviour or even program crash.
char *p = "hello";
const char *p = "hello"; // better
The following last statement is plain wrong.
char *p = {'h','e','l','l','o','\0'};
p is a char pointer here, not an array and can't be initialized using array initialization list. I have highlighted the words array and pointer above to emphasize that array and pointer are different types. In some cases, an array is implicitly converted to a pointer to its first element like when an array is passed to a function or assigned to a pointer of the same type. This does not mean they are the same. They have different pointer arithmetic and different sizeof values.
There is no difference in
char ch [ ] = "hello";
char ch [ ] = { 'h','e','l','l','o','\0'};
But check below
char *p = "hello"; //This is correct
char *p = {'h','e','l','l','o','\0'}; //This is wrong
If you want to make it correct you need to use
char *p[]={'h','e','l','l','o','\0'}; //This works

char pointer initialisation

why is the initialization as follows:
char *p = "hello";
allowed, but:
char *p = {'h','e','l','l','o','\0'};
is not, although they mean the same?
why the initialisation as follows char *p = "hello"; is allowed.But char *p = {'h','e','l','l','o','\0'}; is not allowed , although they mean the same.
Because string literals are character arrays. Arrays are not pointers (read c-faq). Both are different.
char *p = "hello"; // Non modifiable, store in ROM
char p[] = {'h','e','l','l','o','\0'}; //Modifiable and stored in RAM
The string literal "hello" is of type const char * and initializing p which is of type char * with "hello" is valid as an implicit conversion is invoked.
but in the second statement {'h','e','l','l','o','\0'} is an array of char and you cannot initialize a pointer with an array. To initialize p with an array it must be an array too:
char p[6] = {'h','e','l','l','o','\0'};
or
char p[] = {'h','e','l','l','o','\0'};

Do these statements about pointers have the same effect?

Does this...
char* myString = "hello";
... have the same effect as this?
char actualString[] = "hello";
char* myString = actualString;
No.
char str1[] = "Hello world!"; //char-array on the stack; string can be changed
char* str2 = "Hello world!"; //char-array in the data-segment; it's READ-ONLY
The first example creates an array of size 13*sizeof(char) on the stack and copies the string "Hello world!" into it.
The second example creates a char* on the stack and points it to a location in the data-segment of the executable, which contains the string "Hello world!". This second string is READ-ONLY.
str1[1] = 'u'; //Valid
str2[1] = 'u'; //Invalid - MAY crash program!
No. The first one gives you a pointer to const data, and if you change any character via that pointer, it's undefined behavior. The second one copies the characters into an array, which isn't const, so you can change any characters (either directly in array, or via pointer) at will with no ill effects.
No. In the first one, you can't modify the string pointed by myString, in the second one you can. Read more here.
It isn't the same, because the unnamed array pointed to by myString in the first example is read-only and has static storage duration, whereas the named array in the second example is writeable and has automatic storage duration.
On the other hand, this is closer to being equivalent:
static const char actualString[] = "hello";
char* myString = (char *)actualString;
It's still not quite the same though, because the unnamed arrays created by string literals are not guaranteed to be unique, whereas explicit arrays are. So in the following example:
static const char string_a[] = "hello";
static const char string_b[] = "hello";
const char *ptr_a = string_a;
const char *ptr_b = string_b;
const char *ptr_c = "hello";
const char *ptr_d = "hello";
ptr_a and ptr_b are guaranteed to compare unequal, whereas ptr_c and ptr_d may be either equal or unequal - both are valid.

Resources