confusion about character array initialisation - c

What is the difference between
char ch [ ] = "hello";
and
char ch [ ] = { 'h','e','l','l','o','\0'};
and why we can only do
char *p = "hello";
But cant do
char *p = {'h','e','l','l','o','\0'};

char ch [ ] = "hello";
char ch [ ] = { 'h','e','l','l','o','\0'};
There is no difference. ch object will be exactly the same in both declarations.
On why we cannot do:
char *p = {'h','e','l','l','o','\0'};
An initializer list of more than one value can only be used for objects of aggregate type (structures or array). You can only initialize a char * with a pointer value.
Actually:
char ch [ ] = "hello";
and
char *p = "hello";
are not the same. The first initializes an array with the elements of a string literal and the second is a pointer to a string literal.

The two array declarations are the same. As for the pointer declaration, the form char *p = {'h','e','l','l','o','\0'}; is not valid, simply because it was not included in the compiler design.
There is no theoretical reason, in my knowledge, why that declaration should not be valid, when char *p = "hello"; is.

Open upon a time they made a mistake about const
The bit "hello" should be a const char * const but they where lazy and just used char *. But to keep the faith alive they let that one slip.
Then they said. Ok. They can be equal.
Then they had char [] = { 'a', 'b', ...}; and all was good in the world
The the evil monster came and thrust upon them char *p = "hello". But the evil monster was in a good mood and said it should be const char *p = "hello" but I would be happy with that.
He went home and but the evil monster was not amused. He dictated over his realm char *p = {'h','e','l','l','o','\0'}; is the sign of a heretic.
Basically there was a cock-up. Just do it right from now and there is old code kicking around that needs to be satisfied.

Let's take it one by one. This following is initializing a char array ch using the string literal "hello".
char ch[] = "hello";
The following is initializing the array ch using an array initialization list. This is equivalent to the above statement.
char ch[] = {'h', 'e', 'l', 'l', 'o', '\0'};
The following is initializing a char pointer p to point to the memory where the string literal "hello" is stored. This is read-only memory. Attempting to modify its contents will not give compile error because string literal in C are not const qualified unlike in C++, but will cause undefined behaviour or even program crash.
char *p = "hello";
const char *p = "hello"; // better
The following last statement is plain wrong.
char *p = {'h','e','l','l','o','\0'};
p is a char pointer here, not an array and can't be initialized using array initialization list. I have highlighted the words array and pointer above to emphasize that array and pointer are different types. In some cases, an array is implicitly converted to a pointer to its first element like when an array is passed to a function or assigned to a pointer of the same type. This does not mean they are the same. They have different pointer arithmetic and different sizeof values.

There is no difference in
char ch [ ] = "hello";
char ch [ ] = { 'h','e','l','l','o','\0'};
But check below
char *p = "hello"; //This is correct
char *p = {'h','e','l','l','o','\0'}; //This is wrong
If you want to make it correct you need to use
char *p[]={'h','e','l','l','o','\0'}; //This works

Related

initialize a pointer to a character pointer and initialize an array of pointer each pointing to a character pointer

I have a question regarding my code here:
char a = 'A';
char b = 'B';
char c = 'C';
char *ad = &a;
char *ab = &b;
char *ac = &c;
char* cp[] = {ad, ab, ac};
char **d = &(cp[2]); // okay
// char **d[0] = &(cp[2]); //variant 1: error: invalid initializer
// char **d[] = &(cp[2]); //variant 2: error: invalid initializer
I am not sure why variant1 and variant2 would leads to error. For the original initialization (i.e. the line with comment okay), there was no error. My understanding is I have initialize a pointer (i.e. d) which is pointing to a pointer that pointing to a character. So this seems fine.
For variant1, I thought (based on my understanding) I am initializing an array of pointers where each of them will point to a pointer that points to a character. So in this case, I only initialize the very first element in that array.
Similarly for variant2, I initialize an empty array of pointers that each would points to a character.
Could someone tells me why there is error of coming from the compiler here?
And is my understanding of variant 1 correct? i.e. d is an array of pointers where each entry in the array would points to a pointer that points to a character??
I thought this is correct. So why is it that I cannot initialize variant 1?
For variant2, I also thought that I have an array of pointers each pointing to a pointer that points to a character.
This compiles cleanly...
char a = 'A';
char b = 'B';
char c = 'C';
char *ad = &a;
char *ab = &b;
char *ac = &c;
char *cp[] = {ad, ab, ac}; // be consistent
char **d = &cp[2];
char **e[1] = { &cp[2] }; // cannot dimension an array to have 0 elements
char **f[] = { &cp[2] }; // compilers count more accurately than people
Note the absence of unnecessary "()".
Array initialisers are listed within enclosing braces. "{}"
Exception to last statement: char foo[] = "bar";... "string" array elements do not require braces unless one gets silly:
char foo[] = { 'b', 'a', 'r', '\0', };

How to convert constant char pointer to lower case in C?

I've a function which receives a const char* and I want to convert it to lowercase. But I get the error:
error: array initializer must be an initializer list or string literal
I tried to copy the string variable to another array so that I could lower case it. But I think I've got something confused.
This is my function:
int convert(const char* string)
{
char temp[] = string;
temp = tolower(temp); //error is here
//do stuff
}
I'm struggling to understand what this error means, could someone help in explaining it?
tolower takes a single character and returns it in lowercase.
Even if it didn't, arrays aren't assignable. Arrays and pointers are not the same thing.
You presumably want to do something like:
char *temp = strdup(string); // make a copy
// adjust copy to lowercase
unsigned char *tptr = (unsigned char *)temp;
while(*tptr) {
*tptr = tolower(*tptr);
tptr++;
}
// do things
// release copy
free(temp);
Make sure you understand the difference between the heap and the stack, and the rules affecting string literals.
First of all, tolowertakes char or int, not string.
but even if you passed char, your code wouldn't work, because this error array initializer must be an initializer list or string literal ,means you have to initialize it using one of the following methods:
char arr[4] = {'h', 'e', 'y','\0'}; // initializer list
char arr[4] = "hey"; // string literal
char arr[] = "hey"; // also a string literal
char arr[4];
arr[0] = 'h';
arr[1] = 'e';
arr[2] = 'y';
arr[4] = '\0';
Please note that, tolower() takes a character and not a string:
int tolower ( int c );
Also you are trying to copy the string from string to temp[] variable. There in no overloading of the = operator in C to copy the string into char array.
Now about your error:
error: array initializer must be an initializer list or string literal
It says that the array must be initialized via an list of initial individual items or you can directly assign a string literal.
E.g:
char temp[] = "Hi";
char temp[] = {'H','i','\0'};
Also, check your statement:
temp = tolower(temp);
Here the return type of tolower() is int that you are assigning it to the array temp which is improper.
You can use the following snippet using strlwr():
int convert(const char* string)
{
char *temp= malloc(strlen(string)+1);
strcpy(temp,string);
strlwr(temp);
//do your stuff
}

char pointer initialisation

why is the initialization as follows:
char *p = "hello";
allowed, but:
char *p = {'h','e','l','l','o','\0'};
is not, although they mean the same?
why the initialisation as follows char *p = "hello"; is allowed.But char *p = {'h','e','l','l','o','\0'}; is not allowed , although they mean the same.
Because string literals are character arrays. Arrays are not pointers (read c-faq). Both are different.
char *p = "hello"; // Non modifiable, store in ROM
char p[] = {'h','e','l','l','o','\0'}; //Modifiable and stored in RAM
The string literal "hello" is of type const char * and initializing p which is of type char * with "hello" is valid as an implicit conversion is invoked.
but in the second statement {'h','e','l','l','o','\0'} is an array of char and you cannot initialize a pointer with an array. To initialize p with an array it must be an array too:
char p[6] = {'h','e','l','l','o','\0'};
or
char p[] = {'h','e','l','l','o','\0'};

Why can't define string array using char **?

We can assign a string constant to char * or char [ ] just like:
char *p = "hello";
char a[] = "hello";
Now for string array, naturally it'll be like this:
char **p = {"hello", "world"}; // Error
char *a[] = {"hello", "world"};
the first way will generate a warning when compiling, and has a Segmentation fault when I'm trying to print the string constant with printf("%s\n", p[0]);
Why ?
char **p = {"hello", "world"};
Here, p is a pointer to pointer to char which can't be initialized with an an array initializer with an array of pointers (each of the string literals gets converted into a pointer during initialization -- actual type of a string literal in C is char[n]).
The types are incompatible i.e. p is of type char ** and RHS is of type char *[]. Hence, the diagnostic is issued by the compiler.
Whereas,
char *a[] = {"hello", "world"};
is valid as a is an array of pointers to char and the types match. Hence this is a valid initialization.
Since C99, the C language supports Compound literals (6.5.2.5, C99) using which you can initialize:
char **p = (char *[]) {"hello", "world"};
So either use compound literals if C99 or later is supported by your compiler. Otherwise, stick to the array of pointers initialization (char *a[]).
You can read about various examples on compound literals here (gcc manual).
To summarize, because char **p points to a pointer, not to an array of pointers.
That happens because you need to build the array your self. For each level (or dimension) you need to reserve memory for the pointer holders.
What you need to do is:
// this holds you pointers
char **p = malloc( sizeof( char *) * nr of elements);
// to set the elements, you need to:
p[0] = "hello";
p[1] = "world";
And this needs to be done for as many levels (or dimensions) you have.
because char **p is a pointer to pointer not an array of pointers, where char* a[] is an array of pointers
char *ptr ="hello";
defines ptr to be a pointer to the (read-only) string "hello", and thus contains the address of string say 100. ptr must itself be stored somewhere: say location 200.
char **p = &ptr;
Now p points to ptr, that is, it contains the address of ptr (which is 200).
printf("%s",**p);
So your creating a pointer to another pointer but not extra memory to store more than one string but char *a[] creates array of pointers depends on the size you have given
an advice
Do not use char *p = "hello";
use const char *p = "hello";
because string literals are saved in read only memory
First, char *p = "hello" is different from char a[] = "hello".
char *p = "hello"
"hello" is actually a pointer to literal constant. You assign a pointer to literal constant to p. So It's better to use const char *p = "hello".
char a[] = "hello"
The characters in "hello" were copied to array a with a '\0' end.
Second,
char *a[]
defines a array of pointers, so it's OK to use char *a[] = {"hello", "world"};
char **p
defines a pointer to a pointer to a char, so it makes no sense to using char **p = {"hello", "world"};

changing one char in a c string

I am trying to understand why the following code is illegal:
int main ()
{
char *c = "hello";
c[3] = 'g'; // segmentation fault here
return 0;
}
What is the compiler doing when it encounters char *c = "hello";?
The way I understand it, its an automatic array of char, and c is a pointer to the first char. If so, c[3] is like *(c + 3) and I should be able to make the assignment.
Just trying to understand the way the compiler works.
String constants are immutable. You cannot change them, even if you assign them to a char * (so assign them to a const char * so you don't forget).
To go into some more detail, your code is roughly equivalent to:
int main() {
static const char ___internal_string[] = "hello";
char *c = (char *)___internal_string;
c[3] = 'g';
return 0;
}
This ___internal_string is often allocated to a read-only data segment - any attempt to change the data there results in a crash (strictly speaking, other results can happen as well - this is an example of 'undefined behavior'). Due to historical reasons, however, the compiler lets you assign to a char *, giving you the false impression that you can modify it.
Note that if you did this, it would work:
char c[] = "hello";
c[3] = 'g'; // ok
This is because we're initializing a non-const character array. Although the syntax looks similar, it is treated differently by the compiler.
there's a difference between these:
char c[] = "hello";
and
char *c = "hello";
In the first case the compiler allocates space on the stack for 6 bytes (i.e. 5 bytes for "hello" and one for the null-terminator.
In the second case the compiler generates a static const string called "hello" in a global area (aka a string literal, and allocates a pointer on the stack that is initialized to point to that const string.
You cannot modify a const string, and that's why you're getting a segfault.
You can't change the contents of a string literal. You need to make a copy.
#include <string.h>
int main ()
{
char *c = strdup("hello"); // Make a copy of "hello"
c[3] = 'g';
free(c);
return 0;
}

Resources