I'm currently learning C through "Learning C the Hard Way"
I am a bit confused in some sample code as to why some arrays must be initialized with a pointer.
int ages[] = {23, 43, 12, 89, 2};
char *names[] = {
"Alan", "Frank",
"Mary", "John", "Lisa"
};
In the above example, why does the names[] array require a pointer when declared? How do you know when to use a pointer when creating an array?
A string literal such as "Alan" is of type char[5], and to point to the start of a string you use a char *. "Alan" itself is made up of:
{ 'A', 'L', 'A', 'N', '\0' }
As you can see it's made up of multiple chars. This char * points to the start of the string, the letter 'A'.
Since you want an array of these strings, you then add [] to your declaration, so it becomes: char *names[].
Prefer const pointers when you use string literals.
const char *names[] = {
"Alan", "Frank",
"Mary", "John", "Lisa"
};
In the declaration, name is a array of const char pointers which means it holds 5 char* to cstrings. when you want to use a pointer, you use a pointer, as simple as that.
Example:
const char *c = "Hello world";
So, when you use them in an array, you're creating 5 const char* pointers which point to string literals.
Because the content of the array is a char*. The other example has an int. "Alan" is a string, and in C you declare strings as char pointers.
In the case of char *names[] you are declaring an array of string pointers.
a string in C e.g. "Alan" is a series of characters in memory ended with a \0 value marking the end of the string
so with that declaration you are doing this
names[0] -> "Alan\0"
names[1] -> "Frank\0"
...
then you can use names[n] as the pointer to the string
printf( "%s:%d", names[0], strlen(names[0]) );
which gives output "Alan:4"
The use of "array of pointer" is not required.
The following will work as well. It's an array of 20 byte character arrays. The compiler only needs to know the size of the thing in the array, not the length of the array. What you end up with is an array with 5 elements of 20 bytes each with a name in each one.
#include <stdio.h>
char names[][20] = {
"Alan", "Frank",
"Mary", "John", "Lisa"
};
int main(int argc, char *argv[])
{
int idx;
for (idx = 0; idx < 5; idx++) {
printf("'%s'\n", names[idx]);
}
}
In your example the size of the thing in the array is "pointer to char". A string constant can be used to initialize either a "pointer to char" or an "array of char".
Related
Can I consider *str[10] as two dimensional array ?
If I declare char *str[10]={"ONE","TWO","THREE"} how we can access single character ?
This record
char str[10];
is a declaration of an array with 10 elements of the type char, For example you can initialize the array like
char str[10] = "ONE";
This initialization is equivalent to
char str[10] = { 'O', 'N', 'E', '\0' };
all elements of the array that are not explicitly initialized are zero-initialized.
And you may change elements of the array like
str[0] = 'o';
or
strcpy( str, "TWO" );
This record
char *str;
declares a pointer to an object of the type char. You can initialize it for example like
char *str = "ONE";
In this case the pointer will be initialize by the address of the first character of the string literal.
This record
char * str[10];
is a declaration of an array of 10 elements that has the pointer type char *.
You can initialize it as for example
char * str[10] = { "ONE", "TWO", "THREE" };
In this case the first three elements of the array will be initialized by addresses of first characters of the string literals specified explicitly. All other elements will be initialized as null pointers.
You may not change the string literals pointed to by elements of the array. Any attempt to change a string literal results in undefined behavior.
To access elements of the string literals using the array you can use for example two subscript operator. For example
for ( sisze_t i = 0; str[0][i] != '\0'; ++i )
{
putchar( str[0][i] );
}
putchar( '\n' );
If you want to change strings then you need to declare for example a two dimensional array like
char str[][10] = { "ONE", "TWO", "THREE" };
In this case you can change elements of the array that are in turn one-dimensional arrays as for example
str[0][0] = 'o';
or
strcpy( str[0], "FOUR" );
Yes: char* str[10]; would create an array of 10 pointers to chars.
To access a single character, we can access it like a 2 dimensional array; i.e.:
char* str[10]={"ONE","TWO","THREE"};
char first = str[0][0];
Can I consider *str[10] as two dimensional array ?
It's unclear what you mean. *str[10] is not a valid type name, and the context is a bit lacking to determine how else to interpret it.
If you mean it as an expression referencing the subsequent definition then no, its type is char, but evaluating it produces undefined behavior.
If you are asking about the type of the object identified by str, referencing the subsequent definition, then again no. In this case it is a one-dimensional array of pointers to char.
If I declare char *str[10]={"ONE","TWO","THREE"} how we can access single character ?
You can access one of the pointers by indexing str, among other other ways. For example, str[1]. You can access one of the characters in the string into which that pointer points by using the indexing operator again, among other ways. For example, str[1][0]. That you are then using a double index does not make str a 2D array. The memory layout is quite different than if you declared, say, char str[3][10];.
I don't quite understand how to correctly arrange const in order to pass a constant array of arrays to a function, for example, an array of strings:
void f (char **strings);
int main (void)
{
char strings[][2] = { "a", "b", "c" };
f (strings);
}
Could you tell me where to put const? As I understand it, there should be two const and one of them in the example above should stand before char.
There is a high probability of a duplicate, but I could not find a similar question :(
First of all, the declaration isn't ideal. char strings[][2] declares an indeterminate amount of char[2] arrays, where the amount is determined by the initializers. In most cases it makes more sense to declare an array of pointers instead, since that means that the strings pointed at do not need to have a certain fixed length.
So you could change the declaration to const char* strings[3] = { "a", "b", "c" };. Unless of course the intention is to only allow strings on 1 character and 2 null terminator, then the original code was correct. And if you need read/writeable strings then we can't use pointer notation either.
You can pass a const char* strings[3] to a function by declaring that function as
void f (const char* strings[3]);
Just like you declared the array you pass to the function.
Now as it happens, arrays when part of a function declaration "decay" into a pointer to the first element. In this case a pointer to a const char* item, written as const char**.
So you could have written void f (const char** strings); and it would have been equivalent.
I think this resource explains it nicely. In general, it depends what "depth" of const guarding you need. If you want function "f" to have read-only access on both the "outer" pointer, the pointer it points to (the "inner" pointer), and the char the "inner" pointer points to, then use
const char *const *const strings
If you want to relax the guards to make this legal:
strings = NULL;
Then use:
const char *const *strings
Relaxing further, to allow for:
strings = NULL;
*strings = NULL;
Use only one "const":
const char **strings
It is important to understand the difference between an array of arrays (for example, an array of arrays of 2 chars) and an array of pointers. As the wording suggests, their element type is totally different:
Each single element of an array of arrays, like my_arr_of_arrays below, is, well, an array! Not an address, not a pointer: Each element is a proper array, each of the same size (her: 2), a succession of a fixed number of sub-elements. The data is right there in the array object. After char my_arr_of_arrays[][2] = { "a", "b", "c", "" };, my_arr_of_arrays is a succession of characters, grouped in pairs: 'a', '\0', 'b', '\0', 'c', '\0', '\0', '\0'. The picture below illustrates that. Each element in my_arr_of_arrays has a size of 2 bytes. Each string literal is used to copy the characters in it into the corresponding array elements of my_arr_of_arrays. The data can be overwritten later, the copy is not const.
Contrast this with an array of pointers, like my_arr_of_ptrs below! Each element in such an array is, well, a pointer. The data proper is somewhere else! The actual data may have been allocated with malloc, or it is static data like the string literals in my example below. Each element — each address — has a size of 4 byte on a 32 bit architecture, or 8 byte on a 64 bit architecture. Nothing is copied from the string literals: The pointers simply point to the location in the program where the literals themselves are stored. The data is const and cannot be overwritten.
It is confusing that these completely different data structures can be initialized with the same curly-braced initializer list; it is confusing that string literals can serve as a data source for copying characters over into an array, or that their address can be taken and assigned to a pointer: Both char arr[3] = "12": and char *ptr = "12"; are valid, but for arr a copy is made and for ptr the address of the string literal itself is taken.
Your program shows that C permits you to pass an address of an array of 2 chars to a function that expects the address of a pointer; but that is wrong and leads to disaster if the function tries to dereference an "address" which is, in fact, a sequence of characters. C++ forbids this nonsensical conversion.
The following program and image may shed light on the data layout of the two different arrays.
#include <stdio.h>
void f_array_of_arrays(char(* const arr_of_arrays)[2])
{
for (int i = 0; arr_of_arrays[i][0] != 0; i++)
{
printf("string no. %d is ->%s<-\n", i, arr_of_arrays[i]);
}
const char myOtherArr[][2] = { "1", "2", "3", "4", "" };
// arr2d = myOtherArr; // <-- illegal: "const arr2d"!
arr_of_arrays[0][0] = 'x'; // <-- OK: The chars themselves are not const.
}
void f_array_of_pointers(const char** arr_of_ptrs)
{
for (int i = 0; arr_of_ptrs[i][0] != 0; i++)
{
printf("string no. %d is ->%s<-\n", i, arr_of_ptrs[i]);
}
arr_of_ptrs[1] = "87687686";
for (int i = 0; arr_of_ptrs[i][0] != 0; i++)
{
printf("after altering: string no. %d is ->%s<-\n", i, arr_of_ptrs[i]);
}
}
int main()
{
char my_arr_of_arrays[][2] = { "a", "b", "c", "\0" }; // last element has two zero bytes.
const char* my_arr_of_ptrs[] = { "111", "22", "33333", "" }; // "jagged array"
f_array_of_arrays(my_arr_of_arrays);
f_array_of_pointers(my_arr_of_ptrs);
// disaster: function thinks elements are arrays of char but
// they are addresses; does not compile as C++
// f_array_of_arrays(my_arr_of_ptrs);
// disaster: function thinks elements contain addresses and are 4 bytes,
// but they are arbitrary characters and 2 bytes long; does not compile as C++
// f_array_of_pointers(my_arr_of_arrays);
}
Why is the following an acceptable way to initialize an array of strings:
char * strings[] = { "John", "Paul", NULL};
But this way will fail:
char ** strings = { "John", "Paul", NULL};
My thought was that it would work relatively the same as doing:
char string[] = "John";
char * string = "Paul";
Where both work. What's the difference between the two?
char * strings[] is an array of pointers. When you initialize it as
char * strings[] = { "John", "Paul", NULL};
the strings John Paul are string literals. They are constants that exist somewhere in the code or Read only memory. What is done is to copy the pointer to the string literal John into the strings[0] and so on. i.e.
strings[0] --> holds a pointer to "John".
strings[1] --> holds a pointer to "Paul"
Note that the string literals should not be modified by your program. If you do, it is undefined behaviour.
In case of char ** strings This is a pointer to a pointer. It is a single memory location and cannot hold many pointers on its own. So, you cannot initialize it as below.
char ** strings = { "John", "Paul", NULL}; // error
However, a pointer to pointer can be used along with dynamic memory allocation (malloc,calloc etc) to point to an array of strings.
char string[] = "John";
In this case, you have a char array into which the string literal is copied. This step is done by the compiler, generally in the start up code before the main starts.
char * string = "Paul";
Here you have a char pointer which points to a string literal.
Difference between the above two statements is that in the case of a char array, you can modify the elements of string but you cannot in the second case.
why does
char *names [] = {"hello", "Jordan"};
work fine
but this does not
char names [] = {"hello", "Jordan"};
would appreciate if someone could explain this to me, thank you :).
Here
char *names [] = {"hello", "Jordan"};
names is array of char pointers i.e it can holds pointers i.e names each elements itself is one char array. But here
char names [] = {"hello", "Jordan"};
names is just a char array i.e it can hold only single char array like "hello" not multiple.
In second case like
int main(void) {
char names[] = {"hello", "Jordan"};
return 0;
}
when you compile(Suggest you to compile with -Wall -pedantic -Wstrict-prototypes -Werror flags), compiler clearly says
error: excess elements in char array initializer
which means you can't have more than one char array in this case. Correct one is
char names[] = {'h','e','l','l','o','\0'}; /* here names is array of characters */
Edit :- Also there is more possibility if syntax of names looks like below
char names[] = { "hello" "Jordan" }; /* its a valid one */
then here both hello and Jordan gets joined & it becomes single char array helloJordan.
char names[] = { "helloJordan" };
The first is an array of pointers to char. The second is an array of char and would have to look like char names[] = {'a', 'b', 'c'}
A string literal, such as "hello", is stored in static memory as an array of chars. In fact, a string literal has type char [N], where N is the number of characters in the array (including the \0 terminator). In most cases, an array identifier decays to a pointer to the first element of the array, so in most expressions a string literal such as "hello" will decay to a pointer to the char element 'h'.
char *names[] = { "hello", "Jordan" };
Here the two string literals decay to pointers to char which point to 'h' and 'J', respectively. That is, here the string literals have type char * after the conversion. These types agree with the declaration on the left, and the array names[] (which is not an array of character type, but an array of char *) is initialized using these two pointer values.
char names[] = "hello";
or similarly:
char names[] = { "hello" };
Here we encounter a special case. Array identifiers are not converted to pointers to their first elements when they are operands of the sizeof operator or the unary & operator, or when they are string literals used to initialize an array of character type. So in this case, the string literal "hello" does not decay to a pointer; instead the characters contained in the string literal are used to initialize the array names[].
char names[] = {"hello", "Jordan"};
Again, the string literals would be used to initialize the array names[], but there are excess initializers in the initializer list. This is a constraint violation according to the Standard. From §6.7.9 ¶2 of the C11 Draft Standard:
No initializer shall attempt to provide a value for an object not contained within the entity being initialized.
A conforming implementation must issue a diagnostic in the event of a constraint violation, which may take the form of a warning or an error. On the version of gcc that I am using at the moment (gcc 6.3.0) this diagnostic is an error:
error: excess elements in char array initializer
Yet, for arrays of char that are initialized by an initializer list of char values rather than by string literals, the same diagnostic is a warning instead of an error.
In order to initialize an array of char that is not an array of pointers, you would need a 2d array of chars here. Note that the second dimension is required, and must be large enough to contain the largest string in the initializer list:
char names[][100] = { "hello", "Jordan" };
Here, each string literal is used to initialize an array of 100 chars contained within the larger 2d array of chars. Or, put another way, names[][] is an array of arrays of 100 chars, each of which is initialized by a string literal from the initializer list.
char name[] is an array of characters so you can store a word in it:
char name[] = "Muzol";
This is the same of:
char name[] = {'M', 'u', 'z', 'o', 'l', '\0'}; /* '\0' is NULL, it means end of the array */
And char* names[] is an array of arrays where each element of the first array points to the start of the elements of the second array.
char* name[] = {"name1", "name2"};
It's the same of:
char name1[] = {'n', 'a', 'm', 'e', '1', '\0'}; /* or char name1[] = "name1"; */
char name2[] = {'n', 'a', 'm', 'e', '2', '\0'}; /* or char name2[] = "name2"; */
char* names[] = { name1, name2 };
So basically names[0] points to &name1[0], where it can read the memory until name1[5], this is where it finds the '\0' (NULL) character and stops. The same happens for name2[];
I've recently started to try learn the C programming language. In my first program (simple hello world thing) I came across the different ways to declare a string after I realised I couldn't just do variable_name = "string data":
char *variable_name = "data"
char variable_name[] = "data"
char variable_name[5] = "data"
What I don't understand is the difference between them. I know they are different and one of them specifically allocates an amount of memory to store the data in but that's about it, and I feel like I need to understand this inside out before moving onto more complex concepts in C.
Also, why does using *variable_name let me reassign the variable name to a new string but variable_name[number] or variable_name[] does not? Surely if I assign, say, 10 bytes to it (char variable_name[10] = "data") and try reassigning it to something that is 10 bytes or smaller it should work, so why doesn't it?
What are the empty brackets and the asterix doing?
In this declaration
char *variable_name = "data";
there is declared a pointer. This pointer points to the first character of the string literal "data". The compiler places the string literal in some region of memory and assigns the pointer by the address of the first character of the literal.
You may reassign the pointer. For example
char *variable_name = "data";
char c = 'A';
variable_name = &c;
However you may not change the string literal itself. An attempt to change a string literal results in undefined behaviour of the program.
In these declarations
char variable_name[] = "data";
char variable_name[5] = "data";
there are declared two arrays elements of which are initialized by characters of used for the initialization string literals. For example this declaration
char variable_name[] = "data";
is equivalent to the following
char variable_name[] = { 'd', 'a', 't', 'a', '\0' };
The array will have 5 elements. So this declaration is fully euivalent to the declaration
char variable_name[5] = "data";
There is a difference if you would specify some other size of the array. For example
char variable_name[7] = "data";
In this case the array would be initialized the following way
char variable_name[7] = { 'd', 'a', 't', 'a', '\0', '\0', '\0' };
That is all elements of the array that do not have explicit initializers are zero-initialized.
Pay attention to that in C you may declare a character array using a string literal the following way
char variable_name[4] = "data";
that is the terminating zero of the string literal is not placed in the array.
In C++ such a declaration is invalid.
Of course you may change elements of the array (if it is not defined as a constant array) if you want.
Take into account that you may enclose a string literal used as an initializer in braces. For example
char variable_name[5] = { "data" };
In C99 you may also use so-called destination initializers. For example
char variable_name[] = { [4] = 'A', [5] = '\0' };
Here is a demonstrative program
#include <stdio.h>
#include <string.h>
int main(void)
{
char variable_name[] = { [4] = 'A', [5] = '\0' };
printf( "%zu\n", sizeof( variable_name ) );
printf( "%zu\n", strlen( variable_name ) );
return 0;
}
The program output is
6
0
When ypu apply standard C function strlen declared in header <string.h> you get that it returns 0 because the first elements of the array that precede the element with index 4 are zero initialized.