const char **name VS char *name[] - c

I know this topic was already discussed several times and I think I basically know the difference between arrays and pointer but I am interested in how arrays are exactly stored in mem.
for example:
const char **name = {{'a',0},{'b',0},{'c',0},0};
printf("Char: %c\n", name[0][0]); // This does not work
but if its declared like this:
const char *name[] = {"a","b","c"};
printf("Char: %c\n", name[0][0]); // Works well
everything works out fine.

When you define a variable like
char const* str = "abc";
char const** name = &str;
it looks something like this:
+---+ +---+ +---+---+---+---+
| *-+---->| *-+--->| a | b | c | 0 |
+---+ +---+ +---+---+---+---+
When you define a variable using the form
char const* name[] = { "a", "b", "c" };
You have an array of pointers. This looks something like that:
+---+ +---+---+
| *-+---->| a | 0 |
+---+ +---+---+
| *-+---->| b | 0 |
+---+ +---+---+
| *-+---->| c | 0 |
+---+ +---+---+
What may be confusing is that when you pass this array somewhere, it decays into a pointer and you got this:
+---+ +---+ +---+---+
| *-+---->| *-+---->| a | 0 |
+---+ +---+ +---+---+
| *-+---->| b | 0 |
+---+ +---+---+
| *-+---->| c | 0 |
+---+ +---+---+
That is, you get a pointer to the first element of the array. Incrementing this pointer moves on to the next element of the array.

A string literal converts implicitly to char const*.
The curly braces initializer doesn't.
Not relevant to your example, but worth knowing: up till and including C++03 a string literal could also implicitly convert to char* (no const), for compatibility with old C, but happily in C++11 this unsafe conversion was finally removed.

The reason the first snippet does not work is that the compiler re-interprets the sequence of characters as the value of a pointer, and then ignores the rest of the initializers. In order for the snippet to work, you need to tell the compiler that you are declaring an array, and that the elements of that array are arrays themselves, like this:
const char *name[] = {(char[]){'a',0},(char[]){'b',0},(char[]){'c',0},0};
With this modification in place, your program works and produces the desired output (link to ideone).

Your first example declares a pointer to a pointer to char. The second declares an array of pointers to char. The difference is that there's one more layer of indirection in the first one. It's a bit hard to describe without a drawing.
In a fake assembly style,
char **name = {{'a',0},{'b',0},{'c',0},0};
would translate to something like:
t1: .byte 'a', 0
.align somewhere; possibly somewhere convenient
t2: .byte 'b', 0
.align
t3: .byte 'c', 0
.align
t4: .dword t1, t2, t3, 0
name: .dword t4
while the second one,
char *name[] = {"a","b","c"};
might generate the same code for t1, t2, and t3, but then would do
name: .dword t1, t2, t3
Does that make sense?

Arrays are stored in memory as a contiguous sequence of objects, where the type of that object is the base type of the array. So, in the case of your array:
const char *name[] = {"a","b","c"};
The base type of the array is const char * and the size of the array is 3 (because your initialiser has three elements). It would look like this in memory:
| const char * | const char * | const char * |
Note that the elements of the array are pointers - the actual strings aren't stored in the array. Each one of those strings is a string literal, which is an array of char. In this case, they're all arrays of two chars, so somewhere else in memory you have three unnamed arrays:
| 'a' | 0 |
| 'b' | 0 |
| 'c' | 0 |
The initialiser sets the three elements of your name array to point to the initial elements of these three unnamed arrays. name[0] points to the 'a', name[1] points to the 'b' and name[2] points to the 'c'.

You have to look at what happens when you declare a variable, and where the memory to store the data for the variable goes.
First, what does it mean to simply write:
char x = 42;
you get enough bytes to hold a char on the stack, and those bytes are set to the value 42.
Secondly, what happens when you declare an array:
char x[] = "hello";
you get 6 bytes on the stack, and they are set to the characters h, e, l, l, o, and the value zero.
Now what happens if you declare a character pointer:
const char* x = "hello";
The bytes for "hello" are stored somewhere in static memory, and you get enough bytes to hold a pointer on the stack, and its value is set to the address of the first byte of that static memory that holds the value of the string.
So now what happens when you declare it as in your second example? You get three separate strings stored in static memory, "a", "b", and "c". Then on the stack you get an array of three pointers, each set to the memory location of those three strings.
So what is your first example trying to do? It looks like you want a pointer to an array of pointers, but the question is where will this array of pointers go? This is like my pointer example above, where something should be allocated in static memory. However, it just happens that you cannot declare a two dimensional array in static memory using brace initialisation like that. So you could do what you want by declaring the array as a variable outside of the function:
const char* name_pointers[] = {"a", "b", "c"};
then inside the function:
const char** name = name_pointers;

Related

Comparison of Pointer to String , Array of Characters and Pointer to Array of Characters [duplicate]

This question already has answers here:
What is the difference between char s[] and char *s?
(14 answers)
String literals: pointer vs. char array
(1 answer)
Closed 1 year ago.
This post was edited and submitted for review 1 year ago and failed to reopen the post:
Original close reason(s) were not resolved
hello to all programmers, I can't understand something
char a[]="hello"
char* b="salam"
the first question is why can't we modify 2,for example b[0]='m', I know that 2 gets stored as compile time constant BUT I can't understand what does it mean and what is the quiddity of 2 ?
and second question:
3.
char a[]="hello";
char* c=a;
c[0]='r';
Now we can modify and then print c, but we couldn't modify 2 ! why?
I can't understand the concept of those pointers please explain it to me
char a[] = "hello;" is a null terminated array of characters, the array will be initialized with the charaters you specify and the size of it will be deduced by the compiler, in this case it will have space for 6 characters, these are mutable, the charaters are copied to the array, you can change them at will. e.g. a[0] = 'x' will change hello to xello.
char* c = a; just makes the pointer c point to a, the same operations can be performed in c as you are really operating in a.
char* b = "salam" is a different animal, b is a pointer to a string literal, these are not meant to be modified, they don't get stored in an array like a, they are read only and are usually stored in some read only section of memory, either way the behavior of editing b is undefined, i.e. b[0] = 'x' is illegal as per the language rules.
char a[]="hello";
This creates an array like this:
+---+---+---+---+---+----+
a: | h | e | l | l | o | \0 |
+---+---+---+---+---+----+
The array is modifiable and you can write other characters to it later if you like (although you cannot write more than 5 or 6 of them).
char* b="salam";
This uses a string literal to create a constant string somewhere, that variable b is then a pointer to. I like to draw it like this:
+-------+
b: | * |
+---|---+
|
V
+---+---+---+---+---+----+
| s | a | l | a | m | \0 |
+---+---+---+---+---+----+
There are two differences here: (1) b is a pointer, not an array as a was. (2) the string here (that b points to) is probably in nonwritable memory. But a was definitely in writable memory.
char* c=a;
Now c is a pointer, pointing at the earlier-declared array a. The picture looks like this:
+---+---+---+---+---+----+
a: | h | e | l | l | o | \0 |
+---+---+---+---+---+----+
^
|
\
|
+---|---+
c: | * |
+-------+
And the array a was modifiable, so there's no problem doing c[0] = 'r', and we end up sounding like Scooby-Doo and saying:
+---+---+---+---+---+----+
a: | r | e | l | l | o | \0 |
+---+---+---+---+---+----+
^
|
\
|
+---|---+
c: | * |
+-------+
The key difference (which can be quite subtle) is that a string literal in source code like "hello" can be used in two very different ways. When you say
char a[] = "hello";
the string literal is used as the initial value of the array a. But the array a is an ordinary, modifiable array, and there's no problem writing to it later.
Most other uses of string literals, however, work differently. When you say
char *b = "salam";
or
printf("goodbye\n");
those string literals are used to create and initialize "anonymous" string arrays somewhere, which are referred to thereafter via pointers. The arrays are "anonymous" in that they don't have names (identifiers) to refer to them, and they're also usually placed in read-only memory, so you're not supposed to try to write to them.
Let's start of with your first question:
We have 2 strings, a and b
char a[] = "hello";
char *b = "salam";
The first string can be modified, this is because it uses a different memory segment than the second string. It is stored in the data segment of the program, and we have write access to the data segment so we can modify it.
The second string is a pointer to a string, we cannot modify string literals (pointers to strings) since c specifies that this is undefined behavior.
The address of b will just point to somewhere in the program where that string is stored. This string should preferably be declared const since it can't be modified anyways.
const char *b = "salam";
Now let's look at the second question:
The code you provided for the second question is perfectly valid,
char a[] = "hello";
char *c = a;,
c[0] = 'r';
We have a, which stores the actual string and if using ASCII it consists of 6 bytes 'h', 'e', 'l', 'l', 'o', '\0'
c points to a we can verify this with this code
#include <stdio.h>
int main(void) {
char a[] = "hello";
char *c = a;
c[0] = 'r';
printf("a: %p\nc: %p\n", &a, &*c);
}
And we'll get output as such
a: 0x7ffe3c94ecf2
c: 0x7ffe3c94ecf2
They both point to the same address, the start of the array when we do
c[0] // It essentially means *(c + 0) = in other words the address which c points to + 0 and then we subscript this is how subscripting works a[1] = *(a + 1), etc...
So pretty much c in this case points to
0x7ffe3c94ecf2
c + 0 =
0x7ffe3c94ecf2
Access that address and modify the character.

What is a free pointer in C?

I learnt that there are two ways of declaring an array in C:
int array[] = {1,2,3};
and:
int* arr = malloc(3*sizeof(int));
Why is arr called a free pointer ? And why can't I change the address contained in array while I can do it with array ?
As said in comments, you learned something incorrect, from a bad source.
In the second case, arr is not an array, it's a pointer. A pointer that (if the allocation succeeds) happens to contain the address of a block of memory that can hold three ints, but that's not an array.
This confusion probably comes from the fact that arrays "decay" to pointers in some contexts, but that does not make them equivalent.
Let's look at how the two objects are laid out in memory:
+---+
array: | 1 | array[0]
+---+
| 2 | array[1]
+---+
| 3 | array[2]
+---+
+---+ +---+
arr: | | ---------> | ? | arr[0]
+---+ +---+
| ? | arr[1]
+---+
| ? | arr[2]
+---+
So, one immediate difference - there is no array object that is separate from the array elements themselves, whereas arr is a separate object from the array elements. Only array is an actual array as far as C is concerned - arr is just a pointer to a single object, which may be the first element of a sequence of objects or not.
This is why you can assign a new address value to arr, but not to array - in the second case, there's nothing to assign the new address value to. It's like trying to change the address of a scalar variable - you can't do it, because the operation doesn't make any sense.
It also means that the address of array[0] is the same as the address of array. The expressions &array[0], array, and &array will all yield the same address value, although the types of the expressions will be different (int *, int *, and int (*)[3], respectively). By contrast, the address of arr is not the same as the address of arr[0]; the expressions arr and &arr[0] will yield the same value, but &arr will not, and its type will be int ** instead of int (*)[3].

Difference between char **p,char *p[],char p[][]

char *p = "some string"
creates a pointer p pointing to a block containing the string.
char p[] = "some string"
creates a character array and with literals in it.
And the first one is a constant declaration.Is it the same of two-dimensional arrays?
what is the difference between
char **p,char *p[],char p[][].
I read a bit about this that char **p creates an array of pointers so it has an overhead compared to char p[][] for storing the pointer values.
the first two declarations create constant arrays.i did not get any run time error when i tried to modify the contents of argv in main(int argc,char **argv). Is it because they are declared in function prototype?
Normal Declarations (Not Function Parameters)
char **p; declares a pointer to a pointer to char. It reserves space for the pointer. It does not reserve any space for the pointed-to pointers or any char.
char *p[N]; declares an array of N pointers to char. It reserves space for N pointers. It does not reserve any space for any char. N must be provided explicitly or, in a definition with initializers, implicitly by letting the compiler count the initializers.
char p[M][N]; declares an array of M arrays of N char. It reserves space for M•N char. There are no pointers involved. N must be provided explicitly. M must be provided explicitly or, in a definition with initializers, implicitly by letting the compiler count the initializers.
Declarations in Function Parameters
char **p declares a pointer to a pointer to char. When the function is called, space is provided for that pointer (typically on a stack or in a processor register). No space is reserved for the pointed-to-pointers or any char.
char *p[N] is adjusted to be char **p, so it is the same as above. The value of N is ignored, and N may be absent. (Some compilers may evaluate N, so, if it is an expression with side effects, such as printf("Hello, world.\n"), these effects may occur when the function is called. The C standard is unclear on this.)
char p[M][N] is adjusted to be char (*p)[N], so it is a pointer to an array of N char. The value of M is ignored, and M may be absent. N must be provided. When the function is called, space is provided for the pointer (typically on a stack or in a processor register). No space is reserved for the array of N char.
argv
argv is created by the special software that calls main. It is filled with data that the software obtains from the “environment”. You are allowed to modify the char data in it.
In your definition char *p = "some string";, you are not permitted to modify the data that p points to because the C standard says that characters in a string literal may not be modified. (Technically, what it says is that it does not define the behavior if you try.) In this definition, p is not an array; it is a pointer to the first char in an array, and those char are inside a string literal, and you are not permitted to modify the contents of a string literal.
In your definition char p[] = "some string";, you may modify the contents of p. They are not a string literal. In this case, the string literal effectively does not exist at run-time; it is only something used to specify how the array p is initialized. Once p is initialized, you may modify it.
The data set up for argv is set up in a way that allows you to modify it (because the C standard specifies this).
Some more differences description looking it from memory addressing view as follows,
I. char **p; p is double pointer of type char
Declaration:
char a = 'g';
char *b = &a;
char **p = &b;
p b a
+------+ +------+ +------+
| | | | | |
|0x2000|------------>|0x1000|------------>| g |
| | | | | |
+------+ +------+ +------+
0x3000 0x2000 0x1000
Figure 1: Typical memory layout assumption
In above declaration, a is char type containing a character g. Pointer b contains the address of an existing character variable a. Now b is address 0x1000 and *b is character g. Finally address of b is assigned to p, therefore a is a character variable, b is pointer and p is pointer to pointer. Which implies a contains value, b contains address and p contains address of address as shown below in the diagram.
Here, sizeof(p) = sizeof(char *) on respective system;
II. char *p[M]; p is array of strings
Declaration:
char *p[] = {"Monday", "Tuesday", "Wednesday"};
p
+------+
| p[0] | +----------+
0 | 0x100|------>| Monday\0 |
| | +----------+
|------| 0x100
| p[1] | +-----------+
1 | 0x200|------>| Tuesday\0 |
| | +-----------+
|------| 0x200
| p[2] | +-------------+
2 | 0x300|------>| Wednesday\0 |
| | +-------------+
+------+ 0x300
Figure 2: Typical memory layout assumption
In this declaration, p is array of 3 pointers of type char. Implies array p can hold 3 strings. Each string (Monday, Tuesday & Wednesday) is located some where in memory (0x100, 0x200 & 0x300), there addresses are in array p as (p[0], p[1] & p[2]) respectively. Hence it is array of pointers.
Notes: char *p[3];
1. p[0], p[1] & p[2] are addresses of strings of type `char *`.
2. p, p+1 & p+2 are address of address with type being `char **`.
3. Accessing elements is through, p[i][j] is char; p[i] is char *; & p is char **
Here sizeof(p) = Number of char array * sizeof(char *)
III. char p[M][N]; p is array of fixed length strings with dimensions as M x N
Declaration:
char p[][10] = {Monday, Tuesday, Wednesday};
p 0x1 2 3 4 5 6 7 8 9 10
+-------------------------+
0 | M o n d a y \0 \0 \0 \0|
1 | T u e s d a y \0 \0 \0|
2 | W e d n e s d a y \0|
+-------------------------+
Figure 3: Typical memory layout assumption
In this case array p contain 3 strings each containing 10 characters. Form the memory layout we can say p is a two dimensional array of characters with size MxN, which is 3x10 in our example. This is useful for representing strings of equal length since there is a possibility of memory wastage when strings contains lesser than 10 characters compared to declaration char *p[], which has no memory wastage because string length is not specified and it is useful for representing strings of unequal length.
Accessing elements is similar as above case, p[M] is M'th string & p[M][N] is N'th character of M'th string.
Here sizeof(p) = (M rows * N columns) * sizeof(char) of two dimensional array;
a in char* a is pointer to array of chars, a can be modified.
b in char b[] is array of chars. b cannot be modified.
They are sort of compatible - b can automatically decay to a in assignments and expressions, but not other way around.
When you use char** p, char* p[] and char p[][] it is very similar situation, just more levels of indirection.

Storage Concerning Arrays and Pointers

I'm having trouble understanding the following code:
const char *suit[4] = {"Hearts", "Diamonds", "Clubs", "Spades"}
I don't understand what is stored in the array suit, are they pointers? And if so, where are the strings stored?
Also, is the pointer constant, or the array constant?
I would appreciate a full detailed explanation of this code, and what is going on in memory!
Thanks in advance.
We learn a lot by using cdecl.org. This is what it tells us about suit:
declare suit as array 4 of pointer to const char
So:
the array contains 4 pointers.
each pointer points at a char (in this case, the first character of each string).
the pointers are not const, and neither is the array.
The strings are literals; where they are stored is implementation-specific.
In ASCII art:
"Clubs"
^
| "Spades"
| ^
| |
+---+---+---+---+
suit | | | | |
+---+---+---+---+
| |
| v
| "Diamonds"
v
"Hearts"
Note that suit itself is not a pointer; it's the name of the array.
const char * is a string type since strings are just arrays of characters. This means you have an array of const char * (strings). The strings themselves are constant and are stored in the .data section of your file binary when compiled. Hence the data pointed to by the pointer is constant.

How is memory allocated for an implicitly defined multidimensional array in C99?

I'm trying to write a C99 program and I have an array of strings implicitly defined as such:
char *stuff[] = {"hello","pie","deadbeef"};
Since the array dimensions are not defined, how much memory is allocated for each string? Are all strings allocated the same amount of elements as the largest string in the definition? For example, would this following code be equivalent to the implicit definition above:
char stuff[3][9];
strcpy(stuff[0], "hello");
strcpy(stuff[1], "pie");
strcpy(stuff[2], "deadbeef");
Or is each string allocated just the amount of memory it needs at the time of definition (i.e. stuff[0] holds an array of 6 elements, stuff[1] holds an array of 4 elements, and stuff[2] holds an array of 9 elements)?
Pictures can help — ASCII Art is fun (but laborious).
char *stuff[] = {"hello","pie","deadbeef"};
+----------+ +---------+
| stuff[0] |--------->| hello\0 |
+----------+ +---------+ +-------+
| stuff[1] |-------------------------->| pie\0 |
+----------+ +------------+ +-------+
| stuff[2] |--------->| deadbeef\0 |
+----------+ +------------+
The memory allocated for the 1D array of pointers is contiguous, but there is no guarantee that the pointers held in the array point to contiguous sections of memory (which is why the pointer lines are different lengths).
char stuff[3][9];
strcpy(stuff[0], "hello");
strcpy(stuff[1], "pie");
strcpy(stuff[2], "deadbeef");
+---+---+---+---+---+---+---+---+---+
| h | e | l | l | o | \0| x | x | x |
+---+---+---+---+---+---+---+---+---+
| p | i | e | \0| x | x | x | x | x |
+---+---+---+---+---+---+---+---+---+
| d | e | a | d | b | e | e | f | \0|
+---+---+---+---+---+---+---+---+---+
The memory allocated for the 2D array is contiguous. The x's denote uninitialized bytes. Note that stuff[0] is a pointer to the 'h' of 'hello', stuff[1] is a pointer to the 'p' of 'pie', and stuff[2] is a pointer to the first 'd' of 'deadbeef' (and stuff[3] is a non-dereferenceable pointer to the byte beyond the null byte after 'deadbeef').
The pictures are quite, quite different.
Note that you could have written either of these:
char stuff[3][9] = { "hello", "pie", "deadbeef" };
char stuff[][9] = { "hello", "pie", "deadbeef" };
and you would have the same memory layout as shown in the 2D array diagram (except that the x's would be zeroed).
char *stuff[] = {"hello","pie","deadbeef"};
Is not a multidimensional array! It is simply an array of pointers.
how much memory is allocated for each string?
The number of characters plus a null terminator. Same as any string literal.
I think you want this:
char foo[][10] = {"hello","pie","deadbeef"};
Here, 10 is the amount of space per string and all the strings are in contiguous memory. Thus, there will be padding for strings less than size 10.
In the first example, it is a jagged array I suppose.
It declares an array of const pointers to a char. So the string literal can be as long as you like. The length of the string is independent of the array columns.
In the second one.. the number of characters per row (string) lengths must be 9 as specified by your column size, or less.
Are all strings allocated the same amount of elements as the largest
string in the definition?
No, only 3 pointer are allocated and they point to 3 string literals.
char *stuff[] = {"hello","pie","deadbeef"};
and
char stuff[3][9];
are not at all equivalent. First is an array of 3 pointers whereas the second is a 2D array.
For the first only pointer are allocated and the string literals they point to may be stored in the read-only section. The second is allocated on automatic storage (usually stack).

Resources