Literal string initializer for a character array

Literal string initializer for a character array - c

In the following rules for the case when array decays to pointer:
An lvalue [see question 2.5] of type array-of-T which appears in an expression decays (with three exceptions) into a pointer to its first element; the type of the resultant pointer is pointer-to-T.
(The exceptions are when the array is the operand of a sizeof or & operator, or is a literal string initializer for a character array.)
How to understand the case when the array is "literal string initializer for a character array"? Some example please.
Thanks!

The three exceptions where an array does not decay into a pointer are the following:
Exception 1. — When the array is the operand of sizeof.
int main()
{
int a[10];
printf("%zu", sizeof(a)); /* prints 10 * sizeof(int) */
int* p = a;
printf("%zu", sizeof(p)); /* prints sizeof(int*) */
}
Exception 2. — When the array is the operand of the & operator.
int main()
{
int a[10];
printf("%p", (void*)(&a)); /* prints the array's address */
int* p = a;
printf("%p", (void*)(&p)); /*prints the pointer's address */
}
Exception 3. — When the array is initialized with a literal string.
int main()
{
char a[] = "Hello world"; /* the literal string is copied into a local array which is destroyed after that array goes out of scope */
char* p = "Hello world"; /* the literal string is copied in the read-only section of memory (any attempt to modify it is an undefined behavior) */
}

Assume the declarations
char foo[] = "This is a test";
char *bar = "This is a test";
In both cases, the type of the string literal "This is a test" is "15-element array of char". Under most circumstances, array expressions are implicitly converted from type "N-element array of T" to "pointer to T", and the expression evaluates to the address of the first element of the array. In the declaration for bar, that's exactly what happens.
In the declaration for foo, however, the expression is being used to initialize the contents of another array, and is therefore not converted to a pointer type; instead, the contents of the string literal are copied to foo.

This is a literal string initializer for a character array:
char arr[] = "literal string initializer";
Could also be:
char* str = "literal string initializer";
Definition from K&R2:
A string literal, also called a string
constant, is a sequence of characters
surrounded by double quotes as in
"...". A string has type ``array of
characters'' and storage class static
(see Par.A.3 below) and is initialized
with the given characters. Whether
identical string literals are distinct
is implementation-defined, and the
behavior of a program that attempts to
alter a string literal is undefined.

It seems like you pulled that quote from the comp.lang.c FAQ (maybe an old version or maybe the printed version; it doesn't quite match with the current state of the online one):
http://c-faq.com/aryptr/aryptrequiv.html
The corresponding section links to other sections of the FAQ to elaborate on those exceptions. In your case, you should look at:
http://c-faq.com/decl/strlitinit.html

Related

Array of pointers to char in C

I am confused with how an array of pointer to char works in C.
Here is a sample of the code which I am using to understand the array of pointers to char.
int main() {
char *d[]={"hi","bye"};
int a;
a = (d[0]=="hi") ? 1:0;
printf("%d\n",a);
return 0;
}
I am getting a = 1 so d[0]="hi". What is confusing me that since d is an array of char pointers, shouldn't be a[0] equal to the address of h of the hi string ?

C 2018 6.4.5 specifies the behavior of string literals. Paragraph 6 specifies that a string literal in source code causes the creation of an array of characters. When an array is used in an expression other than as the operand of sizeof, the operand of unary &, or as a string literal used to initialize an array, it is automatically converted to a pointer to its first character. So char *d[]={"hi","bye"}; initializes d[0] and d[1] to point to the first characters of "hi" and "bye", and d[0]=="hi" compares d[0] to a pointer to the first character of another "hi".
Paragraph 7 says that the same array may be used for identical string literals:
It is unspecified whether these arrays are distinct provided their elements have the appropriate values…
Thus, when your compiler is taking the address of the first element of "hi" in d[0]=="hi", it may, but is not required to, use the same memory for "hi" as it did when initializing d[0] in char *d[]={"hi","bye"};.
(Note that the paragraph also allows the same memory to be used for string literals identical to the substrings at the ends of other string literals. For example, "phi" and "hi" could share memory.)

Why does the function have to return a char * but not a char array?

char * printstring(void)
{
return "my string";
}
Since what the function does is return a character array why do I have to state that my function returns char* and not char[] at the declaration.

Because because of the the way C was designed, arrays are not first-class citizens in it. You can neither return them nor pass them to a function by value.
If you want achieve either of those things, you'll have to wrap the array in a struct.
struct ten_chars{ char chars[10]; };
struct ten_chars printstring(void)
{
return (struct ten_chars){"my string"};
}

The string literal "my string" does have array type. Note that sizeof "my string" will evaluate to 10, as expected for an array that holds 10 chars (including the '\0'). You can think of "my string" as an identifier that identifies an array, and decays to a pointer to the first element of the array in most expressions (but not in, e.g., sizeof expressions).
So, in the return statement, "my string" decays to a pointer to the first element of the array that holds the characters of the string literal (and the null terminator). It is this pointer that is returned from the function, and this is why the return type must be char *.
For the record, it is not even possible to return an array from a function in C, though you can return a pointer to an array. You can also return a struct that contains an array field from a function.
Take a look at this example code:
#include <stdio.h>
char * getstring(void);
int main(void)
{
printf("%s\n", getstring());
return 0;
}
char * getstring(void)
{
printf("sizeof \"my string\": %zu\n", sizeof "my string");
printf("*(\"my string\" + 1): %c\n", *("my string" + 1));
return "my string";
}
Program output:
sizeof "my string": 10
*("my string" + 1): y
my string

Firstly, C doesn't allow you to define a function that returns an array type; something like
char printstring(void)[10] { return "my string"; }
simply isn't allowed, and the compiler will yell at you over it.
Secondly, because what you are returning isn't an array.
Except when it is the operand of the sizeof or unary & operators, or is a string literal used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element of the array.
The expression "my string" has type "10-element array of char". Since it isn't the operand of either the sizeof or unary & operators, and since it isn't being used to initialize an array of char in a declaration, it "decays" to an expression of type char *. Its value is the address of the first character in the string, and that address value is what your function is actually returning.
This is by design - it was Ritchie's way of kind-of-sort-of preserving B's array semantics in C. However, it means that array expressions in C do not retain their array-ness in most circumstances.

In C it is not allowed to assign to an array variable.
char a[] = "test";
char b[5] = a; /* ILLEGAL */
So why then would one want to defined a function returning an array, if its result could be assigned to anything?

Disclaimer: this is not intended as an exact answer to the question since it is about c++, not c. But i thought that it could be interesting in context of this discussion.
In c++ there is a way to return a reference to an array which could look like the following:
static const char (&func())[12] {
return "hello world";
}
This is similar to returning a pointer and does not copy the values. But it is not possible in plain c.

Alternative way to pass a char pointer to a function

Given the Source Code from strcpy()
char * strcpy(char *s1, const char *s2)
{
char *s = s1;
while ((*s++ = *s2++) != 0);
return (s1);
}
Why does handing over the second argument work and how does it look in memory since I do not pass a pointer to the function
char dest[100];
strcpy(dest, "HelloWorld");

This works, because,
For dest, arrays, when passed as function arguments, decay to the address of the first element. That's a pointer.
So, a call like
strcpy(dest, "HelloWorld");
is the same as
strcpy(&dest[0], "HelloWorld");
For "HelloWorld", a string literal, has a type of char[]. So, it essentially gives you the address of the fist element in it.

In C string literals have types of character arrays. From C Standard (6.4.5 String literals)
6 In translation phase 7, a byte or code of value zero is appended to
each multibyte character sequence that results from a string literal
or literals.78) The multibyte character sequence is then used to
initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals,
the array elements have type char, and are initialized with the
individual bytes of the multibyte character sequence.
Also arrays with rare exceptions are converted to pointers in expressions. The C Standard, 6.3.2.1 Lvalues, arrays, and function designators
3 Except when it is the operand of the sizeof operator or the unary &
operator, or is a string literal used to initialize an array, an
expression that has type ‘‘array of type’’ is converted to an
expression with type ‘‘pointer to type’’ that points to the initial
element of the array object and is not an lvalue. If the array object
has register storage class, the behavior is undefined.
Thus in this call
strcpy(dest, "HelloWorld");
the string literal has type char[11] that is converted to value of type char * that is equal to the address of the first character of the string literal.
You could also write for example
strcpy(dest, &"HelloWorld"[0]);
Or even :)
strcpy(dest, &0["HelloWorld"]);
or:)
strcpy(dest, &*"HelloWorld");
All three expressions yield the address of the initial element of the string literal and have type char *.
Take into account that it is implementation-defined (usually controlled by compiler options) whether
"HelloWorld" == "HelloWorld"
is evaluated to true. That is whether the compiler allocates separate extents of memory for identical string literals or will store only one copy of them.
In this expression the addresses of the first characters of the string literals are compared.
If you write
strcmp( "HelloWorld", "HelloWorld" )
then the result will be equal to 0 that is the string literals are equal each other (contain the same sequence of characters)

strcpy(dest, "HelloWorld");
dest being and array of char will decay to pointer to base element when passed to function and accessed there .
And "HelloWorld" being a string literal is of type char [](but not modifiable) and hence is correct argument to pass.

C: Behaviour of arrays when assigned to pointers

#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
printf("%p %s" ,"hello",ptr );
getchar();
}
Hi, I am trying to understand clearly how can arrays get assign in to pointers. I notice when you assign an array of chars to a pointer of chars ptr="hello"; the array decays to the pointer, but in this case I am assigning a char of arrays that are not inside a variable and not a variable containing them ", does this way of assignment take a memory address specially for "Hello" (what obviously is happening) , and is it possible to modify the value of each element in "Hello" wich are contained in the memory address where this array is stored. As a comparison, is it fine for me to assign a pointer with an array for example of ints something as vague as thisint_ptr = 5,3,4,3; and the values 5,3,4,3 get located in a memory address as "Hello" did. And if not why is it possible only with strings? Thanks in advanced.

"hello" is a string literal. It is a nameless non-modifiable object of type char [6]. It is an array, and it behaves the same way any other array does. The fact that it is nameless does not really change anything. You can use it with [] operator for example, as in "hello"[3] and so on. Just like any other array, it can and will decay to pointer in most contexts.
You cannot modify the contents of a string literal because it is non-modifiable by definition. It can be physically stored in read-only memory. It can overlap other string literals, if they contain common sub-sequences of characters.
Similar functionality exists for other array types through compound literal syntax
int *p = (int []) { 1, 2, 3, 4, 5 };
In this case the right-hand side is a nameless object of type int [5], which decays to int * pointer. Compound literals are modifiable though, meaning that you can do p[3] = 8 and thus replace 4 with 8.
You can also use compound literal syntax with char arrays and do
char *p = (char []) { "hello" };
In this case the right-hand side is a modifiable nameless object of type char [6].

The first thing you should do is read section 6 of the comp.lang.c FAQ.
The string literal "hello" is an expression of type char[6] (5 characters for "hello" plus one for the terminating '\0'). It refers to an anonymous array object with static storage duration, initialized at program startup to contain those 6 character values.
In most contexts, an expression of array type is implicitly converted a pointer to the first element of the array; the exceptions are:
When it's the argument of sizeof (sizeof "hello" yields 6, not the size of a pointer);
When it's the argument of _Alignof (a new feature in C11);
When it's the argument of unary & (&arr yields the address of the entire array, not of its first element; same memory location, different type); and
When it's a string literal in an initializer used to initialize an array object (char s[6] = "hello"; copies the whole array, not just a pointer).
None of these exceptions apply to your code:
char *ptr;
ptr = "hello";
So the expression "hello" is converted to ("decays" to) a pointer to the first element ('h') of that anonymous array object I mentioned above.
So *ptr == 'h', and you can advance ptr through memory to access the other characters: 'e', 'l', 'l', 'o', and '\0'. This is what printf() does when you give it a "%s" format.
That anonymous array object, associated with the string literal, is read-only, but not const. What that means is that any attempt to modify that array, or any of its elements, has undefined behavior (because the standard explicitly says so) -- but the compiler won't necessarily warn you about it. (C++ makes string literals const; doing the same thing in C would have broken existing code that was written before const was added to the language.) So no, you can't modify the elements of "hello" -- or at least you shouldn't try. And to make the compiler warn you if you try, you should declare the pointer as const:
const char *ptr; /* pointer to const char, not const pointer to char */
ptr = "hello";
(gcc has an option, -Wwrite-strings, that causes it to treat string literals as const. This will cause it to warn about some C code that's legal as far as the standard is concerned, but such code should probably be modified to use const.)

#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
//instead of above tow lines you can write char *ptr = "hello"
printf("%p %s" ,"hello",ptr );
getchar();
}
Here you have assigned string literal "hello" to ptr it means string literal is stored in read only memory so you can't modify it. If you declare char ptr[] = "hello";, then you can modify the array.

Say what?
Your code allocates 6 bytes of memory and initializes it with the values 'h', 'e', 'l', 'l', 'o', and '\0'.
It then allocates a pointer (number of bytes for the pointer depends on implementation) and sets the pointer's value to the start of the 5 bytes mentioned previously.
You can modify the values of an array using syntax such as ptr[1] = 'a'.
Syntactically, strings are a special case. Since C doesn't have a specific string type to speak of, it does offer some shortcuts to declaring them and such. But you can easily create the same type of structure as you did for a string using int, even if the syntax must be a bit different.

referencing an index value of a character in a pointer string in c

Suppose I have something like this
int strLen;
printf("Please enter a number: ");
scanf("%d", &strLen);
char *myString;
myString = (char*) malloc(strLen*sizeof(char));
then you fill string with something like "Hello World!" but now I want to just print out "World!" Since my string is just a pointer reference, I can't call it out by indexes ie.
for(int i=6;i<strLen;i++)
{
printf("%s", myString[i]);
}
// THIS IS AN INCORRECT WAY TO DO THIS
How could I refer to a specific character or even pass the array onto another function of the program if all I have is the array base pointer? Can I ever get the full functionality as if I declared it as a static array before compile time?

How could I refer to a specific character or even pass the array onto another function of the program if all I have is the array base pointer?
Several things to remember:
Except when it is the operand of the sizeof or unary & operators, or is a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be replaced with ("decay to") an expression of type "pointer to T" whose value is the address of the first element of the array;
In the context of a function parameter declaration, T a[N] and T a[] are identical to T *a (IOW, a will be declared as a pointer to T, not an array of T - note that this is only true for function parameter declarations);
The subscript operation a[i] is defined as *(a + i) - start with a base address specified by the pointer expression a, offset by i elements (not bytes), and dereference the result;
In C, you do not need to cast the result of malloc (or calloc or realloc), since it returns a value of type void *, which may be assigned to any other object pointer type. Adding the cast may suppress a useful diagnostic if you forget to include stdlib.h or otherwise don't have a prototype in scope. Note that this is not true for C++ - a cast is required there, but if you're writing C++ you should be using new instead of malloc anyway.
This is a long-winded way of saying that, in many contexts, array expressions and pointer expressions can be treated the same way. Taking printf as an example:
int main(void)
{
char foo[] = "This is a test";
char *bar = foo;
printf("%s\n", foo);
printf("%s\n", &foo[0]);
printf("%s\n", bar);
return 0;
}
printf expects the argument corresponding to %s to have type char *, or "pointer to char", not "array of char". The three printf calls above are all equivalent. In the first call, foo is an array expression with type "15-element array of char". By the first rule mentioned above, it will be replaced with an expression of type "pointer to char" whose value is the address of the first element. The second and third calls pass the pointer value directly, just using different expressions to accomplish the same effect.
As far as printf is concerned, all three expressions yield the same result -- the address of the first element of a sequence of char values, terminated by 0.
What does this mean for your code? Well, for one thing, you can use the subscript operator on mystring as though it were an array type:
printf("%s\n", &mystring[6]); // prints "World!"
Note that the subscript operator [] has higher precedence than the unary & operator, so the above is interpreted as &(mystring[6]) - we subscript into mystring and then take the address of the result.
You can pass mystring to any function that you would pass an array of char to:
void foo(char str[]) // identical to char *str
{
// do something with str
}
...
int main(void)
{
char str[] = "Hello, World!";
char *mystr = malloc(strlen(str) + 1); // note no cast
strcpy(mystr, str);
foo(str);
foo(mystr);
...
}
Again, as far as the function foo is concerned, its argument is type char *, not array of char. The expression str decays to a pointer value, and mystr is a pointer value to begin with.

A couple of things:
1) Allow for the null terminator in your "malloc()":
int strLen;
...
char *myString = (char*) malloc(strLen+1);
2) The "sizeof(char)" is kind of duplicate redundant. No harm - but no purpose, either. So I omitted it.
3) This is wrong:
for(int i=6;i<strLen;i++)
{
printf("%s", myString[i]);
}
4) This is better:
for(int i=6;i<strLen;i++)
{
printf("%c", myString[i]);
}

You can take the address of the character at a certain array index.
So, try this if you just want to print out 'world!':
#include <stdio.h>
int main(int a, char** b)
{
int strLen;
char *myString;
myString = "hello world!";
printf("%s", &myString[6]);
return 0;
}