Why might a string literal not be a string? - c

I'm struggling with this part in the C standard about string literals, especially the second part of it:
"In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. 80)"
"80) A string literal might not be a string (see 7.1.1), because a null character can be embedded in it by a \0 escape sequence."
Source: ISO/IEC 9899:2018 (C18), §6.4.5/6, Page 51
I don't understand the explanation - "because a null character can be embedded in it by a \0 escape sequence.".
To look at the referenced section §7.1.1., regarding the definition of a "string", it is stated:
"A string is a contiguous sequence of characters terminated by and including the first null character."
Source: ISO/IEC 9899:2018 (C18), §7.1.1/1, Page 132
I've thought about that the focus maybe lays on the "can", in a way that a string literal does not have to include/embed the null character, while a string is needed to.
But then again I´m asking myself: How is one able to use a string literal as string if it has not a string-terminating null character in it, to determine the end of the string (required for string-operating functions)?
I´m totally drawing blanks at the moment.
Note: I´m aware of that a string literal is stored in read-only memory and can´t be modified and a string is a generic term for a sequence of characters terminated by NUL, which can or can not be mutable.
Thus, my question is not: "What is the difference between a string and a string literal?"
My Question is:
Why/How can a string-literal not be a string?
and, according to my concerns, so far:
Is it true, that a string literal can have the NUL byte omitted?
I wanted to ask this question myself but short before posting it, I got the clue. My confusion was made because of the little misplaced wording inside of the quote.
But I decided to not delete the question´s draft as it could be useful for future readers and provide a Q&A instead.
Feel free to comment and hint.
Related stuff:
What is the difference between char s[] and char *s?
What is the type of string literals in C and C++?
Are string literals const?
"Life-time" of a string literal in C

You're overthinking it.
"A string is a contiguous sequence of characters terminated by and including the first null character."
Source: ISO/IEC 9899:2018 (C18), §7.1.1/1, Page 132
says that a "string" only extends up to the first null character. Characters that may exist after the null are not part of the string. However
"80) A string literal might not be a string (see 7.1.1), because a null character can be embedded in it by a \0 escape sequence."
makes it clear a string literal may contain an embedded null. If it does, the string literal AS A WHOLE is not a string -- the string is just the prefix of the string literal up to the first null

Let´s take a look at the definition of the term "string literal" at the same section in C18, §6.5.1/3:
"A character string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, as in "xyz"."
According to that, a string literal is only consisted of the characters enclosed in quotation marks, the bare string content. It does not have an appended \0. The NUL byte is appended later at translation, as said at §6.5.1/6:
"In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. 80)"
Let´s make an example:
"foo" is a string literal, but not a string because "foo" does not contain an embedded null character.
"foo\0" is a string literal and a string because the literal itself contains a null character at the end of the character sequence.
Note that you don´t need to explicitly insert the null character at the end of a string literal to change it to a string. As already said, it is implicitly appended during the program translation.
Means,
const char *s = "foo";
is equal to
const char *s = "foo\0";
I admit, that the sentence of:
"A string literal might not be a string (see 7.1.1), because a null character can be embedded in it by a \0 escape sequence."
is a little confusing and illogical in the context. It would be better phrased:
"A string literal might not be a string (see 7.1.1), because a null character might not (OR is not required to) be embedded in it by a \0 escape sequence."
or alternatively:
"A string literal might not be a string (see 7.1.1), because a null character can be embedded in it by a \0 escape sequence."
As #EricPostpischil pointed in his comment, the meaning of the footnote is probably quite different.
It means that if the string literal contains a null character inside of it, but not at the end, as it is required for a string, the string literal is not equivalent to a string.
F.e.:
The string literal
"foo\0bar"
is not a string, as it contains the first null character embedded inside of the string literal, but not at the end of it.

Related

Why printf() function print other characters? [duplicate]

I understand that strings in C are just character arrays. So I tried the following code, but it gives strange results, such as garbage output or program crashes:
#include <stdio.h>
int main (void)
{
char str [5] = "hello";
puts(str);
}
Why doesn't this work?
It compiles cleanly with gcc -std=c17 -pedantic-errors -Wall -Wextra.
Note: This post is meant to be used as a canonical FAQ for problems stemming from a failure to allocate room for a NUL terminator when declaring a string.
A C string is a character array that ends with a null terminator.
All characters have a symbol table value. The null terminator is the symbol value 0 (zero). It is used to mark the end of a string. This is necessary since the size of the string isn't stored anywhere.
Therefore, every time you allocate room for a string, you must include sufficient space for the null terminator character. Your example does not do this, it only allocates room for the 5 characters of "hello". Correct code should be:
char str[6] = "hello";
Or equivalently, you can write self-documenting code for 5 characters plus 1 null terminator:
char str[5+1] = "hello";
But you can also use this and let the compiler do the counting and pick the size:
char str[] = "hello"; // Will allocate 6 bytes automatically
When allocating memory for a string dynamically in run-time, you also need to allocate room for the null terminator:
char input[n] = ... ;
...
char* str = malloc(strlen(input) + 1);
If you don't append a null terminator at the end of a string, then library functions expecting a string won't work properly and you will get "undefined behavior" bugs such as garbage output or program crashes.
The most common way to write a null terminator character in C is by using a so-called "octal escape sequence", looking like this: '\0'. This is 100% equivalent to writing 0, but the \ serves as self-documenting code to state that the zero is explicitly meant to be a null terminator. Code such as if(str[i] == '\0') will check if the specific character is the null terminator.
Please note that the term null terminator has nothing to do with null pointers or the NULL macro! This can be confusing - very similar names but very different meanings. This is why the null terminator is sometimes referred to as NUL with one L, not to be confused with NULL or null pointers. See answers to this SO question for further details.
The "hello" in your code is called a string literal. This is to be regarded as a read-only string. The "" syntax means that the compiler will append a null terminator in the end of the string literal automatically. So if you print out sizeof("hello") you will get 6, not 5, because you get the size of the array including a null terminator.
It compiles cleanly with gcc
Indeed, not even a warning. This is because of a subtle detail/flaw in the C language that allows character arrays to be initialized with a string literal that contains exactly as many characters as there is room in the array and then silently discard the null terminator (C17 6.7.9/15). The language is purposely behaving like this for historical reasons, see Inconsistent gcc diagnostic for string initialization for details. Also note that C++ is different here and does not allow this trick/flaw to be used.
From the C Standard (7.1.1 Definitions of terms)
1 A string is a contiguous sequence of characters terminated by and
including the first null character. The term multibyte string is
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial
(lowest addressed) character. The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
In this declaration
char str [5] = "hello";
the string literal "hello" has the internal representation like
{ 'h', 'e', 'l', 'l', 'o', '\0' }
so it has 6 characters including the terminating zero. Its elements are used to initialize the character array str which reserve space only for 5 characters.
The C Standard (opposite to the C++ Standard) allows such an initialization of a character array when the terminating zero of a string literal is not used as an initializer.
However as a result the character array str does not contain a string.
If you want that the array would contain a string you could write
char str [6] = "hello";
or just
char str [] = "hello";
In the last case the size of the character array is determined from the number of initializers of the string literal that is equal to 6.
Can all strings be considered an array of characters (Yes), can all character arrays be considered strings (No).
Why Not? and Why does it matter?
In addition to the other answers explaining that the length of a string is not stored anywhere as part of the string and the references to the standard where a string is defined, the flip-side is "How do the C library functions handle strings?"
While a character array can hold the same characters, it is simply an array of characters unless the last character is followed by the nul-terminating character. That nul-terminating character is what allows the array of characters to be considered (handled as) a string.
All functions in C that expect a string as an argument expect the sequence of characters to be nul-terminated. Why?
It has to do with the way all string functions work. Since the length isn't included as part of an array, string-functions, scan forward in the array until the nul-character (e.g. '\0' -- equivalent to decimal 0) is found. See ASCII Table and Description. Regardless whether you are using strcpy, strchr, strcspn, etc.. All string functions rely on the nul-terminating character being present to define where the end of that string is.
A comparison of two similar functions from string.h will emphasize the importance of the nul-terminating character. Take for example:
char *strcpy(char *dest, const char *src);
The strcpy function simply copies bytes from src to dest until the nul-terminating character is found telling strcpy where to stop copying characters. Now take the similar function memcpy:
void *memcpy(void *dest, const void *src, size_t n);
The function performs a similar operation, but does not consider or require the src parameter to be a string. Since memcpy cannot simply scan forward in src copying bytes to dest until a nul-terminating character is reached, it requires an explicit number of bytes to copy as a third parameter. This third parameter provides memcpy with the same size information strcpy is able to derive simply by scanning forward until a nul-terminating character is found.
(which also emphasizes what goes wrong in strcpy (or any function expecting a string) if you fail to provide the function with a nul-terminated string -- it has no idea where to stop and will happily race off across the rest of your memory segment invoking Undefined Behavior until a nul-character just happens to be found somewhere in memory -- or a Segmentation Fault occurs)
That is why functions expecting a nul-terminated string must be passed a nul-terminated string and why it matters.
Intuitively...
Think of an array as a variable (holds things) and a string as a value (can be placed in a variable).
They are certainly not the same thing. In your case the variable is too small to hold the string, so the string gets cut off. ("quoted strings" in C have an implicit null character at the end.)
However it's possible to store a string in an array that is much larger than the string.
Note that the usual assignment and comparison operators (= == < etc.) don't work as you might expect. But the strxyz family of functions comes pretty close, once you know what you're doing. See the C FAQ on strings and arrays.

When using getch(), there's already a character inputted into stdin. How to remove it? [duplicate]

I understand that strings in C are just character arrays. So I tried the following code, but it gives strange results, such as garbage output or program crashes:
#include <stdio.h>
int main (void)
{
char str [5] = "hello";
puts(str);
}
Why doesn't this work?
It compiles cleanly with gcc -std=c17 -pedantic-errors -Wall -Wextra.
Note: This post is meant to be used as a canonical FAQ for problems stemming from a failure to allocate room for a NUL terminator when declaring a string.
A C string is a character array that ends with a null terminator.
All characters have a symbol table value. The null terminator is the symbol value 0 (zero). It is used to mark the end of a string. This is necessary since the size of the string isn't stored anywhere.
Therefore, every time you allocate room for a string, you must include sufficient space for the null terminator character. Your example does not do this, it only allocates room for the 5 characters of "hello". Correct code should be:
char str[6] = "hello";
Or equivalently, you can write self-documenting code for 5 characters plus 1 null terminator:
char str[5+1] = "hello";
But you can also use this and let the compiler do the counting and pick the size:
char str[] = "hello"; // Will allocate 6 bytes automatically
When allocating memory for a string dynamically in run-time, you also need to allocate room for the null terminator:
char input[n] = ... ;
...
char* str = malloc(strlen(input) + 1);
If you don't append a null terminator at the end of a string, then library functions expecting a string won't work properly and you will get "undefined behavior" bugs such as garbage output or program crashes.
The most common way to write a null terminator character in C is by using a so-called "octal escape sequence", looking like this: '\0'. This is 100% equivalent to writing 0, but the \ serves as self-documenting code to state that the zero is explicitly meant to be a null terminator. Code such as if(str[i] == '\0') will check if the specific character is the null terminator.
Please note that the term null terminator has nothing to do with null pointers or the NULL macro! This can be confusing - very similar names but very different meanings. This is why the null terminator is sometimes referred to as NUL with one L, not to be confused with NULL or null pointers. See answers to this SO question for further details.
The "hello" in your code is called a string literal. This is to be regarded as a read-only string. The "" syntax means that the compiler will append a null terminator in the end of the string literal automatically. So if you print out sizeof("hello") you will get 6, not 5, because you get the size of the array including a null terminator.
It compiles cleanly with gcc
Indeed, not even a warning. This is because of a subtle detail/flaw in the C language that allows character arrays to be initialized with a string literal that contains exactly as many characters as there is room in the array and then silently discard the null terminator (C17 6.7.9/15). The language is purposely behaving like this for historical reasons, see Inconsistent gcc diagnostic for string initialization for details. Also note that C++ is different here and does not allow this trick/flaw to be used.
From the C Standard (7.1.1 Definitions of terms)
1 A string is a contiguous sequence of characters terminated by and
including the first null character. The term multibyte string is
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial
(lowest addressed) character. The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
In this declaration
char str [5] = "hello";
the string literal "hello" has the internal representation like
{ 'h', 'e', 'l', 'l', 'o', '\0' }
so it has 6 characters including the terminating zero. Its elements are used to initialize the character array str which reserve space only for 5 characters.
The C Standard (opposite to the C++ Standard) allows such an initialization of a character array when the terminating zero of a string literal is not used as an initializer.
However as a result the character array str does not contain a string.
If you want that the array would contain a string you could write
char str [6] = "hello";
or just
char str [] = "hello";
In the last case the size of the character array is determined from the number of initializers of the string literal that is equal to 6.
Can all strings be considered an array of characters (Yes), can all character arrays be considered strings (No).
Why Not? and Why does it matter?
In addition to the other answers explaining that the length of a string is not stored anywhere as part of the string and the references to the standard where a string is defined, the flip-side is "How do the C library functions handle strings?"
While a character array can hold the same characters, it is simply an array of characters unless the last character is followed by the nul-terminating character. That nul-terminating character is what allows the array of characters to be considered (handled as) a string.
All functions in C that expect a string as an argument expect the sequence of characters to be nul-terminated. Why?
It has to do with the way all string functions work. Since the length isn't included as part of an array, string-functions, scan forward in the array until the nul-character (e.g. '\0' -- equivalent to decimal 0) is found. See ASCII Table and Description. Regardless whether you are using strcpy, strchr, strcspn, etc.. All string functions rely on the nul-terminating character being present to define where the end of that string is.
A comparison of two similar functions from string.h will emphasize the importance of the nul-terminating character. Take for example:
char *strcpy(char *dest, const char *src);
The strcpy function simply copies bytes from src to dest until the nul-terminating character is found telling strcpy where to stop copying characters. Now take the similar function memcpy:
void *memcpy(void *dest, const void *src, size_t n);
The function performs a similar operation, but does not consider or require the src parameter to be a string. Since memcpy cannot simply scan forward in src copying bytes to dest until a nul-terminating character is reached, it requires an explicit number of bytes to copy as a third parameter. This third parameter provides memcpy with the same size information strcpy is able to derive simply by scanning forward until a nul-terminating character is found.
(which also emphasizes what goes wrong in strcpy (or any function expecting a string) if you fail to provide the function with a nul-terminated string -- it has no idea where to stop and will happily race off across the rest of your memory segment invoking Undefined Behavior until a nul-character just happens to be found somewhere in memory -- or a Segmentation Fault occurs)
That is why functions expecting a nul-terminated string must be passed a nul-terminated string and why it matters.
Intuitively...
Think of an array as a variable (holds things) and a string as a value (can be placed in a variable).
They are certainly not the same thing. In your case the variable is too small to hold the string, so the string gets cut off. ("quoted strings" in C have an implicit null character at the end.)
However it's possible to store a string in an array that is much larger than the string.
Note that the usual assignment and comparison operators (= == < etc.) don't work as you might expect. But the strxyz family of functions comes pretty close, once you know what you're doing. See the C FAQ on strings and arrays.

How should character arrays be used as strings?

I understand that strings in C are just character arrays. So I tried the following code, but it gives strange results, such as garbage output or program crashes:
#include <stdio.h>
int main (void)
{
char str [5] = "hello";
puts(str);
}
Why doesn't this work?
It compiles cleanly with gcc -std=c17 -pedantic-errors -Wall -Wextra.
Note: This post is meant to be used as a canonical FAQ for problems stemming from a failure to allocate room for a NUL terminator when declaring a string.
A C string is a character array that ends with a null terminator.
All characters have a symbol table value. The null terminator is the symbol value 0 (zero). It is used to mark the end of a string. This is necessary since the size of the string isn't stored anywhere.
Therefore, every time you allocate room for a string, you must include sufficient space for the null terminator character. Your example does not do this, it only allocates room for the 5 characters of "hello". Correct code should be:
char str[6] = "hello";
Or equivalently, you can write self-documenting code for 5 characters plus 1 null terminator:
char str[5+1] = "hello";
But you can also use this and let the compiler do the counting and pick the size:
char str[] = "hello"; // Will allocate 6 bytes automatically
When allocating memory for a string dynamically in run-time, you also need to allocate room for the null terminator:
char input[n] = ... ;
...
char* str = malloc(strlen(input) + 1);
If you don't append a null terminator at the end of a string, then library functions expecting a string won't work properly and you will get "undefined behavior" bugs such as garbage output or program crashes.
The most common way to write a null terminator character in C is by using a so-called "octal escape sequence", looking like this: '\0'. This is 100% equivalent to writing 0, but the \ serves as self-documenting code to state that the zero is explicitly meant to be a null terminator. Code such as if(str[i] == '\0') will check if the specific character is the null terminator.
Please note that the term null terminator has nothing to do with null pointers or the NULL macro! This can be confusing - very similar names but very different meanings. This is why the null terminator is sometimes referred to as NUL with one L, not to be confused with NULL or null pointers. See answers to this SO question for further details.
The "hello" in your code is called a string literal. This is to be regarded as a read-only string. The "" syntax means that the compiler will append a null terminator in the end of the string literal automatically. So if you print out sizeof("hello") you will get 6, not 5, because you get the size of the array including a null terminator.
It compiles cleanly with gcc
Indeed, not even a warning. This is because of a subtle detail/flaw in the C language that allows character arrays to be initialized with a string literal that contains exactly as many characters as there is room in the array and then silently discard the null terminator (C17 6.7.9/15). The language is purposely behaving like this for historical reasons, see Inconsistent gcc diagnostic for string initialization for details. Also note that C++ is different here and does not allow this trick/flaw to be used.
From the C Standard (7.1.1 Definitions of terms)
1 A string is a contiguous sequence of characters terminated by and
including the first null character. The term multibyte string is
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial
(lowest addressed) character. The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
In this declaration
char str [5] = "hello";
the string literal "hello" has the internal representation like
{ 'h', 'e', 'l', 'l', 'o', '\0' }
so it has 6 characters including the terminating zero. Its elements are used to initialize the character array str which reserve space only for 5 characters.
The C Standard (opposite to the C++ Standard) allows such an initialization of a character array when the terminating zero of a string literal is not used as an initializer.
However as a result the character array str does not contain a string.
If you want that the array would contain a string you could write
char str [6] = "hello";
or just
char str [] = "hello";
In the last case the size of the character array is determined from the number of initializers of the string literal that is equal to 6.
Can all strings be considered an array of characters (Yes), can all character arrays be considered strings (No).
Why Not? and Why does it matter?
In addition to the other answers explaining that the length of a string is not stored anywhere as part of the string and the references to the standard where a string is defined, the flip-side is "How do the C library functions handle strings?"
While a character array can hold the same characters, it is simply an array of characters unless the last character is followed by the nul-terminating character. That nul-terminating character is what allows the array of characters to be considered (handled as) a string.
All functions in C that expect a string as an argument expect the sequence of characters to be nul-terminated. Why?
It has to do with the way all string functions work. Since the length isn't included as part of an array, string-functions, scan forward in the array until the nul-character (e.g. '\0' -- equivalent to decimal 0) is found. See ASCII Table and Description. Regardless whether you are using strcpy, strchr, strcspn, etc.. All string functions rely on the nul-terminating character being present to define where the end of that string is.
A comparison of two similar functions from string.h will emphasize the importance of the nul-terminating character. Take for example:
char *strcpy(char *dest, const char *src);
The strcpy function simply copies bytes from src to dest until the nul-terminating character is found telling strcpy where to stop copying characters. Now take the similar function memcpy:
void *memcpy(void *dest, const void *src, size_t n);
The function performs a similar operation, but does not consider or require the src parameter to be a string. Since memcpy cannot simply scan forward in src copying bytes to dest until a nul-terminating character is reached, it requires an explicit number of bytes to copy as a third parameter. This third parameter provides memcpy with the same size information strcpy is able to derive simply by scanning forward until a nul-terminating character is found.
(which also emphasizes what goes wrong in strcpy (or any function expecting a string) if you fail to provide the function with a nul-terminated string -- it has no idea where to stop and will happily race off across the rest of your memory segment invoking Undefined Behavior until a nul-character just happens to be found somewhere in memory -- or a Segmentation Fault occurs)
That is why functions expecting a nul-terminated string must be passed a nul-terminated string and why it matters.
Intuitively...
Think of an array as a variable (holds things) and a string as a value (can be placed in a variable).
They are certainly not the same thing. In your case the variable is too small to hold the string, so the string gets cut off. ("quoted strings" in C have an implicit null character at the end.)
However it's possible to store a string in an array that is much larger than the string.
Note that the usual assignment and comparison operators (= == < etc.) don't work as you might expect. But the strxyz family of functions comes pretty close, once you know what you're doing. See the C FAQ on strings and arrays.

character string literal and string literal in standard?

I am confused by these four terms:
character string literal
character constants
string literal.
multibyte character sequence
And reading this quote in C Standard:
A character string literal need not be a string (see 7.1.1), because
a null character may be embedded in it by a \0 escape sequence.
What is meant by the first part ?
A string-literal is
either a character string literal, e.g. "abc";
or UTF-8 string literal, e.g. u8"abc";
or wide string literal, e.g. L"abc".
From the standard (emphasis mine):
A character string literal is a sequence of zero or more multibyte characters enclosed in
double-quotes, as in "xyz". A UTF−8 string literal is the same, except prefixed by u8.
A wide string literal is the same, except prefixed by the letter L, u, or U.....
In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals. 78)
78) A string literal need not be a string (see 7.1.1), because a null character may be embedded in it by a
\0 escape sequence.
A string is a contiguous sequence of characters terminated by and including the first null
character.
So a string literal may have \0 also in the middle or even at the beginning, for instance "a\0b" or "\0ab". I think this is what the footnote is saying.
A character constant is a c-char-sequence (usually a single character) in single quotes, with a possible prefix L/u/U.
An integer character constant is a sequence of one or more multibyte characters enclosed
in single-quotes, as in 'x'. A wide character constant is the same, except prefixed by the
letter L, u, or U.
So the terminology is not very symmetric, IMO. E.g. wide character constant is a particular case of character constant. However both character string literal and wide string literal belong to string literals.

C - is char* template a special type of string?

I came across a line like
char* template = "<html><head><title>%i %s</title></head><body><h1>%i %s</h1> </body></html>";
while reading through code to implement a web server.
I'm curious as I've never seen a string like this before - is template specifying a special type of string (I'm just guessing here because it was highlighted on my IDE)? Also, how would strlen() work with something like this?
Thanks
char* template = "<html>...</html>";
is fundamentally no different than
char *s = "hello";
The name template is not special, it's just an ordinary identifier, the name of the variable. (template happens to be a keyword in C++, but this is C.)
It would be better to define it as const, to enforce the fact that string literals cannot be modified, but it's not mandatory.
Note that template itself is not a string. It's a pointer to a string. The string itself (defined by the language as "a contiguous sequence of characters terminated by and including the first null
character") is the sequence starting with "<html>" and ending with "</html>" and the implicit terminating null character.
And in answer to your second question, strlen(template) would work just fine, giving you the length of the string (81 in this case).
I imagine that there is another part of the code that uses this string to format an output string used as a page by the web server. The strlen function will return the length of the string.
Unless there's a null character somewhere in the initializer or an escape sequence using a \ character, which there isn't, there's nothing special about this string. A % is a normal character in a string and doesn't receive special treatment. The strlen function in particular will read %i as two characters, i.e. % and i. Similarly for %s.
In contrast, a \ is a special character for string and denotes an escape sequence. The \ and the character that follows it in the string constant constitute a single character in the string itself. For example, \n means a newline character (ASCII 10) and \t is a tab character (ASCII 8).
This string is most likely used as a format string for printf. This function will read the string and interpret the %i and %s as format string accepting a int and a char * respectively.
char* template = "<html>...</html>";
just create a char array to store data "<html>...</html>",and this array name is template,you can change this name to other name you want.When create char array,compiler will add \0 to the end of array.strlen will calculate the length from array start to \0(\0 is no include).
I think your IDE will highlight this string is because this string is used in other place.

Resources