partial C String initialisation [duplicate] - c

This question already has answers here:
How does concatenation of two string literals work?
(4 answers)
Closed 1 year ago.
In some code I today read, a type of C-String initialisation existed which is new to me.
It chains multiple String-Initialisation like "A""B""C"...
It also allows splinting the String Initialisation to multiple Lines
I set up a small Hello World demo, so you can see what I am talking about:
#include <stdio.h>
#define SPACE " "
#define EXCLAMATION_MARK "!"
#define HW "Hello"SPACE"World"EXCLAMATION_MARK
int main()
{
char hw_str[] =
"Hello"
SPACE
"World"
"!";
printf("%s\n",hw_str);
printf("%s\n",HW);
return 0;
}
So here are some questions:
is this valid according to the standard?
why this works? "abc" is like a array {'a','b','c'} right?, so why are array initialisations concatenated over multiple pairs of "" working?
has this feature an official name - like when you enter it in google, you find some documentation describing it?
is this portable?

From the C Standard (5.1.1.2 Translation phases)
1 The precedence among the syntax rules of translation is specified by
the following phases.
Adjacent string literal tokens are concatenated
So for example this part of the program
char hw_str[] =
"Hello"
SPACE
"World"
"!";
that after macro substitutions looks like
char hw_str[] =
"Hello"
" "
"World"
"!";
is processed by the preprocessor in the sixth phase by concatenating adjacent string literals and you have
char hw_str[] =
"Hello World!";

Related

string splicing in c language, A stange phenomenon

Recently, I saw a C language code, the following:
printf("%s\n", "1234" "qwer");
// output: 1234qwer
snprintf(buffer, sizeof(buffer), "bvcx" "mju");
// buffer data: bvcxmju
To be honest, it's amazing for me. Before that, I didn't know that the strings can be pasted in "1234" "qwer" format. Why can it run?
then, I try this 'char a[] = "1234" "qwer"', gcc return an error!
so, can someone explain this phenomenon and explain theory?
What you saw has been part of the C language syntax for a long time. A string literal can be split in multiple parts separated only by white space, after preprocessing and comment removal. This syntax enables for example:
writing a long string literal on multiple lines:
char message[] = "This is a long message that can be split on "
"multiple lines for readability";
combining string fragments defined as macros:
printf("The value of i32 is %" PRId32 "\n", i32);
separating string contents that have a different meaning if juxtaposed:
char s1[] = "This is ESC 4: \x1B" "4";
char s2[] = "so is this: \0334 and this: \33""4";
char s3[] = "but not this: \334";
char s4[] = "nor this: \x1B4";
combining stringified macro arguments
Adjacent string literals are always concatenated into a single one as part of the translation phases. See C17 6.4.5/5:
In translation phase 6, the multibyte character sequences specified by any sequence of adjacent character and identically-prefixed string literal tokens are concatenated into a single multibyte character sequence.
Formally, translation phase 6 happens after macro expansion but before preprocessor tokens are converted to tokens. Meaning for example that
sizeof "hello " "world" yields the result 12, equivalent to:
sizeof "hello world"
Practically, this is convenient when writing various "stringification" macros, example:
#include <stdio.h>
#define STRINGIFY(x) #x
#define STRINGIFY_CONCAT(a,b) STRINGIFY(a) " " STRINGIFY(b)
int main (void)
{
puts(STRINGIFY_CONCAT(hello,world));
}
It's also a useful feature whenever you have to use hex escape sequences and need to terminate them, since C allows them to be of variable length: puts("\xABBA") vs puts("\xAB" "BA") will give different outputs.

No. of strings in format string specifier [duplicate]

This question already has answers here:
How does concatenation of two string literals work?
(4 answers)
Closed last year.
I have seen this way of using combination of strings in printf and scanf statements.
int a;
printf("Printing" "using" "multiple" "strings" "%d", a);
// The above is just an example, some usage that I saw was for printing specific integer types like int_32
// uint32_t var; printf("Value is %" PRTu32, var);
I always thought that we could only use a single string as a format specifier. Like as written in the definition of printf function it seems format can point to only one string.
int printf ( const char * format, ... );
So out of curiosity I tried the following code and it ran successfully!
char arr[] = "Hello " "World";
printf("%s",arr); // Output - Hello World
Could anyone explain how this concatenation thing works and what is the correct way of doing it. Any help is appreciated.
If you give a space or no space in between two string literals it concatenates the string literals.
That's one of the C feature: This is defined by the ISO C standard, adjacent string literals are automatically combined/concatinated into a single one.

C chars add themselves up for no reason [duplicate]

This question already has answers here:
Space for Null character in c strings
(5 answers)
Closed 3 years ago.
I think I'm going insane because I cannot find an explanation to why C is combining my chars.
I've made you guys a test programm...
#include <stdio.h>
#include <stdlib.h>
int main()
{
char alphabet_big[26] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char alphabet_small[26] = "abcdefghijklmnopqrstuvwxyz";
printf("%s\n", alphabet_small);
return 0;
}
Results: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZV
Why is C combining alphabet_small and alphabet_big? That's not making sense. And why is there a "V" at the end of the char?
I hope someone can provide me an answer to this "problem".
Best regards.
Keep in mind that a C String is defined as a null terminated char array.
Change the declaration and initialization statement here: (for both statements.)
char alphabet_big[26] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";//forces compiler to use only 26 char
//regardless of the count of initializers
//(leaving no room for NULL terminator)
To
char alphabet_big[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";//allows compiler to set aside
^^ //the proper space, no matter how many initializers
The first produces undefined behavior when using with any of the string functions, such as strcpy, strcmp, and in this case printf with the "%s" format specifier.
The first produces the following, which is not is not a C string:
|A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z|?|?|?|
While the 2nd produces the following, which is a C string:
|A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z|\0|?|?|
Note - The ? symbols used in above illustration depict memory locations that are not owned by the program, and for which the contents are unknown, or may not even exist. A program attempting to access these locations would be invoking undefined behavior.
Normally the library functions expect to find a NUL byte at the end of a string, and the compiler is happy to add it for you automatically except you've told it that alphabet_big has only 26 bytes, essentially avoiding that extra NUL byte, so it combines with what's next.
Remove the 26 and let the compiler count for you.

Why does adding a space between two strings concat the strings in c? [duplicate]

This question already has answers here:
Why allow concatenation of string literals?
(10 answers)
Closed 6 years ago.
Sorry if I am going to ask a very basic question. I tried to search for it but I have been unable to find any answer.
When I run the following code:
#include <stdio.h>
int main() {
char *temp = "sai" "krishna";
printf("%s\n", temp);
return 0;
}
it prints saikrishna
Can you kindly specify why it happens? Should not we use strcat or other concatenation techniques?
Can you please refer to any documentation relating to it and where we can use this technique?
It's a language feature. C allows string literals to get concatenated at compile-time. It can be handy when you have very long string literals stretching over several lines, or when you want to break up string literals containing hex escape sequences. (For example puts("\x42AD") will translate to character 0x42AD, which is likely nonsense and unintended, as opposed to puts("\x42" "AD") which will print BAD.
strcat and strcpy are for string handling in run-time. If you have two string literals, they are compile-time constants, and may as well get concatenated by the compiler in advance, to save execution time.

Error in macro expansion

I have been trying to understand macro expansion and found out that the second printf gives out an error. I am expecting the second print statement to generate the same output as the first one. I know there are functions to do string concatenation. I am finding it difficult to understand why first print statement works and the second doesn't.
#define CAT(str1, str2) str1 str2
void main()
{
char *string_1 = "s1", *string_2 = "s2";
printf(CAT("s1", "s2"));
printf(CAT(string_1, string_2));
}
Concatenating string literals, like "s1" "s2", is part of the language specification. Just placing two variables next to each other, like string_1 string_2 is not part of the language.
If you want to concatenate two string variables, consider using strcat instead, but remember to allocate enough space for the destination string.
Try to do the preprocessing "by hand":
CAT is supposed to take 2 input variables, and print them one after the other, with a space between. So... if we preprocess your code, it becomes:
void main()
{
char *string_1 = "s1", *string_2 = "s2";
printf("s1" "s2");
printf(string_1 string_2);
}
While "s1" "s2" is automatically concatenated to "s1s2" by the compiler, string_1 string_2 is invalid syntax.

Resources