Concatenate string literal with char literal - c

I want to concat a string literal and char literal. Being syntactically incorrect, "abc" 'd' "efg" renders a compiler error:
x.c:4:24: error: expected ',' or ';' before 'd'
By now I have to use snprift (needlessly), despite the value of string literal and the char literal being know at compile time.
I tried
#define CONCAT(S,C) ({ \
static const char *_r = { (S), (C) }; \
_r; \
})
but it does not work because the null terminator of S is not stripped. (Besides of giving compiler warnings.)
Is there a way to write a macro to use
"abc" MACRO('d') "efg" or
MACRO1(MACRO2("abc", 'd'), "efg") or
MACRO("abc", 'd', "efg") ?
In case someone asks why I want that: The char literal comes from a library and I need to print the string out as a status message.

If you can live with the single quotes being included with it, you could use stringification:
#define SOME_DEF 'x'
#define STR1(z) #z
#define STR(z) STR1(z)
#define JOIN(a,b,c) a STR(b) c
int main(void)
{
const char *msg = JOIN("Something to do with ", SOME_DEF, "...");
puts(msg);
return 0;
}
Depending on the context that may or may not be appropriate, but as far as convincing it to actually be a string literal buitl this way, it's the only way that comes to mind without formatting at runtime.

Try this. It uses the C macro trick of double macros so the macro argument has the chance to expand before it is stringified.
#include <stdio.h>
#define C d
#define S "This is a string that contains the character "
#define STR(s) #s
#define XSTR(s) STR(s)
const char* str = S XSTR(C);
int main()
{
puts(str);
return 0;
}

I came up with a GCC-specific solution that I don't like too much, as one cannot use CONCAT nestedly.
#include <stdio.h>
#define CONCAT(S1,C,S2) ({ \
static const struct __attribute__((packed)) { \
char s1[sizeof(S1) - 1]; \
char c; \
char s2[sizeof(S2)]; \
} _r = { (S1), (C), (S2) }; \
(const char *) &_r; \
})
int main(void) {
puts(CONCAT ("abc", 'd', "efg"));
return 0;
}
http://ideone.com/lzEAn

C will only let you concatenate string literals. Actually, there's nothing wrong with snprintf(). You could also use strcpy():
strcpy(dest, str1);
dest[strlen(dest)] = c;
strcpy(dest + strlen(dest) + 1, str2);
You could also use a giant switch statement to overcome this limitation:
switch(c) {
case 'a':
puts("part1" "a" "part2");
break;
case 'b':
puts("part1" "b" "part2");
break;
/* ... */
case 'z':
puts("part1" "z" "part2");
break;
}
...but I refuse to claim any authorship.
To put it short, just stick with snprintf().

Related

Create string in C where first character is string length

This question is related to Concatenate string literal with char literal, but is slightly more complex.
I would like to create a string literal, where the first character of the string is the length of the string, and the second character is a constant. This is how it is being done currently:
const char myString[] =
{
0x08,
SOME_8_BIT_CONSTANT,
'H',
'e',
'l',
'l',
'o',
0x00
};
Ideally, I would like to replace it with something like:
const char myString[] = BUILD_STRING(0xAA, "Hello");
I tried implementing it like this:
#define STR2(S) #S
#define STR(S) STR2(S)
#define BUILD_STRING(C, S) {(sizeof(S)+2), C, S}
const char myString[] = BUILD_STRING(0xAA, "Hello");
but it expands to:
const char myString[] = {(sizeof("Hello")+2), 0xAA, "Hello"};
and the compiler doesn't seem to like mixing numbers and strings.
Is there any way to do this?
You could in-place define a structure to hold the prefix and the rest, conveniently initialize it, and then treat the whole struct as a char array (not a strict-aliasing violation because standard C lets you treat any object as a char array).
Technically, you're not guaranteed that the compiler won't insert padding between the prefix and the rest, but in practice you can count on it.
#define BUILD_STRING(C, S) \
((char const*)&(struct{ char const p[2]; char const s[sizeof(S)]; })\
{ {(sizeof(S)+2), C}, S})
const char *myString = BUILD_STRING(0xAA, "Hello");
#include <stdio.h>
int main()
{
printf("%d, %#hhX, '%s'\n", myString[0], myString[1], myString+2);
//PRINTS: 8, 0XAA, 'Hello'
}
Edit:
If you're paranoid about the possibility of padding, here's one way to statically assert none is inserted:
#define BUILD_STRING__(S) \
char const p[2]; char const s[sizeof(S)]
#define BUILD_STRING(C, S) \
((char const*)&(struct{BUILD_STRING__(S); \
_Static_assert(sizeof(S)+2== \
sizeof(struct{BUILD_STRING__(S);_Static_assert(1,"");}),""); \
}){ {sizeof(S)+2, C}, S})
Alternatively, using the first version with the (nonstandard)
__attribute((__packed__)) should do the same.

Widen version of __FUNCTION__ in Linux

I tried to write a wide version of __FUNCTION__ to support portable code (Windows and Linux)
#include <stdio.h>
#include <wchar.h>
#include <errno.h>
typedef wchar_t WCHAR;
typedef const wchar_t * PCWCH;
#define WIDEN2(x) L ## x
#define WIDEN(x) WIDEN2(x)
#ifdef _WIN32
#define __WFUNCTION__ WIDEN(__FUNCTION__) L"(): "
#elif __linux__
#define MAX_FUNC_NAME_SIZE 1024
WCHAR func_name[MAX_FUNC_NAME_SIZE];
#define __WFUNCTION__ \
(AsciiStrToUnicodeStr(__FUNCTION__, func_name, MAX_FUNC_NAME_SIZE) == 0) ? func_name : L"(): "
#endif
int AsciiStrToUnicodeStr(const char *src, WCHAR *destination, unsigned int dest_max)
{
size_t retval;
if (!src || !destination || (dest_max == 0)) {
return -EINVAL;
}
retval = mbstowcs(destination, src, dest_max);
return (retval == -1) ? retval : 0;
}
void DbgTrace(PCWCH pwcFormat,...)
{
wprintf(L"%ls\n", pwcFormat);
}
void test()
{
DbgTrace(__WFUNCTION__ L"ERROR: Null string passed\r\n");
}
int main()
{
DbgTrace(__WFUNCTION__ L"ERROR: Null string passed\r\n");
test();
}
The output is only containing the name of the function, but not the concatenated string.
What is the mistake in the above code.
Added output of Preprocessor:
void test()
{
DbgTrace((AsciiStrToUnicodeStr(__FUNCTION__, func_name, 1024) == 0) ? func_name : L"(): " L"ERROR: Null string passed\r\n");
}
__FUNCTION__ (which should be spelled __func__ in C99) is not a string literal; it is effectively an implicitly-defined character array. So you can't create a string literal out of it with literal concatenation. (At least, not in standard C. MSVC might treat __FUNCTION__ as a string literal, but it's not portable.)
String literal concatenation is done by right after preprocessing, and can only be applied to string literals, not variables. func_name " extra text" would be a syntax error.
But that's not what the macro expansion produces, as you can see. The literals being concatenated are L"(): " andL"error: NULL string passed".
Note that if __func__ were a string literal, you could turn it into a wide string literal with string concatenation. Eg:
L"" __FILE__ ": the file"
is a valid wide string literal. (But it won't work on Windows. See https://stackoverflow.com/a/21789691/1566221).
Since __func__ is not a string literal, there is no way to extend it in the preprocessor. (Nor to convert it to a wide string). Your best bet is to use it by itself ina printf call (or wprintf):
printf("%s %s, funcname, message);

Can multiple _Generic be used to create a string literal?

Is there a way to use the _Generic keyword multiple times in the same expression to create a single string literal?
What I am looking for is a way to for example generate a single format string to pass to printf, with all the conversion specifiers adapted to the proper types.
When writing this answer I ended up with a rather ugly work-around:
#include <stdio.h>
typedef struct {
int a;
char b;
long c;
} ABC;
// printf conversion specifiers:
#define CS(x) \
_Generic((x), \
int: "%d", \
char: "%c", \
long: "%ld")
int main (void)
{
ABC abc = {1, 'a', 2};
printf(CS(abc.a), abc.a); printf(" ");
printf(CS(abc.b), abc.b); printf(" ");
printf(CS(abc.c), abc.c); printf(" ");
return 0;
}
6 printf calls instead of 1, hardly ideal.
The problem is that I can't find a way to combine _Generic and string literal concatenation by the pre-processor, like this:
printf(CS(abc.a) " ", abc.a); // doesnt work
printf(CS(abc.a) CS(abc.b), abc.a, abc.b); // doesnt work either
Because apparently generic macros don't count as string literals in the pre-processor, so string literal concatenation isn't possible. I toyed around with "stringification" macros but no luck there.
I'm going to say that the answer is NO.
First, the _Generic keyword is not (and cannot possibly be) a pre-processor directive. A generic-selection is a primary expression, as defined in section 6.5.1. Given the input
printf(CS(abc.a) "hello", abc.a);
the output from the preprocessor (generated by the -E compiler option) is:
printf(_Generic((abc.a), int: "%d", char: "%c", long: "%ld") "hello", abc.a);
Notice that string concatenation is not possible because the generic-selection has not been evaluated. Also note that it's impossible for the pre-processor to evaluate since it requires knowledge that abc is a structure of type ABC, that has member a. The pre-processor does simple text substitution, it has no knowledge of such things.
Second, the compiler phases defined in section 5.1.1.2 don't allow evaluation of _Generic keywords before string concatenation. The relevant phases, quoted from the spec, are
Adjacent string literal tokens are concatenated.
White-space characters separating tokens are no longer significant. Each preprocessing token is converted into a token. The resulting
tokens are syntactically and semantically analyzed and translated as a
translation unit.
The _Generic keyword must be evaluated in phase 7, since it requires knowledge that is only available after tokens have been syntactically and semantically analyzed, e.g. that abc is a structure with member a. Hence, multiple _Generic keywords cannot take advantage of string concatenation to produce a single string literal.
Nice question, you can paste a string passing another parameter:
#include <stdio.h>
typedef struct {
int a;
char b;
long c;
} ABC;
// printf conversion specifiers:
#define CS2(x, y) \
_Generic((x), \
int: "%d" y, \
char: "%c" y, \
long: "%ld" y)
int main (void)
{
ABC abc = {1, 'a', 2};
printf(CS2(abc.a, "Hello"), abc.a);
return 0;
}
Just for the record, it turns out it is possible to generate a string constant based on _Generic at compile-time, by using other dirty tricks than those available from the pre-processor.
The solution I came up with is so ugly that I barely dare to post it, but I'll do so just to prove it possible.
Don't write code like this!
#include <stdio.h>
typedef struct {
int a;
char b;
long c;
} ABC;
// printf conversion specifiers:
#define CS(x) \
_Generic((x), \
int: "%d", \
char: "%c", \
long: "%ld")
#pragma pack(push, 1)
#define print2(arg1,arg2) \
{ \
typedef struct \
{ \
char arr1 [sizeof(CS(arg1))-1]; \
char space; \
char arr2 [sizeof(CS(arg2))-1]; \
char nl_nul[2]; \
} struct_t; \
\
typedef union \
{ \
struct_t struc; \
char arr [sizeof(struct_t)]; \
} cs2_t; \
\
const cs2_t cs2 = \
{ \
.struc.arr1 = CS(arg1), \
.struc.space = ' ', \
.struc.arr2 = CS(arg2), \
.struc.nl_nul = "\n" \
}; \
\
printf(cs2.arr, arg1, arg2); \
}
#pragma pack(pop)
int main (void)
{
ABC abc = {1, 'a', 2};
print2(abc.a, abc.b);
print2(abc.a, abc.c);
print2(abc.b, abc.c);
return 0;
}
Output:
1 a
1 2
a 2
Explanation:
The macro print2 is a wrapper around printf and prints exactly 2 arguments, no matter type, with their correct conversion specifiers.
It builds up a string based on a struct, to which the conversion specifier string literals are passed. Each array place-holder for such a conversion specifier was purposely declared too small to fit the null termination.
Finally, this struct is dumped into a union which can interpret the whole struct as a single string. Of course this is quite questionable practice (even though it doesn't violate strict aliasing): if there is any padding then the program will fail.

Returning a Character String from #define Function

I know you can return a character string from a normal function in C as in this code
#include <stdio.h>
char* returnstring(char *pointer) {
pointer="dog";
return pointer;
}
int main(void)
{
char *dog = NULL;
printf("%s\n", returnstring(dog));
}
However, I can't find a way to be able to return character strings in #define functions, as in this code
#include <stdio.h>
#define returnstring(pointer) { \
pointer="dog"; \
return pointer; \
}
int main(void)
{
char *dog = NULL;
printf("%s\n", returnstring(dog));
}
I know that there are workarounds(like using the first program). I just want to know if it is possible
Thinking about a "#define function" is, IMO, the wrong way to approach this.
#define is a blunt instrument which amounts to a text find/replace. It knows little to nothing about C++ as a language, and the replace is done before any of your real code is even looked at.
What you have written isn't a function in its own right, it is a piece of text that looks like one, and it put in where you have written the alias.
If you want to #define what you just did, that's fine (I didn't check your example specifically, but in general, using #define for a function call and substituting the arguments is possible), but think twice before doing so unless you have an amazing reason. And then think again until you decide not to do it.
You can't "return" from a macro. Your best (ugh... arguably the "best", but anyway) bet is to formulate your macro in such a way that it evaluates to the expression you want to be the result. For example:
#define returnstring(ptr) ((ptr) = "hello world")
const char *p;
printf("%s\n", returnstring(p));
If you have multiple expression statements, you can separate them using the horrible comma operator:
#define even_more_dangerous(ptr) (foo(), bar(), (ptr) = "hello world")
If you are using GCC or a compatible compiler, you can also take advantage of a GNU extension called "statement expressions" so as to embed whole (non-expression) statements into your macro:
#define this_should_be_a_function(ptr) ({ \
if (foo) { \
bar(); \
} else { \
for (int i = 0; i < baz(); i++) { \
quirk(); \
} \
} \
ptr[0]; // last statement must be an expression statement \
})
But if you get to this point, you could really just write a proper function as well.
You don't return anything from a #defined macro. Roughly speaking, the C preprocessor replaces the macro call with the text of the macro body, with arguments textually substituted into their positions. If you want a macro to assign a pointer to "dog" and evaluate to the pointer, you can do this:
#define dogpointer(p) ((p)="dog")
The thing is returnstring as a macro does not do what it says; it also assigns the value to the parameter. The function does as it says, even if it (somewhat oddly) uses its parameter as a temporary variable.
The function is equivalent to:
char* returnstring(char *ignored) {
return "dog";
}
The function macro is much the same as:
#define returnstring(pointer) pointer = "dog"
Which begs the question, why not call it assign_string?
Or why not just have:
#define dogString "dog"
And write:
int main(void)
{
char *dog = NULL;
printf("%s\n", dog = dogString);
}
The function for assignString is:
char* assignstring(char **target{
*target= "dog";
return *target;
}
You can then have a macro:
assign_string_macro(pointer) assignstring(&pointer)
Ultimately if you want to "return character strings in #define functions", then all you need is:
#define returnstring(ignored) "dog"

Can I print #defines given their values in C?

I have
#define ADD 5
#define SUB 6
Can I print ADD and SUB given their values 5 and 6?
No.
The names of the defined symbols are removed by the preprocessor, so the compiler never sees them.
If these names are important at runtime, they need to be encoded in something more persistent than just preprocessor symbol names. Perhaps a table with strings and integers:
#define DEFINE_OP(n) { #n, n }
static const struct {
const char *name;
int value;
} operators[] = {
DEFINE_OP(ADD),
DEFINE_OP(SUB),
};
This uses the stringifying preprocessor operator # to avoid repetitions.
With the above, you can trivially write look-up code:
const char * op_to_name(int op)
{
size_t i;
for(i = 0; i < sizeof operators / sizeof *operators; ++i)
if(operators[i].value == op)
return operators[i].name;
return NULL;
}
you can do something like
printf("%d", ADD);
and it will print 5
The thing you have to remember about defines is:
Defines are substituted into the source code by the preprocessor before it is compiled so all instances of ADD in your code are substituted by 5. After the preprocessor the printf looks like this:
printf("%d", 5);
So to answer your question:
No you can't do it like that.
Yes, but not in via some reverse lookup mechanism wherein the value 5 is somehow symbolic in regards to the string "ADD". The symbols defined via a #define are tectually replaced by the pre-processor. You can however keep it simple:
const char *get_name(int value) {
switch(value) {
case ADD:
return "ADD";
case SUB:
return "SUB";
default:
return "WHATEVER";
}
}
#include <stdio.h>
int main() {
printf("%s = %d\n", get_name(ADD), ADD);
printf("%s = %d", get_name(SUB), SUB);
}
With modern C, since C99, this is even much simpler than unwind's answer by using designated initializers and compound literals
#define DEFINE_OP(n) [n] = #n
#define OPNAMES ((char const*const opNames[]){ \
DEFINE_OPT(ADD), \
DEFINE_OPT(SUB), \
})
inline
char const* getOp(unsigned op) {
size_t const maxOp = sizeof OPNAMES/ sizeof *OPNAMES;
if (op >= maxOp || !OPNAMES[op]) return "<unknown operator>";
else return OPNAMES[op];
}
Any modern compiler should be able then to expand calls as getOp(ADD) at compile time.

Resources