Create string in C where first character is string length - c

This question is related to Concatenate string literal with char literal, but is slightly more complex.
I would like to create a string literal, where the first character of the string is the length of the string, and the second character is a constant. This is how it is being done currently:
const char myString[] =
{
0x08,
SOME_8_BIT_CONSTANT,
'H',
'e',
'l',
'l',
'o',
0x00
};
Ideally, I would like to replace it with something like:
const char myString[] = BUILD_STRING(0xAA, "Hello");
I tried implementing it like this:
#define STR2(S) #S
#define STR(S) STR2(S)
#define BUILD_STRING(C, S) {(sizeof(S)+2), C, S}
const char myString[] = BUILD_STRING(0xAA, "Hello");
but it expands to:
const char myString[] = {(sizeof("Hello")+2), 0xAA, "Hello"};
and the compiler doesn't seem to like mixing numbers and strings.
Is there any way to do this?

You could in-place define a structure to hold the prefix and the rest, conveniently initialize it, and then treat the whole struct as a char array (not a strict-aliasing violation because standard C lets you treat any object as a char array).
Technically, you're not guaranteed that the compiler won't insert padding between the prefix and the rest, but in practice you can count on it.
#define BUILD_STRING(C, S) \
((char const*)&(struct{ char const p[2]; char const s[sizeof(S)]; })\
{ {(sizeof(S)+2), C}, S})
const char *myString = BUILD_STRING(0xAA, "Hello");
#include <stdio.h>
int main()
{
printf("%d, %#hhX, '%s'\n", myString[0], myString[1], myString+2);
//PRINTS: 8, 0XAA, 'Hello'
}
Edit:
If you're paranoid about the possibility of padding, here's one way to statically assert none is inserted:
#define BUILD_STRING__(S) \
char const p[2]; char const s[sizeof(S)]
#define BUILD_STRING(C, S) \
((char const*)&(struct{BUILD_STRING__(S); \
_Static_assert(sizeof(S)+2== \
sizeof(struct{BUILD_STRING__(S);_Static_assert(1,"");}),""); \
}){ {sizeof(S)+2, C}, S})
Alternatively, using the first version with the (nonstandard)
__attribute((__packed__)) should do the same.

Related

Concatenate content of string var with string using Macro in C

As described in the title, I want to concatenate the content of string var with string using Macro.
This is an example:
const char * Name = "OverFlow"
#define DEFINE_VAR(str) unsigned char u8_##str##_Var;
I want to use the macro as following:
DEFINE_VAR(Name)
The result is:
unsigned char u8_Name_Var;
and not
unsigned char u8_OverFlow_Var;
Do you have any idea?
The preprocessor cannot concatenate the value of a variable with a string it can only concatenate preprocessor tokens that may be the result of a macro expansion.
It would be possible with #define Name OverFlow or similar.
Example file macro.c:
Edit: As suggested by Lundin I added macros to get a string literal in case the variable char *Name = "OverFlow"; is needed for other purposes.
#define NAME OverFlow
#define DEFINE_VAR_2(str) unsigned char u8_##str##_Var
#define DEFINE_VAR(str) DEFINE_VAR_2(str)
/* macros to get a string literal */
#define STR_2(x) #x
#define STR(x) STR_2(x)
#define STRNAME STR(NAME)
#define STRVAR const char *Name = STR(NAME)
/* this works */
DEFINE_VAR(NAME);
/* this doesn't work */
DEFINE_VAR_2(NAME);
/* if you need a string with the variable name */
const char *Name = STRNAME;
/* or with a single macro */
STRVAR;
Result:
# 1 "macro.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "macro.c"
/* macros to get a string literal as proposed by Lundin */
/* this works */
unsigned char u8_OverFlow_Var;
/* this doesn't work */
unsigned char u8_NAME_Var;
/* if you need a string with the variable name */
const char *Name = "OverFlow";
/* or with a single macro */
const char *Name = "OverFlow";
Ok, here's something that works but may not be what you are looking for exaclty
and the gnu documentation link for you to understand C preprocessor
main.c
#include <stdio.h>
#define xstr(s) str(s)
#define str(s) "u8_"#s"_Var"
int main(int argc, char const *argv[])
{
const char *Overflow;
printf("name: %s\n", Overflow);
printf("defined: %s\n", xstr(Overflow));
return 0;
}
output
name: (null)
defined: u8_Overflow_Var
https://gcc.gnu.org/onlinedocs/cpp/Stringizing.html
&
https://gcc.gnu.org/onlinedocs/cpp/Concatenation.html#Concatenation
The define works, what are you trying to achieve ?
const char * Name = "OverFlow";
#define DEFINE_VAR(str) unsigned char u8_##str##_Var;
DEFINE_VAR(Name)
int main(int argc, char const *argv[])
{
printf("%s\n", Name);
return 0;
}
output result is
OverFlow

How can I initialize a flexible array in rodata and create a pointer to it?

In C, the code
char *c = "Hello world!";
stores Hello world!\0 in rodata and initializes c with a pointer to it.
How can I do this with something other than a string?
Specifically, I am trying to define my own string type
typedef struct {
size_t Length;
char Data[];
} PascalString;
And then want some sort of macro so that I can say
const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");
And have it behave the same, in that \x0c\0\0\0Hello world! is stored in rodata and c2 is initialized with a pointer to it.
I tried using
#define PASCAL_STRING_CONSTANT(c_string_constant) \
&((const PascalString) { \
.Length=sizeof(c_string_constant)-1, \
.Data=(c_string_constant), \
})
as suggested in these questions, but it doesn't work because Data is a flexible array: I get the error error: non-static initialization of a flexible array member (with gcc, clang gives a similar error).
Is this possible in C? And if so, what would the PASCAL_STRING_CONSTANT macro look like?
To clarify
With a C string, the following code-block never stores the string on the stack:
#include <inttypes.h>
#include <stdio.h>
int main(void) {
const char *c = "Hello world!";
printf("test %s", c);
return 0;
}
As we can see by looking at the assembly, line 5 compiles to just loading a pointer into a register.
I want to be able to get that same behavior with pascal strings, and using GNU extensions it is possible to. The following code also never stores the pascal-string on the stack:
#include <inttypes.h>
#include <stdio.h>
typedef struct {
size_t Length;
char Data[];
} PascalString;
#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
static const PascalString _tmpstr = { \
.Length=sizeof(c_string_constant)-1, \
.Data=c_string_constant, \
}; \
&_tmpstr; \
})
int main(void) {
const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");
printf("test %.*s", c2->Length, c2->Data);
return 0;
}
Looking at its generated assembly, line 18 is also just loading a pointer.
However, the best code I've found to do this in ANSI C produces code to copy the entire string onto the stack:
#include <inttypes.h>
#include <stdio.h>
typedef struct {
size_t Length;
char Data[];
} PascalString;
#define PASCAL_STRING_CONSTANT(initial_value) \
(const PascalString *)&(const struct { \
uint32_t Length; \
char Data[sizeof(initial_value)]; \
}){ \
.Length = sizeof(initial_value)-1, \
.Data = initial_value, \
}
int main(void) {
const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");
printf("test %.*s", c2->Length, c2->Data);
return 0;
}
In the generated assembly for this code, line 19 copies the entire struct onto the stack then produces a pointer to it.
I'm looking for either ANSI C code that produces the same assembly as my second example, or an explanation of why that's not possible with ANSI C.
You can use this macro, which names the name of the variable on its contents:
#define PASCAL_STRING(name, str) \
struct { \
unsigned char len; \
char content[sizeof(str) - 1]; \
} name = { sizeof(str) - 1, str }
To create such a string. Use it like this:
const PASCAL_STRING(c2, "Hello world!");
This can be done with the statment-expressions GNU extension, although it is nonstandard.
#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
static const PascalString _tmpstr = { \
.Length=sizeof(c_string_constant)-1, \
.Data=c_string_constant, \
}; \
&_tmpstr; \
})
The extension allows you to have multiple statements in a block as an expression which evaluates to the value of the last statement by enclosing the block in ({ ... }). Thus, we can declare our PascalString as a static const value, and then return a pointer to it.
For completeness, we can also make a stack buffer if we want to modify it:
#define PASCAL_STRING_STACKBUF(initial_value, capacity) \
(PascalString *)&(struct { \
uint32_t Length; \
char Data[capacity]; \
}){ \
.Length = sizeof(initial_value)-1, \
.Data = initial_value, \
}
I am not sure why you would want to do it, but you could do it this way.
This method will store your string in the data segment and gives you a way to access it as a structure. Note that I create a packed structure to ensure that the mapping into the structure always works since I have essentially hard coded the data fields in the const expression below.
#include <stdio.h>
#pragma packed(1)
typedef struct {
unsigned char Length;
char Data[];
} PascalString;
#pragma pack()
const unsigned char HELLO[7] = {
0x06,
'H','E','L','L','O','\0'
};
int main(void) {
PascalString * myString = (PascalString *)HELLO;
printf("I say: %s \n", myString->Data);
}

How to declare the data type for variable arguments?

I'm trying to assign data type to world but unable to figure it out.
#include <stdarg.h>
#include <stdio.h>
#define TRACE(arg) TraceDebug arg ;\
void TraceDebug(const char* format, ...);
void TraceDebug(const char* format, ...)
{
char buffer[256];
va_list args;
va_start(args, format);
vprintf(format, args);
va_end(args);
}
int main(void)
{
int a =55;
TRACE((Hello,a));
return 0;
}
Below is the error statement in detail.
main.c: In function 'main':
main.c:28:12: error: 'Hello' undeclared (first use in this function)
TRACE((Hello,a));
^
main.c:13:32: note: in definition of macro 'TRACE'
#define TRACE(arg) TraceDebug arg ;\
^
main.c:28:12: note: each undeclared identifier is reported only once for each function it appears in
TRACE((Hello,a));
^
main.c:13:32: note: in definition of macro 'TRACE'
#define TRACE(arg) TraceDebug arg ;\
^
Is there anyway possible to declare Hello as a variable, after declaring I need to get the address of the variable.
In simple I want to change the below code into a variadic function arguments
for example #define QU(arg1,arg2) as #define QU(arg1,...) since variadic macro is not supported am using variadic functions.
#define TRACE(arg1) QU arg1
#define QU(arg1,arg2) {static const char arg1; \
printf("%p\n",(void*)&arg1);\
printf("%d\n",arg2);}\
int main(void)
{
int aaa =333;
int bbb =444;
TRACE((Hello,aaa));
TRACE((Hello2,bbb));
return 0;
}
1) (title) How to declare the data type for variable arguments?
2) (1st question) I'm trying to assign data type to world but unable to figure it out.
1) The data type for the variadic argument (represented by the ellipses: ... ) is always the type of the variable preceding the ellipses . For this prototype:
int variadicFunc(int a, const char *b, ...);
^^^^^^^^^^ ^^^
type assumes the type const char *
2) From content of your question only, the answer could be to be use a typedef statement:
typedef char World; // a new type 'World' is created
But there are clarifications in the comments:
if i change the string to variable i can reduce the memory size,... (you)
You want to have a variable argument list to pass variables existing in your program that you want to place on a Trace list for debugging
purposes. (is that close?)... (me)
(is that close?) yes, that's the thing am trying to do... Are you always going to pass the same type to this function? Ahh, type will
be like TRACE(("Hello", a,"world")); (you)
It appears you want to enter a variable number of either string literals, or string variables as function arguments, then for those items to be placed into variables, then the addresses of those variables to be stored in a file, for the purpose of saving space.
The following code illustrates how you can pass a variable number of strings (in different forms) into a function, and have the address and content retained into a struct. From this, you should be able to adapt from what I have done here, to something more useful to your needs. Note, I have reserved the first string argument to be used a file location to store addresses.
#define MAX_LEN 200
typedef struct {
unsigned int addr;
char str[MAX_LEN];
} DATA;
int variadicFunc(int argCount, const char *str, ...);
int main(void)
{
char a[] = {"this is a string"};
char b[] = {"another string"};
char c[] = {"yet another string"};
// count non-variable v1 v2 v3 v4
variadicFunc(4, ".\\storage.txt", a, b, "var string", c);
// ^count of variable argument list
return 0;
}
int variadicFunc(int argCount, const char *str, ...)
{
va_list arg;
int i;
char sAddr[10];
DATA *d = calloc(argCount, sizeof(*d));
va_start(arg, str);
FILE *fp = fopen(str, "w");//using first string as filename to populate
if(fp)
{
for(i=0;i<argCount;i++)
{
// retain addresses and content for each string
strcpy(d[i].str, va_arg(arg, const char *));
d[i].addr = (unsigned int)&d[i].str[i];
sprintf(sAddr, "%X\n", d[i].addr);
fputs(sAddr, fp);
}
fclose(fp);
}
return 0;
}

How can I write these char arrays in a nicer way?

Right now I have an array of char arrays, which I'm using to store font data:
const char * const FONT[] = {
"\x48" "a44448", //0
"\x27" "m\x48" "m\x40", //1
"\x06" "a46425" "m\x00" "m\x80", //2
"\x06" "a46425" "a42425", //3
"\x83" "m\x03" "m\x68" "m\x60", //4
"\x88" "m\x08" "m\x04" "m\x44" "a42424" "m\x00", //5
"\x02" "a42428" "a84842", //6
"\x08" "m\x88" "m\x20", //7
"\x44" "A46428" "a42428", //8
"\x86" "a46428" "m\x60", //9
...
Is there a way to write this in a more readable way, but still have it computed at compile time?
For example, something like:
#define start(x,y) //somehow create '\x<x><y>'. start(3,4) -> '\x34'
#define arc(x,y,rx,ry,a) //evaluate to {'a','<x>','<y>','<rx>','<ry>','<a>'}. arc(1,2,3,4,5) -> {'a','1','2','3','4','5'}
const char * const FONT[] = {
start(4,8) arc(4,4, 4,4, 8) "", //somehow concatenate them
...
Also, why can I use string literals but not char array literals:
(This doesn't work)
const char * const FONT[] = {
{'\x48','a','4','4','4','4','8','\0'}, //0
But this works:
const char X[] = {'\x48','a','4','4','4','4','8','\0'};
const char * const FONT[] = {
X,
...
This set of macro's should do what you want :
#define str(s) #s
#define start(px,py) str(\x##px##py)
#define arc(x,y,rx,ry,pa) str(a##x##y##rx##ry##pa)
const char * const FONT[] = {
start(4,8) arc(4,4, 4,4, 8),
}
This makes use of the # and ## operators (aka stringization resp. concatenation operators).
And results in the following pre-compiler output :
const char * const FONT[] = {
"\x48" "a44448",
}
Is there a way to write this in a more readable way, but still have it
computed at compile time? For example, something like:
#define start(x,y) //somehow create '\x<x><y>'. start(3,4) -> '\x34'
#define arc(x,y,rx,ry,a) //evaluate to {'a','<x>','<y>','<rx>','<ry>','<a>'}. arc(1,2,3,4,5) -> {'a','1','2','3','4','5'}
const char * const FONT[] = {
start(4,8) arc(4,4, 4,4, 8) "", //somehow concatenate them
...
You can implement your start() macro with use of the preprocessor's stringification (#) and token-pasting (##) operators. You need to be a little careful with these, however, to account for the fact that their arguments are not first macro-expanded. Where you do want macro expansion, you can achieve it by interposing an extra layer of macro. For example:
// Stringify the argument (without expansion)
#define STRINGIFY(x) #x
// Expand the argument and stringify the result
#define STRINGIFY_EXPANSION(x) STRINGIFY(x)
// Assumes that the arguments should not themselves be expanded
#define MAKE_HEX(x, y) \ ## x ## y
#define start(x,y) STRINGIFY_EXPANSION(MAKE_HEX(x,y))
Similarly, you can implement your arc() macro as
// No macro expansion wanted here, neither at this level nor before stringifying
#define arc(x,y,rx,ry,a) STRINGIFY(x ## y ## rx ## ry ## a)
(Technically, that creates a string literal token that implies a null terminator, not the unterminated char array you described, but that's what you really want anyway.)
Also, why can I use string literals but not char array literals: (This
doesn't work)
const char * const FONT[] = {
{'\x48','a','4','4','4','4','8','\0'}, //0
But this works:
const char X[] = {'\x48','a','4','4','4','4','8','\0'};
const char * const FONT[] = {
X,
...
Largely because those are not array literals. They are plain initializers. Initializers provide a sequence of values used to initialize an object being declared; when that object is a compound one, such as an array or struct, multiple values presented in its initializer provide initial values for some or all of its members. The members of your array FONT are of type char *, and those pointers are what an initializer provides values for. They furthermore have no deeper structure, so no nested braces are expected.
An array literal might look like this:
(const char[]) {'\x48','a','4','4','4','4','8','\0'}
And, because arrays decay to pointers in initializers, too, just as they do most everywhere else, you indeed can use array literals to initialize your array of pointers:
const char * const FONT[] = {
(const char[]) {'\x48','a','4','4','4','4','8','\0'},
// ...
};
A nicer way would be to write it all in hexadecimal notation even for printable characters. But given the different length that would still be a mess.
How about loading the font from a file at runtime or linking it in as binary blob from a file? There isn't really a good way of making binary data look good in source.
You have almost got the reason. You declare an array of pointers because you have rows of different sizes. So in const char * const FONT[] = {..., FONT is an array of const pointers to arrays of const chars. A litteral string is a const char array so it will decay to a pointer that will be used for the initialization of FONT. If you first declare a char array and use its name, things go well too, because here again the array decays to a pointer.
But in C {'\x48','a','4','4','4','4','8','\0'} is not by itself an array but only an initialization list that can only be used to initialize a character array. For example:
char arr_ok[] = { '1', '2', '3', '\0' }; // correct initialization of a char[4]
char *ptr_ko = { '1', '2', '3', '\0' }; // wrong initialization of a char* (should not compile)
That means that the initialization list is not an array and cannot decay to a pointer.
Things would be different for a 2D array:
char arr2D[][9] = { { '1', '2', '3' }, { '4', '5', '6' }, { '7', '8', '9'} };
This line initializes the 3 sub arrays with resp. '1','2','3' '4','5','6 and '7','8','9'. But it cannot be used for an array of pointers

Concatenate string literal with char literal

I want to concat a string literal and char literal. Being syntactically incorrect, "abc" 'd' "efg" renders a compiler error:
x.c:4:24: error: expected ',' or ';' before 'd'
By now I have to use snprift (needlessly), despite the value of string literal and the char literal being know at compile time.
I tried
#define CONCAT(S,C) ({ \
static const char *_r = { (S), (C) }; \
_r; \
})
but it does not work because the null terminator of S is not stripped. (Besides of giving compiler warnings.)
Is there a way to write a macro to use
"abc" MACRO('d') "efg" or
MACRO1(MACRO2("abc", 'd'), "efg") or
MACRO("abc", 'd', "efg") ?
In case someone asks why I want that: The char literal comes from a library and I need to print the string out as a status message.
If you can live with the single quotes being included with it, you could use stringification:
#define SOME_DEF 'x'
#define STR1(z) #z
#define STR(z) STR1(z)
#define JOIN(a,b,c) a STR(b) c
int main(void)
{
const char *msg = JOIN("Something to do with ", SOME_DEF, "...");
puts(msg);
return 0;
}
Depending on the context that may or may not be appropriate, but as far as convincing it to actually be a string literal buitl this way, it's the only way that comes to mind without formatting at runtime.
Try this. It uses the C macro trick of double macros so the macro argument has the chance to expand before it is stringified.
#include <stdio.h>
#define C d
#define S "This is a string that contains the character "
#define STR(s) #s
#define XSTR(s) STR(s)
const char* str = S XSTR(C);
int main()
{
puts(str);
return 0;
}
I came up with a GCC-specific solution that I don't like too much, as one cannot use CONCAT nestedly.
#include <stdio.h>
#define CONCAT(S1,C,S2) ({ \
static const struct __attribute__((packed)) { \
char s1[sizeof(S1) - 1]; \
char c; \
char s2[sizeof(S2)]; \
} _r = { (S1), (C), (S2) }; \
(const char *) &_r; \
})
int main(void) {
puts(CONCAT ("abc", 'd', "efg"));
return 0;
}
http://ideone.com/lzEAn
C will only let you concatenate string literals. Actually, there's nothing wrong with snprintf(). You could also use strcpy():
strcpy(dest, str1);
dest[strlen(dest)] = c;
strcpy(dest + strlen(dest) + 1, str2);
You could also use a giant switch statement to overcome this limitation:
switch(c) {
case 'a':
puts("part1" "a" "part2");
break;
case 'b':
puts("part1" "b" "part2");
break;
/* ... */
case 'z':
puts("part1" "z" "part2");
break;
}
...but I refuse to claim any authorship.
To put it short, just stick with snprintf().

Resources