How do you declare string constants in C? - c

I know it's quite idiomatic, or good style at least, in C to declare numeric constants as enums instead of #defineing them.
/* bad style */
#define MAXLINE 1024
/* good/better style */
enum {
MAX_LINE = 1024
};
Is there an equivalent rule for the definition of string constants?
/* is this good style? */
#define HELLO "Hello World"
/* or is this better? */
const char *HELLO2 = "Howdy";
What do you prefer? If possible show some drawbacks of either method.

There's one more (at least) road to Rome:
static const char HELLO3[] = "Howdy";
(static — optional — is to prevent it from conflicting with other files). I'd prefer this one over const char*, because then you'll be able to use sizeof(HELLO3) and therefore you don't have to postpone till runtime what you can do at compile time.
The define has an advantage of compile-time concatenation, though (think HELLO ", World!") and you can sizeof(HELLO) as well.
But then you can also prefer const char* and use it across multiple files, which would save you a morsel of memory.
In short — it depends.

One advantage (albeit very slight) of defining string constants is that you can concatenate them at compile time:
#define HELLO "hello"
#define WORLD "world"
puts( HELLO WORLD );
Not sure that's really an advantage, but it is a technique that cannot be used with const char *'s.

If you want a "const string" like your question says, I would really go for the version you stated in your question:
/* first version */
const char *HELLO2 = "Howdy";
Particularly, I would avoid:
/* second version */
const char HELLO2[] = "Howdy";
Reason: The problem with second version is that compiler will make a copy of the entire string "Howdy", PLUS that string is modifiable (so not really const).
On the other hand, first version is a const string accessible by pointer HELLO2, and it can not be modified.

The main disadvantage of the #define method is that the string is duplicated each time it is used, so you can end up with lots of copies of it in the executable, making it bigger.

Their are a few differences.
#define HELLO "Hello World"
The statement above can be used with preprocessor and can only be change in the preprocessor.
const char *HELLO2 = "Howdy";
The statement above can be changed with c code. Now you can't change the each individual character around like the statement below because its constant.
HELLO2[0] = 'a'
But you what you can do is have it point to a different string like the statement
below
HELLO2 = "HELLO WOLRD"
It really depends on how you want to be able to change the variable around.
With the preprocessor or c code.

Related

How to pass a string literal to a function with explicit section storage

Please forgive if this has been asked before, but I couldn't find a question like it.
The problem is like this: for a certain microcontroller I need selected string literals to be in another section than the default .rodata section. The "other" section will be put in flash (which can only be read 4 bytes at a time, so it can't be used freely, the function needs to be aware of the fact), while the .rodata section gets copied into ram, which is useful, because ram can be read without alignment restrictions, but it is very limited in size.
The construction I now use is like this:
#define roflash __attribute__((section(".flash.rodata"))) __attribute__((aligned(sizeof(char*))))
static roflash const char literal[] = "text";
(+ modifications in the loader script of course).
This works as intended. But it means for every string handling function I'm calling something like this:
static roflash const char literal[] = "text";
do_something(literal);
The holy grail would be something that can combine both into one "black box" construction, so I can write do_something_roflash("text");
I guess it would be something with a #define and a code block, so the same variable name could be used over and over again. But then I get stuck, because some of the functions have a variable number of arguments, so something like this won't work:
#define function_roflash(s) { \
static roflash const char str[] = s; \
function_roflash_implementation(s); \
}
In fact I guess I'd need a variable-argument #define, does that exist?
Other solutions also very welcome.
Thx.
One idea for a somewhat generic GCC variadic macro is:
#define roflash(func, str, ...) { \
static roflash const char s[] = str; \
func(s, __VA_ARGS__); \
}
With just one macro you could support many functions that accept one string constant as the first parameter:
roflash(printf, "%d", 42);
roflash(do_something, "text");
roflash(obj.write, "text");
roflash(obj->write, "text");

C Aligning string literals for a specific use case

I'm trying to align string literals in a specific way, as how I'm using it in my code is fairly specific. I don't want to have to assign it to a variable, for instance many of my functions are using it as a direct argument. And I want it to work both in local scope or global scope.
Usage example:
char *str = ALIGNED_STRING("blah"); //what I want
foo(ALIGNED_STRING("blah")); //what I want
_Alignas(16) char str[] = "blah"; //not what I want (but would correctly align the string)
The ideal solution would be (_Alignas(16) char[]){ "blah" } or a worser case using the GCC/Clang compiler extensions for alignment (__attribute__((alignment(16))) char[]){ "blah" }, but neither works (they're ignored and the default alignment for the type is used).
So my next thought was to align it myself, and then my functions that use the string could then fix it up correctly. e.g. #define ALIGNED_STRING(str) (char*)(((uintptr_t)(char[]){ "xxxxxxxxxxxxxxx" str } + 16 - 1) & ~(16 - 1)) (where the string containing 'x' would represent data needed to understand where the real string can be found, that's easy but just for the example assume the 'x' is fine). Now that works fine in local scope, but fails in the global scope. Since the compiler complains about it not being a compile-time constant (error: initializer element is not a compile-time constant); I would've thought it would work but it seems only addition and subtraction are valid operations on the pointer at compile-time.
So I'm wondering if there's anyway to achieve what I want to do? At the moment I'm just using the latter example (padding and manually aligning) and avoiding to use it in the global scope (but I would really want to). And the best solution would avoid needing to make runtime adjustments (like using the alignment qualifier would), but that doesn't seem possible unless I apply it to a variable (but as mentioned that's not what I want to do).
Was able to get close to OP's need with a compound literal. (C99)
#include <stdio.h>
#include <stddef.h>
void bar(const char *s) {
printf("%p %s\n", (void*)s, s);
}
// v-- compound literal --------------------------v
#define ALIGNED_STRING(S) (struct { _Alignas(16) char s[sizeof S]; }){ S }.s
int main() {
char s[] = "12";
bar(s);
char t[] = "34";
bar(t);
bar(ALIGNED_STRING("asdfas"));
char *u = ALIGNED_STRING("agsdas");
bar(u);
}
Output
0x28cc2d 12
0x28cc2a 34
0x28cc30 asdfas // 16 Aligned
0x28cc20 agsdas // 16 Aligned

How to reuse a literal in a char and a one-character string constant?

I need to specify an argument short option (e.g. -F) both as char and char[] constant in c code. In order to maximize code reusage I want to declare a variable which allows me to change the value in one place (a "literal" - not stringly speaking a string or char literal, but in the sense of the abstract concept). I would prefer a solution which solves this exclusively in preprocessor constants and functions/macros or exclusively in c code to a good explanation why this has to be solved in a mixture of both.
I tried/checked out
to #define FOREGROUND_OPTION_VALUE 'F' which causes me trouble to transform it to a char[] (as preprocessor constant) (writing a macro which stringifies with # causes the ' quotes to be stringified as well
to omit the ' quotes which leaves me with the problem of creating the ' quotes or create a char another way.
#PedroWitzel's answer to declare a char[] and use the 0th char for another constant. That's fine, but I'd prefer a way to create the char[] from the char because that enforces both to be equal (otherwise I'd have to add a compile time assertion that char[] isn't longer than 1).
The only thing that matters for me is code maintenance, nothing else (like cost in processing the code (during compilation or runtime - have not reflected intensively if there could be any and don't care)).
Over and above the discussion in comments to Pedro Witzel's answer, there's another option:
#define FOREGROUND_OPTION_VALUE 'F'
static const char fg_opt_str[] = { FOREGROUND_OPTION_VALUE, '\0' };
It's not a commonly used way of initializing a string, but it is a valid one and seems appropriate for your scenario. Now you can use FOREGROUND_OPTION_VALUE where you need a constant char (or int) value, and fg_opt_str where you need a one-character string. If you change the value defined (to f, say), then you only have to change one place for the code to continue to work, assuming you weren't using f before, which meets your maintainability requirement.
A static constant variable would work for you?
static const char FOREGROUND_OPTION_VALUE[] = "F";

Define string array

I would like to define an array of string like that:
#define sup (const char**) ("string1", "string2")
but it fails when I try to print the first string:
printf("The string: %s\n",sup[0]);
how to do it in the proper way?
I would advice against doing this with macros altogether, but if you are really interested in what is going on with the code --more than in how this should actually be tackled, here is an explanation.
There is a simple issue in the code, and a more obscure one. The very simple is that to declare an array you don't use parenthesis but rather curly braces:
#define sup (const char**){"str1", "str2"} // still wrong!!
The less simple issue is that arrays are not pointers. The curly brace initializer can be used to initialize an array of two const char*, but that is not the same as a const char**. If you change the code to:
#define sup (const char*[2]){"str1", "str2" }
It should work.
What is going on under the hood with the previous version? Well, the compiler is seeing the declaration of a pointer (well, casting to a pointer) and the initializer. It is assuming that you want to initialize the pointer with the first element (incompatible pointer, but the cast is explicit... you must know what you want if you forced the cast), and then ignore the remainder. Basically the compiler translates your code to [*]:
#define sup (const char**)"str1"
And that will cause havoc at runtime. It is interesting to note that if you had used a proper variable and then initialized the pointer with it, it would have worked, because while arrays are not pointers (I insist, keep that in mind) arrays do decay into pointers:
const char* tmp[] = { "hi", "there" };
const char** sup = tmp; // fine, tmp decays into &tmp[0]
[*] There's a bit of handwaving there... the compiler translates the code, once inserted at the place of use of the macro by the preprocessor, but the translation is equivalent to what I wrote if you were to edit the macro manually.
I think that doing this kind of preprocessor tricks, especially with arrays, isn't such a good idea. You should instead have a real global string table, like this:
const char const * sup[]={"String 1", "String 2", "String 3"};
in one of the .c files, and put its extern declaration in a header to be included wherever such strings are needed:
extern const char const * sup[];
(the first const is to avoid modifications to each string literal - which is UB -, the second to avoid replacing the pointers stored in sup; if you want to allow this last action, remove the second const)
An alternative approach would be to define sup directly in the header as a static global variable (i.e. with internal linkage); I've seen this done before with integer constants to make sure they are immediately known to the compiler in every translation unit (so it can put them as immediate values in the generated assembly), but I don't think that with string pointers it can give any significant performance boost.
I have a header that is common to all my projects.
#define MAX_STUDENTS 3
char STUDENT[] = { "Manny", "Joe", "Jack" };
The code looks like:
for( int i=0; i<MAX_STUDENTS; i++ )
{ Do Something with STUDENT[i]; }
Claude

Defining const pointer to a const string

Readed bog of Ulrich Drepper and come across 2 entries that looks like conficting.
In the first one (string in global space) Ulrich states that the string should be defines as:
const char _pcre_ucp_names[] = "blabla";
while already in second one (string in function) he argues it should be declared as:
static const char _pcre_ucp_names[] = "blabla";
Can you explain what is the better name to declate a string?
UDP:
First of all I removed C++ tag - this question is valid for C as well for C++. So I don't think answers which explain what static means in class/function/file scope is relevant.
Read the articles before answering. The articles deal about memory usage - where the actual data is stored (in .rodata or in .data section), do the string should be relocated (if we're talking about unix/linux shared objects), is it possible to change the string or not.
UDP2
In first one it's said that for global variable following form:
(1) const char *a = "...";
is less good than
(2) const char a[] = "..."
Why? I always thought that (1) is better, since (2) actually replicate the string we assign it, while (1) only points to string we assign.
It depends—if you need the string to be visible to other source files in a project, you can't declare it static. If you only need to access it from the file where it's defined, then you probably want to use static.
The blog post you mention was talking about something different, though:
#include <stdio.h>
#include <string.h>
int main(void)
{
const char s[] = "hello"; /* Notice this variable is inside a function */
strcpy (s, "bye");
puts (s);
return 0;
}
In that case, static means something different: this creates a variable that persists across multiple calls to the same function. His other example showed a global variable, outside of a function.
EDIT:
To clarify, since you edited your question, the reason you don't want to use const char *a = "string" is you create an extra writable pointer. This means that, while you can't change the characters of the string, you can still make the pointer point to an entirely different string. See below:
const char *hello = "hello";
int main( int argc , char const *argv[] )
{
hello = "goodbye";
puts(hello);
return 0;
}
That example compiles and runs. If hello is supposed to be constant, this is surely not what you want. Of course, you can also get around this by writing this:
const char * const hello = "hello";
You still have two variables where you only needed one though -- hello is a pointer to a string constant, where if it's an array there isn't that extra pointer in the way.
Declaring it static means (if at global, file level) that it won't be visible outside this translation unit, or (if inside a scope) that it will retain its value between executions of the scope. It has nothing to do with the "constness" of the data.
While this is indeed a const string, it's neither a pointer nor a const pointer nor is the second one a declaration.
Both define (and initialize) a constant array of characters.
The only difference is that the first one will be visible and accessible from other translation units (proper declarations assumed), while the second one won't.
Note that, in C++, instead of making variables and constants static, you could put them into an unnamed namespace. Then, too, they are inaccessible from other translation units.
On the
const char *abc = "..."; and <br/>
const char def[] = "..."
part of the question...
The only difference to my knowledge is that the array-style definition is not demoted to a pointer when using the sizeof operator.
sizeof(abc) == size of pointer type <br/>
sizeof(def) == size of string (including \0)
Is it for use at a global (file) level or within a class or within a function ? The meaning of static differs ..
For a file level: It depends on the scope you want (either global or limited to the file). No other difference.
For a class: It's best with the static if you're not gonna change it. Because a const can still be redefined on the constructor so it will have to allocate space for a pointer inside the class itself if it's not static. If it is static then no need for a pointer in each class.
For a function: Doesn't really change anything important I think. In the non static case, a pointer will be allocated on the stack and initialized to point in .rodata at each function call. In the other case, it's more like a global variable but with limited scope.

Resources