Mode settings in C: enum vs constant vs defines - c

I have read a few questions on the topic:
Should I use #define, enum or const?
How does Enum allocate Memory on C?
What makes a better constant in C, a macro or an enum?
What is the size of an enum in C?
"static const" vs "#define" vs "enum"
static const vs. #define in c++ - differences in executable size
and I understand that enum are usually preferred on the #define macros for a better encapsulation and/or readibility. Plus it allows the compilers to check for types preventing some errors.
const declaration are somewhat in between, allowing type checking, and encapsulation, but more messy.
Now I work in Embedded applications with very limited memory space (we often have to fight for byte saving). My first ideas would be that constants take more memory than enums. But I realised that I am not sure how constants will appear in the final firmware.
Example:
enum { standby, starting, active, stoping } state;
Question
In a resource limited environment, how does the enum vs #define vs static const compare in terms of execution speed and memory imprint?

To try to get some substantial elements to the answer, I made a simple test.
Code
I wrote a simple C program main.c:
#include <stdio.h>
#include "constants.h"
// Define states
#define STATE_STANDBY 0
#define STATE_START 1
#define STATE_RUN 2
#define STATE_STOP 3
// Common code
void wait(unsigned int n)
{
unsigned long int vLoop;
for ( vLoop=0 ; vLoop<n*LOOP_SIZE ; ++vLoop )
{
if ( (vLoop % LOOP_SIZE) == 0 ) printf(".");
}
printf("\n");
}
int main ( int argc, char *argv[] )
{
int state = 0;
int loop_state;
for ( loop_state=0 ; loop_state<MACHINE_LOOP ; ++loop_state)
{
if ( state == STATE_STANDBY )
{
printf("STANDBY ");
wait(10);
state = STATE_START;
}
else if ( state == STATE_START )
{
printf("START ");
wait(20);
state = STATE_RUN;
}
else if ( state == STATE_RUN )
{
printf("RUN ");
wait(30);
state = STATE_STOP;
}
else // ( state == STATE_STOP )
{
printf("STOP ");
wait(20);
state = STATE_STANDBY;
}
}
return 0;
}
while constants.h contains
#define LOOP_SIZE 10000000
#define MACHINE_LOOP 100
And I considered three variants to define the state constants. The macro as above, the enum:
enum {
STATE_STANDBY=0,
STATE_START,
STATE_RUN,
STATE_STOP
} possible_states;
and the const:
static const int STATE_STANDBY = 0;
static const int STATE_START = 1;
static const int STATE_RUN = 2;
static const int STATE_STOP = 3;
while the rest of the code was kept identical.
Tests and Results
Tests were made on a 64 bits linux machine and compiled with gcc
Global Size
gcc main.c -o main gives
macro: 7310 bytes
enum: 7349 bytes
const: 7501 bytes
gcc -O2 main.c -o main gives
macro: 7262 bytes
enum: 7301 bytes
const: 7262 bytes
gcc -Os main.c -o main gives
macro: 7198 bytes
enum: 7237 bytes
const: 7198 bytes
When optimization is turned on, both the const and the macro variants come to the same size. The enum is always slightly larger. Using gcc -S I can see that the difference is a possible_states,4,4 in .comm. So the enum is always larger than the macro. and the const can be larger but can also be optimized away.
Section size
I checked a few sections of the programs using objdump -h main: .text, .data, .rodata, .bss, .dynamic. In all cases, .bss has 8 bytes, .data, 16 bytes and .dynamic: 480 bytes.
.rodata has 31 bytes, except for the non-optimized const version (47 bytes).
.text goes from 620 bytes up to 780 bytes, depending on the optimisation. The const unoptimised being the only one differing with the same flag.
Execution speed
I ran the program a few times, but I did not notice a substantial difference between the different versions. Without optimisation, it ran for about 50 seconds. Down to 20 seconds with -O2 and up to more than 3 minutes with -Os. I measured the time with /usr/bin/time.
RAM usage
Using time -f %M, I get about 450k in each case, and when using valgrind --tool=massif --pages-as-heap=yes I get 6242304 in all cases.
Conclusion
Whenever some optimisation has been activated, the only notable difference is about 40 Bytes more for the enum case. But no RAM or speed difference.
Remains other arguments about scope, readability... personal preferences.

and I understand that enum are usually preferred on the #define macros for a better encapsulation and/or readibility
Enums are preferred mainly for better readability, but also because they can be declared at local scope and they add a tiny bit more of type safety (particularly when static analysis tools are used).
Constant declaration are somewhat in between, allowing type checking, and encapsulation, but more messy.
Not really, it depends on scope. "Global" const can be messy, but they aren't as bad practice as global read/write variables and can be justified in some cases. One major advantage of const over the other forms is that such variables tend to be allocated in .rodata and you can view them with a debugger, something that isn't always possible with macros and enums (depends on how good the debugger is).
Note that #define are always global and enum may or may not be, too.
My first ideas would be that constants take more memory than enums
This is incorrect. enum variables are usually of type int, though they can be of smaller types (since their size can vary they are bad for portability). Enumeration constants however (that is the things inside the enum declaration) are always int, which is a type of at least 16 bits.
A const on the other hand, is exactly as large as the type you declared. Therefore const is preferred over enum if you need to save memory.
In a resource limited environment, how does the enum vs #define vs static const compare in terms of execution speed and memory imprint?
Execution speed will probably not differ - it is impossible to say since it is so system-specific. However, since enums tend to give 16 bit or larger values, they are a bad idea when you need to save memory. And they are also a bad idea if you need an exact memory layout, as is often the case in embedded systems. The compiler may however of course optimize them to a smaller size.
Misc advice:
Always use the stdint.h types in embedded systems, particularly when you need exact memory layout.
Enums are fine unless you need them as part of some memory layout, like a data protocol. Don't use enums for such cases.
const is ideal when you need something to be stored in flash. Such variables get their own address and are easier to debug.
In embedded systems, it usually doesn't make much sense to optimize code in order to reduce flash size, but rather to optimize to reduce RAM size.
#define will always end up in .text flash memory, while enum and const may end up either in RAM, .rodata flash or .text flash.
When optimizing for size (RAM or flash) on an embedded system, keep track of your variables in the map file (linker output file) and look for things that stand out there, rather than running around and manually optimizing random things at a whim. This way you can also detect if some variables that should be const have ended up in RAM by mistake (a bug).

enums, #defines and static const will generally give exactly the same code (and therefore the same speed and size), with certain assumptions.
When you declare an enum type and enumeration constants, these are just names for integer constants. They don't take any space or time. The same applies to #define'd values (though these are not limited to int's).
A "static const" declaration may take space, usually in a read-only section in flash. Typically this will only be the case when optimisation is not enabled, but it will also happen if you forget to use "static" and write a plain "const", or if you take the address of the static const object.
For code like the sample given, the results will be identical with all versions, as long as at least basic optimisation is enabled (so that the static const objects are optimised). However, there is a bug in the sample. This code:
enum {
STATE_STANDBY = 0,
STATE_START,
STATE_RUN,
STATE_STOP
} possible_states;
not only creates the enumeration constants (taking no space) and an anonymous enum type, it also creates an object of that type called "possible_states". This has global linkage, and has to be created by the compiler because other modules can refer to it - it is put in the ".comm" common section. What should have been written is one of these:
// Define just the enumeration constants
enum { STATE_STANDBY, ... };
// Define the enumeration constants and the
// type "enum possible_states"
enum possible_states { STATE_STANDBY, ... };
// Define the enumeration constants and the
// type "possible_states"
typedef enum { STATE_STANDBY, ... } possible_states;
All of these will give optimal code generation.
When comparing the sizes generated by the compiler here, be careful not to include the debug information! The examples given show a bigger object file for the enumeration version partly because of the error above, but mainly because it is a new type and leads to more debug information.
All three methods work for constants, but they all have their peculiarities. A "static const" can be any type, but you can't use it for things like case labels or array sizes, or for initialising other objects. An enum constant is limited to type "int". A #define macro contains no type information.
For this particular case, however, an enumerated type has a few big advantages. It collects the states together in one definition, which is clearer, and it lets you make them a type (once you get the syntax correct :-) ). When you use a debugger, you should see the actual enum constants, not just a number, for variables of the enum type. ("state" should be declared of this type.) And you can get better static error checking from your tools. Rather than using a series of "if" / "else if" statements, use a switch and use the "-Wswitch" warning in gcc (or equivalent for other compilers) to warn you if you have forgotten a case.

I used a OpenGL library which defined most constants in an enum, and the default OpenGL-header defines it as #defines. So as a user of the header/library there is no big difference. It is just an aspect of desing. When using plain C, there is no static const until it is something that changes or is a big string like (extern) static const char[] version;. For myself I avoid using macros, so in special cases the identifier can be reused. When porting C code to C++, the enums even are scoped and are Typechecked.
Aspect: characterusage & readability:
#define MY_CONST1 10
#define MY_CONST2 20
#define MY_CONST3 30
//...
#define MY_CONSTN N0
vs.
enum MyConsts {
MY_CONST1 = 10,
MY_CONST2 = 20,
MY_CONST3 = 30,
//...
MY_CONSTN = N0 // this gives already an error, which does not happen with (unused) #defines, because N0 is an identifier, which is probably not defined
};

Related

Redefining a define

I'm reviewing some code and I stumbled across this:
In a header file we have this MAGIC_ADDRESS defined
#define ANOTHER_ADDRESS ((uint8_t*)0x40024000)
#define MAGIC_ADDRESS (ANOTHER_ADDRESS + 4u)
And then peppered throughout the code in various files we have things like this:
*(uint32_t*)MAGIC_ADDRESS = 0;
and
*(uint32_t*)MAGIC_ADDRESS = SOME_OTHER_DEFINE;
This compiles, apparently works, and it throws no linter errors. MAGIC_ADDRESS = 0; without the cast does not compile as I would expect.
So my questions are:
Why in the world would we ever want to do this rather than just making a uint32_t in the first place?
How does this actually work? I thought preprocessor defines were untouchable, how are we managing to cast one?
Why in the world would we ever want to do this rather than just making a uint32_t in the first place?
That's a fair question. One possibility is that ANOTHER_ADDRESS is used as a base address for more than one kind of data, but the code fragments presented do not show any reason why ANOTHER_ADDRESS should not be defined to expand to an expression of type uint32_t *. Note, however, that if that change were made then the definition of MAGIC_ADDRESS would need to be changed to (ANOTHER_ADDRESS + 1u).
How does this actually work? I thought preprocessor defines were untouchable, how are we managing to cast one?
Where an in-scope macro identifier appears in C source code, the macro's replacement text is substituted. Simplifying a bit, if the replacement text contains macro identifiers, too, then those are then replaced with their replacement text, etc.. Nowhere in your code fragments as a macro being cast, per se, but the fully-expanded result expresses some casts.
For example, this ...
*(uint32_t*)MAGIC_ADDRESS = 0;
... expands to ...
*(uint32_t*)(ANOTHER_ADDRESS + 4u) = 0;
... and then on to ...
*(uint32_t*)(((uint8_t*)0x40024000) + 4u) = 0;
. There are no casts of macros there, but there are (valid) casts of macros' replacement text.
It's not the cast that allows the assignment to work, it's the * dereferencing operator. The macro expands to a pointer constant, and you can't reassign a constant. But since it's a pointer you can assign to the memory it points to. So if you wrote
*MAGIC_ADDRESS = 0;
you wouldn't get an error.
The cast is necessary to assign to a 4-byte field at that address, rather than just a single byte, since the macro expands to a uint8_t*. Casting it to uint32_t* make it a 4-byte assignment.
#define ANOTHER_ADDRESS ((uint8_t*)0x40024000)
#define MAGIC_ADDRESS (ANOTHER_ADDRESS + 4u)
And then peppered throughout the code in various files we have things like this:
*(uint32_t*)MAGIC_ADDRESS = 0;
That's the problem - you don't want anything repetitive peppered throughout. Instead, this is what more-or-less idiomatic embedded C code would look like:
// Portable to compilers without void* arithmetic extension
#define BASE_ADDRESS ((uint8_t*)0x40024000)
#define REGISTER1 (*(uint32_t*)(ANOTHER_ADDRESS + 4u))
You can then write REGISTER1 = 42 or if (REGISTER1 != 42) etc. As you may imagine, this is normally used to for memory-mapped peripheral control registers.
If you're using gcc or clang, there's another layer of type safety available as an extension: you don't really want the compiler to allow *BASE_ADDRESS to compile, since presumably you only want to access registers - the *BASE_ADDRESS expression shouldn't pass a code review. And thus:
// gcc, clang, icc, and many others but not MSVC
#define BASE_ADDRESS ((void*)0x40024000)
#define REGISTER1 (*(uint32_t*)(ANOTHER_ADDRESS + 4u))
Arithmetic on void* is a gcc extension adopted most compilers that don't come from Microsoft, and it's handy: the *BASE_ADDRESS expression won't compile, and that's a good thing.
I imagine that the BASE_ADDRESS is the address of the battery-backed RAM on an STM32 MCU, in which case the "REGISTER" interpretation is incorrect, since all you want is to persist some application data, and you're using C, not assembly language, and there's this handy thing we call structures - absolutely use a structure instead of this ugly hack. The things beings stored in that non-volatile area aren't registers, they are just fields in a structure, and the structure itself is stored in a non-volatile fashion:
#define BKPSRAM_BASE_ ((void*)0x40024000)
#define nvstate (*(NVState*)BKPSRAM_BASE_)
enum NVLayout { NVVER_1 = 1, NVVER_2 = 2 };
struct {
// Note: This structure is persisted in NVRAM.
// Do not reorder the fields.
enum NVLayout layout;
// NVVER_1 fields
uint32_t value1;
uint32_t value2;
...
/* sometime later after a release */
// NVVER_2 fields
uint32_t valueA;
uint32_t valueB;
} typedef NVState;
Use:
if (nvstate.layout >= NVVER1) {
nvstate.value1 = ...;
if (nvstate.value2 != 42) ...
}
And here we come to the crux of the problem: your code review was focused on the minutiae, but you should have also divulged the big picture. If my big picture guess is correct - that it's all about sticking some data in a battery-backed RAM, then an actual data structure should be used, not macro hackery and manual offset management. Yuck.
And yes, you'll need that layout field for forward compatibility unless the entire NVRAM area is pre-initialized to zeroes, and you're OK with zeroes as default values.
This approach easily allows you to copy the NVRAM state, e.g. if you wanted to send it over the wire for diagnostic purposes - you don't have to worry about how much data is there, just use sizeof(NVState) for passing it to functions such as fwrite, and you can even use a working copy of that NV data - all without a single memcpy:
NVState wkstate = nvstate;
/* user manipulates the state here */
if (OK_pressed)
nvstate = wkstate;
else if (Cancel_pressed)
wkstate = nvstate;
If you need to assign values to a specific place in memory using MACROs
allows you to do so in a way that is relatively easy to read (and if you need to
use another address later - just change the macro definition)
The macro is translated by the preprocessor to a value. When you then de-reference
it you get access to the memory which you can read or write to. This has nothing to
do with the string that is used as a label by the preprocessor.
Both definitions are wrong I afraid (or at least not completely correct)
It should be defined as a pointer to volatile value if pointers are referencing hardware registers.
#define ANOTHER_POINTER ((volatile uint8_t*)0x40024000)
#define MAGIC_APOINTER (ANOTHER_ADDRESS + 4u)
I was defined as uint8_t * pointer because probably author wanted pointer arithmetic to be done on the byte level.

How much speed gain if using __INLINE__?

In my understanding, INLINE can speed up code execution, is it?
How much speed can we gain from it?
Ripped from here:
Yes and no. Sometimes. Maybe.
There are no simple answers. inline functions might make the code faster, they might make it slower. They might make the executable larger, they might make it smaller. They might cause thrashing, they might prevent thrashing. And they might be, and often are, totally irrelevant to speed.
inline functions might make it faster: As shown above, procedural integration might remove a bunch of unnecessary instructions, which might make things run faster.
inline functions might make it slower: Too much inlining might cause code bloat, which might cause "thrashing" on demand-paged virtual-memory systems. In other words, if the executable size is too big, the system might spend most of its time going out to disk to fetch the next chunk of code.
inline functions might make it larger: This is the notion of code bloat, as described above. For example, if a system has 100 inline functions each of which expands to 100 bytes of executable code and is called in 100 places, that's an increase of 1MB. Is that 1MB going to cause problems? Who knows, but it is possible that that last 1MB could cause the system to "thrash," and that could slow things down.
inline functions might make it smaller: The compiler often generates more code to push/pop registers/parameters than it would by inline-expanding the function's body. This happens with very small functions, and it also happens with large functions when the optimizer is able to remove a lot of redundant code through procedural integration — that is, when the optimizer is able to make the large function small.
inline functions might cause thrashing: Inlining might increase the size of the binary executable, and that might cause thrashing.
inline functions might prevent thrashing: The working set size (number of pages that need to be in memory at once) might go down even if the executable size goes up. When f() calls g(), the code is often on two distinct pages; when the compiler procedurally integrates the code of g() into f(), the code is often on the same page.
inline functions might increase the number of cache misses: Inlining might cause an inner loop to span across multiple lines of the memory cache, and that might cause thrashing of the memory-cache.
inline functions might decrease the number of cache misses: Inlining usually improves locality of reference within the binary code, which might decrease the number of cache lines needed to store the code of an inner loop. This ultimately could cause a CPU-bound application to run faster.
inline functions might be irrelevant to speed: Most systems are not CPU-bound. Most systems are I/O-bound, database-bound or network-bound, meaning the bottleneck in the system's overall performance is the file system, the database or the network. Unless your "CPU meter" is pegged at 100%, inline functions probably won't make your system faster. (Even in CPU-bound systems, inline will help only when used within the bottleneck itself, and the bottleneck is typically in only a small percentage of the code.)
There are no simple answers: You have to play with it to see what is best. Do not settle for simplistic answers like, "Never use inline functions" or "Always use inline functions" or "Use inline functions if and only if the function is less than N lines of code." These one-size-fits-all rules may be easy to write down, but they will produce sub-optimal results.
Copyright (C) Marshall Cline
Using inline makes the system use the substitution model of evaluation, but this is not guaranteed to be used all the time. If this is used, the generated code will be longer and may be faster, but if some optimizations are active, the sustitution model is not faster not all the time.
The reason I use inline function specifier (specifically, static inline), is not because of "speed", but because
static part tells the compiler the function is only visible in the current translation unit (the current file being compiled and included header files)
inline part tells the compiler it can include the implementation of the function at the call site, if it wants to
static inline tells the compiler that it can skip the function completely if it is not used at all in the current translation unit
(Specifically, the compiler that I use most with the options I use most, gcc -Wall, does issue a warning if a function marked static is unused; but will not issue a warning if a function marked static inline is unused.)
static inline tells us humans that the function is a macro-like helper function, in addition adding type-checker to the same behavior as macro's.
Thus, in my opinion, the assumption that inline has anything to do with speed per se, is incorrect. Answering the stated question with a straight answer would be misleading.
In my code, you see them associated with some data structures, or occasionally global variables.
A typical example is when I want to implement a Xorshift pseudorandom number generator in my own C code:
#include <inttypes.h>
static uint64_t prng_state = 1; /* Any nonzero uint64_t seed is okay */
static inline uint64_t prng_u64(void)
{
uint64_t state;
state = prng_state;
state ^= state >> 12;
state ^= state << 25;
state ^= state >> 27;
prng_state = state;
return state * UINT64_C(2685821657736338717);
}
The static uint64_t prng_state = 1; means that prng_state is a variable of type uint64_t, visible only in the current compilation unit, and initialized to 1. The prng_u64() function returns an unsigned 64-bit pseudorandom integer. However, if you do not use prng_u64(), the compiler will not generate code for it either.
Another typical use case is when I have data structures, and they need accessor functions. For example,
#ifndef GRID_H
#define GRID_H
#include <stdlib.h>
typedef struct {
int rows;
int cols;
unsigned char *cell;
} grid;
#define GRID_INIT { 0, 0, NULL }
#define GRID_OUTSIDE -1
static inline int grid_get(grid *const g, const int row, const int col)
{
if (!g || row < 0 || col < 0 || row >= g->rows || col >= g->cols)
return GRID_OUTSIDE;
return g->cell[row * (size_t)(g->cols) + col];
}
static inline int grid_set(grid *const g, const int row, const int col,
const unsigned char value)
{
if (!g || row < 0 || col < 0 || row >= g->rows || col >= g->cols)
return GRID_OUTSIDE;
return g->cell[row * (size_t)(g->cols) + col] = value;
}
static inline void grid_init(grid *g)
{
g->rows = 0;
g->cols = 0;
g->cell = NULL;
}
static inline void grid_free(grid *g)
{
free(g->cell);
g->rows = 0;
g->cols = 0;
g->cell = NULL;
}
int grid_create(grid *g, const int rows, const int cols,
const unsigned char initial_value);
int grid_load(grid *g, FILE *handle);
int grid_save(grid *g, FILE *handle);
#endif /* GRID_H */
That header file defines some useful helper functions, and declares the functions grid_create(), grid_load(), and grid_save(), that would be implemented in a separate .c file.
(Yes, those three functions could be implemented in the header file just as well, but it would make the header file quite large. If you had a large project, spread over many translation units (.c source files), each one including the header file would get their own local copies of the functions. The accessor functions defined as static inline above are short and trivial, so it is perfectly okay for them to be copied here and there. The three functions I omitted are much larger.)

Static functions declared in "C" header files

For me it's a rule to define and declare static functions inside source files, I mean .c files.
However in very rare situations I saw people declaring it in the header file.
Since static functions have internal linkage we need to define it in every file we include the header file where the function is declared. This looks pretty odd and far from what we usually want when declaring something as static.
On the other hand if someone naive tries to use that function without defining it the compiler will complaint. So in some sense is not really unsafe to do this even sounding strange.
My questions are:
What is the problem of declaring static functions in header files?
What are the risks?
What the impact in compilation time?
Is there any risk in runtime?
First I'd like to clarify my understanding of the situation you describe: The header contains (only) a static function declaration while the C file contains the definition, i.e. the function's source code. For example
some.h:
static void f();
// potentially more declarations
some.c:
#include "some.h"
static void f() { printf("Hello world\n"); }
// more code, some of it potentially using f()
If this is the situation you describe, I take issue with your remark
Since static functions have internal linkage we need to define it in every file we include the header file where the function is declared.
If you declare the function but do not use it in a given translation unit, I don't think you have to define it. gcc accepts that with a warning; the standard does not seem to forbid it, unless I missed something. This may be important in your scenario because translation units which do not use the function but include the header with its declaration don't have to provide an unused definition.
Now let's examine the questions:
What is the problem of declaring static functions in header files?
It is somewhat unusual. Typically, static functions are functions needed in only one file. They are declared static to make that explicit by limiting their visibility. Declaring them in a header therefore is somewhat antithetical. If the function is indeed used in multiple files with identical definitions it should be made external, with a single definition. If only one translation unit actually uses it, the declaration does not belong in a header.
One possible scenario therefore is to ensure a uniform function signature for different implementations in the respective translation units. The common header leads to a compile time error for different return types in C (and C++); different parameter types would cause a compile time error only in C (but not in C++' because of function overloading).
What are the risks?
I do not see risks in your scenario. (As opposed to also including the function definition in a header which may violate the encapsulation principle.)
What the impact in compilation time?
A function declaration is small and its complexity is low, so the overhead of having additional function declarations in a header is likely negligible. But if you create and include an additional header for the declaration in many translation units the file handling overhead can be significant (i.e. the compiler idles a lot while it waits for the header I/O)
Is there any risk in runtime? I cannot see any.
This is not an answer to the stated questions, but hopefully shows why one might implement a static (or static inline) function in a header file.
I can personally only think of two good reasons to declare some functions static in a header file:
If the header file completely implements an interface that should only be visible in the current compilation unit
This is extremely rare, but might be useful in e.g. an educational context, at some point during the development of some example library; or perhaps when interfacing to another programming language with minimal code.
A developer might choose to do so if the library or interaface implementation is trivial and nearly so, and ease of use (to the developer using the header file) is more important than code size. In these cases, the declarations in the header file often use preprocessor macros, allowing the same header file to be included more than once, providing some sort of crude polymorphism in C.
Here is a practical example: Shoot-yourself-in-the-foot playground for linear congruential pseudorandom number generators. Because the implementation is local to the compilation unit, each compilation unit will get their own copies of the PRNG. This example also shows how crude polymorphism can be implemented in C.
prng32.h:
#if defined(PRNG_NAME) && defined(PRNG_MULTIPLIER) && defined(PRNG_CONSTANT) && defined(PRNG_MODULUS)
#define MERGE3_(a,b,c) a ## b ## c
#define MERGE3(a,b,c) MERGE3_(a,b,c)
#define NAME(name) MERGE3(PRNG_NAME, _, name)
static uint32_t NAME(state) = 0U;
static uint32_t NAME(next)(void)
{
NAME(state) = ((uint64_t)PRNG_MULTIPLIER * (uint64_t)NAME(state) + (uint64_t)PRNG_CONSTANT) % (uint64_t)PRNG_MODULUS;
return NAME(state);
}
#undef NAME
#undef MERGE3
#endif
#undef PRNG_NAME
#undef PRNG_MULTIPLIER
#undef PRNG_CONSTANT
#undef PRNG_MODULUS
An example using the above, example-prng32.h:
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#define PRNG_NAME glibc
#define PRNG_MULTIPLIER 1103515245UL
#define PRNG_CONSTANT 12345UL
#define PRNG_MODULUS 2147483647UL
#include "prng32.h"
/* provides glibc_state and glibc_next() */
#define PRNG_NAME borland
#define PRNG_MULTIPLIER 22695477UL
#define PRNG_CONSTANT 1UL
#define PRNG_MODULUS 2147483647UL
#include "prng32.h"
/* provides borland_state and borland_next() */
int main(void)
{
int i;
glibc_state = 1U;
printf("glibc lcg: Seed %u\n", (unsigned int)glibc_state);
for (i = 0; i < 10; i++)
printf("%u, ", (unsigned int)glibc_next());
printf("%u\n", (unsigned int)glibc_next());
borland_state = 1U;
printf("Borland lcg: Seed %u\n", (unsigned int)borland_state);
for (i = 0; i < 10; i++)
printf("%u, ", (unsigned int)borland_next());
printf("%u\n", (unsigned int)borland_next());
return EXIT_SUCCESS;
}
The reason for marking both the _state variable and the _next() function static is that this way each compilation unit that includes the header file has their own copy of the variables and the functions -- here, their own copy of the PRNG. Each must be separately seeded, of course; and if seeded to the same value, will yield the same sequence.
One should generally shy away from such polymorphism attempts in C, because it leads to complicated preprocessor macro shenanigans, making the implementation much harder to understand, maintain, and modify than necessary.
However, when exploring the parameter space of some algorithm -- like here, the types of 32-bit linear congruential generators, this lets us use a single implementation for each of the generators we examine, ensuring there are no implementation differences between them. Note that even this case is more like a development tool, and not something you ought to see in a implementation provided for others to use.
If the header implements simple static inline accessor functions
Preprocessor macros are commonly used to simplify code accessing complicated structure types. static inline functions are similar, except that they also provide type checking at compile time, and can refer to their parameters several times (with macros, that is problematic).
One practical use case is a simple interface for reading files using low-level POSIX.1 I/O (using <unistd.h> and <fcntl.h> instead of <stdio.h>). I've done this myself when reading very large (dozens of megabytes to gigabytes range) text files containing real numbers (with a custom float/double parser), as the GNU C standard I/O is not particularly fast.
For example, inbuffer.h:
#ifndef INBUFFER_H
#define INBUFFER_H
typedef struct {
unsigned char *head; /* Next buffered byte */
unsigned char *tail; /* Next byte to be buffered */
unsigned char *ends; /* data + size */
unsigned char *data;
size_t size;
int descriptor;
unsigned int status; /* Bit mask */
} inbuffer;
#define INBUFFER_INIT { NULL, NULL, NULL, NULL, 0, -1, 0 }
int inbuffer_open(inbuffer *, const char *);
int inbuffer_close(inbuffer *);
int inbuffer_skip_slow(inbuffer *, const size_t);
int inbuffer_getc_slow(inbuffer *);
static inline int inbuffer_skip(inbuffer *ib, const size_t n)
{
if (ib->head + n <= ib->tail) {
ib->head += n;
return 0;
} else
return inbuffer_skip_slow(ib, n);
}
static inline int inbuffer_getc(inbuffer *ib)
{
if (ib->head < ib->tail)
return *(ib->head++);
else
return inbuffer_getc_slow(ib);
}
#endif /* INBUFFER_H */
Note that the above inbuffer_skip() and inbuffer_getc() do not check if ib is non-NULL; this is typical for such functions. These accessor functions are assumed to be "in the fast path", i.e. called very often. In such cases, even the function call overhead matters (and is avoided with static inline functions, since they are duplicated in the code at the call site).
Trivial accessor functions, like the above inbuffer_skip() and inbuffer_getc(), may also let the compiler avoid the register moves involved in function calls, because functions expect their parameters to be located in specific registers or on the stack, whereas inlined functions can be adapted (wrt. register use) to the code surrounding the inlined function.
Personally, I do recommend writing a couple of test programs using the non-inlined functions first, and compare the performance and results to the inlined versions. Comparing the results ensure the inlined versions do not have bugs (off by one type is common here!), and comparing the performance and generated binaries (size, at least) tells you whether inlining is worth it in general.
Why would you want a both global and static function? In c, functions are global by default. You only use static functions if you want to limit the access to a function to the file they are declared. So you actively restrict access by declaring it static...
The only requirement for implementations in the header file, is for c++ template functions and template class member functions.

Why #defines instead of enums [duplicate]

Which one is better to use among the below statements in C?
static const int var = 5;
or
#define var 5
or
enum { var = 5 };
It depends on what you need the value for. You (and everyone else so far) omitted the third alternative:
static const int var = 5;
#define var 5
enum { var = 5 };
Ignoring issues about the choice of name, then:
If you need to pass a pointer around, you must use (1).
Since (2) is apparently an option, you don't need to pass pointers around.
Both (1) and (3) have a symbol in the debugger's symbol table - that makes debugging easier. It is more likely that (2) will not have a symbol, leaving you wondering what it is.
(1) cannot be used as a dimension for arrays at global scope; both (2) and (3) can.
(1) cannot be used as a dimension for static arrays at function scope; both (2) and (3) can.
Under C99, all of these can be used for local arrays. Technically, using (1) would imply the use of a VLA (variable-length array), though the dimension referenced by 'var' would of course be fixed at size 5.
(1) cannot be used in places like switch statements; both (2) and (3) can.
(1) cannot be used to initialize static variables; both (2) and (3) can.
(2) can change code that you didn't want changed because it is used by the preprocessor; both (1) and (3) will not have unexpected side-effects like that.
You can detect whether (2) has been set in the preprocessor; neither (1) nor (3) allows that.
So, in most contexts, prefer the 'enum' over the alternatives. Otherwise, the first and last bullet points are likely to be the controlling factors — and you have to think harder if you need to satisfy both at once.
If you were asking about C++, then you'd use option (1) — the static const — every time.
Generally speaking:
static const
Because it respects scope and is type-safe.
The only caveat I could see: if you want the variable to be possibly defined on the command line. There is still an alternative:
#ifdef VAR // Very bad name, not long enough, too general, etc..
static int const var = VAR;
#else
static int const var = 5; // default value
#endif
Whenever possible, instead of macros / ellipsis, use a type-safe alternative.
If you really NEED to go with a macro (for example, you want __FILE__ or __LINE__), then you'd better name your macro VERY carefully: in its naming convention Boost recommends all upper-case, beginning by the name of the project (here BOOST_), while perusing the library you will notice this is (generally) followed by the name of the particular area (library) then with a meaningful name.
It generally makes for lengthy names :)
In C, specifically? In C the correct answer is: use #define (or, if appropriate, enum)
While it is beneficial to have the scoping and typing properties of a const object, in reality const objects in C (as opposed to C++) are not true constants and therefore are usually useless in most practical cases.
So, in C the choice should be determined by how you plan to use your constant. For example, you can't use a const int object as a case label (while a macro will work). You can't use a const int object as a bit-field width (while a macro will work). In C89/90 you can't use a const object to specify an array size (while a macro will work). Even in C99 you can't use a const object to specify an array size when you need a non-VLA array.
If this is important for you then it will determine your choice. Most of the time, you'll have no choice but to use #define in C. And don't forget another alternative, that produces true constants in C - enum.
In C++ const objects are true constants, so in C++ it is almost always better to prefer the const variant (no need for explicit static in C++ though).
The difference between static const and #define is that the former uses the memory and the later does not use the memory for storage. Secondly, you cannot pass the address of an #define whereas you can pass the address of a static const. Actually it is depending on what circumstance we are under, we need to select one among these two. Both are at their best under different circumstances. Please don't assume that one is better than the other... :-)
If that would have been the case, Dennis Ritchie would have kept the best one alone... hahaha... :-)
In C #define is much more popular. You can use those values for declaring array sizes for example:
#define MAXLEN 5
void foo(void) {
int bar[MAXLEN];
}
ANSI C doesn't allow you to use static consts in this context as far as I know. In C++ you should avoid macros in these cases. You can write
const int maxlen = 5;
void foo() {
int bar[maxlen];
}
and even leave out static because internal linkage is implied by const already [in C++ only].
Another drawback of const in C is that you can't use the value in initializing another const.
static int const NUMBER_OF_FINGERS_PER_HAND = 5;
static int const NUMBER_OF_HANDS = 2;
// initializer element is not constant, this does not work.
static int const NUMBER_OF_FINGERS = NUMBER_OF_FINGERS_PER_HAND
* NUMBER_OF_HANDS;
Even this does not work with a const since the compiler does not see it as a constant:
static uint8_t const ARRAY_SIZE = 16;
static int8_t const lookup_table[ARRAY_SIZE] = {
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}; // ARRAY_SIZE not a constant!
I'd be happy to use typed const in these cases, otherwise...
If you can get away with it, static const has a lot of advantages. It obeys the normal scope principles, is visible in a debugger, and generally obeys the rules that variables obey.
However, at least in the original C standard, it isn't actually a constant. If you use #define var 5, you can write int foo[var]; as a declaration, but you can't do that (except as a compiler extension" with static const int var = 5;. This is not the case in C++, where the static const version can be used anywhere the #define version can, and I believe this is also the case with C99.
However, never name a #define constant with a lowercase name. It will override any possible use of that name until the end of the translation unit. Macro constants should be in what is effectively their own namespace, which is traditionally all capital letters, perhaps with a prefix.
#define var 5 will cause you trouble if you have things like mystruct.var.
For example,
struct mystruct {
int var;
};
#define var 5
int main() {
struct mystruct foo;
foo.var = 1;
return 0;
}
The preprocessor will replace it and the code won't compile. For this reason, traditional coding style suggest all constant #defines uses capital letters to avoid conflict.
It is ALWAYS preferable to use const, instead of #define. That's because const is treated by the compiler and #define by the preprocessor. It is like #define itself is not part of the code (roughly speaking).
Example:
#define PI 3.1416
The symbolic name PI may never be seen by compilers; it may be removed by the preprocessor before the source code even gets to a compiler. As a result, the name PI may not get entered into the symbol table. This can be confusing if you get an error during compilation involving the use of the constant, because the error message may refer to 3.1416, not PI. If PI were defined in a header file you didn’t write, you’d have no idea where that 3.1416 came from.
This problem can also crop up in a symbolic debugger, because, again, the name you’re programming with may not be in the symbol table.
Solution:
const double PI = 3.1416; //or static const...
I wrote quick test program to demonstrate one difference:
#include <stdio.h>
enum {ENUM_DEFINED=16};
enum {ENUM_DEFINED=32};
#define DEFINED_DEFINED 16
#define DEFINED_DEFINED 32
int main(int argc, char *argv[]) {
printf("%d, %d\n", DEFINED_DEFINED, ENUM_DEFINED);
return(0);
}
This compiles with these errors and warnings:
main.c:6:7: error: redefinition of enumerator 'ENUM_DEFINED'
enum {ENUM_DEFINED=32};
^
main.c:5:7: note: previous definition is here
enum {ENUM_DEFINED=16};
^
main.c:9:9: warning: 'DEFINED_DEFINED' macro redefined [-Wmacro-redefined]
#define DEFINED_DEFINED 32
^
main.c:8:9: note: previous definition is here
#define DEFINED_DEFINED 16
^
Note that enum gives an error when define gives a warning.
The definition
const int const_value = 5;
does not always define a constant value. Some compilers (for example tcc 0.9.26) just allocate memory identified with the name "const_value". Using the identifier "const_value" you can not modify this memory. But you still could modify the memory using another identifier:
const int const_value = 5;
int *mutable_value = (int*) &const_value;
*mutable_value = 3;
printf("%i", const_value); // The output may be 5 or 3, depending on the compiler.
This means the definition
#define CONST_VALUE 5
is the only way to define a constant value which can not be modified by any means.
Although the question was about integers, it's worth noting that #define and enums are useless if you need a constant structure or string. These are both usually passed to functions as pointers. (With strings it's required; with structures it's much more efficient.)
As for integers, if you're in an embedded environment with very limited memory, you might need to worry about where the constant is stored and how accesses to it are compiled. The compiler might add two consts at run time, but add two #defines at compile time. A #define constant may be converted into one or more MOV [immediate] instructions, which means the constant is effectively stored in program memory. A const constant will be stored in the .const section in data memory. In systems with a Harvard architecture, there could be differences in performance and memory usage, although they'd likely be small. They might matter for hard-core optimization of inner loops.
Don't think there's an answer for "which is always best" but, as Matthieu said
static const
is type safe. My biggest pet peeve with #define, though, is when debugging in Visual Studio you cannot watch the variable. It gives an error that the symbol cannot be found.
Incidentally, an alternative to #define, which provides proper scoping but behaves like a "real" constant, is "enum". For example:
enum {number_ten = 10;}
In many cases, it's useful to define enumerated types and create variables of those types; if that is done, debuggers may be able to display variables according to their enumeration name.
One important caveat with doing that, however: in C++, enumerated types have limited compatibility with integers. For example, by default, one cannot perform arithmetic upon them. I find that to be a curious default behavior for enums; while it would have been nice to have a "strict enum" type, given the desire to have C++ generally compatible with C, I would think the default behavior of an "enum" type should be interchangeable with integers.
A simple difference:
At pre-processing time, the constant is replaced with its value.
So you could not apply the dereference operator to a define, but you can apply the dereference operator to a variable.
As you would suppose, define is faster that static const.
For example, having:
#define mymax 100
you can not do printf("address of constant is %p",&mymax);.
But having
const int mymax_var=100
you can do printf("address of constant is %p",&mymax_var);.
To be more clear, the define is replaced by its value at the pre-processing stage, so we do not have any variable stored in the program. We have just the code from the text segment of the program where the define was used.
However, for static const we have a variable that is allocated somewhere. For gcc, static const are allocated in the text segment of the program.
Above, I wanted to tell about the reference operator so replace dereference with reference.
We looked at the produced assembler code on the MBF16X... Both variants result in the same code for arithmetic operations (ADD Immediate, for example).
So const int is preferred for the type check while #define is old style. Maybe it is compiler-specific. So check your produced assembler code.
I am not sure if I am right but in my opinion calling #defined value is much faster than calling any other normally declared variable (or const value).
It's because when program is running and it needs to use some normally declared variable it needs to jump to exact place in memory to get that variable.
In opposite when it use #defined value, the program don't need to jump to any allocated memory, it just takes the value. If #define myValue 7 and the program calling myValue, it behaves exactly the same as when it just calls 7.

What makes a better constant in C, a macro or an enum?

I am confused about when to use macros or enums. Both can be used as constants, but what is the difference between them and what is the advantage of either one? Is it somehow related to compiler level or not?
In terms of readability, enumerations make better constants than macros, because related values are grouped together. In addition, enum defines a new type, so the readers of your program would have easier time figuring out what can be passed to the corresponding parameter.
Compare
#define UNKNOWN 0
#define SUNDAY 1
#define MONDAY 2
#define TUESDAY 3
...
#define SATURDAY 7
to
typedef enum {
UNKNOWN,
SUNDAY,
MONDAY,
TUESDAY,
...
SATURDAY,
} Weekday;
It is much easier to read code like this
void calendar_set_weekday(Weekday wd);
than this
void calendar_set_weekday(int wd);
because you know which constants it is OK to pass.
A macro is a preprocessor thing, and the compiled code has no idea about the identifiers you create. They have been already replaced by the preprocessor before the code hits the compiler. An enum is a compile time entity, and the compiled code retains full information about the symbol, which is available in the debugger (and other tools).
Prefer enums (when you can).
In C, it is best to use enums for actual enumerations: when some variable can hold one of multiple values which can be given names. One advantage of enums is that the compiler can perform some checks beyond what the language requires, like that a switch statement on the enum type is not missing one of the cases. The enum identifiers also propagate into the debugging information. In a debugger, you can see the identifier name as the value of an enum variable, rather than just the numeric value.
Enumerations can be used just for the side effect of creating symbolic constants of integral type. For instance:
enum { buffer_size = 4096 }; /* we don't care about the type */
this practice is not that wide spread. For one thing, buffer_size will be used as an integer and not as an enumerated type. A debugger will not render 4096 into buffer_size, because that value won't be represented as the enumerated type. If you declare some char array[max_buffer_size]; then sizeof array will not show up as buffer_size. In this situation, the enumeration constant disappears at compile time, so it might as well be a macro. And there are disadvantages, like not being able to control its exact type. (There might be some small advantage in some situation where the output of the preprocessing stages of translation is being captured as text. A macro will have turned into 4096, whereas buffer_size will stay as buffer_size).
A preprocessor symbol lets us do this:
#define buffer_size 0L /* buffer_size is a long int */
Note that various values from C's <limits.h> like UINT_MAX are preprocessor symbols and not enum symbols, with good reasons for that, because those identifiers need to have a precisely determined type. Another advantage of a preprocessor symbol is that we can test for its presence, or even make decisions based on its value:
#if ULONG_MAX > UINT_MAX
/* unsigned long is wider than unsigned int */
#endif
Of course we can test enumerated constants also, but not in such a way that we can change global declarations based on the result.
Enumerations are also ill suited for bitmasks:
enum modem_control { mc_dsr = 0x1, mc_dtr = 0x2, mc_rts = 0x4, ... }
it just doesn't make sense because when the values are combined with a bitwise OR, they produce a value which is outside of the type. Such code causes a headache, too, if it is ever ported to C++, which has (somewhat more) type-safe enumerations.
Note there are some differences between macros and enums, and either of these properties may make them (un)suitable as a particular constant.
enums are signed (compatible with int). In any context where an unsigned type is required (think especially bitwise operations!), enums are out.
if long long is wider than int, big constants won't fit in an enum.
The size of an enum is (usually) sizeof(int). For arrays of small values (up to say, CHAR_MAX) you might want a char foo[] rather than an enum foo[] array.
enums are integral numbers. You can't have enum funny_number { PI=3.14, E=2.71 }.
enums are a C89 feature; K&R compilers (admittedly ancient) don't understand them.
If macro is implemented properly (i.e it does not suffer from associativity issues when substituted), then there's not much difference in applicability between macro and enum constants in situations where both are applicable, i.e. in situation where you need signed integer constants specifically.
However, in general case macros provide much more flexible functionality. Enums impose a specific type onto your constants: they will have type int (or, possibly, larger signed integer type), and they will always be signed. With macros you can use constant syntax, suffixes and/or explicit type conversions to produce a constant of any type.
Enums work best when you have a group of tightly associated sequential integer constants. They work especially well when you don't care about the actual values of the constants at all, i.e. when you only care about them having some well-behaved unique values. In all other cases macros are a better choice (or basically the only choice).
As a practical matter, there is little difference. They are equally usable as constants in your programs. Some may prefer one or the other for stylistic reasons, but I can't think of any technical reason to prefer one over the other.
One difference is that macros allow you to control the integral type of related constants. But an enum will use an int.
#define X 100L
enum { Y = 100L };
printf("%ld\n", X);
printf("%d\n", Y); /* Y has int type */
enum has an advantage: block scope:
{ enum { E = 12 }; }
{ enum { E = 13 }; }
With macros there is a need to #undef.

Resources