Correct definition of constants - c

I'm having a little trouble defining the constants I use in my code in a correct way. Although I read the excellent post Jonathan Leffler over at How do I use extern to share variables between source files?, I seem to have misunderstood something. This is the setup:
/* constants.h */
extern int NUM_PARTICLES;
extern int LIGHTSPEED;
This header is used in random.h and main.c, which looks like
#include "constants.h"
int NUM_PARTICLES=104;
in random.h or
#include "constants.h"
int LIGHTSPEED=104;
in main.c, respectively. NUM_PARTICLES is used in main.c in
30: double ghosts[NUM_PARTICLES][4];
31: double output[NUM_PARTICLES][3];
Although this thing works, I get the following warnings,
main.c: In function ‘int main()’:
main.c:30:32: warning: ISO C++ forbids variable length array ‘ghosts’ [-Wvla]
main.c:31:32: warning: ISO C++ forbids variable length array ‘output’ [-Wvla]
which is weird, because in my opinion I do give the array a constant value that is known at compilation time. (And usually these array length errors cause some segfaults, which in this case they do not.) Any ideas?

Short story: this is a quirk of C.
Normally, you would define an integer constant as
const int LIGHTSPEED = 104;
The problem is that according to the language rules this constant is not a constant expression, and thus cannot be used to specify the size of a statically allocated array.
The relevant part of the C standard (6.6/6, I am not making this up) defines what an integer constant expression is:
An integer constant expression shall have integer type and shall
only have operands that are integer constants, enumeration constants,
character constants, sizeof expressions whose results are integer
constants, and floating constants that are the immediate operands of
casts.
There are two solutions for this. The classic one is to use a macro, which simply pastes 104 between the angle brackets before the compiler sees the code, therefore making the array size an integer constant:
#define NUM_PARTICLES 104
The better one (IMO) is to avoid a macro because you can, and use an enum, which is possible (you are using an enumeration constant):
enum { NUM_PARTICLES = 104 };

Related

C - What is happening when specifying an array parameter with another parameter as its size

I came across this code, and I can't really figure out how/why it works:
void cistore(int bucketsize, int data[][bucketsize])
{
}
int main()
{
return 0;
}
What exactly is going on here? I'd expect the C compiler (in this case gcc) to only allow this if bucketsize is determinable at compile-time. But even when there is no way of knowing bucketsize at compile time, gcc doesn't complain. How does gcc handle this?
What exactly is going on here?
Since C99, C has had support for variable-length arrays, whose lengths are determined at runtime. In C99 it was a mandatory feature, but since C11 it has been an optional feature. Many modern C compilers support it, with the notable exception of Microsoft's.
In
void cistore(int bucketsize, int data[][bucketsize])
, data is declared as a pointer to an array of bucketsize elements of type int. The pointed-to object is a variable-length array, at least from the perspective of the function.
I'd expect the C compiler (in this case
gcc) to only allow this if bucketsize is determinable at compile-time.
Surprise!
But even when there is no way of knowing bucketsize at compile time,
gcc doesn't complain. How does gcc handle this?
How would GCC handle it if the pointed-to array had explicit length? As far as the "how" goes, I don't expect that GCC needs to make too many adjustments.
Or if you're asking about the semantics, they are pretty much what one would expect once they get over any shock about VLAs being a thing. The pointed-to object is an array whose length on any given call to the function is specified by the value of the bucketsize argument. That can differ from call to call.
Here's an extended version of your example code that demonstrates:
void cistore(int bucketsize, int data[][bucketsize])
{
}
int main()
{
int d1[5][5];
int d2[4][6];
int (*d3)[42] = NULL;
cistore(5, d1);
cistore(6, d2);
cistore(42, d3);
return 0;
}
The variable data is a variable length array, which do not necessarily need to have a integer constant size when declared as a function parameter:
If expression is not an integer constant expression, the declarator is for an array of variable size.
In fact, the size is implicitly ignored at function prototype scope:
If the size is *, the declaration is for a VLA of unspecified size. Such declaration may only appear in a function prototype scope, and declares an array of a complete type. In fact, all VLA declarators in function prototype scope are treated as if expression were replaced by *.
("Expression" is what is usually between the square brackets and specifies the size.)
In other words, at a function prototype, the size does not need to be known at all and still the type is considered to be complete. Only in the function definition does the size need to be specified as some non-constant integer expression. In the call to the function, the integer expression may be const or non-const I would imagine this (effectively) works similarly to when VLAs are declared and defined in a loop, i.e.:
Each time the flow of control passes over the declaration, expression is evaluated (and it must always evaluate to a value greater than zero), and the array is allocated (correspondingly, lifetime of a VLA ends when the declaration goes out of scope).
(Emphasis mine.)
...except that the size is now evaluated at the time of the function call.
In your particular case, you basically have a variably-modified type:
Variable-length arrays and the types derived from them (pointers to them, etc) are commonly known as "variably-modified types" (VM). Objects of any variably-modified type may only be declared at block scope or function prototype scope.
And yes, the code does compile. An example:
#include <inttypes.h>
#include <stdio.h>
struct data_el
{
int n;
};
// Function prototype. Not specifying a size is allowed
void cistore(uint32_t bucketsize, struct data_el data[][*]);
// Function definition.
// Must specify the VLA's size as a non-constant integer expression
void cistore(uint32_t bucketsize, struct data_el data[][bucketsize])
{
(*data)[1].n = 5;
printf("%d\n", (*data)[1].n);
}
int main(void)
{
// Function call. Use any (const or non-const) integer expression.
uint32_t bucketsize = 2U;
struct data_el data[bucketsize];
cistore(bucketsize, &data);
}
Interestingly, Clang compiles this without warnings, whereas GCC warns about type conflicts:
vlaquestion.c:14:50: warning: argument 2 of type ‘struct data_el[][bucketsize]’ declared as a variable length array [-Wvla-parameter]
14 | void cistore(uint32_t bucketsize, struct data_el data[][bucketsize])
| ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
vlaquestion.c:10:50: note: previously declared as an ordinary array ‘struct data_el[][0]’
10 | void cistore(uint32_t bucketsize, struct data_el data[][*]);
| ~~~~~~~~~~~~~~~^~~~~~~~~
And finally, some useful reading on the somewhat controversial VLAs: Why aren't variable-length arrays part of the C++ standard?
At some point, even an effort was made to make the Linux kernel VLA free.

C MACRO evaluation

I wish to declare a statically allocated array.
Let's take a look at the following code:
#define MAX(a,b) ((a)>(b)?(a):(b))
#define FAST 16
#define SLOW 6
#define MAX_NUM MAX(FAST,SLOW)
U8* pBuffers[MAX_NUM];
When MAX_NUM being evaluated by the GCC compiler(FAST and SLOW are constants)?
I would like to make sure that MAX_NUM is constant and being evaluated as part of a compilation or by the pre-processor.
When you launch the compiler, the following phases are (sequentially) performed:
preprocessing: it manages #define, #ifdef / #endif...
code generation: it produces the machine code runnable on target CPU
optimization: it optimizes depending on user options
During the preprocessing phase, the preprocessor will for example "replace" your line with:
U8* pBuffers[MAX(FAST,SLOW)]
then:
U8* pBuffers[((FAST)>(SLOW)?(FAST):(SLOW))]
then finally:
U8* pBuffers[((16)>(6)?(16):(6))]
Indeed, the preprocessor is not very clever and does not go further.
During the code generation phase, your line will be interpreted as:
U8* pBuffers[16]
Because the code generator is very clever.
The C standard requires the sizes of most arrays to be declared using an integer constant expression, which can be, and in this case is required to be, fully evaluated at compile time. (The only exception is "variable length arrays", and those have to be function-local variables with "automatic storage duration" — not statically allocated.)
Therefore, one answer to your question is you don't have to worry about it. If you write
WHATEVER_TYPE variable[SOME EXPRESSION];
at file scope, either SOME EXPRESSION will be evaluated to a constant at compile time, or compilation will fail and you will get an error.
But a more useful answer is to explain how to see for yourself whether SOME EXPRESSION is an integer constant expression, when you are reading code. First, you have to mentally expand all of the macros. Then, you will presumably have an arithmetic expression of some sort (if not, it's a syntax error).
This arithmetic expression is a constant expression if it has no side-effects, doesn't make any function calls, and doesn't refer to the value of any variable (not even if it is const) (enum constants are fine, though, as are string literals, and sizeof variable as long as variable is completely declared and isn't a variable-length array). It is an integer constant expression if, in addition, it doesn't try to do any floating-point or pointer arithmetic (you are allowed to write a floating-point literal as the immediate operand of a cast, though; for instance ((int)3.1415926) is an integer constant expression).
So, to take your example,
#define MAX(a,b) ((a)>(b)?(a):(b))
#define FAST 16
#define SLOW 6
#define MAX_NUM MAX(FAST,SLOW)
U8* pBuffers[MAX_NUM];
after macro expansion we have
U8* pBuffers[((16)>(6)?(16):(6))];
The expression inside the square brackets has no side effects, doesn't make any function calls, doesn't refer to the value of any variable, and doesn't do any floating-point or pointer arithmetic, so it's an integer constant expression, and the compiler is required to evaluate it at compile time.
By contrast, if you were using this definition of MAX instead:
static inline size_t MAX(size_t a, size_t b)
{ return a > b ? a : b; }
then macro expansion would produce
U8* pBuffers[MAX(16, 8)];
and the expression inside the square brackets would be making a function call, so it wouldn't be an integer constant expression, or even a constant expression, and you would get a compile-time error.
(FYI, the rules in C++ are much more complicated; if you need to know about that, ask a new question.)
MACROS are always evaluated before the compilation process begins. So this code has nothing to worry about and it should work fine.
At the same time, this whole thing is the compiler dependent, I believe with gcc it will work fine. Maybe, for some bare-metal application, it may give a warning.

Checking if macro argument is literal

Is there a way to check at compile time if the argument of a macro is an integer literal, and evaluate the macro differently in that case?
#include <stdio.h>
#define VALUE_0 0
#define VALUE_1 2
#define VALUE_2 4
#define VALUE_3 6
#define VALUE_4 8
#define VALUE(_idx_) VALUE_ ## _idx_
#define VALUE_(_idx_) 2*(_idx_)
int main() {
printf("%i\n", VALUE(3));
printf("%i\n", VALUE_(1+2));
}
VALUE(3) is always resolved at compile-time, but only works if 3 is an integer literal.
VALUE_(3) works for any argument type, but may be result in an expression that is computed at runtime (in a more complex case), and make compiler optimizations impossible.
If there a way to write the macro such that is automatically resolves to VALUE_ or to VALUE, depending if the argument is an integer literal.
Edit:
It is for a C program, or more specifically OpenCL C. It seems that for some OpenCL C compilers (for example NVidia nvcc and Intel), an expression like VALUE(idx) does not always get resolved at compile time, even when the argument is a constant. (Or at least the kernel does not get auto-vectorized if it contains such an expression.) For example if VALUE() resolves to a call of an inline function containing a switch statement, or to a lookup of a constant array, it does not work, but if it is an nested ?: expression, it works. VALUE_ would be guaranteed to resolve to a constant.
Because I'm generating C source code at runtime from the host and passing it to the OpenCL C compiler, it would be useful to not have to generate two different macros for each array.
I'd recommend always using the latter:
#define VALUE_(_idx_) 2*(_idx_)
If the argument is a constant or a constant expression, the resulting expression after the preprocessor will be evaluated by the compiler. If it is not, it will be evaluated at runtime.
The only difference between the two macros in the case of an integer literal is whether the preprocessor gives you the final result or whether the compiler does. In both cases, there is no runtime overhead, so better to go with the one that gives you the most flexibility.
I'll turn the problem on its head and suggest that you try forcing the compiler to compute the value (with the additional benefit of asserting that the value is, in fact, a compile-time constant):
#define MAKE_ME_CONSTANT(x) sizeof((struct{char c[x];}){{0}}.c)
This (tweaked from an earlier answer of mine) declares an anonymous structure with a char-array member whose size is your constant. It then instantiates it with a compound literal and retrieves the member's size. All of this is part of the unevaluated operand of a sizeof, so I expect any compiler to make it a constant.
Note: I didn't use the simpler sizeof(char[x]) because that could be a VLA. But VLAs can't be structure members, so we're good here.
Hence, you get:
#define VALUE(_idx_) MAKE_ME_CONSTANT(2*(_idx_))

Is there a safe way to refer to linker-only symbols that without taking the address of void expressions?

A file has a series of void declarations used as void* as follows:
extern void __flash_rwdata_start;
...
initialize(&__flash_rwdata_start,
...
which are provided solely by the linker script as symbols referring to binary partitioning as follows (ie a pointer):
PROVIDE (__flash_rwdata_start = LOADADDR(.rwdata));
And this generates the following warning:
file.c:84:19: warning: taking address of expression of type 'void' [enabled by default]
As per the answer to Why has GCC started warning on taking the address of a void expression?, I've changed this as follows (the function which takes the pointers uses an unsigned long* anyway):
extern unsigned long __flash_rwdata_start;
Now it occurs to me that the original definition had an implication of zero (or undefined) size, whereas the current definition does not, and unlike the answer to Why has GCC started warning on taking the address of a void expression? there is no "real underlying data type" that makes any logical sense.
Essentially I've defined a pointer that is valid, but does not point to a valid value, and thus it would be invalid to dereference.
Is there a safer or preferable way to avoid the warning, but without the idea of space being allocated, that can't be dereferenced?
One idea that comes to mind is to make the object's type be an incomplete struct, like:
extern struct never_defined __flash_rwdata_start;
This way the address will be valid, but, as long as the type is never defined, not dereferenceable.
For completeness, the usual way besides "void" (which doesn't conform to C11 at least), is
extern char __flash_rwdata_start[];
I do like Tom's answer better though.

When to use #define or constant char/int?

In general, is it better to define some specific parameters (e.g. (char *) UserIPaddr="192.168.0.5" , (int) MAX_BUF=1024) by #define or constant char */ int?
I read some threads say that it is better not to use #define when it is possible. However, I see quite common usage of #define on open source codes one example from a source code:
#define IEEE80211_WLAN_HDR_LEN 24
a_uint8_t *iv = NULL;
a_uint16_t tmp;
a_uint16_t offset = IEEE80211_WLAN_HDR_LEN;
#define could be avoided to use there, but I wonder why it was preferred to use #define on that case for example. How should I decide when to use #define or not?
In C const declarations do not produce constant expressions, So if you need to have a constant expression its not possible using const, the traditional and more commonly used way to do so is using # define.
For example const int cannot be used in:
a case label or
as a bit-field width or
as array size in a non-VLA array declaration (pre C99 days)
There are few reasons to use #define. There is little it accomplishes that a static const or enum cannot.
As Alok Save mentions, static const int cannot produce an integral constant expression in C (I'm not double checking the C standard; it isn't the case in C++ though). But enum can do that. However enum in pure C does not grow to accommodate values larger than INT_MAX. So if you need a long value to use as an array bound or case label, #define is your friend. Or consider switching to using the C subset of C++, which doesn't have such restrictions.
My rule of thumb is to not use #define unless the symbol must be a compile-time constant. With this in mind, I personally would not have used #define in your example.
To take a different example from the same source file:
#define CRYPTO_KEY_TYPE_AES 2
...
switch (keytype) {
case CRYPTO_KEY_TYPE_AES:
Here, CRYPTO_KEY_TYPE_AES must be a constant expression, and thus using a constant variable would not do.

Resources