How to make GCC evaluate functions at compile time? - c

I am thinking about the following problem: I want to program a microcontroller (let's say an AVR mega type) with a program that uses some sort of look-up tables.
The first attempt would be to locate the table in a separate file and create it using any other scripting language/program/.... In this case there is quite some effort in creating the necessary source files for C.
My thought was now to use the preprocessor and compiler to handle things. I tried to implement this with a table of sine values (just as an example):
#include <avr/io.h>
#include <math.h>
#define S1(i,n) ((uint8_t) sin(M_PI*(i)/n*255))
#define S4(i,n) S1(i,n), S1(i+1,n), S1(i+2,n), S1(i+3,n)
uint8_t lut[] = {S4(0,4)};
void main()
{
uint8_t val, i;
for(i=0; i<4; i++)
{
val = lut[i];
}
}
If I compile this code I get warnings about the sin function. Further in the assembly there is nothing in the section .data. If I just remove the sin in the third line I get the data in the assembly. Clearly all information are available at compile time.
Can you tell me if there is a way to achieve what I intent: The compiler calculates as many values as offline possible? Or is the best way to go using an external script/program/... to calculate the table entries and add these to a separate file that will just be #included?

The general problem here is that sin call makes this initialization de facto illegal, according to rules of C language, as it's not constant expression per se and you're initializing array of static storage duration, which requires that. This also explains why your array is not in .data section.
C11 (N1570) §6.6/2,3 Constant expressions (emphasis mine)
A constant expression can be evaluated during translation rather than
runtime, and accordingly may be used in any place that a constant may
be.
Constant expressions shall not contain assignment, increment,
decrement, function-call, or comma operators, except when they are
contained within a subexpression that is not evaluated.115)
However as by #ShafikYaghmour's comment GCC will replace sin function call with its built-in counterpart (unless -fno-builtin option is present), that is likely to be treated as constant expression. According to 6.57 Other Built-in Functions Provided by GCC:
GCC includes built-in versions of many of the functions in the
standard C library. The versions prefixed with __builtin_ are always
treated as having the same meaning as the C library function even if
you specify the -fno-builtin option.

What you are trying is not part of the C language. In situations like this, I have written code following this pattern:
#if GENERATE_SOURCECODE
int main (void)
{
... Code that uses printf to write C code to stdout
}
#else
// Source code generated by the code above
... Here I paste in what the code above generated
// The rest of the program
#endif
Every time you need to change it, you run the code with GENERATE_SOURCECODE defined, and paste in the output. Works well if your code is self contained and the generated output only ever changes if the code generating it changes.

First of all, it should go without saying that you should evaluate (probably by experiment) whether this is worth doing. Your lookup table is going to increase your data size and programmer effort, but may or may not provide a runtime speed increase that you need.
If you still want to do it, I don't think the C preprocessor can do it straightforwardly, because it has no facilities for iteration or recursion.
The most robust way to go about this would be to write a program in C or some other language to print out C source for the table, and then include that file in your program using the preprocessor. If you are using a tool like make, you can create a rule to generate the table file and have your .c file depend on that file.
On the other hand, if you are sure you are never going to change this table, you could write a program to generate it once and just paste it in.

Related

lldb in xcode detects integer called I to be a complex number

I have a C code, within which an int I gets declared and initialized. When I'm debugging within xcode, if I try to print the value of I, xcode tries to find a complex number:
(lldb) p I
error: <lldb wrapper prefix>:43:31: expected unqualified-id
using $__lldb_local_vars::I;
^
<user expression 3>:1760:11: expanded from here
#define I _Complex_I
^
<user expression 3>:7162:20: expanded from here
#define _Complex_I ( __extension__ 1.0iF )
When I try the same thing (stopping at the same exact line in the code) in the command line, without using xcode, it works fine:
(lldb) p I
(int) $0 = 56
I'm loading the following libraries:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>
which shouldn't even include complex numbers, no? I definitely don't have a macro that defines I to be the complex variable. The one I run in xcode, I compile with the default xcode tools. The one I run in the command line, I use gcc. Is this the difference, somehow? Is xcode including more libraries than I ask it to? Why is this happening and how can I prevent it?
Edit: I should also add that the variable explorer in xcode shows the value of I correctly, as an integer.
$__lldb_local_vars is an artificial namespace that lldb injects into the wrapper it sets up for your expression before compilation so that clang can find the frame's local variables and their types. The problem comes as others have noted because we also run the preprocessor when compiling your expression, and your variable name collides with a preprocessor symbol in the expression context.
Normally, debug information does not record macros at all, so you aren't seeing the complex.h version of I from your own use of it in your code. Rather, you are seeing the I macro because something has caused the Darwin module to be imported into lldb's expression context.
That can happen in two ways, either because you explicitly asked for it by running:
(lldb) expr #import Darwin
or because you built this program with -fmodules and your code imported the Darwin module by inserting a statement like the above.
Doing this by hand is a common trick explicitly to make #defines from the module visible to the expression parser. Since it is the visibility of the macro that is causing problems, then you will have to stop doing that if you want this expression to succeed.
OTOH, if lldb is doing this because the debug information recorded that some part of you code imported this module, you can turn off the behavior by putting:
settings set target.auto-import-clang-modules 0
in your ~/.lldbinit and restarting your debug session.
BTW, the p command (or the expression command that p is an alias for) evaluates the text you provide it as a regular expression using the language and in the context of the current frame, with as much access to symbols, defines and the like as lldb can provide. Most users also want to be able to access class information that might not be directly visible in the current frame, so it tends to cast as wide a net as possible looking for symbols and types in order to enable this.
It is a very powerful feature, but as you are seeing sometimes the desire to provide this wide access for expressions can cause conflicting definitions. And anyway, it is way more powerful than needed just to view a local variable.
lldb has another command: frame var (convenient alias v) that prints local variable values by directly accessing the memory pointed to by the debug information and presenting it using the type from the debug info. It supports a limited subset of C-like syntax for subelement reference; you can use * to dereference, . or -> and if the variable is an array [0] etc...
So unless you really do need to run an expression (for instance to access a computed property or call another function), v will be faster and because its implementation is simpler and more direct, it will have less chance of subtle failures than p.
If you also want to access the object definition of some ObjC or Swift local variable, the command vo or frame var -O will fetch the description of the local variable it finds using the v method.
I definitely don't have a macro that defines I to be the complex variable.
It looks like lldb is getting confused somehow, not an issue with your code, but without a MRE it is hard to say.
The one I run in xcode, I compile with the default xcode tools. The one I run in the command line, I use gcc. Is this the difference, somehow?
xcode uses "Apple clang" (an old, custom version) with libc++ by default, as far as I know. gcc is quite different and it may not even use libc++.
Having said that, since xcode shows the variable as an integer but lldb does not, it looks like something else is going on.
Is xcode including more libraries than I ask it to?
I don't think so given the program works and Xcode shows the value as an integer.
Why is this happening and how can I prevent it?
Hard to say since it is a closed source tool. Try to make an MRE. It usually helps debugging the issue and finding workarounds.
By definition a complex number is not defined as simply int
Additionally, as mentioned, complex I is defined in <complex.h>:
To construct complex numbers you need a way to indicate the imaginary
part of a number. There is no standard notation for an imaginary
floating point constant. Instead, complex.h defines two macros that
can be used to create complex numbers.
Macro: const float complex _Complex_I
This macro is a representation of the complex number “0+1i”. Multiplying a real floating-point value by _Complex_I gives a complex number whose value is purely imaginary. You can use this to construct complex constants:
3.0 + 4.0i = 3.0 + 4.0 * _Complex_I
Note that _Complex_I * _Complex_I has the value -1, but the type of that value is complex.
_Complex_I is a bit of a mouthful. complex.h also defines a shorter name for the same constant.
Macro: const float complex I
This macro has exactly the same value as _Complex_I. Most of the time it is preferable. However, it causes problems if you want to use the identifier I for something else. You can safely write
#include <complex.h>
#undef I
Reference here for GNU implementation
Include this header file (or similar from your environment), and no need to define it yourself

'Reverse' a collection of C preprocessor macros easily

I have a lot of preprocessor macro definitions, like this:
#define FOO 1
#define BAR 2
#define BAZ 3
In the real application, each definition corresponds to an instruction in an interpreter virtual machine. The macros are also not sequential in numbering to leave space for future instructions; there may be a #define FOO 41, then the next one is #define BAR 64.
I'm now working on a debugger for this virtual machine, and need to effectively 'reverse' these preprecessor macros. In other words, I need a function which takes the number and returns the macro name, e.g. an input of 2 returns "BAR".
Of course, I could create a function using a switch myself:
const char* instruction_by_id(int id) {
switch (id) {
case FOO:
return "FOO";
case BAR:
return "BAR";
case BAZ:
return "BAZ";
default:
return "???";
}
}
However, this will a nightmare to maintain, since renaming, removing or adding instructions will require this function to be modified too.
Is there another macro which I can use to create a function like this for me, or is there some other approach? If not, is it possible to create a macro to perform this task?
I'm using gcc 6.3 on Windows 10.
You have the wrong approach. Read SICP if you have not read it.
I have a lot of preprocessor macro definitions, like this:
#define FOO 1
#define BAR 2
#define BAZ 3
Remember that C or C++ code can be generated, and it is quite easy to instruct your build automation tool to generate some particular C file (with GNU make or ninja you just add some rule or recipe).
For example, you could use some different preprocessor (liek GPP or m4), or some script -e.g. in awk or Python or Guile, etc..., or write your own program (in C, C++, Ocaml, etc...), to generate the header file containing these #define-s. And another script or program (or the same one, invoked differently) could generate the C code of instruction_by_id
Such basic metaprogramming techniques (of generating some or several C files from something higher level but specific) have been used since at least the 1980s (e.g. with yacc or RPCGEN). The C preprocessor facilitates that with its #include directive (since you can even include lines inside some function body, etc...). Actually, the idea that code is data (and proof) and data is code is even older (Church-Turing thesis, Curry-Howard correspondence, Halting problem). The Gödel, Escher, Bach book is very entertaining....
For example, you could decide to have a textual file opcodes.txt (or even some sqlite database containing stuff....) like
# ignore lines starting with an hashsign
FOO 1
BAR 2
and have two small awk or Python scripts (or two tiny C specialized programs), one generating the #define-s (into opcode-defines.h) and another generating the body of instruction_by_id (into opcode-instr.inc). Then you need to adapt your Makefile to generate these, and put #include "opcode-defines.h" inside some global header, and have
const char* instruction_by_id(int id) {
switch (id) {
#include "opcode-instr.inc"
default: return "???";
}
}
this will a nightmare to maintain,
Not so with such a metaprogramming approach. You'll just maintain opcodes.txt and the scripts using it, but you express a given "knowledge element" (the relation of FOO to 1) only once (in a single line of opcode.txt). Of course you need to document that (at the very least, with comments in your Makefile).
Metaprogramming from some higher-level, declarative formalization, is a very powerful paradigm. In France, J.Pitrat pioneered it (and he is writing an interesting blog today, while being retired) since the 1960s. In the US, J.MacCarthy and the Lisp community also.
For an entertaining talk, see Liam Proven FOSDEM 2018 talk on The circuit less traveled
Large software are using that metaprogramming approach quite often. For example, the GCC compiler have about a dozen of C++ code generators (in total, they are emitting more than a million of C++ lines).
Another way of looking at such an approach is the idea of domain-specific languages that could be compiled to C. If you use an operating system providing dynamic loading, you can even write a program emitting C code, forking a process to compile it into some plugin, then loading that plugin (on POSIX or Linux, with dlopen). Interestingly, computers are now fast enough to enable such an approach in an interactive application (in some sort of REPL): you can emit a C file of a few thousand lines, compile it into some .so shared object file, and dlopen that, in a fraction of second. You could also use JIT-compiling libraries like GCCJIT or LLVM to generate code at runtime. You could embed an interpreter (like Lua or Guile) into your program.
BTW, metaprogramming approaches is one of the reasons why basic compilation techniques should be known by most developers (and not only just people in the compiler business); another reason is that parsing problems are very common. So read the Dragon Book.
Be aware of Greenspun's tenth rule. It is much more than a joke, actually a profound truth about large software.
In a similar case I've resorted to defining a text file format that defines the instructions, and writing a program to read this file and write out the C source of the actual instruction definitions and the C source of functions like your instruction_by_id(). This way you only need to maintain the text file.
As awesome as general code generation is, I’m surprised that nobody mentioned that (if you relax your problem definition just a bit) the C preprocessor is perfectly capable of generating the necessary code, using a technique called X macros. In fact every simple bytecode VM in C that I’ve seen uses this approach.
The technique works as follows. First, there is a file (call it insns.h) containing the authoritative list of instructions,
INSN(FOO, 1)
INSN(BAR, 2)
INSN(BAZ, 3)
or alternatively a macro in some other header containing the same,
#define INSNS \
INSN(FOO, 1) \
INSN(BAR, 2) \
INSN(BAZ, 3)
whichever is more conveinent for you. (I’ll use the first option in the following.) Note that INSN is not defined anywhere. (Traditionally it would be called X, thus the name of the technique.) Wherever you want to loop over your instructions, define INSN to generate the code you want, include insns.h, then undefine INSN again.
In your disassembler, write
const char *instruction_by_id(int id) {
switch (id) {
#define INSN(NAME, VALUE) \
case NAME: return #NAME;
#include "insns.h" /* or just INSNS if you use a macro */
#undef INSN
default: return "???";
}
}
using the prefix stringification operator # to turn names-as-identifiers into names-as-string-literals.
You obviously can’t define the constants this way, because macros cannot define other macros in the C preprocessor. However, if you don’t insist that the instruction constants be preprocessor constants, there’s a different perfectly serviceable constant facility in the C language: enumerations. Whether or not you use an enumerated type, the enumerators defined inside it are regular integer constants from the point of view of the compiler (though not the preprocessor—you cannot use #ifdef with them, for example). So, using an anonymous enumeration type, define your constants like this:
enum {
#define INSN(NAME, VALUE) \
NAME = VALUE,
#include "insns.h" /* or just INSNS if you use a macro */
#undef INSN
NINSNS /* C89 doesn’t allow trailing commas in enumerations (but C99+ does), and you may find this constant useful in any case */
};
If you want to statically initialize an array indexed by your bytecodes, you’ll have to use C99 designated initializers {[FOO] = foovalue, [BAR] = barvalue, /* ... */} whether or not you use X macros. However, if you don’t insist on assigning custom codes to your instructions, you can eliminate VALUE from the above and have the enumeration assign consecutive codes automatically, and then the array can be simply initialized in order, {foovalue, barvalue, /* ... */}. As a bonus, NINSNS above then becomes equal to the number of the instructions and the size of any such array, which is why I called it that.
There are more tricks you can use here. For example, if some instructions have variants for several data types, the instruction list X macro can call the type list X macro to generate the variants automatically. (The somewhat ugly second option of storing the X macro list in a large macro and not an include file may be more handy here.) The INSN macro may take additional arguments such as the mode name, which would ignored in the code list but used to call the appropriate decoding routine in the disassembler. You can use token pasting operator ## to add prefixes to the names of the constants, as in INSN_ ## NAME to generate INSN_FOO, INSN_BAR, etc. And so on.

initialising constant static array with algorhythm [duplicate]

I am thinking about the following problem: I want to program a microcontroller (let's say an AVR mega type) with a program that uses some sort of look-up tables.
The first attempt would be to locate the table in a separate file and create it using any other scripting language/program/.... In this case there is quite some effort in creating the necessary source files for C.
My thought was now to use the preprocessor and compiler to handle things. I tried to implement this with a table of sine values (just as an example):
#include <avr/io.h>
#include <math.h>
#define S1(i,n) ((uint8_t) sin(M_PI*(i)/n*255))
#define S4(i,n) S1(i,n), S1(i+1,n), S1(i+2,n), S1(i+3,n)
uint8_t lut[] = {S4(0,4)};
void main()
{
uint8_t val, i;
for(i=0; i<4; i++)
{
val = lut[i];
}
}
If I compile this code I get warnings about the sin function. Further in the assembly there is nothing in the section .data. If I just remove the sin in the third line I get the data in the assembly. Clearly all information are available at compile time.
Can you tell me if there is a way to achieve what I intent: The compiler calculates as many values as offline possible? Or is the best way to go using an external script/program/... to calculate the table entries and add these to a separate file that will just be #included?
The general problem here is that sin call makes this initialization de facto illegal, according to rules of C language, as it's not constant expression per se and you're initializing array of static storage duration, which requires that. This also explains why your array is not in .data section.
C11 (N1570) §6.6/2,3 Constant expressions (emphasis mine)
A constant expression can be evaluated during translation rather than
runtime, and accordingly may be used in any place that a constant may
be.
Constant expressions shall not contain assignment, increment,
decrement, function-call, or comma operators, except when they are
contained within a subexpression that is not evaluated.115)
However as by #ShafikYaghmour's comment GCC will replace sin function call with its built-in counterpart (unless -fno-builtin option is present), that is likely to be treated as constant expression. According to 6.57 Other Built-in Functions Provided by GCC:
GCC includes built-in versions of many of the functions in the
standard C library. The versions prefixed with __builtin_ are always
treated as having the same meaning as the C library function even if
you specify the -fno-builtin option.
What you are trying is not part of the C language. In situations like this, I have written code following this pattern:
#if GENERATE_SOURCECODE
int main (void)
{
... Code that uses printf to write C code to stdout
}
#else
// Source code generated by the code above
... Here I paste in what the code above generated
// The rest of the program
#endif
Every time you need to change it, you run the code with GENERATE_SOURCECODE defined, and paste in the output. Works well if your code is self contained and the generated output only ever changes if the code generating it changes.
First of all, it should go without saying that you should evaluate (probably by experiment) whether this is worth doing. Your lookup table is going to increase your data size and programmer effort, but may or may not provide a runtime speed increase that you need.
If you still want to do it, I don't think the C preprocessor can do it straightforwardly, because it has no facilities for iteration or recursion.
The most robust way to go about this would be to write a program in C or some other language to print out C source for the table, and then include that file in your program using the preprocessor. If you are using a tool like make, you can create a rule to generate the table file and have your .c file depend on that file.
On the other hand, if you are sure you are never going to change this table, you could write a program to generate it once and just paste it in.

For loop macro which unrolled on the pre-processor phase?

I want to use gcc pre-processor to write almost the same code declaration for 500 times. let's say for demonstration purposes I would like to use a macro FOR_MACRO:
#define FOR_MACRO(x) \
#for i in {1 ... x}: \
const int arr_len_##x[i] = {i};
and calling FOR_MACRO(100) will be converted into:
const int arr_len_1[1] = {1};
const int arr_len_2[2] = {2};
...
const int arr_len_100[100] = {100};
This is not a good idea:
While possible in principle, using the preprocessor means you have to manually unroll the loop at least once, you end up with some arbitrary implementation-defined limit on loop depth and all statements will be generated in a single line.
Better use the scripting language of your choice to generate the code (possibly in a separate includeable file) and integrate that with your build process.
You can use Order-PP for this, if you desperately need to.
It's a scripting language implemented in the preprocessor. This means it's conceptually similar to using a scripting language to generate C code (in fact, the same) except there are no external tools and the script runs at the same time as the C compiler: everything is done with C macros. Despite being built on the preprocessor, there are no real limits to loop iterations, recursion depth, or anything like that (the limit is somewhere in the billions, you don't need to worry about it).
To emit the code requested in the question example, you could write:
#include <order/interpreter.h>
ORDER_PP( // runs Order code
8for_each_in_range(8fn(8I,
8print( 8cat(8(const int arr_len_), 8I)
([) 8I (] = {) 8I (};) )),
1, 101)
)
I can't fathom why you would do this instead of simply integrating an external language like Python into your build process (Order might be implemented using macros, but it's still a separate language to understand), but the option is there.
Order only works with GCC as far as I know; other preprocessors run out of stack too quickly (even Clang), or are not perfectly standard-compliant.
Instead of providing you with a solution for exactly your problem, are you sure it cannot be handled in a better way?
Maybe it would be better to
use one array with one more dimension
fill the data with the help of an array at runtime, as you obviously want to fill out the first entry of each array. If you leave the array uninitialized, it will (provided it is defined on module level) be put into .bss segment instead of .data and will probably need less space in the binary file.
You could use e.g P99 to do such preprocessor code unrolling. But because of the limited capacities of the preprocessor this comes with a limit, and that limit is normally way below 500.

How could I make a constant in C except using a number

I am working on a C math library, and it is using macros do to the most of it's work, I am now facing a problem.
This is what the macro looks like:
the_macro(a, b, c)
and the macro itself does something like:
(a - b > 0) ? error_function : 1
the error_function is used to stop the user at complie time, so if (a - b > 0) is true, then the macro will expand as a function which does not have a definition. So this will cause a linkage error.
Everthing seems good, but today my boss told me we need to do some unit-test, so I wrote a function which wraps the macro:
int my_func(int a, int b, int c)
{
return the_macro(a, b, c);
}
here comes the problem, the code can't pass linkage, because if I use a var instead of a constant to call the_macro, these error_functions will be in the .o file, because the int a, int b, int c are all known at runtime, so I can only call the macro function with constants: the_macro(2, 3, 4) is there any way to avoid this? or is there a better solution to do unit-test on this macro?
EDIT:
The code I'm working on is confidential... but I made an example which demonstrates the problem:
#include <stdio.h>
#define the_macro(a, b)\
(a > b)?error_function():1
// Comment out my_func(), then the program will run normaly
// But if you don't comment it out, the linkage error will come out.
void my_func(int a, int b)
{
the_macro(a, b);
}
int main()
{
printf("%d\n", the_macro(1, 10));
return 0;
}
I'm using gcc-4
Regardless of where you use the macro, if error_function is not declared, you should get a compiler error. If it is declared but not defined, you have undefined behavior. Whether the arguments to the macro are constants or not changes nothing in this respect. (It may affect what the actual behavior is in the case of undefined behavior.)
When you call the macro with constants, the compiler knows the value and thus, perhaps as as optimization, the expression the_macro (5, 4, 0) gets replaced by 1 instead of error_function. When your expression a-b evaluates to <= 0, your compiler replaces it with error_function, and stops your compilation.
On the other hand, when you use variables, the compiler doesn't know the result of the expression and has to use the full expansion of the macro, which contains a call to undefined function, and hence you get the linkage error.
For the purposes of your unit tests (only) why not define error_function() as part of your unit test and have it return an error unconditionally that your test framework can detect. That way you should be able to mimic the behaviour you're seeing at compile time using either constants or variables.
It's not exactly what you want, but unit test frameworks are always, by their nature run-time testing mechanisms, so an automated compile time test is probably not going to be possible.
Alternatively, you could use system() to run a command line build including your library, redirect the output, including errors into a file. You could then open the file and scan for known text of the linkage error.
Let's see if I understand this correctly:
You want a way to break compilation if a-b>0? This is actually impossible unless you use C11. There simply is no way to have the compiler abort depending on a condition. In your case you are trying to use a combination of the optimizer and the linker to get the desired behavior. But this cannot work reliably.
The expression (a - b > 0) ? error_function : 1 may be reduced by the optimizer to one if a-b>0, but this is not guaranteed. There is a guaranteed behavior compiler has to show defined by the C standard and this standard does not mention an optimizer. The same optimizer may sometimes reduce the expression, and sometimes not reduce it depending on other things in your code. Or it may or may not reduce it depending on the command line flags you are passing.
So with using this macro you are writing code, which may suddenly break unexpectedly when you switch compiler, compiler version, operating system, add or remove linked libraries or target architecture. Code that suddenly breaks depending on such changes is very bad. Don't do this to your fellow developers.
Better to write portable code for which you can be sure that future compilers will understand it because it follows the standard. In pre C11 there is no way to do this. If you really need this, tell your boss the only way is to use C11 which has a static_assert keyword which can give you the conditional abortion of the compilation.

Resources