Preprocessor constant folding - c

I have a fundamental question regarding the C preprocessor constant evaluation and I would like some help in understanding if the preprocessor helps in optimizing the code in a situation like this. I understand that the preprocessor merely "replaces text" in the code. By that rule, even constant expressions get replaced in the code. For instance, the below code:
#include <stdio.h>
#define MY_NUM 2+2*2
int main()
{
int a = 6/MY_NUM;
printf("a: %d\n", a);
return 0;
}
The value of a comes to 7. This is because the preprocessed code looks something like this:
int main()
{
int a = 6/2+2*2;
printf("a: %d\n", a);
return 0;
}
I can see that MY_NUM did not evaluate to 6 before the compilation kicks in. Of course the compiler then optimizes the code by evaluating the value of a at compile time.
I am not sure if preprocessor constant folding happens or not or if it is even possible. Or is there any way (flag in gcc) to enable it. The regular -O optimizations do not enable this. Is there anyway we could change the behavior of the preprocessor here?
I am using gcc 4.8.4 for my code.

No, the only time the preprocessor evaluates any expression is in #if / #elif.
You can fake arithmetic by implementing it in terms of token concatenation and a ton of macros, but that's much harder than simply doing
#define MY_NUM (2+2*2)
But there's no simple compiler switch because macro expansion is just token replacement.

Related

Why are statements with no effect considered legal in C?

Pardon if this question is naive. Consider the following program:
#include <stdio.h>
int main() {
int i = 1;
i = i + 2;
5;
i;
printf("i: %d\n", i);
}
In the above example, the statements 5; and i; seem totally superfluous, yet the code compiles without warnings or errors by default (however, gcc does throw a warning: statement with no effect [-Wunused-value] warning when ran with -Wall). They have no effect on the rest of the program, so why are they considered valid statements in the first place? Does the compiler simply ignore them? Are there any benefits to allowing such statements?
One benefit to allowing such statements is from code that's created by macros or other programs, rather than being written by humans.
As an example, imagine a function int do_stuff(void) that is supposed to return 0 on success or -1 on failure. It could be that support for "stuff" is optional, and so you could have a header file that does
#if STUFF_SUPPORTED
#define do_stuff() really_do_stuff()
#else
#define do_stuff() (-1)
#endif
Now imagine some code that wants to do stuff if possible, but may or may not really care whether it succeeds or fails:
void func1(void) {
if (do_stuff() == -1) {
printf("stuff did not work\n");
}
}
void func2(void) {
do_stuff(); // don't care if it works or not
more_stuff();
}
When STUFF_SUPPORTED is 0, the preprocessor will expand the call in func2 to a statement that just reads
(-1);
and so the compiler pass will see just the sort of "superfluous" statement that seems to bother you. Yet what else can one do? If you #define do_stuff() // nothing, then the code in func1 will break. (And you'll still have an empty statement in func2 that just reads ;, which is perhaps even more superfluous.) On the other hand, if you have to actually define a do_stuff() function that returns -1, you may incur the cost of a function call for no good reason.
Simple Statements in C are terminated by semicolon.
Simple Statements in C are expressions. An expression is a combination of variables, constants and operators. Every expression results in some value of a certain type that can be assigned to a variable.
Having said that some "smart compilers" might discard 5; and i; statements.
Statements with no effect are permitted because it would be more difficult to ban them than to permit them. This was more relevant when C was first designed and compilers were smaller and simpler.
An expression statement consists of an expression followed by a semicolon. Its behavior is to evaluate the expression and discard the result (if any). Normally the purpose is that the evaluation of the expression has side effects, but it's not always easy or even possible to determine whether a given expression has side effects.
For example, a function call is an expression, so a function call followed by a semicolon is a statement. Does this statement have any side effects?
some_function();
It's impossible to tell without seeing the implementation of some_function.
How about this?
obj;
Probably not -- but if obj is defined as volatile, then it does.
Permitting any expression to be made into an expression-statement by adding a semicolon makes the language definition simpler. Requiring the expression to have side effects would add complexity to the language definition and to the compiler. C is built on a consistent set of rules (function calls are expressions, assignments are expressions, an expression followed by a semicolon is a statement) and lets programmers do what they want without preventing them from doing things that may or may not make sense.
The statements you listed with no effect are examples of an expression statement, whose syntax is given in section 6.8.3p1 of the C standard as follows:
expression-statement:
expressionopt ;
All of section 6.5 is dedicated to the definition of an expression, but loosely speaking an expression consists of constants and identifiers linked with operators. Notably, an expression may or may not contain an assignment operator and it may or may not contain a function call.
So any expression followed by a semicolon qualifies as an expression statement. In fact, each of these lines from your code is an example of an expression statement:
i = i + 2;
5;
i;
printf("i: %d\n", i);
Some operators contain side effects such as the set of assignment operators and the pre/post increment/decrement operators, and the function call operator () may have a side effect depending on what the function in question does. There is no requirement however that one of the operators must have a side effect.
Here's another example:
atoi("1");
This is calling a function and discarding the result, just like the call printf in your example but the unlike printf the function call itself does not have a side effect.
Sometimes such a statements are very handy:
int foo(int x, int y, int z)
{
(void)y; //prevents warning
(void)z;
return x*x;
}
Or when reference manual tells us to just read the registers to archive something - for example to clear or set some flag (very common situation in the uC world)
#define SREG ((volatile uint32_t *)0x4000000)
#define DREG ((volatile uint32_t *)0x4004000)
void readSREG(void)
{
*SREG; //we read it here
*DREG; // and here
}
https://godbolt.org/z/6wjh_5

C macros using enum

I am trying to use #if macros by defining the type of operation to invoke the right code, So i made a very simple example similar to what I am trying to do:
#include <stdio.h>
enum{ADD,SUB,MUL};
#define operation ADD
int main()
{
int a = 4;
int b = 2;
int c;
#if (operation == ADD)
c = a+b;
#endif
#if (operation == SUB)
c = a-b;
#endif
#if (operation == MUL)
c = a*b;
#endif
printf("result = %i",c);
return 0;
}
But unfortunately that does not work I get the following result = 8... if I replace The operation with numbers it works fine .... But i want it to work as it is described above.
Any help
The preprocessor is a step that is (in a way) done before the actual compiler sees the code. Therefore it has no idea about enumerations or their values, as they are set during compilation which happens after preprocessing.
You simply can't use preprocessor conditional compilation using enumerations.
The preprocessor will always consider that as false:
#if IDENT == IDENT
It can only test for numeric values.
Simplify your code and feed it to the preprocessor:
enum {ADD,SUB,MUL};
#define operation ADD
int main()
{
(operation == ADD);
}
The result of the preprocessor output is:
enum {ADD,SUB,MUL};
int main()
{
(ADD == ADD);
}
As you see, the enumerate value hasn't been evaluated. In the #if statement, that expression is just seen as false.
So a workaround would be to replace your enumerate by a series of #define:
#define ADD 1
#define SUB 2
#define MUL 3
like this it works. Output of preprocessor output is now:
int main()
{
int a = 4;
int b = 2;
int c;
c = a+b;
# 28 "test.c"
printf("result = %i",c);
return 0;
}
the solution is:
either rely at 100% on the preprocessor (as the solution above suggests)
or rely at 100% on the compiler (use enums and real if statements)
As others have said, the preprocessor performs its transformations at a very early phase in compilation, before enum values are known. So you can't do this test in #if.
However, you can just use an ordinary if statement. Any decent compiler with optimization enabled will detect that you're comparing constants, perform the tests at compile time, and throw out the code that will never be executed. So you'll get the same result that you were trying to achieve with #if.
But i want it to work as it is described above.
You seem to mean that you want the preprocessor to recognize the enum constants as such, and to evaluate the == expressions in that light. I'm afraid you're out of luck.
The preprocessor knows nothing about enums. It operates on a mostly-raw stream of tokens and whitespace. When it evaluates a directive such as
#if (operation == SUB)
it first performs macro expansion to produce
#if (ADD == SUB)
. Then it must somehow convert the tokens ADD and SUB to numbers, but, again, it knows nothing about enums or the C significance of the preceding code. Its rule for interpreting such symbols as numbers is simple: it replaces each with 0. The result is that all three preprocessor conditionals in your code will always evaluate to true.
If you want the preprocessor to do this then you need to define the symbols to the preprocessor. Since you're not otherwise using the enum, you might as well just replace it altogether with
#define ADD 1
#define SUB 2
#define MUL 3
If you want the enum, too, then just use different symbols with the preprocessor than you use for the enum constants. You can use the same or different values, as you like, because never the twain shall meet.
Another solution would be to have the enum in an included header file.

C Stringize result of equation

I have read lots on stringizing macros, but I obviously don't quite understand. I wish to make a string where the argument to the macro needs to be evaluated first. Can someone please explain where I am going wrong, or perhaps how to do this better?
#define SDDISK 2 // Note defined in a library file elsewhere ie not a constant I know)
#define DRIVE_STR(d) #d ":/"
#define xDRIVE_STR(x) DRIVE_STR(x)
#define FILEPATH(f) xDRIVE_STR(SDDISK + '0') #f
const char file[] = FILEPATH(test.log);
void main(void)
{
DebugPrint(file);
}
The output is: "2 + '0':/test.log",
But I want "2:/test.log"
The C PREprocessor runs before the compiler ever sees the code.
This means that the equation will not be evaluated before it is stringified; instead, the preprocessor will just stringize the whole equation.
In your case just removing the +'0' will solve the problem as the value of SDDISK does not need casting to a char before it is stringified.
However, should you actually need to perform a calculation before stringizing you should either:
Use cpp's constexpr.
Complain to your compiler vendor that a constant expression was not optimized.
Use a preprocessor library to gain the wanted behaviour.

Preprocessor #if directive

I am writing a big code and I don't want it all to be in my main.c so I wrote a .inc file that has IF-ELSE statement with function and I was wondering can it be written like this:
#if var==1
process(int a)
{
printf("Result is: %d",2*a);
}
#else
process(int a)
{
printf("Result is: %d",10*a);
}
#endif
I tried to compile it but it gives me errors or in best case it just goes on the first function process without checking the var variable (it is set to 0).
The preprocessor doesn't "know" the value of any variable, because it does its work even before compilation, not at runtime.
In the condition of a preprocessor #if you can only evaluate #define'd symbols and constant expressions.
The particular example you are showing can be simply converted to:
printf("Result is: %d", (var == 1 ? 2: 10) * a);
Just to complete. For a standard conforming compiler your code would always be correct. In #if expression evaluations all identifiers that are not known to the preprocessor are simply replaced with 0 (or false if you want). So in your particular case, if var is just a variable and not a macro, the result would always be the second version of your function.
For the error that you report for MS: I did know that the MS compilers aren't standard conforming, but I wasn't aware that they don't even fulfill such basic language requirements.
This is what you want:
process(int a)
{
if (var == 1)
printf("Result is: %d",2*a);
else
printf("Result is: %d",10*a);
}
It is important to remember that the preprocessor is its own program and not a part of the program you are writing. The variable "var" (or whatever var represents here) is not in the the namespace of the preprocessor's identifiers.

gcc dump resolved defines

I have code like
#define ONE 1
#define TWO 2
#define SUM (ONE+TWO)
How do I dump SUM as "3", the resolved value, in gcc 4.3+?
gcc -dM -E foo.h only seems to dump it as is. How do I get the actual value like it's inserted on compilation stage?
You can't. As far as the compiler is concerned, the line printf("%d\n", SUM) before preprocessing is indistinguishable from the line printf("%d\n", 1+2). The compiler just happens to perform an optimization called constant folding, even at the default optimization level (-O0), which turns the result into a constant 3 at runtime.
There's not really a good way to see the output of these optimizations. You could use the -S option to view the generated assembly code and see what that looks like, but if your program is anything larger than a toy, that will be a lot of manual effort. You could also look at the parse tree by using one of the -fdump-tree options (see the GCC man page).
You can't "dump" SUM as 3 because SUM isn't 3 in any meaningful sense, it's just a sequence of the three tokens ONE, + and TWO. What it turns into depends on the context where it is expanded.
Macros are expanded where they appear in the source, macro replacements are just strings of tokens until then.
You can test this like this.
#include <stdio.h>
#define ONE 1
#define TWO 2
#define SUM ONE+TWO
int a = SUM;
#undef ONE
#define ONE 2
int b = SUM;
int main()
{
printf("a = %d\nb = %d\n", a, b);
return 0;
}
Here's another example:
#include <stdio.h>
#define ONE 1
#define TWO 2
#define SUM ONE+TWO
int main()
{
/* prints 6, not 2 */
printf("5 - SUM = %d\n", 5 - SUM);
return 0;
}
With this example there's no way you can justify SUM being 3.
Contrary to the other answers, there's definitely a solution to this problem, especially with gcc extensions. Parse the output of gcc -dM and generate a C file containing lines of the form __typeof__(MACRO) foo = MACRO; and go from there. Without __typeof__ you could still handle all arithmetic types fairly well by just using long double.
One way is to stop at the precompiler stage (-E) and examine that output.

Resources