Use of ({ ... }) brackets in macros to swallow the semicolon

Use of ({ ... }) brackets in macros to swallow the semicolon - c

Often, in macros, you will see people use a do { ... } while(0) to swallow the semicolon. I just came across an example where they use ({ ... }) instead, and it seems to not only swallow the semicolon, but seems to allow you to return a value as well:
#define NEW_MACRO() ({ int x = 1; int y = 2; x+y; })
if(1)
val = NEW_MACRO();
else
printf("this never prints");`
val would come out being 3. I can't find any documentation on it, so I'm a bit wary of it. Are there any gotcha's with this method?

This is not valid in standard C.
Some compilers may have extensions (e.g. GCC's statement expressions) that allow this sort of thing.

As Oli said correctly this was invented by gcc. The goal is (often with their typeof extension) to be able to evaluated macro elements only once and use this computed value later on by using a name.
Many times such a use can be completely avoided by using inline functions. These also have the (dis)advantage of being more strict on types.
In some other cases where you just need a temporary variable whose address you pass to a function, C99 also has compound literals that can be used for this.

do { ... } while(0) is not for the "swallowing the semicolon". It is for turning the C expression to the C statement.

Related

Calling function with macro in C

I have function in C with 3 params. I want to call that function with pre-defined values. Is there any solution?
My idea which does not work.
#define PRE_DEFINED_MACRO (10, 15 , true)
void myFunc(uint8_t temp, uint32_t value, bool valid)
{
....
}
....
//call the function like that
myFunc(PRE_DEFINED_MACRO);

Your macro already includes parentheses, so when this ...
myFunc(PRE_DEFINED_MACRO);
... is expanded, the result is:
myFunc((10, 15 , true));
You could do this instead:
myFunc PRE_DEFINED_MACRO;
I find that pretty confusing, however. Personally, I would prefer to remove the parentheses from the macro's replacement text:
#define MYFUNC_ARGS 10, 15 , true
// ...
myFunc(MYFUNC_ARGS);
That makes it much clearer that the call to myFunc() is, in fact, a function call. The choice of macro name also makes the intended purpose of the macro much clearer.

To just replace 3 arguments with a macro:
#define PRE_DEFINED_MACRO 10, 15, true
The reason why you can't use a parenthesis is because it would expand as myFunc((10, 15, true)); and the compiler reads that as "the first parameter is obtained from a chained series of comma operators and the rest of the parameters are missing". What does the comma operator , do?
As for always giving a certain function the same default arguments and optionally omit some arguments (like in C++), it's a bit more intricate. See C default arguments, including this late answer that I posted there recently myself.

Another, not uncommon solution would be to use a function-like macro right away, e.g.
#define MY_FUNC_DEFAULTS() myFunc(10, 15, true)
This looks also fairly clean, retains the function syntax etc. Note that the macro expansion does not end with a semicolon so that it can (and must) be provided at the call site in the natural fashion: if(cond) MY_FUNC_DEFAULTS(); else g(); works as expected.
Because you originally asked about C++: I find some of the OCD types here a bit annoying. All preprocessor solutions provided in the answers here work equally well in both languages. Nonetheless, the preprocessor is text replacement and as such inherently unsafe. Therefore, C++ provides means to avoid using it. In this case one could provide default arguments in the function declaration:
void myFunc(uint8_t temp = 10, uint32_t value = 15, bool valid = true);
This essentially overloads the function: It defines a set of four distinct functions that take 3, 2, 1 or 0 arguments, respectively. All of the calls
myFunc();
myFunc(1);
myFunc(1,2);
myFunc(1,2,false);
are now possible.
Such declarations could be present in different translation units with different default values. The default values are defined at the caller side and are not part of the function signature. (Whether that would be recommended is debatable though.)

Why are statements with no effect considered legal in C?

Pardon if this question is naive. Consider the following program:
#include <stdio.h>
int main() {
int i = 1;
i = i + 2;
5;
i;
printf("i: %d\n", i);
}
In the above example, the statements 5; and i; seem totally superfluous, yet the code compiles without warnings or errors by default (however, gcc does throw a warning: statement with no effect [-Wunused-value] warning when ran with -Wall). They have no effect on the rest of the program, so why are they considered valid statements in the first place? Does the compiler simply ignore them? Are there any benefits to allowing such statements?

One benefit to allowing such statements is from code that's created by macros or other programs, rather than being written by humans.
As an example, imagine a function int do_stuff(void) that is supposed to return 0 on success or -1 on failure. It could be that support for "stuff" is optional, and so you could have a header file that does
#if STUFF_SUPPORTED
#define do_stuff() really_do_stuff()
#else
#define do_stuff() (-1)
#endif
Now imagine some code that wants to do stuff if possible, but may or may not really care whether it succeeds or fails:
void func1(void) {
if (do_stuff() == -1) {
printf("stuff did not work\n");
}
}
void func2(void) {
do_stuff(); // don't care if it works or not
more_stuff();
}
When STUFF_SUPPORTED is 0, the preprocessor will expand the call in func2 to a statement that just reads
(-1);
and so the compiler pass will see just the sort of "superfluous" statement that seems to bother you. Yet what else can one do? If you #define do_stuff() // nothing, then the code in func1 will break. (And you'll still have an empty statement in func2 that just reads ;, which is perhaps even more superfluous.) On the other hand, if you have to actually define a do_stuff() function that returns -1, you may incur the cost of a function call for no good reason.

Simple Statements in C are terminated by semicolon.
Simple Statements in C are expressions. An expression is a combination of variables, constants and operators. Every expression results in some value of a certain type that can be assigned to a variable.
Having said that some "smart compilers" might discard 5; and i; statements.

Statements with no effect are permitted because it would be more difficult to ban them than to permit them. This was more relevant when C was first designed and compilers were smaller and simpler.
An expression statement consists of an expression followed by a semicolon. Its behavior is to evaluate the expression and discard the result (if any). Normally the purpose is that the evaluation of the expression has side effects, but it's not always easy or even possible to determine whether a given expression has side effects.
For example, a function call is an expression, so a function call followed by a semicolon is a statement. Does this statement have any side effects?
some_function();
It's impossible to tell without seeing the implementation of some_function.
How about this?
obj;
Probably not -- but if obj is defined as volatile, then it does.
Permitting any expression to be made into an expression-statement by adding a semicolon makes the language definition simpler. Requiring the expression to have side effects would add complexity to the language definition and to the compiler. C is built on a consistent set of rules (function calls are expressions, assignments are expressions, an expression followed by a semicolon is a statement) and lets programmers do what they want without preventing them from doing things that may or may not make sense.

The statements you listed with no effect are examples of an expression statement, whose syntax is given in section 6.8.3p1 of the C standard as follows:
expression-statement:
expressionopt ;
All of section 6.5 is dedicated to the definition of an expression, but loosely speaking an expression consists of constants and identifiers linked with operators. Notably, an expression may or may not contain an assignment operator and it may or may not contain a function call.
So any expression followed by a semicolon qualifies as an expression statement. In fact, each of these lines from your code is an example of an expression statement:
i = i + 2;
5;
i;
printf("i: %d\n", i);
Some operators contain side effects such as the set of assignment operators and the pre/post increment/decrement operators, and the function call operator () may have a side effect depending on what the function in question does. There is no requirement however that one of the operators must have a side effect.
Here's another example:
atoi("1");
This is calling a function and discarding the result, just like the call printf in your example but the unlike printf the function call itself does not have a side effect.

Sometimes such a statements are very handy:
int foo(int x, int y, int z)
{
(void)y; //prevents warning
(void)z;
return x*x;
}
Or when reference manual tells us to just read the registers to archive something - for example to clear or set some flag (very common situation in the uC world)
#define SREG ((volatile uint32_t *)0x4000000)
#define DREG ((volatile uint32_t *)0x4004000)
void readSREG(void)
{
*SREG; //we read it here
*DREG; // and here
}
https://godbolt.org/z/6wjh_5

Is it possible to declare a variable in a function within a #define clause

In order to speed up the performance of my program, I'd like to introduce a function for calculating the leftover of a floating point division (where the quotient is natural, obviously).
Therefore I have following simple function:
double mfmod(double x,double y) {
double a;
return ((a=x/y)-(int)a)*y;
}
As I've heard I could speed up even more by putting this function within a #define clause, but the variable a makes this quite difficult. At this moment I'm here:
#define mfmod(x,y) { \
double a; \
return ((a=x/y)-(int)a)*y; \
}
But trying to launch this gives problems, due to the variable.
The problems are the following: a bit further I'm trying to launch this function:
double test = mfmod(f, div);
And this can't be compiled due to the error message type name is not allowed.
(for your information, f and div are doubles)
Does anybody know how to do this? (if it's even possible) (I'm working with Windows, more exactly Visual Studio, not with GCC)

As I've heard I could speed up even more by putting this function within a #define clause
I think you must have misunderstood. Surely the advice was to implement the behavior as a (#defined) macro, instead of as a function. Your macro is syntactically valid, but the code resulting from expanding it is not a suitable replacement for calling your function.
Defining this as a macro is basically a way to manually inline the function body, but it has some drawbacks. In particular, in standard C, a macro that must expand to an expression (i.e. one that can be used as a value) cannot contain a code block, and therefore cannot contain variable declarations. That, in turn, may make it impossible to avoid multiple evaluation of the macro arguments, which is a big problem.
Here's one way you could write your macro:
#define mfmod(x,y) ( ((x)/(y)) - (int) ((x)/(y)) )
This is not a clear a win over the function, however, as its behavior varies with the argument types, it must evaluate both arguments twice (which can produce unexpected and even undefined results in some cases), and it must also perform the division twice.
If you were willing to change the usage mode so that the macro sets a result variable instead of expanding to an expression, then you could get around many of the problems. #BnBDim provided a first cut at this, but it suffers from some of the same type and multiple-evaluation problems as the above. Here's how you could do it to obtain the same result as your function:
#define mfmod(x, y, res) do { \
double _div = (y); \
double _quot = (double) (x) / _div; \
res = (_quot - (int) _quot) * _div; \
} while (0)
Note that it takes care to reference the arguments once each, and also inside parentheses but for res, which must be an lvalue. You would use it much like a void function instead of like a value-returning function:
double test;
mfmod(f, div, test);
That still affords a minor, but unavoidable risk of breakage in the event that one of the actual arguments to the macro collides with one of the variables declared inside the code block it provides. Using variable names prefixed with underscores is intended to minimize that risk.
Overall, I'd be inclined to go with the function instead, and to let the compiler handle the inlining. If you want to encourage the compiler to do so then you could consider declaring the function inline, but very likely it will not need such a hint, and it is not obligated to honor one.
Or better, just use fmod() until and unless you determine that doing so constitutes a bottleneck.

To answer the stated question: yes, you can declare variables in defined macro 'functions' like the one you are working with, in certain situations. Some of the other answers have shown examples of how to do this.
The problem with your current #define is that you are telling it to return something in the middle of your code, and you are not making a macro that expands in the way you probably want. If I use your macro like this:
...
double y = 1.0;
double x = 1.0;
double z = mfmod(x, y);
int other = (int)z - 1;
...
This is going to expand to:
...
double y = 1.0;
double x = 1.0;
double z = {
double a;
return ((a=x/y)-(int)a)*y;
};
int other = (int)z - 1;
...
The function (if it compiled) would never proceed beyond the initialization of z, because it would return in the middle of the macro. You also are trying to 'assign' a block of code to z.
That being said, this is another example of making assumptions about performance without any (stated) benchmarking. Have you actually measured any performance problem with just using an inline function?
__attribute__((const))
extern inline double mfmod(const double x, const double y) {
const double a = x/y;
return (a - (int)a) * y;
}
Not only is this cleaner, clearer, and easier to debug than the macro, it has the added benefit of being declared with the const attribute, which will suggest to the compiler that subsequent calls to the function with the same arguments should return the same value, which can cause repeated calls to the function to be optimized away entirely, whereas the macro would be evaluated every time (conceptually). To be honest, even using the local double to cache the division result is probably a premature optimization, since the compiler will probably optimize this away. If this were me, and I absolutely HAD to have a macro to do this, I would write it as follows:
#define mfmod(x, y) (((x/y)-((int)(x/y)))*y)
There will almost certainly not be any noticeable performance hit under optimization for doing the division twice. If there were, I would use the inline function above. I will leave it to you to do the benchmarking.

You could use this work-around
#define mfmod(x,y,res) \
do { \
double a=(x)/(y); \
res = (a-(int)a)*(y); \
} while(0)

Legal uses of setjmp and GCC

Using GCC (4.0 for me), is this legal:
if(__builtin_expect(setjmp(buf) != 0, 1))
{
// handle error
}
else
{
// do action
}
I found a discussion saying it caused a problem for GCC back in 2003, but I would imagine that they would have fixed it by now. The C standard says that it's illegal to use setjmp unless it's one of four conditions, the relevant one being this:
one operand of a relational or equality operator with the other operand an integer constant expression, with the resulting expression being the entire controlling expression of a selection or iteration statement;
But if this is a GCC extension, can I guarantee that it will work under for GCC, since it's already nonstandard functionality? I tested it and it seemed to work, though I don't know how much testing I'd have to do to actually break it. (I'm hiding the call to __builtin_expect behind a macro, which is defined as a no-op for non-GCC, so it would be perfectly legal for other compilers.)

I think that what the standard was talking about was to account for doing something like this:
int x = printf("howdy");
if (setjmp(buf) != x ) {
function_that_might_call_longjmp_with_x(buf, x);
} else {
do_something_about_them_errors();
}
In this case you could not rely on x having the value that it was assigned in the previous line anymore. The compiler may have moved the place where x had been (reusing the register it had been in, or something), so the code that did the comparison would be looking in the wrong spot. (you could save x to another variable, and then reassign x to something else before calling the function, which might make the problem more obvious)
In your code you could have written it as:
int conditional;
conditional = setjump(buf) != 0 ;
if(__builtin_expect( conditional, 1)) {
// handle error
} else {
// do action
}
And I think that we can satisfy ourselves that the line of code that assigns the variable conditional meets that requirement.

But if this is a GCC extension, can I guarantee that it will work under for GCC, since it's already nonstandard functionality? I tested it and it seemed to work, though I don't know how much testing I'd have to do to actually break it. (I'm hiding the call to __builtin_expect behind a macro, which is defined as a no-op for non-GCC, so it would be perfectly legal for other compilers.)
You are correct, __builtin_expect should be a macro no-op for other compilers so the result is still defined.

What does "do { ... } while (0)" do exactly in kernel code? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
What’s the use of do while(0) when we define a macro?
Why are there sometimes meaningless do/while and if/else statements in C/C++ macros?
C multi-line macro: do/while(0) vs scope block
I have seen a lot of usages like this, previously I though that the programmer wanted to break out of a block of code easily. Why do we need a do { ... } while (0) loop here? Are we trying to tell the compiler something?
For instance in Linux kernel 2.6.25, include/asm-ia64/system.h
/*
* - clearing psr.i is implicitly serialized (visible by next insn)
* - setting psr.i requires data serialization
* - we need a stop-bit before reading PSR because we sometimes
* write a floating-point register right before reading the PSR
* and that writes to PSR.mfl
*/
#define __local_irq_save(x) \
do { \
ia64_stop(); \
(x) = ia64_getreg(_IA64_REG_PSR); \
ia64_stop(); \
ia64_rsm(IA64_PSR_I); \
} while (0)

It's always used in macros so that a semicolon is required after a call, just like when calling a regular function.
In your example, you have to write
__local_irq_save(1);
while
__local_irq_save(1)
would result in an error about a missing semicolon. This would not happen if the do while was not there. If it was just about scoping, a simple curly brace pair would suffice.

It allows for the code to appear here:
if(a) __local_irq_save(x); else ...;
// -> if(a) do { .. } while(0); else ...;
If they simply used a { .. } you would get
if(a) { ... }; else ...;
The else would not belong to any if anymore, because the semicolon would be the next statement and separate the else from the preceeding if. A compile error would occur.

The purpose of do{ ... } while(0) construct is to turn a group of statements into a single compound statement that can be terminated with a ;. You see, in C language the do/while construct has one weird and unusual property: even though it "works" as a compound statement, it expects a ; at the end. No other compound constructs in C have this property.
Because of this property, you can use do/while to write multi-statement macros, which can be safely used as "ordinary" functions without worrying what's inside the macro, as in the following example
if (/* some condition */)
__local_irq_save(x); /* <- we can safely put `;` here */
else
/* whatever */;

The answer has already been given (so the macro forces a ; when called), but another use of this kind of statement that I have seen: it allows break to be called anywhere in the "loop", early terminating if needed. Essentially a "goto" that your fellow programmers wouldn't murder you for.
do {
int i = do_something();
if(i == 0) { break; } // Skips the remainder of the logic
do_something_else();
} while(0);
Note that this is still fairly confusing, so I don't encourage its use.

Looks like it's there just for scoping. It's similar to:
if (true)
{
// Do stuff.
}
edit
I don't see it in your example, but it's possible that one of those function calls is actually a macro, in which case there's one key difference between do/while(0) and if(true), which is that the former allows continue and break.

It makes use of the macro act like a real statement or function call.
A statement is either { expression-list } or expression; so that poses a problem when defining macros that need more than one expression, because if you use { } then a syntax error will occur if the caller of the macro quite reasonably adds a ; before an else.
if(whatever)
f(x);
else
f(y);
If f() is a single statement macro, fine, but what if it's a macro and something complicated? You end up with if(...) { s1; s2; }; else ... and that doesn't work.
So the writer of the macro has to then either make it into a real function, wrap the construct in a single statement, or use a gnu extension.
The do .. while(0) pattern is the "wrap the construct" approach.