I am trying to understand defining functions as macros and I have the following code, which I am not sure I understand:
#define MAX(i, limit) do \
{ \
if (i < limit) \
{ \
i++; \
} \
} while(1)
void main(void)
{
MAX(0, 3);
}
As I understand it tries to define MAX as an interval between 2 numbers? But what's the point of the infinite loop?
I have tried to store the value of MAX in a variable inside the main function, but it gives me an error saying expected an expression
I am currently in a software developing internship, and trying to learn embedded C since it's a new field for me. This was an exercise asking me what the following code will do. I was confused since I had never seen a function written like this
You are confused because this is a trick question. The posted code makes no sense whatsoever. The MAX macro expands indeed to an infinite loop and since its first argument is a literal value, i++ expands to 0++ which is a syntax error.
The lesson to be learned is: macros are confusing, error prone and should not be used to replace functions.
You have to understand that before your code gets to compiler, first it goes through a preprocessor. And it basically changes your text-written code. The way it changes the code is controlled with preprocessor directives (lines that begin with #, e.g. #include, #define, ...).
In your case, you use a #define directive, and everywhere a preprocessor finds a MAX(i, limit) will be replaced with its definition.
And the output of a preprocessor is also a textual file, but a bit modified. In your case, a preprocessor will replace MAX(0, 3) with
do
{
if (0 < 3)
{
0++;
}
} while(1)
And now the preprocessor output goes to a compiler like that.
So writing a function in a #define is not the same as writing a normal function void max(int i, int limit) { ... }.
Suppose you had a large number of statements of the form
if(a < 10) a++;
if(b < 100) b++;
if(c < 1000) c++;
In a comment, #the busybee refers to this pattern as a "saturating incrementer".
When you see a repeated pattern in code, there's a natural inclination to want to encapsulate the pattern somehow. Sometimes this is a good idea, or sometimes it's fine to just leave the repetition, if the attempt to encapsulate it ends up making things worse.
One way to encapsulate this particular pattern — I'm not going to say whether I think it's a good way or not — would be to define a function-like macro:
#define INCR_MAX(var, max) if(var < max) var++
Then you could say
INCR_MAX(a, 10);
INCR_MAX(b, 100);
INCR_MAX(c, 1000);
One reason to want to make this a function-like macro (as opposed to a true function) is that a macro can "modify its argument" — in this case, whatever variable name you hand to it as var — in a way that a true function couldn't. (That is, if your saturating incrementer were a true function, you would have to call it either as incr_max(&a, 10) or a = incr_max(a, 10), depending on how you chose to set it up.)
However, there's an issue with function-like macros and the semicolon at the end. I'm not going to explain that whole issue here; there's a big long previous SO question about it.
Applying the lesson of that other question, an "improved" INCR_MAX macro would be
#define INCR_MAX(var, max) do { if(var < max) var++; } while(0)
Finally, it appears that somewhere between your exercise and this SO question, the while(0) at the end somehow got changed to while(1). This just about has to have been an unintentional error, since while(1) makes no sense in this context whatsoever.
Yeah, there's a reason you don't understand it - it's garbage.
After preprocessing, the code is
void main(void)
{
do
{
if ( 0 < 3 )
{
0++;
}
} while(1);
}
Yeah, no clue what this thing is supposed to do. The name MAX implies that it should evaluate to the larger of its two arguments, a la
#define MAX(a,b) ((a) < (b) ? (b) : (a))
but that's obviously not what it's doing. It's not defining an interval between two numbers, it's attempting to set the value of the first argument to the second, but in a way that doesn't make a lick of sense.
There are three problems (technically, four):
the compiler will yak on 0++ - a constant cannot be the operand of the ++ or -- operators;
If either i or limit are expressions, such as MAX(i+1, i+5) you're going to have the same problem with the ++ operator and you're going to have precedence issues;
assuming you fix those problems, you still have an infinite loop;
The (technical) fourth problem is ... using a macro as a function. I know, this is embedded world, and embedded world wants to minimize function call overhead. That's what the inline function specifier is supposed to buy you so you don't have to go through this heartburn.
But, okay, maybe the compiler available for the system you're working on doesn't support inline so you have to go through this exercise.
But you're going to have to go to the person who gave you this code and politely and respectfully ask, "what is this crap?"
Related
Pardon if this question is naive. Consider the following program:
#include <stdio.h>
int main() {
int i = 1;
i = i + 2;
5;
i;
printf("i: %d\n", i);
}
In the above example, the statements 5; and i; seem totally superfluous, yet the code compiles without warnings or errors by default (however, gcc does throw a warning: statement with no effect [-Wunused-value] warning when ran with -Wall). They have no effect on the rest of the program, so why are they considered valid statements in the first place? Does the compiler simply ignore them? Are there any benefits to allowing such statements?
One benefit to allowing such statements is from code that's created by macros or other programs, rather than being written by humans.
As an example, imagine a function int do_stuff(void) that is supposed to return 0 on success or -1 on failure. It could be that support for "stuff" is optional, and so you could have a header file that does
#if STUFF_SUPPORTED
#define do_stuff() really_do_stuff()
#else
#define do_stuff() (-1)
#endif
Now imagine some code that wants to do stuff if possible, but may or may not really care whether it succeeds or fails:
void func1(void) {
if (do_stuff() == -1) {
printf("stuff did not work\n");
}
}
void func2(void) {
do_stuff(); // don't care if it works or not
more_stuff();
}
When STUFF_SUPPORTED is 0, the preprocessor will expand the call in func2 to a statement that just reads
(-1);
and so the compiler pass will see just the sort of "superfluous" statement that seems to bother you. Yet what else can one do? If you #define do_stuff() // nothing, then the code in func1 will break. (And you'll still have an empty statement in func2 that just reads ;, which is perhaps even more superfluous.) On the other hand, if you have to actually define a do_stuff() function that returns -1, you may incur the cost of a function call for no good reason.
Simple Statements in C are terminated by semicolon.
Simple Statements in C are expressions. An expression is a combination of variables, constants and operators. Every expression results in some value of a certain type that can be assigned to a variable.
Having said that some "smart compilers" might discard 5; and i; statements.
Statements with no effect are permitted because it would be more difficult to ban them than to permit them. This was more relevant when C was first designed and compilers were smaller and simpler.
An expression statement consists of an expression followed by a semicolon. Its behavior is to evaluate the expression and discard the result (if any). Normally the purpose is that the evaluation of the expression has side effects, but it's not always easy or even possible to determine whether a given expression has side effects.
For example, a function call is an expression, so a function call followed by a semicolon is a statement. Does this statement have any side effects?
some_function();
It's impossible to tell without seeing the implementation of some_function.
How about this?
obj;
Probably not -- but if obj is defined as volatile, then it does.
Permitting any expression to be made into an expression-statement by adding a semicolon makes the language definition simpler. Requiring the expression to have side effects would add complexity to the language definition and to the compiler. C is built on a consistent set of rules (function calls are expressions, assignments are expressions, an expression followed by a semicolon is a statement) and lets programmers do what they want without preventing them from doing things that may or may not make sense.
The statements you listed with no effect are examples of an expression statement, whose syntax is given in section 6.8.3p1 of the C standard as follows:
expression-statement:
expressionopt ;
All of section 6.5 is dedicated to the definition of an expression, but loosely speaking an expression consists of constants and identifiers linked with operators. Notably, an expression may or may not contain an assignment operator and it may or may not contain a function call.
So any expression followed by a semicolon qualifies as an expression statement. In fact, each of these lines from your code is an example of an expression statement:
i = i + 2;
5;
i;
printf("i: %d\n", i);
Some operators contain side effects such as the set of assignment operators and the pre/post increment/decrement operators, and the function call operator () may have a side effect depending on what the function in question does. There is no requirement however that one of the operators must have a side effect.
Here's another example:
atoi("1");
This is calling a function and discarding the result, just like the call printf in your example but the unlike printf the function call itself does not have a side effect.
Sometimes such a statements are very handy:
int foo(int x, int y, int z)
{
(void)y; //prevents warning
(void)z;
return x*x;
}
Or when reference manual tells us to just read the registers to archive something - for example to clear or set some flag (very common situation in the uC world)
#define SREG ((volatile uint32_t *)0x4000000)
#define DREG ((volatile uint32_t *)0x4004000)
void readSREG(void)
{
*SREG; //we read it here
*DREG; // and here
}
https://godbolt.org/z/6wjh_5
I saw the following in a .h-file for the cc2640 mcu:
#define ADC_STATUS_SUCCESS (0)
From my C knowledge, the compiler is told to put the value of ADC_STATUS_SUCCESS everywhere it occurs, that is (0). But what is the difference in putting just 0?
what is the difference in putting just 0?
None, if you don't write crazy code. It is common to use parentheses for macros that contain expressions to avoid unexpected errors related to operator precedence and similar stuff when using them. In this case though, defining something as 0 or as (0) is the same if it's used in expressions.
What do I mean by "crazy code"? Well, the only difference between the two can be seen in something like the following:
void func(int x) { /* ... */ };
#define ADC_STATUS_SUCCESS 0
func ADC_STATUS_SUCCESS; // INVALID
#define ADC_STATUS_SUCCESS (0)
func ADC_STATUS_SUCCESS; // VALID (for the love of God NEVER do this)
I highly doubt this is the case though, nobody in their right mind would write such an abomination. That define is most likely out of habit.
In order to speed up the performance of my program, I'd like to introduce a function for calculating the leftover of a floating point division (where the quotient is natural, obviously).
Therefore I have following simple function:
double mfmod(double x,double y) {
double a;
return ((a=x/y)-(int)a)*y;
}
As I've heard I could speed up even more by putting this function within a #define clause, but the variable a makes this quite difficult. At this moment I'm here:
#define mfmod(x,y) { \
double a; \
return ((a=x/y)-(int)a)*y; \
}
But trying to launch this gives problems, due to the variable.
The problems are the following: a bit further I'm trying to launch this function:
double test = mfmod(f, div);
And this can't be compiled due to the error message type name is not allowed.
(for your information, f and div are doubles)
Does anybody know how to do this? (if it's even possible) (I'm working with Windows, more exactly Visual Studio, not with GCC)
As I've heard I could speed up even more by putting this function within a #define clause
I think you must have misunderstood. Surely the advice was to implement the behavior as a (#defined) macro, instead of as a function. Your macro is syntactically valid, but the code resulting from expanding it is not a suitable replacement for calling your function.
Defining this as a macro is basically a way to manually inline the function body, but it has some drawbacks. In particular, in standard C, a macro that must expand to an expression (i.e. one that can be used as a value) cannot contain a code block, and therefore cannot contain variable declarations. That, in turn, may make it impossible to avoid multiple evaluation of the macro arguments, which is a big problem.
Here's one way you could write your macro:
#define mfmod(x,y) ( ((x)/(y)) - (int) ((x)/(y)) )
This is not a clear a win over the function, however, as its behavior varies with the argument types, it must evaluate both arguments twice (which can produce unexpected and even undefined results in some cases), and it must also perform the division twice.
If you were willing to change the usage mode so that the macro sets a result variable instead of expanding to an expression, then you could get around many of the problems. #BnBDim provided a first cut at this, but it suffers from some of the same type and multiple-evaluation problems as the above. Here's how you could do it to obtain the same result as your function:
#define mfmod(x, y, res) do { \
double _div = (y); \
double _quot = (double) (x) / _div; \
res = (_quot - (int) _quot) * _div; \
} while (0)
Note that it takes care to reference the arguments once each, and also inside parentheses but for res, which must be an lvalue. You would use it much like a void function instead of like a value-returning function:
double test;
mfmod(f, div, test);
That still affords a minor, but unavoidable risk of breakage in the event that one of the actual arguments to the macro collides with one of the variables declared inside the code block it provides. Using variable names prefixed with underscores is intended to minimize that risk.
Overall, I'd be inclined to go with the function instead, and to let the compiler handle the inlining. If you want to encourage the compiler to do so then you could consider declaring the function inline, but very likely it will not need such a hint, and it is not obligated to honor one.
Or better, just use fmod() until and unless you determine that doing so constitutes a bottleneck.
To answer the stated question: yes, you can declare variables in defined macro 'functions' like the one you are working with, in certain situations. Some of the other answers have shown examples of how to do this.
The problem with your current #define is that you are telling it to return something in the middle of your code, and you are not making a macro that expands in the way you probably want. If I use your macro like this:
...
double y = 1.0;
double x = 1.0;
double z = mfmod(x, y);
int other = (int)z - 1;
...
This is going to expand to:
...
double y = 1.0;
double x = 1.0;
double z = {
double a;
return ((a=x/y)-(int)a)*y;
};
int other = (int)z - 1;
...
The function (if it compiled) would never proceed beyond the initialization of z, because it would return in the middle of the macro. You also are trying to 'assign' a block of code to z.
That being said, this is another example of making assumptions about performance without any (stated) benchmarking. Have you actually measured any performance problem with just using an inline function?
__attribute__((const))
extern inline double mfmod(const double x, const double y) {
const double a = x/y;
return (a - (int)a) * y;
}
Not only is this cleaner, clearer, and easier to debug than the macro, it has the added benefit of being declared with the const attribute, which will suggest to the compiler that subsequent calls to the function with the same arguments should return the same value, which can cause repeated calls to the function to be optimized away entirely, whereas the macro would be evaluated every time (conceptually). To be honest, even using the local double to cache the division result is probably a premature optimization, since the compiler will probably optimize this away. If this were me, and I absolutely HAD to have a macro to do this, I would write it as follows:
#define mfmod(x, y) (((x/y)-((int)(x/y)))*y)
There will almost certainly not be any noticeable performance hit under optimization for doing the division twice. If there were, I would use the inline function above. I will leave it to you to do the benchmarking.
You could use this work-around
#define mfmod(x,y,res) \
do { \
double a=(x)/(y); \
res = (a-(int)a)*(y); \
} while(0)
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
What’s the use of do while(0) when we define a macro?
Why are there sometimes meaningless do/while and if/else statements in C/C++ macros?
C multi-line macro: do/while(0) vs scope block
I have seen a lot of usages like this, previously I though that the programmer wanted to break out of a block of code easily. Why do we need a do { ... } while (0) loop here? Are we trying to tell the compiler something?
For instance in Linux kernel 2.6.25, include/asm-ia64/system.h
/*
* - clearing psr.i is implicitly serialized (visible by next insn)
* - setting psr.i requires data serialization
* - we need a stop-bit before reading PSR because we sometimes
* write a floating-point register right before reading the PSR
* and that writes to PSR.mfl
*/
#define __local_irq_save(x) \
do { \
ia64_stop(); \
(x) = ia64_getreg(_IA64_REG_PSR); \
ia64_stop(); \
ia64_rsm(IA64_PSR_I); \
} while (0)
It's always used in macros so that a semicolon is required after a call, just like when calling a regular function.
In your example, you have to write
__local_irq_save(1);
while
__local_irq_save(1)
would result in an error about a missing semicolon. This would not happen if the do while was not there. If it was just about scoping, a simple curly brace pair would suffice.
It allows for the code to appear here:
if(a) __local_irq_save(x); else ...;
// -> if(a) do { .. } while(0); else ...;
If they simply used a { .. } you would get
if(a) { ... }; else ...;
The else would not belong to any if anymore, because the semicolon would be the next statement and separate the else from the preceeding if. A compile error would occur.
The purpose of do{ ... } while(0) construct is to turn a group of statements into a single compound statement that can be terminated with a ;. You see, in C language the do/while construct has one weird and unusual property: even though it "works" as a compound statement, it expects a ; at the end. No other compound constructs in C have this property.
Because of this property, you can use do/while to write multi-statement macros, which can be safely used as "ordinary" functions without worrying what's inside the macro, as in the following example
if (/* some condition */)
__local_irq_save(x); /* <- we can safely put `;` here */
else
/* whatever */;
The answer has already been given (so the macro forces a ; when called), but another use of this kind of statement that I have seen: it allows break to be called anywhere in the "loop", early terminating if needed. Essentially a "goto" that your fellow programmers wouldn't murder you for.
do {
int i = do_something();
if(i == 0) { break; } // Skips the remainder of the logic
do_something_else();
} while(0);
Note that this is still fairly confusing, so I don't encourage its use.
Looks like it's there just for scoping. It's similar to:
if (true)
{
// Do stuff.
}
edit
I don't see it in your example, but it's possible that one of those function calls is actually a macro, in which case there's one key difference between do/while(0) and if(true), which is that the former allows continue and break.
It makes use of the macro act like a real statement or function call.
A statement is either { expression-list } or expression; so that poses a problem when defining macros that need more than one expression, because if you use { } then a syntax error will occur if the caller of the macro quite reasonably adds a ; before an else.
if(whatever)
f(x);
else
f(y);
If f() is a single statement macro, fine, but what if it's a macro and something complicated? You end up with if(...) { s1; s2; }; else ... and that doesn't work.
So the writer of the macro has to then either make it into a real function, wrap the construct in a single statement, or use a gnu extension.
The do .. while(0) pattern is the "wrap the construct" approach.
I'm trying to decide whether to implement certain operations as macros or as functions.
Say, just as an example, I have the following code in a header file:
extern int x_GLOB;
extern int y_GLOB;
#define min(x,y) ((x_GLOB = (x)) < (y_GLOB = (y))? x_GLOB : y_GLOB)
the intent is to have each single argument computed only once (min(x++,y++) would not cause any problem here).
The question is: do the two global variables pose any issue in terms of code re-entrancy or thread safeness?
I would say no but I'm not sure.
And what about:
#define min(x,y) ((x_GLOB = (x)), \
(y_GLOB = (y)), \
((x_GLOB < y_GLOB) ? x_GLOB : y_GLOB)
would it be a differnt case?
Please note that the question is not "in general" but is related to the fact that those global variables are used just within that macro and for the evaluation of a single expression.
Anyway, looking at the answers below, I think I can summarize as follows:
It's not thread safe as nothing guarantees that a thread is suspended "in the middle" of the evaluation of an expression (as I hoped instead)
The "state" those globals represent is, at least, the internal state of the "min" operation and preserving that state would, again at least, require to impose restrictions on how the function can be called (e.g. avoid min(min(1,2),min(3,1)).
I don't think using non-portable constructs is a good idea either so I guess the only option is to stay on the safe side an implement those cases as regular functions.
Silently modifying global variables inside a macro like min is a very bad idea. You're much better off
accepting that min can evaluate its argument multiple times (in which case I'd call it MIN to make it look less like a regular C function), or
write a regular or inline function for min, or
use gcc extensions to make the macro safe.
If you do any of these things, you eliminate the global variables entirely and hence any questions about reentracy and multi-threading.
Option 1:
#define MIN(x, y) ((x) < (y) ? (x) : (y))
Option 2:
int min(int x, int y) { return x < y ? x : y; }
Of course, this has the problem that you can't use this version of min for different argument types.
Option 3:
#define min(x, y) ({ \
typeof(x) x__ = (x); \
typeof(y) y__ = (y); \
x__ < y__ ? x__ : y__; \
})
The globals are not thread-safe. You can make them so by using thread-local storage.
Personally, I'd use an inline function instead of a macro to avoid the problem altogether!
Personally I don't think this code is safe at all, even if we don't consider multithreaded scenarios.
Take the following evaluation:
int a = min(min(1, 2), min(3, 4));
How will this expand properly and evaluate?
Evaluate min(1, 2) and assign value to x_GLOB (outer macro):
x_GLOB = 1
y_GLOB = 2
evaluate min, result is 1, assign to x_GLOB (for outer macro)
Evaluate min(3, 4) and assign value to y_GLOB (outer macro):
x_GLOB = 3 (uh-oh, 1 from the first expansion is now clobbered)
y_GLOB = 4
evaluate min, result is 3, assign to y_GLOB (for outer macro)
Evaluate min(a, b) where a is min(1, 2) and b is min(3, 4)
evaluate min of x_GLOB (which is 3) and y_GLOB (which is 3)
result of total evaluation is 3
Or am I missing something here?
Generally speaking, anytime you use global variables, your subroutine (or macro) is not re-entrant.
Anytime two threads access the same data or shared resource, you have to use synchronization primitives (mutexes, semaphores, etc) to coordinate access to the "critical section".
See Wikipedia reentrant page
It's not thread-safe. To demonstrate this, suppose that thread A calls min(1,2) and threadB calls min(3,4). Suppose that thread A is running first, and is interrupted by the scheduler right at the question mark of the macro, because its time slice has expired.
Then x_GLOB is 1, and y_GLOB is 2.
Now suppose threadB runs for a while, and completes the contents of the macro:
x_GLOB is 3, y_GLOB is 4.
Now suppose threadA resumes. It will return x_GLOB (3) instead of the right answer (1).
Obviously I've simplified things a bit - being interrupted "at the question mark" is pretty handwavy, and threadA doesn't necessarily return 3, on some compilers with some optimisation levels it might keep the values in registers and not read back from x_GLOB at all. So the code emitted might happen to work with that compiler and options. But hopefully my example makes the point - you can't be sure.
I recommend you write a static inline function. If you're trying to be very portable, do it like this:
#define STATIC_INLINE static
STATIC_INLINE int min_func(int x, int y) { return x < y ? x : y; }
#define min(a,b) min_func((a),(b))
Then if you have a performance problem on some particular platform, and it turns out to be because that compiler fails to inline it, you can worry about whether on that compiler you need to define STATIC_INLINE differently (to static inline in C99, or static __inline for Microsoft, or whatever), or perhaps even implement the min macro differently. That level of optimisation ("does it get inlined?") isn't something you can do much about in portable code.
That code is not safe for multithreading.
Just avoid macros, unless indispensable ("Efficient C++" books have examples of those cases).