globals, re-entrant code and thread safety

globals, re-entrant code and thread safety - c

I'm trying to decide whether to implement certain operations as macros or as functions.
Say, just as an example, I have the following code in a header file:
extern int x_GLOB;
extern int y_GLOB;
#define min(x,y) ((x_GLOB = (x)) < (y_GLOB = (y))? x_GLOB : y_GLOB)
the intent is to have each single argument computed only once (min(x++,y++) would not cause any problem here).
The question is: do the two global variables pose any issue in terms of code re-entrancy or thread safeness?
I would say no but I'm not sure.
And what about:
#define min(x,y) ((x_GLOB = (x)), \
(y_GLOB = (y)), \
((x_GLOB < y_GLOB) ? x_GLOB : y_GLOB)
would it be a differnt case?
Please note that the question is not "in general" but is related to the fact that those global variables are used just within that macro and for the evaluation of a single expression.
Anyway, looking at the answers below, I think I can summarize as follows:
It's not thread safe as nothing guarantees that a thread is suspended "in the middle" of the evaluation of an expression (as I hoped instead)
The "state" those globals represent is, at least, the internal state of the "min" operation and preserving that state would, again at least, require to impose restrictions on how the function can be called (e.g. avoid min(min(1,2),min(3,1)).
I don't think using non-portable constructs is a good idea either so I guess the only option is to stay on the safe side an implement those cases as regular functions.

Silently modifying global variables inside a macro like min is a very bad idea. You're much better off
accepting that min can evaluate its argument multiple times (in which case I'd call it MIN to make it look less like a regular C function), or
write a regular or inline function for min, or
use gcc extensions to make the macro safe.
If you do any of these things, you eliminate the global variables entirely and hence any questions about reentracy and multi-threading.
Option 1:
#define MIN(x, y) ((x) < (y) ? (x) : (y))
Option 2:
int min(int x, int y) { return x < y ? x : y; }
Of course, this has the problem that you can't use this version of min for different argument types.
Option 3:
#define min(x, y) ({ \
typeof(x) x__ = (x); \
typeof(y) y__ = (y); \
x__ < y__ ? x__ : y__; \
})

The globals are not thread-safe. You can make them so by using thread-local storage.
Personally, I'd use an inline function instead of a macro to avoid the problem altogether!

Personally I don't think this code is safe at all, even if we don't consider multithreaded scenarios.
Take the following evaluation:
int a = min(min(1, 2), min(3, 4));
How will this expand properly and evaluate?
Evaluate min(1, 2) and assign value to x_GLOB (outer macro):
x_GLOB = 1
y_GLOB = 2
evaluate min, result is 1, assign to x_GLOB (for outer macro)
Evaluate min(3, 4) and assign value to y_GLOB (outer macro):
x_GLOB = 3 (uh-oh, 1 from the first expansion is now clobbered)
y_GLOB = 4
evaluate min, result is 3, assign to y_GLOB (for outer macro)
Evaluate min(a, b) where a is min(1, 2) and b is min(3, 4)
evaluate min of x_GLOB (which is 3) and y_GLOB (which is 3)
result of total evaluation is 3
Or am I missing something here?

Generally speaking, anytime you use global variables, your subroutine (or macro) is not re-entrant.
Anytime two threads access the same data or shared resource, you have to use synchronization primitives (mutexes, semaphores, etc) to coordinate access to the "critical section".
See Wikipedia reentrant page

It's not thread-safe. To demonstrate this, suppose that thread A calls min(1,2) and threadB calls min(3,4). Suppose that thread A is running first, and is interrupted by the scheduler right at the question mark of the macro, because its time slice has expired.
Then x_GLOB is 1, and y_GLOB is 2.
Now suppose threadB runs for a while, and completes the contents of the macro:
x_GLOB is 3, y_GLOB is 4.
Now suppose threadA resumes. It will return x_GLOB (3) instead of the right answer (1).
Obviously I've simplified things a bit - being interrupted "at the question mark" is pretty handwavy, and threadA doesn't necessarily return 3, on some compilers with some optimisation levels it might keep the values in registers and not read back from x_GLOB at all. So the code emitted might happen to work with that compiler and options. But hopefully my example makes the point - you can't be sure.
I recommend you write a static inline function. If you're trying to be very portable, do it like this:
#define STATIC_INLINE static
STATIC_INLINE int min_func(int x, int y) { return x < y ? x : y; }
#define min(a,b) min_func((a),(b))
Then if you have a performance problem on some particular platform, and it turns out to be because that compiler fails to inline it, you can worry about whether on that compiler you need to define STATIC_INLINE differently (to static inline in C99, or static __inline for Microsoft, or whatever), or perhaps even implement the min macro differently. That level of optimisation ("does it get inlined?") isn't something you can do much about in portable code.

That code is not safe for multithreading.
Just avoid macros, unless indispensable ("Efficient C++" books have examples of those cases).

Related

Defining a function as macro

I am trying to understand defining functions as macros and I have the following code, which I am not sure I understand:
#define MAX(i, limit) do \
{ \
if (i < limit) \
{ \
i++; \
} \
} while(1)
void main(void)
{
MAX(0, 3);
}
As I understand it tries to define MAX as an interval between 2 numbers? But what's the point of the infinite loop?
I have tried to store the value of MAX in a variable inside the main function, but it gives me an error saying expected an expression

I am currently in a software developing internship, and trying to learn embedded C since it's a new field for me. This was an exercise asking me what the following code will do. I was confused since I had never seen a function written like this
You are confused because this is a trick question. The posted code makes no sense whatsoever. The MAX macro expands indeed to an infinite loop and since its first argument is a literal value, i++ expands to 0++ which is a syntax error.
The lesson to be learned is: macros are confusing, error prone and should not be used to replace functions.

You have to understand that before your code gets to compiler, first it goes through a preprocessor. And it basically changes your text-written code. The way it changes the code is controlled with preprocessor directives (lines that begin with #, e.g. #include, #define, ...).
In your case, you use a #define directive, and everywhere a preprocessor finds a MAX(i, limit) will be replaced with its definition.
And the output of a preprocessor is also a textual file, but a bit modified. In your case, a preprocessor will replace MAX(0, 3) with
do
{
if (0 < 3)
{
0++;
}
} while(1)
And now the preprocessor output goes to a compiler like that.
So writing a function in a #define is not the same as writing a normal function void max(int i, int limit) { ... }.

Suppose you had a large number of statements of the form
if(a < 10) a++;
if(b < 100) b++;
if(c < 1000) c++;
In a comment, #the busybee refers to this pattern as a "saturating incrementer".
When you see a repeated pattern in code, there's a natural inclination to want to encapsulate the pattern somehow. Sometimes this is a good idea, or sometimes it's fine to just leave the repetition, if the attempt to encapsulate it ends up making things worse.
One way to encapsulate this particular pattern — I'm not going to say whether I think it's a good way or not — would be to define a function-like macro:
#define INCR_MAX(var, max) if(var < max) var++
Then you could say
INCR_MAX(a, 10);
INCR_MAX(b, 100);
INCR_MAX(c, 1000);
One reason to want to make this a function-like macro (as opposed to a true function) is that a macro can "modify its argument" — in this case, whatever variable name you hand to it as var — in a way that a true function couldn't. (That is, if your saturating incrementer were a true function, you would have to call it either as incr_max(&a, 10) or a = incr_max(a, 10), depending on how you chose to set it up.)
However, there's an issue with function-like macros and the semicolon at the end. I'm not going to explain that whole issue here; there's a big long previous SO question about it.
Applying the lesson of that other question, an "improved" INCR_MAX macro would be
#define INCR_MAX(var, max) do { if(var < max) var++; } while(0)
Finally, it appears that somewhere between your exercise and this SO question, the while(0) at the end somehow got changed to while(1). This just about has to have been an unintentional error, since while(1) makes no sense in this context whatsoever.

Yeah, there's a reason you don't understand it - it's garbage.
After preprocessing, the code is
void main(void)
{
do
{
if ( 0 < 3 )
{
0++;
}
} while(1);
}
Yeah, no clue what this thing is supposed to do. The name MAX implies that it should evaluate to the larger of its two arguments, a la
#define MAX(a,b) ((a) < (b) ? (b) : (a))
but that's obviously not what it's doing. It's not defining an interval between two numbers, it's attempting to set the value of the first argument to the second, but in a way that doesn't make a lick of sense.
There are three problems (technically, four):
the compiler will yak on 0++ - a constant cannot be the operand of the ++ or -- operators;
If either i or limit are expressions, such as MAX(i+1, i+5) you're going to have the same problem with the ++ operator and you're going to have precedence issues;
assuming you fix those problems, you still have an infinite loop;
The (technical) fourth problem is ... using a macro as a function. I know, this is embedded world, and embedded world wants to minimize function call overhead. That's what the inline function specifier is supposed to buy you so you don't have to go through this heartburn.
But, okay, maybe the compiler available for the system you're working on doesn't support inline so you have to go through this exercise.
But you're going to have to go to the person who gave you this code and politely and respectfully ask, "what is this crap?"

Sending parameter to a #define

I wonder to know is it possible to send a parameter to a #define macro for selecting different output
For example:
#define Row(1) LPC_GPIO0
#define Row(2) LPC_GPIO3
#define Row(3) LPC_GPIO2
Then in my code I create a loop for sending the parameter
Row(x)

This macro syntax doesn't exist.
Moreover, it can't possibly exist, because macros are expanded before the compiler compiles the code. If your x isn't a compile time constant, there would never be a way to determine what to replace in the source code for the macro invocation.
If you need to index some values, just use an array, e.g. (assuming these constants are integers):
static int rows[] = { 0, LPC_GPIO0, LPC_GPIO3, LPC_GPIO2 };
Writing
rows[x]
would have the effect you seem to have expected from your invalid macro syntax.

if you want to use macros
#define GPIOx(x) GPIO##x
and GPIOx(1) will expand to GPIO1

If you want these calculated at runtime there is a way to do what you want
#define Row(x) (x == 1 ? LPC_GPIO0 : (x == 2 ? LPC_GPIO3 : (x == 3 ? LPC_GPIO2 : ERROR_VALUE)))
Though this gets messy as the number of options increases
Also, even if you do want this evaluated at compile time, most optimizing compilers would do that for you as long as x is a constant

Is it possible to declare a variable in a function within a #define clause

In order to speed up the performance of my program, I'd like to introduce a function for calculating the leftover of a floating point division (where the quotient is natural, obviously).
Therefore I have following simple function:
double mfmod(double x,double y) {
double a;
return ((a=x/y)-(int)a)*y;
}
As I've heard I could speed up even more by putting this function within a #define clause, but the variable a makes this quite difficult. At this moment I'm here:
#define mfmod(x,y) { \
double a; \
return ((a=x/y)-(int)a)*y; \
}
But trying to launch this gives problems, due to the variable.
The problems are the following: a bit further I'm trying to launch this function:
double test = mfmod(f, div);
And this can't be compiled due to the error message type name is not allowed.
(for your information, f and div are doubles)
Does anybody know how to do this? (if it's even possible) (I'm working with Windows, more exactly Visual Studio, not with GCC)

As I've heard I could speed up even more by putting this function within a #define clause
I think you must have misunderstood. Surely the advice was to implement the behavior as a (#defined) macro, instead of as a function. Your macro is syntactically valid, but the code resulting from expanding it is not a suitable replacement for calling your function.
Defining this as a macro is basically a way to manually inline the function body, but it has some drawbacks. In particular, in standard C, a macro that must expand to an expression (i.e. one that can be used as a value) cannot contain a code block, and therefore cannot contain variable declarations. That, in turn, may make it impossible to avoid multiple evaluation of the macro arguments, which is a big problem.
Here's one way you could write your macro:
#define mfmod(x,y) ( ((x)/(y)) - (int) ((x)/(y)) )
This is not a clear a win over the function, however, as its behavior varies with the argument types, it must evaluate both arguments twice (which can produce unexpected and even undefined results in some cases), and it must also perform the division twice.
If you were willing to change the usage mode so that the macro sets a result variable instead of expanding to an expression, then you could get around many of the problems. #BnBDim provided a first cut at this, but it suffers from some of the same type and multiple-evaluation problems as the above. Here's how you could do it to obtain the same result as your function:
#define mfmod(x, y, res) do { \
double _div = (y); \
double _quot = (double) (x) / _div; \
res = (_quot - (int) _quot) * _div; \
} while (0)
Note that it takes care to reference the arguments once each, and also inside parentheses but for res, which must be an lvalue. You would use it much like a void function instead of like a value-returning function:
double test;
mfmod(f, div, test);
That still affords a minor, but unavoidable risk of breakage in the event that one of the actual arguments to the macro collides with one of the variables declared inside the code block it provides. Using variable names prefixed with underscores is intended to minimize that risk.
Overall, I'd be inclined to go with the function instead, and to let the compiler handle the inlining. If you want to encourage the compiler to do so then you could consider declaring the function inline, but very likely it will not need such a hint, and it is not obligated to honor one.
Or better, just use fmod() until and unless you determine that doing so constitutes a bottleneck.

To answer the stated question: yes, you can declare variables in defined macro 'functions' like the one you are working with, in certain situations. Some of the other answers have shown examples of how to do this.
The problem with your current #define is that you are telling it to return something in the middle of your code, and you are not making a macro that expands in the way you probably want. If I use your macro like this:
...
double y = 1.0;
double x = 1.0;
double z = mfmod(x, y);
int other = (int)z - 1;
...
This is going to expand to:
...
double y = 1.0;
double x = 1.0;
double z = {
double a;
return ((a=x/y)-(int)a)*y;
};
int other = (int)z - 1;
...
The function (if it compiled) would never proceed beyond the initialization of z, because it would return in the middle of the macro. You also are trying to 'assign' a block of code to z.
That being said, this is another example of making assumptions about performance without any (stated) benchmarking. Have you actually measured any performance problem with just using an inline function?
__attribute__((const))
extern inline double mfmod(const double x, const double y) {
const double a = x/y;
return (a - (int)a) * y;
}
Not only is this cleaner, clearer, and easier to debug than the macro, it has the added benefit of being declared with the const attribute, which will suggest to the compiler that subsequent calls to the function with the same arguments should return the same value, which can cause repeated calls to the function to be optimized away entirely, whereas the macro would be evaluated every time (conceptually). To be honest, even using the local double to cache the division result is probably a premature optimization, since the compiler will probably optimize this away. If this were me, and I absolutely HAD to have a macro to do this, I would write it as follows:
#define mfmod(x, y) (((x/y)-((int)(x/y)))*y)
There will almost certainly not be any noticeable performance hit under optimization for doing the division twice. If there were, I would use the inline function above. I will leave it to you to do the benchmarking.

You could use this work-around
#define mfmod(x,y,res) \
do { \
double a=(x)/(y); \
res = (a-(int)a)*(y); \
} while(0)

Renaming a macro in C

Let's say I have already defined 9 macros from
ABC_1 to ABC_9
If there is another macro XYZ(num) whose objective is to call one of the ABC_{i} based on the value of num, what is a good way to do this? i.e. XYZ(num) should call/return ABC_num.

This is what the concatenation operator ## is for:
#define XYZ(num) ABC_ ## num
Arguments to macros that use concatenation (and are used with the operator) are evaluated differently, however (they aren't evaluated before being used with ##, to allow name-pasting, only in the rescan pass), so if the number is stored in a second macro (or the result of any kind of expansion, rather than a plain literal) you'll need another layer of evaluation:
#define XYZ(num) XYZ_(num)
#define XYZ_(num) ABC_ ## num
In the comments you say that num should be a variable, not a constant. The preprocessor builds compile-time expressions, not dynamic ones, so a macro isn't really going to be very useful here.
If you really wanted XYZ to have a macro definition, you could use something like this:
#define XYZ(num) ((int[]){ \
0, ABC_1, ABC_2, ABC_3, ABC_4, ABC_5, ABC_6, ABC_7, ABC_8, ABC_9 \
}[num])
Assuming ABC_{i} are defined as int values (at any rate they must all be the same type - this applies to any method of dynamically selecting one of them), this selects one with a dynamic num by building a temporary array and selecting from it.
This has no obvious advantages over a completely non-macro solution, though. (Even if you wanted to use macro metaprogramming to generate the list of names, you could still do that in a function or array definition.)

Yes, that's possible, using concatenation. For example:
#define FOO(x, y) BAR ##x(y)
#define BAR1(y) "hello " #y
#define BAR2(y) int y()
#define BAR3(y) return y
FOO(2, main)
{
puts(FOO(1, world));
FOO(3, 0);
}
This becomes:
int main()
{
puts("hello " "world");
return 0;
}

macro with arguments

Let's say I define macro with arguments, then invoke it as follows:
#define MIN(x,y) ((x)<(y)?(x):(y))
int x=1,y=2,z;
z=MIN(y,x);
Given that (a) macro works as text substitution, (b) that actual args here are like formal args, only swapped, -- will this specfic z=MIN(y,x) work as expected ? If it will, why ?
I mean, how preprocessor manages not to confuse actual and formal args ?
This question is about technicalities of C compiler. This is not c++ question.
This question does not recommend anybody to use macros.
This question is not about programming style.

The internal representation of the macro will be something like this, where spaces indicate token boundaries, and #1 and #2 are magic internal-use-only tokens indicating where parameters are to be substituted:
MIN( #1 , #2 ) --> ( ( #1 ) < ( #2 ) ? ( #1 ) : ( #2 ) )
-- that is to say, the preprocessor doesn't make use of the names of macro parameters internally (except to implement the rules about redefinitions). So it doesn't matter that the formal parameter names are the same as the actual arguments.
What can cause problems is when the macro body makes use of an identifier that isn't a formal parameter name, but that identifier also appears in the expansion of a formal parameter. For instance, if you rewrote your MIN macro using the GNU extensions that let you avoid evaluating arguments twice...
#define MIN(x, y) ({ \
__typeof__(x) a = (x); \
__typeof__(y) b = (y); \
a < b ? a : b; \
})
and then you tried to use it like this:
int minint(int b, int a) { return MIN(b, a); }
the macro expansion would look like this:
int minint(int b, int a)
{
return ({
__typeof__(b) a = (b);
__typeof__(a) b = (a);
a < b ? a : b;
});
}
and the function would always return its first argument, whether or not it was smaller. C has no way to avoid this problem in the general case, but a convention that many people use is to always put an underscore at the end of the name of each local variable defined inside a macro, and never put underscores at the ends of any other identifiers. (Contrast the behavior of Scheme's hygienic macros, which are guaranteed to not have this problem. Common Lisp makes you worry about it yourself, but at least there you have gensym to help out.)

It will work as expected.
#define MIN(x, y) ((x) < (y) ? (x) : (y))
int x=1,y=2,z;
z = MIN(y, x);
becomes
int x=1,y=2,z;
z = ((y) < (x) ? (y) : (x));
Does the above have any syntactic or semantic errors? No. Therefore, the result will be as expected.

Since you're missing a close ')', I don't think it will work.
Edit:
Now that's fixed, it should work just fine. It won't be confused by x and y any more than it would be if you has a string x with "x" in it.

First off, this isn't about the C compiler, this is about the C Pre-processor. A macro works much like a function, just though text substitution. What variable names you use make no impact on the outcome of the macro substitution. You could have done:
#define MIN(x,y) ((x)<(y)?(x):(y))
int blarg=1,bloort=2,z;
z=MIN(bloort,blarg);
and get the same result.

As a side node, the min() macro is a perfect example of what can go wrong when using macros, as an exercise you should see what happens when you run the following code:
int x,y,z;
x=1;y=3;
z = min(++x,y);
printf("%d %d %d\n", x,y,z); /* we would expect to get 2 3 2, but we get 3 3 3 . */

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

globals, re-entrant code and thread safety - c

The globals are not thread-safe. You can make them so by using thread-local storage. Personally, I'd use an inline function instead of a macro to avoid the problem altogether!

That code is not safe for multithreading. Just avoid macros, unless indispensable ("Efficient C++" books have examples of those cases).

Related

Defining a function as macro

Sending parameter to a #define

Is it possible to declare a variable in a function within a #define clause

Renaming a macro in C

macro with arguments

Categories

Resources