Decipherment of preprocessor statement - c-preprocessor

Could somebody please help me to decipher the following
preprocessor statement.
#define ALLOC(x,y) x _##y; x* y = &_##y;
I am aware that here some memory allocation for some variable y of type x is done.
But I am not sure what is the purpose of the ## in the statement above is. Would be great if somebody could help me out.

That's known as token pasting operator.

The token pasting operator (##) is used when you need to make one token out of two (or more) separate tokens.
without it, i.e.
#define ALLOC(x,y) x _y; x* y = &_y;
ALLOC(a,b)
would expand into:
a _y; a* b = &_y;
because the preprocessor knows to replace y with b but _y is another token altogether.
with the ## operator (i.e. your example)
ALLOC(a,b)
would expand into:
a _b; a* b = &_b;
## joins the token _ and the token b to form the token _b

A usage such as
ALLOC(int, integer)
would create the following code at the place where the macro is called:
int _integer;
int* integer = &_integer;

If I remember correctly ## means concatenating the "strings itself".
example for your case ALLOC(int,var2) == > int_var2; int* var2 = &_var2

Related

How to understand rescanning the replacement token sequence for more defined identifiers in case of # or ## in macros?

From The C Programming Language by KRC, about the operators # and ## in macro definition
Two special operators influence the replacement process. First,
if an occurrence of a parameter in the replacement token sequence
is immediately preceded by #, string quotes (") are placed around
the corresponding parameter, and then both the # and the
parameter identifier are replaced by the quoted argument. A \
character is inserted before each " or \ character that appears
surrounding, or inside, a string literal or character constant
in the argument.
Second, if the definition token sequence for either kind of macro
contains a ## operator, then just after replacement of the
parameters, each ## is deleted, together with any white space on
either side, so as to concatenate the adjacent tokens and
form a new token. The effect is undefined if invalid tokens are
produced, or if the result depends on the order of processing of the ## operators. Also, ## may not appear at the beginning or end of a replacement token sequence.
In both kinds of macro, the replacement token sequence is
repeatedly rescanned for more defined identifiers. However, once a
given identifier has been replaced in a given expansion, it is not
replaced if it turns up again during rescanning; instead it is left
unchanged.
I am having trouble understanding the last paragraph, especially the sentences in bold.
Could you rephrase it, and/or give some examples? Thanks.
Consider the snippet:
#define A B + C
#define B 1
#define C 2
int k = A;
In this case first pass
will replace A:
int k = B + C;
The second pass will replace B and C
int k = 1 + 2;
Now consider another snippet:
#define A B + C
#define B A
#define C A
int k = A;
Now the first pass will expand A once, as before:
int k = B + C;
The second will replace B and C as before:
int k = A + A;
But here it will stop, as A was already expanded before in the first pass.
To rephrase the first emphasized sentence: when the preprocessor 'sees' a new #define <identifier> <replacement> directive, it checks whether <replacement> contains <identifier>s that have been defined previously.
But if some identifier's replacement contains the same identifier, it's not replaced by the replacement and left for the compiler to process. This means you can't define recursive macros like this:
#define recursion(a) ((a)>0)?:recursion(a-1):(a)
If you then write:
printf("%d\n", recursion(3));
Then the expansion would have a call to recursion(3-1) and the compiler will treat it as a call to a nonexistent function.

Renaming a macro in C

Let's say I have already defined 9 macros from
ABC_1 to ABC_9
If there is another macro XYZ(num) whose objective is to call one of the ABC_{i} based on the value of num, what is a good way to do this? i.e. XYZ(num) should call/return ABC_num.
This is what the concatenation operator ## is for:
#define XYZ(num) ABC_ ## num
Arguments to macros that use concatenation (and are used with the operator) are evaluated differently, however (they aren't evaluated before being used with ##, to allow name-pasting, only in the rescan pass), so if the number is stored in a second macro (or the result of any kind of expansion, rather than a plain literal) you'll need another layer of evaluation:
#define XYZ(num) XYZ_(num)
#define XYZ_(num) ABC_ ## num
In the comments you say that num should be a variable, not a constant. The preprocessor builds compile-time expressions, not dynamic ones, so a macro isn't really going to be very useful here.
If you really wanted XYZ to have a macro definition, you could use something like this:
#define XYZ(num) ((int[]){ \
0, ABC_1, ABC_2, ABC_3, ABC_4, ABC_5, ABC_6, ABC_7, ABC_8, ABC_9 \
}[num])
Assuming ABC_{i} are defined as int values (at any rate they must all be the same type - this applies to any method of dynamically selecting one of them), this selects one with a dynamic num by building a temporary array and selecting from it.
This has no obvious advantages over a completely non-macro solution, though. (Even if you wanted to use macro metaprogramming to generate the list of names, you could still do that in a function or array definition.)
Yes, that's possible, using concatenation. For example:
#define FOO(x, y) BAR ##x(y)
#define BAR1(y) "hello " #y
#define BAR2(y) int y()
#define BAR3(y) return y
FOO(2, main)
{
puts(FOO(1, world));
FOO(3, 0);
}
This becomes:
int main()
{
puts("hello " "world");
return 0;
}

Macro expansion of __typeof__ to function name

I wrote the following code in plain C:
#define _cat(A, B) A ## _ ## B
#define cat(A, B) _cat(A, B)
#define plus(A, B) cat(cat(plus,__typeof__(A)),__typeof__(B))(A, B)
int main(int argc, const char * argv[])
{
double x = 1, y = 0.5;
double r = plus(x, y);
printf("%lf",r);
return 0;
}
Here, I would like the macro plus to be expanded becoming a function name which contains the types of the parameters. In this example I would like it to expand the following way
double r = plus(x, y)
...
/* First becomes*/
double r = cat(cat(plus,double),double)(x, y)
...
/* Then */
double r = cat(plus_double,double)(x, y)
...
/* And finally */
double r = plus_double_double(x, y)
However all I got from the preprocessor is
double r = plus___typeof__(x)___typeof(y)(x,y)
and gcc will obviously refuse to compile.
Now, I know that typeof evaluates at compile-time and it is my understanding that a macro is only prevented from being evaluated when it is contained in second macro which directly involves the stringify #and the concatenation ## tokens (here's the reason why I split cat in the way you see). If this is right, why doesn't __typeof__(x) get evaluated to double by the preprocessor? Seems to me that the behaviour should be perfectly clear at build time. Shouldn't __typeof__(x) evaluate to double before even going in _cat?
I searched and searched but I couldn't find anything... Am I doing something really really stupid?
I'm running Mac OS X Mountain Lion but I'm mostly interested in getting it work on any POSIX platform.
The reason this does not work is typeof is not a macro but a reserved word in the gcc's dialect of C and is thus handled after the preprocessor has finished its work. A good analogy would be the sizeof operator which is not a macro either and is not expanded by the preprocessor. To do (approximately) what you want (pick a different function based on the type of the arguments) try the _Generic construct (new in C11)
Macro expansion occurs before C token analysis (see https://stackoverflow.com/a/1479972/1583175 for a diagram of the phases of translation)
The macro preprocessor is unaware of the type information -- it merely does text processing
The preprocessor knows nothing about types, only about textual tokens. __typeof__() gets evaluated by the compiler pass, after the preprocessor has finished performing macro replacements.

strange macro construstion

I found this macro in c source which I'm porting now:
#define Round256(p0, p1) \
X##p0 += X##p1;
There is no variable X in that code. Can anyone tell me what the symbol # do in this context?
## textually appends two strings together.
So in your example, if called as follows:
Round256(one, two)
will be translated to:
Xone += Xtwo;
The macro simply stringizes the values
Suppose it was called like this
Round256(1,2)
It would be expanded by the preprocessor as
X1 += X2
Which seems to resemble variable X1...n somewhere....
The ## concatenates the stringized values together.
The reason I used numerical values is in the name of the macro itself.
## is the pasting operator. It contatenates X (literally) and each value. So Round256(one, two) will be converted to Xone += Xtwo, for example.

macro with arguments

Let's say I define macro with arguments, then invoke it as follows:
#define MIN(x,y) ((x)<(y)?(x):(y))
int x=1,y=2,z;
z=MIN(y,x);
Given that (a) macro works as text substitution, (b) that actual args here are like formal args, only swapped, -- will this specfic z=MIN(y,x) work as expected ? If it will, why ?
I mean, how preprocessor manages not to confuse actual and formal args ?
This question is about technicalities of C compiler. This is not c++ question.
This question does not recommend anybody to use macros.
This question is not about programming style.
The internal representation of the macro will be something like this, where spaces indicate token boundaries, and #1 and #2 are magic internal-use-only tokens indicating where parameters are to be substituted:
MIN( #1 , #2 ) --> ( ( #1 ) < ( #2 ) ? ( #1 ) : ( #2 ) )
-- that is to say, the preprocessor doesn't make use of the names of macro parameters internally (except to implement the rules about redefinitions). So it doesn't matter that the formal parameter names are the same as the actual arguments.
What can cause problems is when the macro body makes use of an identifier that isn't a formal parameter name, but that identifier also appears in the expansion of a formal parameter. For instance, if you rewrote your MIN macro using the GNU extensions that let you avoid evaluating arguments twice...
#define MIN(x, y) ({ \
__typeof__(x) a = (x); \
__typeof__(y) b = (y); \
a < b ? a : b; \
})
and then you tried to use it like this:
int minint(int b, int a) { return MIN(b, a); }
the macro expansion would look like this:
int minint(int b, int a)
{
return ({
__typeof__(b) a = (b);
__typeof__(a) b = (a);
a < b ? a : b;
});
}
and the function would always return its first argument, whether or not it was smaller. C has no way to avoid this problem in the general case, but a convention that many people use is to always put an underscore at the end of the name of each local variable defined inside a macro, and never put underscores at the ends of any other identifiers. (Contrast the behavior of Scheme's hygienic macros, which are guaranteed to not have this problem. Common Lisp makes you worry about it yourself, but at least there you have gensym to help out.)
It will work as expected.
#define MIN(x, y) ((x) < (y) ? (x) : (y))
int x=1,y=2,z;
z = MIN(y, x);
becomes
int x=1,y=2,z;
z = ((y) < (x) ? (y) : (x));
Does the above have any syntactic or semantic errors? No. Therefore, the result will be as expected.
Since you're missing a close ')', I don't think it will work.
Edit:
Now that's fixed, it should work just fine. It won't be confused by x and y any more than it would be if you has a string x with "x" in it.
First off, this isn't about the C compiler, this is about the C Pre-processor. A macro works much like a function, just though text substitution. What variable names you use make no impact on the outcome of the macro substitution. You could have done:
#define MIN(x,y) ((x)<(y)?(x):(y))
int blarg=1,bloort=2,z;
z=MIN(bloort,blarg);
and get the same result.
As a side node, the min() macro is a perfect example of what can go wrong when using macros, as an exercise you should see what happens when you run the following code:
int x,y,z;
x=1;y=3;
z = min(++x,y);
printf("%d %d %d\n", x,y,z); /* we would expect to get 2 3 2, but we get 3 3 3 . */

Resources