How do function renaming macros work, and should one use them? - c

Everyone knows about classic #define DEFAULT_VALUE 100 macro where the preprocessor will just find the "token" and replace it with whatever the value is.
The problem I am having is understanding the function version of this #define my_puts(x) puts(x). I have K&R in front of me but I simply cannot find a suitable explanation. For instance:
why do I need to supply the number of arguments?
why can their name be whatever?
why don't I have to supply the type?
But mainly I would like to know how this replacement functions under the hood.
In the back of my mind I think I have a memory of someone saying somewhere that this is bad because there are no types.
In short, I would like to know if it is safe and secure to use macros to rename functions (as opposed to the alternative of manually wrapping the function in another function).
Thank you!

The problem I am having is understanding the function version of this #define my_puts(x) puts(x).
Part of your confusion might arise from thinking of this variety as a "function renaming" macro. A more conventional term is "function-like", referring to the form of the macro definition and usage. Providing aliases for function names or converting from one function name to another is a relatively minor use for this kind of macro.
Such macros are better regarded more generally, simply as macros that accept parameters. From that standpoint, your specific questions have relatively clear answers:
why do I need to supply the number of arguments?
You are primarily associating parameter names with the various positions in the macro's parameter list. This is necessary so that the preprocessor can properly expand the macro. That the number of parameters is thereby conveyed (except for variadic macros) is of secondary importance.
why can their name be whatever?
"Whatever" is a little too strong, but the answer is that the names of macro parameters are significant only within the scope of the macro definition. The preprocessor substitutes the actual arguments into each expansion in place of the parameter names whenever it expands the macro. This is analogous to bona fide functions, actually, so I'm not really sure why this particular uncertainty arises for you.
why don't I have to supply the type?
Of the macro? Because to the extent that macros have a type, they all have the same one. They all expand to sequences of zero or more tokens. You can view this as a source-to-source translation. The resulting token sequence will be interpreted by the compiler at a subsequent stage in the process.
But mainly I would like to know how this replacement functions under the hood.
Roughly speaking, wherever the name of an in-scope function like macro appears in the source code followed by a parenthesized list of arguments, the macro name and argument list are replaced by the expansion of the macro, with the macro arguments substituted appropriately.
For example, consider this function-like macro, which you might see in real source code:
#define MIN(x, y) (((x) <= (y)) ? (x) : (y))
Within the scope of that definition, this code ...
n = MIN(10, z);
... expands to
n = (((10) <= (z)) ? (10) : (z));
Note well that
the function-like macro is not providing function alias in this case.
the macro arguments are substituted into the macro expansion wherever they appear as complete tokens in the macro's defined replacement text.
In the back of my mind I think I have a memory of someone saying somewhere that this is bad because there are no types.
Well, there are no types declared in the macro definition. That doesn't prevent all the normal rules around data type from applying to the source code resulting from the preprocessing stage. Both of these factors need to be taken into account. In some ways, the MIN() macro in the above example is more flexible than any one function can be be. Is that bad? I don't mean to deny that there are arguments against, but it's a multifaceted question that is not well captured by a single consideration or a plain "good" vs. "bad" evaluation.
In short, I would like to know if it is safe and secure to use macros to rename functions (as opposed to the alternative of manually wrapping the function in another function).
That's largely a different question from any of the above. The semantics of function-like macros are well-defined. There is no inherent safety or security issue. But function-like macros do obscure what is going on, and thereby make it more difficult to analyze code. This is therefore mostly a stylistic issue.
Function-like macros do have detractors these days, especially in the C++ community. In most cases, they have little to offer to distinguish themselves as superior to functions.

Related

How to process macros in LEX?

How do I implement #define in yacc/bison?
For Example:
#define f(x) x*x
If anywhere f(x) appears in any function then it is replaced by the right side of the
macro substituting for the argument ‘x’.
For example, f(3) would be replaced with 3*3. The macro can call another macro too.
It's not usually possible to do macro expansion inside a parser, at least not C-style macros, because C-style macro expansion doesn't respect syntax. For example
#define IF if(
#define THEN )
is legal (although very bad style IMHO). But for that to be handled inside the grammar, it would be necessary to allow a macro identifier to appear anywhere in the input, not just where an identifier might be expected. The necessary modifications to the grammar are going to make it much less readable and are very likely to introduce parser action conflicts. [Note 1]
Alternatively, you could do the macro expansion in the lexical analyzer. The lexical analyzer is not a parser, but parsing a C-style macro invocation doesn't require much sophistication, and if macro parameters were not allowed, it would be even simpler. This is how Flex handles macro replacement in its regular expressions. ({identifier}, for example. [Note 2] Since Flex macros are just raw character sequences, not token lists as with C-style macros, they can be handled by pushing the replacement text back into the input stream. (F)lex provides the unput special action for this purpose. unput pushes one character back into the input stream, so if you want to push an entire macro replacement, you have to unput it one character at a time, back to front so that the last character unput is the first one to be read afterwards.
That's workable but ugly. And it's not really scalable to even the small feature list provided by the C preprocessor. And it violates the fundamental principle of software design, which is that each component does just one thing (so that it can do it well).
So that leaves the most common approach, which is to add a separate macro processor component, so that instead of dividing the parse into lexical scan/syntax analysis, the parse becomes lexical scan/macro expansion/syntax analysis. [Note 3]
A C-style macro processor which works between the lexical analyser and the syntactic analyser could itself be written in Bison. As I mentioned above, the parsing requirements are generally minimal, but there is still parsing to be done and Bison is presumably already part of the project. Although I don't know of any macro processor (other than proof-of-concept programs I've written myself) which do this, I think it's a very flexible solution. In particular, the Bison syntactic analysis phase could be implemented with a push-parser, which avoids the need to produce the entire macro-expanded token stream in order to make it available to a traditional pull-parser.
That's not the only way to design macros, though. Indeed, it has a lot of shortcomings, because the macro expansions are not hygienic, respecting neither syntax nor scope. Probably anyone who has used C macros has at one time or other been bitten by these problems; the simplest manifestation is defining a macro like:
#define NEXT(a) a + 1
and then writing
int x = NEXT(a) * 3;
which is not going to produce the expected result (unless what is expected is a violation of the syntactic form of the last statement). Also, any macro expansion which needs to use a local variable will sooner or later produce an incorrect expansion because of unexpected name collision. Hygienic macro expansion seeks to solve these issues by viewing macro expansion as an operation on syntax trees, not token streams, making the parsing paradigm lexical scan/syntax analysis/macro expansion (of the parse tree). For that operation, the appropriate tool might well be some kind of tree parser.
Notes
Also, you'd want to remove the token from the parse tree Yacc/bison does have a poorly-documented feature, YYBACKUP, which might possibly help be able to accomplish this. I don't know if that's one of its intended use cases; indeed, it is not clear to me what its intended use cases are.
The (f)lex documentation calls these definitions, but they really are macros, and they suffer from all the usual problems macros bring with them, such as mysterious interactions with surrounding syntax.
Another possibility is macro expansion/lexical scan/syntax analysis, which could be implemented using a macro processor like M4. But that completely divorces the macros from the rest of the language.
yacc and lex generate c source at the end. So you can use macros inside the parser and lexer actions.
The actual #define preprocessor directives can go in the first section of the lexer and parser file
%{
// Somewhere here
#define f(x) x*x
%}
These sections will be copied verbatim to the generated c source.

How can I get the function name as text not string in a macro?

I am trying to use a function-like macro to generate an object-like macro name (generically, a symbol). The following will not work because __func__ (C99 6.4.2.2-1) puts quotes around the function name.
#define MAKE_AN_IDENTIFIER(x) __func__##__##x
The desired result of calling MAKE_AN_IDENTIFIER(NULL_POINTER_PASSED) would be MyFunctionName__NULL_POINTER_PASSED. There may be other reasons this would not work (such as __func__ being taken literally and not interpreted, but I could fix that) but my question is what will provide a predefined macro like __func__ except without the quotes? I believe this is not possible within the C99 standard so valid answers could be references to other preprocessors.
Presently I have simply created my own object-like macro and redefined it manually before each function to be the function name. Obviously this is a poor and probably unacceptable practice. I am aware that I could take an existing cpp program or library and modify it to provide this functionality. I am hoping there is either a commonly used cpp replacement which provides this or a preprocessor library (prefer Python) which is designed for extensibility so as to allow me to 'configure' it to create the macro I need.
I wrote the above to try to provide a concise and well defined question but it is certainly the Y referred to by #Ruud. The X is...
I am trying to manage unique values for reporting errors in an embedded system. The values will be passed as a parameter to a(some) particular function(s). I have already written a Python program using pycparser to parse my code and identify all symbols being passed to the function(s) of interest. It generates a .h file of #defines maintaining the values of previously existing entries, commenting out removed entries (to avoid reusing the value and also allow for reintroduction with the same value), assigning new unique numbers for new identifiers, reporting malformed identifiers, and also reporting multiple use of any given identifier. This means that I can simply write:
void MyFunc(int * p)
{
if (p == NULL)
{
myErrorFunc(MYFUNC_NULL_POINTER_PASSED);
return;
}
// do something actually interesting here
}
and the Python program will create the #define MYFUNC_NULL_POINTER_PASSED 7 (or whatever next available number) for me with all the listed considerations. I have also written a set of macros that further simplify the above to:
#define FUNC MYFUNC
void MyFunc(int * p)
{
RETURN_ASSERT_NOT_NULL(p);
// do something actually interesting here
}
assuming I provide the #define FUNC. I want to use the function name since that will be constant throughout many changes (as opposed to LINE) and will be much easier for someone to transfer the value from the old generated #define to the new generated #define when the function itself is renamed. Honestly, I think the only reason I am trying to 'solve' this 'issue' is because I have to work in C rather than C++. At work we are writing fairly object oriented C and so there is a lot of NULL pointer checking and IsInitialized checking. I have two line functions that turn into 30 because of all these basic checks (these macros reduce those lines by a factor of five). While I do enjoy the challenge of crazy macro development, I much prefer to avoid them. That said, I dislike repeating myself and hiding the functional code in a pile of error checking even more than I dislike crazy macros.
If you prefer to take a stab at this issue, have at.
__FUNCTION__ used to compile to a string literal (I think in gcc 2.96), but it hasn't for many years. Now instead we have __func__, which compiles to a string array, and __FUNCTION__ is a deprecated alias for it. (The change was a bit painful.)
But in neither case was it possible to use this predefined macro to generate a valid C identifier (i.e. "remove the quotes").
But could you instead use the line number rather than function name as part of your identifier?
If so, the following would work. As an example, compiling the following 5-line source file:
#define CONCAT_TOKENS4(a,b,c,d) a##b##c##d
#define EXPAND_THEN_CONCAT4(a,b,c,d) CONCAT_TOKENS4(a,b,c,d)
#define MAKE_AN_IDENTIFIER(x) EXPAND_THEN_CONCAT4(line_,__LINE__,__,x)
static int MAKE_AN_IDENTIFIER(NULL_POINTER_PASSED);
will generate the warning:
foo.c:5: warning: 'line_5__NULL_POINTER_PASSED' defined but not used
As pointed out by others, there is no macro that returns the (unquoted) function name (mainly because the C preprocessor has insufficient syntactic knowledge to recognize functions). You would have to explicitly define such a macro yourself, as you already did yourself:
#define FUNC MYFUNC
To avoid having to do this manually, you could write your own preprocessor to add the macro definition automatically. A similar question is this: How to automatically insert pragmas in your program
If your source code has a consistent coding style (particularly indentation), then a simple line-based filter (sed, awk, perl) might do. In its most naive form: every function starts with a line that does not start with a hash or whitespace, and ends with a closing parenthesis or a comma. With awk:
{
print $0;
}
/^[^# \t].*[,\)][ \t]*$/ {
sub(/\(.*$/, "");
sub(/^.*[ \t]/, "");
print "#define FUNC " toupper($0);
}
For a more robust solution, you need a compiler framework like ROSE.
Gnu-C has a __FUNCTION__ macro, but sadly even that cannot be used in the way you are asking.

Where macros variable created? and size of the variable?

I have doubts about macros, When we create like the following
#define DATA 40
where DATA can be create? and i need to know size also?and type of DATA?
In java we create macro along with data type,
and what about macro function they are all inline function?
Macros are essentially text substitutions.
DATA does not exist beyond the pre-processing stage. The compiler never sees it. Since no variable is created, we can't talk about its data type, size or address.
Macros are literally pasted into the code. They are not "parsed", but expanded. The compiler does not see DATA, but 40. This is why you must be careful because macros are not like normal functions or variables. See gcc's documentation.
A macro is a fragment of code which has been given a name. Whenever
the name is used, it is replaced by the contents of the macro. There
are two kinds of macros. They differ mostly in what they look like
when they are used. Object-like macros resemble data objects when
used, function-like macros resemble function calls.
You may define any valid identifier as a macro, even if it is a C
keyword. The preprocessor does not know anything about keywords. This
can be useful if you wish to hide a keyword such as const from an
older compiler that does not understand it. However, the preprocessor
operator defined (see Defined) can never be defined as a macro, and
C++'s named operators (see C++ Named Operators) cannot be macros when
you are compiling C++.
macro's are not present in your final executable. They present in your source code only.macro's are processed during pre-processing stage of compilation.You can find more info about macro's here
Preprocessor directives like #define are replaced with the corresponding text during the preprocessing phase of compilation, and are (almost) never represented in the final executable.

C macros: advantage/intent of apparently useless macro

I have some experience in programming in C but I would not dare to call myself proficient.
Recently, I encountered the following macro:
#define CONST(x) (x)
I find it typically used in expressions like for instance:
double x, y;
x = CONST(2.0)*y;
Completely baffled by the point of this macro, I extensively researched the advantages/disadvantages and properties of macros but still I can not figure out what the use of this particular macro would be. Am I missing something?
As presented in the question, you are right that the macro does nothing.
This looks like some artificial structure imposed by whoever wrote that code, maybe to make it abundantly clear where the constants are, and be able to search for them? I could see the advantage in having searchable constants, but this is not the best way to achieve that goal.
It's also possible that this was part of some other macro scheme that either never got implemented or was only partially removed.
Some (old) C compilers do not support the const keyword and this macro is most probably a reminiscence of a more elaborate sequence of macros that handled different compilers. Used like in x = CONST(2.0)*y; though makes no sense.
You can check this section from the Autoconf documentation for more details.
EDIT: Another purpose of this macro might be custom preprocessing (for extracting and/or replacing certain constants for example), like Qt Framework's Meta Object Compiler does.
There is absolutely no benefit of that macro and whoever wrote it must be confused. The code is completely equivalent to x = 2.0*y;.
Well this kind of macro could actually be usefull when there is a need to workaround the macro expansion.
A typical example of such need is the stringification macro. Refer to the following question for an example : C Preprocessor, Stringify the result of a macro
Now in your specific case, I don't see the benefit appart from extreme documention or code parsing purposes.
Another use could be to reserve those values as future function invocations, something like this:
/* #define CONST(x) (x) */
#define CONST(x) some_function(x)
// ...
double x, y;
x = CONST(2.0)*y; // x = some_function(2.0)*y;
Another good thing about this macro would be something like this
result=CONST(number+number)*2;
or something related to comparisons
result=CONST(number>0)*2;
If there is some problem with this macro, it is probably the name. This "CONST" thing isn't related with constants but with some other thing. It would be nice to look for the rest of the code to know why the author called it CONST.
This macro does have the effect of wrapping parenthesis around x during the macro expansion.
I'm guessing someone is trying to allow for something along the lines of
CONST(3+2)*y
which, without the parens, would become
3+2*y
but with the parens becomes
(3+2)*y
I seem to recall that we had the need for something like this in a previous development lifetime.

When should you use macros instead of inline functions?

In a previous question what I thought was a good answer was voted down for the suggested use of macros
#define radian2degree(a) (a * 57.295779513082)
#define degree2radian(a) (a * 0.017453292519)
instead of inline functions. Please excuse the newbie question, but what is so evil about macros in this case?
Most of the other answers discuss why macros are evil including how your example has a common macro use flaw. Here's Stroustrup's take: http://www.research.att.com/~bs/bs_faq2.html#macro
But your question was asking what macros are still good for. There are some things where macros are better than inline functions, and that's where you're doing things that simply can't be done with inline functions, such as:
token pasting
dealing with line numbers or such (as for creating error messages in assert())
dealing with things that aren't expressions (for example how many implementations of offsetof() use using a type name to create a cast operation)
the macro to get a count of array elements (can't do it with a function, as the array name decays to a pointer too easily)
creating 'type polymorphic' function-like things in C where templates aren't available
But with a language that has inline functions, the more common uses of macros shouldn't be necessary. I'm even reluctant to use macros when I'm dealing with a C compiler that doesn't support inline functions. And I try not to use them to create type-agnostic functions if at all possible (creating several functions with a type indicator as a part of the name instead).
I've also moved to using enums for named numeric constants instead of #define.
There's a couple of strictly evil things about macros.
They're text processing, and aren't scoped. If you #define foo 1, then any subsequent use of foo as an identifier will fail. This can lead to odd compilation errors and hard-to-find runtime bugs.
They don't take arguments in the normal sense. You can write a function that will take two int values and return the maximum, because the arguments will be evaluated once and the values used thereafter. You can't write a macro to do that, because it will evaluate at least one argument twice, and fail with something like max(x++, --y).
There's also common pitfalls. It's hard to get multiple statements right in them, and they require a lot of possibly superfluous parentheses.
In your case, you need parentheses:
#define radian2degree(a) (a * 57.295779513082)
needs to be
#define radian2degree(a) ((a) * 57.295779513082)
and you're still stepping on anybody who writes a function radian2degree in some inner scope, confident that that definition will work in its own scope.
For this specific macro, if I use it as follows:
int x=1;
x = radian2degree(x);
float y=1;
y = radian2degree(y);
there would be no type checking, and x,y will contain different values.
Furthermore, the following code
float x=1, y=2;
float z = radian2degree(x+y);
will not do what you think, since it will translate to
float z = x+y*0.017453292519;
instead of
float z = (x+y)+0.017453292519;
which is the expected result.
These are just a few examples for the misbehavior ans misuse macros might have.
Edit
you can see additional discussions about this here
if possible, always use inline function. These are typesafe and can not be easily redefined.
defines can be redfined undefined, and there is no type checking.
Macros are relatively often abused and one can easily make mistakes using them as shown by your example. Take the expression radian2degree(1 + 1):
with the macro it will expand to 1 + 1 * 57.29... = 58.29...
with a function it will be what you want it to be, namely (1 + 1) * 57.29... = ...
More generally, macros are evil because they look like functions so they trick you into using them just like functions but they have subtle rules of their own. In this case, the correct way would be to write it would be (notice the paranthesis around a):
#define radian2degree(a) ((a) * 57.295779513082)
But you should stick to inline functions. See these links from the C++ FAQ Lite for more examples of evil macros and their subtleties:
inline vs. macros
macros containing if
macros with multiple lines
macros used to paste two tokens together
The compiler's preprocessor is a finnicky thing, and therefore a terrible candidate for clever tricks. As others have pointed out, it's easy to for the compiler to misunderstand your intention with the macro, and it's easy for you to misunderstand what the macro will actually do, but most importantly, you can't step into macros in the debugger!
Macros are evil because you may end up passing more than a variable or a scalar to it and this could resolve in an unwanted behavior (define a max macro to determine max between a and b but pass a++ and b++ to the macro and see what happens).
If your function is going to be inlined anyway, there is no performance difference between a function and a macro. However, there are several usability differences between a function and a macro, all of which favor using a function.
If you build the macro correctly, there is no problem. But if you use a function, the compiler will do it correctly for you every time. So using a function makes it harder to write bad code.

Resources