I have doubts about macros, When we create like the following
#define DATA 40
where DATA can be create? and i need to know size also?and type of DATA?
In java we create macro along with data type,
and what about macro function they are all inline function?
Macros are essentially text substitutions.
DATA does not exist beyond the pre-processing stage. The compiler never sees it. Since no variable is created, we can't talk about its data type, size or address.
Macros are literally pasted into the code. They are not "parsed", but expanded. The compiler does not see DATA, but 40. This is why you must be careful because macros are not like normal functions or variables. See gcc's documentation.
A macro is a fragment of code which has been given a name. Whenever
the name is used, it is replaced by the contents of the macro. There
are two kinds of macros. They differ mostly in what they look like
when they are used. Object-like macros resemble data objects when
used, function-like macros resemble function calls.
You may define any valid identifier as a macro, even if it is a C
keyword. The preprocessor does not know anything about keywords. This
can be useful if you wish to hide a keyword such as const from an
older compiler that does not understand it. However, the preprocessor
operator defined (see Defined) can never be defined as a macro, and
C++'s named operators (see C++ Named Operators) cannot be macros when
you are compiling C++.
macro's are not present in your final executable. They present in your source code only.macro's are processed during pre-processing stage of compilation.You can find more info about macro's here
Preprocessor directives like #define are replaced with the corresponding text during the preprocessing phase of compilation, and are (almost) never represented in the final executable.
Related
Everyone knows about classic #define DEFAULT_VALUE 100 macro where the preprocessor will just find the "token" and replace it with whatever the value is.
The problem I am having is understanding the function version of this #define my_puts(x) puts(x). I have K&R in front of me but I simply cannot find a suitable explanation. For instance:
why do I need to supply the number of arguments?
why can their name be whatever?
why don't I have to supply the type?
But mainly I would like to know how this replacement functions under the hood.
In the back of my mind I think I have a memory of someone saying somewhere that this is bad because there are no types.
In short, I would like to know if it is safe and secure to use macros to rename functions (as opposed to the alternative of manually wrapping the function in another function).
Thank you!
The problem I am having is understanding the function version of this #define my_puts(x) puts(x).
Part of your confusion might arise from thinking of this variety as a "function renaming" macro. A more conventional term is "function-like", referring to the form of the macro definition and usage. Providing aliases for function names or converting from one function name to another is a relatively minor use for this kind of macro.
Such macros are better regarded more generally, simply as macros that accept parameters. From that standpoint, your specific questions have relatively clear answers:
why do I need to supply the number of arguments?
You are primarily associating parameter names with the various positions in the macro's parameter list. This is necessary so that the preprocessor can properly expand the macro. That the number of parameters is thereby conveyed (except for variadic macros) is of secondary importance.
why can their name be whatever?
"Whatever" is a little too strong, but the answer is that the names of macro parameters are significant only within the scope of the macro definition. The preprocessor substitutes the actual arguments into each expansion in place of the parameter names whenever it expands the macro. This is analogous to bona fide functions, actually, so I'm not really sure why this particular uncertainty arises for you.
why don't I have to supply the type?
Of the macro? Because to the extent that macros have a type, they all have the same one. They all expand to sequences of zero or more tokens. You can view this as a source-to-source translation. The resulting token sequence will be interpreted by the compiler at a subsequent stage in the process.
But mainly I would like to know how this replacement functions under the hood.
Roughly speaking, wherever the name of an in-scope function like macro appears in the source code followed by a parenthesized list of arguments, the macro name and argument list are replaced by the expansion of the macro, with the macro arguments substituted appropriately.
For example, consider this function-like macro, which you might see in real source code:
#define MIN(x, y) (((x) <= (y)) ? (x) : (y))
Within the scope of that definition, this code ...
n = MIN(10, z);
... expands to
n = (((10) <= (z)) ? (10) : (z));
Note well that
the function-like macro is not providing function alias in this case.
the macro arguments are substituted into the macro expansion wherever they appear as complete tokens in the macro's defined replacement text.
In the back of my mind I think I have a memory of someone saying somewhere that this is bad because there are no types.
Well, there are no types declared in the macro definition. That doesn't prevent all the normal rules around data type from applying to the source code resulting from the preprocessing stage. Both of these factors need to be taken into account. In some ways, the MIN() macro in the above example is more flexible than any one function can be be. Is that bad? I don't mean to deny that there are arguments against, but it's a multifaceted question that is not well captured by a single consideration or a plain "good" vs. "bad" evaluation.
In short, I would like to know if it is safe and secure to use macros to rename functions (as opposed to the alternative of manually wrapping the function in another function).
That's largely a different question from any of the above. The semantics of function-like macros are well-defined. There is no inherent safety or security issue. But function-like macros do obscure what is going on, and thereby make it more difficult to analyze code. This is therefore mostly a stylistic issue.
Function-like macros do have detractors these days, especially in the C++ community. In most cases, they have little to offer to distinguish themselves as superior to functions.
The LLVM libc++ headers have a macro, used in function declarations, named _LIBCPP_INLINE_VISIBILITY.
I don't understand what it means; I looked at its definition, and it says:
// Just so we can migrate to the new macros gradually.
#define _LIBCPP_INLINE_VISIBILITY _LIBCPP_HIDE_FROM_ABI
... and this second macro has no definition I can find. So, what does _LIBCPP_INLINE_VISIBILITY mean and what is it typically expanded into?
(Thanks, #Ruslan)
The intent is to hide functions marked with it from appearing in dynamic libraries ("hide from the ABI"). This used to be done by making such functions inline only, but now, the clang attribute attribute((internal_linkage)) is used; that's the definition of _LIBCPP_HIDE_FROM_ABI.
As for the inline-for-invisibility macro _LIBCPP_INLINE_VISIBILITY - what you're seeing is it being redefined to what its name should have been to being with.
I was reading the C Preprocessor guide page on gnu.org on computed includes which has the following explanation:
2.6 Computed Includes
Sometimes it is necessary to select one of several different header
files to be included into your program. They might specify
configuration parameters to be used on different sorts of operating
systems, for instance. You could do this with a series of
conditionals,
#if SYSTEM_1
# include "system_1.h"
#elif SYSTEM_2
# include "system_2.h"
#elif SYSTEM_3 …
#endif
That rapidly becomes tedious. Instead, the preprocessor offers the
ability to use a macro for the header name. This is called a computed
include. Instead of writing a header name as the direct argument of
‘#include’, you simply put a macro name there instead:
#define SYSTEM_H "system_1.h"
…
#include SYSTEM_H
This doesn't make sense to me. The first code snippet allows for optionality based on which system type you encounter by using branching if elifs. The second seems to have no optionality as a macro is used to define a particular system type and then the macro is placed into the include statement without any code that would imply its definition can be changed. Yet, the text implies these are equivalent and that the second is a shorthand for the first. Can anyone explain how the optionality of the first code snippet exists in the second? I also don't know what code is implied to be contained in the "..." in the second code snippet.
There's some other places in the code or build system that define or don't define the macros that are being tested in the conditionals. What's suggested is that instead of those places defining lots of different SYSTEM_1, SYSTEM_2, etc. macros, they'll just define SYSTEM_H to the value that's desired.
Most likely this won't actually be in an explicit #define, instead of will be in a compiler option, e.g.
gcc -DSYSTEM_H='"system_1.h"' ...
And this will most likely actually come from a setting in a makefile or other configuration file.
I was reading the book "Compilers: Principles, Techniques, and Tools (2nd Edition)" by Alfred V. Aho. There is an example in this book (example 1.7) which asks to analyze the scope of x in the following macro definition in C:
#define a (x+1)
From this example,
We cannot resolve x statically, that is, in terms of the program text.
In fact, in order to interpret x, we must use the usual dynamic-scope
rule. We examine all the function calls that are currently active, and
we take the most recently called function that has a declaration of x.
It is to this declaration that the use of x refers.
I've become confused reading this - as far as I know, macro substitution happens in the preprocessing stage, before compilation starts. But if I get it right, the book says it happens when the program is getting executed. Can anyone please clarify this?
The macro itself has no notion of scope, at least not in the same sense as the C language has. Wherever the symbol a appears in the source after the #define (and before a possible #undef) it is replaced by (x + 1).
But the text talks about the scope of x, the symbol in the macro substitution. That is interpreted by the usual C rules. If there is no symbol x in the scope where a was substituted, this is a compilation error.
The macro is not self-contained. It uses a symbol external to the macro, some kind of global variable if you will, but one whose meaning will change according to the place in the source text where the macro is invoked. I think what the quoted text wants to say is that we cannot know what macro a does unless we know where it is evoked.
I've become confused reading this - as far as I know, macro substitution happens in preprocessing stage, before compilation starts.
Yes, this is how a compiler works.
But if I get it right, the book says it happens when the program is getting executed. Can anyone please clarify this?
Speaking without referring to the book, there are other forms of program analysis besides translating source code to object code (a.k.a. compilation). A C compiler replaces macros before compiling, thus losing information about what was originally a macro, because that information is not significant to the rest of the translation process. The question of the scope of x within the macro never comes up, so the compiler may ignore the issue.
Debuggers often implement tighter integration with source code, though. One could conceive of a debugger that points at subexpressions while stepping through the program (I have seen this feature in an embedded toolchain), and furthermore points inside macros which generate expressions (this I have never seen, but it's conceivable). Or, some debuggers allow you to point at any identifier and see its value. Pointing at the macro definition would then require resolving the identifiers used in the macro, as Aho et al discuss there.
It's difficult to be sure without seeing more context from the book, but I think that passage is at least unclear, and probably incorrect. It's basically correct about how macro definitions work, but not about how the name x is resolved.
#define a (x+1)
C macros are expanded early in the compilation process, in translation phase 4 of 8, as specified in N1570 5.1.1.2. Variable names aren't resolved until phase 7).
So the name x will be meaningfully visible to the compiler, not at the point where the macro is defined, but at the point in the source code where the macro a is used. Two different uses of the a macro could refer to two different declarations of variables named x.
We cannot resolve x statically, that is, in terms of the program text.
We cannot resolve it at the point of the macro definition.
In fact, in order to interpret x, we must use the usual dynamic-scope
rule. We examine all the function calls that are currently active, and
we take the most recently called function that has a declaration of x.
It is to this declaration that the use of x refers.
This is not correct for C. When the compiler sees a reference to x, it must determine what declaration it refers to (or issue a diagnostic if there is no such declaration). That determination does not depend on currently active function calls, something that can only be determined at run time. C is statically scoped, meaning that the appropriate declaration of x can be determined entirely by examining the program text.
At compile time, the compiler will examine symbol table entries for the current block, then for the enclosing block, then for the current function (x might be the name of a parameter), then for file scope.
There are languages that uses dynamic scoping, where the declaration a name refers to depends on the current run-time call stack. C is not one of them.
Here's an example of dynamic scoping in Perl (note that this is considered poor style):
#!/usr/bin/perl
use strict;
use warnings;
no strict "vars";
sub inner {
print " name=\"$name\"\n";
}
sub outer1 {
local($name) = "outer1";
print "outer1 calling inner\n";
inner();
}
sub outer2 {
local($name) = "outer2";
print "outer2 calling inner\n";
inner();
}
outer1();
outer2();
The output is:
outer1 calling inner
name="outer1"
outer2 calling inner
name="outer2"
A similar program in C would be invalid, since the declaration of name would not be statically visible in the function inner.
I have a general doubt ..
Is there a way we limit the scope of a MACRO within a .C file just like a static function ?
Macros are done by the pre-processor.
The pre-processor reads all files being processed and applies macros and macro logic, the results of which are then passed to the compiler.
Once a macro is defined, its value will be used everywhere the macro is referenced, even in other files.
Please see the GCC Documentation for details regarding macro usage.
The general practice is to #undef the macro when you're done with it. Error prone, but it works.
Macros don't have any sort of block scope.
You can place the macro in the .c file where you want it to be used instead of a header file and it won't be accessible from other files (although some compilers allow inclusion of .c files but no one does that, well no one that's sensible).
Also mentioned below is the use of #undef but that can quickly start to get messy if you use that macro a lot.
All macros are already like static functions, in that they can only be used in the translation unit in which they're defined. If you want to restrict the areas where you can use a particular macro, just define it in a sensible place.
a macro is evaluated by the preprocessor, not by the compiler.
it doesn't know anything about compilation units, so you cannot restrict it's use to one.
instead it is evaluated within the translation unit.
the macros life cycle starts in the line it is defined (all lines above it do know nothing about the macro), and it ends either at the end of the translation unit or whenever it get's undefined using "#undef"
All C macros are limited to the translation unit (a single C file) unless they are defined in a header and being included to every translation units.
Unfortunately, a translation unit is often big, easily hundreds to thousands of lines of code, and macros are context dependent and it would be much more useful if it can be limited to much smaller context (such as a block scope). Lacking scope limits macro usage in C, mostly global constants, a few universal simple routines, and often need all capital names or some trick to manage pollutions).
However, higher order functions can be easily achieved with macros. Think about how we use natural language, where we may use "it" to refer any thing too long to repeat within the context. A scoped macro system will enable the same ability.
I have developed MyDef, which is essentially a scoped macro system.