I have written a little bit of C, and I can read it well enough to get a general idea of what it is doing, but every time I have encountered a macro it has thrown me completely. I end up having to remember what the macro is and substitute it in my head as I read. The ones that I have encountered that were intuitive and easy to understand were always like little mini functions, so I always wondered why they weren't just functions.
I can understand the need to define different build types for debug or cross platform builds in the preprocessor but the ability to define arbitrary substitutions seems to be useful only to make an already difficult language even more difficult to understand.
Why was such a complex preprocessor introduced for C? And does anyone have an example of using it that will make me understand why it still seems to be used for purposes other than simple if #debug style conditional compilations?
Edit:
Having read a number of answers I still just don't get it. The most common answer is to inline code. If the inline keyword doesn't do it then either it has a good reason to not do it, or the implementation needs fixing. I don't understand why a whole different mechanism is needed that means "really inline this code" (aside form the code being written before inline was around). I also don't understand the idea that was mentioned that "if its too silly to be put in a function". Surely any piece of code that takes an input and produces an output is best put in a function. I think I may not be getting it because I am not used to the micro optimisations of writing C, but the preprocessor just feels like a complex solution to a few simple problems.
I end up having to remember what the macro is and substitute it in my head as I read.
That seems to reflect poorly on the naming of the macros. I would assume you wouldn't have to emulate the preprocessor if it were a log_function_entry() macro.
The ones that I have encountered that were intuitive and easy to understand were always like little mini functions, so I always wondered why they weren't just functions.
Usually they should be, unless they need to operate on generic parameters.
#define max(a,b) ((a)<(b)?(b):(a))
will work on any type with an < operator.
More that just functions, macros let you perform operations using the symbols in the source file. That means you can create a new variable name, or reference the source file and line number the macro is on.
In C99, macros also allow you to call variadic functions such as printf
#define log_message(guard,format,...) \
if (guard) printf("%s:%d: " format "\n", __FILE__, __LINE__,__VA_ARGS_);
log_message( foo == 7, "x %d", x)
In which the format works like printf. If the guard is true, it outputs the message along with the file and line number that printed the message. If it was a function call, it would not know the file and line you called it from, and using a vaprintf would be a bit more work.
This excerpt pretty much sums up my view on the matter, by comparing several ways that C macros are used, and how to implement them in D.
copied from DigitalMars.com
Back when C was invented, compiler
technology was primitive. Installing a
text macro preprocessor onto the front
end was a straightforward and easy way
to add many powerful features. The
increasing size & complexity of
programs have illustrated that these
features come with many inherent
problems. D doesn't have a
preprocessor; but D provides a more
scalable means to solve the same
problems.
Macros
Preprocessor macros add powerful features and flexibility to C. But they have a downside:
Macros have no concept of scope; they are valid from the point of definition to the end of the source. They cut a swath across .h files, nested code, etc. When #include'ing tens of thousands of lines of macro definitions, it becomes problematical to avoid inadvertent macro expansions.
Macros are unknown to the debugger. Trying to debug a program with symbolic data is undermined by the debugger only knowing about macro expansions, not the macros themselves.
Macros make it impossible to tokenize source code, as an earlier macro change can arbitrarily redo tokens.
The purely textual basis of macros leads to arbitrary and inconsistent usage, making code using macros error prone. (Some attempt to resolve this was introduced with templates in C++.)
Macros are still used to make up for deficits in the language's expressive capability, such as for "wrappers" around header files.
Here's an enumeration of the common uses for macros, and the corresponding feature in D:
Defining literal constants:
The C Preprocessor Way
#define VALUE 5
The D Way
const int VALUE = 5;
Creating a list of values or flags:
The C Preprocessor Way
int flags:
#define FLAG_X 0x1
#define FLAG_Y 0x2
#define FLAG_Z 0x4
...
flags |= FLAG_X;
The D Way
enum FLAGS { X = 0x1, Y = 0x2, Z = 0x4 };
FLAGS flags;
...
flags |= FLAGS.X;
Setting function calling conventions:
The C Preprocessor Way
#ifndef _CRTAPI1
#define _CRTAPI1 __cdecl
#endif
#ifndef _CRTAPI2
#define _CRTAPI2 __cdecl
#endif
int _CRTAPI2 func();
The D Way
Calling conventions can be specified in blocks, so there's no need to change it for every function:
extern (Windows)
{
int onefunc();
int anotherfunc();
}
Simple generic programming:
The C Preprocessor Way
Selecting which function to use based on text substitution:
#ifdef UNICODE
int getValueW(wchar_t *p);
#define getValue getValueW
#else
int getValueA(char *p);
#define getValue getValueA
#endif
The D Way
D enables declarations of symbols that are aliases of other symbols:
version (UNICODE)
{
int getValueW(wchar[] p);
alias getValueW getValue;
}
else
{
int getValueA(char[] p);
alias getValueA getValue;
}
There are more examples on the DigitalMars website.
They are a programming language (a simpler one) on top of C, so they are useful for doing metaprogramming in compile time... in other words, you can write macro code that generates C code in less lines and time that it will take writing it directly in C.
They are also very useful to write "function like" expressions that are "polymorphic" or "overloaded"; e.g. a max macro defined as:
#define max(a,b) ((a)>(b)?(a):(b))
is useful for any numeric type; and in C you could not write:
int max(int a, int b) {return a>b?a:b;}
float max(float a, float b) {return a>b?a:b;}
double max(double a, double b) {return a>b?a:b;}
...
even if you wanted, because you cannot overload functions.
And not to mention conditional compiling and file including (that are also part of the macro language)...
Macros allow someone to modify the program behavior during compilation time. Consider this:
C constants allow fixing program behavior at development time
C variables allow modifying program behavior at execution time
C macros allow modifying program behavior at compilation time
At compilation time means that unused code won't even go into the binary and that the build process can modify the values, as long as it's integrated with the macro preprocessor. Example: make ARCH=arm (assumes forwarding macro definition as cc -DARCH=arm)
Simple examples:
(from glibc limits.h, define the largest value of long)
#if __WORDSIZE == 64
#define LONG_MAX 9223372036854775807L
#else
#define LONG_MAX 2147483647L
#endif
Verifies (using the #define __WORDSIZE) at compile time if we're compiling for 32 or 64 bits. With a multilib toolchain, using parameters -m32 and -m64 may automatically change bit size.
(POSIX version request)
#define _POSIX_C_SOURCE 200809L
Requests during compilation time POSIX 2008 support. The standard library may support many (incompatible) standards but with this definition, it will provide the correct function prototypes (example: getline(), no gets(), etc.). If the library doesn't support the standard it may give an #error during compile time, instead of crashing during execution, for example.
(hardcoded path)
#ifndef LIBRARY_PATH
#define LIBRARY_PATH "/usr/lib"
#endif
Defines, during compilation time a hardcode directory. Could be changed with -DLIBRARY_PATH=/home/user/lib, for example. If that were a const char *, how would you configure it during compilation ?
(pthread.h, complex definitions at compile time)
# define PTHREAD_MUTEX_INITIALIZER \
{ { 0, 0, 0, 0, 0, 0, { 0, 0 } } }
Large pieces of text may that otherwise wouldn't be simplified may be declared (always at compile time). It's not possible to do this with functions or constants (at compile time).
To avoid really complicating things and to avoid suggesting poor coding styles, I'm wont give an example of code that compiles in different, incompatible, operating systems. Use your cross build system for that, but it should be clear that the preprocessor allows that without help from the build system, without breaking compilation because of absent interfaces.
Finally, think about the importance of conditional compilation on embedded systems, where processor speed and memory are limited and systems are very heterogeneous.
Now, if you ask, is it possible to replace all macro constant definitions and function calls with proper definitions ? The answer is yes, but it won't simply make the need for changing program behavior during compilation go away. The preprocessor would still be required.
Remember that macros (and the pre-processor) come from the earliest days of C. They used to be the ONLY way to do inline 'functions' (because, of course, inline is a very recent keyword), and they are still the only way to FORCE something to be inlined.
Also, macros are the only way you can do such tricks as inserting the file and line into string constants at compile time.
These days, many of the things that macros used to be the only way to do are better handled through newer mechanisms. But they still have their place, from time to time.
Apart from inlining for efficiency and conditional compilation, macros can be used to raise the abstraction level of low-level C code. C doesn't really insulate you from the nitty-gritty details of memory and resource management and exact layout of data, and supports very limited forms of information hiding and other mechanisms for managing large systems. With macros, you are no longer limited to using only the base constructs in the C language: you can define your own data structures and coding constructs (including classes and templates!) while still nominally writing C!
Preprocessor macros actually offer a Turing-complete language executed at compile time. One of the impressive (and slightly scary) examples of this is over on the C++ side: the Boost Preprocessor library uses the C99/C++98 preprocessor to build (relatively) safe programming constructs which are then expanded to whatever underlying declarations and code you input, whether C or C++.
In practice, I'd recommend regarding preprocessor programming as a last resort, when you don't have the latitude to use high level constructs in safer languages. But sometimes it's good to know what you can do if your back is against the wall and the weasels are closing in...!
From Computer Stupidities:
I've seen this code excerpt in a lot of freeware gaming programs for UNIX:
/*
* Bit values.
*/
#define BIT_0 1
#define BIT_1 2
#define BIT_2 4
#define BIT_3 8
#define BIT_4 16
#define BIT_5 32
#define BIT_6 64
#define BIT_7 128
#define BIT_8 256
#define BIT_9 512
#define BIT_10 1024
#define BIT_11 2048
#define BIT_12 4096
#define BIT_13 8192
#define BIT_14 16384
#define BIT_15 32768
#define BIT_16 65536
#define BIT_17 131072
#define BIT_18 262144
#define BIT_19 524288
#define BIT_20 1048576
#define BIT_21 2097152
#define BIT_22 4194304
#define BIT_23 8388608
#define BIT_24 16777216
#define BIT_25 33554432
#define BIT_26 67108864
#define BIT_27 134217728
#define BIT_28 268435456
#define BIT_29 536870912
#define BIT_30 1073741824
#define BIT_31 2147483648
A much easier way of achieving this is:
#define BIT_0 0x00000001
#define BIT_1 0x00000002
#define BIT_2 0x00000004
#define BIT_3 0x00000008
#define BIT_4 0x00000010
...
#define BIT_28 0x10000000
#define BIT_29 0x20000000
#define BIT_30 0x40000000
#define BIT_31 0x80000000
An easier way still is to let the compiler do the calculations:
#define BIT_0 (1)
#define BIT_1 (1 << 1)
#define BIT_2 (1 << 2)
#define BIT_3 (1 << 3)
#define BIT_4 (1 << 4)
...
#define BIT_28 (1 << 28)
#define BIT_29 (1 << 29)
#define BIT_30 (1 << 30)
#define BIT_31 (1 << 31)
But why go to all the trouble of defining 32 constants? The C language also has parameterized macros. All you really need is:
#define BIT(x) (1 << (x))
Anyway, I wonder if guy who wrote the original code used a calculator or just computed it all out on paper.
That's just one possible use of Macros.
I will add to whats already been said.
Because macros work on text substitutions they allow you do very useful things which wouldn't be possible to do using functions.
Here a few cases where macros can be really useful:
/* Get the number of elements in array 'A'. */
#define ARRAY_LENGTH(A) (sizeof(A) / sizeof(A[0]))
This is a very popular and frequently used macro. This is very handy when you for example need to iterate through an array.
int main(void)
{
int a[] = {1, 2, 3, 4, 5};
int i;
for (i = 0; i < ARRAY_LENGTH(a); ++i) {
printf("a[%d] = %d\n", i, a[i]);
}
return 0;
}
Here it doesn't matter if another programmer adds five more elements to a in the decleration. The for-loop will always iterate through all elements.
The C library's functions to compare memory and strings are quite ugly to use.
You write:
char *str = "Hello, world!";
if (strcmp(str, "Hello, world!") == 0) {
/* ... */
}
or
char *str = "Hello, world!";
if (!strcmp(str, "Hello, world!")) {
/* ... */
}
To check if str points to "Hello, world". I personally think that both these solutions look quite ugly and confusing (especially !strcmp(...)).
Here are two neat macros some people (including I) use when they need to compare strings or memory using strcmp/memcmp:
/* Compare strings */
#define STRCMP(A, o, B) (strcmp((A), (B)) o 0)
/* Compare memory */
#define MEMCMP(A, o, B) (memcmp((A), (B)) o 0)
Now you can now write the code like this:
char *str = "Hello, world!";
if (STRCMP(str, ==, "Hello, world!")) {
/* ... */
}
Here is the intention alot clearer!
These are cases were macros are used for things functions cannot accomplish. Macros should not be used to replace functions but they have other good uses.
One of the case where macros really shine is when doing code-generation with them.
I used to work on an old C++ system that was using a plugin system with his own way to pass parameters to the plugin (Using a custom map-like structure). Some simple macros were used to be able to deal with this quirk and allowed us to use real C++ classes and functions with normal parameters in the plugins without too much problems. All the glue code being generated by macros.
Given the comments in your question, you may not fully appreciate is that calling a function can entail a fair amount of overhead. The parameters and key registers may have to be copied to the stack on the way in, and the stack unwound on the way out. This was particularly true of the older Intel chips. Macros let the programmer keep the abstraction of a function (almost), but avoided the costly overhead of a function call. The inline keyword is advisory, but the compiler may not always get it right. The glory and peril of 'C' is that you can usually bend the compiler to your will.
In your bread and butter, day-to-day application programming this kind of micro-optimization (avoiding function calls) is generally worse then useless, but if you are writing a time-critical function called by the kernel of an operating system, then it can make a huge difference.
Unlike regular functions, you can do control flow (if, while, for,...) in macros. Here's an example:
#include <stdio.h>
#define Loop(i,x) for(i=0; i<x; i++)
int main(int argc, char *argv[])
{
int i;
int x = 5;
Loop(i, x)
{
printf("%d", i); // Output: 01234
}
return 0;
}
It's good for inlining code and avoiding function call overhead. As well as using it if you want to change the behaviour later without editing lots of places. It's not useful for complex things, but for simple lines of code that you want to inline, it's not bad.
By leveraging C preprocessor's text manipulation one can construct the C equivalent of a polymorphic data structure. Using this technique we can construct a reliable toolbox of primitive data structures that can be used in any C program, since they take advantage of C syntax and not the specifics of any particular implementation.
Detailed explanation on how to use macros for managing data structure is given here - http://multi-core-dump.blogspot.com/2010/11/interesting-use-of-c-macros-polymorphic.html
Macros let you get rid of copy-pasted fragments, which you can't eliminate in any other way.
For instance (the real code, syntax of VS 2010 compiler):
for each (auto entry in entries)
{
sciter::value item;
item.set_item("DisplayName", entry.DisplayName);
item.set_item("IsFolder", entry.IsFolder);
item.set_item("IconPath", entry.IconPath);
item.set_item("FilePath", entry.FilePath);
item.set_item("LocalName", entry.LocalName);
items.append(item);
}
This is the place where you pass a field value under the same name into a script engine. Is this copy-pasted? Yes. DisplayName is used as a string for a script and as a field name for the compiler. Is that bad? Yes. If you refactor you code and rename LocalName to RelativeFolderName (as I did) and forget to do the same with the string (as I did), the script will work in a way you don't expect (in fact, in my example it depends on did you forget to rename the field in a separate script file, but if the script is used for serialization, it would be a 100% bug).
If you use a macro for this, there will be no room for the bug:
for each (auto entry in entries)
{
#define STR_VALUE(arg) #arg
#define SET_ITEM(field) item.set_item(STR_VALUE(field), entry.field)
sciter::value item;
SET_ITEM(DisplayName);
SET_ITEM(IsFolder);
SET_ITEM(IconPath);
SET_ITEM(FilePath);
SET_ITEM(LocalName);
#undef SET_ITEM
#undef STR_VALUE
items.append(item);
}
Unfortunately, this opens a door for other types of bugs. You can make a typo writing the macro and will never see a spoiled code, because the compiler doesn't show how it looks after all preprocessing. Someone else could use the same name (that's why I "release" macros ASAP with #undef). So, use it wisely. If you see another way of getting rid of copy-pasted code (such as functions), use that way. If you see that getting rid of copy-pasted code with macros isn't worth the result, keep the copy-pasted code.
One of the obvious reasons is that by using a macro, the code will be expanded at compile time, and you get a pseudo function-call without the call overhead.
Otherwise, you can also use it for symbolic constants, so that you don't have to edit the same value in several places to change one small thing.
Macros .. for when your &#(*$& compiler just refuses to inline something.
That should be a motivational poster, no?
In all seriousness, google preprocessor abuse (you may see a similar SO question as the #1 result). If I'm writing a macro that goes beyond the functionality of assert(), I usually try to see if my compiler would actually inline a similar function.
Others will argue against using #if for conditional compilation .. they would rather you:
if (RUNNING_ON_VALGRIND)
rather than
#if RUNNING_ON_VALGRIND
.. for debugging purposes, since you can see the if() but not #if in a debugger. Then we dive into #ifdef vs #if.
If its under 10 lines of code, try to inline it. If it can't be inlined, try to optimize it. If its too silly to be a function, make a macro.
While I'm not a big fan of macros and don't tend to write much C anymore, based on my current tasking, something like this (which could obviously have some side-effects) is convenient:
#define MIN(X, Y) ((X) < (Y) ? (X) : (Y))
Now I haven't written anything like that in years, but 'functions' like that were all over code that I maintained earlier in my career. I guess the expansion could be considered convenient.
I didn't see anyone mentioning this so, regarding function like macros, eg:
#define MIN(X, Y) ((X) < (Y) ? (X) : (Y))
Generally it's recommended to avoid using macros when not necessary, for many reasons, readability being the main concern. So:
When should you use these over a function?
Almost never, since there's a more readable alternative which is inline, see https://www.greenend.org.uk/rjk/tech/inline.html
or http://www.cplusplus.com/articles/2LywvCM9/ (the second link is a C++ page, but the point is applicable to c compilers as far as I know).
Now, the slight difference is that macros are handled by the pre-processor and inline is handled by the compiler, but there's no practical difference nowadays.
when is it appropriate to use these?
For small functions (two or three liners max). The goal is to gain some advantage during the run time of a program, as function like macros (and inline functions) are code replacements done during the pre-proccessing (or compilation in case of inline) and are not real functions living in memory, so there's no function call overhead (more details in the linked pages).
Related
This is a subjective question, so I will accept 'there is no answer' but read fully as this is specifically on a system where the code is safety critical.
I've adopted some embedded C code for a safety critical system, where the original author has (in random places) used syntax like this:
#include <stdio.h>
typedef struct tag_s2 {
int a;
}s2;
typedef struct tag_s1 {
s2 a;
}s1;
s1 inst;
#define X_myvar inst.a.a
int main(int argc, char **argv)
{
X_myvar = 10;
printf("myvar = %d\n", X_myvar + 1);
return 0;
}
Effectively using a #define to alias and obscure a deep structure member. Mostly two or three, but occasionally four deep.
BTW: This is a simple example, the real code is far more complicated but I can't publish any part of that here.
The use of this is not consistent, in some places the aliased variable is used directly other by it's alias, some parts of code are not aliased.
IMO this is bad practice as it obscures the code with no gain reducing maintainability and readability, leading to future errors and misunderstanding.
If the style was 100% consistent then perhaps I would be more happy with it.
However, being safety critical a change is costly. So not wanting to fix 'wot aint broke' I am open to other arguments.
Should I fix it or leave well alone?
Is there any guidance (e.g. Generic C, MISRA or DO178B style guides) that would have an opinion on this?
However, being safety critical a change is costly. So not wanting to fix 'wot aint broke' I am open to other arguments.
It's paradoxical death spiral that the most critical code gets the least attention because people are afraid to change it.
That you are hesitant to make a simple, rote refactoring to this code tells me the code either has no tests or you don't trust the tests. When you're afraid to improve code because you might break it, that delays improvements to the code. You're likely to do the smallest possible thing which will make the code even more brittle and unsafe.
I'd advise the first thing is to get some tests in place along with a staging environment for trials. Then all changes become safer. There might be some gafes initially while you find all the weird and dangerous things this code is doing, but that's what the staging area is for. In the medium and long term everyone will improve this code faster and with more confidence. Making code easier and safer to change allows it to be made easier and safer to change; the spiral then goes up, not down.
The technique of making a macro seem like a single variable is a technique I've seen before in the Perl 5 code base. It is written more in C macros than in C. For example, here's a bit of manipulating the Perl call stack.
#define SP sp
#define MARK mark
#define TARG targ
#define PUSHMARK(p) \
STMT_START { \
I32 * mark_stack_entry; \
if (UNLIKELY((mark_stack_entry = ++PL_markstack_ptr) \
== PL_markstack_max)) \
mark_stack_entry = markstack_grow(); \
*mark_stack_entry = (I32)((p) - PL_stack_base); \
DEBUG_s(DEBUG_v(PerlIO_printf(Perl_debug_log, \
"MARK push %p %" IVdf "\n", \
PL_markstack_ptr, (IV)*mark_stack_entry))); \
} STMT_END
#define TOPMARK S_TOPMARK(aTHX)
#define POPMARK S_POPMARK(aTHX)
#define INCMARK \
STMT_START { \
DEBUG_s(DEBUG_v(PerlIO_printf(Perl_debug_log, \
"MARK inc %p %" IVdf "\n", \
(PL_markstack_ptr+1), (IV)*(PL_markstack_ptr+1)))); \
PL_markstack_ptr++; \
} STMT_END
#define dSP SV **sp = PL_stack_sp
#define djSP dSP
#define dMARK SV **mark = PL_stack_base + POPMARK
#define dORIGMARK const I32 origmark = (I32)(mark - PL_stack_base)
#define ORIGMARK (PL_stack_base + origmark)
#define SPAGAIN sp = PL_stack_sp
#define MSPAGAIN STMT_START { sp = PL_stack_sp; mark = ORIGMARK; } STMT_END
#define GETTARGETSTACKED targ = (PL_op->op_flags & OPf_STACKED ? POPs : PAD_SV(PL_op->op_targ))
#define dTARGETSTACKED SV * GETTARGETSTACKED
These are macros upon macros upon macros. The Perl 5 source is riddled with them. There is a lot of opaque magic happening there. Some of them need to be macros to allow assignment, but many could be inline functions. Despite being part of a public API they are indifferently documented in part because they are macros and not functions.
This style is very clever and useful if you're already very familiar with the Perl 5 source code. For everyone else it has made the Perl 5 internals extremely difficult to work with. While some compilers will provide stack traces for macro expansions, others will only report on the expanded macro leaving one scratching their head what the hell const I32 origmark = (I32)(mark - PL_stack_base) is because it never appears in your source.
Like many macro hacks, while the technique is very clever it is also mind-bending and unfamiliar to many programmers. Mind-bending is not what you want in safety critical code. You want simple, boring code. That alone is the simplest argument to replace it with well named getter and setter functions. Trust the compiler to optimize them.
A good example of this is GLib which carefully uses well-documented function-like macros to make generic data structures. For example, adding a value to an array.
#define g_array_append_val(a,v)
While this is a macro, it acts and is documented like a function. It's macro solely as a mechanism to create a safe, type generic array. It hides no variables. You can safely use it without ever being aware it's a macro.
In conclusion, yes, change it. But instead of simply replacing X_myvar with inst.a.a consider creating functions that continue to provide encapsulation.
void s1_set_a( s1 *s, int val ) {
s->a.a = val;
}
int s1_get_a( s1 *s ) {
return s->a.a;
}
s1_set_a(&inst, 10);
printf("myvar = %d\n", s1_get_a(&inst) + 1);
The internals of s1 are hidden making it easier to change the internals later (for example, changing s1.a to a pointer to save memory). What variable you're working with is clear making the overall code easier to understand. The function names provide a clear explanation of what's happening. Because they're functions they have an obvious place for documentation. Trust the compiler to know how best to optimize it.
Yeah you should get rid of it. Obscure macros are dangerous.
It was common in older C to avoid spelling out deep nesting to do things like
#define a inst.a
In which case you only had to type inst.a instead of inst.a.a. Although this is questionable practice, macros like these were used to repair a shortcoming in the language, namely the lack of anonymous structs. Modern C supports that from C11 though. We can use anonymous structures to get rid of unnecessarily nested structs:
typedef struct {
struct
{
int a;
};
}s1;
But MISRA-C:2012 doesn't support C11 so that might not be an option.
Another trick you can use to get rid of long names is something like this:
int* x = &longstructname.somemember.anotherstruct.x;
// use *x from here on
x is a local variable with limited scope. That's much more readable than the obscure macros and it gets optimized away in the machine code.
From a maintenance perspective, yes, this is definitely code that needs to be fixed.
However, it is only from that perspective that the code needs to be fixed. It does not harm program correctness, and if the code is correct as-is, that is the paramount consideration.
That's why code like this should never be fixed unless a thorough unit test and regression test regimen is already in place. You should only fix code like this if you can be certain that you don't break correctly-functioning code in the process.
I have a lot of preprocessor macro definitions, like this:
#define FOO 1
#define BAR 2
#define BAZ 3
In the real application, each definition corresponds to an instruction in an interpreter virtual machine. The macros are also not sequential in numbering to leave space for future instructions; there may be a #define FOO 41, then the next one is #define BAR 64.
I'm now working on a debugger for this virtual machine, and need to effectively 'reverse' these preprecessor macros. In other words, I need a function which takes the number and returns the macro name, e.g. an input of 2 returns "BAR".
Of course, I could create a function using a switch myself:
const char* instruction_by_id(int id) {
switch (id) {
case FOO:
return "FOO";
case BAR:
return "BAR";
case BAZ:
return "BAZ";
default:
return "???";
}
}
However, this will a nightmare to maintain, since renaming, removing or adding instructions will require this function to be modified too.
Is there another macro which I can use to create a function like this for me, or is there some other approach? If not, is it possible to create a macro to perform this task?
I'm using gcc 6.3 on Windows 10.
You have the wrong approach. Read SICP if you have not read it.
I have a lot of preprocessor macro definitions, like this:
#define FOO 1
#define BAR 2
#define BAZ 3
Remember that C or C++ code can be generated, and it is quite easy to instruct your build automation tool to generate some particular C file (with GNU make or ninja you just add some rule or recipe).
For example, you could use some different preprocessor (liek GPP or m4), or some script -e.g. in awk or Python or Guile, etc..., or write your own program (in C, C++, Ocaml, etc...), to generate the header file containing these #define-s. And another script or program (or the same one, invoked differently) could generate the C code of instruction_by_id
Such basic metaprogramming techniques (of generating some or several C files from something higher level but specific) have been used since at least the 1980s (e.g. with yacc or RPCGEN). The C preprocessor facilitates that with its #include directive (since you can even include lines inside some function body, etc...). Actually, the idea that code is data (and proof) and data is code is even older (Church-Turing thesis, Curry-Howard correspondence, Halting problem). The Gödel, Escher, Bach book is very entertaining....
For example, you could decide to have a textual file opcodes.txt (or even some sqlite database containing stuff....) like
# ignore lines starting with an hashsign
FOO 1
BAR 2
and have two small awk or Python scripts (or two tiny C specialized programs), one generating the #define-s (into opcode-defines.h) and another generating the body of instruction_by_id (into opcode-instr.inc). Then you need to adapt your Makefile to generate these, and put #include "opcode-defines.h" inside some global header, and have
const char* instruction_by_id(int id) {
switch (id) {
#include "opcode-instr.inc"
default: return "???";
}
}
this will a nightmare to maintain,
Not so with such a metaprogramming approach. You'll just maintain opcodes.txt and the scripts using it, but you express a given "knowledge element" (the relation of FOO to 1) only once (in a single line of opcode.txt). Of course you need to document that (at the very least, with comments in your Makefile).
Metaprogramming from some higher-level, declarative formalization, is a very powerful paradigm. In France, J.Pitrat pioneered it (and he is writing an interesting blog today, while being retired) since the 1960s. In the US, J.MacCarthy and the Lisp community also.
For an entertaining talk, see Liam Proven FOSDEM 2018 talk on The circuit less traveled
Large software are using that metaprogramming approach quite often. For example, the GCC compiler have about a dozen of C++ code generators (in total, they are emitting more than a million of C++ lines).
Another way of looking at such an approach is the idea of domain-specific languages that could be compiled to C. If you use an operating system providing dynamic loading, you can even write a program emitting C code, forking a process to compile it into some plugin, then loading that plugin (on POSIX or Linux, with dlopen). Interestingly, computers are now fast enough to enable such an approach in an interactive application (in some sort of REPL): you can emit a C file of a few thousand lines, compile it into some .so shared object file, and dlopen that, in a fraction of second. You could also use JIT-compiling libraries like GCCJIT or LLVM to generate code at runtime. You could embed an interpreter (like Lua or Guile) into your program.
BTW, metaprogramming approaches is one of the reasons why basic compilation techniques should be known by most developers (and not only just people in the compiler business); another reason is that parsing problems are very common. So read the Dragon Book.
Be aware of Greenspun's tenth rule. It is much more than a joke, actually a profound truth about large software.
In a similar case I've resorted to defining a text file format that defines the instructions, and writing a program to read this file and write out the C source of the actual instruction definitions and the C source of functions like your instruction_by_id(). This way you only need to maintain the text file.
As awesome as general code generation is, I’m surprised that nobody mentioned that (if you relax your problem definition just a bit) the C preprocessor is perfectly capable of generating the necessary code, using a technique called X macros. In fact every simple bytecode VM in C that I’ve seen uses this approach.
The technique works as follows. First, there is a file (call it insns.h) containing the authoritative list of instructions,
INSN(FOO, 1)
INSN(BAR, 2)
INSN(BAZ, 3)
or alternatively a macro in some other header containing the same,
#define INSNS \
INSN(FOO, 1) \
INSN(BAR, 2) \
INSN(BAZ, 3)
whichever is more conveinent for you. (I’ll use the first option in the following.) Note that INSN is not defined anywhere. (Traditionally it would be called X, thus the name of the technique.) Wherever you want to loop over your instructions, define INSN to generate the code you want, include insns.h, then undefine INSN again.
In your disassembler, write
const char *instruction_by_id(int id) {
switch (id) {
#define INSN(NAME, VALUE) \
case NAME: return #NAME;
#include "insns.h" /* or just INSNS if you use a macro */
#undef INSN
default: return "???";
}
}
using the prefix stringification operator # to turn names-as-identifiers into names-as-string-literals.
You obviously can’t define the constants this way, because macros cannot define other macros in the C preprocessor. However, if you don’t insist that the instruction constants be preprocessor constants, there’s a different perfectly serviceable constant facility in the C language: enumerations. Whether or not you use an enumerated type, the enumerators defined inside it are regular integer constants from the point of view of the compiler (though not the preprocessor—you cannot use #ifdef with them, for example). So, using an anonymous enumeration type, define your constants like this:
enum {
#define INSN(NAME, VALUE) \
NAME = VALUE,
#include "insns.h" /* or just INSNS if you use a macro */
#undef INSN
NINSNS /* C89 doesn’t allow trailing commas in enumerations (but C99+ does), and you may find this constant useful in any case */
};
If you want to statically initialize an array indexed by your bytecodes, you’ll have to use C99 designated initializers {[FOO] = foovalue, [BAR] = barvalue, /* ... */} whether or not you use X macros. However, if you don’t insist on assigning custom codes to your instructions, you can eliminate VALUE from the above and have the enumeration assign consecutive codes automatically, and then the array can be simply initialized in order, {foovalue, barvalue, /* ... */}. As a bonus, NINSNS above then becomes equal to the number of the instructions and the size of any such array, which is why I called it that.
There are more tricks you can use here. For example, if some instructions have variants for several data types, the instruction list X macro can call the type list X macro to generate the variants automatically. (The somewhat ugly second option of storing the X macro list in a large macro and not an include file may be more handy here.) The INSN macro may take additional arguments such as the mode name, which would ignored in the code list but used to call the appropriate decoding routine in the disassembler. You can use token pasting operator ## to add prefixes to the names of the constants, as in INSN_ ## NAME to generate INSN_FOO, INSN_BAR, etc. And so on.
Background
I've utilized a set of preprocessor macros from another question that allows me to prefix symbol names (enums, function names, struct names, etc) in my source, i.e.:
#include <stdio.h>
#define VARIABLE 3
#define PASTER(x,y) x ## _ ## y
#define EVALUATOR(x,y) PASTER(x,y)
#define NAME(fun) EVALUATOR(fun, VARIABLE)
void NAME(func)(int i);
int main(void)
{
NAME(func)(123);
return 0;
}
void NAME(func)(int i)
{
printf("i is %d in %s.\n", i, __func__);
}
Problem
This works as expected, with the following output: i is 123 in func_3.
Edit
I would like this code:
#define NAME(SOME_MACRO_CONST) (123)
#define NAME(SOME_MACRO_CONST2) (123)
To expand to:
#define 3SOME_MACRO_CONST (123)
#define 3SOME_MACRO_CONST2 (123)
I realize the macro shouldn't start with a digit. In the final code I'll be using names like LIB_A_ and LIB_B_ as prefixes.
/Edit
However, if I attempt to do the same with macros as the arguments to my NAME variadic macro, it fails like so:
Re-using NAME macro:
Code
#define NAME(MY_CONST) (3)
Output
test.c:7:0: warning: "NAME" redefined
#define NAME(MY_CONST) 3
Manually pasting prefix:
Code:
#define VARIABLE ## MY_CONST (3)
Output:
test.c:8:18: error: '##' cannot appear at either end of a macro expansion
#define VARIABLE ## MY_CONST (3)
Question
How can I create simple macro definitions (name + value) that has a common prefix for all the macros? The goal is to be able to make multiple copies of the source file and compile them with different flags so all versions can be linked together into the same final binary without symbol/macro name collisions (the macros will later be moved into header files). The final file will be too big to write in something like M4 or a template language. Ideally, the solution would involve being able to use a single macro-function/variadic-macro for all use cases, but I'm OK with one macro for symbol prefixing, and another for macro-name prefixing.
I would like this code:
#define NAME(SOME_MACRO_CONST) (123)
#define NAME(SOME_MACRO_CONST2) (124)
To expand to:
#define 3SOME_MACRO_CONST (123)
#define 3SOME_MACRO_CONST2 (124)
(I corrected the second number to 124 to make it different from the first one, for readability purposes)
This is impossible with the C preprocessor
for several reasons:
3SOME_MACRO_CONST is not a valid identifier (both for the preprocessor, and for the C compiler itself) since it does not start with a letter or an underscore. So let's assume you want your code to be expanded to:
/// new desired expansion
#define THREE_SOME_MACRO_CONST (123)
#define THREE_SOME_MACRO_CONST2 (124)
this is still impossible, because the preprocessor works before anything else and cannot generate any preprocessor directive (e.g. #define).
A workaround, if you only want to #define some numbers (computable at compile-time !!!) might be to expand to some anonymous enum like
enum {
THREE_SOME_MACRO_CONST= 123,
THREE_SOME_MACRO_CONST2= 124,
};
and you know how to do that in the details. Read also about X-macros.
However, even if you can change your requirement to something that is possible, it might be not recommendable, because your code becomes very unreadable (IMHO). You could sometimes consider writing some simple script (e.g. in sed or awk ...), or use some other preprocessor like GPP, to generate a C file from something else.
Notice that most serious build automation tools (like GNU make or ninja) -or even IDEs (they can be configured to) permit quite easily (by adding extra targets, recipes, commands, etc...) to generate some C (or C++) code from some other file, and that meta-programming practice has been routinely used since decades (e.g. bison, flex, autoconf, rpcgen, Qt moc, SWIG ...) so I am surprised you cannot do so. Generating a header file containing many #define-s is so common a practice that I am surprised you are forbidden to do so. Perhaps you just need to discuss with your manager or colleagues. Maybe you need to look for some more interesting job.
Personally, I am very fond of such meta-programming approaches (I did my PhD on these in 1990, and I would discuss them at every job interview; a job where metaprogramming is forbidden is not for me. Look for example at my past GCC MELT project, and my future project also will have metaprogramming). Another way of promoting that approach is to defend domain specific languages (and the ability to make your DSL inside some large software project; for example the GCC compiler has about a dozen of such DSLs inside it....). Then, your DSL can (naturally) be compiled to C which is a common practice. On modern operating systems that generated C code could be compiled at runtime and dynamically loaded as a (generated) plugin (using dlopen on POSIX...)
Sometimes, you can trick the compiler. For a project compiled by GCC, you could consider writing your GCC plugin..... (that is a lot more work than adding a command generating C code; your plugin could provide extra magic pragmas or builtins or attributes used by some other macros).
You could also configure the spec file of your gcc to handle specifically some C files. Beware, that could affect every future compilation!
I wanted to see how the arduino function digitalWrite actually worked. But when I looked for the source for the function it was full of macros that were themselves defined in terms of other macros. Why would it be built this way instead of just using a function? Is this just poor coding style or the proper way of doing things in C?
For example, digitalWrite contains the macro digitalPinToPort.
#define digitalPinToPort(P) ( pgm_read_byte( digital_pin_to_port_PGM + (P) ) )
And pgm_read_byte is a macro:
#define pgm_read_byte(address_short) pgm_read_byte_near(address_short)
And pgm_read_byte_near is a macro:
#define pgm_read_byte_near(address_short) __LPM((uint16_t)(address_short))
And __LPM is a macro:
#define __LPM(addr) __LPM_classic__(addr)
And __LPM_classic__ is a macro:
#define __LPM_classic__(addr) \
(__extension__({ \
uint16_t __addr16 = (uint16_t)(addr); \
uint8_t __result; \
__asm__ __volatile__ \
( \
"lpm" "\n\t" \
"mov %0, r0" "\n\t" \
: "=r" (__result) \
: "z" (__addr16) \
: "r0" \
); \
__result; \
}))
Not directly related to this, but I was also under the impression that double underscore should only be used by the compiler. Is having LPM prefixed with __ proper?
If your question is "why one would use multiple layers of macros?", then:
why not? Notably in the pre-C99 era without inline. A canonical example was 1980s era getc which was (IIRC) at that time (SunOS3.2, 1987) documented as being a macro, with a NOTE in the man page telling (I forgot the details) that with some FILE* filearr[]; a getc(filearr[i++]) was wrong (IIRC, the undefined behavior terminology did not exist at that time). When you looked into some system headers (e.g. <stdio.h> or some header included by it) you could find the definition of such macro. And at that time (with computers running only at a few MHz, so thousand times slower than today) getc has to be a macro for efficiency reasons (since inline did not exist, and compilers did no interprocedural optimizations like they are able to do now). Of course, you could use getc in your own macros.
even today, some standards define macros. In particular today's waitpid(2) system call document WIFEXITED & WEXITSTATUS as macros, and it is sensible to #define some of your macros mixing both of them.
the main point is to understand the working of the C preprocessor, and its profoundly textual (so very brittle) nature. This is explained in all textbooks about C. Hence you need to understand what is happening under the hoods.
the rule of thumb with modern era C (that is C99 & C11) is to systematically prefer having some static inline function (defined in some header file) to an equivalent macro. In other words, #define some macro only when you cannot avoid it. And explicitly document that fact.
several layers of macro might (sometimes) increase code readability.
macros can be tested with #ifdef macroname which is sometimes useful.
Of course, when you dare defining several layers of macros (I won't call them "recursive", read about self-referential macros) you need to be very careful and understand what is happening with all of them (combined & separately). Looking into the preprocessed form is helpful.
BTW, to debug complex macros, I sometimes do
gcc -C -E -I/some/directory mysource.c | sed 's:^#://#:' > mysource.i
then I look into the mysource.i and sometimes even have to compile it, perhaps as gcc -c -Wall mysource.i to get warnings which are located in the preprocessed form mysource.i (which I can examine in my emacs editor). The sed command is "commenting" the lines starting with # which are setting source location (à la #line).... Sometimes I even do indent mysource.i
(actually, I have a special rule for that in my Makefile)
Is having LPM prefixed with __ proper?
BTW, names starting with _ are (by the standard, and conventionally) reserved to the implementation. In principle you are not allowed to have your names starting with _ so there cannot be any collision.
Notice that the __LPM_classic__ macro uses the statement-expr extension
(of GCC & Clang)
Look also at other programming languages. Common Lisp has a very different model of macros (and a much more interesting one). Read about hygienic macros. My personal regret is that the C preprocessor has never evolved to be much more powerful (and Scheme-like). Why that did not happen (imagine a C preprocessor able to invoke Guile code to do the macro expansion!) is probably for sociological & economical reasons.
You still should sometimes consider using some other preprocessor (like m4 or GPP) to generate C code. See autoconf.
Why would it be built this way instead of just using a function?
The purpose is likely to optimize functions' calls using pre-C99 compilers, that don't support inline functions (via inline keyword). This way, the whole stack of function-like macros is essentially merged by preprocessor into single block of code.
Every time you call a function in C, there is a tiny overhead, because of jumping around program's code, managing the stack frame and passing arguments. This cost is negligible in most applications, but if the function is called very often, then it may become a performance bottleneck.
Is this just poor coding style or the proper way of doing things in C?
It's hard to give definitive answer, because coding style is subjective topic. I would say, consider using inline functions or even (better) let the compiler inline them by itself. They are type-safe, more readable, more predictable, and with the proper assistance from compiler, the net result is essentially the same.
Related reference (it's for C++, but the idea is generally the same for C):
Why should I use inline functions instead of plain old #define macros?
I am a first year computer science student and my professor said #define is banned in the industry standards along with #if, #ifdef, #else, and a few other preprocessor directives. He used the word "banned" because of unexpected behaviour.
Is this accurate? If so why?
Are there, in fact, any standards which prohibit the use of these directives?
First I've heard of it.
No; #define and so on are widely used. Sometimes too widely used, but definitely used. There are places where the C standard mandates the use of macros — you can't avoid those easily. For example, §7.5 Errors <errno.h> says:
The macros are
EDOM
EILSEQ
ERANGE
which expand to integer constant expressions with type int, distinct positive values, and which are suitable for use in #if preprocessing directives; …
Given this, it is clear that not all industry standards prohibit the use of the C preprocessor macro directives. However, there are 'best practices' or 'coding guidelines' standards from various organizations that prescribe limits on the use of the C preprocessor, though none ban its use completely — it is an innate part of C and cannot be wholly avoided. Often, these standards are for people working in safety-critical areas.
One standard you could check the MISRA C (2012) standard; that tends to proscribe things, but even that recognizes that #define et al are sometimes needed (section 8.20, rules 20.1 through 20.14 cover the C preprocessor).
The NASA GSFC (Goddard Space Flight Center) C Coding Standards simply say:
Macros should be used only when necessary. Overuse of macros can make code harder to read and maintain because the code no longer reads or behaves like standard C.
The discussion after that introductory statement illustrates the acceptable use of function macros.
The CERT C Coding Standard has a number of guidelines about the use of the preprocessor, and implies that you should minimize the use of the preprocessor, but does not ban its use.
Stroustrup would like to make the preprocessor irrelevant in C++, but that hasn't happened yet. As Peter notes, some C++ standards, such as the JSF AV C++ Coding Standards (Joint Strike Fighter, Air Vehicle) from circa 2005, dictate minimal use of the C preprocessor. Essentially, the JSF AV C++ rules restrict it to #include and the #ifndef XYZ_H / #define XYZ_H / … / #endif dance that prevents multiple inclusions of a single header. C++ has some options that are not available in C — notably, better support for typed constants that can then be used in places where C does not allow them to be used. See also static const vs #define vs enum for a discussion of the issues there.
It is a good idea to minimize the use of the preprocessor — it is often abused at least as much as it is used (see the Boost preprocessor 'library' for illustrations of how far you can go with the C preprocessor).
Summary
The preprocessor is an integral part of C and #define and #if etc cannot be wholly avoided. The statement by the professor in the question is not generally valid: #define is banned in the industry standards along with #if, #ifdef, #else, and a few other macros is an over-statement at best, but might be supportable with explicit reference to specific industry standards (but the standards in question do not include ISO/IEC 9899:2011 — the C standard).
Note that David Hammen has provided information about one specific C coding standard — the JPL C Coding Standard — that prohibits a lot of things that many people use in C, including limiting the use of of the C preprocessor (and limiting the use of dynamic memory allocation, and prohibiting recursion — read it to see why, and decide whether those reasons are relevant to you).
No, use of macros is not banned.
In fact, use of #include guards in header files is one common technique that is often mandatory and encouraged by accepted coding guidelines. Some folks claim that #pragma once is an alternative to that, but the problem is that #pragma once - by definition, since pragmas are a hook provided by the standard for compiler-specific extensions - is non-standard, even if it is supported by a number of compilers.
That said, there are a number of industry guidelines and encouraged practices that actively discourage all usage of macros other than #include guards because of the problems macros introduce (not respecting scope, etc). In C++ development, use of macros is frowned upon even more strongly than in C development.
Discouraging use of something is not the same as banning it, since it is still possible to legitimately use it - for example, by documenting a justification.
Some coding standards may discourage or even forbid the use of #define to create function-like macros that take arguments, like
#define SQR(x) ((x)*(x))
because a) such macros are not type-safe, and b) somebody will inevitably write SQR(x++), which is bad juju.
Some standards may discourage or ban the use of #ifdefs for conditional compilation. For example, the following code uses conditional compilation to properly print out a size_t value. For C99 and later, you use the %zu conversion specifier; for C89 and earlier, you use %lu and cast the value to unsigned long:
#if __STDC_VERSION__ >= 199901L
# define SIZE_T_CAST
# define SIZE_T_FMT "%zu"
#else
# define SIZE_T_CAST (unsigned long)
# define SIZE_T_FMT "%lu"
#endif
...
printf( "sizeof foo = " SIZE_T_FMT "\n", SIZE_T_CAST sizeof foo );
Some standards may mandate that instead of doing this, you implement the module twice, once for C89 and earlier, once for C99 and later:
/* C89 version */
printf( "sizeof foo = %lu\n", (unsigned long) sizeof foo );
/* C99 version */
printf( "sizeof foo = %zu\n", sizeof foo );
and then let Make (or Ant, or whatever build tool you're using) deal with compiling and linking the correct version. For this example that would be ridiculous overkill, but I've seen code that was an untraceable rat's nest of #ifdefs that should have had that conditional code factored out into separate files.
However, I am not aware of any company or industry group that has banned the use of preprocessor statements outright.
Macros can not be "banned". The statement is nonsense. Literally.
For example, section 7.5 Errors <errno.h> of the C Standard requires the use of macros:
1 The header <errno.h> defines several macros, all relating to the reporting of error conditions.
2 The macros are
EDOM
EILSEQ
ERANGE
which expand to integer constant expressions with type int, distinct
positive values, and which are suitable for use in #if preprocessing
directives; and
errno
which expands to a modifiable lvalue that has type int and thread
local storage duration, the value of which is set to a positive error
number by several library functions. If a macro definition is
suppressed in order to access an actual object, or a program defines
an identifier with the name errno, the behavior is undefined.
So, not only are macros a required part of C, in some cases not using them results in undefined behavior.
No, #define is not banned. Misuse of #define, however, may be frowned upon.
For instance, you may use
#define DEBUG
in your code so that later on, you can designate parts of your code for conditional compilation using #ifdef DEBUG, for debug purposes only. I don't think anyone in his right mind would want to ban something like this. Macros defined using #define are also used extensively in portable programs, to enable/disable compilation of platform-specific code.
However, if you are using something like
#define PI 3.141592653589793
your teacher may rightfully point out that it is much better to declare PI as a constant with the appropriate type, e.g.,
const double PI = 3.141592653589793;
as it allows the compiler to do type checking when PI is used.
Similarly (as mentioned by John Bode above), the use of function-like macros may be disapproved of, especially in C++ where templates can be used. So instead of
#define SQ(X) ((X)*(X))
consider using
double SQ(double X) { return X * X; }
or, in C++, better yet,
template <typename T>T SQ(T X) { return X * X; }
Once again, the idea is that by using the facilities of the language instead of the preprocessor, you allow the compiler to type check and also (possibly) generate better code.
Once you have enough coding experience, you'll know exactly when it is appropriate to use #define. Until then, I think it is a good idea for your teacher to impose certain rules and coding standards, but preferably they themselves should know, and be able to explain, the reasons. A blanket ban on #define is nonsensical.
That's completely false, macros are heavily used in C. Beginners often use them badly but that's not a reason to ban them from industry. A classic bad usage is #define succesor(n) n + 1. If you expect 2 * successor(9) to give 20, then you're wrong because that expression will be translated as 2 * 9 + 1 i.e. 19 not 20. Use parenthesis to get the expected result.
No. It is not banned. And truth to be told, it is impossible to do non-trivial multi-platform code without it.
No your professor is wrong or you misheard something.
#define is a preprocessor macro, and preprocessor macros are needed for conditional compilation and some conventions, which aren't simply built in the C language. For example, in a recent C standard, namely C99, support for booleans had been added. But it's not supported "native" by the language, but by preprocessor #defines. See this reference to stdbool.h
Macros are used pretty heavily in GNU land C, and without conditional preprocessor commands there's be no way to properly handle multiple inclusions of the same source files, so that makes them seem like essential language features to me.
Maybe your class is actually on C++, which despite many people's failure to do so, should be distinguished from C as it is a different language, and I can't speak for macros there. Or maybe the professor meant he's banning them in his class. Anyhow I'm sure the SO community would be interested in hearing which standard he's talking about, since I'm pretty sure all C standards support the use of macros.
Contrary to all of the answers to date, the use of preprocessor directives is oftentimes banned in high-reliability computing. There are two exceptions to this, the use of which are mandated in such organizations. These are the #include directive, and the use of an include guard in a header file. These kinds of bans are more likely in C++ rather than in C.
Here's but one example: 16.1.1 Use the preprocessor only for implementing include guards, and including header files with include guards.
Another example, this time for C rather than C++: JPL Institutional Coding Standard for the C Programming Language . This C coding standard doesn't go quite so far as banning the use of the preprocessor completely, but it comes close. Specifically, it says
Rule 20 (preprocessor use)
Use of the C preprocessor shall be limited to file inclusion and simple macros. [Power of Ten Rule 8].
I'm neither condoning nor decrying those standards. But to say they don't exist is ludicrous.
If you want your C code to interoperate with C++ code, you will want to declare your externally visible symbols, such as function declarations, in the extern "C" namespace. This is often done using conditional compilation:
#ifdef __cplusplus
extern "C" {
#endif
/* C header file body */
#ifdef __cplusplus
}
#endif
Look at any header file and you will see something like this:
#ifndef _FILE_NAME_H
#define _FILE_NAME_H
//Exported functions, strucs, define, ect. go here
#endif /*_FILE_NAME_H */
These define are not only allowed, but critical in nature as each time the header file is referenced in files it will be included separately. This means without the define you are redefining everything in between the guard multiple times which best case fails to compile and worse case leaves you scratching your head later why your code doesn't work the way you want it to.
The compiler will also use define as seen here with gcc that let you test for things like the version of the compiler which is very useful. I'm currently working on a project that needs to compile with avr-gcc, but we have a testing environment that we also run our code though. To prevent the avr specific files and registers from keeping our test code from running we do something like this:
#ifdef __AVR__
//avr specific code here
#endif
Using this in the production code, the complementary test code can compile without using the avr-gcc and the code above is only compiled using avr-gcc.
If you had just mentioned #define, I would have thought maybe he was alluding to its use for enumerations, which are better off using enum to avoid stupid errors such as assigning the same numerical value twice.
Note that even for this situation, it is sometimes better to use #defines than enums, for instance if you rely on numerical values exchanged with other systems and the actual values must stay the same even if you add/delete constants (for compatibility).
However, adding that #if, #ifdef, etc. should not be used either is just weird. Of course, they should probably not be abused, but in real life there are dozens of reasons to use them.
What he may have meant could be that (where appropriate), you should not hardcode behaviour in the source (which would require re-compilation to get a different behaviour), but rather use some form of run-time configuration instead.
That's the only interpretation I could think of that would make sense.