I'm currently using the __COUNTER__ macro in my C library code to generate unique integer identifiers. It works nicely, but I see two issues:
It's not part of any C or C++ standard.
Independent code that also uses __COUNTER__ might get confused.
I thus wish to implement an equivalent to __COUNTER__ myself.
Alternatives that I'm aware of, but do not want to use:
__LINE__ (because multiple macros per line wouldn't get unique ids)
BOOST_PP_COUNTER (because I don't want a boost dependency)
BOOST_PP_COUNTER proves that this can be done, even though other answers claim it is impossible.
In essence, I'm looking for a header file "mycounter.h", such that
#include "mycounter.h"
__MYCOUNTER__
__MYCOUNTER__ __MYCOUNTER__
__MYCOUNTER__
will be preprocessed by gcc -E to
(...)
0
1 2
3
without using the built-in __COUNTER__.
Note: Earlier, this question was marked as a duplicate of this, which deals with using __COUNTER__ rather than avoiding it.
You can't implement __COUNTER__ directly. The preprocessor is purely functional - no state changes. A hidden counter is inherently impossible in such a system. (BOOST_PP_COUNTER does not prove what you want can be done - it relies on #include and is therefore one-per-line only - may as well use __LINE__. That said, the implementation is brilliant, you should read it anyway.)
What you can do is refactor your metaprogram so that the counter could be applied to the input data by a pure function. e.g. using good ol' Order:
#include <order/interpreter.h>
#define ORDER_PP_DEF_8map_count \
ORDER_PP_FN(8fn(8L, 8rec_mc(8L, 8nil, 0)))
#define ORDER_PP_DEF_8rec_mc \
ORDER_PP_FN(8fn(8L, 8R, 8C, \
8if(8is_nil(8L), \
8R, \
8let((8H, 8seq_head(8L)) \
(8T, 8seq_tail(8L)) \
(8D, 8plus(8C, 1)), \
8if(8is_seq(8H), \
8rec_mc(8T, 8seq_append(8R, 8seq_take(1, 8L)), 8C), \
8rec_mc(8T, 8seq_append(8R, 8seq(8C)), 8D) )))))
ORDER_PP (
8map_count(8seq( 8seq(8(A)), 8true, 8seq(8(C)), 8true, 8true )) //((A))(0)((C))(1)(2)
)
(recurses down the list, leaving sublist elements where they are and replacing non-list elements - represented by 8false - with an incrementing counter variable)
I assume you don't actually want to simply drop __COUNTER__ values at the program toplevel, so if you can place the code into which you need to weave __COUNTER__ values inside a wrapper macro that splits it into some kind of sequence or list, you can then feed the list to a pure function similar to the example.
Of course a metaprogramming library capable of expressing such code is going to be significantly less portable and maintainable than __COUNTER__ anyway. __COUNTER__ is supported by Intel, GCC, Clang and MSVC. (not everyone, e.g. pcc doesn't have it, but does anyone even use that?) Arguably if you demonstrate the feature in use in real code, it makes a stronger case to the standardisation committee that __COUNTER__ should become part of the next C standard.
You are confusing two different things:
1 - the preprocessor which handles#define and #include like stuff. It does only works as the text (meaning character sequences) level and has very few computing capabilities. It is so limited that it cannot implement __COUNTER__. The preprocessor work consist only in macro expansion and file replacement. The crucial point it that it occur before the compilation even start.
2 - the C++ language and in particular the template (meta)programming language which can be used to compute stuff during the compilation phase. It is indeed turing complete but as I already said compilation start after preprocessing.
So what you are asking is not doable in standard C or C++. To solve this problem boost implement its own preprocessor which is not standard compliant and has much more computing capabilities. In particular it is possible to use build an analogue to __counter__ with it.
This small header of mine contains an own implementation of a C preprocessor counter (it uses a slightly different syntax).
Related
I have a lot of preprocessor macro definitions, like this:
#define FOO 1
#define BAR 2
#define BAZ 3
In the real application, each definition corresponds to an instruction in an interpreter virtual machine. The macros are also not sequential in numbering to leave space for future instructions; there may be a #define FOO 41, then the next one is #define BAR 64.
I'm now working on a debugger for this virtual machine, and need to effectively 'reverse' these preprecessor macros. In other words, I need a function which takes the number and returns the macro name, e.g. an input of 2 returns "BAR".
Of course, I could create a function using a switch myself:
const char* instruction_by_id(int id) {
switch (id) {
case FOO:
return "FOO";
case BAR:
return "BAR";
case BAZ:
return "BAZ";
default:
return "???";
}
}
However, this will a nightmare to maintain, since renaming, removing or adding instructions will require this function to be modified too.
Is there another macro which I can use to create a function like this for me, or is there some other approach? If not, is it possible to create a macro to perform this task?
I'm using gcc 6.3 on Windows 10.
You have the wrong approach. Read SICP if you have not read it.
I have a lot of preprocessor macro definitions, like this:
#define FOO 1
#define BAR 2
#define BAZ 3
Remember that C or C++ code can be generated, and it is quite easy to instruct your build automation tool to generate some particular C file (with GNU make or ninja you just add some rule or recipe).
For example, you could use some different preprocessor (liek GPP or m4), or some script -e.g. in awk or Python or Guile, etc..., or write your own program (in C, C++, Ocaml, etc...), to generate the header file containing these #define-s. And another script or program (or the same one, invoked differently) could generate the C code of instruction_by_id
Such basic metaprogramming techniques (of generating some or several C files from something higher level but specific) have been used since at least the 1980s (e.g. with yacc or RPCGEN). The C preprocessor facilitates that with its #include directive (since you can even include lines inside some function body, etc...). Actually, the idea that code is data (and proof) and data is code is even older (Church-Turing thesis, Curry-Howard correspondence, Halting problem). The Gödel, Escher, Bach book is very entertaining....
For example, you could decide to have a textual file opcodes.txt (or even some sqlite database containing stuff....) like
# ignore lines starting with an hashsign
FOO 1
BAR 2
and have two small awk or Python scripts (or two tiny C specialized programs), one generating the #define-s (into opcode-defines.h) and another generating the body of instruction_by_id (into opcode-instr.inc). Then you need to adapt your Makefile to generate these, and put #include "opcode-defines.h" inside some global header, and have
const char* instruction_by_id(int id) {
switch (id) {
#include "opcode-instr.inc"
default: return "???";
}
}
this will a nightmare to maintain,
Not so with such a metaprogramming approach. You'll just maintain opcodes.txt and the scripts using it, but you express a given "knowledge element" (the relation of FOO to 1) only once (in a single line of opcode.txt). Of course you need to document that (at the very least, with comments in your Makefile).
Metaprogramming from some higher-level, declarative formalization, is a very powerful paradigm. In France, J.Pitrat pioneered it (and he is writing an interesting blog today, while being retired) since the 1960s. In the US, J.MacCarthy and the Lisp community also.
For an entertaining talk, see Liam Proven FOSDEM 2018 talk on The circuit less traveled
Large software are using that metaprogramming approach quite often. For example, the GCC compiler have about a dozen of C++ code generators (in total, they are emitting more than a million of C++ lines).
Another way of looking at such an approach is the idea of domain-specific languages that could be compiled to C. If you use an operating system providing dynamic loading, you can even write a program emitting C code, forking a process to compile it into some plugin, then loading that plugin (on POSIX or Linux, with dlopen). Interestingly, computers are now fast enough to enable such an approach in an interactive application (in some sort of REPL): you can emit a C file of a few thousand lines, compile it into some .so shared object file, and dlopen that, in a fraction of second. You could also use JIT-compiling libraries like GCCJIT or LLVM to generate code at runtime. You could embed an interpreter (like Lua or Guile) into your program.
BTW, metaprogramming approaches is one of the reasons why basic compilation techniques should be known by most developers (and not only just people in the compiler business); another reason is that parsing problems are very common. So read the Dragon Book.
Be aware of Greenspun's tenth rule. It is much more than a joke, actually a profound truth about large software.
In a similar case I've resorted to defining a text file format that defines the instructions, and writing a program to read this file and write out the C source of the actual instruction definitions and the C source of functions like your instruction_by_id(). This way you only need to maintain the text file.
As awesome as general code generation is, I’m surprised that nobody mentioned that (if you relax your problem definition just a bit) the C preprocessor is perfectly capable of generating the necessary code, using a technique called X macros. In fact every simple bytecode VM in C that I’ve seen uses this approach.
The technique works as follows. First, there is a file (call it insns.h) containing the authoritative list of instructions,
INSN(FOO, 1)
INSN(BAR, 2)
INSN(BAZ, 3)
or alternatively a macro in some other header containing the same,
#define INSNS \
INSN(FOO, 1) \
INSN(BAR, 2) \
INSN(BAZ, 3)
whichever is more conveinent for you. (I’ll use the first option in the following.) Note that INSN is not defined anywhere. (Traditionally it would be called X, thus the name of the technique.) Wherever you want to loop over your instructions, define INSN to generate the code you want, include insns.h, then undefine INSN again.
In your disassembler, write
const char *instruction_by_id(int id) {
switch (id) {
#define INSN(NAME, VALUE) \
case NAME: return #NAME;
#include "insns.h" /* or just INSNS if you use a macro */
#undef INSN
default: return "???";
}
}
using the prefix stringification operator # to turn names-as-identifiers into names-as-string-literals.
You obviously can’t define the constants this way, because macros cannot define other macros in the C preprocessor. However, if you don’t insist that the instruction constants be preprocessor constants, there’s a different perfectly serviceable constant facility in the C language: enumerations. Whether or not you use an enumerated type, the enumerators defined inside it are regular integer constants from the point of view of the compiler (though not the preprocessor—you cannot use #ifdef with them, for example). So, using an anonymous enumeration type, define your constants like this:
enum {
#define INSN(NAME, VALUE) \
NAME = VALUE,
#include "insns.h" /* or just INSNS if you use a macro */
#undef INSN
NINSNS /* C89 doesn’t allow trailing commas in enumerations (but C99+ does), and you may find this constant useful in any case */
};
If you want to statically initialize an array indexed by your bytecodes, you’ll have to use C99 designated initializers {[FOO] = foovalue, [BAR] = barvalue, /* ... */} whether or not you use X macros. However, if you don’t insist on assigning custom codes to your instructions, you can eliminate VALUE from the above and have the enumeration assign consecutive codes automatically, and then the array can be simply initialized in order, {foovalue, barvalue, /* ... */}. As a bonus, NINSNS above then becomes equal to the number of the instructions and the size of any such array, which is why I called it that.
There are more tricks you can use here. For example, if some instructions have variants for several data types, the instruction list X macro can call the type list X macro to generate the variants automatically. (The somewhat ugly second option of storing the X macro list in a large macro and not an include file may be more handy here.) The INSN macro may take additional arguments such as the mode name, which would ignored in the code list but used to call the appropriate decoding routine in the disassembler. You can use token pasting operator ## to add prefixes to the names of the constants, as in INSN_ ## NAME to generate INSN_FOO, INSN_BAR, etc. And so on.
I wanted to see how the arduino function digitalWrite actually worked. But when I looked for the source for the function it was full of macros that were themselves defined in terms of other macros. Why would it be built this way instead of just using a function? Is this just poor coding style or the proper way of doing things in C?
For example, digitalWrite contains the macro digitalPinToPort.
#define digitalPinToPort(P) ( pgm_read_byte( digital_pin_to_port_PGM + (P) ) )
And pgm_read_byte is a macro:
#define pgm_read_byte(address_short) pgm_read_byte_near(address_short)
And pgm_read_byte_near is a macro:
#define pgm_read_byte_near(address_short) __LPM((uint16_t)(address_short))
And __LPM is a macro:
#define __LPM(addr) __LPM_classic__(addr)
And __LPM_classic__ is a macro:
#define __LPM_classic__(addr) \
(__extension__({ \
uint16_t __addr16 = (uint16_t)(addr); \
uint8_t __result; \
__asm__ __volatile__ \
( \
"lpm" "\n\t" \
"mov %0, r0" "\n\t" \
: "=r" (__result) \
: "z" (__addr16) \
: "r0" \
); \
__result; \
}))
Not directly related to this, but I was also under the impression that double underscore should only be used by the compiler. Is having LPM prefixed with __ proper?
If your question is "why one would use multiple layers of macros?", then:
why not? Notably in the pre-C99 era without inline. A canonical example was 1980s era getc which was (IIRC) at that time (SunOS3.2, 1987) documented as being a macro, with a NOTE in the man page telling (I forgot the details) that with some FILE* filearr[]; a getc(filearr[i++]) was wrong (IIRC, the undefined behavior terminology did not exist at that time). When you looked into some system headers (e.g. <stdio.h> or some header included by it) you could find the definition of such macro. And at that time (with computers running only at a few MHz, so thousand times slower than today) getc has to be a macro for efficiency reasons (since inline did not exist, and compilers did no interprocedural optimizations like they are able to do now). Of course, you could use getc in your own macros.
even today, some standards define macros. In particular today's waitpid(2) system call document WIFEXITED & WEXITSTATUS as macros, and it is sensible to #define some of your macros mixing both of them.
the main point is to understand the working of the C preprocessor, and its profoundly textual (so very brittle) nature. This is explained in all textbooks about C. Hence you need to understand what is happening under the hoods.
the rule of thumb with modern era C (that is C99 & C11) is to systematically prefer having some static inline function (defined in some header file) to an equivalent macro. In other words, #define some macro only when you cannot avoid it. And explicitly document that fact.
several layers of macro might (sometimes) increase code readability.
macros can be tested with #ifdef macroname which is sometimes useful.
Of course, when you dare defining several layers of macros (I won't call them "recursive", read about self-referential macros) you need to be very careful and understand what is happening with all of them (combined & separately). Looking into the preprocessed form is helpful.
BTW, to debug complex macros, I sometimes do
gcc -C -E -I/some/directory mysource.c | sed 's:^#://#:' > mysource.i
then I look into the mysource.i and sometimes even have to compile it, perhaps as gcc -c -Wall mysource.i to get warnings which are located in the preprocessed form mysource.i (which I can examine in my emacs editor). The sed command is "commenting" the lines starting with # which are setting source location (à la #line).... Sometimes I even do indent mysource.i
(actually, I have a special rule for that in my Makefile)
Is having LPM prefixed with __ proper?
BTW, names starting with _ are (by the standard, and conventionally) reserved to the implementation. In principle you are not allowed to have your names starting with _ so there cannot be any collision.
Notice that the __LPM_classic__ macro uses the statement-expr extension
(of GCC & Clang)
Look also at other programming languages. Common Lisp has a very different model of macros (and a much more interesting one). Read about hygienic macros. My personal regret is that the C preprocessor has never evolved to be much more powerful (and Scheme-like). Why that did not happen (imagine a C preprocessor able to invoke Guile code to do the macro expansion!) is probably for sociological & economical reasons.
You still should sometimes consider using some other preprocessor (like m4 or GPP) to generate C code. See autoconf.
Why would it be built this way instead of just using a function?
The purpose is likely to optimize functions' calls using pre-C99 compilers, that don't support inline functions (via inline keyword). This way, the whole stack of function-like macros is essentially merged by preprocessor into single block of code.
Every time you call a function in C, there is a tiny overhead, because of jumping around program's code, managing the stack frame and passing arguments. This cost is negligible in most applications, but if the function is called very often, then it may become a performance bottleneck.
Is this just poor coding style or the proper way of doing things in C?
It's hard to give definitive answer, because coding style is subjective topic. I would say, consider using inline functions or even (better) let the compiler inline them by itself. They are type-safe, more readable, more predictable, and with the proper assistance from compiler, the net result is essentially the same.
Related reference (it's for C++, but the idea is generally the same for C):
Why should I use inline functions instead of plain old #define macros?
I am a first year computer science student and my professor said #define is banned in the industry standards along with #if, #ifdef, #else, and a few other preprocessor directives. He used the word "banned" because of unexpected behaviour.
Is this accurate? If so why?
Are there, in fact, any standards which prohibit the use of these directives?
First I've heard of it.
No; #define and so on are widely used. Sometimes too widely used, but definitely used. There are places where the C standard mandates the use of macros — you can't avoid those easily. For example, §7.5 Errors <errno.h> says:
The macros are
EDOM
EILSEQ
ERANGE
which expand to integer constant expressions with type int, distinct positive values, and which are suitable for use in #if preprocessing directives; …
Given this, it is clear that not all industry standards prohibit the use of the C preprocessor macro directives. However, there are 'best practices' or 'coding guidelines' standards from various organizations that prescribe limits on the use of the C preprocessor, though none ban its use completely — it is an innate part of C and cannot be wholly avoided. Often, these standards are for people working in safety-critical areas.
One standard you could check the MISRA C (2012) standard; that tends to proscribe things, but even that recognizes that #define et al are sometimes needed (section 8.20, rules 20.1 through 20.14 cover the C preprocessor).
The NASA GSFC (Goddard Space Flight Center) C Coding Standards simply say:
Macros should be used only when necessary. Overuse of macros can make code harder to read and maintain because the code no longer reads or behaves like standard C.
The discussion after that introductory statement illustrates the acceptable use of function macros.
The CERT C Coding Standard has a number of guidelines about the use of the preprocessor, and implies that you should minimize the use of the preprocessor, but does not ban its use.
Stroustrup would like to make the preprocessor irrelevant in C++, but that hasn't happened yet. As Peter notes, some C++ standards, such as the JSF AV C++ Coding Standards (Joint Strike Fighter, Air Vehicle) from circa 2005, dictate minimal use of the C preprocessor. Essentially, the JSF AV C++ rules restrict it to #include and the #ifndef XYZ_H / #define XYZ_H / … / #endif dance that prevents multiple inclusions of a single header. C++ has some options that are not available in C — notably, better support for typed constants that can then be used in places where C does not allow them to be used. See also static const vs #define vs enum for a discussion of the issues there.
It is a good idea to minimize the use of the preprocessor — it is often abused at least as much as it is used (see the Boost preprocessor 'library' for illustrations of how far you can go with the C preprocessor).
Summary
The preprocessor is an integral part of C and #define and #if etc cannot be wholly avoided. The statement by the professor in the question is not generally valid: #define is banned in the industry standards along with #if, #ifdef, #else, and a few other macros is an over-statement at best, but might be supportable with explicit reference to specific industry standards (but the standards in question do not include ISO/IEC 9899:2011 — the C standard).
Note that David Hammen has provided information about one specific C coding standard — the JPL C Coding Standard — that prohibits a lot of things that many people use in C, including limiting the use of of the C preprocessor (and limiting the use of dynamic memory allocation, and prohibiting recursion — read it to see why, and decide whether those reasons are relevant to you).
No, use of macros is not banned.
In fact, use of #include guards in header files is one common technique that is often mandatory and encouraged by accepted coding guidelines. Some folks claim that #pragma once is an alternative to that, but the problem is that #pragma once - by definition, since pragmas are a hook provided by the standard for compiler-specific extensions - is non-standard, even if it is supported by a number of compilers.
That said, there are a number of industry guidelines and encouraged practices that actively discourage all usage of macros other than #include guards because of the problems macros introduce (not respecting scope, etc). In C++ development, use of macros is frowned upon even more strongly than in C development.
Discouraging use of something is not the same as banning it, since it is still possible to legitimately use it - for example, by documenting a justification.
Some coding standards may discourage or even forbid the use of #define to create function-like macros that take arguments, like
#define SQR(x) ((x)*(x))
because a) such macros are not type-safe, and b) somebody will inevitably write SQR(x++), which is bad juju.
Some standards may discourage or ban the use of #ifdefs for conditional compilation. For example, the following code uses conditional compilation to properly print out a size_t value. For C99 and later, you use the %zu conversion specifier; for C89 and earlier, you use %lu and cast the value to unsigned long:
#if __STDC_VERSION__ >= 199901L
# define SIZE_T_CAST
# define SIZE_T_FMT "%zu"
#else
# define SIZE_T_CAST (unsigned long)
# define SIZE_T_FMT "%lu"
#endif
...
printf( "sizeof foo = " SIZE_T_FMT "\n", SIZE_T_CAST sizeof foo );
Some standards may mandate that instead of doing this, you implement the module twice, once for C89 and earlier, once for C99 and later:
/* C89 version */
printf( "sizeof foo = %lu\n", (unsigned long) sizeof foo );
/* C99 version */
printf( "sizeof foo = %zu\n", sizeof foo );
and then let Make (or Ant, or whatever build tool you're using) deal with compiling and linking the correct version. For this example that would be ridiculous overkill, but I've seen code that was an untraceable rat's nest of #ifdefs that should have had that conditional code factored out into separate files.
However, I am not aware of any company or industry group that has banned the use of preprocessor statements outright.
Macros can not be "banned". The statement is nonsense. Literally.
For example, section 7.5 Errors <errno.h> of the C Standard requires the use of macros:
1 The header <errno.h> defines several macros, all relating to the reporting of error conditions.
2 The macros are
EDOM
EILSEQ
ERANGE
which expand to integer constant expressions with type int, distinct
positive values, and which are suitable for use in #if preprocessing
directives; and
errno
which expands to a modifiable lvalue that has type int and thread
local storage duration, the value of which is set to a positive error
number by several library functions. If a macro definition is
suppressed in order to access an actual object, or a program defines
an identifier with the name errno, the behavior is undefined.
So, not only are macros a required part of C, in some cases not using them results in undefined behavior.
No, #define is not banned. Misuse of #define, however, may be frowned upon.
For instance, you may use
#define DEBUG
in your code so that later on, you can designate parts of your code for conditional compilation using #ifdef DEBUG, for debug purposes only. I don't think anyone in his right mind would want to ban something like this. Macros defined using #define are also used extensively in portable programs, to enable/disable compilation of platform-specific code.
However, if you are using something like
#define PI 3.141592653589793
your teacher may rightfully point out that it is much better to declare PI as a constant with the appropriate type, e.g.,
const double PI = 3.141592653589793;
as it allows the compiler to do type checking when PI is used.
Similarly (as mentioned by John Bode above), the use of function-like macros may be disapproved of, especially in C++ where templates can be used. So instead of
#define SQ(X) ((X)*(X))
consider using
double SQ(double X) { return X * X; }
or, in C++, better yet,
template <typename T>T SQ(T X) { return X * X; }
Once again, the idea is that by using the facilities of the language instead of the preprocessor, you allow the compiler to type check and also (possibly) generate better code.
Once you have enough coding experience, you'll know exactly when it is appropriate to use #define. Until then, I think it is a good idea for your teacher to impose certain rules and coding standards, but preferably they themselves should know, and be able to explain, the reasons. A blanket ban on #define is nonsensical.
That's completely false, macros are heavily used in C. Beginners often use them badly but that's not a reason to ban them from industry. A classic bad usage is #define succesor(n) n + 1. If you expect 2 * successor(9) to give 20, then you're wrong because that expression will be translated as 2 * 9 + 1 i.e. 19 not 20. Use parenthesis to get the expected result.
No. It is not banned. And truth to be told, it is impossible to do non-trivial multi-platform code without it.
No your professor is wrong or you misheard something.
#define is a preprocessor macro, and preprocessor macros are needed for conditional compilation and some conventions, which aren't simply built in the C language. For example, in a recent C standard, namely C99, support for booleans had been added. But it's not supported "native" by the language, but by preprocessor #defines. See this reference to stdbool.h
Macros are used pretty heavily in GNU land C, and without conditional preprocessor commands there's be no way to properly handle multiple inclusions of the same source files, so that makes them seem like essential language features to me.
Maybe your class is actually on C++, which despite many people's failure to do so, should be distinguished from C as it is a different language, and I can't speak for macros there. Or maybe the professor meant he's banning them in his class. Anyhow I'm sure the SO community would be interested in hearing which standard he's talking about, since I'm pretty sure all C standards support the use of macros.
Contrary to all of the answers to date, the use of preprocessor directives is oftentimes banned in high-reliability computing. There are two exceptions to this, the use of which are mandated in such organizations. These are the #include directive, and the use of an include guard in a header file. These kinds of bans are more likely in C++ rather than in C.
Here's but one example: 16.1.1 Use the preprocessor only for implementing include guards, and including header files with include guards.
Another example, this time for C rather than C++: JPL Institutional Coding Standard for the C Programming Language . This C coding standard doesn't go quite so far as banning the use of the preprocessor completely, but it comes close. Specifically, it says
Rule 20 (preprocessor use)
Use of the C preprocessor shall be limited to file inclusion and simple macros. [Power of Ten Rule 8].
I'm neither condoning nor decrying those standards. But to say they don't exist is ludicrous.
If you want your C code to interoperate with C++ code, you will want to declare your externally visible symbols, such as function declarations, in the extern "C" namespace. This is often done using conditional compilation:
#ifdef __cplusplus
extern "C" {
#endif
/* C header file body */
#ifdef __cplusplus
}
#endif
Look at any header file and you will see something like this:
#ifndef _FILE_NAME_H
#define _FILE_NAME_H
//Exported functions, strucs, define, ect. go here
#endif /*_FILE_NAME_H */
These define are not only allowed, but critical in nature as each time the header file is referenced in files it will be included separately. This means without the define you are redefining everything in between the guard multiple times which best case fails to compile and worse case leaves you scratching your head later why your code doesn't work the way you want it to.
The compiler will also use define as seen here with gcc that let you test for things like the version of the compiler which is very useful. I'm currently working on a project that needs to compile with avr-gcc, but we have a testing environment that we also run our code though. To prevent the avr specific files and registers from keeping our test code from running we do something like this:
#ifdef __AVR__
//avr specific code here
#endif
Using this in the production code, the complementary test code can compile without using the avr-gcc and the code above is only compiled using avr-gcc.
If you had just mentioned #define, I would have thought maybe he was alluding to its use for enumerations, which are better off using enum to avoid stupid errors such as assigning the same numerical value twice.
Note that even for this situation, it is sometimes better to use #defines than enums, for instance if you rely on numerical values exchanged with other systems and the actual values must stay the same even if you add/delete constants (for compatibility).
However, adding that #if, #ifdef, etc. should not be used either is just weird. Of course, they should probably not be abused, but in real life there are dozens of reasons to use them.
What he may have meant could be that (where appropriate), you should not hardcode behaviour in the source (which would require re-compilation to get a different behaviour), but rather use some form of run-time configuration instead.
That's the only interpretation I could think of that would make sense.
Let's say I have a macro in an inclusion file:
// a.h
#define VALUE SUBSTITUTE
And another file that includes it:
// b.h
#define SUBSTITUTE 3
#include "a.h"
Is it the case that VALUE is now defined to SUBSTITUTE and will be macro expanded in two passes to 3, or is it the case that VALUE has been set to the macro expanded value of SUBSTITUTE (i.e. 3)?
I ask this question in the interest of trying to understand the Boost preprocessor library and how its BOOST_PP_SLOT defines work (edit: and I mean the underlying workings). Therefore, while I am asking the above question, I'd also be interested if anyone could explain that.
(and I guess I'd also like to know where the heck to find the 'painted blue' rules are written...)
VALUE is defined as SUBSTITUTE. The definition of VALUE is not aware at any point that SUBSTITUTE has also been defined. After VALUE is replaced, whatever it was replaced by will be scanned again, and potentially more replacements applied then. All defines exist in their own conceptual space, completely unaware of each other; they only interact with one another at the site of expansion in the main program text (defines are directives, and thus not part of the program proper).
The rules for the preprocessor are specified alongside the rules for C proper in the language standard. The standard documents themselves cost money, but you can usually download the "final draft" for free; the latest (C11) can be found here: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
For at-home use the draft is pretty much equivalent to the real thing. Most people who quote the standard are actually looking at copies of the draft. (Certainly it's closer to the actual standard than any real-world C compiler is...)
There's a more accessible description of the macro rules in the GCC manual: http://gcc.gnu.org/onlinedocs/cpp/Self_002dReferential-Macros.html
Additionally... I couldn't tell you much about the Boost preprocessor library, not having used it, but there's a beautiful pair of libraries by the same authors called Order and Chaos that are very "clean" (as macro code goes) and easy to understand. They're more academic in tone and intended to be pure rather than portable; which might make them easier reading.
(Since I don't know Boost PP I don't know how relevant this is to your question but) there's also a good introductory example of the kids of techniques these libraries use for advanced metaprogramming constructs in this answer: Is the C99 preprocessor Turing complete?
I would like to be able to write preprocessor macros using a more fully fledged language. Such a language would ideally include the following features:
boolean and natural arithmetic and comparisons
branching based on comparisons
list representation
recursion
variable binding
functions as first class values and higher order functions
partial function application and currying
functional primitives, such as map and fold
useful functions/structures for common code generation tasks
Is this possible to implement within the C preprocessor?
Incredibly, the answer is yes! The Order header-only library provides a set of macros that implement a functional language inside the C preprocessor. It includes all of the specified features and more. You can use it as long as your C preprocessor is nearly completely C99 compliant. The GNU CPP (as used in GCC and G++) is compatible, as is the Boost Wave preprocessor. Order has been around since 2004. Although it is no longer maintained, it is very full featured, if not fully documented.
Here is a simple example use of Order:
#define AVERAGE(...) ((ORDER_PP( \
8seq_for_each_with_delimiter( \
8put, \
8emit(8quote(+)), \
8tuple_to_seq(8quote((__VA_ARGS__)))))) / \
ORDER_PP(8to_lit(8tuple_size(8quote((__VA_ARGS__))))))
The macro AVERAGE expands to an expression expressing the mean of the provided arguments. AVERAGE(a, b, c) (for example) expands to ((a + b + c) / 3). This is a very simple example that does not use all of Order's features.
Another simple example, showing use of pre-compilation arithmetic (using an arbitrary precision natural number representation), functions as first class values (see the use of 8plus) and variable binding, is a macro for computing (integer arithmetic) averages in the preprocessor:
#define AVERAGE_LITERAL(...) ORDER_PP(\
8let((8A, 8quote((__VA_ARGS__))), \
8to_lit(8quotient( \
8seq_fold(8plus, 0, 8tuple_to_seq(8A)), \
8tuple_size(8A)))))
AVERAGE_LITERAL(5, 6, 8, 9) expands to 7.
I have only touched upon a few of the features. More practical examples are provided in the accompanying documentation and tutorial, including those showing how Order can help remove practically all tedious code repetition.
Order is very powerful, and it is still very relevant to C++ programmers - templates and inlining can only solve some problems. Order solves most of the others. The only inherent limitations I can find are the inability to manipulate string literals or manipulate tokens in any way other than replacing or concatenating, as this is imposed by the C preprocessor.