I am testing two versions of the same code (with GCC version 4.9.2 on Linux, no parameters).
Both have a #define directive, followed by an #ifdef/#endif pair further down.
Now, it turns out that the combination works properly only if the label after the initial #define starts with an underscore. Without the underscore, it works.... in a very weird way, only every third time.
In other words, this works
#define _whatever
while this doesn't:
#define whatever
Even though I know how to make the directive work, just curious - does that behavior follow any standard?
Edit:
Following requests below, here's two absolutely real examples.
This one prints the line "Preprocessor works":
#define _whatever
#include <stdio.h>
void main()
{
#ifdef _whatever
printf("Preprocessor works \n");
#endif
}
... and this one doesn't output anything:
#define whatever
#include <stdio.h>
void main()
{
#ifdef whatever
printf("Preprocessor works \n");
#endif
}
Yes, I am even using the word "whatever" literally - I don't think, it is defined anywhere else. But again, it's the underscore that makes the label work.
There is absolutely no requirement, in any known version of gcc, that preprocessor macros begin with an underscore.
As a general rule, preprocessor macros that begin with various combinations of underscores are reserved to the implementation, and users are advised to ignore them. So #define whatever and #ifdef whatever absolutely must work.
I agree that this is a baffling and frustrating problem. There's something strange going on, but whatever the explanation is, it's not that gcc is requiring leading underscores.
Ok, so the answer is - my sloppy command of the tools.
Specifically:
(1) I was using a header file to add/remove the #define directive
(2) I have (mindlessly) compiled the header by using "gcc *" in place of "gcc *.c"
(3) The occasional presence of the compiled *.h.gch file explains the results.
So, what seemed like erratic behavior was actually me (mindlessly) removing the *.h.gch from time to time.
Thanks everyone - I have learned a lot from all the replies.
Related
Original Question
What I'd like is not a standard C pre-processor, but a variation on it which would accept from somewhere - probably the command line via -DNAME1 and -UNAME2 options - a specification of which macros are defined, and would then eliminate dead code.
It may be easier to understand what I'm after with some examples:
#ifdef NAME1
#define ALBUQUERQUE "ambidextrous"
#else
#define PHANTASMAGORIA "ghostly"
#endif
If the command were run with '-DNAME1', the output would be:
#define ALBUQUERQUE "ambidextrous"
If the command were run with '-UNAME1', the output would be:
#define PHANTASMAGORIA "ghostly"
If the command were run with neither option, the output would be the same as the input.
This is a simple case - I'd be hoping that the code could handle more complex cases too.
To illustrate with a real-world but still simple example:
#ifdef USE_VOID
#ifdef PLATFORM1
#define VOID void
#else
#undef VOID
typedef void VOID;
#endif /* PLATFORM1 */
typedef void * VOIDPTR;
#else
typedef mint VOID;
typedef char * VOIDPTR;
#endif /* USE_VOID */
I'd like to run the command with -DUSE_VOID -UPLATFORM1 and get the output:
#undef VOID
typedef void VOID;
typedef void * VOIDPTR;
Another example:
#ifndef DOUBLEPAD
#if (defined NT) || (defined OLDUNIX)
#define DOUBLEPAD 8
#else
#define DOUBLEPAD 0
#endif /* NT */
#endif /* !DOUBLEPAD */
Ideally, I'd like to run with -UOLDUNIX and get the output:
#ifndef DOUBLEPAD
#if (defined NT)
#define DOUBLEPAD 8
#else
#define DOUBLEPAD 0
#endif /* NT */
#endif /* !DOUBLEPAD */
This may be pushing my luck!
Motivation: large, ancient code base with lots of conditional code. Many of the conditions no longer apply - the OLDUNIX platform, for example, is no longer made and no longer supported, so there is no need to have references to it in the code. Other conditions are always true. For example, features are added with conditional compilation so that a single version of the code can be used for both older versions of the software where the feature is not available and newer versions where it is available (more or less). Eventually, the old versions without the feature are no longer supported - everything uses the feature - so the condition on whether the feature is present or not should be removed, and the 'when feature is absent' code should be removed too. I'd like to have a tool to do the job automatically because it will be faster and more reliable than doing it manually (which is rather critical when the code base includes 21,500 source files).
(A really clever version of the tool might read #include'd files to determine whether the control macros - those specified by -D or -U on the command line - are defined in those files. I'm not sure whether that's truly helpful except as a backup diagnostic. Whatever else it does, though, the pseudo-pre-processor must not expand macros or include files verbatim. The output must be source similar to, but usually simpler than, the input code.)
Status Report (one year later)
After a year of use, I am very happy with 'sunifdef' recommended by the selected answer. It hasn't made a mistake yet, and I don't expect it to. The only quibble I have with it is stylistic. Given an input such as:
#if (defined(A) && defined(B)) || defined(C) || (defined(D) && defined(E))
and run with '-UC' (C is never defined), the output is:
#if defined(A) && defined(B) || defined(D) && defined(E)
This is technically correct because '&&' binds tighter than '||', but it is an open invitation to confusion. I would much prefer it to include parentheses around the sets of '&&' conditions, as in the original:
#if (defined(A) && defined(B)) || (defined(D) && defined(E))
However, given the obscurity of some of the code I have to work with, for that to be the biggest nit-pick is a strong compliment; it is valuable tool to me.
The New Kid on the Block
Having checked the URL for inclusion in the information above, I see that (as predicted) there is an new program called Coan that is the successor to 'sunifdef'. It is available on SourceForge and has been since January 2010. I'll be checking it out...further reports later this year, or maybe next year, or sometime, or never.
I know absolutely nothing about C, but it sounds like you are looking for something like unifdef. Note that it hasn't been updated since 2000, but there is a successor called "Son of unifdef" (sunifdef).
Also you can try this tool http://coan2.sourceforge.net/
something like this will remove ifdef blocks:
coan source -UYOUR_FLAG --filter c,h --recurse YourSourceTree
I used unifdef years ago for just the sort of problem you describe, and it worked fine. Even if it hasn't been updated since 2000, the syntax of preprocessor ifdefs hasn't changed materially since then, so I expect it will still do what you want. I suppose there might be some compile problems, although the packages appear recent.
I've never used sunifdef, so I can't comment on it directly.
Around 2004 I wrote a tool that did exactly what you are looking for. I never got around to distributing the tool, but the code can be found here:
http://casey.dnsalias.org/exifdef-0.2.zip (that's a dsl link)
It's about 1.7k lines and implements enough of the C grammar to parse preprocessor statements, comments, and strings using bison and flex.
If you need something similar to a preprocessor, the flexible solution is Wave (from boost). It's a library designed to build C-preprocessor-like tools (including such things as C++03 and C++0x preprocessors). As it's a library, you can hook into its input and output code.
To avoid impossible situation one could reduce the problem to two cases.
Case 1
The first (simplest) case is situation where the preprocessor has a chance to detect it, that is there's a preprocessor directive that depends on a macro being predefined (that is defined before the first line of input) or not. For example:
#ifdef FOO
#define BAR 42
#else
#define BAR 43
#endif
depends on FOO being predefined or not. However the file
#undef FOO
#ifdef FOO
#define BAR 42
#endif
does not. A harder case would be to detect if the dependency actually does matter, which it doesn't in the above cases (as neither FOO or BAR affects the output).
Case 2
The second (harder) case is where successful compilation depends on predefined macros:
INLINE int fubar(void) {
return 42;
}
which is perfectly fine as far as the preprocessor is concerned whether or not ENTRY_POINT is predefined, but unless INLINE is carefully defined that code won't compile. Similarily we could in this case it might be possible to exclude cases where the output isn't affected, but I can't find an example of that. The complication here is that in the example:
int fubar(void) {
return 42;
}
the fubar being predefined can alter the successful compilation of this, so one would probably need to restrict it to cases where a symbol need to be predefined in order to compile successfully.
I guess such a tool would be something similar to a preprocessor (and C parser in the second case). The question is if there is such a tool? Or is there a tool that only handles the first case? Or none at all?
In C everything can be (re)defined, so there is no way to know in advance what is intended to be (re)defined. Usually some naming conventions helps us to figure out what is meant to be a macro (like upper-case). Therefore it is not possible to have such tool. Of course if you assume that the compilation errors are caused by missing macro definitions then you can use them to analyze what is missing.
I am using both the JUCE Library and a number of Boost headers in my code. Juce defines "T" as a macro (groan), and Boost often uses "T" in it's template definitions. The result is that if you somehow include the JUCE headers before the Boost headers the preprocessor expands the JUCE macro in the Boost code, and then the compiler gets hopelessly lost.
Keeping my includes in the right order isn't hard most of the time, but it can get tricky when you have a JUCE class that includes some other classes and somewhere up the chain one file includes Boost, and if any of the files before it needed a JUCE include you're in trouble.
My initial hope at fixing this was to
#undef T
before any includes for Boost. But the problem is, if I don't re-define it, then other code gets confused that "T" is not declared.
I then thought that maybe I could do some circular #define trickery like so:
// some includes up here
#define ___T___ T
#undef T
// include boost headers here
#define T ___T___
#undef ___T___
Ugly, but I thought it may work.
Sadly no. I get errors in places using "T" as a macro that
'___T___' was not declared in this scope.
Is there a way to make these two libraries work reliably together?
As greyfade pointed out, your ___T___ trick doesn't work because the preprocessor is a pretty simple creature. An alternative approach is to use pragma directives:
// juice includes here
#pragma push_macro("T")
#undef T
// include boost headers here
#pragma pop_macro("T")
That should work in MSVC++ and GCC has added support for pop_macro and push_macro for compatibility with it. Technically it is implementation-dependent though, but I don't think there's a standard way of temporarily suppressing the definition.
Can you wrap the offending library in another include and trap the #define T inside?
eg:
JUICE_wrapper.h:
#include "juice.h"
#undef T
main.cpp:
#include "JUICE_wrapper.h"
#include "boost.h"
rest of code....
I then thought that maybe I could do some circular #define trickery like so:
The C Preprocessor doesn't work this way. Preprocessor symbols aren't defined in the same sense that a symbol is given meaning when, e.g., you define a function.
It might help to think of the preprocessor as a text-replace engine. When a symbol is defined, it's treated as a straight-up text-replace until the end of the file or until it's undefined. Its value is not stored anywhere, and so, can't be copied. Therefore, the only way to restore the definition of T after you've #undefed it is to completely reproduce its value in a new #define later in your code.
The best you can do is to simply not use Boost or petition the developers of JUCE to not use T as a macro. (Or, worst case, fix it yourself by changing the name of the macro.)
I added this in my code:
#ifdef DEBUG_MODE
printf("i=%d\n",i);
fflush(stdout);
#endif
and my question is, if I'm not in DEBUG_MODE what the compiler does when compiling this?
The compiler will do nothing, because there will be nothing there when DEBUG_MODE is not defined.
#ifdef and #endif control conditional compilation. This happens during an initial pass over the program, making dumb textual substitutions before the compiler even begins to consider the file to contain C code specifically. In this case, without the symbol defined only whitespace is left. The text is never even lexed into C tokens if the preprocessor define tested for isn't defined at that point.
You can see this for yourself: just invoke your compiler with whatever flag it uses to stop after preprocessing - e.g. gcc -E x.cc - and at that point in the output there will just be an empty line or two. This is also a very important technique for understanding macros, and a good thing to do when you just can't guess why some program's not working the way you expect - the compiler says some class or function doesn't exist and you've included its header - look at the preprocessed output to know what your compiler is really dealing with.
if DEBUG_MODE is not defined, the code under it will not be compiled.
I'm curious as to why I see nearly all C macros formatted like this:
#ifndef FOO
# define FOO
#endif
Or this:
#ifndef FOO
#define FOO
#endif
But never this:
#ifndef FOO
#define FOO
#endif
(moreover, vim's = operator only seems to count the first two as correct.)
Is this due to portability issues among compilers, or is it just a standard practice?
I've seen it done all three ways, it seems to be a matter of style, not of syntax
While usually the second example is the most common, i've seen cases where the first (or third) is used to help distinguish multiple levels of #ifdefs. Sometimes the logic can become deeply nested and the only way to understand it at a glance is to use indentation much like it is common practice to indent blocks of code between { and }.
IIRC, older C preprocessors required the # to be the first character on the line (though I've never actually encountered one that had this requirement).
I never seen your code like your first example. I usually wrote preprocessor directives as in your second example. I found that it visually interfered with the indentation of the actual code less (not that I write in C anymore).
The GNU C Preprocessor manual says:
Preprocessing directives are lines in
your program that start with '#'.
Whitespace is allowed before and after
the '#'.
For preference I use the third style, with the exception of include guards, for which I use the second style.
I don't like the first style at all - I think of #define as being a preprocessor instruction, even though really of course it isn't, it's a # followed by the preprocessor instruction define. But since I do think of it that way, it seems wrong to separate them. I expect text editors written by people who advocate that style will have a block indent/un-indent that works on code written in that style. But I would hate to encounter it using a text editor that didn't.
There's no point pandering to ancient preprocessors where the # has to be the first character of the line, unless you can also list off the top of your head all the other differences between those implementations and standard C, in order to avoid the other things you could possibly do that they would not support. Of course if you genuinely are working with a pre-standard compiler, fair enough.
Preprocessor directives are lines included in our programs that are not actually program statements but directives for the preprocessor. These lines are always preceded by a hash sign (#).Whitespace is allowed before and after the '#'. As soon as a newline character is found, the preprocessor directive is considered to end.
There is no other rule as far the standard of C/C++ concerned,So it remains as the matter of style and readability issue,I have seen/wrote programs only in the second way that you posted,although the third one seems more readable.