Is there a way to make the GNU C Preprocessor, cpp (or some other tool) list all available macros and their values at a given point in a C file?
I'm looking for system-specific macros while porting a program that's already unix savvy and loading a sparse bunch of unix system files.
Just wondering if there's an easier way than going hunting for definitions.
I don't know about a certain spot in a file, but using:
$ touch emptyfile
$ cpp -dM emptyfile
Dumps all the default ones. Doing the same for a C file with some #include and #define lines in it includes all those as well. I guess you could truncate your file to the spot you care about and then do the same?
From the man page:
-dCHARS
CHARS is a sequence of one or more of the following characters, and must not be preceded by a space. Other characters are interpreted by the compiler proper, or reserved for future versions of GCC, and so are silently ignored. If you specify characters whose behavior conflicts, the result is undefined.
M
Instead of the normal output, generate a list of #define directives for all the macros defined during the execution of the preprocessor, including predefined macros. This gives you a way of finding out what is predefined in your version of the preprocessor. Assuming you have no file foo.h, the command
touch foo.h; cpp -dM foo.h
will show all the predefined macros.
If you use -dM without the -E option, -dM is interpreted as a synonym for -fdump-rtl-mach.
D
Like M except in two respects: it does not include the predefined macros, and it outputs both the #define directives and the result of preprocessing. Both kinds of output go to the standard output file.
N
Like D, but emit only the macro names, not their expansions.
I
Output #include directives in addition to the result of preprocessing.
With gcc, you can use the "-dD" option to dump all the macro definitions to stdout.
Why not consult the section on Predefined-macros? Do you need this for building a project or some such thing?
To list "their values at a given point in a C file" using macros, there is two that can demonstrate a given point in a C file, especially when compiled, and would be deemed useful for tracing a point of failure...consider this sample code in a file called foo.c:
if (!(ptr = malloc(20))){
fprintf(stderr, "Whoops! Malloc Failed in %s at line %d\n", __FILE__, __LINE__);
}
If that code logic was used several times in this file, and the call to malloc was failing, you would get this output:
Whoops! Malloc Failed in foo.c at line 25
The line number would be different depending on where in the source, that logic is used. This sample serves the purpose in showing where that macro could be used...
Here is a link to a page with an overview of command-line options to list predefined macros for most compilers (gcc, clang...).
Related
I have a C file (for simplicity, assume it includes nothing). This C files requires several definitions of literal numbers to compile properly - and I want to figure out which definitions these are.
Naturally, one can try to compile the file, and at some point we would start to get failures; with some failure recovery, we might get failure notifications about additional defines. But - that's not what I want:
I'm not interested in completing the compilation of the program. Building a syntax tree (or even a simplified syntax tree of some kind) should be enough.
I can assume that, other than missing macros, the program is syntactically correct. Which, for C, should means it's syntactically correct, period.
I can assume that the relevant macros are all in uppercase, i.e. they have the form [A-Z][A-Z_0-9]* ).
What are my alternatives for getting the list of undefined macros?
Motivation: In reality, I'm feeding something into a dynamic compilation library, and I want to check beforehand if all necessary macros have been defined, without knowing a priori which macros the file needs (i.e. it could be different ones for different input files).
The ugly fallback solution:
Obviously, your fallback is to just compile the program. But - do so while minimizing irrelevant messages and irrelevant. This will be compiler-dependent, but with GCC for example, you can:
Avoid any output generation
Suppress warnings
Suppress notes
Be strictly standard-compliant, no GNU extensions
Disable the use of those dumb fancy quotation marks GCC insists on using
... using various command-line switches and when making it take input from the standard input stream rather than a file (only way I've found so far to suppress some of the notes). That looks like:
cat your_program.c \
| LC_CTYPE=C gcc -std=c99 -fsyntax-only -x c -fcompare-debug-second -
and the output could look like:
<stdin>: In function 'mult':
<stdin>:3:18: error: 'MY_CONSTANT' undeclared (first use in this function)
Now, if your program is correct other than the undefined macros (= undeclared identifiers), then you can easily parse the above with a bit of shell scripting:
cat your_program.c \
| LC_CTYPE=C gcc -std=c99 -fsyntax-only -x c -fcompare-debug-second - \
| sed -r '/error: /!d; s/^.*error: '"'//; s/'.*//;" \
| sort -u
This has the further disadvantage of not being fully embeddable into your program, i.e. you can't invoke the partial compilation using some library in some program of yours, then programmatically parse the output. You would need a system()-type call.
Note: If your program can have other errors, the pattern for dropping the line in the sed command will need to be a little more specific.
You could use something around the idea that every identifier-like non-keyword outside a comment in a C file must be declared somewhere. (I think! Is that correct?)
The basic idea is to generate a list of such identifiers and search the program and then the included headers for a declaration of each. While this can be done by hand and ad-hoc it probably makes sense to index all potential header files and to use something like ctags for indexing as well as finding (there is a libctags, as I just learned).
I assume that the solution doesn't have to be perfect — missed cases will simply fail compilation — but that you want to reduce such cases. In that case the parsing of the source code for identifiers does not have to be perfect (it can ignore nested comments etc.) and can probably be done "manually" with acceptable effort.
I am trying to learn preprocessor tricks that I found not so easy (Can we have recursive macros?, Is there a way to use C++ preprocessor stringification on variadic macro arguments?, C++ preprocessor __VA_ARGS__ number of arguments, Variadic macro trick, ...). I know the -E option to see the result of the preprocessor whole pass but I would like to know, if options or means exist to see the result step by step. Indeed, sometimes it is difficult to follow what happens when a macro calls a macro that calls a macro ... with the mechanism of disabling context, painting blue ... In brief, I wonder if a sort of preprocessor debugger with breakpoints and other tools exists.
(Do not answer that this use of preprocessor directives is dangerous, ugly, horrible, not good practices in C, produces unreadable code ... I am aware of that and it is not the question).
Yes, this tool exists as a feature of Eclipse IDE. I think the default way to access the feature is to hover over a macro you want to see expanded (this will show the full expansion) and then press F2 on your keyboard (a popup appears that allows you to step through each expansion).
When I used this tool to learn more about macros it was very helpful. With just a little practice, you won't need it anymore.
In case anyone is confused about how to use this feature, I found a tutorial on the Eclipse documentation here.
This answer to another question is relevant.
When you do weird preprocessor tricks (which are legitimate) it is useful to ask the compiler to generate the preprocessed form (e.g. with gcc -C -E if using GCC) and look into that preprocessed form.
In practice, for a source file foo.c it makes (sometimes) sense to get its preprocessed form foo.i with gcc -C -E foo.c > foo.i and look into that foo.i.
Sometimes, it even makes sense to get that foo.i without line information. The trick here (removing line information contained in lines starting with #) would be to do:
gcc -C -E foo.c | grep -v '^#' > foo.i
Then you could indent foo.i and compile it, e.g. with gcc -Wall -c foo.i; you'll get error locations in the preprocessed file and you could understand how you got that and go back to your preprocessor macros (or their invocations).
Remember that the C preprocessor is mostly a textual transformation working at the file level. It is not possible to macro-expand a few lines in isolation (because prior lines might have played with #if combined with #define -perhaps in prior #include-d files- or preprocessor options such as -DNDEBUG passed to gcc or g++). On Linux see also feature_test_macros(7)
A known example of expansion which works differently when compiled with or without -DNDEBUG passed to the compiler is assert. The meaning of assert(i++ > 0) (a very wrong thing to code) depends on it and illustrates that macro-expansion cannot be done locally (and you might imagine some prior header having #define NDEBUG 1 even if of course it is poor taste).
Another example (very common actually) where the macro expansion is context dependent is any macro using __LINE__ or __COUNTER__
...
NB. You don't need Eclipse for all that, just a good enough source code editor (my preference is emacs but that is a matter of taste): for the preprocessing task you can use your compiler.
The only way to see what is wrong with your macro is to add the option which will keep the temporary files when compilation completes. For gcc it is -save-temps option. You can open the .i file and the the expanded macros.
IDE indexers (like Eclipse) will not help too much. They will not expand (as other answer states) the macros until the error occures.
I'm using a proprietary development environment that compiles code written in C, as well as the IEC 61131 languages. For the C compilation, it uses GCC 4.1.2 with these build options:
-fPIC -O0 -g -nostartfiles -Wall -trigraphs -fno-asm
The compilation is done by a program running on windows utilizing Cygwin.
My issue is, IEC language preprocessor is not that useful (doesn't support #define at all) and I want to use macros! I don't see why the GCC preprocessor would really care what language it is processing (my target language is Structured Text), so I'm looking to see if anyone might know a way to get it to process files of different file types that then are not compiled further (I'm just looking for macro expansion before the file is run through the IEC compiler). I'm very ignorant of compiler options and environments since I've never had to deal with them, I just write C code and it magically compiles and transfers to my target system to run.
The only things I can really do are add build options and execute a batch file before anything is executed. I think my best hope lies in using a batch file to process all files of a certain extension, but I don't even know what executable in the gnuinst folder to use, let alone what flags to use to run through the files.
Just about any C preprocessor, including gcc's cpp, is going to assume that its input is valid C code. It has to tokenize the input following C (or C++, or Objective-C) rules, because it had to resolve its input into tokens (more precisely preprocessing tokens). Constructs above the token level shouldn't be an issue.
You certainly can use cpp or gcc -E to preprocess text that isn't C source code, but some input constructs will cause problems.
Taking an example from the comments:
$ cat foo.txt
#define ADDTHEM(x, y) ((x) + (y))
ADDTHEM(2, 3)
$ gcc -E - < foo.txt
# 1 "<stdin>"
# 1 "<command-line>"
# 1 "<stdin>"
((2) + (3))
Note that I had to use gcc -E - < foo.txt rather than gcc -E foo.txt, because gcc treats a .txt file as a linker input file by default.
But if you add some content to foo.txt that doesn't consist of valid C preprocessor tokens, you can have problems:
$ cat foo.txt
#define ADDTHEM(x, y) ((x) + (y))
ADDTHEM(2, 3)
ADDTHEM('c, "s)
$ gcc -E - < foo.txt
# 1 "<stdin>"
# 1 "<command-line>"
# 1 "<stdin>"
((2) + (3))
<stdin>:3:9: warning: missing terminating ' character [enabled by default]
<stdin>:3:0: error: unterminated argument list invoking macro "ADDTHEM"
ADDTHEM
(Attempts to feed Ada source code to a C preprocessor have run into this kind of problem, since Ada uses isolated apostrophe ' characters for its attribute syntax.)
So you can do it if the input language doesn't use things that aren't valid C preprocessor tokens.
See the N1570 draft of the C standard, section 6.4, for more information about preprocessing tokens.
I actually wrote the above before I checked the GNU cpp manual, which says:
The C preprocessor is intended to be used only with C, C++, and
Objective-C source code. In the past, it has been abused as a general
text processor. It will choke on input which does not obey C's lexical
rules. For example, apostrophes will be interpreted as the beginning of
character constants, and cause errors. Also, you cannot rely on it
preserving characteristics of the input which are not significant to
C-family languages. If a Makefile is preprocessed, all the hard tabs
will be removed, and the Makefile will not work.
Having said that, you can often get away with using cpp on things
which are not C. Other Algol-ish programming languages are often safe
(Pascal, Ada, etc.) So is assembly, with caution. `-traditional-cpp'
mode preserves more white space, and is otherwise more permissive. Many
of the problems can be avoided by writing C or C++ style comments
instead of native language comments, and keeping macros simple.
Wherever possible, you should use a preprocessor geared to the
language you are writing in. Modern versions of the GNU assembler have
macro facilities. Most high level programming languages have their own
conditional compilation and inclusion mechanism. If all else fails,
try a true general text processor, such as GNU M4.
(The authors of that manual apparently missed the problem with Ada's attribute syntax.)
Is it possible to get the list of #defines(both compile time and defined in the source code) used in a C program while execution.
Because i am having a project having lot of C source files.
Is there any compile time option to get that?
GNU cpp takes various -d options to output macro and define data. See their man pages for more details.
for gcc, you can use one of the following:
-dCHARS CHARS is a sequence of one or more of the following characters, and must not be preceded by a space. Other characters are interpreted by the compiler proper, or reserved for future versions of GCC, and so are silently ignored. If you specify characters whose behavior conflicts, the result is undefined.
M'
Instead of the normal output, generate a list of#define' directives for all the macros defined during the execution of the preprocessor, including predefined macros. This gives you a way of finding out what is predefined in your version of the preprocessor. Assuming you have no file foo.h, the command
touch foo.h; cpp -dM foo.h
will show all the predefined macros.
If you use -dM without the -E option, -dM is interpreted as a synonym for -fdump-rtl-mach. See Debugging Options.
D'
LikeM' except in two respects: it does not include the predefined macros, and it outputs both the #define' directives and the result of preprocessing. Both kinds of output go to the standard output file.
N'
Like `D', but emit only the macro names, not their expansions.
I'
Output#include' directives in addition to the result of preprocessing.
U'
LikeD' except that only macros that are expanded, or whose definedness is tested in preprocessor directives, are output; the output is delayed until the use or test of the macro; and `#undef' directives are also output for macros tested but undefined at the time.
In gcc the command you probably want is
gcc -dM -E [your_source_files]
I know this is implicitly in the above answers, but perhaps someone needs (like myself) the quick recipe.
The Wikipedia entry for the C Preprocessor states:
The language of preprocessor
directives is agnostic to the grammar
of C, so the C preprocessor can also
be used independently to process other
types of files.
How can this be done? Any examples or techniques?
EDIT: Yes, I'm mostly interested in macro processing. Even though it's probably not advisable or maintainable it would still be useful to know what's possible.
You can call CPP directly:
cpp <file>
Rather than calling it through gcc:
gcc -E filename
Do note however that, as mentioned in the same Wikipedia article, C preprocessor's language is not really equipped for general-purpose use:
However, since the C preprocessor does not have features of some other
preprocessors, such as recursive macros, selective expansion according
to quoting, string evaluation in conditionals, and Turing
completeness, it is very limited in comparison to a more general macro
processor such as m4.
Have you considered dabbling with a more flexible macro processing language, like the aforementioned m4 for instance?
For example, Assembler. While many assemblers have their own way to #include headers and #define macros, it can be useful to use the C preprocessor for this. GNU make, for example, has implicit rules for turning *.S files into *.s files by running the preprocessor ('cpp'), before feeding the *.s file to the GNU assembler ('as').
Yes, it can be done by parsing your own language through the gcc preprocessor (e.g. 'gcc -E').
We have done this on my job with our our, specific language. It has quite some advantages:
You can use C's include statements (#include) which is very powerful
You can use your #ifdef constructions
You can define Constants (#define MAGIC_NUMBER 42) or macro functions (#define min(x,y) ( (x( < (y) ? (x) : (y))
... and the other things in the c processor.
HOWEVER, you also inherit the unsafe C constructions, and having a preprocessor not integrated with your main language is the cause of it. Think about the minimum macro and doing something like :
a = 2;
b = 3;
c = min(a--, b--);
Just think what value a and b will have after the min function?
Same is true about the non-typed constants that you introduce
See the Safer C book for details.
Many C compilers have a flag that tells them to only preprocess. With gcc it's the -E flag. eg:
$ gcc -E -
#define FOO foo
bar FOO baz
will output:
# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "<stdin>"
bar foo baz
With other C compilers you'll have to check the manuals to see how to swithc to preprocess-only mode.
Usually you can invoke the C compiler with an option to preprocess only (and ignore any #line statements). Take this as a simple example:
<?php
function foo()
{
#ifdef DEBUG
echo "Some debug info.";
#endif
echo "Foo!";
}
foo();
We define a PHP source file with preprocess statements. We can then preprocess it (gcc can do this, too):
cl -nologo -EP foo.php > foo2.php
Since DEBUG is not the defined the first echo is stripped. Plus here is that lines beginning with # are comments in PHP so you don't have to preprocess them for a "debug" build.
Edit: Since you asked about macros. This works fine too and could be used to generate boilerplate code etc.
Using Microsoft's compiler, I think (I just looked it up, haven't tested it) that it's the /P compiler option.
Other compilers presumably have similar options (or, for some compilers the preprocessor might actually be a different executable, which is usually run implicitly by the compiler but which you can also run explicitly separately).
Assuming you're using GCC, You can take any plain old text file, regardless of its contents, and run:
gcc -E filename
Any preprocessor directives in the file will be processed by the preprocessor and GCC will then exit.
The point is that it doesn't matter what the actual content of the text file is, since all the preprocessor cares about is its own directives.
I have heard of people using the C pre-processor on Ada code. Ada has no preprocessor, so you have to do something like that if you want to preprocess your code.
However, it was a concious design decision not to give it one, so doing this is very un-Ada. I wouldn't suggest anyone do this.
A while ago I did some work on a project that used imake for makefile generation. As I recall, it was basically the c preprocessor syntax to generate the make files.
The C preprocessor can also be invoked by the Glasgow Haskell Compiler (GHC) prior to compiling Haskell code, by passing the -cpp flag.
You could implement the C preprocessor in the compiler for another language.
You could use it to preprocess any sort of text file, but there's much better things for that purpose.
Basically what it's saying is that preprocessors have nothing to do with C syntax. They are basically simple parsers that follow a set of rules. So you could use preprocessors kind of like you'd use sed or awk for some silly tasks. Don't ask me why you'd ever want to do it though.
For example, on a text file:
#define pi 3.141
pi is not an irrational number.
Then you run the preprocessor & you'd get.
3.141 is not an irrational number.