I'm attempting to refactor a piece of legacy code and I'd like a snapshot of all of the macros defined at a certain point in the source. The code imports a ridiculous number of headers etc. and it's a bit tedious to track them down by hand.
Something like
#define FOO 1
int myFunc(...) {
PRINT_ALL_DEFINED_THINGS(stderr)
/* ... */
}
Expected somewhere in the output
MACRO: "FOO" value 1
I'm using gcc but have access to other compilers if they are easier to accomplish this task.
EDIT:
The linked question does not give me the correct output for this:
#include <stdio.h>
#define FOO 1
int main(void) {
printf("%d\n", FOO);
}
#define FOO 0
This very clearly prints 1 when run, but gcc test.c -E -dM | grep FOO gives me 0
To dump all defines you can run:
gcc -dM -E file.c
Check GCC dump preprocessor defines
All defines that it will dump will be the value defined (or last redefined), you won't be able to dump the define value in all those portions of code.
You can also append the option "-Wunused-macro" to warn when macros have been redefined.
Related
Suppose there is a C/C++ header file with over ten million lines. There are lots of #ifdef and #endif statements beyond counting. What's the most efficient way to find an arbitrary line's all preprocessor dependencies? In other words, how to find all preprocessor definitions that are required to let the compiler include or ignore a block of codes that contains such line?
For example, we have the following code:
#ifdef A
#if defined(B)
#ifdef C
#else
#define X 1
#endif
#endif
#endif
In order to let the compiler include #define X 1, how do I know that I should define A and B but not C in preprocessor without manually reading the code? Or is there an efficient method to manually find all dependencies?
There is AFAIK no tool that can do this for you.
As mentioned in the comments, the correct solution is to reference the documentation. If this is some odd case where that is not an option, then you may be able to work backwards by printing out the values of each macro you are confused on. Here is a bash script I just cooked up that could automate that process for you:
deref.sh:
#!/bin/bash
if [ -z "$2" ]; then
>&2 echo "usage: $0 <file> <macro name> [<macro name> ...]"
exit 2
fi
source_file="$1"
shift
for macro in "$#"; do
play_file="$(mktemp "$(dirname "$source_file")/XXXXXX.c")"
cat "$source_file" > "$play_file"
printf '\n#ifndef %s\nUNDEFINED\n#else\n%s\n#endif' "$macro" "$macro" >> "$play_file"
printf '%s: %s\n' "$macro" "$(gcc -E "$play_file" | tail -1)"
rm "$play_file"
done
usage example...
a.c:
#define X 1
#include <stdio.h>
int main(void)
{
printf("Hello World");
}
in shell:
./deref.sh a.c X Y
X: 1
Y: UNDEFINED
I am using go-hdf5 to read an hdf5 file into golang. I am on windows7 using a pretty recent copy of mingw and hdf5 1.8.14_x86 and it seems like trying to use any of the predefined types doesn't work, let's focus for example on T_NATIVE_UINT64. I have reduced the issue to the following, which basically leaves go-hdf5 out of the problem and points at something quite fundamental going wrong:
package main
/*
#cgo CFLAGS: -IC:/HDF_Group/HDF5/1.8.14_x86/include
#cgo LDFLAGS: -LC:/HDF_Group/HDF5/1.8.14_x86/bin -lhdf5 -lhdf5_hl
#include "hdf5.h"
#include <stdio.h>
void print_the_value2() { printf("the value of the constant is %d\n", H5T_NATIVE_UINT64); }
*/
import "C"
func main() {
C.print_the_value2()
}
You obviously need to have hdf5 and point the compiler at the headers/dlls and running go get, then executing prints this on my pc
the value of the constant is -1962924545
Running variations of the above, in how/where the constant is read, will give different answers for the value of H5T_NATIVE_UINT64. However I am pretty sure that is none are the right value and in fact trying to use a type with the id returned doesn't work, unsurprisingly.
If I write and run a "real" C program, I get different results
#include <stdio.h>
#include "hdf5.h"
hid_t _go_hdf5_H5T_NATIVE_UINT64() { return H5T_NATIVE_UINT64; }
int main()
{
printf("the value of the constant is %d", _go_hdf5_H5T_NATIVE_UINT64());
}
Compiling using
C:\Temp>gcc -IC:/HDF_Group/HDF5/1.8.14_x86/include -LC:/HDF_Group/HDF5/1.8.14_x86/bin -lhdf5 -lhdf5_hl -o stuff.exe stuff.c
and running gives me
the value of the constant is 50331683
And that appears to be the right value as I can use it directly from my go program. Obviously I want to be able to use the constants instead. Any idea why this could be happening?
Extra info following comments below:
I looked for the definition of H5T_NATIVE_UINT64 in the hdf5 headers and see the following
c:\HDF_Group\HDF5\1.8.14_x86\include>grep H5T_NATIVE_UINT64 *
H5Tpkg.h:H5_DLLVAR size_t H5T_NATIVE_UINT64_ALIGN_g;
H5Tpublic.h:#define H5T_NATIVE_UINT64 (H5OPEN H5T_NATIVE_UINT64_g)
H5Tpublic.h:H5_DLLVAR hid_t H5T_NATIVE_UINT64_g;
The whole header is here
http://www.hdfgroup.org/ftp/HDF5/prev-releases/hdf5-1.8.14/src/unpacked/src/H5Tpublic.h
Thanks!
H5T_NATIVE_UINT64 is NOT a constant but a #define that ultimately evaluates to (H5Open(), H5T_NATIVE_UINT64_g), which cgo does not understand.
It's easy to check by turning on debug output on gcc's preprocessor:
gcc -E -dM your_test_c_file.c | grep H5T_NATIVE_UINT64
Result:
#define H5T_NATIVE_UINT64 (H5OPEN H5T_NATIVE_UINT64_g)
Now the same for H5OPEN:
gcc -E -dM test_go.c | grep '#define H5OPEN'
gives:
#define H5OPEN H5open(),
Right now, cgo does understand simple integer constant defines like #define VALUE 1234, or anything that the gcc preprocessor will turn into an integer constant. See the function func (p *Package) guessKinds(f *File) in $GOROOT/src/cmd/cgo/gcc.go.
For example, does MIN_N_THINGIES below compile to 2? Or will I recompute the division every time I use the macro in code (e.g. recomputing the end condition of a for loop each iteration).
#define MAX_N_THINGIES (10)
#define MIN_N_THINGIES ((MAX_N_THINGIES) / 5)
uint8_t i;
for (i = 0; i < MIN_N_THINGIES; i++) {
printf("hi");
}
This question stems from the fact that I'm still learning about the build process. Thanks!
If you pass -E to gcc it will show what the preprocessor stage outputted.
gcc -E test.c | tail -n11
Outputs:
# 3 "test.c" 2
int main() {
uint8_t i;
for (i = 0; i < ((10) / 5); i++) {
printf("hi");
}
return 0;
}
Then if you pass -s flag to gcc you will see that the division was optimized out. If you also pass the -o flag you can set the output files and diff them to see that they generated the same code.
gcc -S test.c -o test-with-div.s
edit test.c to make MIN_N_THINGIES equal a const 2
gcc -S test.c -o test-constant.s
diff test-with-div.s test-constant.s
// for educational purposes you should look at the .s files generated.
Then as mentioned in another comment you can change the optimization flag by using -O...
gcc -S test.c -O2 -o test-unroll-loop.s
Will unroll the for loop even such that there isn't even a loop.
Preprocessor will replace MIN_N_THINGIES with ((10)/5), then it is up to the compiler to optimize ( or not ) the expression.
Maybe. The standard does not mandate that it is or it is not. On most compilers it will do after passing optimization flags (for example gcc with -O0 does not do it while with -O2 it even unrolls the loop).
Modern compilers perform even much more complicated techniques (vectorization, loop skewing, blocking ...). However unless you really care about performance, for ex. you program HPC, program real time system etc., you probably should not care about the output of the compiler - unless you're just interested (and yes - compilers can be a fascinating subject).
No. The preprocessor does not calculate macros, they're handled by the compiler. The preprocessor can calculate arithmetic expressions (no floating point values) in #if conditionals though.
Macros are simply text substitutions.
Note that the expanded macros can still be calculated and optimized by the compiler, it's just that it's not done by the preprocessor.
The standard mandates that some expressions are evaluated at compile time. But note that the preprocessor does just text splicing (well, almost) when the macro is called, so if you do:
#define A(x) ((x) / (S))
#define S 5
A(10) /* Gives ((10) / (5)) == 2 */
#undef S
#define S 2
A(20) /* Gives ((20) / (2)) == 10 */
The parenteses are to avoid idiocies like:
#define square(x) x * x
square(a + b) /* Gets you a + b * a + b, not the expected square */
After preprocessing, the result is passed to the compiler proper, which does (most of) the computation in the source that the standard requests. Most compilers will do a lot of constant folding, i.e., computing (sub)expressions made of known constants, as this is simple to do.
To see the expansions, it is useful to write a *.c file of a few lines, just with the macros to check, and run it just through the preprocessor (typically someting like cc -E file.c) and check the output.
In code reviews I ask for option (1) below to be used as it results in a symbol being created (for debugging) whereas (2) and (3) do not appear to do so at least for gcc and icc. However (1) is not a true const and cannot be used on all compilers as an array size. Is there a better option that includes debug symbols and is truly const for C?
Symbols:
gcc f.c -ggdb3 -g ; nm -a a.out | grep _sym
0000000100000f3c s _symA
0000000100000f3c - 04 0000 STSYM _symA
Code:
static const int symA = 1; // 1
#define symB 2 // 2
enum { symC = 3 }; // 3
GDB output:
(gdb) p symA
$1 = 1
(gdb) p symB
No symbol "symB" in current context.
(gdb) p symC
No symbol "symC" in current context.
And for completeness, the source:
#include <stdio.h>
static const int symA = 1;
#define symB 2
enum { symC = 3 };
int main (int argc, char *argv[])
{
printf("symA %d symB %d symC %d\n", symA, symB, symC);
return (0);
}
The -ggdb3 option should be giving you macro debugging information. But this is a different kind of debugging information (it has to be different - it tells the debugger how to expand the macro, possibly including arguments and the # and ## operators) so you can't see it with nm.
If your goal is to have something that shows up in nm, then I guess you can't use a macro. But that's a silly goal; you should want to have something that actually works in a debugger, right? Try print symC in gdb and see if it works.
Since macros can be redefined, gdb requires the program to be stopped at a location where the macro existed so it can find the correct definition. In this program:
#include <stdio.h>
int main(void)
{
#define X 1
printf("%d\n", X);
#undef X
printf("---\n");
#define X 2
printf("%d\n", X);
}
If you break on the first printf and print X you'll get the 1; next to the second printf and gdb will tell you that there is no X; next again and it will show the 2.
Also the gdb command info macro foo can be useful, if foo is a macro that takes arguments and you want to see its definition rather than expand it with a specific set of arguments. And if a macro expands to something that's not an expression, gdb can't print it so info macro is the only thing you can do with it.
For better inspection of the raw debugging information, try objdump -W instead of nm.
However (1) is not a true const and cannot be used on all compilers as an array size.
This can be used as array size on all compilers that support C99 and latter (gcc, clang). For others (like MSVC) you have only the last two options.
Using option 3 is preferred 2. enums are different from #define constants. You can use them for debugging. You can use enum constants as l-value as well unlike #define constants.
I have code that has a lot of complicated #define error codes that are not easy to decode since they are nested through several levels.
Is there any elegant way I can get a list of #defines with their final numerical values (or whatever else they may be)?
As an example:
<header1.h>
#define CREATE_ERROR_CODE(class, sc, code) ((class << 16) & (sc << 8) & code)
#define EMI_MAX 16
<header2.h>
#define MI_1 EMI_MAX
<header3.h>
#define MODULE_ERROR_CLASS MI_1
#define MODULE_ERROR_SUBCLASS 1
#define ERROR_FOO CREATE_ERROR_CODE(MODULE_ERROR_CLASS, MODULE_ERROR_SUBCLASS, 1)
I would have a large number of similar #defines matching ERROR_[\w_]+ that I'd like to enumerate so that I always have a current list of error codes that the program can output. I need the numerical value because that's all the program will print out (and no, it's not an option to print out a string instead).
Suggestions for gcc or any other compiler would be helpful.
GCC's -dM preprocessor option might get you what you want.
I think the solution is a combo of #nmichaels and #aschepler's answers.
Use gcc's -dM option to get a list of the macros.
Use perl or awk or whatever to create 2 files from this list:
1) Macros.h, containing just the #defines.
2) Codes.c, which contains
#include "Macros.h"
ERROR_FOO = "ERROR_FOO"
ERROR_BAR = "ERROR_BAR"
(i.e: extract each #define ERROR_x into a line with the macro and a string.
now run gcc -E Codes.c. That should create a file with all the macros expanded. The output should look something like
1 = "ERROR_FOO"
2 = "ERROR_BAR"
I don't have gcc handy, so haven't tested this...
The program 'coan' looks like the tool you are after. It has the 'defs' sub-command, which is described as:
defs [OPTION...] [file...] [directory...]
Select #define and #undef directives from the input files in accordance with the options and report them on the standard output in accordance with the options.
See the cited URL for more information about the options. Obtain the code here.
If you have a complete list of the macros you want to see, and all are numeric, you can compile and run a short program just for this purpose:
#include <header3.h>
#include <stdio.h>
#define SHOW(x) printf(#x " = %lld\n", (long long int) x)
int main(void) {
SHOW(ERROR_FOO);
/*...*/
return 0;
}
As #nmichaels mentioned, gcc's -d flags may help get that list of macros to show.
Here's a little creative solution:
Write a program to match all of your identifiers with a regular expression (like \#define :b+(?<NAME>[0-9_A-Za-z]+):b+(?<VALUE>[^(].+)$ in .NET), then have it create another C file with just the names matched:
void main() {
/*my_define_1*/ my_define_1;
/*my_define_2*/ my_define_2;
//...
}
Then pre-process your file using the /C /P option (for VC++), and you should get all of those replaced with the values. Then use another regex to swap things around, and put the comments before the values in #define format -- now you have the list of #define's!
(You can do something similar with GCC.)
Is there any elegant way I can get a list of #defines with their final numerical values
For various levels of elegance, sort of.
#!/bin/bash
file="mount.c";
for macro in $(grep -Po '(?<=#define)\s+(\S+)' "$file"); do
echo -en "$macro: ";
echo -en '#include "'"$file"'"\n'"$macro\n" | \
cpp -E -P -x c ${CPPFLAGS} - | tail -n1;
done;
Not foolproof (#define \ \n macro(x) ... would not be caught - but no style I've seen does that).