How to determine when -fsanitize=memory is in use? - c

I want to clear a false positive on FD_ZERO and FD_SET when the memory sanitizer is in use. Clearing it is somewhat easy:
#include <sanitizer/msan_interface.h>
...
__msan_unpoison(&readfds, sizeof(readfds));
__msan_unpoison(&writefds, sizeof(writefds));
However, I don't know how to detect when the memory sanitizer is in use. That is, detect when -fsanitize=memory was specified on the command line. The preprocessor does not seem to be helping:
$ clang -dM -E -fsanitize=memory - </dev/null | egrep -i 'memory|sanitize|msan'
$
How can I determine when -fsanitize=memory is in use?

According to Konstantin Serebryany on the Memory Sanitizer mailing list, there is no preprocessor macro. The __has_feature(memory_sanitizer) should be used:
#if defined(__has_feature)
# if __has_feature(memory_sanitizer)
# define MEMORY_SANITIZER 1
# endif
#endif
...
#ifdef MEMORY_SANITIZER
# include <sanitizer/msan_interface.h>
#endif
...
#ifdef MEMORY_SANITIZER
__msan_unpoison(&readfds, sizeof(readfds));
__msan_unpoison(&writefds, sizeof(writefds));
#endif
...

Related

How to find all preprocessor dependencies of a specific code

Suppose there is a C/C++ header file with over ten million lines. There are lots of #ifdef and #endif statements beyond counting. What's the most efficient way to find an arbitrary line's all preprocessor dependencies? In other words, how to find all preprocessor definitions that are required to let the compiler include or ignore a block of codes that contains such line?
For example, we have the following code:
#ifdef A
#if defined(B)
#ifdef C
#else
#define X 1
#endif
#endif
#endif
In order to let the compiler include #define X 1, how do I know that I should define A and B but not C in preprocessor without manually reading the code? Or is there an efficient method to manually find all dependencies?
There is AFAIK no tool that can do this for you.
As mentioned in the comments, the correct solution is to reference the documentation. If this is some odd case where that is not an option, then you may be able to work backwards by printing out the values of each macro you are confused on. Here is a bash script I just cooked up that could automate that process for you:
deref.sh:
#!/bin/bash
if [ -z "$2" ]; then
>&2 echo "usage: $0 <file> <macro name> [<macro name> ...]"
exit 2
fi
source_file="$1"
shift
for macro in "$#"; do
play_file="$(mktemp "$(dirname "$source_file")/XXXXXX.c")"
cat "$source_file" > "$play_file"
printf '\n#ifndef %s\nUNDEFINED\n#else\n%s\n#endif' "$macro" "$macro" >> "$play_file"
printf '%s: %s\n' "$macro" "$(gcc -E "$play_file" | tail -1)"
rm "$play_file"
done
usage example...
a.c:
#define X 1
#include <stdio.h>
int main(void)
{
printf("Hello World");
}
in shell:
./deref.sh a.c X Y
X: 1
Y: UNDEFINED

How to partially preprocess a C file with specific working directory

I would like to expand include directives of a C file of my working directory only; not the system directory.
I tried the following:
gcc -E -nostdinc -I./ input.c
But it stops preprocessing when it fails to find the included system headers in input.c. I would like it to copy the include directive when it can't find it and keep preprocessing the file.
if your input.c file contains some system headers, it's normal that the preprocessor crashes when it cannot find them.
You could first use grep -v to remove all #include of system headers in your code, achieving something like this (list is non-exhaustive):
grep -vE "(stdio|stdlib)\.h" code.c > code_.c
you get for instance:
#define EXITCODE 0
int main(){
int i = EOF;
printf("hello\n");
return EXITCODE;
}
then pre-process:
S:\c>gcc -E code_.c
# 1 "code_.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "code_.c"
int main(){
int i = EOF;
printf("hello\n");
return 0;
}
note that the pre-processor doesn't care about functions or macros not defined. You get your code preprocessed (and your macros expanded), not the system ones.
You have to process all included files as well of course. That means an extra layer of tools to create temp source files and work from there.
I found a utility that does exactly what I was looking for:
$ cpphs --nowarn --nomacro -I./ input.c | sed -E 's|#line 1 "missing file: (.*)"|#include <\1>|'

Avoiding a double macro subsitution in the C pre-processor

Here's a simple little C program that had me confused for a while:
#include <stdio.h>
#define STR1(x) #x
#define STR(x) STR1(x)
int main(void) {
printf("%s\n", STR(MYDEF));
}
This just prints the value of the MYDEF #define as a string, using the standard stringizing double-define technique.
Compile (on Linux) with gcc -DMYDEF=abc prog.c run the result and, not surprisingly, it prints out 'abc'.
But change the value gcc -DMYDEF=linux prog.c and the result printed is not 'linux' but '1'.
So that confused me for a bit, but of course it happens because gcc (on Linux) has, I discovered, a built-in #define for the name 'linux' with a value '1', and the STR(x) macro ends up expanding MYDEF to 'linux' then linux to '1'.
In my real program (which was rather more complex than the little test above) I got round this by doing things in a different (probably better) way, but it left me curious ... is there a simple little macro technique that would avoid this double-substitution and make the program print out 'linux'? I know I could add a -U or #undef of linux, but that feels a bit clumsy.
I had thought all the built-in #defines start with underscores (usually double underscores), but I guess not.
There is no way to expand a macro only once, there's always a rescan performing further replacement (never recursive, of course). There are circumstances where macros aren't expanded at all (as with the # operator), which is why you need the extra replacement level with two #define like in your example.
In ISO C, identifiers without a leading underscore are free for you to use (not all of them, to be precise). The GNU C dialects define some other macros by default (like linux) for backwards compatibility, though they plan to remove such macros in the future.
To get a list of such macros on your machine, you can do:
$ echo | gcc -std=gnu99 -E -dM - | grep -v '# *define *_'
#define unix 1
#define linux 1
#define i386 1
With the options for ISO C (-ansi/-std=c89, -std=c99, -std=c11/-std=c1x for older Gcc), these macros are not defined:
$ cat test.c
#define STR1(x) #x
#define STR(x) STR1(x)
STR(MYDEF);
STR1(MYDEF);
$ gcc -std=gnu99 -DMYDEF=linux -E test.c
# 1 "test.c"
# 1 "<command-line>"
# 1 "test.c"
"1";
"MYDEF";
$ gcc -std=c99 -DMYDEF=linux -E test.c
# 1 "test.c"
# 1 "<command-line>"
# 1 "test.c"
"linux";
"MYDEF";
In ISO C mode, these macros properly are in the reserved namespace:
$ echo | gcc -std=c99 -E -dM - | grep linux
#define __linux 1
#define __linux__ 1
#define __gnu_linux__ 1
I see in the gcc manual that you can use the -ansi option to turn off predefined macros like "linux"
gcc -ansi -DMYDEF=linux prog.c

Print all defined macros

I'm attempting to refactor a piece of legacy code and I'd like a snapshot of all of the macros defined at a certain point in the source. The code imports a ridiculous number of headers etc. and it's a bit tedious to track them down by hand.
Something like
#define FOO 1
int myFunc(...) {
PRINT_ALL_DEFINED_THINGS(stderr)
/* ... */
}
Expected somewhere in the output
MACRO: "FOO" value 1
I'm using gcc but have access to other compilers if they are easier to accomplish this task.
EDIT:
The linked question does not give me the correct output for this:
#include <stdio.h>
#define FOO 1
int main(void) {
printf("%d\n", FOO);
}
#define FOO 0
This very clearly prints 1 when run, but gcc test.c -E -dM | grep FOO gives me 0
To dump all defines you can run:
gcc -dM -E file.c
Check GCC dump preprocessor defines
All defines that it will dump will be the value defined (or last redefined), you won't be able to dump the define value in all those portions of code.
You can also append the option "-Wunused-macro" to warn when macros have been redefined.

One-liner for printing out the value of a macro from a header

I have a header that defines a large number of macros, some of whom depend on other macros -- however, the dependencies are all resolved within this header.
I need a one-liner for printing out the value of a macro defined in that header.
As an example:
#define MACRO_A 0x60000000
#define MACRO_B MACRO_A + 0x00010000
//...
As a first blush:
echo MACRO_B | ${CPREPROCESSOR} --include /path/to/header
... which nearly gives me what I want:
# A number of lines that are not important
# ...
0x60000000 + 0x00010000
... however, I'm trying to keep this from ballooning into a huge sequence of "pipe it to this, then pipe it to that ...".
I've also tried this:
echo 'main(){ printf( "0x%X", MACRO_B ); }' \
| ${CPREPROCESSOR} --include /path/to/header --include /usr/include/stdio.h
... but it (the gcc compiler) complains that -E is required when processing code on standard input, so I end up having to write out to a temporary file to compile/run this.
Is there a better way?
-Brian
echo 'void main(){ printf( "0x%X", MACRO_B ); }' \
| gcc -x c --include /path/to/header --include /usr/include/stdio.h - && ./a.out
will do it in one line.
(You misread the error GCC gives when reading from stdin. You need -E or -x (needed to specify what language is expected))
Also, it's int main(), or, when you don't care like here, just drop the return type entirely. And you don't need to specify the path for stdio.h.
So slightly shorter:
echo 'main(){printf("0x%X",MACRO_B);}' \
| gcc -xc --include /path/to/header --include stdio.h - && ./a.out
What about tail -n1? Like this:
$ echo C_IRUSR | cpp --include /usr/include/cpio.h | tail -n 1
000400
How about artificially generating an error that contains your MACRO_B value in it, and then compiling the code?
I think the easiest way would be to write a small C program, include the header to that, and print the desired output. Then you can use it in your script, makefile or whatever.
echo '"EOF" EOF' | cpp --include /usr/include/stdio.h | grep EOF
prints:
"EOF" (-1)

Resources