How can I get all modulo(%) operation in my c source code?
I don't think a regular expression can do it, due to C MACRO inclusion.
And, must take care of string printf formatting.
Which static analyse source code can I use to do this?
For exemple, Codesonar or codecheck "only" try find problems, not that.
I want to check manually all modulo operation with zero. (like division by 0).
It's probably not the best way to do this, but it seems to work. Here's my test file:
#include <stdio.h>
#define MACRO %
int main(void) {
int a = 1%1;
int b = 1 MACRO 1;
printf("%1s", "hello");
}
Then I ran this and got some useful output:
$ clang -fsyntax-only -Xclang -dump-tokens main.c 2>&1 > /dev/null | grep percent
percent '%' Loc=<main.c:6:14>
percent '%' [LeadingSpace] Loc=<main.c:7:15 <Spelling=main.c:3:15>>
Short explanation:
This is the output from the lexical analysis. The command seem to write to stderr instead of stdout. That's the reason for 2>&1 > /dev/null. After that I'm just using grep to find all occurrences of the modulo operator.
Related
//C code starts
#define mod(a) (a>=0?a:-a)
#include<stdio.h>
int main(){
int x,y,z;
scanf("%d%d%d",&x,&y,&z);
printf("%d %d %d %d %d\n",x,y,z,y-z,x-z);
if(mod(y-x)<mod(x-z)) printf("%d %d Cat A",mod(y-z),mod(x-z));
else if(mod(y-z)>mod(x-z)) printf("%d %d Cat B",mod(y-z),mod(x-z));
else printf("Mouse C");
printf("\n");
}
/*code ends here*/
For the input of "1 3 2" I would expect the output to be "Mouse C" but it is not the case.
Also if we add all the variables in mod in one more bracket (e.g. if the mod(y-z) is then written as mod((y-z)) ) then the output comes as expected.
So why it is going on?
Macros perform direct text (or more accurately, token) substitution. So this:
mod(y-x)
Is exactly the same as this:
(y-x>=0?y-x:-y-x)
Notice that the last part is -y-x, i.e. the negation of y minus x, while what you wanted was -(y-x). This is a prime example of why macro arguments should always be placed in parenthesis as follows:
#define mod(a) ((a)>=0?(a):-(a))
When you want to see what your macro actually does, look at your code after the preprocessor has actually done its job:
gcc -E file.c
clang -E file.c or clang --preprocess file.c
cl.exe /E file.c, cl.exe /P file.c, or cl.exe /EP file.c for those on windows
It's in general a bad idea to use macros for these things. Use a function instead.
First thing is that you the preprocessor just replaces text. mod(y-x) will be expanded to (y-x>=0?y-x:-y-x) which is obviously wrong. You can remedy this with
#define mod(a) ((a)>=0?(a):-(a))
But here's another catch. Suppose you do this:
mod(f())
where f is a function with side effects. For instance the rand() function. The macro would then make three function calls.
You can solve this problem in gcc (thus making it non-portable) with this construct:
#define mod(a) ({ int _a=a; _a>=0?_a:-_a; })
But isn't it easier to just do this?
long long mod(a) {
return a>=0?a:-a;
}
I've been working on a piece of code that had an overlooked derp in it:
#include<stdio.h>
#include<stdlib.h>
#include<limits.h>
#define MAX_N_LENGTH
/*function prototypes*/
int main(){
...
}
It should be easy to spot with the context removed: #define MAX_N_LENGTH should have read #define MAX_N_LENGTH 9. I have no idea where that trailing constant went.
Since that macro was only used in one place in the form of char buf[ MAX_N_LENGTH + 1], it was extremely difficult to track down and debug the program.
Is there a way to catch errors like this one using the gcc compiler?
You can use char buf[1 + MAX_N_LENGTH], because char buf[1 +] should not compile with the error message error: expected expression before ']' token:
http://ideone.com/5m2LYw
What you have there isn't an undefined macro. It's an empty macro. And defined empty macros are perfectly legit, because you can test for their definedness.
They're used quite a lot in the implementation header files, although all those empty macros will be in the implementation namespace, which means they will either contain two underscores or an underscore followed by an uppercase letter.
What you could do is test whether you have an empty macro that's not in the implementation namespace, and you can do that with:
cpp -dM YOUR_FILE.c |
cut -d\ -f2- | grep '^[a-zA-Z0-9_]* $' |grep -v -e __ -e ^_[A-Z]
For your example, it should output just MAX_N_LENGTH.
It's not possible to catch this error in the general sense, because it isn't an error. There's plenty of cases where this sort of behavior is desired, so the compiler cannot treat it as an error or a warning.
If you can track the error down to a line, using gcc's -E command line argument will cause it to output the result of the preprocessor. In that case, your char line would have turned to char buf[+1], which is legal C code, but might catch your attention because you expected it to be char buf[9+1]. -E causes gcc to print those results, so you would actually see char buf[+1] in the output of gcc.
Issues like this are why C++ discourages use of define macros in this way (C++, of course, has more alternatives than C which makes it easier to discourage them)
You can use the preprocessor to catch when a macro is either 0 or defined without a value:
#define VAR
#if VAR+0 == 0
#error "VAR is either 0 or defined without a value."
#endif
alright, i understand that the title of this topic sounds a bit gibberish... so i'll try to explain it as clearly as i can...
this is related to this previous post (an approach that's been verified to work):
multipass a source code to cpp
-- which basically asks the cpp to preprocess the code once before starting the gcc compile build process
take the previous post's sample code:
#include <stdio.h>
#define DEF_X #define X 22
int main(void)
{
DEF_X
printf("%u", X);
return 1;
}
now, to be able to freely insert the DEF_X anywhere, we need to add a newline
this doesn't work:
#define DEF_X \
#define X 22
this still doesn't work, but is more likely to:
#define DEF_X \n \
#define X 22
if we get the latter above to work, thanks to C's free form syntax and constant string multiline concatenation, it works anywhere as far as C/C++ is concerned:
"literal_str0" DEF_X "literal_str1"
now when cpp preprocesses this:
# 1 "d:/Projects/Research/tests/test.c"
# 1 "<command-line>"
# 1 "d:/Projects/Research/test/test.c"
# 1 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.7.2/../../../../include/stdio.h" 1 3
# 19 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.7.2/../../../../include/stdio.h" 3
# 1 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.7.2/../../../../include/_mingw.h" 1 3
# 32 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.7.2/../../../../include/_mingw.h" 3=
# 33 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.7.2/../../../../include/_mingw.h" 3
# 20 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.7.2/../../../../include/stdio.h" 2 3
ETC_ETC_ETC_IGNORED_FOR_BREVITY_BUT_LOTS_OF_DECLARATIONS
int main(void)
{
\n #define X 22
printf("%u", X);
return 1;
}
we have a stray \n in our preprocessed file. so now the problem is to get rid of it....
now, the unix system commands aren't really my strongest suit. i've compiled dozens of packages in linux and written simple bash scripts that simply enter multiline commands (so i don't have to type them every time or keep pressing the up arrow and choose the correct command successions). so i don`t know the finer points of stream piping and their arguments.
having said that, i tried these commands:
cpp $MY_DIR/test.c | perl -p -e 's/\\n/\n/g' > $MY_DIR/test0.c
gcc $MY_DIR/test0.c -o test.exe
it works, it removes that stray \n.
ohh, as to using perl rather than sed, i'm just more familiar with perl's variant to regex... it's more consistent in my eyes.
anyways, this has the nasty side effect of eating up any \n in the file (even in string literals)... so i need a script or a series of commands to:
remove a \n if:
if it is not inside a quote -- so this won't be modified: "hell0_there\n"
not passed to a function call (inside the argument list)
this is safe as one can never pass a single \n, which is neither a keyword nor an identifier.
if i need to "stringify" an expression with \n, i can simply call a function macro QUOTE_VAR(token). so that encapsulates all instances that \n would have to be treated as a string.
this should cover all cases that \n should be substituted... at least for my own coding conventions.
really, i would do this if i could manage it on my own... but my skills in regex is extremely lacking, only using it in for simple substitutions.
The better way is to replace \n if it occurs in the beginning of line.
The following command should do the work:
sed -e 's/\s*\\n/\n/g'
or occurs before #
sed -e 's/\\n\s*#/\n#/g'
or you can reverse the order of preprocessing and substitute DEF_X with your own tool before C preprocessor.
For example, does MIN_N_THINGIES below compile to 2? Or will I recompute the division every time I use the macro in code (e.g. recomputing the end condition of a for loop each iteration).
#define MAX_N_THINGIES (10)
#define MIN_N_THINGIES ((MAX_N_THINGIES) / 5)
uint8_t i;
for (i = 0; i < MIN_N_THINGIES; i++) {
printf("hi");
}
This question stems from the fact that I'm still learning about the build process. Thanks!
If you pass -E to gcc it will show what the preprocessor stage outputted.
gcc -E test.c | tail -n11
Outputs:
# 3 "test.c" 2
int main() {
uint8_t i;
for (i = 0; i < ((10) / 5); i++) {
printf("hi");
}
return 0;
}
Then if you pass -s flag to gcc you will see that the division was optimized out. If you also pass the -o flag you can set the output files and diff them to see that they generated the same code.
gcc -S test.c -o test-with-div.s
edit test.c to make MIN_N_THINGIES equal a const 2
gcc -S test.c -o test-constant.s
diff test-with-div.s test-constant.s
// for educational purposes you should look at the .s files generated.
Then as mentioned in another comment you can change the optimization flag by using -O...
gcc -S test.c -O2 -o test-unroll-loop.s
Will unroll the for loop even such that there isn't even a loop.
Preprocessor will replace MIN_N_THINGIES with ((10)/5), then it is up to the compiler to optimize ( or not ) the expression.
Maybe. The standard does not mandate that it is or it is not. On most compilers it will do after passing optimization flags (for example gcc with -O0 does not do it while with -O2 it even unrolls the loop).
Modern compilers perform even much more complicated techniques (vectorization, loop skewing, blocking ...). However unless you really care about performance, for ex. you program HPC, program real time system etc., you probably should not care about the output of the compiler - unless you're just interested (and yes - compilers can be a fascinating subject).
No. The preprocessor does not calculate macros, they're handled by the compiler. The preprocessor can calculate arithmetic expressions (no floating point values) in #if conditionals though.
Macros are simply text substitutions.
Note that the expanded macros can still be calculated and optimized by the compiler, it's just that it's not done by the preprocessor.
The standard mandates that some expressions are evaluated at compile time. But note that the preprocessor does just text splicing (well, almost) when the macro is called, so if you do:
#define A(x) ((x) / (S))
#define S 5
A(10) /* Gives ((10) / (5)) == 2 */
#undef S
#define S 2
A(20) /* Gives ((20) / (2)) == 10 */
The parenteses are to avoid idiocies like:
#define square(x) x * x
square(a + b) /* Gets you a + b * a + b, not the expected square */
After preprocessing, the result is passed to the compiler proper, which does (most of) the computation in the source that the standard requests. Most compilers will do a lot of constant folding, i.e., computing (sub)expressions made of known constants, as this is simple to do.
To see the expansions, it is useful to write a *.c file of a few lines, just with the macros to check, and run it just through the preprocessor (typically someting like cc -E file.c) and check the output.
I have code that has a lot of complicated #define error codes that are not easy to decode since they are nested through several levels.
Is there any elegant way I can get a list of #defines with their final numerical values (or whatever else they may be)?
As an example:
<header1.h>
#define CREATE_ERROR_CODE(class, sc, code) ((class << 16) & (sc << 8) & code)
#define EMI_MAX 16
<header2.h>
#define MI_1 EMI_MAX
<header3.h>
#define MODULE_ERROR_CLASS MI_1
#define MODULE_ERROR_SUBCLASS 1
#define ERROR_FOO CREATE_ERROR_CODE(MODULE_ERROR_CLASS, MODULE_ERROR_SUBCLASS, 1)
I would have a large number of similar #defines matching ERROR_[\w_]+ that I'd like to enumerate so that I always have a current list of error codes that the program can output. I need the numerical value because that's all the program will print out (and no, it's not an option to print out a string instead).
Suggestions for gcc or any other compiler would be helpful.
GCC's -dM preprocessor option might get you what you want.
I think the solution is a combo of #nmichaels and #aschepler's answers.
Use gcc's -dM option to get a list of the macros.
Use perl or awk or whatever to create 2 files from this list:
1) Macros.h, containing just the #defines.
2) Codes.c, which contains
#include "Macros.h"
ERROR_FOO = "ERROR_FOO"
ERROR_BAR = "ERROR_BAR"
(i.e: extract each #define ERROR_x into a line with the macro and a string.
now run gcc -E Codes.c. That should create a file with all the macros expanded. The output should look something like
1 = "ERROR_FOO"
2 = "ERROR_BAR"
I don't have gcc handy, so haven't tested this...
The program 'coan' looks like the tool you are after. It has the 'defs' sub-command, which is described as:
defs [OPTION...] [file...] [directory...]
Select #define and #undef directives from the input files in accordance with the options and report them on the standard output in accordance with the options.
See the cited URL for more information about the options. Obtain the code here.
If you have a complete list of the macros you want to see, and all are numeric, you can compile and run a short program just for this purpose:
#include <header3.h>
#include <stdio.h>
#define SHOW(x) printf(#x " = %lld\n", (long long int) x)
int main(void) {
SHOW(ERROR_FOO);
/*...*/
return 0;
}
As #nmichaels mentioned, gcc's -d flags may help get that list of macros to show.
Here's a little creative solution:
Write a program to match all of your identifiers with a regular expression (like \#define :b+(?<NAME>[0-9_A-Za-z]+):b+(?<VALUE>[^(].+)$ in .NET), then have it create another C file with just the names matched:
void main() {
/*my_define_1*/ my_define_1;
/*my_define_2*/ my_define_2;
//...
}
Then pre-process your file using the /C /P option (for VC++), and you should get all of those replaced with the values. Then use another regex to swap things around, and put the comments before the values in #define format -- now you have the list of #define's!
(You can do something similar with GCC.)
Is there any elegant way I can get a list of #defines with their final numerical values
For various levels of elegance, sort of.
#!/bin/bash
file="mount.c";
for macro in $(grep -Po '(?<=#define)\s+(\S+)' "$file"); do
echo -en "$macro: ";
echo -en '#include "'"$file"'"\n'"$macro\n" | \
cpp -E -P -x c ${CPPFLAGS} - | tail -n1;
done;
Not foolproof (#define \ \n macro(x) ... would not be caught - but no style I've seen does that).