Sorry for the bad title, hopefully my explanation is clearer.
I have the following c program:
clang_test.c
#include "clang_test2.c"
int main()
{
somefunc();
return 0;
}
clang_test2.c
int somefunc()
{
return 5;
}
I then compile it using clang with the -E parameter in order to see the result of the preprocessor.
clang.exe -std=c99 -pedantic-errors -E .\clang_test.c
The preprocessor output is this:
# 1 ".\\clang_test.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 324 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 ".\\clang_test.c" 2
# 1 "./clang_test2.c" 1
int somefunc()
{
return 5;
}
# 3 ".\\clang_test.c" 2
int main()
{
somefunc();
return 0;
}
This works as expected and I get no compilation error if I try to compile it regularly without -E.
For the sake of experimentation I modified clang_test.c to not #include clang_test2.c:
int main()
{
somefunc();
return 0;
}
I then tried to compile using:
clang.exe -std=c99 -pedantic-errors .\clang_test2.c .\clang_test.c
And I get a compiler error saying:
.\clang_test.c:13:2: error: implicit declaration of function 'somefunc' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
somefunc();
But if I look at the output of the preprocessor, it seems like it should work because somefunc() is still declared above the main function where it's used:
# 1 ".\\clang_test2.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 324 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 ".\\clang_test2.c" 2
int somefunc()
{
return 5;
}
# 1 ".\\clang_test.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 324 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 ".\\clang_test.c" 2
int main()
{
somefunc();
return 0;
}
So based on this observation, is it not reliable to look at the output of the preprocessor in order to diagnose problems related to function definitions/declarations?
And also since the only difference between the 2 preprocessor outputs is this block of text:
# 1 ".\\clang_test.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 324 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 ".\\clang_test.c" 2
What could it be doing to prevent somefunc() from being properly forward declared?
The line
clang.exe -std=c99 -pedantic-errors .\clang_test2.c .\clang_test.c
does not just concatenate the two files and compile them. It compiles each of the C files separately to object (.o) files and then links them. You get the error because compiled alone,
int main()
{
somefunc();
return 0;
}
doesn't define somefunc. You'll need a prototype to tell the compiler its type:
int somefunc(void);
int main(void)
{
somefunc();
return 0;
}
Note that you should be using proper prototypes throughout. Functions with no arguments should be be declared with void argument lists. Old style function interfaces (e.g. with arg lists ()) allow many kinds of errors that prototypes allow the compiler to detect.
On the other hand the output of the compiler with -E
clang.exe -std=c99 -pedantic-errors -E .\clang_test2.c .\clang_test.c
does concatenate the two files and send them through the C preprocessor. The difference - compiling to object and liking vice concatenating - explains the behaviors you see.
Related
I am trying to create a simple unit test library in C, similar to googletest (yes I know I could use that, not the point of the exercise).
/* In unit_test.h */
#define UNIT_TEST_HELPER(SuiteName, TestName) void SuiteName##TestName
#define UNIT_TEST(SuiteName, TestName) UNIT_TEST_HELPER(SuiteName, TestName)()
/* in some other file */
#include "unit_test.h"
/* This successfully creates a function 'void HelloTest() */
UNIT_TEST(Hello, Test) {
/* This is where testing code goes */
printf("Calling from HelloTest()\n");
}
/* In main I am to do the following */
int main() {
HelloTest();
}
What I would like to do is either:
a) Somehow call HelloTest() after its fully defined
b) Add HelloTest() to a list of functions to call (a list of void function pointers)
int main() {
UnitTestRun(); /* Loop through function pointers */
}
I have no idea if this is possible with out a lot of work (have to look for functions with a certain signature or something). Goal is to try and avoid having to call each UNIT_TEST() function explicility, what I am currently doing and trying to make life simpler.
"Accumulate a list" is not within the capabilities of the C preprocessor, I'm afraid.
But it's pretty easy to do with an external preprocessor, and any decent build system will have a way to automatically run an external preprocessor as part of the build process.
If you're careful not to use the sequence UNIT_TEST( anywhere in your source code other than in the definition of a unit test, and also to ensure that you never spread UNIT_TEST(a, b) over two lines (which doesn't seem like a big restriction), then you could use a simple shell script to build a stand-alone unit-test caller:
#!/bin/sh
echo '#define CONCAT_(a,b) a##b'
echo '#define UNIT_TEST(a,b) void CONCAT_(a,b)(void); CONCAT_(a,b)();'
echo 'void UnitTestRun(void) {';
grep -hEo 'UNIT_TEST\([^)]*)' "$#"
echo '}'
This allows you to define unit tests in various files, and accumulate them into a single file containing only the function UnitTestRun. The file is completely stand-alone; it contains both declarations and invocations of each unit test, so it doesn't require any #includes. (This does require C99; if you needed to separate the declarations from the invocations, you could do two scans over the files.)
Here's a simple sample run of the script. I placed declarations of the test functions in suite1.c and suite2.c, and the above script in unit_test_maker.sh:
$ # The input files
$ cat suite1.c
UNIT_TEST(Suite1, FirstTest) {
/* blah, blah, blah */
}
UNIT_TEST(Suite1, SecondTest) {
/* More blah */
}
$ cat suite2.c
UNIT_TEST(Suite2, OnlyTest) {
/* blah, blah, blah */
}
$ # The output of the script
$ ./unit_test_maker.sh suite1.c suite2.c
#define CONCAT_(a,b) a##b
#define UNIT_TEST(a,b) void CONCAT_(a,b)(void); CONCAT_(a,b)();
void UnitTestRun(void) {
UNIT_TEST(Suite1, FirstTest)
UNIT_TEST(Suite1, SecondTest)
UNIT_TEST(Suite2, OnlyTest)
}
$ # The output from preprocessing the script
$ ./unit_test_maker.sh suite1.c suite2.c | gcc -E -x c -
# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "<stdin>"
void UnitTestRun(void) {
void Suite1FirstTest(void); Suite1FirstTest();
void Suite1SecondTest(void); Suite1SecondTest();
void Suite2OnlyTest(void); Suite2OnlyTest();
}
$ # Compiles without warnings with `-Wall`.
$ ./unit_test_maker.sh suite1.c suite2.c | gcc -Wall -x c -c -o unit_test_run.o -
$
Doxygen version used: 1.8.11
I have the following code:
void func();
void main ()
{
func();
}
When I run Doxygen graph generation, output is correct:
However, if I use a function macro:
void func();
#define MACRO func
void main ()
{
MACRO();
}
Output is incorrect since the called function is missing:
How should I set preprocessor flags for this to work? Any combination I tried has failed so far.
Thanks
Edit: added code after preprocessing
# 1 "test.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "test.c"
void func();
void main ()
{
func();
}
If you have clang installed on your system, you can enable the CLANG_ASSISTED_PARSING option in the Doxyfile, which is more accurate but a bit slower than the doxygen builtin preprocessor. This generates the correct call graph on my system.
main.c:
int main() { return 0; }
After preprocessing stage: gcc -E main.c
# 1 "main.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "main.c"
int main() { return 0; }
I know that:
the first numbers are line numbers of a processed file;
the "strings" are file names;
the numbers at the end of lines are described here https://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html
What does other lines mean? I mean: <built-in>, <command-line> and from where /usr/include/stdc-predef.h is taken?
Here I found this question GCC preprocessing, what are the built-in and command-line lines for? almost "without" answers.
gcc version 8.3.0 (Debian 8.3.0-6)
UPDATED: Explanation of /usr/include/stdc-predef.h
The header file stdc-predef.h was hardcoded in gcc/config/glibc-c.c (from git repo):
26 /* Implement TARGET_C_PREINCLUDE for glibc targets. */
27
28 static const char *
29 glibc_c_preinclude (void)
30 {
31 return "stdc-predef.h";
32 }
It is processed in push_command_line_include of gcc/c-family/c-opts.c:
1534 /* Give CPP the next file given by -include, if any. */
1535 static void
1536 push_command_line_include (void)
1537 {
1538 /* This can happen if disabled by -imacros for example.
1539 Punt so that we don't set "<command-line>" as the filename for
1540 the header. */
1541 if (include_cursor > deferred_count)
1542 return;
1543
1544 if (!done_preinclude)
1545 {
1546 done_preinclude = true;
1547 if (flag_hosted && std_inc && !cpp_opts->preprocessed)
1548 {
1549 const char *preinc = targetcm.c_preinclude ();
1550 if (preinc && cpp_push_default_include (parse_in, preinc))
1551 return;
1552 }
1553 }
and pseudo-filenames "<built-in>" and "<command-line>" are added in c_finish_options there also.
Start with an empty header.
$ touch foo.h
You are already aware of the numbers in the output of the preprocessor, so won't re-iterate. Coming to <built-in>, it is the list of the predefined macros. Using the preprocessor documentation
-dM Instead of the normal output, generate a list of #define
directives for all the macros defined during the
execution of the preprocessor, including predefined
macros. This gives you a way of finding out what is
predefined in your version of the preprocessor. Assuming
you have no file foo.h, the command
touch foo.h; cpp -dM foo.h
shows all the predefined macros.
So, doing that should give all the predefined macros and their expanions as:
#define __SSP_STRONG__ 3
#define __DBL_MIN_EXP__ (-1021)
#define __FLT32X_MAX_EXP__ 1024
#define __UINT_LEAST16_MAX__ 0xffff
#define __ATOMIC_ACQUIRE 2
:
To see how <command-line> is expanded, pass in a command-line define using the -DX=Y syntax
$ gcc -E -DDBG=1 -dN foo.h|grep 'command-line' -A 1 -B 1
#define __DECIMAL_BID_FORMAT__
# 1 "<command-line>"
#define DBG
-- #define __STDC_ISO_10646__
# 1 "<command-line>" 2
# 1 "foo.h"
DBG shows up under the <command-line> set
As for "/usr/include/stdc-predef.h", well that's the file that contains some of those pred-defined macros. e.g on my system:
#ifdef __GCC_IEC_559
# if __GCC_IEC_559 > 0
# define __STDC_IEC_559__ 1
# endif
which matches with the pre-processor output:
$ gcc -E foo.h -dM|grep __STDC_IEC_559__
#define __STDC_IEC_559__ 1
You can always use the cpp binary for just doing the pre-processing part instead of using gcc -E.
A lot more is actually explained in this answer.
I have a code base which uses #define in a different way then I am accustomed to.
I know that, for example, #define a 5 will replace variable a with 5 in the code.
But what would this mean:
'#define MSG_FLAG 5, REG, MSGCLR'
I tried doing it in a simple code and compiling it. It takes the last value (like the third argument as MSGCLR).
Preprocessing is largely just string replacement that happens before the "real" compilation starts. So we don't have any idea of what a variable is at this point.
The commas here are not any special syntax. This will cause any appearance of MSG_FLAG in the code to be replaced by 5, REG, MSGCLR
Most compilers have a flag that will just run the preprocessor, so you can see for yourself. On gcc, this is -E.
So to verify this, we can have some nonsense source:
#define MSG_FLAG 5, REG, MSGCLR
MSG_FLAG
Compile with gcc -E test.c
And the output is:
# 1 "test.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "test.c"
5, REG, MSGCLR
This question already has answers here:
What is the meaning of lines starting with a hash sign and number like '# 1 "a.c"' in the gcc preprocessor output?
(3 answers)
Closed 7 years ago.
These are the first few lines of the pre-processor output of a simple C program. What do they mean?
# 1 "test.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 325 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "test.c" 2
# 1 "some_path/stdio.h" 1 3 4
# 64 "some_path/stdio.h" 3 4
Here's my program:
#include <stdio.h>
int main()
{
printf("Hello, World!\n");
return 0;
}
# linenum filename flags
These are called linemarkers. They are inserted as needed into the output (but never within a string or character constant). They mean that the following line originated in file filename at line linenum. filename will never contain any non-printing characters; they are replaced with octal escape sequences.
After the file name comes zero or more flags, which are ‘1’, ‘2’, ‘3’, or ‘4’. If there are multiple flags, spaces separate them. Here is what the flags mean:
‘1’ This indicates the start of a new file.
‘2’ This indicates returning to a file (after having included another file).
‘3’ This indicates that the following text comes from a system header file, so certain warnings should be suppressed.
‘4’ This indicates that the following text should be treated as being wrapped in an implicit extern "C" block.
Source: GCC Manual