Overriding line marker filename in CPP output - c-preprocessor

At work, we're using GNU CPP as a preprocessor for a custom language. The flow downstream from the preprocessor understands line markers (of the form # 123 foo-bar.extension) and embeds their information into the eventual compiled source.
This is all very well, but the overall process works by writing everything to a temporary directory and it does some (pre-)preprocessing on the input before handing it to CPP. This transformation doesn't change line numbers. As a result, CPP gets called on a file of the form my-tmp-dir/foo.input-pp and that name ends up embedded in the line markers. I'd love to be able to spoof things so that CPP instead emitted line markers of the form original/path/foo.original.input.
Does CPP have any command flags that would let me do this?

I don't know of any option for this, but can't you use a simple sed command to change the line markers?
sed -e '/^#/s,my-tmp-dir\(.*\)-pp,original/path\1,'
Alternatively, put a line directive as the first line of the transformed source so cpp knows what the file name is:
#line 1 "original/path/foo.original.input"
Line directives work as expected. This file:
#line 1 "bar.c"
int test1;
#include <sys/syscall.h>
int test2;
is preprocessed into:
# 1 "foo.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "foo.c"
# 1 "bar.c"
int test1;
# 1 "/usr/include/sys/syscall.h" 1 3 4
# 3 "bar.c" 2
int test2;
A mention of foo.c still occurs but it is immediately overridden by bar.c.

Related

In C what is the difference between a normal array and one declared with #define?

I just started learning C and I was wondering if there is a difference between an array declared like this one:
int m[10][5];
and an array declared like this:
#define NL 10
#define NC 5
int m[NL][NC];
I tought that the memory allocated might be different,but i'm not really sure.
The #define's are "preprocessor directives". The C preprocessor (CPP) is basically a program that is run by your compiler before actually processing your C code.
For your #define's, the preprocessor basically does a "search and replace" for NL and NC, replacing with 10 and 5, respectively. It does this without much understanding of C. (In fact, you could run the C preprocessor on any kind of text file and have it do the same search and replace.)
To get gcc or clang to stop after running the preprocessor, use the -E option and it'll write the preprocessor results to stdout.
Here's an example of having GCC stop after running the preprocessor on your source code:
$ gcc -E andre.c
# 1 "andre.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "andre.c"
int m[10][5];
As you can see, after running the preprocessor, NL and NC have been replaced with the text you supplied in their #define's.
If you're using a compiler other than gcc or clang, check its documentation for how to stop processing after running the preprocessor. Often, preprocessed source code has the extension .i.
(On my machine, cpp andre.c does the exact same thing as the call to gcc above.)
The C/C++ macros will be replaced by the compiler with inline code and there is no difference between declaration with #define and the inline code.

GCC -E output meaning [duplicate]

I print out the output of C preprocessor by using
gcc -E a.c
The output contains many lines like
# 1 "a.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "a.c"
# 1 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/stdio.h" 1 3
# 19 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/stdio.h" 3
# 1 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/_mingw.h" 1 3
# 31 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/_mingw.h" 3
# 32 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/_mingw.h" 3
# 20 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/stdio.h" 2 3
I've never seen this kind of syntax in C. Can someone explain what this is doing?
These lines are hints for debugging (where the code following the line actually came from)
# line-number "source-file" [flags]
Meaning of flags (space separated):
1 - Start of a new file
2 - Returning to previous file
3 - Following text comes from a system header file (#include <> vs #include "")
4 - Following text should be treated as being wrapped in an implicit extern "C" block.
These linemarkers are mentioned in man gcc for -P option.
The -P option is specifically meant to get rid of these lines for clarity:
gcc -E -P source.c
See detailed documentation (answered before).
Those are line synchronization directives, which allow gcc to give correct error messages for errors in #included files. Other preprocessors (such as yacc/bison) use the same mechanism to relate C errors to the correct lines in the input .y file.

What does the preprocessor do with "# <number> <filename>"?

I've just encountered a C file which contains both preprocessor directives and lines that look like this:
# 9 "filename"
I have never seen such lines before. What do they mean? I'm guessing these are preprocessor directives, but what does the preprocessor do with them?
Also, for some of the lines the string doesn't even represent an existing filename...
I believe it's another way of using the #line preprocessor directive.
For example you could write:
// you could write #line 7 "filename" or
// # 7 "filename" or
// # 7 or
#line 7
int main(void)
{
printf("%d\n", __LINE__);
And all of them would give you (in this case) 10 on stdout.
And a note about the "filename" part it's optional and unverified (that's why it can be anything, even a file that doesn't exist). Its use is explained in the link I provided -
If you specify a file name, the compiler views the next line as part of the specified file. If you do not specify a file name, the compiler views the next line as part of the current source file.

What is the meaning of lines starting with a hash sign and number like '# 1 "a.c"' in the gcc preprocessor output?

I print out the output of C preprocessor by using
gcc -E a.c
The output contains many lines like
# 1 "a.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "a.c"
# 1 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/stdio.h" 1 3
# 19 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/stdio.h" 3
# 1 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/_mingw.h" 1 3
# 31 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/_mingw.h" 3
# 32 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/_mingw.h" 3
# 20 "c:\\mingw\\bin\\../lib/gcc/mingw32/4.5.0/../../../../include/stdio.h" 2 3
I've never seen this kind of syntax in C. Can someone explain what this is doing?
These lines are hints for debugging (where the code following the line actually came from)
# line-number "source-file" [flags]
Meaning of flags (space separated):
1 - Start of a new file
2 - Returning to previous file
3 - Following text comes from a system header file (#include <> vs #include "")
4 - Following text should be treated as being wrapped in an implicit extern "C" block.
These linemarkers are mentioned in man gcc for -P option.
The -P option is specifically meant to get rid of these lines for clarity:
gcc -E -P source.c
See detailed documentation (answered before).
Those are line synchronization directives, which allow gcc to give correct error messages for errors in #included files. Other preprocessors (such as yacc/bison) use the same mechanism to relate C errors to the correct lines in the input .y file.

How to remove lines added by default by the C preprocessor to the top of the output?

I'm trying to use the C preprocessor on non-C code, and it works fine except for creating lines like this at the top:
# 1 "test.java"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "test.java"
The problem is that these lines aren't valid in Java. Is there any way to get the preprocessor to not write this stuff? I'd prefer not to have to run this through something else to just remove the first 4 lines every time.
If you're using the gcc preprocessor:
-P Inhibit generation of linemarkers in the output from the
preprocessor. This might be useful when running the preprocessor
on something that is not C code, and will be sent to a program
which might be confused by the linemarkers.
from gcc cpp man page

Resources