Is there any C preprocessor as an independent program? - c-preprocessor

I know that C preprocessor exists as part of compiler. But I'm looking for an independent program. Is there any such tool?

It's often called cpp. For example, on my Linux box:
CPP(1) GNU CPP(1)
NAME
cpp - The C Preprocessor
SYNOPSIS
cpp [-Dmacro[=defn]...] [-Umacro]
[-Idir...] [-iquotedir...]
[-Wwarn...]
[-M|-MM] [-MG] [-MF filename]
[-MP] [-MQ target...]
[-MT target...]
[-P] [-fno-working-directory]
[-x language] [-std=standard]
infile outfile
This particular one is part of gcc and is available for a wide variety of platforms.

mcpp.
From the homepage:
mcpp is a C/C++ preprocessor with the following features.
Implements all of C90, C99 and C++98 specifications.
Provides a validation suite to test C/C++ preprocessor's conformance and quality comprehensively. When this validation suite is applied, mcpp distinguishes itself among many existing preprocessors.
Has plentiful and on-target diagnostics to check all the preprocessing problems such as latent bug or lack of portability in source code.
Has #pragma directives to output debugging information.
Is portable and has been ported to many compiler-systems, including GCC and Visual C++, on UNIX-like systems and Windows.
Has various behavior modes.
Can be built either as a compiler-specific preprocessor to replace the resident preprocessor of a particular compiler system, or as a compiler-independent command, or >even as a subroutine called from some other main program.
Provides comprehensive documents both in Japanese and in English.
Is an open source software released under BSD-style-license.

You can also have a look at m4
What is m4?
M4 can be called a “template language”, a “macro language” or a “preprocessor language”. The name “m4” also refers to the program which processes texts in this language: this “preprocessor” or “macro processor” takes as input an m4 template and sends this to the output, after acting on any embedded directives, called macros.

I've used filepp for preprocessing files other than straight C. It's a Perl module, so it's pretty portable. It's handy in that you can use all the familiar idioms you are used to, and adds some useful features.
From the web site:
Why filepp and not plain old cpp?
cpp is designed specifically to
generate output for the C compiler.
Yes, you can use any file type with
it, but the output it creates includes
loads of blank lines and lines of the
style:
# 1 "file.c"
Obviously these lines are very useful
to the C-compiler, but no use in say
an HTML file. Also, as filepp is
written in Perl, it is 8-bit clean and
so works on any character set, not
just ASCII characters. filepp is also
customisable and hopefully more user
friendly than cpp.

cpp is just one. It's a separated program called by gcc when compiling.

It is a part of the package, and usually called cpp (C PreProcessor).
which cpp
# /usr/bin/cpp
man cpp

Related

paste operator in macros

I found the following snippet of code .
#define f(g,g2) g##g2
main() {
int var12=100;
printf("%d",f(var,12));
}
I understand that this will translate f(var,12) into var12 .
My question is in the macro definition, why didn't they just write the following :
#define f(g,g2) gg2
why do we need ## to concatenate text, rather than concatenate it ourselves ?
If one writes gg2 the preprocessor will perceive that as a single token. The preprocessor cannot understand that that is the concatenation of g and g2.
#define f(g,g2) g##g2
My opinion is that this is poor unreadable code. It needs at least a comment (giving some motivation, explanation, etc...), and a short name like f is meaningless.
My question is in the macro definition, why didn't they just write the following :
#define f(g,g2) gg2
With such a macro definition, f(x,y) would still be expanded to the token gg2, even if the author wanted the expansion to be xy
Please take time to read e.g. the documentation of GNU cpp (and of your compiler, perhaps GCC) and later some C standard like n1570 or better.
Consider also designing your software by (in some cases) generating C code (inspired by GNU bison, or GNU m4, or GPP). Your build machinery (e.g. your Makefile for GNU make) would process that as you want. In some cases (e.g. programs running for hours of CPU time), you might consider doing some partial evaluation and generating specialized code at runtime (for example, with libgccjit or GNU lightning). Pitrat's book on Artificial Beings, the conscience of a conscious machine explains and arguments that idea in an entire book.
Don't forget to enable all warnings and debug info in your compiler (e.g. with GCC use gcc -Wall -Wextra -g) and learn to use a debugger (like GNU gdb).
On Linux systems I sometimes like to generate some (more or less temporary) C code at runtime (from some kind of abstract syntax tree), then compile that code as a plugin, and dlopen(3) that plugin then dlsym(3) inside it. For a stupid example, see my manydl.c program (to demonstrate that you can generate hundreds of thousands of C files and plugins in the same program). For serious examples, read books.
You might also read books about Common Lisp or about Rust; both have a much richer macro system than C provides.

Extending the C preprocessor to inject code

I am working on a project where I need to inject code to C (or C++) files given some smart comments in the source. The code injected is provided by an external file. Does anyone know of any such attempts and can point me to examples - of course I need to preserve original line numbers with #line. My thinking is to replace the cpp with a script which first does this and then calls the system cpp.
Any suggestions will be appreciated
Thanks
Danny
Providing your modified cpp external program won't usually work, at least in recent GCC where the preprocessing is internal to the compiler (so is part of cc1 or cc1plus). Hence, there is no more any cpp program involved in most GCC compilations (but libcpp is an internal library of GCC).
If using mostly GCC, I would suggest to inject code with you own #pragmas (not comments!). You could add your own GCC plugin, or code your own MELT extension, for that purpose (since GCC plugins can add pragmas and builtins but cannot currently affect preprocessing).
As Ira Baxter commented, you could simply put some weird macro invocations and define these macros in separate files.
I don't exactly guess what precise kind of code injection you want.
Alternatively, you could generate your C or C++ code with your own generator (which could emit #line directives) and feed that to gcc

A Windows C compiler that doesn't split arguments in its runtime libraries?

I have heard that in Windows, parameters are passed a single parameter, and then the program splits it into arguments, either in its runtime libraries, or sometimes, in the actual code.
I've heard that most C/C++ compilers do it in runtime libararies (for example, TCC - Tiny C Compiler, which I downloaded)
Are there any C compilers I can download, that don't? Any links to them?
And in such a compiler, would argsv[0] have the whole string?
Added
It's based on what this person (jdedb) said in Super User question Can't pipe or redirect Cygwin grep output, after seeming to suggest that I ask on Stack Overflow.
"It's up to the called program to split the command tail into words, if it wants to operate in Unix (and C language) fashion. (The runtime support libraries of most C and C++ language implementations for Win32 do this splitting behind the scenes."
He said it's the compilers.. But according to Necrolis, it's not the compiler.
(added- Necrolis commented correcting my misreading, compiler!=runtime library)
If you are on Windows, just use GetCommandLine. This is how most CRT wrappers get the command line to split to start with.
As for your actual question, it's not the compiler, but the CRT startup wrapper that they use. If you implement mainCRTstartup, and override the entrypoint with it, you can do whatever you want. A good example of how it works can be seen here.
That "parameter splitting" is the way mandated by the C99 Standard (PDF file) in 5.1.2.2.1.
If an implementation (compiler + library + options) recognizes but does not separate the program name from the other parameters (and parameters from each other) it is not conforming.
Of course, if you use a free-standing implementation none of this applies.

C preprocessor library

I have a task of developing source analysis tool for C programs, and I need to pre-process code before the analysis itself. I was wondering what is the best library for this. I need something light-weight and portable.
Instead of rolling out your own, why not use cpp that's part of the gcc suite: http://gcc.gnu.org/onlinedocs/gcc-4.6.1/cpp/
CPP(1) GNU CPP(1)
NAME
cpp - The C Preprocessor
SYNOPSIS
cpp [-Dmacro[=defn]...] [-Umacro]
[-Idir...] [-iquotedir...]
[-Wwarn...]
[-M|-MM] [-MG] [-MF filename]
[-MP] [-MQ target...]
[-MT target...]
[-P] [-fno-working-directory]
[-x language] [-std=standard]
infile outfile
Only the most useful options are listed here; see below for the
remainder.
DESCRIPTION
The C preprocessor, often known as cpp, is a macro processor that is
used automatically by the C compiler to transform your program before
compilation. It is called a macro processor because it allows you to
define macros, which are brief abbreviations for longer constructs.

What C preprocessor macros have already been defined in gcc?

In gcc, how can I check what C preprocessor definitions are in place during the compilation of a C program, in particular what standard or platform-specific macro definitions are defined?
Predefined macros depend on the standard and the way the compiler implements it.
For GCC: http://gcc.gnu.org/onlinedocs/cpp/Predefined-Macros.html
For Microsoft Visual Studio 8: http://msdn.microsoft.com/en-us/library/b0084kay(VS.80).aspx
This Wikipedia page http://en.wikipedia.org/wiki/C_preprocessor#Compiler-specific_predefined_macros lists how to dump at some of the predefined macros
A likely source of the predefined macros for a specific combination of compiler and platform is the Predef project at Sourceforge. They are attempting to maintain a catalog of all predefined macros in all C and C++ compilers on all platforms. In practice, they have coverage of a fair number of platforms for GCC, and a smattering of other compilers.
They achieved this through a combination of careful reading of documentation, as well as a shell script that figures out what macros are predefined the hard way: it tries them. My understanding is that it actually tries every string it can find in the executable image of the compiler and/or preprocessor to see if it has a predefined meaning.
They will happily add any info they don't have yet to their database.
A program may define a macro at one
point, remove that definition later,
and then provide a different
definition after that. Thus, at
different points in the program, a
macro may have different definitions,
or have no definition at all.

Resources