Extending the C preprocessor to inject code

Extending the C preprocessor to inject code - c-preprocessor

I am working on a project where I need to inject code to C (or C++) files given some smart comments in the source. The code injected is provided by an external file. Does anyone know of any such attempts and can point me to examples - of course I need to preserve original line numbers with #line. My thinking is to replace the cpp with a script which first does this and then calls the system cpp.
Any suggestions will be appreciated
Thanks
Danny

Providing your modified cpp external program won't usually work, at least in recent GCC where the preprocessing is internal to the compiler (so is part of cc1 or cc1plus). Hence, there is no more any cpp program involved in most GCC compilations (but libcpp is an internal library of GCC).
If using mostly GCC, I would suggest to inject code with you own #pragmas (not comments!). You could add your own GCC plugin, or code your own MELT extension, for that purpose (since GCC plugins can add pragmas and builtins but cannot currently affect preprocessing).
As Ira Baxter commented, you could simply put some weird macro invocations and define these macros in separate files.
I don't exactly guess what precise kind of code injection you want.
Alternatively, you could generate your C or C++ code with your own generator (which could emit #line directives) and feed that to gcc

Related

Is it possible to see the macros of a compiled C program?

I am trying to learn C and I have this C file that I want view the macros of. Is there a tool to view the macros of the compiled C file.

No. That's literally impossible.
The preprocessor is a textual replacement that happens before the main compile pass. There is no difference between using a macro and putting the code the macro expands to in its place.*
*Ignoring the debugger output. But even then you can do it if you know the right #pragma to tell it the file and line number.

They're always defined in the header file(s) that you've imported with #include, or that those files in turn #include.
This may involve a lot of digging. It may involve going into files that make no sense to you because they're not written for casual inspection.
Any macros of any importance are usually documented. They may use other more complex implementation-specific macros that you shouldn't concern yourself with ordinarily, but if you're curious how they work the source is all there.
That being said, this is only relevant if you have the source and more specifically a complete build environment. Once compiled all these definitions, like the source itself, do not appear in the executable and cannot be inferred directly from the executable, especially not a release build.
Unlike Java or C#, C compiles directly to machine code so there's no way to easily reverse that back to the source. There are "decompilers" that try, but they can only really guess as to the original source. VM-based languages like Java and C# only lightly compile the code, sot here are a lot of hints as to how that code was generated and reversing it is an easier process.

Do any C-targeting compilers allow inline C?

Some C compilers emit assembly language and allow snippets of assembly to be placed inline in the source code to be copied verbatim to the output, e.g. https://gcc.gnu.org/onlinedocs/gcc/Using-Assembly-Language-with-C.html
Some compilers for higher-level languages emit C, ranging from Nim which was to some extent designed for that, to Scheme which very definitely was not, and takes heroic effort to compile to efficient code that way.
Do any such compilers, similarly allow snippets of C to be placed inline in the source code, to be copied verbatim to the output?

I'm not sure I understand what you mean by "be copied verbatim to the output," but all C compilers (msvc, gcc, clang, etc...) have preprocessor directives that essentially allow snippets of code to be added to the source files for compilation. For example, the #include directive will pull in the contents the specified file to be included in compilation. An "effect" of this is that you can do weird things such as:
printf("My code: \n%s\n",
#include "/tmp/somefile.c"
);
Alternatively, creating macros with the #define directive allows you to supplant snippets of code by calling a macro name. This all happens at the preprocessor stage before turning into the compile "output."
Other languages, like c# with roslyn, allows runtime compilation of code. Of course, you can also implement the same within c by calling your compiler as via something like system() and then loading the resulting library with dlopen.
Edit:
Now that I come back and think about this question, I should also note that python is one of those C-targeting "compilers" (I guess technically a interpreter on top of the python runtime). Python let's you use native C compiled code with some either some py API code to export functions or directly with some dlopen-like helpers. Take a look at the inlinec module that does what I described above (call the compiler then load the compiled code). I suppose you should have the ability to do similar functionality with any language that can call c compiled code (c#, java, etc...).

Changing preprocessed values during compile time

I have written some code using pre processor directives to skip some statements to be executed.But My C code inside main is interested to change previously #defined values and assign new values as per condition and also change the result of pre processed statements too during run time.In short I have to change the pre processed statements during run time. How can I do this?

In short I have to change the pre processed statements during run time
This is impossible. Read about C preprocessing & cpp. Compile-time and run-time are different (and the compiled code could even run on a different machine, read more about cross-compiling). If using GCC, use gcc -C -E foo.c > foo.i to preprocess your foo.c source file into foo.i preprocessed form (and then use an editor or a page to look inside that generated foo.i)
Perhaps you want to load additional code at runtime. This is not possible with pure C99 standard code. Perhaps your operating system offers dynamic loading. POSIX specifies dlopen. You might also want to use JIT compiling techniques to construct machine code at runtime, e.g. with libraries like GCCJIT, asmjit, GNU lightning, libjit, LLVM, ...
Read also about homoiconic languages. Consider coding in Common Lisp (e.g. with SBCL).
Perhaps you want to customize your GCC compiler with MELT.

Not possible. Preprocessing happens before compile-time.
The compiler only sees the result of the preprocessor, nothing more.

Can you add preprocessor directives in assembly?

I would like to execute some assembly instructions based on a define from a header file.
Let's say in test.h I have #define DEBUG.
In test.asm I want to check somehow like #ifdef DEBUG do something...
Is such thing possible? I was not able to find something helpful in the similar questions or online.

Yes, you can run the C preprocessor on your asm file. Depends on your build environment how to do this. gcc, for example, automatically runs it for files with extension .S (capital). Note that whatever you include, should be asm compatible. It is common practice to conditionally include part of the header, using #ifndef ASSEMBLY or similar constructs, so you can have C and ASM parts in the same header.

The C preprocessor is just a program that inputs data (C source files), transforms it, and outputs data again (translation units).
You can run it manually like so:
gcc -E < input > output
which means you can run the C preprocessor over .txt files, or latex files, if you want to.
The difficult bit, of course, is how you integrate that in your build system. This very much depends on the build system you're using. If that involves makefiles, you create a target for your assembler file:
assembler_file: input_1 input_2
gcc -E < $^ > $#
and then you compile "assembler_file" in whatever way you normally compile it.

Sure but that is no longer assembly language, you would need to feed it through a C preprocessor that also knows that this is a hybrid C/asm file and does the c preprocessing part but doesnt try to compile, it then feeds to to the assembler or has its own assembler built in.
Possible, heavily depends on your toolchain (either supported or not) but IMO leaves a very bad taste, YMMV.

C/C++ Compiler listing what's defined

This question : Is there a way to tell whether code is now being compiled as part of a PCH? lead me to thinking about this.
Is there a way, in perhaps only certain compilers, of getting a C/C++ compiler to dump out the defines that it's currently using?
Edit: I know this is technically a pre-processor issue but let's add that within the term compiler.

Yes. In GCC
g++ -E -dM <file>
I would bet it is possible in nearly all compilers.

Boost Wave (a preprocessor library that happens to include a command line driver) includes a tracing capability to trace macro expansions. It's probably a bit more than you're asking for though -- it doesn't just display the final result, but essentially every step of expanding a macro (even a very complex one).
The clang preprocessor is somewhat similar. It's also basically a library that happens to include a command line driver. The preprocessor defines a macro_iterator type and macro_begin/macro_end of that type, that will let you walk the preprocessor symbol table and do pretty much whatever you want with it (including printing out the symbols, of course).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight