C code preprocessing in Perl

C code preprocessing in Perl - c

I work on the C code parser in Perl.
At the moment I need to pre-process the code.
Implementation of the pre-processing seems to be a lot of work, so I am looking for a script or library that will allow to pre-process the file.
I found the following possibilities:
Text::CPP
Filter::CPP
Both of these require cpp which I don't have on my Windows machine. Are there any other options?

I'm not sure I understand your needs, but you are right that implementing this yourself is probably a poor choice. I was recently looking for alternative C preprocessors as well.
The Text::CPP module should only require a compiler to compile itself. If you can find a precompiled version, it should work for you.
The JCPP Java C Preprocessor by the same author could probably be made to work. You'd likely have to process externally and then load the result.
Filepp is an older Perl program that claims CPP compatability. There is a precompiled Windows binary to download.
There is a brand new Lua C-Preprocessor LCPP that might be something you could work with. Probably best as a standalone, but you might be able to use Inline::Lua.
SWIG comes with its own preprocessor implementation. I presume this would be available for Windows.
What else? The Boost Wave Preprocessor might work well and is available for Windows.
The MSVC Compiler can preprocess to a file.
Still, the easiest and best long term solution may be to just install CPP. It comes as part of GCC, which you can get from Cygwin or MinGW.

Related

Compilation map

Let assume a complex project (in C/C++), is there a solution to know which sources files are responsible/used for the creation of a specific binary without compiling the project itself.
I know I could just read the Makefile and try to follow the dependency chain like this but it's not very scalable and it could be hard if multiple Makefiles and / or implicit rules are used.
Thanks a lot for your help
PS: To clarify the first comments, I'm looking for a method which does not need to have a valid build environment (e.g. so compiling, even as a dry-run, is not an option).

is there a solution to know which sources files are responsible/used for the creation of a specific binary without compiling the project itself
If you compile with GCC (or perhaps Clang) you could use appropriate preprocessor options like -M to generate and keep in some textual file the dependencies, in a format acceptable by GNU make or ninja build automation tools. This works well on Linux distributions like Debian.
You could also be interested by other builders, including omake, and package managers like opam, urpmi, etc...
You could also be in touch with SoftwareHeritage team.
If you use GCC, you could write your own GCC plugin to maintain these dependencies in your database.
At last, be aware of Rice's theorem, and think about crazy examples (in C++) like
#if __TIME__[0]=='1'
int something=0;
#else
constexpr int something=1;
#endif
So my current intuition is that your wish is impossible. I could have misunderstood it.
Refer to some C standard like n1570, or to some C++ standard like n3337.
Study the behavior of tools like GNU autoconf.
Think of programs generating C or C++ code like GNU bison, my manydl.c, bismon, SWIG, RefPerSys, ANTLR .... Notice that GCC has many C++ code generators (notably gengtype) and is definitely "a complex project coded in C++".
See also linuxfromscratch.

How can I compile ANSI C99-based MEX code delivered with Linux makefiles under Win64 MATLAB?

It seems I've got a real problem here due to my lack of any knowledge about Linux systems:
I have downloaded some open source code, which
is written in C
uses complex.h, so I assume it is ANSI C99
comes with makefiles designed for compilation under Linux systems
provides interfaces to IDL, MATLAB, Python etc.
I am indeed familiar about compiling C/MEX files under Windows-based MATLAB environments, but in this case I don't even know where to start. The project is distributed in several folders and consists of dozens of source and header files. And, to begin with, the Visual Studio 2010 compiler I've used to compile MEX files until now does not comply with the C99 standard, i.e. it does not recognize the complex.h header.
Any help towards getting this project compiled would be highly appreciated. In particular, I have the following questions:
1) Is there any possibility to automatically extract compilation information from the MEX files and transfer it to Windows reality?
2) Is there any free compiler being able to compile C99 stuff, which is also easy to embed in MATLAB?

I have done this (moved in-house legacy code inc. mex files to Win64). I can't recommend the experience.
You will have to recompile, no way around it.
Supported compilers for mex depend on your MATLAB version
This File Exchange entry for using Pelles C may be a starting point (if it works with your version of MATLAB).
I am guessing that there is a main makefile which then works through the makefiles in the subdirectories - have a read through the instructions for compiling under Linux, it will give you some idea of what's going on and may also discuss what to do if you want to change compiler. Once you've found a compatible compiler, the next stage is to understand what the makefiles are doing and edit them accordingly (change paths, compiler, compiler flags, etc.)
Then, from memory (it was a while ago), you get to enjoy a magical mystery tour through increasingly obscure compiler errors. Document everything because if you do get it working, you won't be in a mood to do this twice.

MATLAB R2016b on Windows now supports the MinGW compiler. I'm successfully using this to compile code written primarily for Linux/gcc. I installed this from the Add-On menu in MATLAB (search MinGW).
For my case, I'm building with the legacy code tool. The only thing I needed to do differently than normal was to tell the compiler to support c99 via a compiler flag. This does the trick:
legacy_code('compile', def, {'CFLAGS=-std=c99'})
I had trouble getting the flag command just right (I had some extra quotes that apparently broke things), and asked The MathWorks, so credit is due to their support team for this.
If you are using mex, I would expect to do something very similar.
I would guess that the makefiles are irrelevant for your application; you will need to tell the mex or legacy_code function about all of the files necessary to build the whole application or link against pre-built libraries (which it sounds like you don't have).
I hope this helps!

Code refactoring tools for C, usable on GNU/Linux? FOSS preferable

Variations of this question have been asked, but not specific to GNU/Linux and C. I use Komodo Edit as my usual Editor, but I'd actually prefer something that can be used from CLI.
I don't need C++ support; it's fine if the tool can only handle plain C.
I really appreciate any direction, as I was unable to find anything.
I hope I'm not forced to 'roll' something myself.
NOTE: Please refrain from mention vim; I know it exists and what its capabilities are. I purposefully choose to avoid vim, which is why I use Komodo (or nano on the servers).

I don't think that a pure console refactoring tool would be nice to use.
I use Eclipse CDT on linux to write and refactor C-Code.
There exists also Xrefactory for Emacs http://www.xref.sk/xrefactory/main.html
if a non console refactoring tool is o.k for you as well.

C-xrefactory was an open source version of xrefactory, covering C and Java, made available on SourceForge by Marián Vittek under GPLv2.
For those interested, there's an actively maintained c-xrefactory fork on GitHub:
https://github.com/thoni56/c-xrefactory
The goal of the GitHub fork is to refactor c-xrefactory itself, add a test suite, and try to document the original source code (which is rather obscure). Maybe, in the future, also convert it into an LSP C language server and refactoring tool.
C-xrefactory works on Emacs; setup scripts and instructions can be found at the repository. Windows users can run it via WSL/WSL2.

You could consider coding a GCC plugin or a MELT extension (MELT is a domain specific language to extend GCC) for your needs.
However, such approach would take you some time, because you'll need to understand some of GCC internals.

For Windows only, and not FOSS but you said "any direction..."
Our DMS Software Reengineering Toolkit" with its C Front End can apply transformations to C source code. DMS can be configured to carry out custom, complex reliable transformations, although the configuration isn't as easy as typing just a command like "refactor frazzle by doobaz".
One of the principal stumbling blocks is still the preprocessor. DMS can transform code that has preprocessor directives in typical places (around statements, expressions, if/for/while loop heads, declarations, etc.) but other "unstructured conditionals" give it trouble. You can run DMS by expanding the preprocessor directives out of existence, or more imporantly, expanding out the ones that give it trouble, but mostly people don't like this because they prefer to keep thier preprocessor directives. So it isn't perfect.
[Another answer suggested Concinelle, which looks pretty good from my point of view. As far as I know, it doesn't handle preprocessor directives at all; I could be wrong and it might handle some cases as DMS does, but I'm sure it can't handle all the cases].
You don't want to consider rolling your own. Building a transformation/refactoring tool is much harder than you might guess having never tried it. You need full, accurate parsers for the (C) dialect of interest and just that is pretty hard to get right. You need a preprocessor, symbol tables, flow analysis, transformation, code regeneration machinery, ... this stuff takes years of effort to build and get right. Trust me, been there, done that.

How can I best check for C library dependencies?

I'm building something that installs a high-level stack, and to do that, I need to install the lower-level stuff.
The simplest way to look for whether, say, Java is installed, is to just shell out a which java in a shell script and check if it can find it. I'm now to the point where I need to do some libraries without an obvious binary- basically stuff that is an include from within C. libxml, for example.
I'm woefully green to C in general, so this makes things a little tricky for me. :) Ideally I could just make a shell script that calls a little C applicaiton that calls #include <xxxx>, where xxxx is the library that I'm checking the existence of. If it can't find it, it errors out. Unfortunately, of course, all that happens prior to compilation, so it's not as dynamic as I'd like.
I'm doing this on a system that probably doesn't have anything installed on it (be it high-level language or package managers or what have you), so I'm looking more for a basic shell script way of doing things (or maybe some clever C or command-line gcc options). Or maybe just manually search the include paths that gcc would look for anyway /usr/local/include, /usr/include, etc.). Any thoughts?

Autotools is really what you need. Its a huge (and bizarre) framework for dealing with this very problem:
http://www.gnu.org/software/autoconf/
You can also use pkg-config, which will work with newer software making use of that mechanism:
http://pkg-config.freedesktop.org/wiki/

this is the purpose of configure (part of automake and autoconf)

Tool to determine symbol origin in C

I'm looking for a tool that, given a bit of C, will tell you what symbols (types, precompiler definitions, functions, etc) are used from a given header file. I'm doing a port of a large driver from Solaris to Windows and figuring out where things are coming from is getting to be difficult, so this would be a huge help. Any ideas?
Edit: Not an absolute requirement, but tools that work on Windows would be a plus.
Edit #2: To clarify what I'm trying to do, I have a codebase I'm trying to port, which brings in a large number of headers. What I'd like is a tool that, given foo.c, will tell me which symbols it uses from bar.h.

I like KScope, which copes with very large projects.
KScope http://img110.imageshack.us/img110/4605/99101zd3.png

I use on both Linux and Windows :
gvim + ctags + cscope.
Same environment will work on solaris as well, but this is of course force you to use vim as editor, i pretty sure that emacs can work with both ctags and cscope as well.
You might want give a try to vim, it's a bit hard at first, but soon you can't work another way. The most efficient editor (IMHO).
Comment replay:
Look into the cscope man:
...
Find functions called by this function:
Find functions calling this function:
...
I think it's exactly what are you looking for ... Please clarify if not.
Comment replay 2:
ok, now i understand you. The tools i suggested can help you understand code flow, and find there certain symbol is defined, but not what are you looking for.
Not what you asking for but since we are talking i have some experience with porting and drivers (feel free to ignore)
It seems like compiler is good enough for your task. You just starting with original file and let compiler find what missing part, it will be a lot of empty stubs and you will get you code compiled.
At least for beginning i suggest you to create a lot of stubs and modifying original code as less as possible, later on once you get it working you can optimize.
It's might be more complex depending on the type of driver your are porting (I'm assuming kernel driver), the Windows and Solaris subsystems are not so alike. We do have a driver working on both solaris and windows, but it was designed to be multi platform from the beginning.

emacs and etags.
And I leverage make to run the tag indexing for me---that way I can index a large project with one command. I've been thinking about building a master index and separate module indecies, but haven't gotten around to implementing this yet...
#Ilya: Would pistols at dawn be acceptable?

Try doxygen, it can produce graphs and/or HTML and highly customizable

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight