Libraries that parse code written in C and provide an API - c

I am implementing a proof of concept application for source-to-source transformation and need a C-parser with an API for manipulating/traversing the C-syntax tree (AST).
I have tried to use clang but I ran into various problems, like not being able to compile the tutorials using libclang, wrong architecture etc. Since this is a proof of concept application, I will defer clang to a different date.
Question
What are some software/libraries (implemented in any language) which can parse C code and which provide an API so I can build applications on top of them. I looked around, but I could not locate any free parsers.
The platforms I can use are anything on Windows or Mac or Linux, and any parsers written in C/C++/Java/Perl/Python/PHP will work.

You could try one of the available grammars for ANTLR. ANTLR has support for creating tree walkers and you can walk/manipulate the AST manually if necessary. ANTLR V3 has several grammars available including a C preprocessor, ANSI C and GNU C.

Related

Using TinyCC (tcc) to generate a C wrapper for V

I am trying to find some basis I can use to generate wrappers/bindings for C libraries to be used from Vlang and whilst doing so, I remembered that initially, V uses TCC for it's bootstrap compilation.
Since TCC is a very, very capable C compiler, I wondered if it was possible to utilize this and make this a way to generate wrappers and bindings by using TCC's built in parser/lexer to generate a symbol table of structs, functions, enums and the like and then iterate over said table to generate V code.
Judging from reading tcc.h, the API described here is usable, but I wouldn't be surprised if it was declared internal and thus not fully documented. Where can I find more information about how I could use TCC as a plain parser?
I'm sure you've already found some information regarding this, but for posterity, here are some places with information about TCC and using it as a dynamic code generator:
The TCC Git Repo
The Gnu project page
The TCC Development Archive

Include libraries in other languages in a C application

My general question is this: what's the most common way to include libraries in other languages in a C application?
For example, if I have a Ruby library intended for doing function X, and a Python library for doing function Y, how can I write a program in C (the language, that is) that uses the functions in each?
I've seen wrappers that give access to C libraries in these higher languages, but are there wrappers that go the other way? Is there a common way of handling this in general?
Are these native-code libraries (i.e. have they been compiled?) Or are these source libraries (i.e. a bunch of text files containing Ruby source code)?
If the former, libraries in a language like Ruby or Lua or so usually have a published binary interface ("ABI"). This is low-level documentation that describes how their libraries and its functions work under the hood. Often, those are defined in C or C++, or whatever language was used to implement the interpreter/compiler for Ruby itself.
So you'd have to find that documentation, and find out how to call the parts you are interested in. Some languages even use the same ABI as C does, and you just need to create a header file that matches the contents of the library and you can call it directly (This is how you integrate e.g. assembler and C, or even C++, which you can get to generate straight C functions).
If the latter, you usually need to find an embeddable version of the language, and find out how to run a script from inside your application (This is how Lua is usually used, for example).
But are you sure you need the given Ruby libraries? Often, common libraries are implemented using a C or C++ library under the hood, and then just wrapped for scripting languages, so you can just skip the scripting translation layer and use the (maybe slightly more low-level) library yourself.
PS - there are also automatic wrapper generators, like SWIG, that will read a file in one language and write the translation code for you.

What are resources for building a static analyzer for C in C?

I have a school project to develop a static analyzer in C for C.
Where should I start? What are some resources which could assist me?
I am assuming I will need to parse C, so what are some good parsers for C or tools for building C parsers?
I would first take yourself over to antlr, look at its getting started guide, it has a wealth of information about parsing etc.., I personally use antlr as it gives a choice of code generation targets.
To use antlr you need a c or c++ grammar file, pick of these up and start playing.
Anyway have fun with it..
Probably your best starting point would by Clang (with the proviso that it already has a static analyzer, so unless you want to write one for its own sake, you might be better off using/enhancing the existing one).
Are you sure that you want to write the analyzer in C?
If you were using a modern langauge (e.g. C#, Java, Python), then I would second spgennard's suggestion of ANTLR for the parser.
If writing the analyzer in C is a requirement then you are stuck with lex and yacc (flex and bison) or maybe a hand-crafted parser.
Looks like Uno comes close to what you want to do. It uses lex/yacc and includes the grammar files. The analysis part however is written in C++.
Maybe you can get some more ideas about the how and what from tools listed at SpinRoot. Wikipedia also has some good info.
Parsing is the easiest and least important part of a static analyser. Antlr was already suggested, it should be sufficient for parsing plain C (but not C++). Just a little tip - do not implement your own preprocessor, better reuse the output of gcc -E.
As for the rest, you can take a look at some of the existing analysers sources, namely Clang and CIL, read about an SSA representation and abstract interpretation. Choosing the right intermediate representation for your code is a key.
I doubt it can be an easy task in plain C, so you'd probably end up implementing some sort of DSL on top of it to handle ASTs and transforms. Sounds like something much bigger than a typical school project.

How to use multiple development languages

I program in Delphi (D7 and D2006) on Windows XP (migrating in the near future to Windows 7). I need to use a mathematical library for some of the work I am doing and most of the math libraries (I am inclining towards Mathematica at present) I have looked at will produce compiled C code. Such code will provide specific functionality to my main programs.
I have a very basic question - given this development setup - how do I start utilising the compiled c code from Delphi? I really need baby steps to get me started on the process.
I've done quite a bit of this with my FE product OrcaFlex. You have two options to link to your C code from Delphi: static or dynamic. I link statically because it makes distribution and versioning much easier. But it's really quite a trick to get it to work statically and you have to rely on a number of undocumented aspects of Delphi.
I suspect that for your needs dynamic linking is best. Basically you need to compile and link your C code into a DLL. I recommend using the Borland C compiler to do this. You can use the free command line version BCC55 to do this. The advantage of using Borland C is that it makes the same assumptions about the 8087 floating point unit as Delphi does. If you build with MSVC then you will find that MS have elected not to raise floating point exceptions. Borland C does raise floating point exceptions. This is a bit of a corner case but it becomes relevant if you are trying to ship a product that you need to be robust.
You should know that the C code will, by default, use the C calling convention and I'd just stick with that. You bring it into Delphi by declaring the external routine as cdecl calling convention.
The other thing you need to take care on is defining a clear interface between the two modules. You need to make sure that exceptions don't cross the module boundary and that you don't pass any special types (e.g. Delphi strings) across the boundary. So for a string use a PChar (or even better PAnsiChar or PWideChar to be sure that it won't change meaning when you upgrade to Delphi 2009 and later).
I have been very happy with the SDL Library from Lohninger (http://www.lohninger.com/mathpack.html). It is written in Delphi and compiles right into your application, so there are no bundling or calling convention problems or floating point usage differences, as discussed by other responses in this thread.
Take a look at what he includes. If you're lucky, your needs will be met by his library and you'll be able to use it!
If you currently have Mathematica installed, go to the documentation centre and lookup guide/CLanguageInterface otherwise that guide is available on the web and have a good read there.
My understanding is that Mathematica can generate C-programs that link up with the Mathematica engine via MathLink if you need full function, or if you only need lower-level features then it is capable of generating code that can be statically linked with compiled Mathematica libraries. So that standalone code is possible.
See the Code Generator documentation.
If you can convert the C programs in to DLLs, then accessing such external functions from Delphi is relatively simple with external declarations.
function MathematicaRoutine(const x : double) : double; external 'MyInterface.dll';
There are bound to be a great number of complexities in getting this to work if you need to achieve a static bind, for use where Mathematica is not installed, if indeed it is possible. I have never attempted it.
You can mix your project with Delphi and C++ (Builder) code using RAD Studio. Put the automatically created C code into a C++ Builder file (.cpp) and for the rest add Delphi files.

Swig alternatives for Ruby?

I am looking to generate ruby modules from existing C libraries.
In the past, I have used Swig, and found that it was a painful task. I just want to check if there's something better for Ruby out there, and any gotchas.
Just need to evaluate choices, so even a simple url pointing me to the site will do!
In the past, the go-to method for binding Ruby to C (or C to Ruby, it doesn't really matter) was writing an MRI C extension by hand. SWIG basically automates that, but in a really crappy way, so that writing it by hand is usually easier, faster, more performant.
However, there is a significant problem with MRI C extensions: they are MRI C extensions. This was fine, when MRI was the only Ruby Implementation, but now we have three production-ready Ruby Implementations with another two on the way in the next couple of weeks and yet another two or three to be released later this year.
Of course, there is another problem with MRI C extensions: you have to write them in C.
A better solution is the DL library in the Ruby standard library, which allows you to bind to a dynamic library (.dll, .so, .dylib) at runtime, in pure Ruby. Unfortunately, it's pretty badly documented and because of that, it is not very well supported (or completely unsupported) by several Ruby Implementations: how are you going to provide a compatible implementation if there is no documentation of what "compatible implementation" means?
Rubinius introduced the Rubinius Foreign Function Interface (FFI), which is much easier to use than DL, much easier to implement for Ruby VM writers and fully documented, specified and tested. JRuby immediately copied the API, Wayne Meissner wrote two C extensions for MRI and YARV, tinyrb supports it, IronRuby, MacRuby and MagLev will pretty soon.
So, if you use FFI, you won't have to write a single line of C, and your library will automatically work on MRI, YARV, JRuby and Rubinius and in the future also on IronRuby, MacRuby and MagLev.

Resources