Lightweight and extensible C compiler front-end - c

is there anyone that knows if there is a lightweight C compiler front-end? I really just need lexing + parsing + semantic checks I need to do code generation and static analysis on that. Thanks in advance
Alberto

Try TCC: http://bellard.org/tcc/

Check out clang, an open source front end to LLVM.

Related

How to make use of Clang's AST?

I am looking at making use of the Clang's AST for my C code and do some analysis over the AST. Some pointers on where to start, how to obtain the Clang's AST, tutorials or anything in this regard will be of great help!!!
I have been trying to find some and I got this link which was created 2 years back. But for some reason, it is not working for me. The sample code, in the tutorial, gives me too many errors. So I am not sure, if I build the code properly or some thing is wrong with the tutorial. But I would be happy to start from some other page as well.
Start with the tutorial linked by sharth. Then go through Clang's Doxygen. Start with SemaConsumer.
Read a lot of source code. Clang is a moving target. If you are writing tools based on clang, then you need to recognize that clang is adding and fixing features daily, so you should be prepared to read a lot of code!
You probably want the stable C API provided in the libclang library, as opposed to the unstable C++ internal APIs that others have mentioned.
The best documentation to start with currently is the video/slides of the talk, "libclang: Thinking Beyond the Compiler" available on the LLVM Developers Meeting website.
However, do note that the stability of the API comes at a cost of comprehensiveness. You won't be able to do everything with this API, but it is much easier to use.
To obtain the AST as well as get to know stages of the frontend, there is a frontend chapter in the book "LLVM core libraries". Basically it has such a flow (in the case of llvm-4.0.1 and should similar for later versions):
cc1_main.cpp:cc1_main (ExecuteCompilerInvocation)
CompilerInstance.cpp:CompilerInstance::ExecuteAction
ParseAST.cpp:clang::ParseAST (Consumer>HandleTranslationUnit(S.getASTContext())
CodeGenAction.cpp:HandleTranslationUnit
The last function handles the whole translation unit(top level decls are already handled at this point), and calls EmitBackendOutput to do backend stuff. So this function is a good spot where you can do something with the complete AST and before emitting backend output.
In terms of how to manipulate the AST, clang has some basic tutorial on this: http://clang.llvm.org/docs/RAVFrontendAction.html.
Also look at ASTDumper.cpp. It's the best example of visiting the AST.
Another good tutorial: https://jonasdevlieghere.com/understanding-the-clang-ast/ teaches you how to find a specific call expr in the AST via three different approaches.
I find this ASTUnit::LoadFromCompilerInvocation() fn as the most easiest way to construct the AST.
This link may give you some ideas http://comments.gmane.org/gmane.comp.compilers.clang.devel/12471

What are resources for building a static analyzer for C in C?

I have a school project to develop a static analyzer in C for C.
Where should I start? What are some resources which could assist me?
I am assuming I will need to parse C, so what are some good parsers for C or tools for building C parsers?
I would first take yourself over to antlr, look at its getting started guide, it has a wealth of information about parsing etc.., I personally use antlr as it gives a choice of code generation targets.
To use antlr you need a c or c++ grammar file, pick of these up and start playing.
Anyway have fun with it..
Probably your best starting point would by Clang (with the proviso that it already has a static analyzer, so unless you want to write one for its own sake, you might be better off using/enhancing the existing one).
Are you sure that you want to write the analyzer in C?
If you were using a modern langauge (e.g. C#, Java, Python), then I would second spgennard's suggestion of ANTLR for the parser.
If writing the analyzer in C is a requirement then you are stuck with lex and yacc (flex and bison) or maybe a hand-crafted parser.
Looks like Uno comes close to what you want to do. It uses lex/yacc and includes the grammar files. The analysis part however is written in C++.
Maybe you can get some more ideas about the how and what from tools listed at SpinRoot. Wikipedia also has some good info.
Parsing is the easiest and least important part of a static analyser. Antlr was already suggested, it should be sufficient for parsing plain C (but not C++). Just a little tip - do not implement your own preprocessor, better reuse the output of gcc -E.
As for the rest, you can take a look at some of the existing analysers sources, namely Clang and CIL, read about an SSA representation and abstract interpretation. Choosing the right intermediate representation for your code is a key.
I doubt it can be an easy task in plain C, so you'd probably end up implementing some sort of DSL on top of it to handle ASTs and transforms. Sounds like something much bigger than a typical school project.

Where can I find the string.c file itself (to read it)?

I'm interested in reviewing some of the functions included in the string library for c, but I can't find the source code (literally, i.e. functions and all) online and I have no idea where to find it.
Thanks.
EDIT:
the link provided by pmg shows those functions divided into c files, so it's about what I needed. Thanks.
Take a look at redhat glibc. It appears to be somewhat current.
You'll find it in the source code of the gcc compiler.
http://www.gnu.org/software/gcc/
Usually included with the compiler that you install so this may vary. Also depends on the operating system your running. If you're using windows, I recommend you run a Windows search for strings.c and if you're running linux then you can use the find command.
Disregard the file I linked to prior to this edit. I should have verified the code before sending it. It didn't apply to your question. Sorry
Maybe you're looking for GNU C string.h?
You can check the source in any standard libc implementation http://www.gnu.org/software/libc/

Is there a bundled library for regular expressions in MSVC?

If I'm compiling a C program with gcc, I can safely assume that the functions in regex.h are available. Is there a regex library I can assume is there if someone is compiling with microsoft's C compiler?
C++ only, but may be something you can use (or wrap):
Visual C++ 2010 includes the TR1 regex library support.
http://msdn.microsoft.com/en-us/library/bb982382.aspx
It's also available for VC++ 2008 in a feature pack:
http://www.microsoft.com/downloads/details.aspx?FamilyId=D466226B-8DAB-445F-A7B4-448B326C48E7&displaylang=en
No, I don't think MSVC comes bundled with any regex library.
Regex isn't part of the C/C++ standard library, so you shouldn't rely on any compiler providing such a library by default. It's best to get hold of a separate regex library for C (I'm sure there are tons available) and include it with your code.
Try Boost or wait for release of C++1x...
There's no C/C++ regexp library bundled with msvc. C++/CLI have access to the .NET regexp classes though.
Perhaps you can use PCRE
If you want POSIX-compatible regular expression semantics (and the same API too!) then the best regex library is TRE: http://laurikari.net/tre/
Unlike most regex implementations, it follows POSIX exactly in regards to the matches it returns for parenthesized subexpressions, and it's O(n) whereas most implementations are O(2^n) in time.
Google also has a new regex implementation that uses Perl-compatible syntax if you prefer that. You can find a link on the TRE website.
Edit: By the way, TRE seems to come with project files to build it under MSVC.

Good C library to parse apache log file

I would like to parse a lot of apache log files with a C library.
Who knows a good C library which is optimized for good performance?
Thank you for advice!
Did you give a try to MergeLog ? You can use the parser that comes with it.

Resources