Which Eiffel compilers use Earley parsing - eiffel

I stumbled upon this post http://compilers.iecc.com/comparch/article/02-04-096
that says there are two Eiffel compilers using Earley parsing. The post is quite old.
I wonder if anyone here knows which Eiffel compilers use Earley parsers and if they are
still in use? Links are highly appreciated.

The modern Eiffel compilers that are used in production (EiffelStudio from Eiffel Software and gec from Gobo Eiffel Project - both open source) parse Eiffel code using the parsers generated by geyacc from the parser description files (here are the links for EiffelStudio and Gobo), a parser generator utility similar to GNU bison, that converts a grammar description for an LALR(1) context-free grammar, but is adapted to produce Eiffel code that is type-safe and void-safe. Neither use an Earley parser.

Related

Is there an external parser generator tool used for building Alloy language parser

Have Alloy developers used any parser generator tool (like ANTLR) for parsing alloy specifications, or is its parser built-in and specifically written for the alloy language purpose?
If they used external tool for Alloy parser implementation, how can I access further information regarding this (for example the grammar which is fed into the external parser generator).
Alloy uses a modified version of CUP (which is shipped with the Alloy distribution). You can find the grammar specification files (Alloy.lex and Alloy.cup) inside the edu.mit.csail.sdg.alloy4compiler.parser package. In the same package there are some bash scripts used to generate corresponding lexer/parser classes.
http://alloy.mit.edu/alloy/documentation/book-chapters/alloy-language-reference.pdf
Section B.3 has the grammar.
Can't say anything about the language implementation.

Libraries that parse code written in C and provide an API

I am implementing a proof of concept application for source-to-source transformation and need a C-parser with an API for manipulating/traversing the C-syntax tree (AST).
I have tried to use clang but I ran into various problems, like not being able to compile the tutorials using libclang, wrong architecture etc. Since this is a proof of concept application, I will defer clang to a different date.
Question
What are some software/libraries (implemented in any language) which can parse C code and which provide an API so I can build applications on top of them. I looked around, but I could not locate any free parsers.
The platforms I can use are anything on Windows or Mac or Linux, and any parsers written in C/C++/Java/Perl/Python/PHP will work.
You could try one of the available grammars for ANTLR. ANTLR has support for creating tree walkers and you can walk/manipulate the AST manually if necessary. ANTLR V3 has several grammars available including a C preprocessor, ANSI C and GNU C.

Parsing source code

I need to parse the source code of different files, each written in a different language, and I would like to do this using C.
To do that, I was thinking of using yacc / lex, but I find them very hard to understand, maybe due to the complete lack of decent documentation (either that, or they really are cryptic).
So my questions are: where can I find some good documentation for yacc / lex, preferably a tutorial style introduction? Or, is there any better way to do this in C? Maybe there's something else I could use instead of yacc / lex, perhaps even written in a different language?
yacc and lex are very powerful tools, built around the theories for compiler construction. To be able to fully understand them you probably need some basics in formal languages, automata theory and compiler construction.
The dragon book is a classic on the subject.
The second half of Kernighan and Pike's The Unix Programming Environment is an extended introduction to programming an interpreter with lex and yacc. The lex coverage is a little light, as they mostly use a custom scanner.
If you like math (the most important clause in this answer), then write your own compiler-compiler, and then write your compiler with that. I did this once because I was getting bored of writing all the functions for all the productions of a compiler which I had started as a recursive-descent compiler, because the available choices in 2004 didn't please me, and because I had free time while job-hunting. I only used the compiler compiler on the one project, and it is not necessarily thoroughly tested, so it is not on github. I was very happy with the grammar file syntax that I devised.
If I had such a need today I might make a different decision. The newer cutting-edge CC's seem to have have changed a lot in the last 8 years.

What are resources for building a static analyzer for C in C?

I have a school project to develop a static analyzer in C for C.
Where should I start? What are some resources which could assist me?
I am assuming I will need to parse C, so what are some good parsers for C or tools for building C parsers?
I would first take yourself over to antlr, look at its getting started guide, it has a wealth of information about parsing etc.., I personally use antlr as it gives a choice of code generation targets.
To use antlr you need a c or c++ grammar file, pick of these up and start playing.
Anyway have fun with it..
Probably your best starting point would by Clang (with the proviso that it already has a static analyzer, so unless you want to write one for its own sake, you might be better off using/enhancing the existing one).
Are you sure that you want to write the analyzer in C?
If you were using a modern langauge (e.g. C#, Java, Python), then I would second spgennard's suggestion of ANTLR for the parser.
If writing the analyzer in C is a requirement then you are stuck with lex and yacc (flex and bison) or maybe a hand-crafted parser.
Looks like Uno comes close to what you want to do. It uses lex/yacc and includes the grammar files. The analysis part however is written in C++.
Maybe you can get some more ideas about the how and what from tools listed at SpinRoot. Wikipedia also has some good info.
Parsing is the easiest and least important part of a static analyser. Antlr was already suggested, it should be sufficient for parsing plain C (but not C++). Just a little tip - do not implement your own preprocessor, better reuse the output of gcc -E.
As for the rest, you can take a look at some of the existing analysers sources, namely Clang and CIL, read about an SSA representation and abstract interpretation. Choosing the right intermediate representation for your code is a key.
I doubt it can be an easy task in plain C, so you'd probably end up implementing some sort of DSL on top of it to handle ASTs and transforms. Sounds like something much bigger than a typical school project.

Parse C files [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I am looking for a Windows based library which can be used for parsing a bunch of C files to list global and local variables. The global and local variables may be declared using typedef. The output (i.e. list of global and local variables) can then be used for post processing (e.g. replacing the variable names with a new name).
Is such a library available?
Some of the methods available:
Elsa: The Elkhound-based C/C++ Parser
CIL - Infrastructure for C Program Analysis and Transformation
Sparse - a Semantic Parser for C
clang: a C language family frontend for LLVM
pycparser: C parser and AST generator written in Python
Alternately you could write your own using lex and yacc (or their kin- flex and bison) using a public lex specification and a yacc grammar.
Possibly overkill, but there's a complete ANSI C parser written with Boost.Spirit:
http://spirit.sourceforge.net/repository/applications/c.zip
Maybe you'll be able to model it to suit your needs.
Parsing C is lot harder than it looks, when you take into
account different dialects, preprocessor directives,
the need for type information while parsing, etc.
People that tell you "just use lex and yacc" have
clearly not done a production C parser.
A tool that can do this is our C front end
It addresses all of the above issues.
On completion, it has a complete, navigable symbol table
with all identifiers and corresponding type information.
Listing global and local variables would be trivial with this.
I'm the architect behind Semantic Designs.
I don't know if it offers a library, but have a look at CTAGS.
If it is plain C, lex and yacc are your friends, but you need to take on account C preprocessor - source files with unexpanded macros typically are do not comply with C syntax so parser, written with K&R grammar in mind, most likely will fail.
If you decide to parse the output of preprocessor, be prepared that your parser will fail due to "extensions" of your particular compiler, because very likely standard library headers use them. At least this the the case with GCC.
I had this with GCC and finally decided to achieve my goal using different approach. If you just need to change names for variables, regular expressions will do fine, and there is no need to build a full parser, IMHO. If your goal is just to collect data, the ultimate source of data is debug information. There are ways to get debug information out of binary - for ELF executables with DWARF there is libdwarf, for Windows-land (COFF ?) should be something as well. Probably you can use some existing tools to get debug information about binary - again, I know nothing about Windows, you need to investigate.
I recently read about a win32-based system that looked at the debugging information in COFF dlls:
http://www.drizzle.com/~scottb/gdc/fubi-paper.htm
Maybe gnu project cflow http://www.gnu.org/software/cflow/ ?

Resources