GCC Xml Alternatives - c

I am looking into GCCXML, which can parse a given header file and generates XML format of C code meta data. But GCCxml is an open source. Is there any commercial version of c code parser which works similar to GCC XML?
Thanks,
Karthick

The obvious replacement for gccxml will be clang, which is licensed under BSD license (so you can freely use it in commercial projects, do whatever you want with the code, etc.). clang used to have an xml AST dumper built-in, but it was removed at some stage. If you only need to extract specific information (such as function prototypes for IDL generation or stuff like this) it is not difficult to write a basic custom clang plugin to do this. Otherwise, you can search around for existing clang plugins which will do the job, such as this one:
https://github.com/sk-havok/clang-extract
Clang plugin tutorial: http://clang.llvm.org/docs/ClangPlugins.html

See our DMS Software Reengineering Toolkit with its C Front End for an equivalent/superset of GCCXML.
The C front end can handle a variety of C dialects (ANSI, GCC, MS). It contains a full preprocessor. It can export ASTs for the complete language (esp. including function bodies, which GCCXML does not do, IIRC) and its symbol table, both in XML format.
Here at SO there is an example dump of the AST from DMS's C++ front end. This uses the same machinery as the C front end uses.

Related

Compilation map

Let assume a complex project (in C/C++), is there a solution to know which sources files are responsible/used for the creation of a specific binary without compiling the project itself.
I know I could just read the Makefile and try to follow the dependency chain like this but it's not very scalable and it could be hard if multiple Makefiles and / or implicit rules are used.
Thanks a lot for your help
PS: To clarify the first comments, I'm looking for a method which does not need to have a valid build environment (e.g. so compiling, even as a dry-run, is not an option).
is there a solution to know which sources files are responsible/used for the creation of a specific binary without compiling the project itself
If you compile with GCC (or perhaps Clang) you could use appropriate preprocessor options like -M to generate and keep in some textual file the dependencies, in a format acceptable by GNU make or ninja build automation tools. This works well on Linux distributions like Debian.
You could also be interested by other builders, including omake, and package managers like opam, urpmi, etc...
You could also be in touch with SoftwareHeritage team.
If you use GCC, you could write your own GCC plugin to maintain these dependencies in your database.
At last, be aware of Rice's theorem, and think about crazy examples (in C++) like
#if __TIME__[0]=='1'
int something=0;
#else
constexpr int something=1;
#endif
So my current intuition is that your wish is impossible. I could have misunderstood it.
Refer to some C standard like n1570, or to some C++ standard like n3337.
Study the behavior of tools like GNU autoconf.
Think of programs generating C or C++ code like GNU bison, my manydl.c, bismon, SWIG, RefPerSys, ANTLR .... Notice that GCC has many C++ code generators (notably gengtype) and is definitely "a complex project coded in C++".
See also linuxfromscratch.

How do I read the source code for a C library in CLion

OS: Deepin 20 (base on Debian 10)
CLion: 2020.1.2
GCC: gcc (Uos 8.3.0.3-3+rebuild) 8.3.0
Make: 4.2.1 x86_64-pc-linux-gnu
Cmake: 3.18.1
I am a newcomer who just started learning C language. When I was writing C code using CLion, I could access it by Ctrl + mouse click .
I'm calling the method inside the header function. For example, if I use printf , I can access the stdio.h file, which can be seen at line 332 extern int printf (const char * ___, RESTRICT, format,...) ; .
But if I want to see the details of this method
I can't see it. According to Navigate in the code structure
Use Ctrl+Alt+Home to switch. But the IDE prompts No related file .
How can I get the source code to call a method? I want to learn from the good experiences of others by looking at their implementation logic in their libraries
Thank you for your review. I would really appreciate it if you could help me.
Even if most of GNU/Linux software is open source, it is not installed (in source code form) by default on your computer.
Regarding C programming, see Modern C (and the C11 standard n1570) and read the documentation of your C compiler (perhaps GCC or Clang, or simpler ones like nwcc or tinycc), your linker (probably binutils), your build automation tool (e.g. GNU make or ninja or cmake). Enable all warnings and DWARF debug info, so if using GCC compile with at least gcc -Wall -Wextra -g; then improve your C code to get no warnings. Once you have debugged your C source code (using GDB and perhaps valgrind), add optimization flags such as -O2. Order of arguments to gcc matters!
Consider, for some tasks, generating some of your C code (perhaps some #include-d header file) with tools like GNU bison, ANTLR, SWIG, RPCGEN, AWK, GUILE, GPP, GNU m4, GNU autoconf - or your own program or script.
I want to learn from the good experiences of others by looking at their implementation logic in their libraries
You need to fetch the source code from elsewhere.
For examples, see GNU libc or musl-libc, and the Linux kernel (and others: GTK, PostGreSQL, sqlite, GUILE, etc.... including many open source programs mentionned in this answer) and look also on websites like github, gitlab, sourceforge
Read also Advanced Linux Programming and syscalls(2). See also http://linuxfromscratch.org/
In 2020, a recent GCC compiler happens to handle specially calls to printf when asked to optimize. See the softwareheritage and Frama-C projects.
In some cases, consider accepting plugins in your program with dlopen(3) and dlsym(3) (see also elf(5) and How to Write Shared Libraries). You might even generate some code at runtime with libraries like libgccjit (or generate C code at runtime, then compile it as a plugin, and load it; such an approach is called metaprogramming and is related to partial evaluation; see also the blog of the late J.Pitrat for more insights).
Of course, you need tools to navigate in source code. Consider using GNU emacs combined with GNU grep for that, or some other source navigator. For large programs of millions of source code lines, consider writing your own GCC plugin to understand them.
Use also tools like strace(1) and GDB to understand the dynamic behavior of programs.
Expect several months of full time work to explore all this.
You could be interested by ACM conference papers also.
For your own source code, consider using some version control tool such as git. Of course read its documentation. And use LibreOffice, Lout or LaTeX, MarkDown (perhaps combined with inkscape or diagrams for figures) to write the documentation of your software.
In some cases, you might consider generating parts of the documentation from parts of your source code (e.g. using literate programming techniques like nuweb or documentation generators like doxygen).

Libraries that parse code written in C and provide an API

I am implementing a proof of concept application for source-to-source transformation and need a C-parser with an API for manipulating/traversing the C-syntax tree (AST).
I have tried to use clang but I ran into various problems, like not being able to compile the tutorials using libclang, wrong architecture etc. Since this is a proof of concept application, I will defer clang to a different date.
Question
What are some software/libraries (implemented in any language) which can parse C code and which provide an API so I can build applications on top of them. I looked around, but I could not locate any free parsers.
The platforms I can use are anything on Windows or Mac or Linux, and any parsers written in C/C++/Java/Perl/Python/PHP will work.
You could try one of the available grammars for ANTLR. ANTLR has support for creating tree walkers and you can walk/manipulate the AST manually if necessary. ANTLR V3 has several grammars available including a C preprocessor, ANSI C and GNU C.

typedef solving for dll wrapper

I want to write a wrap for a DLL file, in this case for python. The problem is that the argument types are not the C standard ones. They have been typedef'end to something else.
I have the header files for the DLL files... so I can manually track the original standard C type the argument type was typedef'ined to. But wanted a more systematic way to do this. I was wondering whether there is a utility that would evaluate the header files, or if you can get somewhere in the dll the types definition.
I think the tool you are looking for is SWIG:
SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages. SWIG is used with different types of languages including common scripting languages such as Perl, PHP, Python, Tcl and Ruby. The list of supported languages also includes non-scripting languages such as C#, Common Lisp (CLISP, Allegro CL, CFFI, UFFI), Java, Lua, Modula-3, OCAML, Octave and R. Also several interpreted and compiled Scheme implementations (Guile, MzScheme, Chicken) are supported. SWIG is most commonly used to create high-level interpreted or compiled programming environments, user interfaces, and as a tool for testing and prototyping C/C++ software. SWIG can also export its parse tree in the form of XML and Lisp s-expressions. SWIG may be freely used, distributed, and modified for commercial and non-commercial use.
This does assume that you are willing to use the headers for the DLL. If you want to work solely with the DLL, then you have more work to do. It might provide a reflection interface that you can use to analyze the types. Failing that, you are into a world of pain - or reverse engineering any debugging information in the DLL.

Is there a C header parser tool for wrapper generation like gccxml?

I need to write a few c header wrappers for a new programming language and would like something like gccxml but without the full dependency on gcc and the problems it gives on a windows system.
Just needs to read C not C++. Output in any format is okay as long it is fully documented.
Need it for Curl, SQLite, GTK2, SDL, OpenGL, Win32 API and C posix API's on Linux/Solaris/FreeBSD/MacOSX.
VivaCore is very cool. Have you tried SWIG the wikipedia page on ffi has some good links too. I think there is a MSVC codedom example that does C also.
See our SD C Front End for DMS. Full C parsing, symbol table construction, post parsing dump of any information you like. Can dump code and symbol tables in XML format.
You may like pycparser in Python. Used in CFFI and other awesome projects.

Resources