XML -> C parser generator - c

I have a c program, that gets its settings from an XML file. Currently I'm using Xerces to traverse the data, but it's getting quite tedious to map each XML-value to a variable.
The same XML is also read by a Java program, which is much more convenient due to JAXB creating all the necessary classes and such in Java. I'm looking for something similar that can create a "structure of structs" or some such. It's important that I get c structs, and not c++ classes, because this code will run on GPUs.
I found "XML Booster", and am currently reading it docs. Do you know of other options? Needs to be usable in linux.

i use the libxml library. You still have to traverse the XML, but you get a linked list with elements, attribues, nodes and children-nodes, which you can follow.
link: http://xmlsoft.org/index.html

Given your XML files have common pattern, you can use Bison+Flex or simply ANTLR (C runtime) to construct grammar and extract the values from the XML files to variables. Those will produce parsers in pure C so you have nothing to worry about.

If you have an xml schema, check out xsd codesynthesis. It generates nice c++ objects for your xsd and you don't need to deal with xerces directly:
http://www.codesynthesis.com/products/xsd/

Related

We have C structures in header files and we want to have an XML schema generated from the header files

I have a twenty year old legacy application and want to connect it to a web front end. I need to pass a rather large deeply nested data structure that is defined in C structs. We are currently planing to do that in XML. The total number of struct definitions is around 150. These all nest into one huge data structure. I would like to find a program that would scan the header files and generate an XML Schema that I could then tailor to my needs. Does anyone know of such a tool?
SWIG (swig.org) has an XML target (-xml) that may do what you want.
There exist a tool called GCC XML which transform the internal representation of a program compiled by GCC into some XML, but it is not maintained any more.
A possibility could be to use GCC 4.6 plugin abilities, that it to code a plugin (in C) for GCC which would process the Tree (that is the internal AST) of the structure declaration. You can also use GCC MELT, a higher-level domain specific language to extend GCC. But in either cases, you'll need to understand the Tree (& Gimple) internal representations of GCC (and it might not worth it if you have just 150 structures). However, if your legacy application is large enough, learning these (and using MELT) might be worthwhile, because such new skills (of extending GCC) can be used for other tasks on that legacy application.
At last, you might also look into the (rather small, by today's standards) tools related to RPC-XDR, they contains a parser of C-like struct declarations.

File format to store certain configurations

I would like to know which file format i can use to store(and easily parse and read) certain configuration items and their values. On eoption is INI file. Is there any other option like .opt file?
EDIT:
I am using C language.
Look into XML. It's got implementations in many languages and is pretty easy to parse and create.
But a lot of it has to do with what language you're using.
http://www.w3schools.com/xml/default.asp
Personally, I like to use XML files for configuration options. Most non-power users can understand them relatively easy enough and there are many libraries out there that make them super easy to parse.

Parsing C header files to extract information about data types, functions and function arguments

I have a C header file. I want to parse it and extract information about data types, functions and functions arguments. Who can help me? I need some example in C.
Thank you very much.
You could try Clang. In special The Lexer and Preprocessor Library.
Use ANTLR. There's a decent grammar for C already written for you, and ANTLR will generate C code (or some other languages if you prefer), which you can then traverse to get what you want.
There is also srcml.
Similar to c2xml it uses source code directly.
c2xml starts from preprocessor output.
Assume good C coding rules (as opposed to arbitrary use of preprocessing) this has been an advantage for my re-engineering tasks, as it preserves the names of #defines and being able to process selected macros in a specific way.
The DMS Software Reengineering Toolkit with its C Front End can do this.
DMS provides general purpose parsing, symbol table construction, flow analysis, and program transformations, parameterized by a language definition. Using DMS's C front end, DMS will parse any of a variety of C dialects, builds ASTs for the code elements, builds full symbol tables doing complete name and type resolution of all symbols (including parameter lists in function headers); you can stop there and dump those out. DMS can also do control and data flow analysis on the C code; you can use othe DMS facilities to further analyze or transform the code. (The C front end has a full C preprocessor built-in).
The EDG front end can also be used for parsing and symbol tables, but does not have the other capabilities of DMS.
Yet another option is to use the c2xml tool from "sparse". Its C parser isn't 100% standard-compliant (e.g. it won't parse K&R-style declarations), but for reasonably modern C code it works quite well.
If you need a human-readable output (e.g. in html or PDF), then you can use doxygene/doxywizard. In doxywizard "All entities" has to be selected.

Parsing XML in Pure C

What is the preferred library for parsing XML data in Pure C?
The canonical XML parsing library for C is libxml2.
Two popular choices are expat and libxml2.
Here is a list of libraries for multiple languages, including C:
http://www.xml.com/pub/rg/XML_Parsers
Not 'the preferred library', but there's also http://www.minixml.org/.
Mini-XML is a small XML library that
you can use to read and write XML and
XML-like data files in your
application without requiring large
non-standard libraries. Mini-XML only
requires an ANSI C compatible compiler
(GCC works, as do most vendors' ANSI C
compilers) and a 'make' program.
Mini-XML supports reading of UTF-8 and
UTF-16 and writing of UTF-8 encoded
XML files and strings. Data is stored
in a linked-list tree structure,
preserving the XML data hierarchy, and
arbitrary element names, attributes,
and attribute values are supported
with no preset limits, just available
memory.
VTD-XML is the one you should look into, if you want a combination of ease of use, performane and efficiency
You can consider miniML-Parser, a simple and tiny XML parser library in C. It is specifically developed for embedded applications in mind.
It is extremely easy to use: You need to call only one API to parse your XML data
It has a very small footprint: The parser uses only 1.8 kB1 of code memory. Hence, you can use it in very small embedded applications.
It is a validating XML parser.
It also extracts the content of XML data and converts it to its specified data type.
It comes with a tool to generate the source code from XML schema file, instead of manually writing XML tree structure in C.
Disclosure: I'm author of this miniML-Parser

Program for documenting a C struct?

If you have a binary file format (or packet format) which is described as a C structure, are there any programs which will parse the structure and turn it into neat documentation on your protocol?
The struct would of course contain arrays, other structures, etc., as necessary to describe the format. The documentation would probably need to include things like packing, endianness, etc.
Maybe you should think about this a different way.
"Can I create a documentation format for my packet for which I can generate a C struct?"
Consider for example using XML to define the packet structure and add elements for comments and so forth. It wil be fairly easy to write a simple program that transformed it into an actual C structure
Doxygen is a commonly-used documentation generator. However, if you want to get useful documentation, you'll probably have to mark up your structure definitions with doc comments.
If you know perl you can try playing with Jeeves:
https://www.rodhughes.com/perl/advprog/examples/Jeeves/
(This source is there; I assume it's all right to use. ;) )
I'm trying to work out something similar to what you need: a parser for structured binary data. I'm looking to Jeeves to output parsing classes in C++ from a meta format. The default parser for Jeeves allows for adding additional tags to each member of a class definition. This would let you automatically include information about endianness, alignment, etc. in comments within your classes (and, of course, implement them in your code).

Resources