Convert string to XML format in C language - c

I have a question, is there an algorithm that converts the corresponding logical expression as a string to xml format?
For example:
I am loading logical expressions as a string from the input file:
(X&Y)|Z
And now to convert it into xml format, to get something like this in a new file:
<expression>
<or>
<operand>Z</operand>
<and>
<operand>X</operand>
<operand>Y</operand>
</and>
</or>
</expression>
Thanks in advance!

(Responding to the xml tag.)
Perhaps you can make use of invisible-xml (ixml),
the SO Info
tab lists several resources.
With a grammar such as
ixml version "1.0".
expression: expr.
-expr: operand; and; or; not.
and: expr, -"&", expr.
or: expr, -"|", expr.
not: -"!", operand.
-operand: id; -"(", expr, -")".
id: letter.
-letter: ["A"-"Z"].
the ixml processor coffeepot
java -jar '/usr/local/share/java/coffeepot-1.99.11.jar' \
--grammar:expr-bool.ixml --pretty-print '(X&Y)|Z'
produces the following output:
<expression>
<or>
<and>
<id>X</id>
<id>Y</id>
</and>
<id>Z</id>
</or>
</expression>
With the input (X&Y)|Z&!(A|B) coffeepot (by default) outputs the
first of (in this case 2) possible parses and emits the root element as
<expression xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous">
to point out the ambiguity.
If parsing fails the output could be something like:
<fail xmlns:ixml="http://invisiblexml.org/NS" ixml:state="failed">
<line>1</line>
<column>14</column>
<pos>14</pos>
<end-of-input>true</end-of-input>
<permitted>['A'-'Z']</permitted>
</fail>

Related

How to convert a string to an array in FreeMarker

I have got stuck in one problem:
If the user give the value of a = abhay, himanshu, aman, piyush
they have not mentioned like this a = ["abhay","himanshu","aman","piyush"].
So how should I use this a as an array in this:
<#list a as x>
${x}
</#list>
You can use the string builtin ?split to split a string by a regular expression:
<#list "abhay, himanshu, aman, piyush"?split(", ?", "r") as x>
${x}
</#list>
See also:
https://freemarker.apache.org/docs/ref_builtins_string.html#ref_builtin_split

Force ANTLR (version 3) to match lexer rule

I have the following ANTLR (version 3) grammar:
grammar GRM;
options
{
language = C;
output = AST;
}
create_statement : CREATE_KEYWORD SPACE_KEYWORD FILE_KEYWORD SPACE_KEYWORD value -> ^(value);
value : NUMBER | STRING;
CREATE_KEYWORD : 'CREATE';
FILE_KEYWORD : 'FILE';
SPACE_KEYWORD : ' ';
NUMBER : DIGIT+;
STRING : (LETTER | DIGIT)+;
fragment DIGIT : '0'..'9';
fragment LETTER : 'a'..'z' | 'A'..'Z';
With this grammar, I am able to successfully parse strings like CREATE FILE dump or CREATE FILE output. However, when I try to parse a string like CREATE FILE file it doesn't work. ANTLR matches the text file (in the string) with lexer rule FILE_KEYWORD which is not the match that I was expecting. I was expecting it to match with lexer rule STRING.
How can I force ANTLR to do this?
Your problem is a variant on classic contextual keyword vs identifier issue, it seems.
Either "value" should be a lexer rule, not a parser rule, it's too late otherwise, or you should reorder the rules (or both).
Hence using VALUE = NUMBER | STRING (lexer rule) instead of lower case value (grammar rule) will help. The order of the lexer rules are also important, usually definition of ID ("VALUE" in your code) comes after keyword definitions.
See also : 'IDENTIFIER' rule also consumes keyword in ANTLR Lexer grammar
grammar GMR;
options
{
language = C;
output = AST;
}
create_statement : CREATE_KEYWORD SPACE_KEYWORD FILE_KEYWORD SPACE_KEYWORD value -> ^(value);
CREATE_KEYWORD : 'CREATE';
FILE_KEYWORD : 'FILE';
value : (LETTER | DIGIT) + | FILE_KEYWORD | CREATE_KEYWORD ;
SPACE_KEYWORD : ' ';
this works for me in ANTLRworks for input CREATE FILE file and for input CREATE FILE FILE if needed.

How to search for :) in Solr

How does one search for specific punctuation in Solr, such as :)? I have tried URL encoding the text but I still get this message:
org.apache.solr.search.SyntaxError: Cannot parse ':': Encountered " ":" ": "" at line 1, column 0.
Was expecting one of:
<NOT> ...
"+" ...
"-" ...
<BAREOPER> ...
"(" ...
"*" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
<REGEXPTERM> ...
"[" ...
"{" ...
<LPARAMS> ...
<NUMBER> ...
<TERM> ...
"*" ...
Additionally, I need to perform this search on a text field, not on a string field. How should I configure the analyser to save punctuation?
Note that searching google for the subject is impossible due to two prolific Solr contributors with the name "Smiley"!
What configurations you have for the text field?
You should take care the splitting is not happening on the puntuations e.g. if using StandardTokenizerFactory or word delimiter filter.
You can define a custom field with WhitespaceTokenizerFactory or KeywordTokenizerFactory and have further filters like lower case on it.
Also, There are some characters which Solr/Lucene uses for some operation e.g. + - ! ( ) { } [ ] ^ " ~ * ? :
You would need to escape the special characters with backslash. Check Escape Special Characters
instead of :) search for "\:\ )" , both chars :,) have special meaning in SOLR.
for all special operatos you need to escape by prefixing with '\' char .

Why can't my yacc rule reduce here?

I am using YACC to do my compiler homework project. I found that my program could not get the syntax tree. So I printed it all out to see what is happening. According to my result, it seems that ClassDecl does not reduce to ClassDeclList here. But I can't understand why... can anyone help me out?
The sample input is:
program ex11;
class ab {
}
It printed out as:
programXXXX ex11ID
semicon abID
RBRACEnum
ClassBody ClassDecl ClassDecl1 Error!
The first three lines are messages I printed from my LEX file, to ensure that the characters are recognized correctly.
According to the information, the parser successfully reduces {} to ClassBody and class ab {} to ClassDecl. And then it does not reduce to ClassDeclList, is it because I am writing a left recursive grammar here?
This is the part of my YACC rule base for the inference:
Program: PROGRAMnum IDnum SEMInum ClassDeclList
{printf("program"); $$ = MakeTree(ProgramOp,$4, MakeLeaf(IDNode,$2)); printtree($$,0);};
ClassDeclList: ClassDecl
{printf("ClassDeclList1");$$ = MakeTree(ClassOp,NullExp(),$1); printf("ClassDeclListend");};
|ClassDecl ClassDeclList
{printf("ClassDeclList2");$$ = MakeTree(ClassOp,$2,$1); printf("ClassDeclList");};
ClassDecl: CLASSnum IDnum ClassBody
{printf("ClassDecl");$$=MakeTree(ClassDefOp,$3,MakeLeaf(IDNode,$2)); printf("ClassDecl1");};
Have you tried
| ClassDeclList ClassDecl
instead of
| ClassDecl ClassDeclList
?
I remember this fixing many problems when I used to use CUP.

ANTLR3 C Target - parser return 'misses' out root element

I'm trying to use the ANTLR3 C Target to make sense of an AST, but am running into some difficulties.
I have a simple SQL-like grammar file:
grammar sql;
options
{
language = C;
output=AST;
ASTLabelType=pANTLR3_BASE_TREE;
}
sql : VERB fields;
fields : FIELD (',' FIELD)*;
VERB : 'SELECT' | 'UPDATE' | 'INSERT';
FIELD : CHAR+;
fragment
CHAR : 'a'..'z';
and this works as expected within ANTLRWorks.
In my C code I have:
const char pInput[] = "SELECT one,two,three";
pANTLR3_INPUT_STREAM pNewStrm = antlr3NewAsciiStringInPlaceStream((pANTLR3_UINT8) pInput,sizeof(pInput),NULL);
psqlLexer lex = sqlLexerNew (pNewStrm);
pANTLR3_COMMON_TOKEN_STREAM tstream = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT,
TOKENSOURCE(lex));
psqlParser ps = sqlParserNew( tstream );
sqlParser_sql_return ret = ps->sql(ps);
pANTLR3_BASE_TREE pTree = ret.tree;
cout << "Tree: " << pTree->toStringTree(pTree)->chars << endl;
ParseSubTree(0,pTree);
This outputs a flat tree structure when you use ->getChildCount and ->children->get to recurse through the tree.
void ParseSubTree(int level,pANTLR3_BASE_TREE pTree)
{
ANTLR3_UINT32 childcount = pTree->getChildCount(pTree);
for (int i=0;i<childcount;i++)
{
pANTLR3_BASE_TREE pChild = (pANTLR3_BASE_TREE) pTree->children->get(pTree->children,i);
for (int j=0;j<level;j++)
{
std::cout << " - ";
}
std::cout <<
pChild->getText(pChild)->chars <<
std::endl;
int f=pChild->getChildCount(pChild);
if (f>0)
{
ParseSubTree(level+1,pChild);
}
}
}
Program output:
Tree: SELECT one , two , three
SELECT
one
,
two
,
three
Now, if I alter the grammar file:
sql : VERB ^fields;
.. the call to ParseSubTree only displays the child nodes of fields.
Program output:
Tree: (SELECT one , two , three)
one
,
two
,
three
My question is: why, in the second case, is Antlr just give the child nodes? (in effect missing out the SELECT token)
I'd be very grateful if anybody can give me any pointers for making sense of the tree returned by Antlr.
Useful Information:
AntlrWorks 1.4.2,
Antlr C Target 3.3,
MSVC 10
Placing output=AST; in the options section will not produce an actual AST, it only causes ANTLR to create CommonTree tokens instead of CommonTokens (or, in your case, the equivalent C structs).
If you use output=AST;, the next step is to put tree operators, or rewrite rules inside your parser rules that give shape to your AST.
See this previous Q&A to find out how to create a proper AST.
For example, the following grammar (with rewrite rules):
options {
output=AST;
// ...
}
sql // make VERB the root
: VERB fields -> ^(VERB fields)
;
fields // omit the comma's from the AST
: FIELD (',' FIELD)* -> FIELD+
;
VERB : 'SELECT' | 'UPDATE' | 'INSERT';
FIELD : CHAR+;
SPACE : ' ' {$channel=HIDDEN;};
fragment CHAR : 'a'..'z';
will parse the following input:
UPDATE field, foo , bar
into the following AST:
I think it is important that you realize that the tree you see in Antrlworks is not the AST. The ".tree" in your code is the AST but may look different from what you expect. In order to create the AST, you need to specify the nodes using the ^ symbol in strategic places using rewrite rules.
You can read more here

Resources