Xtext - solving ambiguity without semantic predicates? - c

Problem
Entry
: temp += (Expression | Declaration | UserType)*
;
Declaration
: Type '*' name=ID ';'
;
Expression
: temp1 = Primary ('*' temp2 += Primary)* ';'
;
Primary
: temp1 = INT
| temp2 = [Declaration]
;
Type
: temp1 = SimpleType
| temp2 = [UserType]
;
SimpleType
: 'int' | 'long'
;
UserType
: 'typedef' name=ID ';'
;
Rules Declaration and Expression are ambiguous due to the fact that both rules share the exact same syntax and problems occur because both cross references [Declaration] as well as [UserType] are based on the terminal rule ID.
Therefore generating code for the grammar above will throw the ANTLR warning:
Decision can match input such as "RULE_ID '*' RULE_ID ';'"
using multiple alternatives: 1, 2
Goal
I would like the rule to be chosen which was able to resolve the cross reference first.
Assume the following:
typedef x;
int* x;
int* b;
The AST for
x*b
should look something like:
x = Entry -> Expression -> Primary (temp1) -> [Declaration] -> Stop!
* = Entry -> Expression -> Primary '*' -> Stop!
b = Entry -> Expression -> Primary (temp2) -> [Declaration] -> Stop!
Therefore
Entry -> Declaration
should never be considered, since
Entry -> Expression -> [Declaration]
could already validate the cross reference [Declaration].
Question
Because we do not have semantic predicates in Xtext (or am I wrong?), is there a way to validate a cross reference and explicitly choose that rule based on that validation?
PS: As a few might already know, this problem stems from the C language which I am trying to implement with Xtext.

Regarding the current version of Xtext, semantic predicates are not supported.
Cross-references are resolved to their terminals (in my case UserRole and Declaration to terminal ID). And only during the linking process references are validated, which in my case is too late since the AST was already created.
The only possible way to use context sensitive rule decisions is to actually define an abstract rule within the grammar that states the syntax. In the example above, rules Expression and Declaration would be rewritten to one. Semantic validation then follows in the necessary areas such as content assist with the use of scoping.

Related

Wrong rule used in bison

I am trying to perform a syntax analysis using bison, but it uses the wrong rule at one point and I didn't manage to find how to fix it.
I have a few rules but these ones seem to be the source of the problem :
method : vars statements;
vars : //empty
| vars var;
var : type IDENTIFIER ';';
type : IDENTIFIER;
statements : //empty
| statements statement;
statement : IDENTIFIER '=' e ';';
e : (...)
With IDENTIFIER being a simple regex matching [a-zA-Z]*
So basically, if I write that :
int myint;
myint = 12;
Since myint is an identifier, bison seems to still try to match it on the second line as a type and then matches the whole thing as a var and not as a statement. So I get this error (knowing that ASSIGN is '=') :
syntax error, unexpected ASSIGN, expecting IDENTIFIER
Edit : Note that bison is indicating that there are shift/reduce errors, so it may be linked (as said in the answers).
The problem you're having is coming from the default resolution of the shift-reduce conflict you have due to the empty statements rule -- it needs to know whether to reduce the empty statement and start matching statements, or shift the IDENTIFIER that might begin another var. So it decides to shift, which puts it down the var path.
You can avoid this problem by refactoring the grammar to avoid empty productions:
method: vars | vars statements | statements ;
vars: var | vars var ;
statements : statement | statements statement ;
... rest the same
which avoids needing to know whether something is var or a statement until after shifting far enough into it to tell.

Backpatching with OR_ELSE and ALSO in yacc

I am trying to implement Syntax Directed Translation(SDT) using yacc. In the case of boolean expressions I've chosen to adopt Backpatching method. Below is an example of my implementation of logical OR using C in yacc :
simpleexp : simpleexp OR M simpleexp
{
backpatch($1.falselist,$3.quad);
$$.truelist = merge($1.truelist,$4.truelist);
$$.falselist = $4.falselist;
$$.type = TYPE_BOOL;
fprintf(fout, "Rule 64 \t\t simpleexp -> simpleexp OR simpleexp \n");
};
M : /* empty */
{
$$.quad = nextquad();//nextquad returns next available place in quadruples array
};
It does work but my current problem is something else. We are to have an alternate approach for handling logical operation OR_ELSE besides the "Side Effect" approach adopted just for usual OR. Lets go on a little bit more about the problem
Imagine we are dealing with something like simpleexp1 OR simpleexp2(numbers are provided in order to distinct simpleexp s from each other). Using the implemetation of logical OR , as implemented above , we won't evaluate simpleexp2 if simpleexp1 is true. Because we should jump to simpleexp1.truelist which is filled via calling backpatch function called somewhere else later. In case of simpleexp1 OR_ELSE simpleexp2 we want to be sure that simpleexp2 is evaluated , whether the final result will be true or false. For example if simpleexp2 is (x++ == 4) where x is an integer , we want to be sure that the incrementation takes place and the value of x increments by 1 ; however , in the case of simpleexp1 OR simpleexp2 , the value of x won't change if the simpleexp1 is true.
P.S. : I have figured out that in order the simpleexp2 to be evaluated we should fill both the truelist and falselist of simpleexp1 in an order that causes the control flow to be given to the beginning of simpleexp2 , but I don't know what to do next!

Resolving yacc conflicts - rules useless in parser due to conflicts

I am working on a yacc file to parse a given file and convert it to an equivalent c++ file. I have created the following grammar based on the provided syntax diagrams:
program: PROGRAMnumber id 'is' comp_stmt
;
comp_stmt: BEGINnumber statement symbol ENDnumber
;
statement: statement SEMInumber statement
| id EQnumber expression
| PRINTnumber expression
| declaration
;
declaration: VARnumber id
;
expression: term
;
term: term as_op term
| MINUSnumber term
| factor
;
factor: factor md_op factor
| ICONSTnumber
| id
| RPARENnumber expression LPARENnumber
;
as_op: PLUSnumber
| MINUSnumber
;
md_op: TIMESnumber
| DIVnumber
;
symbol: SEMInumber
| COMMAnumber
;
id: IDnumber
| id symbol id
;
The only issue I have remaining is that I am receiving this error when trying to compile with yacc.
conflicts: 14 shift/reduce
calc.y:103.17-111.41: warning: rule useless in parser due to conflicts: declaration: VARnumber id
I have resolved the only other conflict I have encountered, but I am not sure what the resolution for this conflict is. The line it should match is of the format
var a, b, c, d;
or
var a;
All of your productions intended to derive lists are ambiguous and therefore generate reduce/reduce conflicts. For example:
id: id symbol id
Will be clearly ambiguous when there are three identifiers: are the first two to be reduced first, or the last two? The usual list idiom is left-recursion:
id_list: id | id_list `,` id
For most languages, that would not be correct for statements, which are terminated with semi-colons, not separated by them, but that model would work for a comma-separated list of identifiers, or for a left-associative sequence of addition operators.
For statements, you probably want something more like:
statement_list: | statement_list statement ';'
Speaking of symbol, do you really believe that , and ; have the same syntactic function? That seems unlikely, since you write var a, b, c, d; and not, for example, var a; b, c; d,.
The "useless rule" warning produced by bison is exactly because your grammar allows ids to be separated with semicolons. When the parser sees "var" ID with ; as lookahead, it first reduces ID to id and then needs to decide whether to reduce var id to declaration or to shift the ; in order to later reduce it to symbol and then proceed with the reduction of id symbol id. In the absence of precedence rules, bison always resolves shift/reduce conflicts in favour of shifting, so that is what it does in this case. But the result is that it is never possible to reduce "var" id to declaration, making the production useless as the result of a shift-reduce conflict resolution, which is more or less what the warning says.

Yacc- How to write action code for assign operation with C structure node

In yacc program,how do we write the action for assign operation using c structure node?
Example:-
stmt: stmt stmt ';'
| exp ';' {printtree();}
| bool ';' {...}
| VAR ASSIGN exp ';' {//How to store this value to VAR using node?}
...
;
exp: exp PLUS exp {make_operator($1,'+',$3);// which stores a char '+' with
left node to $1 and right node to $3 to the synatx tree
}
| exp MINUS exp {...}
...
;
It would be of great help if someone can suggest a solution for this.
The answer is that since your Yacc parser is not actually executing the code, but producing an abstract syntax tree (as evidenced by the use of a make_operator function in the PLUS operation, the same thing is done for the assignment. It could be as simple as:
stmt: stmt stmt ';'
| exp ';' {printtree();}
| bool ';' {...}
| VAR ASSIGN exp ';' {$$ = make_operator($1, '=', $3);}
...
;
The actual job of generating the code to perform the assignment will be done by other passes over the syntax tree which is constructed by the parser. Those passes will have to do things like ensuring that VAR is actually defined in the given scope and so on, depending on the rules of the language: does it have the right type, is it modifiable, ...
A translation scheme for assignments (at least of a simple scalar variable which fits into a register) is:
Generate the code to calculate the address of the assignment target, such that this code leaves the value in a new temporary register, call it t1.
Generate the code to calculate the value of the expression, leaving it in another register t2.
Generate the code mem[t1] := t2 which represents store the value of t2 into the memory location pointed at by t1. (Of course, this intermediate code isn't literally represented by text such as mem[t1] := t2, but rather some instruction data structure. The text is just a printed notation so we can discuss it.)

How to define an operator alias in PostgreSQL?

Is there an easy way to define an operator alias for the = operator in PostgreSQL?
How is that solved for the != and <> operator? Only the <> operator seems to be in pg_operators. Is the != operator hard-coded?
This is needed for an application which uses a self-defined operator. In most environments this operator should act like a =, but there are some cases where we define a special behavior by creating an own operator and operator class. But for the normal case our operator should just be an alias for the = operator, so that it is transparent to the application which implementation is used.
Just check pgAdmin, the schema pg_catalog. It has all the operators and show you how the create them for all datatypes. Yes, you have to create them for all datatypes. So it's not just a single "alias", you need a lot of aliasses.
Example for a char = char, using !!!! as the alias:
CREATE OPERATOR !!!! -- name
(
PROCEDURE = pg_catalog.chareq,
LEFTARG = "char",
RIGHTARG = "char",
COMMUTATOR = !!!!, -- the same as the name
RESTRICT = eqsel,
JOIN = eqjoinsel,
HASHES,
MERGES
);
SELECT 'a' !!!! 'a' -- true
SELECT 'a' !!!! 'b' -- false
Check the manual as well and pay attention to the naming rules, it has some restrictions.

Resources