Resolving yacc conflicts - rules useless in parser due to conflicts - c

I am working on a yacc file to parse a given file and convert it to an equivalent c++ file. I have created the following grammar based on the provided syntax diagrams:
program: PROGRAMnumber id 'is' comp_stmt
;
comp_stmt: BEGINnumber statement symbol ENDnumber
;
statement: statement SEMInumber statement
| id EQnumber expression
| PRINTnumber expression
| declaration
;
declaration: VARnumber id
;
expression: term
;
term: term as_op term
| MINUSnumber term
| factor
;
factor: factor md_op factor
| ICONSTnumber
| id
| RPARENnumber expression LPARENnumber
;
as_op: PLUSnumber
| MINUSnumber
;
md_op: TIMESnumber
| DIVnumber
;
symbol: SEMInumber
| COMMAnumber
;
id: IDnumber
| id symbol id
;
The only issue I have remaining is that I am receiving this error when trying to compile with yacc.
conflicts: 14 shift/reduce
calc.y:103.17-111.41: warning: rule useless in parser due to conflicts: declaration: VARnumber id
I have resolved the only other conflict I have encountered, but I am not sure what the resolution for this conflict is. The line it should match is of the format
var a, b, c, d;
or
var a;

All of your productions intended to derive lists are ambiguous and therefore generate reduce/reduce conflicts. For example:
id: id symbol id
Will be clearly ambiguous when there are three identifiers: are the first two to be reduced first, or the last two? The usual list idiom is left-recursion:
id_list: id | id_list `,` id
For most languages, that would not be correct for statements, which are terminated with semi-colons, not separated by them, but that model would work for a comma-separated list of identifiers, or for a left-associative sequence of addition operators.
For statements, you probably want something more like:
statement_list: | statement_list statement ';'
Speaking of symbol, do you really believe that , and ; have the same syntactic function? That seems unlikely, since you write var a, b, c, d; and not, for example, var a; b, c; d,.
The "useless rule" warning produced by bison is exactly because your grammar allows ids to be separated with semicolons. When the parser sees "var" ID with ; as lookahead, it first reduces ID to id and then needs to decide whether to reduce var id to declaration or to shift the ; in order to later reduce it to symbol and then proceed with the reduction of id symbol id. In the absence of precedence rules, bison always resolves shift/reduce conflicts in favour of shifting, so that is what it does in this case. But the result is that it is never possible to reduce "var" id to declaration, making the production useless as the result of a shift-reduce conflict resolution, which is more or less what the warning says.

Related

Wrong rule used in bison

I am trying to perform a syntax analysis using bison, but it uses the wrong rule at one point and I didn't manage to find how to fix it.
I have a few rules but these ones seem to be the source of the problem :
method : vars statements;
vars : //empty
| vars var;
var : type IDENTIFIER ';';
type : IDENTIFIER;
statements : //empty
| statements statement;
statement : IDENTIFIER '=' e ';';
e : (...)
With IDENTIFIER being a simple regex matching [a-zA-Z]*
So basically, if I write that :
int myint;
myint = 12;
Since myint is an identifier, bison seems to still try to match it on the second line as a type and then matches the whole thing as a var and not as a statement. So I get this error (knowing that ASSIGN is '=') :
syntax error, unexpected ASSIGN, expecting IDENTIFIER
Edit : Note that bison is indicating that there are shift/reduce errors, so it may be linked (as said in the answers).
The problem you're having is coming from the default resolution of the shift-reduce conflict you have due to the empty statements rule -- it needs to know whether to reduce the empty statement and start matching statements, or shift the IDENTIFIER that might begin another var. So it decides to shift, which puts it down the var path.
You can avoid this problem by refactoring the grammar to avoid empty productions:
method: vars | vars statements | statements ;
vars: var | vars var ;
statements : statement | statements statement ;
... rest the same
which avoids needing to know whether something is var or a statement until after shifting far enough into it to tell.

Conflict Bison parser

I'm new to Bison and I'm having trouble with shift/reduce conflicts...
I'm writing the rules for grammar for the C language: ID is a token that identifies a variable, and I wrote this rule to ensure that the identifier can be considered even if it is written in parentheses.
id : '(' ID ')' {printf("(ID) %s\n", $2);}
| ID {printf("ID %s\n", $1);}
;
Output of Bison conflicts is:
State 82
12 id: '(' ID . ')'
13 | ID .
')' shift, and go to state 22
')' [reduce using rule 13 (id)]
$default reduce using rule 13 (id)
How can I resolve this conflict?
I hope I was clear and thanks for your help.
Your id rule in itself cannot cause a shift/reduce error. There must be some other rule in your grammar that uses ID. For example, you have an expression rule such as:
expr: '(' expr ')'
| ID
;
In the above example, ID can reduce to id or to expr and the parser doesn't know which reduction to take. Check what is in state 22.
Edit: you ask "what can I do to solve the conflict?"
I'm writing the rules for grammar for the C language: ID is a token that identifies a variable, and I wrote this rule to ensure that the identifier can be considered even if it is written in parentheses
A variable in parenthesis as a left-hand side is invalid in C, so it can only occur in a right-hand side. Then you can consider it an expression, so just remove your rule and where you use id replace that with expr.

Ada: Why exactly is ""A" .. "F"" not discrete?

Program Text
type T is array ("A" .. "F") of Integer;
Compiler Console Output
hello.adb:4:22: discrete type required for range
Question
If my understanding is correct, clause 9 from chapter 3.6 of the Ada reference manual is the reason for the compiler raising an compilation error:
Each index_subtype_definition or discrete_subtype_definition in an array_type_definition defines an index subtype; its type (the index type) shall be discrete.
Hence, why exactly is "A" .. "F" not discrete? What does discrete exactly mean?
Background info
The syntax requirements for array type definitions are quoted below. Source: Ada Reference Manual
array_type_definition ::= unconstrained_array_definition | constrained_array_definition
constrained_array_definition ::= array (discrete_subtype_definition {, discrete_subtype_definition}) of component_definition
discrete_subtype_definition ::= discrete_subtype_indication | range
range ::= range_attribute_reference | simple_expression .. simple_expression
simple_expression ::= [unary_adding_operator] term {binary_adding_operator term}
term ::= factor {multiplying_operator factor}
factor ::= primary [** primary] | abs primary | not primary
primary ::= numeric_literal | null | string_literal | aggregate | name | qualified_expression | allocator | (expression)
This:
"A" .. "F"
does satisfy the syntax of a range; it consists of a simple_expression, followed by .., followed by another simple_expression. So it's not a syntax error.
It's still invalid; specifically it's a semantic error. The syntax isn't the only thing that determines whether a chunk of code is valid or not. For example, "foo" * 42 is a syntactically valid expression, but it's semantically invalid because there's no * operator for a string and an integer (unless you write your own).
A discrete type is either an integer type or an enumeration type. Integer, Character, and Boolean are examples of discrete types. Floating-point types, array types, pointer types, record types, and so forth are not discrete types, so expressions of those types can't be used in a range for a discrete_subtype_indication.
This:
type T is array ("A" .. "F") of Integer;
is probably supposed to be:
type T is array ('A' .. 'F') of Integer;
String literals are of type String, which is an array type. Character literals are of type Character, which is an enumeration type and therefore a discrete type.
You wrote in a comment on another answer:
Unfortunately I'm unable to replace the string literals by character literals and recompile the code ...
If that's the case, it's quite unfortunate. The code you've posted is simply invalid; it will not compile. Your only options are to modify it or not to use it.
Ermm ... I think it is trying to tell you that you can't use String literals to specify ranges. You probably meant to use a character literal.
Reference:
http://archive.adaic.com/standards/83lrm/html/lrm-02-05.html
After all, the clauses quoted above are explicitely requiring string_literal to be used
You have misunderstood the Ada syntax specs. Specifically, you have missed this production:
name ::= simple_name
| character_literal | operator_symbol
| indexed_component | slice
| selected_component | attribute

Yacc- How to write action code for assign operation with C structure node

In yacc program,how do we write the action for assign operation using c structure node?
Example:-
stmt: stmt stmt ';'
| exp ';' {printtree();}
| bool ';' {...}
| VAR ASSIGN exp ';' {//How to store this value to VAR using node?}
...
;
exp: exp PLUS exp {make_operator($1,'+',$3);// which stores a char '+' with
left node to $1 and right node to $3 to the synatx tree
}
| exp MINUS exp {...}
...
;
It would be of great help if someone can suggest a solution for this.
The answer is that since your Yacc parser is not actually executing the code, but producing an abstract syntax tree (as evidenced by the use of a make_operator function in the PLUS operation, the same thing is done for the assignment. It could be as simple as:
stmt: stmt stmt ';'
| exp ';' {printtree();}
| bool ';' {...}
| VAR ASSIGN exp ';' {$$ = make_operator($1, '=', $3);}
...
;
The actual job of generating the code to perform the assignment will be done by other passes over the syntax tree which is constructed by the parser. Those passes will have to do things like ensuring that VAR is actually defined in the given scope and so on, depending on the rules of the language: does it have the right type, is it modifiable, ...
A translation scheme for assignments (at least of a simple scalar variable which fits into a register) is:
Generate the code to calculate the address of the assignment target, such that this code leaves the value in a new temporary register, call it t1.
Generate the code to calculate the value of the expression, leaving it in another register t2.
Generate the code mem[t1] := t2 which represents store the value of t2 into the memory location pointed at by t1. (Of course, this intermediate code isn't literally represented by text such as mem[t1] := t2, but rather some instruction data structure. The text is just a printed notation so we can discuss it.)

Xtext - solving ambiguity without semantic predicates?

Problem
Entry
: temp += (Expression | Declaration | UserType)*
;
Declaration
: Type '*' name=ID ';'
;
Expression
: temp1 = Primary ('*' temp2 += Primary)* ';'
;
Primary
: temp1 = INT
| temp2 = [Declaration]
;
Type
: temp1 = SimpleType
| temp2 = [UserType]
;
SimpleType
: 'int' | 'long'
;
UserType
: 'typedef' name=ID ';'
;
Rules Declaration and Expression are ambiguous due to the fact that both rules share the exact same syntax and problems occur because both cross references [Declaration] as well as [UserType] are based on the terminal rule ID.
Therefore generating code for the grammar above will throw the ANTLR warning:
Decision can match input such as "RULE_ID '*' RULE_ID ';'"
using multiple alternatives: 1, 2
Goal
I would like the rule to be chosen which was able to resolve the cross reference first.
Assume the following:
typedef x;
int* x;
int* b;
The AST for
x*b
should look something like:
x = Entry -> Expression -> Primary (temp1) -> [Declaration] -> Stop!
* = Entry -> Expression -> Primary '*' -> Stop!
b = Entry -> Expression -> Primary (temp2) -> [Declaration] -> Stop!
Therefore
Entry -> Declaration
should never be considered, since
Entry -> Expression -> [Declaration]
could already validate the cross reference [Declaration].
Question
Because we do not have semantic predicates in Xtext (or am I wrong?), is there a way to validate a cross reference and explicitly choose that rule based on that validation?
PS: As a few might already know, this problem stems from the C language which I am trying to implement with Xtext.
Regarding the current version of Xtext, semantic predicates are not supported.
Cross-references are resolved to their terminals (in my case UserRole and Declaration to terminal ID). And only during the linking process references are validated, which in my case is too late since the AST was already created.
The only possible way to use context sensitive rule decisions is to actually define an abstract rule within the grammar that states the syntax. In the example above, rules Expression and Declaration would be rewritten to one. Semantic validation then follows in the necessary areas such as content assist with the use of scoping.

Resources