Wrong rule used in bison

Wrong rule used in bison - c

I am trying to perform a syntax analysis using bison, but it uses the wrong rule at one point and I didn't manage to find how to fix it.
I have a few rules but these ones seem to be the source of the problem :
method : vars statements;
vars : //empty
| vars var;
var : type IDENTIFIER ';';
type : IDENTIFIER;
statements : //empty
| statements statement;
statement : IDENTIFIER '=' e ';';
e : (...)
With IDENTIFIER being a simple regex matching [a-zA-Z]*
So basically, if I write that :
int myint;
myint = 12;
Since myint is an identifier, bison seems to still try to match it on the second line as a type and then matches the whole thing as a var and not as a statement. So I get this error (knowing that ASSIGN is '=') :
syntax error, unexpected ASSIGN, expecting IDENTIFIER
Edit : Note that bison is indicating that there are shift/reduce errors, so it may be linked (as said in the answers).

The problem you're having is coming from the default resolution of the shift-reduce conflict you have due to the empty statements rule -- it needs to know whether to reduce the empty statement and start matching statements, or shift the IDENTIFIER that might begin another var. So it decides to shift, which puts it down the var path.
You can avoid this problem by refactoring the grammar to avoid empty productions:
method: vars | vars statements | statements ;
vars: var | vars var ;
statements : statement | statements statement ;
... rest the same
which avoids needing to know whether something is var or a statement until after shifting far enough into it to tell.

Related

Conflict Bison parser

I'm new to Bison and I'm having trouble with shift/reduce conflicts...
I'm writing the rules for grammar for the C language: ID is a token that identifies a variable, and I wrote this rule to ensure that the identifier can be considered even if it is written in parentheses.
id : '(' ID ')' {printf("(ID) %s\n", $2);}
| ID {printf("ID %s\n", $1);}
;
Output of Bison conflicts is:
State 82
12 id: '(' ID . ')'
13 | ID .
')' shift, and go to state 22
')' [reduce using rule 13 (id)]
$default reduce using rule 13 (id)
How can I resolve this conflict?
I hope I was clear and thanks for your help.

Your id rule in itself cannot cause a shift/reduce error. There must be some other rule in your grammar that uses ID. For example, you have an expression rule such as:
expr: '(' expr ')'
| ID
;
In the above example, ID can reduce to id or to expr and the parser doesn't know which reduction to take. Check what is in state 22.
Edit: you ask "what can I do to solve the conflict?"
I'm writing the rules for grammar for the C language: ID is a token that identifies a variable, and I wrote this rule to ensure that the identifier can be considered even if it is written in parentheses
A variable in parenthesis as a left-hand side is invalid in C, so it can only occur in a right-hand side. Then you can consider it an expression, so just remove your rule and where you use id replace that with expr.

Resolving yacc conflicts - rules useless in parser due to conflicts

I am working on a yacc file to parse a given file and convert it to an equivalent c++ file. I have created the following grammar based on the provided syntax diagrams:
program: PROGRAMnumber id 'is' comp_stmt
;
comp_stmt: BEGINnumber statement symbol ENDnumber
;
statement: statement SEMInumber statement
| id EQnumber expression
| PRINTnumber expression
| declaration
;
declaration: VARnumber id
;
expression: term
;
term: term as_op term
| MINUSnumber term
| factor
;
factor: factor md_op factor
| ICONSTnumber
| id
| RPARENnumber expression LPARENnumber
;
as_op: PLUSnumber
| MINUSnumber
;
md_op: TIMESnumber
| DIVnumber
;
symbol: SEMInumber
| COMMAnumber
;
id: IDnumber
| id symbol id
;
The only issue I have remaining is that I am receiving this error when trying to compile with yacc.
conflicts: 14 shift/reduce
calc.y:103.17-111.41: warning: rule useless in parser due to conflicts: declaration: VARnumber id
I have resolved the only other conflict I have encountered, but I am not sure what the resolution for this conflict is. The line it should match is of the format
var a, b, c, d;
or
var a;

All of your productions intended to derive lists are ambiguous and therefore generate reduce/reduce conflicts. For example:
id: id symbol id
Will be clearly ambiguous when there are three identifiers: are the first two to be reduced first, or the last two? The usual list idiom is left-recursion:
id_list: id | id_list `,` id
For most languages, that would not be correct for statements, which are terminated with semi-colons, not separated by them, but that model would work for a comma-separated list of identifiers, or for a left-associative sequence of addition operators.
For statements, you probably want something more like:
statement_list: | statement_list statement ';'
Speaking of symbol, do you really believe that , and ; have the same syntactic function? That seems unlikely, since you write var a, b, c, d; and not, for example, var a; b, c; d,.
The "useless rule" warning produced by bison is exactly because your grammar allows ids to be separated with semicolons. When the parser sees "var" ID with ; as lookahead, it first reduces ID to id and then needs to decide whether to reduce var id to declaration or to shift the ; in order to later reduce it to symbol and then proceed with the reduction of id symbol id. In the absence of precedence rules, bison always resolves shift/reduce conflicts in favour of shifting, so that is what it does in this case. But the result is that it is never possible to reduce "var" id to declaration, making the production useless as the result of a shift-reduce conflict resolution, which is more or less what the warning says.

SAS array define error

%let ng = 4;
data a1;
set a2;
array cur{&ng} cur1-cur&ng.;
do i = 1 to &ng.;
if (_n_ = (i-1)*5 + 1) then cur[i] = Val;
end;
run;
Error msg
ERROR: Missing numeric suffix on a numbered variable list (cur1-cur).
ERROR: Too few variables defined for the dimension(s) specified for the array cur.
ERROR 22-322: Syntax error, expecting one of the following: a name, (, ;, _ALL_, _CHARACTER_, _CHAR_, _NUMERIC_.
ERROR 200-322: The symbol is not recognized and will be ignored.
Why do i = 1 to &ng. and cur{&ng} work but cur1-cur&ng. generates errors?

That code works fine for me, however I have encountered this problem where I've created a macro variable (in this case ng) with the proc sql into: or call symput methods, as these set a default length of 8 and pad the value with spaces. I suspect in your actual code the macro variable ng is being created in one of these ways.
To get around this, try adding %trim as below.
array cur[&ng.] cur1-cur%trim(&ng.);
You also need to add an end statement to close the do loop.

Yacc- How to write action code for assign operation with C structure node

In yacc program,how do we write the action for assign operation using c structure node?
Example:-
stmt: stmt stmt ';'
| exp ';' {printtree();}
| bool ';' {...}
| VAR ASSIGN exp ';' {//How to store this value to VAR using node?}
...
;
exp: exp PLUS exp {make_operator($1,'+',$3);// which stores a char '+' with
left node to $1 and right node to $3 to the synatx tree
}
| exp MINUS exp {...}
...
;
It would be of great help if someone can suggest a solution for this.

The answer is that since your Yacc parser is not actually executing the code, but producing an abstract syntax tree (as evidenced by the use of a make_operator function in the PLUS operation, the same thing is done for the assignment. It could be as simple as:
stmt: stmt stmt ';'
| exp ';' {printtree();}
| bool ';' {...}
| VAR ASSIGN exp ';' {$$ = make_operator($1, '=', $3);}
...
;
The actual job of generating the code to perform the assignment will be done by other passes over the syntax tree which is constructed by the parser. Those passes will have to do things like ensuring that VAR is actually defined in the given scope and so on, depending on the rules of the language: does it have the right type, is it modifiable, ...
A translation scheme for assignments (at least of a simple scalar variable which fits into a register) is:
Generate the code to calculate the address of the assignment target, such that this code leaves the value in a new temporary register, call it t1.
Generate the code to calculate the value of the expression, leaving it in another register t2.
Generate the code mem[t1] := t2 which represents store the value of t2 into the memory location pointed at by t1. (Of course, this intermediate code isn't literally represented by text such as mem[t1] := t2, but rather some instruction data structure. The text is just a printed notation so we can discuss it.)

Xtext - solving ambiguity without semantic predicates?

Problem
Entry
: temp += (Expression | Declaration | UserType)*
;
Declaration
: Type '*' name=ID ';'
;
Expression
: temp1 = Primary ('*' temp2 += Primary)* ';'
;
Primary
: temp1 = INT
| temp2 = [Declaration]
;
Type
: temp1 = SimpleType
| temp2 = [UserType]
;
SimpleType
: 'int' | 'long'
;
UserType
: 'typedef' name=ID ';'
;
Rules Declaration and Expression are ambiguous due to the fact that both rules share the exact same syntax and problems occur because both cross references [Declaration] as well as [UserType] are based on the terminal rule ID.
Therefore generating code for the grammar above will throw the ANTLR warning:
Decision can match input such as "RULE_ID '*' RULE_ID ';'"
using multiple alternatives: 1, 2
Goal
I would like the rule to be chosen which was able to resolve the cross reference first.
Assume the following:
typedef x;
int* x;
int* b;
The AST for
x*b
should look something like:
x = Entry -> Expression -> Primary (temp1) -> [Declaration] -> Stop!
* = Entry -> Expression -> Primary '*' -> Stop!
b = Entry -> Expression -> Primary (temp2) -> [Declaration] -> Stop!
Therefore
Entry -> Declaration
should never be considered, since
Entry -> Expression -> [Declaration]
could already validate the cross reference [Declaration].
Question
Because we do not have semantic predicates in Xtext (or am I wrong?), is there a way to validate a cross reference and explicitly choose that rule based on that validation?
PS: As a few might already know, this problem stems from the C language which I am trying to implement with Xtext.

Regarding the current version of Xtext, semantic predicates are not supported.
Cross-references are resolved to their terminals (in my case UserRole and Declaration to terminal ID). And only during the linking process references are validated, which in my case is too late since the AST was already created.
The only possible way to use context sensitive rule decisions is to actually define an abstract rule within the grammar that states the syntax. In the example above, rules Expression and Declaration would be rewritten to one. Semantic validation then follows in the necessary areas such as content assist with the use of scoping.