is it possible to bucket on the count() of aggregates? The grammar Select parameter language grammar seems to suggest that it is but I could be interpreting it wrong.
My rough interpretation:
predefined([expr = (aggr = (count())], bucket(...))
( "predefined" "(" exp "," "(" bucket ( "," bucket )* ")" ")" ) |
exp ::= ( "+" | "-") ( "$" identifier [ "=" math ] ) | ( math ) | ( aggr )
aggr ::= ( ( "count" "(" ")" ) |
( "sum" "(" exp ")" ) |
( "avg" "(" exp ")" ) |
( "max" "(" exp ")" ) |
attempt ("Expression 'count()' not applicable for single hit.")
all(group(predefined(status, bucket["field1"] ) ) each(
all(group(predefined(count(), bucket[0,10>, bucket[11,20>)) each(
output(count() as(count)
))
))
Creating predefined buckets of count() (or other aggregators) is not supported. Count in general (i.e when counting subgroups rather than hits) would be a bit tricky because it is computed across the nodes as a data sketch, whose output would them need to be sent back down for bucketing.
Is this is something you need to do? If so, create a ticket for it on https://github.com/vespa-engine/vespa/issues
Related
My scanner is works fine but i couldn't find whats wrong with my parser
semi: "{" vallist "}"
| "{" "}""
;
val: tSTR
| tInt
;
vallist: vallist , val
| val
;
You have a number of problems, some of which are probably just typos in your copy-paste (what you have above will be rejected by bison).
Your main problem is probably using " (double quotes) for your tokens, which for the most part doesn't do anything useful -- it creates a 'new' token that is not the same as the single character token your lexer probably returns.
Instead, you want to use ' (single quotes):
semi: '{' vallist '}'
| '{' '}'
;
val: tSTR
| tInt
| semi
;
vallist: vallist ',' val
| val
;
So in all cases of AST examples, there are productions of the following kind:
expr -> expr "+" expr;
expr -> expr "-" expr;
And in this case it's easy to create a new node like this:
expr: expr "+" expr {newNode("+",$1,$3);}
;
Now my grammar has the following implementation:
assignment:IDENTIFIER '=' expression ';'
;
expression:term expression_1
;
expression_1: '+' term expression_1 |
'-' term expression_1 |
;
term: factor term_1
;
term_1: '*' factor term_1 |
'/' factor term_1 |
;
factor: IDENTIFIER |
'(' expression ')' |
NUM | FNUM | STRING
;
Here, while making a new node, how do I take the first operand(which is in a previous production), and feed that into a newNode function which will have the operator and the second operand(both of these are together in a different production)?
How to enable a start condition at the beginning of a rule and disable it at the end ? I have to ignore whitespace with some bison rules only.
How to ignore whitespace inside nested brackets.
define_directive:
DEFINE '(' class_name ')'{ ... }
;
I'm trying to write a parser for this sample code with some more rules.
#/*
* #Template Family
* #Description sample script template for Mate Programming language
* (multi-line comment)
*/
#namespace(sample)
#require(String fatherName)
#require(String motherName)
#require(Array childrenNames)
#define(Family : Template) #// end of header anything can go in body section below (comment)
Family Description
==================
Father's Name: #(fatherName)
Mother's Name: #(motherName)
Number of child: #(childrenNamesCount,0) #// valuation operator is null safe (comment)
List of children's names
------------------------
#foreach(childName:childrenNames)
> #(childName)
#empty
> there is no child name to display.
#end
##(varName) #// this should not be interpreted because escaped with # (comment)
Lexer and parser partially implemented. My problem is how to deal with whitespace inside statement keywords like #foreach, #require.
Whitespaces should be ignored for these.
desired sample output
Family Description
==================
Father's Name: Mira
Mother's Name: James
Number of child: 0
List of children's names
------------------------
> there is no child name to display.
##(varName)
bison file content
command:
fileword
| valuation
| alternative
| loop
| command_directive
;
fileword:
tokenword { scriptlangy_echo(yytext,"fileword.tokenword"); }
| MAGICESC { scriptlangy_echo("#","fileword.MAGICESC"); }
;
tokenword:
IDENTIFIER | NUMBER | STRING_LITERAL | WHITESPACE
| INC_OP | DEC_OP | AND_OP | OR_OP | LE_OP | GE_OP | EQ_OP | NE_OP | L_OP | G_OP
| ';' | ',' | ':' | '=' | ']' | '.' | '&' | '[' | '!' | '~' | '-' | '+' | '*' | '/' | '%' | '^' | '|' | ')' | '}' | '?' | '{' | '('
;
valuation:
'#' '(' expression ')' {
fprintf(yyout, "<val>");
}
| '#' '(' expression ',' default_value ')' {
fprintf(yyout, "<val>");
}
;
loop:
for_loop
| foreach_loop
| while_loop
;
while_loop:
WHILE '(' expression ')' end_block
| WHILE '(' expression ')' commands end_block
;
for_loop:
FOR '(' expression_statement expression_statement expression')' end_block
| FOR '(' expression_statement expression_statement expression')' commands end_block
;
foreach_loop:
foreach_block end_block
| foreach_block empty_block end_block
;
foreach_block:
FOREACH '(' IDENTIFIER ')'
| FOREACH '(' IDENTIFIER ':' expression')' commands
;
The key part of your question seems to be this:
I have to ignore whitespace with some bison rules only. How to ignore
whitespace inside nested brackets.
As I remarked in comments, your implementation idea of somehow doing this by having your parser rules manipulate scanner start conditions is pretty much a non-starter. Forget about that.
Since evidently your scanner does not, in general, ignore whitespace, it must emit tokens that represent whitespace, or perhaps tokens that represent something else plus whitespace (ugly). If it emits whitespace tokens then the thing to do is simply to account for them in your grammar rules. This is completely possible. In fact, you can build a parser for any context-free language on top of a scanner that just returns every character as its own token. The scanner / parser dichotomy is a functional and conceptual convenience, not a necessity.
For example, then, suppose we want to be able to parse numeric array literals, formed as a nonempty, comma-delimited list of decimal numbers enclosed in curly braces, with optional whitespace around commas and inside the braces. Suppose further that we have these terminal symbols to work with:
OPEN // open brace
CLOSE // close brace
NUM // maximal sequence of one or more decimal digits
COMMA // a comma
WS // a maximal run of whitespace
We might then write these rules:
array: array_start array_elements CLOSE;
array_start: OPEN
| OPEN WS
;
array_elements: array_element
| array_elements array_separator array_element
;
array_element: NUM
| NUM WS
;
array_separator: COMMA
| COMMA WS
;
There are, of course, many other ways to set up the details, but, generally speaking, this is how you handle whitespace with parser rules: not by ignoring it, but by accepting it.
I'm having trouble fixing a shift reduce conflict in my grammar. I tried to add -v to read the output of the issue and it guides me towards State 0 and mentions that my INT and FLOAT is reduced to variable_definitions by rule 9. I cannot see the conflict and I'm having trouble finding a solution.
%{
#include <stdio.h>
#include <stdlib.h>
%}
%token INT FLOAT
%token ADDOP MULOP INCOP
%token WHILE IF ELSE RETURN
%token NUM ID
%token INCLUDE
%token STREAMIN ENDL STREAMOUT
%token CIN COUT
%token NOT
%token FLT_LITERAL INT_LITERAL STR_LITERAL
%right ASSIGNOP
%left AND OR
%left RELOP
%%
program: variable_definitions
| function_definitions
;
function_definitions: function_head block
| function_definitions function_head block
;
identifier_list: ID
| ID '[' INT_LITERAL ']'
| identifier_list ',' ID
| identifier_list ',' ID '[' INT_LITERAL ']'
;
variable_definitions:
| variable_definitions type identifier_list ';'
;
type: INT
| FLOAT
;
function_head: type ID arguments
;
arguments: '('parameter_list')'
;
parameter_list:
|parameters
;
parameters: type ID
| type ID '['']'
| parameters ',' type ID
| parameters ',' type ID '['']'
;
block: '{'variable_definitions statements'}'
;
statements:
| statements statement
;
statement: expression ';'
| compound_statement
| RETURN expression ';'
| IF '('bool_expression')' statement ELSE statement
| WHILE '('bool_expression')' statement
| input_statement ';'
| output_statement ';'
;
input_statement: CIN
| input_statement STREAMIN variable
;
output_statement: COUT
| output_statement STREAMOUT expression
| output_statement STREAMOUT STR_LITERAL
| output_statement STREAMOUT ENDL
;
compound_statement: '{'statements'}'
;
variable: ID
| ID '['expression']'
;
expression_list:
| expressions
;
expressions: expression
| expressions ',' expression
;
expression: variable ASSIGNOP expression
| variable INCOP expression
| simple_expression
;
simple_expression: term
| ADDOP term
| simple_expression ADDOP term
;
term: factor
| term MULOP factor
;
factor: ID
| ID '('expression_list')'
| literal
| '('expression')'
| ID '['expression']'
;
literal: INT_LITERAL
| FLT_LITERAL
;
bool_expression: bool_term
| bool_expression OR bool_term
;
bool_term: bool_factor
| bool_term AND bool_factor
;
bool_factor: NOT bool_factor
| '('bool_expression')'
| simple_expression RELOP simple_expression
;
%%
Your definition of a program is that it is either a list of variable definitions or a list of function definitions (program: variable_definitions | function_definitions;). That seems a bit odd to me. What if I want to define both a function and a variable? Do I have to write two programs and somehow link them together?
This is not the cause of your problem, but fixing it would probably fix the problem as well. The immediate cause is that function_definitions is one or more function definition while variable_definitions is zero or more variable definitions. In other words, the base case of the function_definitions recursion is a function definition, while the base case of variable_definitions is the empty sequence. So a list of variable definitions starts with an empty sequence.
But both function definitions and variable definitions start with a type. So if the first token of a program is int, it could be the start of a function definition with return type int or a variable definition of type int. In the former case, the parser should shift the int in order to produce the function_definitions base case:; in the latter case, it should immediately reduce an empty variable_definitions base case.
If you really wanted a program to be either function definitions or variable definitions, but not both. you would need to make variable_definitions have the same form as function_definitions, by changing the base case from empty to type identifier_list ';'. Then you could add an empty production to program so that the parser could recognize empty inputs.
But as I said at the beginning, you probably want a program to be a sequence of definitions, each of which could either be a variable or a function:
program: %empty
| program type identifier_list ';'
| program function_head block
By the way, you are misreading the output file produced by -v. It shows the following actions for State 0:
INT shift, and go to state 1
FLOAT shift, and go to state 2
INT [reduce using rule 9 (variable_definitions)]
FLOAT [reduce using rule 9 (variable_definitions)]
Here, INT and FLOAT are possible lookaheads. So the interpretation of the line INT [reduce using rule 9 (variable_definitions)] is "if the lookahead is INT, immediately reduce using production 9". Production 9 produces the empty sequence, so the reduction reduces zero tokens at the top of the parser stack into a variable_definitions. Reductions do not use the lookahead token, so after the reduction, the lookahead token is still INT.
However, the parser doesn't actually do that because it has a different action for INT, which is to shift it and go to state 1. as indicated by the first line start INT. The brackets [...] indicate that this action is not taken because it is a conflict and the resolution of the conflict was some other action. So the more accurate interpretation of that line is "if it weren't for the preceding action on INT, the lookahead INT would cause a reduction using rule 9."
I am writing program to analyse Pascal grammar. I want to check correctness of input Pascal file and show where errors are.
I have a problem with finding more than one error, after finding an error parser ends.
Also parser doesnt't show in which line error is only display " Syntax error at or before [declaration], line" but i want to this line where is an error.
I used : http://ccia.ei.uvigo.es/docencia/PL/doc/bison/pascal/
pascal.l
%{
/*
* pascal.l
*
* lex input file for pascal scanner
*
* extensions: to ways to spell "external" and "->" ok for "^".
*/
#include <stdio.h>
#include "pascal.tab.h"
int line_no = 1;
%}
A [aA]
B [bB]
C [cC]
D [dD]
E [eE]
F [fF]
G [gG]
H [hH]
I [iI]
J [jJ]
K [kK]
L [lL]
M [mM]
N [nN]
O [oO]
P [pP]
Q [qQ]
R [rR]
S [sS]
T [tT]
U [uU]
V [vV]
W [wW]
X [xX]
Y [yY]
Z [zZ]
NQUOTE [^']
%%
{A}{N}{D} return(AND);
{A}{R}{R}{A}{Y} return(ARRAY);
{C}{A}{S}{E} return(CASE);
{C}{O}{N}{S}{T} return(CONST);
{D}{I}{V} return(DIV);
{D}{O} return(DO);
{D}{O}{W}{N}{T}{O} return(DOWNTO);
{E}{L}{S}{E} return(ELSE);
{E}{N}{D} return(END);
{E}{X}{T}{E}{R}{N} |
{E}{X}{T}{E}{R}{N}{A}{L} return(EXTERNAL);
{F}{O}{R} return(FOR);
{F}{O}{R}{W}{A}{R}{D} return(FORWARD);
{F}{U}{N}{C}{T}{I}{O}{N} return(FUNCTION);
{G}{O}{T}{O} return(GOTO);
{I}{F} return(IF);
{I}{N} return(IN);
{L}{A}{B}{E}{L} return(LABEL);
{M}{O}{D} return(MOD);
{N}{I}{L} return(NIL);
{N}{O}{T} return(NOT);
{O}{F} return(OF);
{O}{R} return(OR);
{O}{T}{H}{E}{R}{W}{I}{S}{E} return(OTHERWISE);
{P}{A}{C}{K}{E}{D} return(PACKED);
{B}{E}{G}{I}{N} return(PBEGIN);
{F}{I}{L}{E} return(PFILE);
{P}{R}{O}{C}{E}{D}{U}{R}{E} return(PROCEDURE);
{P}{R}{O}{G}{R}{A}{M} return(PROGRAM);
{R}{E}{C}{O}{R}{D} return(RECORD);
{R}{E}{P}{E}{A}{T} return(REPEAT);
{S}{E}{T} return(SET);
{T}{H}{E}{N} return(THEN);
{T}{O} return(TO);
{T}{Y}{P}{E} return(TYPE);
{U}{N}{T}{I}{L} return(UNTIL);
{V}{A}{R} return(VAR);
{W}{H}{I}{L}{E} return(WHILE);
{W}{I}{T}{H} return(WITH);
[a-zA-Z]([a-zA-Z0-9\-])* return(IDENTIFIER);
":=" return(ASSIGNMENT);
'({NQUOTE}|'')+' return(CHARACTER_STRING);
":" return(COLON);
"," return(COMMA);
[0-9]+ return(DIGSEQ);
"." return(DOT);
".." return(DOTDOT);
"=" return(EQUAL);
">=" return(GE);
">" return(GT);
"[" return(LBRAC);
"<=" return(LE);
"(" return(LPAREN);
"<" return(LT);
"-" return(MINUS);
"<>" return(NOTEQUAL);
"+" return(PLUS);
"]" return(RBRAC);
[0-9]+"."[0-9]+ return(REALNUMBER);
")" return(RPAREN);
";" return(SEMICOLON);
"/" return(SLASH);
"*" return(STAR);
"**" return(STARSTAR);
"->" |
"^" return(UPARROW);
"(*" |
"{" { register int c;
while ((c = input()))
{
if (c == '}')
break;
else if (c == '*')
{
if ((c = input()) == ')')
break;
else
unput (c);
}
else if (c == '\n')
line_no++;
else if (c == 0)
commenteof();
}
}
[\t\f " "] ;
\n line_no++;
. { fprintf (stderr,
"'%c' (0%o): illegal character at line %d\n",
yytext[0], yytext[0], line_no);
}
%%
commenteof()
{
fprintf (stderr, "Unexpected EOF inside comment at line %d\n", line_no);
exit (1);
}
yywrap ()
{
return (1);
}
pascal.y
%{
/*
* pascal.y
*
* Pascal grammar in Yacc format, based originally on BNF given
* in "Standard Pascal -- User Reference Manual", by Doug Cooper.
* This in turn is the BNF given by the ANSI and ISO Pascal standards,
* and so, is PUBLIC DOMAIN. The grammar is for ISO Level 0 Pascal.
* The grammar has been massaged somewhat to make it LALR, and added
* the following extensions.
*
* constant expressions
* otherwise statement in a case
* productions to correctly match else's with if's
* beginnings of a separate compilation facility
*/
#include<stdio.h>
%}
%token AND ARRAY ASSIGNMENT CASE CHARACTER_STRING
%token COLON COMMA CONST DIGSEQ DIV DO DOT DOTDOT
%token DOWNTO ELSE END EQUAL EXTERNAL FOR FORWARD
%token FUNCTION GE GOTO GT IDENTIFIER IF IN LABEL LBRAC
%token LE LPAREN LT MINUS MOD NIL NOT NOTEQUAL OF OR
%token OTHERWISE PACKED PBEGIN PFILE PLUS PROCEDURE
%token PROGRAM RBRAC REALNUMBER RECORD REPEAT RPAREN
%token SEMICOLON SET SLASH STAR STARSTAR THEN
%token TO TYPE UNTIL UPARROW VAR WHILE WITH
%%
file : program
| module
;
program : program_heading semicolon block DOT
;
program_heading : PROGRAM identifier
| PROGRAM identifier LPAREN identifier_list RPAREN
;
identifier_list : identifier_list COMMA identifier
| identifier
;
block : label_declaration_part
constant_definition_part
type_definition_part
variable_declaration_part
procedure_and_function_declaration_part
statement_part
;
module : constant_definition_part
type_definition_part
variable_declaration_part
procedure_and_function_declaration_part
;
label_declaration_part : LABEL label_list semicolon
|
;
label_list : label_list comma label
| label
;
label : DIGSEQ
;
constant_definition_part : CONST constant_list
|
;
constant_list : constant_list constant_definition
| constant_definition
;
constant_definition : identifier EQUAL cexpression semicolon
;
/*constant : cexpression ; /* good stuff! */
cexpression : csimple_expression
| csimple_expression relop csimple_expression
;
csimple_expression : cterm
| csimple_expression addop cterm
;
cterm : cfactor
| cterm mulop cfactor
;
cfactor : sign cfactor
| cexponentiation
;
cexponentiation : cprimary
| cprimary STARSTAR cexponentiation
;
cprimary : identifier
| LPAREN cexpression RPAREN
| unsigned_constant
| NOT cprimary
;
constant : non_string
| sign non_string
| CHARACTER_STRING
;
sign : PLUS
| MINUS
;
non_string : DIGSEQ
| identifier
| REALNUMBER
;
type_definition_part : TYPE type_definition_list
|
;
type_definition_list : type_definition_list type_definition
| type_definition
;
type_definition : identifier EQUAL type_denoter semicolon
;
type_denoter : identifier
| new_type
;
new_type : new_ordinal_type
| new_structured_type
| new_pointer_type
;
new_ordinal_type : enumerated_type
| subrange_type
;
enumerated_type : LPAREN identifier_list RPAREN
;
subrange_type : constant DOTDOT constant
;
new_structured_type : structured_type
| PACKED structured_type
;
structured_type : array_type
| record_type
| set_type
| file_type
;
array_type : ARRAY LBRAC index_list RBRAC OF component_type
;
index_list : index_list comma index_type
| index_type
;
index_type : ordinal_type ;
ordinal_type : new_ordinal_type
| identifier
;
component_type : type_denoter ;
record_type : RECORD record_section_list END
| RECORD record_section_list semicolon variant_part END
| RECORD variant_part END
;
record_section_list : record_section_list semicolon record_section
| record_section
;
record_section : identifier_list COLON type_denoter
;
variant_part : CASE variant_selector OF variant_list semicolon
| CASE variant_selector OF variant_list
|
;
variant_selector : tag_field COLON tag_type
| tag_type
;
variant_list : variant_list semicolon variant
| variant
;
variant : case_constant_list COLON LPAREN record_section_list RPAREN
| case_constant_list COLON LPAREN record_section_list semicolon
variant_part RPAREN
| case_constant_list COLON LPAREN variant_part RPAREN
;
case_constant_list : case_constant_list comma case_constant
| case_constant
;
case_constant : constant
| constant DOTDOT constant
;
tag_field : identifier ;
tag_type : identifier ;
set_type : SET OF base_type
;
base_type : ordinal_type ;
file_type : PFILE OF component_type
;
new_pointer_type : UPARROW domain_type
;
domain_type : identifier ;
variable_declaration_part : VAR variable_declaration_list semicolon
|
;
variable_declaration_list :
variable_declaration_list semicolon variable_declaration
| variable_declaration
;
variable_declaration : identifier_list COLON type_denoter
;
procedure_and_function_declaration_part :
proc_or_func_declaration_list semicolon
|
;
proc_or_func_declaration_list :
proc_or_func_declaration_list semicolon proc_or_func_declaration
| proc_or_func_declaration
;
proc_or_func_declaration : procedure_declaration
| function_declaration
;
procedure_declaration : procedure_heading semicolon directive
| procedure_heading semicolon procedure_block
;
procedure_heading : procedure_identification
| procedure_identification formal_parameter_list
;
directive : FORWARD
| EXTERNAL
;
formal_parameter_list : LPAREN formal_parameter_section_list RPAREN ;
formal_parameter_section_list :
formal_parameter_section_list semicolon formal_parameter_section
| formal_parameter_section
;
formal_parameter_section : value_parameter_specification
| variable_parameter_specification
| procedural_parameter_specification
| functional_parameter_specification
;
value_parameter_specification : identifier_list COLON identifier
;
variable_parameter_specification : VAR identifier_list COLON identifier
;
procedural_parameter_specification : procedure_heading ;
functional_parameter_specification : function_heading ;
procedure_identification : PROCEDURE identifier ;
procedure_block : block ;
function_declaration : function_heading semicolon directive
| function_identification semicolon function_block
| function_heading semicolon function_block
;
function_heading : FUNCTION identifier COLON result_type
| FUNCTION identifier formal_parameter_list COLON result_type
;
result_type : identifier ;
function_identification : FUNCTION identifier ;
function_block : block ;
statement_part : compound_statement ;
compound_statement : PBEGIN statement_sequence END ;
statement_sequence : statement_sequence semicolon statement
| statement
;
statement : open_statement
| closed_statement
;
open_statement : label COLON non_labeled_open_statement
| non_labeled_open_statement
;
closed_statement : label COLON non_labeled_closed_statement
| non_labeled_closed_statement
;
non_labeled_closed_statement : assignment_statement
| procedure_statement
| goto_statement
| compound_statement
| case_statement
| repeat_statement
| closed_with_statement
| closed_if_statement
| closed_while_statement
| closed_for_statement
|
;
non_labeled_open_statement : open_with_statement
| open_if_statement
| open_while_statement
| open_for_statement
;
repeat_statement : REPEAT statement_sequence UNTIL boolean_expression
;
open_while_statement : WHILE boolean_expression DO open_statement
;
closed_while_statement : WHILE boolean_expression DO closed_statement
;
open_for_statement : FOR control_variable ASSIGNMENT initial_value direction
final_value DO open_statement
;
closed_for_statement : FOR control_variable ASSIGNMENT initial_value direction
final_value DO closed_statement
;
open_with_statement : WITH record_variable_list DO open_statement
;
closed_with_statement : WITH record_variable_list DO closed_statement
;
open_if_statement : IF boolean_expression THEN statement
| IF boolean_expression THEN closed_statement ELSE open_statement
;
closed_if_statement : IF boolean_expression THEN closed_statement
ELSE closed_statement
;
assignment_statement : variable_access ASSIGNMENT expression
;
variable_access : identifier
| indexed_variable
| field_designator
| variable_access UPARROW
;
indexed_variable : variable_access LBRAC index_expression_list RBRAC
;
index_expression_list : index_expression_list comma index_expression
| index_expression
;
index_expression : expression ;
field_designator : variable_access DOT identifier
;
procedure_statement : identifier params
| identifier
;
params : LPAREN actual_parameter_list RPAREN ;
actual_parameter_list : actual_parameter_list comma actual_parameter
| actual_parameter
;
/*
* this forces you to check all this to be sure that only write and
* writeln use the 2nd and 3rd forms, you really can't do it easily in
* the grammar, especially since write and writeln aren't reserved
*/
actual_parameter : expression
| expression COLON expression
| expression COLON expression COLON expression
;
goto_statement : GOTO label
;
case_statement : CASE case_index OF case_list_element_list END
| CASE case_index OF case_list_element_list semicolon END
| CASE case_index OF case_list_element_list semicolon
otherwisepart statement END
| CASE case_index OF case_list_element_list semicolon
otherwisepart statement semicolon END
;
case_index : expression ;
case_list_element_list : case_list_element_list semicolon case_list_element
| case_list_element
;
case_list_element : case_constant_list COLON statement
;
otherwisepart : OTHERWISE
| OTHERWISE COLON
;
control_variable : identifier ;
initial_value : expression ;
direction : TO
| DOWNTO
;
final_value : expression ;
record_variable_list : record_variable_list comma variable_access
| variable_access
;
boolean_expression : expression ;
expression : simple_expression
| simple_expression relop simple_expression
;
simple_expression : term
| simple_expression addop term
;
term : factor
| term mulop factor
;
factor : sign factor
| exponentiation
;
exponentiation : primary
| primary STARSTAR exponentiation
;
primary : variable_access
| unsigned_constant
| function_designator
| set_constructor
| LPAREN expression RPAREN
| NOT primary
;
unsigned_constant : unsigned_number
| CHARACTER_STRING
| NIL
;
unsigned_number : unsigned_integer | unsigned_real ;
unsigned_integer : DIGSEQ
;
unsigned_real : REALNUMBER
;
/* functions with no params will be handled by plain identifier */
function_designator : identifier params
;
set_constructor : LBRAC member_designator_list RBRAC
| LBRAC RBRAC
;
member_designator_list : member_designator_list comma member_designator
| member_designator
;
member_designator : member_designator DOTDOT expression
| expression
;
addop: PLUS
| MINUS
| OR
;
mulop : STAR
| SLASH
| DIV
| MOD
| AND
;
relop : EQUAL
| NOTEQUAL
| LT
| GT
| LE
| GE
| IN
;
identifier : IDENTIFIER
;
semicolon : SEMICOLON
;
comma : COMMA
;
%%
extern int line_no;
extern char *yytext;
int yyerror(s)
char *s;
{
fprintf(stderr, "%s: at or before '%s', line %d\n",
s, yytext, line_no);
}
main (void) {
extern int init();
extern FILE *yyin;
extern int yylex();
extern int yylineno;
extern char *yytext;
yyin = fopen("D:\\helloword.pascal", "r");
if(yyparse()){
printf("error");
}
else {
printf("good");
}
fclose(yyin);
getchar();
}
Pascal input file
program Hello;
begin someErrorText1
writeln ('Hello, world.') someErrorText2
writeln ('Hello, world2.')
end.
output
Console
And you see error is in line 2 but parser show 3, and doesn't show second error.