Why doesn't this grammar parse the return statement? - c

I am trying to write a grammar that can parse the following 3 inputs
-- testfile --
class hi implements ho:
var x:int;
end;
-- testfile2 --
interface xs:
myFunc(int,int):int
end;
-- testfile3 --
class hi implements ho:
method myMethod(x:int)
return y;
end
end;
this is lexer.l:
%{
#include <stdio.h>
#include <stdlib.h>
#include "parser.tab.h"
#include <string.h>
int line_number = 0;
void lexerror(char *message);
%}
newline (\n|\r\n)
whitespace [\t \n\r]*
digit [0-9]
alphaChar [a-zA-Z]
alphaNumChar ({digit}|{alphaChar})
hexDigit ({digit}|[A-Fa-f])
decNum {digit}+
hexNum {digit}{hexDigit}*H
identifier {alphaChar}{alphaNumChar}*
number ({hexNum}|{decNum})
comment "/*"[.\r\n]*"*/"
anything .
%s InComment
%option noyywrap
%%
<INITIAL>{
interface return INTERFACE;
end return END;
class return CLASS;
implements return IMPLEMENTS;
var return VAR;
method return METHOD;
int return INT;
return return RETURN;
if return IF;
then return THEN;
else return ELSE;
while return WHILE;
do return DO;
not return NOT;
and return AND;
new return NEW;
this return THIS;
null return _NULL;
":" return COL;
";" return SCOL;
"(" return BRACL;
")" return BRACR;
"." return DOT;
"," return COMMA;
"=" return ASSIGNMENT;
"+" return PLUS;
"-" return MINUS;
"*" return ASTERISK;
"<" return LT;
{decNum} {
yylval = atoi(yytext);
return DEC;
}
{hexNum} {
const int len = strlen(yytext)-1;
char* substr = (char*) malloc(sizeof(char) * len);
strncpy(substr,yytext,len);
yylval = (int)strtol
( substr
, NULL
, 16);
free (substr);
return HEX;
}
{identifier} {
yylval= (char *) malloc(sizeof(char)*strlen(yytext));
strcpy(yylval, yytext);
return ID;
}
{whitespace} {}
"/*" BEGIN InComment;
}
{newline} line_number++;
<InComment>{
"*/" BEGIN INITIAL;
{anything} {}
}
. lexerror("Illegal input");
%%
void lexerror(char *message)
{
fprintf(stderr,"Error: \"%s\" in line %d. = %s\n",
message,line_number,yytext);
exit(1);
}
this is parser.y:
%{
# include <stdio.h>
int yylex(void);
void yyerror(char *);
extern int line_number;
%}
%start Program
%token INTERFACE END CLASS IMPLEMENTS VAR METHOD INT RETURN IF THEN ELSE
%token WHILE DO NOT AND NEW THIS _NULL EOC SCOL COL BRACL BRACR DOT COMMA
%token ASSIGNMENT PLUS ASTERISK MINUS LT EQ DEC HEX ID NEWLINE
%%
Program: INTERFACE Interface SCOL { printf("interface\n"); }
| CLASS Class SCOL { printf("class\n");}
| error { printf("error on: %s\n", $$); }
;
Interface: ID COL
AbstractMethod
END
;
AbstractMethod: ID BRACL Types BRACR COL Type
;
Types : Type COMMA Types
| Type
;
Class: ID
IMPLEMENTS ID COL
Member SCOL
END
;
Member: VAR ID COL Type
| METHOD ID BRACL Pars BRACR Stats END
;
Type: INT
| ID
;
Pars: Par COMMA Pars
| Par
;
Par: ID COL Type
;
Stats: Stat SCOL Stat
| Stat
;
Stat: RETURN Expr
| IF Expr THEN Stats MaybeElse END
| WHILE Expr DO Stats END
| VAR ID COL Type COL ASSIGNMENT Expr
| ID COL ASSIGNMENT Expr
| Expr
;
MaybeElse :
| ELSE Stats
;
Expr: NOT Term
| NEW ID
| Term PLUS Term
| Term ASTERISK Term
| Term AND Term
| Term ArithOp Term
| Term
;
ArithOp: MINUS
| LT
| ASSIGNMENT
;
Term: BRACL Expr BRACR
| Num
| THIS
| ID
| Term DOT ID BRACL Exprs BRACR
| error { printf("error in term: %s\n", $$); }
;
Num : HEX
| INT
;
Exprs : Expr COMMA Exprs
| Expr
;
%%
void yyerror(char *s) {
fprintf(stderr, "Parse Error on line %i: %s\n", line_number, s);
}
int main(void){
yyparse();
}
the first two inputs are recognized as expected,
However, the third one fails with the error error on: y and I don't have an idea why.
As I see it, this should be a Class with a Member METHOD that contains a Stat(ement) RETURN with an Expr Term being an ID.
I tried commenting and removing all the unneccesary bits, but the result is still the same.
I also took a look at the parser to verify that my identifiers parse correctly, but as I see it they should.
Why is the y in return y not recognized here?
Is there some conflict in the grammar I am unaware of?
(Please note that I am not expecting you to fix the complete grammar; I am merely asking for the reason this is not working. I am sure there are other errors in there, but I am really stuck fixing this one.)
here is also my makefile:
CC = gcc
LEX = flex
YAC = bison
scanner: parser.y lexer.l
$(YAC) -d -Wcounterexamples parser.y
$(LEX) lexer.l
$(CC) parser.tab.c parser.tab.h lex.yy.c -o parser
clean:
rm -f *.tab.h *.tab.c *.gch *.yy.c
rm ./parser
testing:
cat testfile3 | ./parser

First you have one error in your grammar :
Stats: Stat SCOL Stat
| Stat
;
must be
Stats: Stat SCOL Stats
| Stat
;
('s' added at the end of line)
Second your definition in testfile3 does not follow your grammar and must be
class hi implements ho:
method myMethod(x:int)
return y
end;
end;
so the ';' after return y must be moved after the first end
(and return x seems more logical, but this is an other subject, you do not check the validity of the ID)
Out of that a class can have only one member, it's very limited / restrictive

Related

Why isn't my bison printing the variable names?

So i'm using a flex/bison parser but the variable names arent printing correctly. It understands the number values. I've tried messing with everything but I'm lost. heres a link to the output. its where it prints "Data: 0" that i'm trying to get the variable name [https://imgur.com/vJDpgpR][1]
invocation is: ./frontEnd data.txt
//main.c
#define BUF_SIZE 1024
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
extern FILE* yyin;
extern yyparse();
int main(int argc, char* argv[]){
if(argc < 2){
FILE* fp = fopen("temp.txt", "a");
printf("Entering data: \n");
void *content = malloc(BUF_SIZE);
if (fp == 0)
printf("error opening file");
int read;
while ((read = fread(content, BUF_SIZE, 1, stdin))){
fwrite(content, read, 1, fp);
}
if (ferror(stdin))
printf("There was an error reading from stdin");
fclose(fp);
yyparse(fp);
}
if(argc == 2){
yyin = fopen(argv[2], "r");
if(!yyin)
{
perror(argv[2]);
printf("ERROR: file does not exist.\n");
return 0;
}
yyparse (yyin);
}
return 0;
}
void yyerror(char *s){
fprintf(stderr, "error: exiting %s \n", s);
}
//lex.l
%{
#include <stdio.h>
#include <stdlib.h>
#include "parser.tab.h"
extern SYMTABNODEPTR symtable[SYMBOLTABLESIZE];
extern int curSymSize;
%}
%option noyywrap
%option nounput yylineno
%%
"stop" return STOP;
"iter" return ITER;
"scanf" return SCANF;
"printf" return PRINTF;
"main" return MAIN;
"if" return IF;
"then" return THEN;
"let" return LET;
"func" return FUNC;
"//" return COMMENT; printf("\n");
"start" return START;
"=" return ASSIGN;
"=<" return LE;
"=>" return GE;
":" return COLON;
"+" return PLUS;
"-" return MINUS;
"*" return MULT;
"/" return DIV;
"%" return MOD;
"." return DOT;
"(" return RPAREN;
")" return LPAREN;
"," return COMMA;
"{" return RBRACE;
"}" return LBRACE;
";" return SEMICOLON;
"[" return LBRACK;
"]" return RBRACK;
"==" return EQUAL;
[A-Z][a-z]* { printf("SYNTAX ERROR: Identifiers must start with lower case. "); }
[a-zA-Z][_a-zA-Z0-9]* {
printf("string: %s \n", yytext);
yylval.iVal = strdup(yytext);
yylval.iVal = addSymbol(yytext);
return ID;
}
[0-9]+ {
yylval.iVal = atoi(yytext);
printf("num: %s \n", yytext);
return NUMBER; }
[ _\t\r\s\n] ;
^"#".+$ return COMMENT;
. {printf("ERROR: Invalid Character "); yyterminate();}
<<EOF>> { printf("EOF: line %d\n", yylineno); yyterminate(); }
%%
// stores all variable id is in an array
SYMTABNODEPTR newSymTabNode()
{
return ((SYMTABNODEPTR)malloc(sizeof(SYMTABNODE)));
}
int addSymbol(char *s)
{
extern SYMTABNODEPTR symtable[SYMBOLTABLESIZE];
extern int curSymSize;
int i;
i = lookup(s);
if(i >= 0){
return(i);
}
else if(curSymSize >= SYMBOLTABLESIZE)
{
return (NOTHING);
}
else{
symtable[curSymSize] = newSymTabNode();
strncpy(symtable[curSymSize]->id,s,IDLENGTH);
symtable[curSymSize]->id[IDLENGTH-1] = '\0';
return(curSymSize++);
}
}
int lookup(char *s)
{
extern SYMTABNODEPTR symtable[SYMBOLTABLESIZE];
extern int curSymSize;
int i;
for(i=0;i<curSymSize;i++)
{
if(strncmp(s,symtable[i]->id,IDLENGTH) == 0){
return (i);
}
}
return(-1);
}
// parser.y
%{
#define YYERROR_VERBOSE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
extern char *yytext;
extern int yylex();
extern void yyerror(char *);
extern int yyparse();
extern FILE *yyin;
/* ------------- some constants --------------------------------------------- */
#define SYMBOLTABLESIZE 50
#define IDLENGTH 15
#define NOTHING -1
#define INDENTOFFSET 2
#ifdef DEBUG
char *NodeName[] =
{
"PROGRAM", "BLOCK", "VARS", "EXPR", "N", "A", "R", "STATS", "MSTAT", "STAT",
"IN", "OUT", "IF_STAT", "LOOP", "ASSIGN", "RO", "IDVAL", "NUMVAL"
};
#endif
enum ParseTreeNodeType
{
PROGRAM, BLOCK, VARS, EXPR, N, A, R, STATS, MSTAT, STAT,
IN, OUT,IF_STAT, LOOP, ASSIGN, RO, IDVAL, NUMVAL
};
#define TYPE_CHARACTER "char"
#define TYPE_INTEGER "int"
#define TYPE_REAL "double"
#ifndef TRUE
#define TRUE 1
#endif
#ifndef FALSE
#define FALSE 0
#endif
#ifndef NULL
#define NULL 0
#endif
// definitions for parse tree
struct treeNode {
int item;
int nodeID;
struct treeNode *first;
struct treeNode *second;
};
typedef struct treeNode TREE_NODE;
typedef TREE_NODE *TREE;
TREE makeNode(int, int, TREE, TREE);
#ifdef DEBUG
void printTree(TREE, int);
#endif
// symbol table definitions.
struct symbolTableNode{
char id[IDLENGTH];
};
typedef struct symbolTableNode SYMTABNODE;
typedef SYMTABNODE *SYMTABNODEPTR;
SYMTABNODEPTR symtable[SYMBOLTABLESIZE];
int curSymSize = 0;
%}
%start program
%union {
char *sVal;
int iVal;
TREE tVal;
}
// list of all tokens
%token SEMICOLON GE LE EQUAL COLON RBRACK LBRACK ASSIGNS LPAREN RPAREN COMMENT
%token DOT MOD PLUS MINUS DIV MULT RBRACE LBRACE START MAIN STOP LET COMMA
%token SCANF PRINTF IF ITER THEN FUNC
%left MULT DIV MOD ADD SUB
// tokens defined with values and rule names
%token<iVal> NUMBER ID
//%token<sVal> ID
%type<tVal> program type block vars expr N A R stats mStat stat in out if_stat loop assign RO
%%
program : START vars MAIN block STOP
{
TREE tree;
tree = makeNode(NOTHING, PROGRAM, $2,$4);
#ifdef DEBUG
printTree(tree, 0);
#endif
}
;
block : RBRACE vars stats LBRACE
{
$$ = makeNode(NOTHING, BLOCK, $2, $3);
}
;
vars : /*empty*/
{
$$ = makeNode(NOTHING, VARS,NULL,NULL);
}
| LET ID COLON NUMBER vars
{
$$ = makeNode($2, VARS, $5,NULL);
printf("id: %d", $2);
}
;
//variable:
// type ID{$$ = newNode($2,VARIABLE,$1,NULL,NULL);};
//type:
// INT {$$ = newNode(INT,TYPE,NULL,NULL,NULL);}
// | BOOL {$$ = newNode(BOOL,TYPE,NULL,NULL,NULL);}
// | CHAR {$$ = newNode(CHAR,TYPE,NULL,NULL,NULL);}
// | STRING{$$ = newNode(STRING,TYPE,NULL,NULL,NULL);};
expr : N DIV expr
{
$$ = makeNode(DIV, EXPR, $1, $3);
}
| N MULT expr
{
$$ = makeNode(MULT, EXPR, $1, $3);
}
| N
{
$$ = makeNode(NOTHING, EXPR, $1,NULL);
}
;
N : A PLUS N
{
$$ = makeNode(PLUS, N, $1, $3);
}
| A MINUS N
{
$$ = makeNode(MINUS, N, $1, $3);
}
| A
{
$$ = makeNode(NOTHING, N, $1,NULL);
}
;
A : MOD A
{
$$ = makeNode(NOTHING, A, $2,NULL);
}
| R
{
$$ = makeNode(NOTHING, A, $1,NULL);
}
;
R : LBRACK expr RBRACK
{
$$ = makeNode(NOTHING, R, $2,NULL);
}
| ID
{
$$ = makeNode($1, IDVAL, NULL,NULL);
}
| NUMBER
{
$$ = makeNode($1, NUMVAL, NULL,NULL);
}
;
stats : stat mStat
{
$$ = makeNode(NOTHING, STATS, $1, $2);
}
;
mStat : /* empty */
{
$$ = makeNode(NOTHING, MSTAT, NULL,NULL);
}
| stat mStat
{
$$ = makeNode(NOTHING, MSTAT, $1, $2);
}
;
stat: in DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| out DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| block
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| if_stat DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| loop DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| assign DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
;
in : SCANF LBRACK ID RBRACK
{
$$ = makeNode($3, IN,NULL,NULL);
}
;
out : PRINTF LBRACK expr RBRACK
{
$$ = makeNode(NOTHING, OUT,$3,NULL);
}
;
if_stat : IF LBRACK expr RO expr RBRACK THEN block
{
$$ = makeNode(NOTHING, IF_STAT, $4, $8);
}
;
loop : ITER LBRACK expr RO expr RBRACK block
{
$$ = makeNode(NOTHING, LOOP, $4, $7);
}
;
assign : ID ASSIGNS expr
{
$$ = makeNode($1, ASSIGN, $3,NULL);
}
;
RO : LE
{
$$ = makeNode(LE, RO, NULL,NULL);
}
| GE
{
$$ = makeNode(GE, RO, NULL,NULL);
}
| EQUAL
{
$$ = makeNode(EQUAL, RO, NULL,NULL);
}
| COLON COLON
{
$$ = makeNode(EQUAL, RO, NULL,NULL);
}
;
%%
// node generator
TREE makeNode(int iVal, int nodeID, TREE p1, TREE p2)
{
TREE t;
t = (TREE)malloc(sizeof(TREE_NODE));
t->item = iVal;
t->nodeID = nodeID;
t->first = p1;
t->second = p2;
//printf("NODE CREATED");
return(t);
}
// prints the tree with indentation for depth
void printTree(TREE tree, int depth){
int i;
if(tree == NULL) return;
for(i=depth;i;i--)
printf(" ");
if(tree->nodeID == NUMBER)
printf("INT: %d ",tree->item);
else if(tree->nodeID == IDVAL){
if(tree->item > 0 && tree->item < SYMBOLTABLESIZE )
printf("id: %s ",symtable[tree->item]->id);
else
printf("unknown id: %d ", tree->item);
}
if(tree->item != NOTHING){
printf("Data: %d ",tree->item);
}
// If out of range of the table
if (tree->nodeID < 0 || tree->nodeID > sizeof(NodeName))
printf("Unknown ID: %d\n",tree->nodeID);
else
printf("%s\n",NodeName[tree->nodeID]);
printTree(tree->first,depth+2);
printTree(tree->second,depth+2);
}
#include "lex.yy.c"
// heres the makefile I use for compilation
frontEnd: lex.yy.c parser.tab.c
gcc parser.tab.c main.c -o frontEnd -lfl -DDEBUG
parser.tab.c parser.tab.h: parser.y
bison -d parser.y
lex.yy.c: lex.l
flex lex.l
clean:
rm lex.yy.c y.tab.c frontEnd
'''
// data.txt
start
let x : 13
main {
scanf [ x ] .
printf [ 34 ] .
} stop[enter image description here][2]
[1]: https://i.stack.imgur.com/xlNnh.png
[2]: https://i.stack.imgur.com/HKRtX.png
I think this has a lot more to do with your AST and symbol table functions than with your parser, and practically nothing to do with bison itself.
For example, your function to print trees won't attempt to print an identifier's name if its symbol table index is 0.
if(tree->item > 0 && tree->item < SYMBOLTABLESIZE)
But the first symbol entered in the table will have index 0. (Perhaps you fixed this between pasting your code and generating the results. You should always check that the code you paste in a question corresponds precisely to the output which you show. But this isn't the only bug in your code; it's just an example.)
As another example, the immediate problem which causes Data: 0 to be printed instead of the symbol name is that your tree printer only prints symbol names for AST nodes of type IDVAL, but you create an AST IN node whose data field contains the variable's symbol table index. So either you need to fix your tree printer so it knows about IN nodes, or you need to change the IN node so that it has a child which is the IDVAL node. (That's probably the best solution in the long run.)
It's always a temptation to blame bison (or whatever unfamiliar tool you're using at the moment) for bugs, instead of considering the possibility that you've introduced bugs in your own support code. To avoid falling into this trap, it's always a good idea to test your library functions separately before using them in a more complicated project. For example, you could write a small test driver that builds a fixed AST tree, prints it, and deletes it. Once that works, and only when that works, you can check to see if your parser can build and print the same tree by parsing an input.
You will find that some simple good software design practices will make this whole process much smoother:
Organise your code into separate component files, each with its own header file. Document the library interfaces (and, if necessary, data structures) using comments in the header file. Briefly describe what each function's purpose is. If you can't find a brief description, it nay be that the function is trying to do too many different things.
In your parser, the functions and declarations needed to build and use ASTs are scattered between different parts of your lexer and parser files. This makes them much harder to read, debug, maintain and even use.
No matter what your teacher might tell you, if you find it necessary to #include the generated lexical scanner directly into the parser, then you probably have not found a good way to organise your support functions. You should always aim to make it possible to separately compile the parser and the scanner.
For data structures like your AST node, which use different member variables in different ways depending on an enumerated node type -- which is a model you'll find in other C projects as well, but is particularly common in parsers -- document the precise use of each field for every enumeration value. And make sure that every time you change the way you use the data or add new enumeration values, you fix the documentation accordingly.
This documentation will make it much easier to verify that your AST is being built correctly. As an additional benefit, you (or others using your code) will have an accurate description of how to interpret the contents of AST nodes, which makes it much easier to write code which analyses the tree.
In short, the way to write, debug and maintain any non-trivial project is not by "messing around" but by being systematic and modular. While it might seem like all of this takes precious time, particularly the documentation, it will almost always save you a lot of time in the long run.

I got an error in function `yylex': lex.yy.c:(.text+0x2ac): undefined reference

I am new with lex and yacc, and I am following the "lex & yacc 1992" book.
I am working in an example in chapter 3, and I have an error in the compiling process, but I couldn't find a solution;
here is the code:
the lex file.l :
%{
#include "y.tab.h"
#include "symboletable.h"
#include <math.h>
extern int yylavl;
%}
%%
([0-9]+|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?) {
yylval.dval = atof(yytext);
return NUMBER;
}
[ \t] ; /* ignore whitespace */
[A-Za-z][A-Za-z0-9]* { /* return symbol pointer */
yylval.symp = symlook(yytext);
return NAME;
}
"$" { return 0; }
\n |
. return yytext[0];
%%
and here the yacc file.y
%{
#include "symboletable.h"
#include <string.h>
#include <stdio.h> /* C declarations used in actions */
int yylex();
void yyerror(const char *s);
%}
%union {
double dval;
struct symtab *symp;
}
%token <symp> NAME
%token <dval> NUMBER
%left '+' '-'
%left '*' '/'
%nonassoc UMINUS
%type <dval> expression
%%
statement_list : statement '\n'
| statement_list statement '\n'
;
statement : expression { printf("= %g\n", $1); }
| NAME '=' expression {$1->value = $3; }
;
expression : NAME {$$ = $1->value; }
| expression '+' expression {$$ = $1 + $3; }
| expression '-' expression {$$ = $1 - $3; }
| expression '*' expression {$$ = $1 * $3; }
| expression '/' expression
{ if ($3 ==0.0)
yyerror("divide by zero");
else
$$ = $1 / $3;
}
| '-' expression %prec UMINUS {$$ = -$2; }
| '(' expression ')' {$$ = $2; }
| NUMBER
;
%%
according to the example in the book, I need to write a symbol table routines, to get the string and allocate dynamic space for the string, here the file.h
the symboletable.h
#define NSYMS 20 /* maximum number of symbols */
struct symtab {
char *name;
double value;
} symtab[NSYMS];
struct symtab *symlook();
and the symboletable.pgm:
/* look up a symbol table entry, add if not present */
struct symtab *
symlook(s)
char *s;
{
char *p;
struct symtab *sp;
for (sp = symtab; sp < &symtab[NSYMS]; sp++){
/* is it already here ? */
if (sp->name && !strcmp(sp->name, s))
return sp;
/* is it free */
if (!sp->name){
sp->name = strdup(s);
return sp;
}
/* otherwise continue to next */
}
yyerror("Too many symbols");
exit(1); /* cannot continue */
} /* symlook */
now when I run the following command:
yacc -d file.y
lex file.l
cc -c lex.yy.c -o newfile -ll
cc -o new y.tab.c lex.yy.c -ly -ll
but here the error I got:
/tmp/ccGnPAO2.o: In function yylex': lex.yy.c:(.text+0x2ac):
undefined reference tosymlook' collect2: error: ld returned 1 exit
status
so, why I got that error, I am totally follow the example ?
You need to include your symbol table implementation in your compilation command. Otherwise, how is the linker going to find that code?

Bus error: 10 in bison and flex on mac os

Here is the code in question:
calc.y
%{
#include <stdio.h>
void yyerror(char *);
int yylex(void);
int sym[26];
%}
%token INTEGER VARIABLE
%left '+' '-'
%left '*' '/'
%%
program:
program statement '\n'
| /* NULL */
;
statement:
expression { printf("%d\n", $1); }
| VARIABLE '=' expression { sym[$1] = $3; }
;
expression:
INTEGER
| VARIABLE { $$ = sym[$1]; }
| expression '+' expression { $$ = $1 + $3; }
| expression '-' expression { $$ = $1 - $3; }
| expression '*' expression { $$ = $1 * $3; }
| expression '/' expression { $$ = $1 / $3; }
| '(' expression ')' { $$ = $2; }
;
%%
void yyerror(char *s){
fprintf(stderr, "%s\n", s);
}
int main(void){
yyparse();
}
calc.l
%{
#include "calc.tab.h"
#include <stdlib.h>
void yyerror(char *);
%}
%%
[a-z] {
yylval = *yytext - 'a';
return VARIABLE;
}
[0-9]+ {
yylval = atoi(yytext);
return INTEGER;
}
[-+()=/*\n] { return *yytext; }
[\t] ;
. yyerror("Unkown Character");
%%
int yywrap(void) {
return 1;
}
When I run above code with the following commands, it works well.
$ bison -d calc.y
$ flex calc.l
However, when it is run like this:
$ gcc lex.yy.c calc.tab.c -o app
this command does not work well. And I am getting following error:
Bus error: 10
Can anyone explain why this is happening ?
Or, how can I solve this error ?
Please need help.
You need to make up your mind as to whether VARIABLE is sym[$1] or just the index into sym[].myouve used it both ways in your grammar. Judging by your lexer it is the index. In fact I don't see any necessity for sym[] at all.
And you didn't get the bus error when you generate the .c files or when you compiled them. You got it when you executed your application.

How to echo input text in YACC Grammar?

I am trying to display the whole arithmetic expression from text file and its result, I tried it with file handling option but it is not working.
YACC :
%{
#include <stdio.h>
#include <string.h>
#define YYSTYPE int /* the attribute type for Yacc's stack */
extern int yylval; /* defined by lex, holds attrib of cur token */
extern char yytext[]; /* defined by lex and holds most recent token */
extern FILE * yyin; /* defined by lex; lex reads from this file */
%}
%token NUM
%%
Calc : Expr {printf(" = %d\n",$1);}
| Calc Expr {printf(" = %d\n",$2);}
| Calc error {yyerror("\n");}
;
Expr : Term { $$ = $1; }
| Expr '+' Term { $$ = $1 + $3; }
| Expr '-' Term { $$ = $1 - $3; }
;
Term : Fact { $$ = $1; }
| Term '*' Fact { $$ = $1 * $3; }
| Term '/' Fact { if($3==0){
yyerror("Divide by Zero Encountered.");
break;}
else
$$ = $1 / $3;
}
;
Fact : Prim { $$ = $1; }
| '-' Prim { $$ = -$2; }
;
Prim : '(' Expr ')' { $$ = $2; }
| Id { $$ = $1; }
;
Id :NUM { $$ = yylval; }
;
%%
void yyerror(char *mesg); /* this one is required by YACC */
main(int argc, char* *argv){
char ch,c;
FILE *f;
if(argc != 2) {printf("useage: calc filename \n"); exit(1);}
if( !(yyin = fopen(argv[1],"r")) ){
printf("cannot open file\n");exit(1);
}
/*
f=fopen(argv[1],"r");
if(f!=NULL){
char line[1000];
while(fgets(line,sizeof(line),f)!=NULL)
{
fprintf(stdout,"%s",line);
yyparse();
}
}
*/
yyparse();
}
void yyerror(char *mesg){
printf("\n%s", mesg);
}
LEX
%{
#include <stdio.h>
#include "y.tab.h"
int yylval; /*declared extern by yacc code. used to pass info to yacc*/
%}
letter [A-Za-z]
digit ([0-9])*
op "+"|"*"|"("|")"|"/"|"-"
ws [ \t\n\r]+$
other .
%%
{ws} { /*Nothing*/ }
{digit} { yylval = atoi(yytext); return NUM;}
{op} { return yytext[0];}
{other} { printf("bad%cbad%d\n",*yytext,*yytext); return '?'; }
%%
My Text file contains these two expressions :
4+3-2*(-7)
9/3-2*(-5)
I want output as :
4+3-2*(-7)=21
9/3-2*(-5)=13
But the Output Is :
=21
=13
because a parser will do all calculations at once so this (the commented code) is not legit to use. So what is needed is to show pass input expression to grammar and print in Calc block. I am not able to find anything relevant on google about displaying input in grammar.Thanks in advance for comments & suggestions.
You don't want to do this in the grammar. Too complicated, and too subject to whatever rearrangement the grammar may do. You could consider doing it in the lexer, i.e. print yytext in every action other than the whitespace action, just before you return it, but I would echo all the input as it is read, by overriding lex(1)'s input function.
NB You should be using flex(1), not lex(1), and note that if you change, yyyext ceases being a char[] and becomes a char *.
I didn't mention it in your prior question, but this rule:
{other} { printf("bad%cbad%d\n",*yytext,*yytext); return '?'; }
would better be written as:
{other} { return yytext[0]; }
That way the parser will see it and produce a syntax error, so you don't have to print anything yourself. This technique also lets you get rid of the rules for the individual special characters +,-=*,/,(,), as the parser will recognize them via yytext[0].
Finally, I got it :
YACC
%{
#include <stdio.h>
#include <string.h>
#define YYSTYPE int /* the attribute type for Yacc's stack */
extern int yylval; /* defined by lex, holds attrib of cur token */
extern char yytext[]; /* defined by lex and holds most recent token */
extern FILE * yyin; /* defined by lex; lex reads from this
file */ %}
%token NUM
%%
Calc : Expr {printf(" = %d\n",$1);}
| Calc Expr {printf(" = %d\n",$2);}
| error {yyerror("Bad Expression\n");}
;
Expr : Term { $$ = $1; }
| Expr Add Term { $$ = $1 + $3; }
| Expr Sub Term { $$ = $1 - $3; }
;
Term : Fact { $$ = $1; }
| Term Mul Fact { $$ = $1 * $3; }
| Term Div Fact { if($3==0){
yyerror("Divide by Zero Encountered.");
break;}
else
$$ = $1 / $3;
}
;
Fact : Prim { $$ = $1; }
| '-' Prim { $$ = -$2; }
;
Prim : LP Expr RP { $$ = $2; }
| Id { $$ = $1; }
;
Id :NUM { $$ = yylval; printf("%d",yylval); }
;
Add : '+' {printf("+");}
Sub : '-' {printf("-");}
Mul : '*' {printf("*");}
Div : '/' {printf("/");}
LP : '(' {printf("(");}
RP : ')' {printf(")");}
%%
void yyerror(char *mesg); /* this one is required by YACC */
main(int argc, char* *argv){
char ch,c;
FILE *f;
if(argc != 2) {printf("useage: calc filename \n"); exit(1);}
if( !(yyin = fopen(argv[1],"r")) ){
printf("cannot open file\n");exit(1);
}
yyparse();
}
void yyerror(char *mesg){
printf("%s ", mesg);
}
Thanks EJP & EMACS User for responding.

Problems Compiling

So I have to create a compiler for the Tiny C language, but I cant compile it, I have the .y and .l files and both work all right, but when I try to compile the .tab.c file, it shows 3 errors for
undefined reference to 'install_id'
undefined reference to printSymtab'
undefined reference to 'lookup_id'
Here are the codes:
Symtab.h
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct symtab_node * SYMTAB;
typedef struct symtab_node {
char * nombre;
int type;
float fval;
SYMTAB next;
} SYMTAB_NODE;
SYMTAB lookup_id(SYMTAB st, char * name);
SYMTAB install_id(SYMTAB st, char * name, int typ);
void printSymtab(SYMTAB t);
Symtab.c
#include "symtab.h"
#include <stdio.h>
int next_num() {
static int i = 1;
return i++;
}
/* looks up an is in ST. Returns pointer to cell if found else NULL */
SYMTAB lookup_id(SYMTAB st, char * name) {
SYMTAB tmp = st;
if (tmp == NULL) {/* empty list */
return NULL;
} else {
while (tmp != NULL) {
if (strcmp(tmp->idname,name) == 0) {
return tmp; /* found */
} else {
tmp = tmp->next; /* go to next cell */
}
}
return NULL; /* not found */
}
}
/* adds an id to ST if not present */
SYMTAB install_id(SYMTAB st, char * name, int typ) {
if (lookup_id(st, name) == NULL) {
SYMTAB nst = (SYMTAB)malloc(sizeof(SYMTAB_NODE));
nst->idname = (char *) strdup(name);
nst->idnum = next_num();
nst->next = st;
return nst;
} else {
return st;
}
}
/* print out ST */
void printSymtab(SYMTAB t) {
SYMTAB tmp = t;
while (tmp != NULL) {
printf("%s\t%d\n", tmp->idname, tmp->idnum);
tmp = tmp->next;
}
}
grammar.y
%{
#include "symtab.h"
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
char * concat (char * str1, char * str2);
extern int yylex();
extern char * yytext;
extern int yylineno;
SYMTAB st;
int typev;
/* Function definitions */
void yyerror (char *string);
%}
%union{
char *strval;
int value;
float fvalue;
SYMTAB st;
}
/* Declaramos todos los tokens que recibirá el programa y que provienen del cparser.l */
%token SEMI INTEGER FLOAT
%token IF THEN ELSE WHILE DO
%token READ WRITE
%token LPAREN RPAREN LBRACE RBRACE
%token LT EQ
%token PLUS MINUS TIMES DIV ASSIGN
%token<value> INT_NUM
%token<fvalue> FLOAT_NUM
%token<strval> ID
%%
/* Definimos las reglas de producción para el mini-lenguaje C */
program: var_dec stmt_seq { printf ("No hay errores sintacticos\n");}
;
var_dec: var_dec single_dec
|
;
single_dec: type ID SEMI { st = install_id(st,$2,typev); printSymtab(st); }
;
type: INTEGER { typev = 1; }
| FLOAT { typev = 2; }
;
stmt_seq: stmt_seq stmt
|
;
stmt: IF exp THEN else
| WHILE exp DO stmt
| variable ASSIGN exp SEMI { /*st = install_id(st,$1); */}
| READ LPAREN variable RPAREN SEMI { /*st = install_id(st,$3); */}
| WRITE LPAREN exp RPAREN SEMI
| block
| error SEMI { yyerrok;}
;
else: stmt
| ELSE stmt
;
block: LBRACE stmt_seq RBRACE
;
exp: simple_exp LT simple_exp
| simple_exp EQ simple_exp
| simple_exp
;
simple_exp: simple_exp PLUS term
| simple_exp MINUS term
| term
;
term: term TIMES factor
| term DIV factor
| factor
;
factor: LPAREN exp RPAREN
| INT_NUM
| FLOAT_NUM
| variable
;
variable: ID
{ if(lookup_id(st,$1) == NULL){
yyerror(concat("Error: Undeclared Identifier ", $1));
}
}
;
%%
/* A function that concatenates two strings and returns the result */
char * concat(char * str1, char * str2){
char *str3;
str3 = (char *) calloc(strlen(str1)+strlen(str2)+1, sizeof(char));
strcpy(str3,str1);
strcat(str3,str2);
return str3;
}
#include "lex.yy.c"
/* Bison does NOT implement yyerror, so define it here */
void yyerror (char *string){
printf ("ERROR NEAR LINE %d: %s\n",yylineno,string);
}
/* Bison does NOT define the main entry point so define it here */
main (){
yyparse();
yylex();
}
lexem.y
%{
#include <string.h>
#include <stdlib.h>
char * strval;
int value;
float fvalue;
int error;
extern YYSTYPE yylval;
%}
/* This is the rule definition */
%option noyywrap
%option yylineno
ids [A-Za-z_][A-Za-z0-9_]*
digits 0|[1-9][0-9]*|0(c|C)[0-7]+|0(x|X)[0-9A-Fa-f]+
floats [0-9]*"."[0-9]+([eE][-+]?[0-9]+)?
%%
/* Consume los comentarios*/
(\/\*([^\*]|\*[^/])*\*\/)|(\/\/.*)
/* Consume los espacios, tabulaciones y saltos de linea*/
[[:space:]]|[[:blank:]]|\n
/* Palabras reservadas */
"int" { return INTEGER; }
"float" { return FLOAT; }
"if" { return IF; }
"then" { return THEN; }
"else" { return ELSE; }
"do" { return DO; }
"while" { return WHILE; }
"read" { return READ; }
"write" { return WRITE; }
/* Simbolos de puntuacion, operadores y relacionales */
/* Puntuacion */
";" { return SEMI; }
"(" { return LPAREN; }
")" { return RPAREN; }
"{" { return LBRACE; }
"}" { return RBRACE; }
/* Relacionales */
">" { return LT; }
"==" { return EQ; }
/* Operadores */
"+" { return PLUS; }
"-" { return MINUS; }
"*" { return TIMES; }
"/" { return DIV; }
"=" { return ASSIGN; }
{ids} { yylval.strval = (char *) strdup(yytext);
return (ID); }
{digits} { yylval.value = atoi(yytext);
return (INT_NUM); }
{floats} { yylval.fvalue = atof(yytext);
return (FLOAT_NUM); }
/* Consume los simbolos que sobran y marca error */
. { printf("LEXICAL ERROR NEAR LINE %d: %s \n", yyget_lineno(), yyget_text()); error++; }
%%
You're not supposed to compile the whatever.tab.h file, that's a header file containing the YACC elements for the grammar, for inclusion into the lex and yacc code sections, as well as your own code if you need access to it.
You're supposed to compile whatever.tab.c, ensuring that you're also including your symtab.c (or its equivalent object file), and any other C source files as well.
And, based on your comment, it's this non-inclusion of the symtab.c file which is indeed causing your immediate error.
When I execute your steps (slightly modified for different names):
flex lexem.l
yacc -d -v grammar.y
gcc -o par y.tab.c
then I get a similar problem to what you're seeing:
/tmp/ccI5DpZQ.o:y.tab.c:(.text+0x35c): undefined reference to `install_id'
/tmp/ccI5DpZQ.o:y.tab.c:(.text+0x36e): undefined reference to `printSymtab'
/tmp/ccI5DpZQ.o:y.tab.c:(.text+0x3a7): undefined reference to `lookup_id'
However, when I incorporate the symtab.c file into the compile line (and add the idname and idnum missing bits to the structure in symtab.h to solve compilation problems), it works just fine:
gcc -o par y.tab.c symtab.c
So that's what you need to do, include symtab.c on the gcc command line.

Resources