I recently download a opensource project named re1(Goggle) because my current research topic is about regular expression matching with nfa and dfa. re1 is a very very simple and small project, but there is a parse.y file but I never met. After google, I know it is generated by yacc(yet another compiler compiler). There is also a makefile so I can run it in Linux, but now I want to run it in Visual Studio(Windows) because I need to debug step by step(F5, F10, F11,etc. are very userful).But now it cannot build in VS because the .y file that VS cannot recognize it, there are many "error LNK2019: unresolved external symbol".
I do not know how to settle it, can I convert or restore it to .c file? How to do it?
following is part of parse.y:
%{
#include "regexp.h"
static int yylex(void);
static void yyerror(char*);
static Regexp *parsed_regexp;
static int nparen;
%}
%union {
Regexp *re;
int c;
int nparen;
}
%token <c> CHAR EOL
%type <re> alt concat repeat single line
%type <nparen> count
%%
line: alt EOL
{
parsed_regexp = $1;
return 1;
}
alt:
concat
| alt '|' concat
{
$$ = reg(Alt, $1, $3);
}
;
concat:
repeat
| concat repeat
{
$$ = reg(Cat, $1, $2);
}
;
repeat:
single
| single '*'
{
$$ = reg(Star, $1, nil);
}
| single '*' '?'
{
$$ = reg(Star, $1, nil);
$$->n = 1;
}
| single '+'
{
$$ = reg(Plus, $1, nil);
}
| single '+' '?'
{
$$ = reg(Plus, $1, nil);
$$->n = 1;
}
| single '?'
{
$$ = reg(Quest, $1, nil);
}
| single '?' '?'
{
$$ = reg(Quest, $1, nil);
$$->n = 1;
}
;
count:
{
$$ = ++nparen;
}
;
single:
'(' count alt ')'
{
$$ = reg(Paren, $3, nil);
$$->n = $2;
}
| '(' '?' ':' alt ')'
{
$$ = $4;
}
| CHAR
{
$$ = reg(Lit, nil, nil);
$$->ch = $1;
}
| '.'
{
$$ = reg(Dot, nil, nil);
}
;
%%
static char *input;
static Regexp *parsed_regexp;
static int nparen;
int gen;
static int
yylex(void)
{
int c;
if(input == NULL || *input == 0)
return EOL;
c = *input++;
if(strchr("|*+?():.", c))
return c;
yylval.c = c;
return CHAR;
}
void
fatal(char *fmt, ...)
{
va_list arg;
va_start(arg, fmt);
fprintf(stderr, "fatal error: ");
vfprintf(stderr, fmt, arg);
fprintf(stderr, "\n");
va_end(arg);
exit(2);
}
static void
yyerror(char *s)
{
fatal("%s", s);
}
Regexp*
parse(char *s)
{
Regexp *r, *dotstar;
input = s;
parsed_regexp = nil;
nparen = 0;
if(yyparse() != 1)
yyerror("did not parse");
if(parsed_regexp == nil)
yyerror("parser nil");
r = reg(Paren, parsed_regexp, nil); // $0 parens
dotstar = reg(Star, reg(Dot, nil, nil), nil);
dotstar->n = 1; // non-greedy
return reg(Cat, dotstar, r);
}
I try to remove these symbols like %, token and type, but I do not know how to resolve the rule(%%), of course it doesn't work, how can I do in VS, does VS support yacc?
There are two tools that go hand in hand together.
Flex and Bison, the GNU team to "Lex and Yacc".
There's a great book called "Flex and Bison" and another called "Lex and Yacc" they are really worth reading, you can pick them up second hand for a few pennies.
Flex and Bison are two of the most under-appreciated things I know of. Constantly re-implemented.
Knowing about work that has already happens is very important. Do not try and read the output of Flex though, that is an optimised result created by a program and is not for human consumption! Instead read Flex's source-code when you know what it does.
the GPL license gives you a right to read it's source code. Enjoy it :)
BTW Unix stuff, the unix philosophy, tells us to write programs that write other programs for us, talk in human-readable text and do one job and one job only, but do it very well.
Flex and Bison are prime examples of this.
Another famous Raymond quote: "Prepare for the future, for it'll be here sooner than you think", both are still here and still very good.
Instead of Yacc, look for Bison, which is the GNU free implementation of the old Yacc program. The parse.y file is not generated by Yacc/Bison, but it is its input file.
Some Unix and Unix-like systems, such as some Linux distributions, come with Bison already installed. On Windows, you have to download and install them separately. You can download and then install Bison for Windows.
Then run Bison with the .y file as input. It generates a C file, which you can then compile in Visual Studio.
Related
So basically, in my bison file if yyparse fails (i.e there is syntax error) I want to print 'ERROR' statement and not print anything else that I do in the above part of bison file in fac the stmt part. if yyparse returns 1 is there any way to skip the content of middle part of bison file? such as idk maybe writing if statement above the stmt part etc? I would appreciate any help! Thanks in advance.
Such as :
%{
#include "header.h"
#include <stdio.h>
void yyerror (const char *s)
{}
extern int line;
%}
%token ...///
%// some tokens types etc...
%union
{
St class;
int value;
char* str;
int line_num;
float float_value;
}
%start prog
%%
prog: '[' stmtlst ']'
;
stmtlst: stmtlst stmt |
;
stmt: setStmt | if | print | unaryOperation | expr
{
if ($1.type==flt && $1.line_num!=0) {
printf("Result of expression on %d is (",$1.line_num);
printf( "%0.1f)\n", $1.float_value);
$$.type=flt;
}
else if ($1.type==integer && $1.line_num!=0){
$$.type=integer;
printf("Result of expression on %d is (%d)\n",$1.line_num,$1.value);
}
else if ($1.type==string && $1.line_num!=0) {
$$.type=string;
printf("Result of expression on %d is (%s)\n",$1.line_num,$1.str);
}
else if ($1.type==mismatch && $1.line_num!=0)
{
$$.type=mismatch;
printf("Type mismatch on %d \n",$1.line_num);
}
else{ }
}
%%
int main ()
{
if (yyparse()) {
// if parse error happens only print this printf and not above stmt part
printf("ERROR\n");
return 1;
}
else {
// successful parsing
return 0;
}
}
Unfortunately, time travel is not an option. By the time the error is detected, the printf calls have already happened. You cannot make them unhappen.
What you can so (and must do if you are writing a compiler rather than a calculator) is create some data structure which represents the parsed program, instead of trying to execute it immediately. That data structure -- which could be an abstract syntax tree (AST) or a list of three-address instructions, or whatever, will be used to produce the compiled program. Only when the compiled program is run will the statements be evaluated.
First of all I need to say that I am very new to Flex and Bison and I am a bit confused. There is a school project that want us to create a compiler using Flex and Bison for some kind of CLIPS language.
My code has a lot of problems but the main one is that whatever i type i see a syntax error while the result should be something else. The ideal scenario would be to fully work for the language CLIPS. EG when i write "4" it get syntax error. Reading my code maybe will get you understand this better. If i write "test 3 4" it doesnt show syntax error but it counts it as an unknown token and thats wrong again..i'm completely lost. the code is a prototype by the school and we need to do some changes. if you have any questions dont hesitate to ask. THank you!
P.S.: dont mind the comments, they are in greek.
FLEX CODE:
%option noyywrap
/* Kwdikas C gia orismo twn apaitoumenwn header files kai twn metablhtwn.
Otidhpote anamesa sta %{ kai %} metaferetai autousio sto arxeio C pou
tha dhmiourghsei to Flex. */
%{
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
/* Header file pou periexei lista me ola ta tokens */
#include "token.h"
/* Orismos metrhth trexousas grammhs */
int line = 1;
%}
/* Onomata kai antistoixoi orismoi (ypo morfh kanonikhs ekfrashs).
Meta apo auto, mporei na ginei xrhsh twn onomatwn (aristera) anti twn,
synhthws idiaiterws makroskelwn kai dysnohtwn, kanonikwn ekfrasewn */
/* dimiourgia KE simfona me ta orismata tis glossas */
DELIMITER [ \t]+
INTCONST [+-]*[1-9][0-9]*
VARIABLE [?][A-Za-z0-9]*
DEFINITIONS [a-zA-Z][-|_|A-Z|a-z|0-9]*
COMMENTS ^;.*$
/* Gia kathe pattern (aristera) pou tairiazei ekteleitai o antistoixos
kwdikas mesa sta agkistra. H entolh return epitrepei thn epistrofh
mias arithmhtikhs timhs mesw ths synarthshs yylex() */
/* an sinantisei diaxoristi i sxolio to agnoei, an sinantisei akeraio,metavliti i orismo ton emfanizei. se kathe alli periptosi ektiponei oti den anagnorizei to token, ti grammi pou vrisketai kai to string pou dothike */
%%
{DELIMITER} {;}
"bind" { return BIND;}
"test" { return TEST;}
"read" { return READ;}
"printout" { return PRINTOUT;}
"deffacts" { return DEFFACTS;}
"defrule" { return DEFRULE;}
"->" { return '->';}
"=" { return '=';}
"+" { return '+';}
"-" { return '-';}
"*" { return '*';}
"/" { return '/';}
"(" { return '(';}
")" { return ')';}
{INTCONST} { return INTCONST; }
{VARIABLE} { return VARIABLE; }
{DEFINITIONS} { return DEFINITIONS; }
{COMMENTS} {;}
\n { line++; printf("\n"); }
.+ { printf("\tLine=%d, UNKNOWN TOKEN, value=\"%s\"\n",line, yytext);}
<<EOF>> { printf("#END-OF-FILE#\n"); exit(0); }
%%
/* Pinakas me ola ta tokens se antistoixia me tous orismous sto token.h */
char *tname[11] = {"DELIMITER","INTCONST" , "VARIABLE", "DEFINITIONS", "COMMENTS", "BIND", "TEST", "READ", "PRINTOUT", "DEFFACTS", "DEFRULE"};
BISON CODE:
%{
/* Orismoi kai dhlwseis glwssas C. Otidhpote exei na kanei me orismo h arxikopoihsh
metablhtwn & synarthsewn, arxeia header kai dhlwseis #define mpainei se auto to shmeio */
#include <stdio.h>
#include <stdlib.h>
int yylex(void);
void yyerror(char *);
%}
/* Orismos twn anagnwrisimwn lektikwn monadwn. */
%token INTCONST VARIABLE DEFINITIONS PLUS NEWLINE MINUS MULT DIV COM BIND TEST READ PRINTOUT DEFFACTS DEFRULE
%%
/* Orismos twn grammatikwn kanonwn. Kathe fora pou antistoixizetai enas grammatikos
kanonas me ta dedomena eisodou, ekteleitai o kwdikas C pou brisketai anamesa sta
agkistra. H anamenomenh syntaksh einai:
onoma : kanonas { kwdikas C } */
program:
program expr NEWLINE { printf("%d\n", $2); }
|
;
expr:
INTCONST { $$ = $1; }
| VARIABLE { $$ = $1; }//prosthiki tis metavlitis
| PLUS expr expr { $$ = $2 + $3; }//prosthiki tis prosthesis os praksi
| MINUS expr expr { $$ = $2 - $3; } //prosthiki tis afairesis os praksi
| MULT expr expr { $$ = $2 * $3; }//prosthiki tou pollaplasiasmou os praksi
| DIV expr expr { $$ = $2 / $3; }//prosthiki tis diairesis os praksi
| COM { $$ = $1; }//prosthiki ton sxolion
| DEFFACTS expr { $$ = $2; }//prosthiki ton gegonoton
| DEFRULE expr { $$ = $2; }//prosthiki ton kanonon
| BIND expr expr { $$ = $2;}//prosthiki tis bind
| TEST expr expr { $$ = $2 ;}//prosthiki tis test
| READ expr expr { $$ = $2 ;}//prosthiki tis read
| PRINTOUT expr expr { $$ = $2 ;}//prosthiki tis printout
;
%%
/* H synarthsh yyerror xrhsimopoieitai gia thn anafora sfalmatwn. Sygkekrimena kaleitai
apo thn yyparse otan yparksei kapoio syntaktiko lathos. Sthn parakatw periptwsh h
synarthsh epi ths ousias typwnei mhnyma lathous sthn othonh. */
void yyerror(char *s) {
fprintf(stderr, "Error: %s\n", s);
}
/* H synarthsh main pou apotelei kai to shmeio ekkinhshs tou programmatos.
Sthn sygkekrimenh periptwsh apla kalei thn synarthsh yyparse tou Bison
gia na ksekinhsei h syntaktikh analysh. */
int main(void) {
yyparse();
return 0;
}
TOKEN FILE:
#define DELIMITER 1
#define INTCONST 2
#define VARIABLE 3
#define DEFINITIONS 4
#define COMMENTS 5
#define BIND 6
#define TEST 7
#define READ 8
#define PRINTOUT 9
#define DEFFACTS 10
#define DEFRULE 11
MAKEFILE:
all:
bison -d simple-bison-code.y
flex mini-clips-la.l
gcc simple-bison-code.tab.c lex.yy.c -o B2
./B2
clean:
rm simple-bison-code.tab.c simple-bison-code.tab.h lex.yy.c B2
Your top-level rule is:
program:
program expr NEWLINE
which cannot succeed unless the parser sees a NEWLINE token. But it will never see one, because your lexical scanner never sends one; when it sees a newline, it increments the line count but doesn't return anything.
All your tokens are considered invalid because your lexical scanner uses its own definitions of the token values. You shouldn't do that. The parser generator (bison/yacc) will generate a header file containing the correct definitions; that is, the values it is expecting to see.
There are various other problems, probably more than I noticed. The most important is that you should not call exit(0) in the <<EOF>> rule, since that will mean that the parser can never succeed; it does not succeed until it is passed an EOF token. In fact, you should not normally have an <<EOF>> rule; the default action is to return 0 and that is pretty well the only action which makes sense.
Also, '->' is not a correct C literal. The compiler would have complained about it if you had enabled compiler warnings (-Wall), which you should always do, even if you are compiling generated code.
And your scanner's last pattern, intended to trigger on bad tokens, is .+, which will match the entire line, not just the erroneous character. Since (f)lex scanners accept the pattern with the longest match, most of your other patterns will never match. (Flex usually warns you about unmatchable patterns. Didn't you get such a warning?)
The fallback pattern should be .|\n, although you can use . if you are absolutely sure that every newline will be matched by some rule. I like to use %option nodefault, which will cause flex to warn me if there is some possible input not matched by any rule.
I have the following code in Bison, which extends the mfcalc proposed in the guide, implementing some functions like yylex() externally with FLEX.
To understand my problem, the key rules are in non-terminal token called line at the beginning of the grammar. Concretely, the rules EVAL CLOSED_STRING '\n' and END (this token is sent by FLEX when EOF is detected. The first opens a file and points the input to that file. The second closes the file and points the input to stdin input.
I'm trying to make a rule eval "file_path" to load tokens from a file and evaluate them. Initially I have yyin = stdin (I use the function setStandardInput() to do this).
When a user introduces eval "file_path" the parser swaps yyinfrom stdin to the file pointer (with the function setFileInput()) and the tokens are readen correctly.
When the END rule is reached by the parser, it tries to restore the stdin input but it gets bugged. This bug means the calculator doesn't ends but what I write in the input isn't evaluated.
Note: I supposed there are no errors in the grammar, because error recovery it's not complete. In the file_path you can use simple arithmetic operations.
As a summary, I want to swap among stdin and file pointers as inputs, but when I swap to stdin it gets bugged, except I start the calculator with stdin as default.
%{
/* Library includes */
#include <stdio.h>
#include <math.h>
#include "utils/fileutils.h"
#include "lex.yy.h"
#include "utils/errors.h"
#include "utils/stringutils.h"
#include "table.h"
void setStandardInput();
void setFileInput(char * filePath);
/* External functions and variables from flex */
extern size_t yyleng;
extern FILE * yyin;
extern int parsing_line;
extern char * yytext;
//extern int yyerror(char *s);
extern int yyparse();
extern int yylex();
int yyerror(char * s);
%}
/***** TOKEN DEFINITION *****/
%union{
char * text;
double value;
}
%type <value> exp asig
%token LS
%token EVAL
%token <text> ID
%token <text> VAR
%token <value> FUNCTION
%token <value> LEXEME
%token <value> RESERVED_WORD
%token <value> NUMBER
%token <value> INTEGER
%token <value> FLOAT
%token <value> BINARY
%token <value> SCIENTIFIC_NOTATION
%token <text> CLOSED_STRING
%token DOCUMENTATION
%token COMMENT
%token POW
%token UNRECOGNIZED_CHAR
%token MALFORMED_STRING_ERROR
%token STRING_NOT_CLOSED_ERROR
%token COMMENT_ERROR
%token DOCUMENTATION_ERROR
%token END
%right '='
%left '+' '-'
%left '/' '*'
%left NEG_MINUS
%right '^'
%right '('
%%
input: /* empty_expression */ |
input line
;
line: '\n'
| asig '\n' { printf("\t%f\n", $1); }
| asig END { printf("\t%f\n", $1); }
| LS { print_table(); }
| EVAL CLOSED_STRING '\n' {
// Getting the file path
char * filePath = deleteStringSorroundingQuotes($2);
setFileInput(filePath);
| END { closeFile(yyin); setStandardInput();}
;
exp: NUMBER { $$ = $1; }
| VAR {
lex * result = table_search($1, LEXEME);
if(result != NULL) $$ = result->value;
}
| VAR '(' exp ')' {
lex * result = table_search($1, FUNCTION);
// If the result is a function, then invokes it
if(result != NULL) $$ = (*(result->function))($3);
else yyerror("That identifier is not a function.");
}
| exp '+' exp { $$ = $1 + $3; }
| exp '-' exp { $$ = $1 - $3; }
| exp '*' exp { $$ = $1 * $3; }
| exp '/' exp {
if($3 != 0){ $$ = $1 / $3;};
yyerror("You can't divide a number by zero");
}
| '-' exp %prec NEG_MINUS { $$ = -$2; }
| exp '^' exp { $$ = pow($1, $3); }
| '(' exp ')' { $$ = $2; }
| '(' error ')' {
yyerror("An error has ocurred between the parenthesis."); yyerrok; yyclearin;
}
;
asig: exp { $$ = $1; }
| VAR '=' asig {
int type = insertLexeme($1, $3);
if(type == RESERVED_WORD){
yyerror("You tried to assign a value to a reserved word.");
YYERROR;
}else if(type == FUNCTION){
yyerror("You tried to assign a value to a function.");
YYERROR;
}
$$ = $3;
}
;
%%
void setStandardInput(){
printf("Starting standard input:\n");
yyin = NULL;
yyin = stdin;
yyparse();
}
void setFileInput(char * filePath){
FILE * inputFile = openFile(filePath);
if(inputFile == NULL){
printf("The file couldn't be loaded. Redirecting to standard input: \n");
setStandardInput();
}else{
yyin = inputFile;
}
}
int main(int argc, char ** argv) {
create_table(); // Table instantiation and initzialization
initTable(); // Symbol table initzialization
setStandardInput(); // yyin = stdin
while(yyparse()!=1);
print_table();
// Table memory liberation
destroyTable();
return 0;
}
int yyerror(char * s){
printf("---------- Error in line %d --> %s ----------------\n", parsing_line, s);
return 0;
}
It's not too difficult to create a parser and a scanner which can be called recursively. (See below for an example.) But neither the default bison-generated parser nor the flex-generated scanner are designed to be reentrant. So with the default parser/scanner, you shouldn't call yyparse() inside SetStandardInput, because that function is itself called by yyparse.
If you had a recursive parser and scanner, on the other hand, you could significantly simplify your logic. You could get rid of the END token (which is, in any case, practically never a good idea) and just recursively call yyparse in your action for EVAL CLOSED_STRING '\n'.
If you want to use the default parser and scanner, then your best solution is to use Flex's buffer stack to push and later pop a "buffer" corresponding to the file to be evaluated. (The word "buffer" here is a bit confusing, I think. A Flex "buffer" is actually an input source, such as a file; it's called a buffer because only a part of it is in memory, but Flex will read the entire input source as part of processing a "buffer".)
You can read about the buffer stack usage in the flex manual, which includes sample code. Note that in the sample code, the end of file condition is entirely handled inside the scanner, which is usual for this architecture.
It is possible in this case to manufacture an end-of-file indicator (although you cannot use END because that is used to indicate the end of all input). That has the advantage of ensuring that the contents of the evaluated file are parsed as a whole, without leaking a partial parse back to the including file, but you will still want to pop the buffer stack inside the scanner because it annoyingly tricky to get end-of-file handling correct without violating any of the API constraints (one of which is that you cannot reliably read EOF twice on the same "buffer").
In this case, I would recommend generating a reentrant parser and scanner and simply doing a recursive call. It's a clean and simple solution, and avoiding global variables is always good.
A simple example. The simple language below only has echo and eval statements, both of which require a quoted string argument.
There are a variety of ways to hook together a reentrant scanner and reentrant parser. All of them have some quirks and the documentation (although definitely worth reading) has some holes. This is a solution which I've found useful. Note that most of the externally visible functions are defined in the scanner file, because they rely on interfaces defined in that file for manipulating the reentrant scanner context object. You can get flex to export a header with the approriate definitions, but I've generally found it simpler to write my own wrapper functions and export those. (I don't usually export yyscan_t either; normally I create a context object of my own which has a yyscan_t member.)
There is an annoying circularity which is largely the result of bison not allowing for the possibility to introduce user code at the top of yyparse. Consequently, it is necessary to pass the yyscan_t used to call the lexer as an argument to yyparse, which means that it is necessary to declare yyscan_t in the bison file. yyscan_t is actually declared in the scanner generated file (or the flex-generated header, if you've asked for one), but you can't include the flex-generated header in the bison-generated header because the flex-generated header requires YYSTYPE which is declared in the bison-generated header.
I normally avoid this circularity by using a push parser, but that's pushing the boundaries for this question, so I just resorted to the usual work-around, which is to insert
typedef void* yyscan_t;
in the bison file. (That's the actual definition of yyscan_t, whose actual contents are supposed to be opaque.)
I hope the rest of the example is self-evident, but please feel free to ask for clarification if there is anything which you don't understand.
file recur.l
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "recur.tab.h"
%}
%option reentrant bison-bridge
%option noinput nounput nodefault noyywrap
%option yylineno
%%
"echo" { return T_ECHO; }
"eval" { return T_EVAL; }
[[:alpha:]][[:alnum:]]* {
yylval->text = strdup(yytext);
return ID;
}
["] { yyerror(yyscanner, "Unterminated string constant"); }
["][^"\n]*["] {
yylval->text = malloc(yyleng - 1);
memcpy(yylval->text, yytext + 1, yyleng - 2);
yylval->text[yyleng - 2] = '\0';
return STRING;
}
"." { return yytext[0]; }
[[:digit:]]*("."[[:digit:]]*)? {
yylval->number = strtod(yytext, NULL);
return NUMBER;
}
[ \t]+ ;
.|\n { return yytext[0]; }
%%
/* Use "-" or NULL to parse stdin */
int parseFile(const char* path) {
FILE* in = stdin;
if (path && strcmp(path, "-") != 0) {
in = fopen(path, "r");
if (!in) {
fprintf(stderr, "Could not open file '%s'\n", path);
return 1;
}
}
yyscan_t scanner;
yylex_init (&scanner);
yyset_in(in, scanner);
int rv = yyparse(scanner);
yylex_destroy(scanner);
if (in != stdin) fclose(in);
return rv;
}
void yyerror(yyscan_t yyscanner, const char* msg) {
fprintf(stderr, "At line %d: %s\n", yyget_lineno(yyscanner), msg);
}
file recur.y
%code {
#include <stdio.h>
}
%define api.pure full
%param { scanner_t context }
%union {
char* text;
double number;
}
%code requires {
int parseFILE(FILE* in);
}
%token ECHO "echo" EVAL "eval"
%token STRING ID NUMBER
%%
program: %empty | program command '\n'
command: echo | eval | %empty
echo: "echo" STRING { printf("%s\n", $2); }
eval: "eval" STRING { FILE* f = fopen($2, "r");
if (f) {
parseFILE(f);
close(f);
}
else {
fprintf(stderr, "Could not open file '%s'\n",
$2);
YYABORT;
}
}
%%
I am fairly new to Yacc and Lex programming but I am training myself with a analyser for C programs.
However, I am facing a small issue that I didn't manage to solve.
When there is a declaration for example like int a,b; I want to save a and b in an simple array. I did manage to do that but it saving a bit more that wanted.
It is actually saving "a," or "b;" instead of "a" and "b".
It should have worked as $1should only return tID which is a regular expression recognising only a string chain. I don't understand why it take the coma even though it defined as a token. Does anyone know how to solve this problem ?
Here is the corresponding yacc declarations :
Declaration :
tINT Decl1 DeclN
{printf("Declaration %s\n", $2);}
| tCONST Decl1 DeclN
{printf("Declaration %s\n", $2);}
;
Decl1 :
tID
{$$ = $1;
tabvar[compteur].id=$1; tabvar[compteur].adresse=compteur;
printf("Added %s at adress %d\n", $1, compteur);
compteur++;}
| tID tEQ E
{$$ = $1;
tabvar[compteur].id=$1; tabvar[compteur].adresse=compteur;
printf("Added %s at adress %d\n", $1, compteur);
pile[compteur]=$3;
compteur++;}
;
DeclN :
/*epsilon*/
| tVIR Decl1 DeclN
And the extract of the Lex file :
separateur [ \t\n\r]
id [A-Za-z][A-Za-z0-9_]*
nb [0-9]+
nbdec [0-9]+\.[0-9]+
nbexp [0-9]+e[0-9]+
"," { return tVIR; }
";" { return tPV; }
"=" { return tEQ; }
{separateur} ;
{id} { yylval.str = yytext; return tID; }
{nb}|{nbdec}|{nbexp} { yylval.nb = atoi(yytext); return tNB; }
%%
int yywrap() {return 1;}
The problem is that yytext is a reference into lex's token scanning buffer, so it is only valid until the next time the parser calls yylex. You need to make a copy of the string in yytext if you want to return it. Something like:
{id} { yylval.str = strdup(yytext); return tID; }
will do the trick, though it also exposes you to the possibility of memory leaks.
Also, in general when writing lex/yacc parsers involving single character tokens, it is much clearer to use them directly as charcter constants (eg ',', ';', and '=') rather than defining named tokens (tVIR, tPV, and tEQ in your code).
I'm trying to implement a calculator for nor expressions, such as true nor true nor (false nor false) using Flex and Bison, but I keep getting my error message back. Here is my .l file:
%{
#include <stdlib.h>
#include "y.tab.h"
%}
%%
("true"|"false") {return BOOLEAN;}
.|\n {yyerror();}
%%
int main(void)
{
yyparse();
return 0;
}
int yywrap(void)
{
return 0;
}
int yyerror(void)
{
printf("Error\n");
}
Here is my .y file:
/* Bison declarations. */
%token BOOLEAN
%left 'nor'
%% /* The grammar follows. */
input:
/* empty */
| input line
;
line:
'\n'
| exp '\n' { printf ("%s",$1); }
;
exp:
BOOLEAN { $$ = $1; }
| exp 'nor' exp { $$ = !($1 || $3); }
| '(' exp ')' { $$ = $2; }
;
%%
Does anyone see the problem?
The simple way to handle all the single-character tokens, which as #vitaut correctly says you aren't handling at all yet, is to return yytext[0] for the dot rule, and let the parser sort out which ones are legal.
You have also lost the values of the BOOLEANs 'true' and 'false', which should be stored into yylval as 1 and 0 respectively, which will then turn up in $1, $3 etc. If you're going to have more datatypes in the longer term, you need to look into the %union directive.
The reason why you get errors is that your lexer only recognizes one type of token, namely BOOLEAN, but not the newline, parentheses or nor (and you produce an error for everything else). For single letter tokens like parentheses and the newline you can return the character itself as a token type:
\n { return '\n'; }
For nor thought you should introduce a token type like you did for BOOLEAN and add an appropriate rule to the lexer.