I'm trying to make a reentrant flex&bison parser but I got this strange error:
too few arguments to function call, expected 5, have 4
I can see that the code generated by Bison looks like this:
static void
yydestruct (const char *yymsg,
yysymbol_kind_t yykind, YYSTYPE *yyvaluep, void *scanner, struct BisonOutput *out)
{ ...some code... }
and
int
yyparse (void *scanner, struct BisonOutput *out)
{
...some code...
yydestruct ("Cleanup: discarding lookahead",
yytoken, &yylval, out); // <--- here void*scanner parameter is clearly missing
...some code...
}
My code is this:
%define api.pure full
%lex-param {void *scanner}
%parse-param {void *scanner, struct BisonOutput *out}
%{
struct BisonOutput{
int out;
};
#include "syntax_parser.h"
#include "lex.yy.h"
#include <stdio.h>
%}
%define api.value.type union
%token <int> NUM
...bunch of other tokens...
%%
...bunch of grammar rules...
%%
... main function and such ...
And Flex code is as follows:
%{
#include "syntax_parser.h"
%}
%option reentrant bison-bridge noyywrap
blanks [ \t\n]+
number [0-9]+
%option noyywrap
%%
... bunch of rules ...
I'm really lost. Why doesn't bison plug scanner into yydestruct despite clearly using it in yyparse?
You are not allowed to put two parameters in a %*-param declaration. The correct way to produce the set of parameters you want is:
%param { void* scanner }
%parse-param { struct BisonOutput* out }
Bison doesn't really parse the code between { and }. All it does is identify the last identifier which it assumes is the name of the parameter. It also assumes that the code is a syntactically-correct declaration of a single parameter, and it is inserted as such in the prototypes. Since it's actually two parameters, it can be inserted without problem into a prototype, but since only one argument is inserted into the calls to the function, these don't match the prototype.
(Really, void* scanner should be yyscan_t scanner, with a prior typedef void* yyscan_t;. But perhaps it is not really better.)
You might also consider putting the declaration of struct BisonOutput into a %code requires (or %code provides) block, so that it is automatically included in the bison-generated header file.
Related
So here is the start of my program. I keep getting the error
boolExpr.y:13:2: error: unknown type name 'bool'
bool boolean;
However when I check the bison generated file, I can see stdbool.h is included at the start of the program executing. I can't figure out how a library can be important but then bool not be recognized. I'm thinking I missed something simple, or I need to reinstall bison or lex. I can include the rest of the program if needed.
I tried to switch it to int boolean; instead of bool boolean; and that fixed the compilation problem, however it still mystifies me.
Is there some way to extend a pointer to a struct into %union without getting compile errors? I tried to make a structName * boolean; to replace bool boolean but that kept coming back as undefined wimplicit error as well.
%{
#include "semantics.h"
#include <stdbool.h>
#include "IOMngr.h"
#include <string.h>
extern int yylex(); /* The next token function. */
extern char *yytext; /* The matched token text. */
extern int yyerror(char *s);
extern SymTab *table;
extern SymEntry *entry;
%}
%union{
bool boolean;(this is the line # of error)
char * string;
}
%type <string> Id
%type <boolean> Expr
%type <boolean> Term
%type <boolean> Factor
%token Ident
%token TRUE
%token FALSE
%token OR
%token AND
%%
Prog : StmtSeq {printSymTab();};
StmtSeq : Stmt StmtSeq { };
StmtSeq : { };
Stmt : Id '=' Expr ';' {storeVar($1, $3);};
Expr : Expr OR Term {$$ = doOR($1, $3);};
Expr : Term {$$ = $1;};
Term : Term AND Factor {$$ = doAND($1, $3);};
Term : Factor {$$ = $1;};
Factor : '!' Factor {$$ = doNOT($2);};
Factor : '(' Expr ')' {$$ = $2;};
Factor : Id {$$ = getVal($1);};
Factor : TRUE {$$ = true;};
Factor : FALSE {$$ = false;};
Id : Ident {$$ = strdup(yytext);};
%%
int yyerror(char *s) {
WriteIndicator(getCurrentColumnNum());
WriteMessage("Illegal Character in YACC");
return 1;
}
Ok, silly mistake after all- in my lex file, I had
#include "h4.tab.h"
#include "SymTab.h"
#include <stdbool.h>
but it should have been
#include "SymTab.h"
#include <stdbool.h>
#include "h4.tab.h"
didn't realize include order mattered!
When you use a %union declaration, Bison creates a union type called YYSTYPE which it declares in the generated header file; that type is used in the declaration of yylval, which is also in the generated header file. Putting the declarations in the generated header file means that you don't need to do anything to make yylval and its type YYSTYPE available in your lexical analyser, other than #includeing the bison-generated header file.
That's fine if all the types referred to in the %union declaration are standard C types. But there is a problem if you want to use a type which requires an #included header file, or which you yourself define. Assuming you put the necessary lines in a bison code prologue (%{...}%) before the %union declaration, you won't have a problem compiling your parser, but you will run into a problem in the lexical analyser. When you #include your header file, you will effectively insert the declaration of ´union YYSTYPE´ and that will fail unless all of the referenced types have already been defined.
Of course, you can solve this issue by just copying all the necessary #includes and/or definitions from the .y file to the .l file, making sure that you put the ones needed for the union+ declaration before the #include of the header file and the ones which require YYSTYPE be defined after the #include. But that violates the principles of good software design; it means that every time you change an #include or declaration in your .y file, you need to think about whether and where you need to make a similar change in your .l file. Fortunately, bison provides a more convenient mechanism.
The ideal would be to arrange for everything needed to be inserted into the generated header file. Then you can just #include "h4.tab.h" in the lexical analyzer, confident that you don't need to do anything else to ensure that needed #includes are present and in the correct order.
To this end, Bison provides a more flexible alternative to %{...}%, the %code directive:
%code [where] {
// code to insert
}
There are a few possible values for where, which are documented in the Bison documentation.
Two of these are used to maintain the generated header file:
%code requires { ... } inserts the code block in both the header file and the source file, in both cases before the union declaration. This is the code block type you should use for dependencies of the semantic and location types.
%code provides { ... } also inserts the code block into both the header and the source file, but this time after the union declaration. You can use this block type if you have some interfaces which themselves refer to YYSTYPE.
You can still use %{...}% to insert code directly into the output source code. But you might want to instead use
* %code { ... }
which, like %{...}%, only inserts the code in the source file. Unlike %{...}%, it inserts the code in a defined place in the source file, after the YYSTYPE and other declarations. This avoids obscure problems with %{...}% blocks, which are sometimes inserted early and sometimes inserted late and therefore can suddenly fail to compile if you change the order of apparently unrelated bison directives.
I'm trying to develop a basic compiler and I'm using a union for yylval as follows:
%{
#include <string.h>
#include <stdio.h>
struct info {
int line;
/* details unimportant */
};
%}
%union{
char *str;
struct info *ptr;
}
In my lexer definition, I have
%{
#include "parse.tab.h"
%}
But when I compile the generated lexer, I get the following errors:
y.tab.h: unknown type name 'YYSTYPE'.
error: request for a member str in something not a structure or a union.
Do I need to #define YYSTYPE as well?
(I edited the original question to insert enough information from the source files to make the question answerable. Any mistakes in the transcription are my fault and I apologize -- Rici.)
No. If you use a %union declaration, you must not #define YYSTYPE; the bison manual makes this clear.
However, any necessary declarations -- in this case, the declaration of struct info -- must be included in your lexer description file (parse.l) as well. The two generated files are independent of each other, so the fact that struct info is declared in the parser does not make the definition automatically available to the lexer.
In order to avoid repeating the declarations, it is usually a good idea to put them in a separate header file:
file: info.h (added)
#ifndef INFO_H_HEADER_
#define INFO_H_HEADER_
struct info {
int line;
/* details unimportant */
};
// ...
#endif
file: parse.y (now #include's info.h instead of the in-line struct declaration)
%{
#include <stdio.h>
#include <string.h>
#include "info.h"
%}
%union{
char *str;
struct info *ptr;
}
file: parse.l (also #includes info.h)
%{
#include <stdio.h>
#include <string.h>
/* This must come *before* including parse.tab.h */
#include "info.h"
#include "parse.tab.h"
%}
The following is an example of how I use YYSTYPE:
typedef union { // base type filled by lexical analyzer
struct {
int numtype; // classval (type; selects into union below)
union {
int ival; // integer value
long lval; // long value
double dval; // double
} val;
} numval;
unsigned char *sval; // string value
} lex_baseval;
typedef struct { // type returned by lexical analyzer
int lineno;
lex_baseval lexval;
} YYSTYPE;
#define YYSTYPE YYSTYPE
The problem with your linked code is that the %union is inside the %{...%} at the top of your .y file -- which means that yacc just copies it verbatim to the y.tab.c file and does not actually process it.
This manifests most obviously as a syntax error on %union when you try to compile y.tab.c, but also means there's no YYSTYPE definition in y.tab.h, as yacc didn't see the %union so didn't create one.
I'm currently working on a simple infix-to-postfix compiler for a given grammar. I'm currently at the stage of syntax analysis. I have already written a lexical analyzer, using Flex library, however I'm stuck on a seemingly simple problem. The information below might seem like a lot to process, but I presume the problem is rather basic to anyone with some experience in compiler construction.
Here is my lexer:
%{
#include <stdlib.h>
#include "global.h"
int lineno = 1, tokenval = NONE;
%}
letter [A-Za-z]
digit [0-9]
id {letter}({letter}|{digit})*
%option noinput
%option nounput
%%
[ \t]+ {}
\n {lineno++;}
{digit}+ {tokenval = atoi(yytext);
printf("digit\n");
return NUM;}
{id} {int p;
p = lookup(yytext);
if(p==0){
p = insert(yytext, ID);
}
tokenval = p;
return symtable[p].token;
}
<<EOF>> {return DONE;}
. {tokenval = NONE;
return yytext[0];}
Nothing special here, just defining some tokens and handling them.
And my parser.y file:
%{
#include "global.h"
%}
%token digit
%%
start: line {printf("success!\n");};
line: expr ';' line | expr ;
expr: digit;
%%
void yyerror(char const *s)
{
printf("error\n");
};
int main()
{
yyparse();
return 0;
}
The problem is on the line:
expr: digit;
The compiler has evidently some problem with the digit token, since if I put instead anything constant other than a digit, it all works fine, and expressions like -; or +; will be accepted. I have no idea why is this happening, especially that I'm pretty sure my lexical analyzer works fine.
The global.h file is just a linkage for other files, contains necessary function prototypes and links to any necessary variables:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
#define BSIZE 128
#define NONE -1
#define EOS '\0'
#define NUM 256
#define DIV 257
#define MOD 258
#define ID 259
#define DONE 260
extern int tokenval;
extern int lineno;
struct entry
{
char *lexptr;
int token;
};
extern struct entry symtable[];
int insert (char s[], int tok);
void error (char *m) ;
int lookup (char s[]) ;
void init () ;
void parse () ;
int yylex (void) ;
void expr () ;
void term () ;
void factor () ;
void match (int t) ;
void emit (int t, int tval) ;
void yyerror(char const *s);
Your scanner returns NUM when it has found a sequence of digits, not digit. The identifier digit is just used internally in your Flex specification.
Then you have another digit defined as a token in your Bison grammar, but it is not connected in any way to the Flex one.
To fix this, use NUM, both in your Bison grammar and as a return value from the lexer. Don't declare it yourself with #define, but let Bison create those declarations, from your %token definitions. You can use the -d flag to get Bison to output a header file. Run Bison before Flex, and #include Bison's output header file, with NUM in it, in your Flex code.
Some lines of my flex file:
%{
#include <stdlib.h>
#include <string.h>
#include "types.h"
#define NO_YY_UNPUT
/* #define YY_NEVER_INTERACTIVE */
extern char *strdup(const char *);
short unsigned int yylineno = 1;
%}
{ID} {
yylval.txt = strdup(yytext);
return ID;
};
\n { ++yylineno; }
My code looks good but I have problem when i want to compile on Ubuntu. In windows everything is okay but on linux I have errors like:
lex.l:10:14: error: expected identifier or ‘(’ before ‘__extension__’
lex.l:12:20: error: conflicting types for ‘yylineno’
lex.c:355:5: note: previous definition of ‘yylineno’ was here
Line 10: extern char *strdup(const char *);
Line 12: short unsigned int yylineno = 1;
strdup is declared in string.h, but it is a Posix interface and you should define an appropriate feature test macro before including any system header:
%top {
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <string.h>
#include "types.h"
}
(Note: Using %top forces the enclosed code to be inserted at the top of the generated C file, in order to provide the guarantee that the feature test macro is defined before any system header.)
I don't know if that works on Windows (and it certainly depends on your compiler and toolchain), so you might need to declare strdup on that platform. If so, make sure you surround the declaration with a preprocessor test for the build environment.)
The error at line 10 is probably the result of strdup being defined as a macro. I'm not sure under what conditions that will happen -- it will be some GNU extension mode -- but defining the Posix feature test macro should prevent it.
As for the error with the type of yylineno, there is a simple solution: don't declare yylineno. It is declared in the code flex generates (and it may be declared as a macro if you ask for a re-entrant -- "pure" -- lexer).
yacc doesn't seem to like when my tokens are of a type that I defined.
At the top of my grammar (.y) file in a %{ ... %} block, I include a header file that defines the following structure:
typedef struct _spim_register {
spim_register_type type; /* This is a simple enumeration, already defined */
int number;
} spim_register;
Before my list of rules, I have:
%token AREG
...
%union {
struct _spim_register reg;
}
...
%type <reg> register AREG
I get
error: field ‘reg’ has incomplete type
at the line in the %union clause while trying to compile the code produced by bison. In my %union statement, trying to declare reg by writing spim_register reg; gives the error:
unknown type name ‘spim_register’
It seems like there's something special about %union { ... }, because I'm able to use the data structures from my header file in the actions for the rules.
It would help if my #includes were in the right order...
The answer was, as user786653 hinted, here. I needed to include the header file that defines my custom structure before including the .tab.h file in the .l file.
I met the same problem. Because my *.l file like this:
include "y.tab.h"
include "FP.h"
then, I rewrote it like this:
include "FP.h"
include "y.tab.h"
It works. Thank you very much. #ArIck