Why is bison include not applying/extending to union%?

Why is bison include not applying/extending to union%? - c

So here is the start of my program. I keep getting the error
boolExpr.y:13:2: error: unknown type name 'bool'
bool boolean;
However when I check the bison generated file, I can see stdbool.h is included at the start of the program executing. I can't figure out how a library can be important but then bool not be recognized. I'm thinking I missed something simple, or I need to reinstall bison or lex. I can include the rest of the program if needed.
I tried to switch it to int boolean; instead of bool boolean; and that fixed the compilation problem, however it still mystifies me.
Is there some way to extend a pointer to a struct into %union without getting compile errors? I tried to make a structName * boolean; to replace bool boolean but that kept coming back as undefined wimplicit error as well.
%{
#include "semantics.h"
#include <stdbool.h>
#include "IOMngr.h"
#include <string.h>
extern int yylex(); /* The next token function. */
extern char *yytext; /* The matched token text. */
extern int yyerror(char *s);
extern SymTab *table;
extern SymEntry *entry;
%}
%union{
bool boolean;(this is the line # of error)
char * string;
}
%type <string> Id
%type <boolean> Expr
%type <boolean> Term
%type <boolean> Factor
%token Ident
%token TRUE
%token FALSE
%token OR
%token AND
%%
Prog : StmtSeq {printSymTab();};
StmtSeq : Stmt StmtSeq { };
StmtSeq : { };
Stmt : Id '=' Expr ';' {storeVar($1, $3);};
Expr : Expr OR Term {$$ = doOR($1, $3);};
Expr : Term {$$ = $1;};
Term : Term AND Factor {$$ = doAND($1, $3);};
Term : Factor {$$ = $1;};
Factor : '!' Factor {$$ = doNOT($2);};
Factor : '(' Expr ')' {$$ = $2;};
Factor : Id {$$ = getVal($1);};
Factor : TRUE {$$ = true;};
Factor : FALSE {$$ = false;};
Id : Ident {$$ = strdup(yytext);};
%%
int yyerror(char *s) {
WriteIndicator(getCurrentColumnNum());
WriteMessage("Illegal Character in YACC");
return 1;
}

Ok, silly mistake after all- in my lex file, I had
#include "h4.tab.h"
#include "SymTab.h"
#include <stdbool.h>
but it should have been
#include "SymTab.h"
#include <stdbool.h>
#include "h4.tab.h"
didn't realize include order mattered!

When you use a %union declaration, Bison creates a union type called YYSTYPE which it declares in the generated header file; that type is used in the declaration of yylval, which is also in the generated header file. Putting the declarations in the generated header file means that you don't need to do anything to make yylval and its type YYSTYPE available in your lexical analyser, other than #includeing the bison-generated header file.
That's fine if all the types referred to in the %union declaration are standard C types. But there is a problem if you want to use a type which requires an #included header file, or which you yourself define. Assuming you put the necessary lines in a bison code prologue (%{...}%) before the %union declaration, you won't have a problem compiling your parser, but you will run into a problem in the lexical analyser. When you #include your header file, you will effectively insert the declaration of ´union YYSTYPE´ and that will fail unless all of the referenced types have already been defined.
Of course, you can solve this issue by just copying all the necessary #includes and/or definitions from the .y file to the .l file, making sure that you put the ones needed for the union+ declaration before the #include of the header file and the ones which require YYSTYPE be defined after the #include. But that violates the principles of good software design; it means that every time you change an #include or declaration in your .y file, you need to think about whether and where you need to make a similar change in your .l file. Fortunately, bison provides a more convenient mechanism.
The ideal would be to arrange for everything needed to be inserted into the generated header file. Then you can just #include "h4.tab.h" in the lexical analyzer, confident that you don't need to do anything else to ensure that needed #includes are present and in the correct order.
To this end, Bison provides a more flexible alternative to %{...}%, the %code directive:
%code [where] {
// code to insert
}
There are a few possible values for where, which are documented in the Bison documentation.
Two of these are used to maintain the generated header file:
%code requires { ... } inserts the code block in both the header file and the source file, in both cases before the union declaration. This is the code block type you should use for dependencies of the semantic and location types.
%code provides { ... } also inserts the code block into both the header and the source file, but this time after the union declaration. You can use this block type if you have some interfaces which themselves refer to YYSTYPE.
You can still use %{...}% to insert code directly into the output source code. But you might want to instead use
* %code { ... }
which, like %{...}%, only inserts the code in the source file. Unlike %{...}%, it inserts the code in a defined place in the source file, after the YYSTYPE and other declarations. This avoids obscure problems with %{...}% blocks, which are sometimes inserted early and sometimes inserted late and therefore can suddenly fail to compile if you change the order of apparently unrelated bison directives.

Related

yydestruct too few arguments to function call (flex&bison)

I'm trying to make a reentrant flex&bison parser but I got this strange error:
too few arguments to function call, expected 5, have 4
I can see that the code generated by Bison looks like this:
static void
yydestruct (const char *yymsg,
yysymbol_kind_t yykind, YYSTYPE *yyvaluep, void *scanner, struct BisonOutput *out)
{ ...some code... }
and
int
yyparse (void *scanner, struct BisonOutput *out)
{
...some code...
yydestruct ("Cleanup: discarding lookahead",
yytoken, &yylval, out); // <--- here void*scanner parameter is clearly missing
...some code...
}
My code is this:
%define api.pure full
%lex-param {void *scanner}
%parse-param {void *scanner, struct BisonOutput *out}
%{
struct BisonOutput{
int out;
};
#include "syntax_parser.h"
#include "lex.yy.h"
#include <stdio.h>
%}
%define api.value.type union
%token <int> NUM
...bunch of other tokens...
%%
...bunch of grammar rules...
%%
... main function and such ...
And Flex code is as follows:
%{
#include "syntax_parser.h"
%}
%option reentrant bison-bridge noyywrap
blanks [ \t\n]+
number [0-9]+
%option noyywrap
%%
... bunch of rules ...
I'm really lost. Why doesn't bison plug scanner into yydestruct despite clearly using it in yyparse?

You are not allowed to put two parameters in a %*-param declaration. The correct way to produce the set of parameters you want is:
%param { void* scanner }
%parse-param { struct BisonOutput* out }
Bison doesn't really parse the code between { and }. All it does is identify the last identifier which it assumes is the name of the parameter. It also assumes that the code is a syntactically-correct declaration of a single parameter, and it is inserted as such in the prototypes. Since it's actually two parameters, it can be inserted without problem into a prototype, but since only one argument is inserted into the calls to the function, these don't match the prototype.
(Really, void* scanner should be yyscan_t scanner, with a prior typedef void* yyscan_t;. But perhaps it is not really better.)
You might also consider putting the declaration of struct BisonOutput into a %code requires (or %code provides) block, so that it is automatically included in the bison-generated header file.

Flex And Bison, detecting macro statements (newbie)

I want to teach flex & bison to detect the macro definitions in pure C. Actually i'am adding this function to the existing parser form here. The parser itself is good, but it lacks macro functionality. So i did add successfully the #include and pragma macros detection, but with the selection macroses i have the problems, this is the code in the parser:
macro_selection_variants
: macro_compound_statement
| include_statement
| pragma_statement
| macro_selection_statement
| statement
;
macro_selection_statement
: MACRO_IFDEF IDENTIFIER macro_selection_variants MACRO_ENDIF
| MACRO_IFDEF IDENTIFIER macro_selection_variants MACRO_ELSE macro_selection_variants MACRO_ENDIF
| MACRO_IFNDEF IDENTIFIER macro_selection_variants MACRO_ENDIF
| MACRO_IFNDEF IDENTIFIER macro_selection_variants MACRO_ELSE macro_selection_variants MACRO_ENDIF
;
statement is declared like so:
statement
: labeled_statement
| compound_statement
| expression_statement
| selection_statement
| iteration_statement
| jump_statement
;
And the lexer part for those macroses is:
"#ifdef" { count(); return(MACRO_IFDEF); }
"#ifndef" { count(); return(MACRO_IFNDEF); }
"#else" { count(); return(MACRO_ELSE); }
"#endif" { count(); return(MACRO_ENDIF); }
So the problem is i get the 2 reduce/reduce errors because i'm trying to use statement in my macro_selection_statement. I need to use statement in the macro selection block, because those blocks can have variables definitions likes so:
#ifdef USER
#include "user.h"
int some_var;
char some_text[]="hello";
#ifdef ONE
int two=0;
#endif
#endif
What would be the right move here? because i read that %expect -rr N is a really bad thing to do with the reduce warnings.

You cannot really expect to implement a preprocessor (properly) inside of a C grammar. It needs to be a *pre*processor; that is, it reads the program text, and its output is sent to the C grammar.
It is possible to (mostly) avoid doing a second lex pass, since (in theory) the preprocessor can output tokens rather than a stream of characters. That would probably work well with a bison 2.7-or-better "push parser", so you might want to give it a try. But the traditional approach is just a stream of characters, which may well be easier.
It's important to remember that the replacement text of a macro, as well as the arguments to the macro, have no syntactic constraints. (Or almost no constraints.) The following is entirely legal:
#define OPEN {
#define CLOSE }
#define SAY(whatever) puts(#whatever);
#include <stdio.h>
int main(int argc, char** argv) OPEN SAY(===>) return 0; CLOSE
And that's just a start :)

What are the keywords '%type' and '%token' used for in C?

While I was digging into OpenNTPD source code files, I noticed new keywords and syntaxs that I've never seen in any C code before such as }%, %%, %type and %token in a file named parse.y:
%{
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
...
%}
%token LISTEN ON
%token SERVER SERVERS SENSOR CORRECTION RTABLE REFID WEIGHT
%token ERROR
%token <v.string> STRING
%token <v.number>
....
grammar : /* empty */
| grammar '\n'
| grammar main '\n'
| grammar error '\n' { file->errors++; }
;
main : LISTEN ON address listen_opts {
struct listen_addr *la;
struct ntp_addr *h, *next;
if ($3->a)
...
Most of the file's contents have the usual C syntax except these keywords. Does someone know what these keywords are and what they are used for?

My guess is that this is Yacc code (i.e. the definition of a grammar), not plain C. This is a notation similar to BNF.

And if you look at *.l files, you might also see a lot of C code, mixed with %%, %x, %s, %option etc. Then it's a lexer input file, which often is accompanied by a yacc *.y file.

yacc - field has incomplete type

yacc doesn't seem to like when my tokens are of a type that I defined.
At the top of my grammar (.y) file in a %{ ... %} block, I include a header file that defines the following structure:
typedef struct _spim_register {
spim_register_type type; /* This is a simple enumeration, already defined */
int number;
} spim_register;
Before my list of rules, I have:
%token AREG
...
%union {
struct _spim_register reg;
}
...
%type <reg> register AREG
I get
error: field ‘reg’ has incomplete type
at the line in the %union clause while trying to compile the code produced by bison. In my %union statement, trying to declare reg by writing spim_register reg; gives the error:
unknown type name ‘spim_register’
It seems like there's something special about %union { ... }, because I'm able to use the data structures from my header file in the actions for the rules.

It would help if my #includes were in the right order...
The answer was, as user786653 hinted, here. I needed to include the header file that defines my custom structure before including the .tab.h file in the .l file.

I met the same problem. Because my *.l file like this:
include "y.tab.h"
include "FP.h"
then, I rewrote it like this:
include "FP.h"
include "y.tab.h"
It works. Thank you very much. #ArIck

what is the use of tokens.h when I am programming a lexer?

I am programming a lexer in C and I read somewhere about the header file tokens.h. Is it there? If so, what is its use?

tokens.h is a file generated by yacc or bison that contains a list of tokens within your grammar.
Your yacc/bison input file may contain token declarations like:
%token INTEGER
%token ID
%token STRING
%token SPACE
Running this file through yacc/bison will result in a tokens.h file that contains preprocessor definitions for these tokens:
/* Something like this... */
#define INTEGER (1)
#define ID (2)
#define STRING (3)

Probably, tokens.h is a file generated by the parser generator (Yacc/Bison) containing token definitions so you can return tokens from the lexer to the parser.
With Lex/Flex and Yacc/Bison, it works like this:
parser.y:
%token FOO
%token BAR
%%
start: FOO BAR;
%%
lexer.l:
%{
#include "tokens.h"
%}
%%
foo {return FOO;}
bar {return BAR;}
%%

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight