Common tokens for flex and bison - c

I have one file with declarations of my tokens declarations.h:
#define ID 257
#define NUM 258
...
In my flex code i return one of this values or symbol(for example '+', '-', '*'). And everything works.
The problem in bison file.
If i write something like that:
exp: ID '+' ID
i'll get error, because bison doesn't know anything about ID.
Adding the line %token ID will not help, because in that case i'll have compilation error(preprocessor will change ID by 257 and i'll get 257=257)

You get Bison to create the list of tokens; your lexer uses the list generated by Bison.
bison -d grammar.y
# Generates grammar.tab.c and grammar.tab.h
Your lexer then uses grammar.tab.h:
$ cat grammar.y
%token ID
%%
program: /* Nothing */
| program ID
;
%%
$ cat lexer.l
%{
#include "grammar.tab.h"
%}
%%
[a-zA-Z][A-Za-z_0-9]+ { return ID; }
[ \t\n] { /* Nothing */ }
. { return *yytext; }
%%
$ bison -d grammar.y
$ flex lexer.l
$ gcc -o testgrammar grammar.tab.c lex.yy.c -ly -lfl
$ ./testgrammar
id est
quod erat demonstrandum
$
Bison 2.4.3 on MacOS X 10.7.2 generates the token numbers as an enum, not as a series of #define values - to get the token names into the symbol table for debuggers (a very good idea!).

Related

Flex & Bison: "true" being interpreted as and ID

I am writting a parser and a scanner in Ubuntu OS. In my flex code "scanner.l" I have an IDENTIFIER token and BOOL_LITERAL token. IDENTIFIER is any word and BOOL_LITERAL is either true or false.
In my bison code "parser.y" I have the grammar in which it should be able to take a BOO_LITERAL through the primary production.
However, the code is not working as intended. Here is the erro
Here are all of my files:
scanner.l
%{
#include <string>
#include <vector>
using namespace std;
#include "listing.h"
#include "tokens.h"
%}
%option noyywrap
ws [ \t\r]+
comment (\-\-.*\n)|\/\/.*\n
line [\n]
digit [0-9]
int {digit}+
real {int}"."{int}([eE][+-]?{digit})?
boolean ["true""false"]
punc [\(\),:;]
addop ["+""-"]
mulop ["*""\/"]
relop [="/=">">=""<="<]
id [A-Za-z][A-Za-z0-9]*
%%
{ws} { ECHO; }
{comment} { ECHO; nextLine();}
{line} { ECHO; nextLine();}
{relop} { ECHO; return(RELOP); }
{addop} { ECHO; return(ADDOP); }
{mulop} { ECHO; return(MULOP); }
begin { ECHO; return(BEGIN_); }
boolean { ECHO; return(BOOLEAN); }
end { ECHO; return(END); }
endreduce { ECHO; return(ENDREDUCE); }
function { ECHO; return(FUNCTION); }
integer { ECHO; return(INTEGER); }
real { ECHO; return(REAL); }
is { ECHO; return(IS); }
reduce { ECHO; return (REDUCE); }
returns { ECHO; return(RETURNS); }
and { ECHO; return(ANDOP); }
{boolean} { ECHO; return(BOOL_LITERAL); }
{id} { ECHO; return(IDENTIFIER);}
{int} { ECHO; return(INT_LITERAL); }
{real} { ECHO; return(REAL_LITERAL); }
{punc} { ECHO; return(yytext[0]); }
. { ECHO; appendError(LEXICAL, yytext); }
%%
parser.y
%{
#include <string>
using namespace std;
#include "listing.h"
int yylex();
void yyerror(const char* message);
%}
%error-verbose
%token INT_LITERAL REAL_LITERAL BOOL_LITERAL
%token IDENTIFIER
%token ADDOP MULOP RELOP ANDOP
%token BEGIN_ BOOLEAN END ENDREDUCE FUNCTION INTEGER IS REDUCE RETURNS REAL
%%
function:
function_header optional_variable body ;
function_header:
FUNCTION IDENTIFIER RETURNS type ';' ;
parameters:
parameters ',' |
parameter ;
parameter:
IDENTIFIER ':' type |
;
optional_variable:
variable |
;
variable:
IDENTIFIER ':' type IS statement_ ;
type:
INTEGER |
BOOLEAN |
REAL ;
body:
BEGIN_ statement_ END ';' ;
statement_:
statement ';' |
error ';' ;
statement:
expression |
REDUCE operator reductions ENDREDUCE ;
operator:
ADDOP |
MULOP ;
reductions:
reductions statement_ |
;
expression:
expression ANDOP relation |
relation ;
relation:
relation RELOP term |
term;
term:
term ADDOP factor |
factor ;
factor:
factor MULOP primary |
primary ;
primary:
'(' expression ')' |
INT_LITERAL |
REAL_LITERAL |
BOOL_LITERAL |
IDENTIFIER ;
%%
void yyerror(const char* message)
{
appendError(SYNTAX, message);
}
int main(int argc, char *argv[])
{
firstLine();
yyparse();
lastLine();
return 0;
}
Other associated files:
listing.h
enum ErrorCategories {LEXICAL, SYNTAX, GENERAL_SEMANTIC, DUPLICATE_IDENTIFIER,
UNDECLARED};
void firstLine();
void nextLine();
int lastLine();
void appendError(ErrorCategories errorCategory, string message);
listing.cc
#include <cstdio>
#include <string>
using namespace std;
#include "listing.h"
static int lineNumber;
static string error = "";
static int totalErrors = 0;
static void displayErrors();
void firstLine()
{
lineNumber = 1;
printf("\n%4d ",lineNumber);
}
void nextLine()
{
displayErrors();
lineNumber++;
printf("%4d ",lineNumber);
}
int lastLine()
{
printf("\r");
displayErrors();
printf(" \n");
return totalErrors;
}
void appendError(ErrorCategories errorCategory, string message)
{
string messages[] = { "Lexical Error, Invalid Character ", "",
"Semantic Error, ", "Semantic Error, Duplicate Identifier: ",
"Semantic Error, Undeclared " };
error = messages[errorCategory] + message;
totalErrors++;
}
void displayErrors()
{
if (error != "")
printf("%s\n", error.c_str());
error = "";
}
makeile
compile: scanner.o parser.o listing.o
g++ -o compile scanner.o parser.o listing.o
scanner.o: scanner.c listing.h tokens.h
g++ -c scanner.c
scanner.c: scanner.l
flex scanner.l
mv lex.yy.c scanner.c
parser.o: parser.c listing.h
g++ -c parser.c
parser.c tokens.h: parser.y
bison -d -v parser.y
mv parser.tab.c parser.c
mv parser.tab.h tokens.h
listing.o: listing.cc listing.h
g++ -c listing.cc
Note:
I have to run "makeile", "bison -d parser.y" and finally "makefile" again. Then, I run the following command "./compile < incremental1.txt" and I get the following error:
enter image description here
Please help me understand why I am getting a syntax error.
#SoronelHaetir has certainly identified one of the problems with your parser. But that problem cannot create the syntax error message which appears in your image. [Note 1] Your grammar allows identifiers in exactly the same place as boolean literals, so the fact that true is actually scanned as an identifier will not produce a syntax error in an expression which starts true and. (In other words, x and... would be parsed just the same.)
The problem is actually your use of 8.E+1 as a numeric literal. Your rule for REAL_LITERAL uses the pattern
{int}"."{int}([eE][+-]?{digit})?
which doesn't match 8.E+1 because there is no {int} followed the .. So when the scanner reaches the input 8.E+1, it produces the INT_LITERAL 8, which is the longest match. When it is asked for the next token, it first sees a ., but that doesn't match any pattern so it uses the default fallback action (ECHO), and then continues to the next character (E) which matches the IDENTIFIER pattern. And the input
true and 8 E ...
is indeed a syntax error: there is an unexpected identifier following the 8, and that's what bison reports.
Aside from fixing the pattern for real literals, you should make sure that you do something sensible with unrecognised characters; flex's default action -- which basically just ignores characters that can't match any pattern -- is not of much use, particularly in debugging (as I think the above explanation demonstrates).
There are a number of other issues with your patterns involving the same misconception about the syntax of character classes as shown in the boolean literal pattern. This indicates to me that you did not attempt to test your lexical scanner before hooking it into your parser. That's an essential step in writing parsers; if your lexical scanner is not returning the tokens you expect it to return, you're going to have a lot of trouble trying to figure out what errors there might be in your grammar.
You might find the debugging techniques outlined in this answer useful. (That post also has links to the flex and bison manuals. Section 6 of the flex manual is a brief but complete guide to the syntax of flex patterns, and you might want to take a few minutes to read it.)
Notes
Please copy and paste the text of error messages into your questions rather than using an image showing a screenshot. Images are very hard to read on smartphones, for example, or for people who rely on screen-readers. And it's not possible to copy a part of a screenshot into an answer, which I would have preferred to have done here.
Your boolean pattern should be "true"|"false" not ["true""false"].
Honestly, the way your patterns are set up is just weird. Is there some reason not to use:
...
%%
"true" { /* */ return BOOL_LITERAL; }
"false { /* */ return BOOL_LITERAL; }
Patterns make sense when you aren't trying to match literals but here you are.

Define multiple-word macro using -D flag with gcc

The purpose of this is to build a program with command line-injected macros, using a Makefile.
I would like to define macros using multiple terms, however I am given an error as subsequent parts of the string are treated as files by gcc.
An example of what I need is as follows:
#define ULL unsigned long long
#define T_ULL typedef unsigned long long ull_t
As a result, I am only able to create macros that contain 1 term per definition.
The latter attempt allows me to create parameterized macros, however those are also limited to 1 term per definition.
Attempted solution
#include <stdio.h>
#define _STRINGIZE(x) #x
#define STRINGIZE(x) _STRINGIZE(x)
int main(void)
{
# ifdef DEBUG
# ifdef STRING
printf("%s", "A STRING macro was defined.\n");
printf("string: %s\n", STRINGIZE(STRING));
# else
printf("%s\n", "A DEBUG macro was defined.");
# endif
# endif
}
Results
As described by the man page, under the -D option description.
$ gcc define.c -D='DEBUG' ; ./a.out
A DEBUG macro was defined.
As described by this answer, as an alternative approach.
$ gcc define.c -D'DEBUG' ; ./a.out
A DEBUG macro was defined.
$ gcc define.c -D'DEBUG' -D'STRING="abc"' ; ./a.out
A STRING macro was defined.
string: "abc"
$ gcc define.c -D'DEBUG' -D'STRING="abc efg"' ; ./a.out
clang: error: no such file or directory: 'efg"'
A STRING macro was defined.
string: "abc"
$ gcc define.c -D'DEBUG' -D'STRING="abc efg hij"' ; ./a.out
clang: error: no such file or directory: 'efg'
clang: error: no such file or directory: 'hij"'
A DEBUG macro was defined.
string: "abc"
You don't need the STRINGIZE macro. The correct command-line syntax is:
gcc -DDEBUG -DSTRING='"abc def"' program.c
In other words, you need to quote the whole value of the defined macro, including C string delimiters (").
Then you can just do:
printf("string: %s\n", STRING);

How to show 'preprocessed' code ignoring includes with GCC

I'd like to know if it's possible to output 'preprocessed' code wit gcc but 'ignoring' (not expanding) includes:
ES I got this main:
#include <stdio.h>
#define prn(s) printf("this is a macro for printing a string: %s\n", s);
int int(){
char str[5] = "test";
prn(str);
return 0;
}
I run gcc -E main -o out.c
I got:
/*
all stdio stuff
*/
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);
return 0;
}
I'd like to output only:
#include <stdio.h>
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);
return 0;
}
or, at least, just
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);
return 0;
}
PS: would be great if possible to expand "local" "" includes and not to expand "global" <> includes
I agree with Matteo Italia's comment that if you just prevent the #include directives from being expanded, then the resulting code won't represent what the compiler actually sees, and therefore it will be of limited use in troubleshooting.
Here's an idea to get around that. Add a variable declaration before and after your includes. Any variable that is reasonably unique will do.
int begin_includes_tag;
#include <stdio.h>
... other includes
int end_includes_tag;
Then you can do:
> gcc -E main -o out.c | sed '/begin_includes_tag/,/end_includes_tag/d'
The sed command will delete everything between those variable declarations.
When cpp expands includes it adds # directives (linemarkers) to trace back errors to the original files.
You can add a post processing step (it can be trivially written in any scripting language, or even in C if you feel like it) to parse just the linemarkers and filter out the lines coming from files outside of your project directory; even better, one of the flags (3) marks system header files (stuff coming from paths provided through -isystem, either implicitly by the compiler driver or explicitly), so that's something you could exploit as well.
For example in Python 3:
#!/usr/bin/env python3
import sys
skip = False
for l in sys.stdin:
if not skip:
sys.stdout.write(l)
if l.startswith("# "):
toks = l.strip().split(" ")
linenum, filename = toks[1:3]
flags = toks[3:]
skip = "3" in flags
Using gcc -E foo.c | ./filter.py I get
# 1 "foo.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "foo.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 4 "foo.c"
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);;
return 0;
}
Protect the #includes from getting expanded, run the preprocessor textually, remove the # 1 "<stdint>" etc. junk the textual preprocessor generates and reexpose the protected #includes.
This shell function does it:
expand_cpp(){
sed 's|^\([ \t]*#[ \t]*include\)|magic_fjdsa9f8j932j9\1|' "$#" \
| cpp | sed 's|^magic_fjdsa9f8j932j9||; /^# [0-9]/d'
}
as long as you keep the include word together instead of doing crazy stuff like
#i\
ncl\
u??/
de <iostream>
(above you can see 2 backslash continuation lines + 1 trigraph (??/ == \ ) backslash continuation line).
If you wish, you can protect #ifs #ifdefs #ifndefs #endifs and #elses the same way.
Applied to your example
example.c:
#include <stdio.h>
#define prn(s) printf("this is a macro for printing a string: %s\n", s);
int int(){
char str[5] = "test";
prn(str);
return 0;
}
like as with expand_cpp < example.c or expand_cpp example.c, it generates:
#include <stdio.h>
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);;
return 0;
}
You can use -dI to show the #include directives and post-process the preprocessor output.
Assuming the name of your your file is foo.c
SOURCEFILE=foo.c
gcc -E -dI "$SOURCEFILE" | awk '
/^# [0-9]* "/ { if ($3 == "\"'"$SOURCEFILE"'\"") show=1; else show=0; }
{ if(show) print; }'
or to suppress all # line_number "file" lines for $SOURCEFILE:
SOURCEFILE=foo.c
gcc -E -dI "$SOURCEFILE" | awk '
/^# [0-9]* "/ { ignore = 1; if ($3 == "\"'"$SOURCEFILE"'\"") show=1; else show=0; }
{ if(ignore) ignore=0; else if(show) print; }'
Note: The AWK scripts do not work for file names that include whitespace. To handle file names with spaces you could modify the AWK script to compare $0 instead of $3.
supposing the file is named c.c :
gcc -E c.c | tail -n +`gcc -E c.c | grep -n -e "#*\"c.c\"" | tail -1 | awk -F: '{print $1}'`
It seems # <number> "c.c" marks the lines after each #include
Of course you can also save gcc -E c.c in a file to not do it two times
The advantage is to not modify the source nor to remove the #include before to do the gcc -E, that just removes all the lines from the top up to the last produced by an #include ... if I am right
Many previous answers went in the direction of using the tracing # directives.
It's actually a one-liner in classical Unix (with awk):
gcc -E file.c | awk '/# [1-9][0-9]* "file.c"/ {skip=0; next} /# [1-9][0-9]* ".*"/ {skip=1} (skip<1) {print}'
TL;DR
Assign file name to fname and run following commands in shell. Throughout this ansfer fname is assumed to be sh variable containing the source file to be processed.
fname=file_to_process.c ;
grep -G '^#include' <./"$fname" ;
grep -Gv '^#include[ ]*<' <./"$fname" | gcc -x c - -E -o - $(grep -G '^#include[ ]*<' <./"$fname" | xargs -I {} -- expr "{}" : '#include[ ]*<[ ]*\(.*\)[ ]*>' | xargs -I {} printf '-imacros %s ' "{}" ) | grep -Ev '^([ ]*|#.*)$'
All except gcc here is pure POSIX sh, no bashisms, or nonportable options. First grep is there to output #include directives.
GCC's -imacros
From gcc documentation:
-imacros file: Exactly like ‘-include’, except that any output produced by scanning file is
thrown away. Macros it defines remain defined. This allows you to acquire all
the macros from a header without also processing its declarations
So, what is -include anyway?
-include file: Process file as if #include "file" appeared as the first line of the primary
source file. However, the first directory searched for file is the preprocessor’s
working directory instead of the directory containing the main source file. If
not found there, it is searched for in the remainder of the #include "..."
search chain as normal.
Simply speaking, because you cannot use <> or "" in -include directive, it will always behave as if #include <file> were in source code.
First approach
ANSI C guarantees assert to be macro, so it is perfect for simple test:
printf 'int main(){\nassert(1);\nreturn 0;}\n' | gcc -x c -E - -imacros assert.h.
Options -x c and - tells gcc to read source file from stdin and that the language used is C. Output doesn't contain any declarations from assert.h, but there is still mess, that can be cleaned up with grep:
printf 'int main(){\nassert(1);\nreturn 0;}\n' | gcc -x c -E - -imacros assert.h | grep -Ev '^([ ]*|#.*)$'
Note: in general, gcc won't expand tokens that intended to be macros, but the definition is missing. Nevertheless assert happens to expand entirely: __extension__ is compiler option, __assert_fail is function, and __PRETTY_FUNCTION__ is string literal.
Automatisation
Previous approach works, but it can be tedious;
each #include needs to be deleted from file manually, and
it has to be added to gcc call as -imacros's argument.
First part is easy to script: pipe grep -Gv '^#include[ ]*<' <./"$fname" to gcc.
Second part takes some exercising (at least without awk):
2.1 Drop -v negative matching from previous grep command: grep -G '^#include[ ]*<' <./"$fname"
2.2 Pipe previous to expr inside xarg to extract header name from each include directive: xargs -I {} -- expr "{}" : '#include[ ]*<[ ]*\(.*\)[ ]*>'
2.3 Pipe again to xarg, and printf with -imacros prefix: xargs -I {} printf '-imacros %s ' "{}"
2.4 Enclose all in command substitution "$()" and place inside gcc.
Done. This is how you end up with the lengthy command from the beginning of my answer.
Solving subtle problems
This solution still has flaws; if local header files themselves contains global ones, these global will be expanded. One way to solve this problem is to use grep+sed to transfer all global includes from local files and collect them in each *.c file.
printf '' > std ;
for header in *.h ; do
grep -G '^#include[ ]*<' <./$header >> std ;
sed -i '/#include[ ]*</d' $header ;
done;
for source in *.c ; do
cat std > tmp;
cat $source >> tmp;
mv -f tmp $source ;
done
Now the processing script can be called on any *.c file inside pwd without worry, that anything from global includes would leak into. The final problem is duplication. Local headers including themselves local includes might be duplicated, but this could occur only, when headers aren't guarded, and in general every header should be always guarded.
Final version and example
To show these scripts in action, here is small demo:
File h1.h:
#ifndef H1H
#define H1H
#include <stdio.h>
#include <limits.h>
#define H1 printf("H1:%i\n", h1_int)
int h1_int=INT_MAX;
#endif
File h2.h:
#ifndef H2H
#define H2H
#include <stdio.h>
#include "h1.h"
#define H2 printf("H2:%i\n", h2_int)
int h2_int;
#endif
File main.c:
#include <assert.h>
#include "h1.h"
#include "h2.h"
int main(){
assert(1);
H1;
H2;
}
Final version of the script preproc.sh:
fname="$1"
printf '' > std ;
for source in *.[ch] ; do
grep -G '^#include[ ]*<' <./$source >> std ;
sed -i '/#include[ ]*</d' $source ;
sort -u std > std2;
mv -f std2 std;
done;
for source in *.c ; do
cat std > tmp;
cat $source >> tmp;
mv -f tmp $source ;
done
grep -G '^#include[ ]*<' <./"$fname" ;
grep -Gv '^#include[ ]*<' <./"$fname" | gcc -x c - -E -o - $(grep -G '^#include[ ]*<' <./"$fname" | xargs -I {} -- expr "{}" : '#include[ ]*<[ ]*\(.*\)[ ]*>' | xargs -I {} printf '-imacros %s ' "{}" ) | grep -Ev '^([ ]*|#.*)$'
Output of the call ./preproc.sh main.c:
#include <assert.h>
#include <limits.h>
#include <stdio.h>
int h1_int=0x7fffffff;
int h2_int;
int main(){
((void) sizeof ((
1
) ? 1 : 0), __extension__ ({ if (
1
) ; else __assert_fail (
"1"
, "<stdin>", 4, __extension__ __PRETTY_FUNCTION__); }))
;
printf("H1:%i\n", h1_int);
printf("H2:%i\n", h2_int);
}
This should always compile. If you really want to print every #include "file", then delete < from grep pattern '^#include[ ]*<' in 16-th line of preproc.sh`, but be warned, that content of headers will then be duplicated, and code might fail, if headers contain initialisation of variables. This is purposefully the case in my example to address the problem.
Summary
There are plenty of good answers here so why yet another? Because this seems to be unique solution with following properties:
Local includes are expanded
Global included are discarded
Macros defined either in local or global includes are expanded
Approach is general enough to be usable not only with toy examples, but actually in small and medium projects that reside in a single directory.

Using character literals as terminals in bison

I'm trying to understand flex/bison, but the documentation is a bit difficult for me, and I've probably grossly misunderstood something. Here's a test case: http://namakajiri.net/misc/bison_charlit_test/
File "a" contains the single character 'a'. "foo.y" has a trivial grammar like this:
%%
file: 'a' ;
The generated parser can't parse file "a"; it gives a syntax error.
The grammar "bar.y" is almost the same, only I changed the character literal for a named token:
%token TOK_A;
%%
file: TOK_A;
and then in bar.lex:
a { return TOK_A; }
This one works just fine.
What am I doing wrong in trying to use character literals directly as bison terminals, like in the docs?
I'd like my grammar to look like "statement: selector '{' property ':' value ';' '}'" and not "statement: selector LBRACE property COLON value SEMIC RBRACE"...
I'm running bison 2.5 and flex 2.5.35 in debian wheezy.
Rewrite
The problem is a runtime problem, not a compile time problem.
The trouble is that you have two radically different lexical analyzers.
The bar.lex analyzer recognizes an a in the input and returns it as a TOK_A and ignores everything else.
The foo.lex analyzer echoes every single character, but that's all.
foo.lex — as written
%{
#include "foo.tab.h"
%}
%%
foo.lex — equivalent
%{
#include "foo.tab.h"
%}
%%
. { ECHO; }
foo.lex — required
%{
#include "foo.tab.h"
%}
%%
. { return *yytext; }
Working code
Here's some working code with diagnostic printing in place.
foo-lex.l
%%
. { printf("Flex: %d\n", *yytext); return *yytext; }
foo.y
%{
#include <stdio.h>
void yyerror(char *s);
%}
%%
file: 'a' { printf("Bison: got file!\n") }
;
%%
int main(void)
{
yyparse();
}
void yyerror(char *s)
{
fprintf(stderr, "%s\n", s);
}
Compilation and execution
$ flex foo-lex.l
$ bison foo.y
$ gcc -o foo foo.tab.c lex.yy.c -lfl
$ echo a | ./foo
Flex: 97
Bison: got file!
$
Point of detail: how did that blank line get into the output? Answer: the lexical analyzer put it there. The pattern . does not match a newline, so the newline was treated as if there was a rule:
\n { ECHO; }
This is why the input was accepted. If you change the foo-lex.l file to:
%%
. { printf("Flex-1: %d\n", *yytext); return *yytext; }
\n { printf("Flex-2: %d\n", *yytext); return *yytext; }
and then recompile and run again, the output is:
$ echo a | ./foo
Flex-1: 97
Bison: got file!
Flex-2: 10
syntax error
$
with no blank lines. This is because the grammar doesn't allow a newline to appear in a valid 'file'.

What's wrong with this lex file?

I have a Makefile so that when I type make the following commands run:
yacc -d parser.y
gcc -c y.tab.c
flex calclexer.l
gcc -c lex.yy.c
But then after this I get the following error messages:
calclexer.l:10: error: parse error before '[' token
calclexer.l:10: error: stray '\' in program
calclexer.l:15: error: stray '\' in program
calclexer.l:24: error: stray '\' in program
make: *** [lex.yy.o] Error 1
This is what is inside calclexer. How can it be fixed?
%{
#include "y.tab.h"
#include "parser.h"
#include <math.h>
%}
%%
%%
([0-9]+|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?) {
yylval.dval = atof(yytext);
return NUMBER;
}
[ \t] ; /* ignore white space */
[A-Za-z][A-Za-z0-9]* { /* return symbol pointer */
yylval.symp = symlook(yytext);
return NAME;
}
"$" { return 0; /* end of input */ }
\n |. return yytext[0];
%%
You look to have an extra "%%" in "calclexer.l", where you have:
%%
%%
Remove one of those (and the blank line).
The format of a lexer file is (taken from the flex manpage):
definitions
%%
rules
%%
user code
The user code gets copied verbatim to the output file. With the extra "%%", your rules are being interpreted as user code.

Resources