javaCC Parsing Limitation - parser-generator

I am trying to parse a text file through javaCC. The file consists of multiple sentences, separated by newline. Each line may contain any sequence of "a" and "b" but should end with "a" followed "b" before the newline. JavaCC doesn't parse the same and consumes the terminal tokens a and b as part of the optional series.
This should be parsed successfully by JavaCC:
aa ab aab
aab
The jjt file is as follows:
options {
STATIC = false ;
FORCE_LA_CHECK = true;
LOOKAHEAD = 20000;
DEBUG_PARSER = true;
DEBUG_LOOKAHEAD = true;
OTHER_AMBIGUITY_CHECK = 3;
}
PARSER_BEGIN(Test)
class Test {
public static void main( String[] args )
throws ParseException {
Test act = new Test (System.in);
SimpleNode root = act.Start() ;
root.dump (" ");
//ystem.out.println("Total = "+val);
}
}PARSER_END(Test)
TOKEN_MGR_DECLS :
{
int stringSize;
}
SKIP : { < WS : " " > }
SKIP : {"\t" | "\r" | "\uFFFF" | "\u201a" | "\u00c4" | "\u00ee" | "\u00fa" | "\u00f9" | "\u00ec" | "\u2013" }
TOKEN [IGNORE_CASE] :
{
< A : "a" >
| < B : "b" >
| < NEWLINE : (("\n")+ ) >
}
SimpleNode Start() throws NumberFormatException :
{
int i ;
int value=0 ;
} {
chapter()
{
return jjtThis; }
}
void chapter() :
{ } {
(LOOKAHEAD (part_sentence()) part_sentence())+ (newline())? <EOF>
}
void part_sentence() :
{ } {
<NEWLINE> ( a() | b())+ a() b()
}
void a() :
{ } {
<A>
}
void b() :
{ } {
<B>
}
void newline() throws NumberFormatException :
{ }{
<NEWLINE>
{ System.out.print ("N# "); }
}
It may be clarified, that non-terminals a() and b() cannot be substituted with a token; they are taken as "a" and "b" only for simplicity. Also, "NEWLINE" cannot be shifted to the end of the non-terminal "part_sentence" due to other constraints.
I am stuck at this problem from the past 4 days. My last hope was semantic parsing - LOOKAHEAD ({!( getToken(1).kind==a() && getToken(2).kind==b() && getToken(3).kind==newline()}) but cannot get a handle to non-terminals! Any help would be deeply appreciated.

[Note: you say any sequence of a's and b's that ends with "ab", but your code uses a + not a *. I'm going to assume you really did mean any sequence that ends with "ab", including the sequence of "ab". End Note.]
You need to exit the loop on the basis of look ahead. What you want to do is this
( LOOKAHEAD( x )
(a() | b() )
)*
a() b() <NEWLINE>
where x says if the next items of input do not match a() b() <NEWLINE>. Unfortunately, there is no way to say "do not match" using syntactic look ahead. The trick is to replace the loop with a recursion.
void oneLine() : {} {
LOOKAHEAD( a() b() <NEWLINE> )
a() b() <NEWLINE>
|
a() oneLine()
|
b() oneLine()
}
You say that you want the <NEWLINE> at the start of the production. For reasons explained in the FAQ, I don't like using syntactic look ahead that extends beyond the choice at hand. But the following could be done.
void oneLine() : {} { <NEWLINE> oneLinePrime() }
void oneLinePrime() : {} {
LOOKAHEAD( a() b() <NEWLINE> )
a() b()
|
a() oneLinePrime()
|
b() oneLinePrime()
}

Related

Why doesn't this grammar parse the return statement?

I am trying to write a grammar that can parse the following 3 inputs
-- testfile --
class hi implements ho:
var x:int;
end;
-- testfile2 --
interface xs:
myFunc(int,int):int
end;
-- testfile3 --
class hi implements ho:
method myMethod(x:int)
return y;
end
end;
this is lexer.l:
%{
#include <stdio.h>
#include <stdlib.h>
#include "parser.tab.h"
#include <string.h>
int line_number = 0;
void lexerror(char *message);
%}
newline (\n|\r\n)
whitespace [\t \n\r]*
digit [0-9]
alphaChar [a-zA-Z]
alphaNumChar ({digit}|{alphaChar})
hexDigit ({digit}|[A-Fa-f])
decNum {digit}+
hexNum {digit}{hexDigit}*H
identifier {alphaChar}{alphaNumChar}*
number ({hexNum}|{decNum})
comment "/*"[.\r\n]*"*/"
anything .
%s InComment
%option noyywrap
%%
<INITIAL>{
interface return INTERFACE;
end return END;
class return CLASS;
implements return IMPLEMENTS;
var return VAR;
method return METHOD;
int return INT;
return return RETURN;
if return IF;
then return THEN;
else return ELSE;
while return WHILE;
do return DO;
not return NOT;
and return AND;
new return NEW;
this return THIS;
null return _NULL;
":" return COL;
";" return SCOL;
"(" return BRACL;
")" return BRACR;
"." return DOT;
"," return COMMA;
"=" return ASSIGNMENT;
"+" return PLUS;
"-" return MINUS;
"*" return ASTERISK;
"<" return LT;
{decNum} {
yylval = atoi(yytext);
return DEC;
}
{hexNum} {
const int len = strlen(yytext)-1;
char* substr = (char*) malloc(sizeof(char) * len);
strncpy(substr,yytext,len);
yylval = (int)strtol
( substr
, NULL
, 16);
free (substr);
return HEX;
}
{identifier} {
yylval= (char *) malloc(sizeof(char)*strlen(yytext));
strcpy(yylval, yytext);
return ID;
}
{whitespace} {}
"/*" BEGIN InComment;
}
{newline} line_number++;
<InComment>{
"*/" BEGIN INITIAL;
{anything} {}
}
. lexerror("Illegal input");
%%
void lexerror(char *message)
{
fprintf(stderr,"Error: \"%s\" in line %d. = %s\n",
message,line_number,yytext);
exit(1);
}
this is parser.y:
%{
# include <stdio.h>
int yylex(void);
void yyerror(char *);
extern int line_number;
%}
%start Program
%token INTERFACE END CLASS IMPLEMENTS VAR METHOD INT RETURN IF THEN ELSE
%token WHILE DO NOT AND NEW THIS _NULL EOC SCOL COL BRACL BRACR DOT COMMA
%token ASSIGNMENT PLUS ASTERISK MINUS LT EQ DEC HEX ID NEWLINE
%%
Program: INTERFACE Interface SCOL { printf("interface\n"); }
| CLASS Class SCOL { printf("class\n");}
| error { printf("error on: %s\n", $$); }
;
Interface: ID COL
AbstractMethod
END
;
AbstractMethod: ID BRACL Types BRACR COL Type
;
Types : Type COMMA Types
| Type
;
Class: ID
IMPLEMENTS ID COL
Member SCOL
END
;
Member: VAR ID COL Type
| METHOD ID BRACL Pars BRACR Stats END
;
Type: INT
| ID
;
Pars: Par COMMA Pars
| Par
;
Par: ID COL Type
;
Stats: Stat SCOL Stat
| Stat
;
Stat: RETURN Expr
| IF Expr THEN Stats MaybeElse END
| WHILE Expr DO Stats END
| VAR ID COL Type COL ASSIGNMENT Expr
| ID COL ASSIGNMENT Expr
| Expr
;
MaybeElse :
| ELSE Stats
;
Expr: NOT Term
| NEW ID
| Term PLUS Term
| Term ASTERISK Term
| Term AND Term
| Term ArithOp Term
| Term
;
ArithOp: MINUS
| LT
| ASSIGNMENT
;
Term: BRACL Expr BRACR
| Num
| THIS
| ID
| Term DOT ID BRACL Exprs BRACR
| error { printf("error in term: %s\n", $$); }
;
Num : HEX
| INT
;
Exprs : Expr COMMA Exprs
| Expr
;
%%
void yyerror(char *s) {
fprintf(stderr, "Parse Error on line %i: %s\n", line_number, s);
}
int main(void){
yyparse();
}
the first two inputs are recognized as expected,
However, the third one fails with the error error on: y and I don't have an idea why.
As I see it, this should be a Class with a Member METHOD that contains a Stat(ement) RETURN with an Expr Term being an ID.
I tried commenting and removing all the unneccesary bits, but the result is still the same.
I also took a look at the parser to verify that my identifiers parse correctly, but as I see it they should.
Why is the y in return y not recognized here?
Is there some conflict in the grammar I am unaware of?
(Please note that I am not expecting you to fix the complete grammar; I am merely asking for the reason this is not working. I am sure there are other errors in there, but I am really stuck fixing this one.)
here is also my makefile:
CC = gcc
LEX = flex
YAC = bison
scanner: parser.y lexer.l
$(YAC) -d -Wcounterexamples parser.y
$(LEX) lexer.l
$(CC) parser.tab.c parser.tab.h lex.yy.c -o parser
clean:
rm -f *.tab.h *.tab.c *.gch *.yy.c
rm ./parser
testing:
cat testfile3 | ./parser
First you have one error in your grammar :
Stats: Stat SCOL Stat
| Stat
;
must be
Stats: Stat SCOL Stats
| Stat
;
('s' added at the end of line)
Second your definition in testfile3 does not follow your grammar and must be
class hi implements ho:
method myMethod(x:int)
return y
end;
end;
so the ';' after return y must be moved after the first end
(and return x seems more logical, but this is an other subject, you do not check the validity of the ID)
Out of that a class can have only one member, it's very limited / restrictive

Why isn't my bison printing the variable names?

So i'm using a flex/bison parser but the variable names arent printing correctly. It understands the number values. I've tried messing with everything but I'm lost. heres a link to the output. its where it prints "Data: 0" that i'm trying to get the variable name [https://imgur.com/vJDpgpR][1]
invocation is: ./frontEnd data.txt
//main.c
#define BUF_SIZE 1024
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
extern FILE* yyin;
extern yyparse();
int main(int argc, char* argv[]){
if(argc < 2){
FILE* fp = fopen("temp.txt", "a");
printf("Entering data: \n");
void *content = malloc(BUF_SIZE);
if (fp == 0)
printf("error opening file");
int read;
while ((read = fread(content, BUF_SIZE, 1, stdin))){
fwrite(content, read, 1, fp);
}
if (ferror(stdin))
printf("There was an error reading from stdin");
fclose(fp);
yyparse(fp);
}
if(argc == 2){
yyin = fopen(argv[2], "r");
if(!yyin)
{
perror(argv[2]);
printf("ERROR: file does not exist.\n");
return 0;
}
yyparse (yyin);
}
return 0;
}
void yyerror(char *s){
fprintf(stderr, "error: exiting %s \n", s);
}
//lex.l
%{
#include <stdio.h>
#include <stdlib.h>
#include "parser.tab.h"
extern SYMTABNODEPTR symtable[SYMBOLTABLESIZE];
extern int curSymSize;
%}
%option noyywrap
%option nounput yylineno
%%
"stop" return STOP;
"iter" return ITER;
"scanf" return SCANF;
"printf" return PRINTF;
"main" return MAIN;
"if" return IF;
"then" return THEN;
"let" return LET;
"func" return FUNC;
"//" return COMMENT; printf("\n");
"start" return START;
"=" return ASSIGN;
"=<" return LE;
"=>" return GE;
":" return COLON;
"+" return PLUS;
"-" return MINUS;
"*" return MULT;
"/" return DIV;
"%" return MOD;
"." return DOT;
"(" return RPAREN;
")" return LPAREN;
"," return COMMA;
"{" return RBRACE;
"}" return LBRACE;
";" return SEMICOLON;
"[" return LBRACK;
"]" return RBRACK;
"==" return EQUAL;
[A-Z][a-z]* { printf("SYNTAX ERROR: Identifiers must start with lower case. "); }
[a-zA-Z][_a-zA-Z0-9]* {
printf("string: %s \n", yytext);
yylval.iVal = strdup(yytext);
yylval.iVal = addSymbol(yytext);
return ID;
}
[0-9]+ {
yylval.iVal = atoi(yytext);
printf("num: %s \n", yytext);
return NUMBER; }
[ _\t\r\s\n] ;
^"#".+$ return COMMENT;
. {printf("ERROR: Invalid Character "); yyterminate();}
<<EOF>> { printf("EOF: line %d\n", yylineno); yyterminate(); }
%%
// stores all variable id is in an array
SYMTABNODEPTR newSymTabNode()
{
return ((SYMTABNODEPTR)malloc(sizeof(SYMTABNODE)));
}
int addSymbol(char *s)
{
extern SYMTABNODEPTR symtable[SYMBOLTABLESIZE];
extern int curSymSize;
int i;
i = lookup(s);
if(i >= 0){
return(i);
}
else if(curSymSize >= SYMBOLTABLESIZE)
{
return (NOTHING);
}
else{
symtable[curSymSize] = newSymTabNode();
strncpy(symtable[curSymSize]->id,s,IDLENGTH);
symtable[curSymSize]->id[IDLENGTH-1] = '\0';
return(curSymSize++);
}
}
int lookup(char *s)
{
extern SYMTABNODEPTR symtable[SYMBOLTABLESIZE];
extern int curSymSize;
int i;
for(i=0;i<curSymSize;i++)
{
if(strncmp(s,symtable[i]->id,IDLENGTH) == 0){
return (i);
}
}
return(-1);
}
// parser.y
%{
#define YYERROR_VERBOSE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
extern char *yytext;
extern int yylex();
extern void yyerror(char *);
extern int yyparse();
extern FILE *yyin;
/* ------------- some constants --------------------------------------------- */
#define SYMBOLTABLESIZE 50
#define IDLENGTH 15
#define NOTHING -1
#define INDENTOFFSET 2
#ifdef DEBUG
char *NodeName[] =
{
"PROGRAM", "BLOCK", "VARS", "EXPR", "N", "A", "R", "STATS", "MSTAT", "STAT",
"IN", "OUT", "IF_STAT", "LOOP", "ASSIGN", "RO", "IDVAL", "NUMVAL"
};
#endif
enum ParseTreeNodeType
{
PROGRAM, BLOCK, VARS, EXPR, N, A, R, STATS, MSTAT, STAT,
IN, OUT,IF_STAT, LOOP, ASSIGN, RO, IDVAL, NUMVAL
};
#define TYPE_CHARACTER "char"
#define TYPE_INTEGER "int"
#define TYPE_REAL "double"
#ifndef TRUE
#define TRUE 1
#endif
#ifndef FALSE
#define FALSE 0
#endif
#ifndef NULL
#define NULL 0
#endif
// definitions for parse tree
struct treeNode {
int item;
int nodeID;
struct treeNode *first;
struct treeNode *second;
};
typedef struct treeNode TREE_NODE;
typedef TREE_NODE *TREE;
TREE makeNode(int, int, TREE, TREE);
#ifdef DEBUG
void printTree(TREE, int);
#endif
// symbol table definitions.
struct symbolTableNode{
char id[IDLENGTH];
};
typedef struct symbolTableNode SYMTABNODE;
typedef SYMTABNODE *SYMTABNODEPTR;
SYMTABNODEPTR symtable[SYMBOLTABLESIZE];
int curSymSize = 0;
%}
%start program
%union {
char *sVal;
int iVal;
TREE tVal;
}
// list of all tokens
%token SEMICOLON GE LE EQUAL COLON RBRACK LBRACK ASSIGNS LPAREN RPAREN COMMENT
%token DOT MOD PLUS MINUS DIV MULT RBRACE LBRACE START MAIN STOP LET COMMA
%token SCANF PRINTF IF ITER THEN FUNC
%left MULT DIV MOD ADD SUB
// tokens defined with values and rule names
%token<iVal> NUMBER ID
//%token<sVal> ID
%type<tVal> program type block vars expr N A R stats mStat stat in out if_stat loop assign RO
%%
program : START vars MAIN block STOP
{
TREE tree;
tree = makeNode(NOTHING, PROGRAM, $2,$4);
#ifdef DEBUG
printTree(tree, 0);
#endif
}
;
block : RBRACE vars stats LBRACE
{
$$ = makeNode(NOTHING, BLOCK, $2, $3);
}
;
vars : /*empty*/
{
$$ = makeNode(NOTHING, VARS,NULL,NULL);
}
| LET ID COLON NUMBER vars
{
$$ = makeNode($2, VARS, $5,NULL);
printf("id: %d", $2);
}
;
//variable:
// type ID{$$ = newNode($2,VARIABLE,$1,NULL,NULL);};
//type:
// INT {$$ = newNode(INT,TYPE,NULL,NULL,NULL);}
// | BOOL {$$ = newNode(BOOL,TYPE,NULL,NULL,NULL);}
// | CHAR {$$ = newNode(CHAR,TYPE,NULL,NULL,NULL);}
// | STRING{$$ = newNode(STRING,TYPE,NULL,NULL,NULL);};
expr : N DIV expr
{
$$ = makeNode(DIV, EXPR, $1, $3);
}
| N MULT expr
{
$$ = makeNode(MULT, EXPR, $1, $3);
}
| N
{
$$ = makeNode(NOTHING, EXPR, $1,NULL);
}
;
N : A PLUS N
{
$$ = makeNode(PLUS, N, $1, $3);
}
| A MINUS N
{
$$ = makeNode(MINUS, N, $1, $3);
}
| A
{
$$ = makeNode(NOTHING, N, $1,NULL);
}
;
A : MOD A
{
$$ = makeNode(NOTHING, A, $2,NULL);
}
| R
{
$$ = makeNode(NOTHING, A, $1,NULL);
}
;
R : LBRACK expr RBRACK
{
$$ = makeNode(NOTHING, R, $2,NULL);
}
| ID
{
$$ = makeNode($1, IDVAL, NULL,NULL);
}
| NUMBER
{
$$ = makeNode($1, NUMVAL, NULL,NULL);
}
;
stats : stat mStat
{
$$ = makeNode(NOTHING, STATS, $1, $2);
}
;
mStat : /* empty */
{
$$ = makeNode(NOTHING, MSTAT, NULL,NULL);
}
| stat mStat
{
$$ = makeNode(NOTHING, MSTAT, $1, $2);
}
;
stat: in DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| out DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| block
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| if_stat DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| loop DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
| assign DOT
{
$$ = makeNode(NOTHING, STAT, $1,NULL);
}
;
in : SCANF LBRACK ID RBRACK
{
$$ = makeNode($3, IN,NULL,NULL);
}
;
out : PRINTF LBRACK expr RBRACK
{
$$ = makeNode(NOTHING, OUT,$3,NULL);
}
;
if_stat : IF LBRACK expr RO expr RBRACK THEN block
{
$$ = makeNode(NOTHING, IF_STAT, $4, $8);
}
;
loop : ITER LBRACK expr RO expr RBRACK block
{
$$ = makeNode(NOTHING, LOOP, $4, $7);
}
;
assign : ID ASSIGNS expr
{
$$ = makeNode($1, ASSIGN, $3,NULL);
}
;
RO : LE
{
$$ = makeNode(LE, RO, NULL,NULL);
}
| GE
{
$$ = makeNode(GE, RO, NULL,NULL);
}
| EQUAL
{
$$ = makeNode(EQUAL, RO, NULL,NULL);
}
| COLON COLON
{
$$ = makeNode(EQUAL, RO, NULL,NULL);
}
;
%%
// node generator
TREE makeNode(int iVal, int nodeID, TREE p1, TREE p2)
{
TREE t;
t = (TREE)malloc(sizeof(TREE_NODE));
t->item = iVal;
t->nodeID = nodeID;
t->first = p1;
t->second = p2;
//printf("NODE CREATED");
return(t);
}
// prints the tree with indentation for depth
void printTree(TREE tree, int depth){
int i;
if(tree == NULL) return;
for(i=depth;i;i--)
printf(" ");
if(tree->nodeID == NUMBER)
printf("INT: %d ",tree->item);
else if(tree->nodeID == IDVAL){
if(tree->item > 0 && tree->item < SYMBOLTABLESIZE )
printf("id: %s ",symtable[tree->item]->id);
else
printf("unknown id: %d ", tree->item);
}
if(tree->item != NOTHING){
printf("Data: %d ",tree->item);
}
// If out of range of the table
if (tree->nodeID < 0 || tree->nodeID > sizeof(NodeName))
printf("Unknown ID: %d\n",tree->nodeID);
else
printf("%s\n",NodeName[tree->nodeID]);
printTree(tree->first,depth+2);
printTree(tree->second,depth+2);
}
#include "lex.yy.c"
// heres the makefile I use for compilation
frontEnd: lex.yy.c parser.tab.c
gcc parser.tab.c main.c -o frontEnd -lfl -DDEBUG
parser.tab.c parser.tab.h: parser.y
bison -d parser.y
lex.yy.c: lex.l
flex lex.l
clean:
rm lex.yy.c y.tab.c frontEnd
'''
// data.txt
start
let x : 13
main {
scanf [ x ] .
printf [ 34 ] .
} stop[enter image description here][2]
[1]: https://i.stack.imgur.com/xlNnh.png
[2]: https://i.stack.imgur.com/HKRtX.png
I think this has a lot more to do with your AST and symbol table functions than with your parser, and practically nothing to do with bison itself.
For example, your function to print trees won't attempt to print an identifier's name if its symbol table index is 0.
if(tree->item > 0 && tree->item < SYMBOLTABLESIZE)
But the first symbol entered in the table will have index 0. (Perhaps you fixed this between pasting your code and generating the results. You should always check that the code you paste in a question corresponds precisely to the output which you show. But this isn't the only bug in your code; it's just an example.)
As another example, the immediate problem which causes Data: 0 to be printed instead of the symbol name is that your tree printer only prints symbol names for AST nodes of type IDVAL, but you create an AST IN node whose data field contains the variable's symbol table index. So either you need to fix your tree printer so it knows about IN nodes, or you need to change the IN node so that it has a child which is the IDVAL node. (That's probably the best solution in the long run.)
It's always a temptation to blame bison (or whatever unfamiliar tool you're using at the moment) for bugs, instead of considering the possibility that you've introduced bugs in your own support code. To avoid falling into this trap, it's always a good idea to test your library functions separately before using them in a more complicated project. For example, you could write a small test driver that builds a fixed AST tree, prints it, and deletes it. Once that works, and only when that works, you can check to see if your parser can build and print the same tree by parsing an input.
You will find that some simple good software design practices will make this whole process much smoother:
Organise your code into separate component files, each with its own header file. Document the library interfaces (and, if necessary, data structures) using comments in the header file. Briefly describe what each function's purpose is. If you can't find a brief description, it nay be that the function is trying to do too many different things.
In your parser, the functions and declarations needed to build and use ASTs are scattered between different parts of your lexer and parser files. This makes them much harder to read, debug, maintain and even use.
No matter what your teacher might tell you, if you find it necessary to #include the generated lexical scanner directly into the parser, then you probably have not found a good way to organise your support functions. You should always aim to make it possible to separately compile the parser and the scanner.
For data structures like your AST node, which use different member variables in different ways depending on an enumerated node type -- which is a model you'll find in other C projects as well, but is particularly common in parsers -- document the precise use of each field for every enumeration value. And make sure that every time you change the way you use the data or add new enumeration values, you fix the documentation accordingly.
This documentation will make it much easier to verify that your AST is being built correctly. As an additional benefit, you (or others using your code) will have an accurate description of how to interpret the contents of AST nodes, which makes it much easier to write code which analyses the tree.
In short, the way to write, debug and maintain any non-trivial project is not by "messing around" but by being systematic and modular. While it might seem like all of this takes precious time, particularly the documentation, it will almost always save you a lot of time in the long run.

How to call an array from a static method?

In this code, how do I call an array globally for other methods to use?
Background info on my code, we are asked to scan a file that contains DNA strands then translating it to an RNA Strand.
I receive the error: " cannot find symbol - variable dna " when i call the dna array on the translation method (it can't find dna.length) for(int i=0; i < dna.length; i++){
public class FileScannerExample
{
public static void main(String[] args) throws IOException
{
//This is how to create a scanner to read a file
Scanner inFile = new Scanner(new File("dnaFile.txt"));
String dnaSequence = inFile.next();
int dnalength = dnaSequence.length();
String[] dna = new String[dnalength];
for(int i=0; i<=dna.length-2 ; i++)
{
dna[i]=dnaSequence.substring(i,i+1); //looking ahead and taking each character and placing it in the array
}
dna[dna.length-1]=dnaSequence.substring(dna.length-1); //reading the last spot in order to put it in the array
//Testing that the array is identical to the string
System.out.println(dnaSequence);
for(int i = 0 ; i<=dna.length-1; i++)
{
System.out.print(dna[i]);
}
}
public void translation()
{
for(int i=0; i < dna.length; i++){
//store temporary
if (dna[i] = "A"){
dna[i] = "U";
}
if(dna[i] = "T"){
dna[i] = "A";
}
if(dna[i] = "G"){
dna[i]= "C";
}
if(dna[i] = "C"){
dna[i] = "G";
}
}
}
}
you need to bring the symbol into scope before you can reference it. you can do this, either by pulling it up into a higher scope (as a field in the class), or by sending it into the local scope by passing it as a method parameter.
As a class member:
public class Test
{
private String myField;
void A() {
myField = "I can see you";
}
void B() {
myField = "I can see you too";
}
}
As a method parameter:
public class Test
{
void A() {
String myVar = "I can see you";
System.out.println(myVar);
B(myVar);
}
void B(String param) {
param += " too";
System.out.println(param);
}
}
Note that in order to see an instance member, you must be referencing it from a non-static context. You can get around this by declaring the field as static too, although you want to be careful with static state in a class, it generally makes the code more messy and harder to work with.

What happens inside of this condition statement? while (a = foo(bar))

It may sounds silly, but I want to know the happening when I execute while(a = function(b)){}.
Suppose we got NULL for the return value of read_command_stream.
Can I get out of the loop?
while ((command = read_command_stream (command_stream)))
{
if (print_tree)
{
printf("# %d\n", command_number++);
print_command (command);
}
else
{
last_command = command;
execute_command(command, time_travel);
}
}
struct command
struct command
{
enum command_type type;
// Exit status, or -1 if not known (e.g., because it has not exited yet).
int status;
// I/O redirections, or null if none.
char *input;
char *output;
union
{
// for AND_COMMAND, SEQUENCE_COMMAND, OR_COMMAND, PIPE_COMMAND:
struct command *command[2];
// for SIMPLE_COMMAND:
char **word;
// for SUBSHELL_COMMAND:
struct command *subshell_command;
} u;
};
The syntax says:
while (expression) { ... }
and expression can be a lot.
It can be:
a constant: while (1) { ... }
the result of a comparison: while (a < b) { ... }
some boolean construct: : while (a < b && c < d ) { ... }
the resulting expression from an assignment: while (*dst++ = *src++) {;}
and the assignment can also involve function calls: while((ch = getc()) != EOF) { ... }
a plain variable: while(i) ( ...)
an expression based on the evaluation of a plain variable : while (i--) { ... } (even with side effects!)
a pointer expression: : while (*++argv) { ... }
Now, in the case of an integer expression, the expression is checked for not equal to zero. For pointer expressions, it is checked against not equal to NULL. That's all.
The crux of this is that in C, even an assignment is an expression, so you can write:
a = b = c;
or even:
a = b = c = d;
But, since an assignment is also an expression, you could even write:
while ( a = b = c = d) { ... }
The = evaluates to whatever it sets the variable to, so if you do something like
var = 0
This evaluates to 0 and if it was in a while loop would break out.
Also remember NULL is just 0 (though it's not guaranteed) so something returning NULL will have the same effect to break out of a loop. Generally it's a bad idea to use an = as a condition and good compilers will warn you about it.
Null is supposed to be zero unless otherwise specified on your system/compiler. Therefore, the loop terminates.

Unintentional concatenation in Bison/Yacc grammar

I am experimenting with lex and yacc and have run into a strange issue, but I think it would be best to show you my code before detailing the issue. This is my lexer:
%{
#include <stdlib.h>
#include <string.h>
#include "y.tab.h"
void yyerror(char *);
%}
%%
[a-zA-Z]+ {
yylval.strV = yytext;
return ID;
}
[0-9]+ {
yylval.intV = atoi(yytext);
return INTEGER;
}
[\n] { return *yytext; }
[ \t] ;
. yyerror("invalid character");
%%
int yywrap(void) {
return 1;
}
This is my parser:
%{
#include <stdio.h>
int yydebug=1;
void prompt();
void yyerror(char *);
int yylex(void);
%}
%union {
int intV;
char *strV;
}
%token INTEGER ID
%%
program: program statement EOF { prompt(); }
| program EOF { prompt(); }
| { prompt(); }
;
args: /* empty */
| args ID { printf(":%s ", $<strV>2); }
;
statement: ID args { printf("%s", $<strV>1); }
| INTEGER { printf("%d", $<intV>1); }
;
EOF: '\n'
%%
void yyerror(char *s) {
fprintf(stderr, "%s\n", s);
}
void prompt() {
printf("> ");
}
int main(void) {
yyparse();
return 0;
}
A very simple language, consisting of no more than strings and integer and a basic REPL. Now, you'll note in the parser that args are output with a leading colon, the intention being that, when combined with the first pattern of the rule of the statement the interaction with the REPL would look something like this:
> aaa aa a
:aa :a aaa>
However, the interaction is this:
> aaa aa a
:aa :a aaa aa aa
>
Why does the token ID in the following rule
statement: ID args { printf("%s", $<strV>1); }
| INTEGER { printf("%d", $<intV>1); }
;
have the semantic value of the total input string, newline included? How can my grammar be reworked so that the interaction I intended?
You have to preserve token strings as they are read if you want them to remain valid. I modified the statement rule to read:
statement: ID { printf("<%s> ", $<strV>1); } args { printf("%s", $<strV>1); }
| INTEGER { printf("%d", $<intV>1); }
;
Then, with your input, I get the output:
> aaa aa a
<aaa> :aa :a aaa aa a
>
Note that at the time the initial ID is read, the token is exactly what you expected. But, because you did not preserve the token, the string has been modified by the time you get back to printing it after the args have been parsed.
I think there is an associativity conflict between the args and statement productions. This is borne out by the (partial) output from the bison -v parser.output file:
Nonterminals, with rules where they appear
$accept (6)
on left: 0
program (7)
on left: 1 2 3, on right: 0 1 2
statement (8)
on left: 4 5, on right: 1
args (9)
on left: 6 7, on right: 4 7
EOF (10)
on left: 8, on right: 1 2
Indeed, I'm having a hard time trying to figure out what your grammar is trying to accept. As a side note, I'd probably move your EOF production into the lexer as an EOL token; this will make resynchronizing on parse errors easier.
Better explanation of your intent would be helpful.

Resources