Concatenation operator not allow in module instantiation - concatenation

MUX32_16x1 inst9(muxR, dontNeed, addSub, AddSub, mult, shift, shift, wireAnd, wireOr, wireNor, {31{0}, addSub[31]}, dontNeed, dontNeed, dontNeed, dontNeed, dontNeed, dontNeed, OPRN[3:0]);
Above is my instantiation of a 16x1 mux, I'm trying to set one parameter to 0 for the first 31 bits leaving only the last bit as an input by doing this
{31{0}, addSub[31]}
I'm not sure why the program is giving out this error
near ",": syntax error, unexpected ',', expecting '}'.
is curly braces operator not allow in module instantiation?
sorry, I'm very new to Verilog programming

When you're doing repeated concatenation, you need to enclose it in another set of braces, so {31{1'b0}} is the same as 31'd0.
Try:
{{31{1'b0}}, addSub[31]}
Or:
{31'd0, addSub[31]}

Related

Wrong rule used in bison

I am trying to perform a syntax analysis using bison, but it uses the wrong rule at one point and I didn't manage to find how to fix it.
I have a few rules but these ones seem to be the source of the problem :
method : vars statements;
vars : //empty
| vars var;
var : type IDENTIFIER ';';
type : IDENTIFIER;
statements : //empty
| statements statement;
statement : IDENTIFIER '=' e ';';
e : (...)
With IDENTIFIER being a simple regex matching [a-zA-Z]*
So basically, if I write that :
int myint;
myint = 12;
Since myint is an identifier, bison seems to still try to match it on the second line as a type and then matches the whole thing as a var and not as a statement. So I get this error (knowing that ASSIGN is '=') :
syntax error, unexpected ASSIGN, expecting IDENTIFIER
Edit : Note that bison is indicating that there are shift/reduce errors, so it may be linked (as said in the answers).
The problem you're having is coming from the default resolution of the shift-reduce conflict you have due to the empty statements rule -- it needs to know whether to reduce the empty statement and start matching statements, or shift the IDENTIFIER that might begin another var. So it decides to shift, which puts it down the var path.
You can avoid this problem by refactoring the grammar to avoid empty productions:
method: vars | vars statements | statements ;
vars: var | vars var ;
statements : statement | statements statement ;
... rest the same
which avoids needing to know whether something is var or a statement until after shifting far enough into it to tell.

Regular expression unexpected pattern matching

I am trying to create a syntax parser using C-Bison and Flex. In Flex I have a regular expression which matches integers based on the following:
Must start with any digit in range 1-9 and followed by any number of digits in range 0-9. (ex. Correct: 1,12,11024 | Incorrect: 012)
Can be signed (ex. +2,-5)
The number 0 must not be followed by any digit (0-9) and must not signed. (ex. Correct: 0 | Incorrect: 012,+0,-0)
Here is the regex I have created to perform the matching:
[^+-]0[^0-9]|[+-]?[1-9][0-9]*
Here is the expression I am testing:
(1 + 1 + 10)
The matches:
1
1
10)
And here is my question, why does it match '10)'?
The reason I used the above expression, instead of the much simpler one,
(0|[+-]?[1-9][0-9]*) is due to inability of the parser to recognise incorrect expressions such as 012.
The problem seems to occur only when before the ')' precedes the digit '0'. However if the '0' is preceded by two or more digits (ex. 100), then the ')' is not matched.
I know for a fact if I remove [^0-9] from the regex it doesn't match the ')'.
It matches 10( because 1 matches [^+-], 0 matches 0 and ( matches [^0-9].
The reason I used the above expression, instead of the much simpler one, (0|[+-]?[1-9][0-9]*) is due to inability of the parser to recognise incorrect expressions such as 012.
How so? Using the above regex, 012 would be recognized as two tokens: 0 and 12. Would this not cause an error in your parser?
Admittedly, this would not produce a very good error message, so a better approach might be to just use [0-9]+ as the regex and then use the action to check for a leading zero. That way 012 would be a single token and the lexer could produce an error or warning about the leading zero (I'm assuming here that you actually want to disallow leading zeros - not use them for octal literals).
Instead of a check in the action, you could also keep your regex and then add another one for integers with a leading zero (like 0[0-9]+ { warn("Leading zero"); return INT; }), but I'd go with the check in the action since it's an easy check and it keeps the regex short and simple.
PS: If you make - and + part of the integer token, something like 2+3 will be seen as the integer 2, followed by the integer +3, rather than the integers 2 and 3 with a + token in between. Therefore it is generally a better idea to not make the sign a part of the integer token and instead allow prefix + and - operators in the parser.

How to differentiate '-' operator from a negative number for a tokenizer

I am creating an infix expression parser, an so I have to create a tokenizer. It works well, except for one thing: I do not now how to differentiate negative number from the "-" operator.
For example, if I have:
23 / -23
The tokens should be 23, / and -23, but if I have an expression like
23-22
Then the tokens should be 23, - and 22.
I found a dirty workaround which is if I encounter a "-" followed by a number, I look at the previous character and if this character is a digit or a ')', I treat the "-" as an operator and not a number.
Apart from being kind of ugly, it doesn't work for expressions like
--56
where it gets the following tokens: - and -56 where it should get --56
Any suggestion?
In the first example the tokens should be 23, /, - and 23.
The solution then is to evaluate the tokens according to the rules of associativity and precedence. - cannot bind to / but it can to 23, for example.
If you encounter --56, is split into -,-,56 and the rules take care of the problem. There is no need for special cases.

How to handle same symbol used for two things lemon parser

I'm developing a domain specific language. Part of the language is exactly like C expression parsing semantics such as precidence and symbols.
I'm using the Lemon parser. I ran into an issue of the same token being used for two different things, and I can't tell the difference in the lexer. The ampersand (&) symbol is used for both 'bitwise and' and "address of".
At first I thought it was a trivial issue, until I realized that they don't have the same associativity.
How do I give the same token two different associativities? Should I just use AMP (as in ampersand) and make the addressof and bitwise and rules use AMP, or should I use different tokens (such as ADDRESSOF and BITWISE_AND). If I do use separate symbols, how am I supposed to know which one from the lexer (it can't know without being a parser itself!).
If you're going to write the rules out explicitly, using a different non-terminal for every "precedence" level, then you do not need to declare precedence at all, and you should not do so.
Lemon, like all yacc-derivatives, uses precedence declarations to remove ambiguities from ambiguous grammars. The particular ambiguous grammar referred to is this one:
expression: expression '+' expression
| expression '*' expression
| '&' expression
| ... etc, etc.
In that case, every alternative leads to a shift-reduce conflict. If your parser generator didn't have precedence rules, or you wanted to be precise, you'd have to write that as an unambiguous grammar (which is what you've done):
term: ID | NUMBER | '(' expression ')' ;
postfix_expr: term | term '[' expression '] | ... ;
unary_expr: postfix_expr | '&' unary_expr | '*' unary_expr | ... ;
multiplicative_expr: unary_expr | multiplicative_expr '*' postfix_expr | ... ;
additive_expr: multiplicative_expr | additive_expr '+' multiplicative_expr | ... ;
...
assignment_expr: conditional_expr | unary_expr '=' assignment_expr | ...;
expression: assignment_expr ;
[1]
Note that the unambiguous grammar even shows left-associative (multiplicative and additive, above), and right-associative (assignment, although it's a bit weird, see below). So there are really no ambiguities.
Now, the precedence declarations (%left, %right, etc.) are only used to disambiguate. If there are no ambiguities, the declarations are ignored. The parser generator does not even check that they reflect the grammar. (In fact, many grammars cannot be expressed as this kind of precedence relationship.)
Consequently, it's a really bad idea to include precedence declarations if the grammar is unambiguous. They might be completely wrong, and mislead anyone who reads the grammar. Changing them will not affect the way the language is parsed, which might mislead anyone who wants to edit the grammar.
There is at least some question about whether it's better to use an ambiguous grammar with precedence rules or to use an unambiguous grammar. In the case of C-like languages, whose grammar cannot be expressed with a simple precedence list, it's probably better to just use the unambiguous grammar. However, unambiguous grammars have a lot more states and may make parsing slightly slower, unless the parser generator is able to optimize away the unit-reductions (all of the first alternatives in the above grammar, where each expression-type might just be the previous expression-type without affecting the AST; each of these productions needs to be reduced, although it's mostly a no-op, and many parser generators will insert some code.)
The reason C cannot simply be expressed as a precedence relationship is precisely the assignment operator. Consider:
a = 4 + b = c + 4;
This doesn't parse because in assignment-expression, the assignment operator can only apply on the left to a unary-expression. This doesn't reflect either possible numeric precedence between + and =. [2]
If + were of higher precedence than =, the expression would parse as:
a = ((4 + b) = (c + 4));
and if + were lower precedence, it would parse as
(a = 4) + (b = (c + 4));
[1] I just realized that I left out cast_expression but I can't be cast to put it back in; you get the idea)
[2] Description fixed.
Later I realized I had the same ambiguity between dereference (*) and multiplication, also (*).
Lemon provides a way to assign a precidence to a rule, using the name used in the associativity declarations (%left/right/nonassoc) in square brackets after the period.
I haven't verified that this works correctly yet, but I think you can do this (note the things in square brackets near the end):
.
.
.
%left COMMA.
%right QUESTION ASSIGN
ADD_ASSIGN SUB_ASSIGN MUL_ASSIGN DIV_ASSIGN MOD_ASSIGN
LSH_ASSIGN RSH_ASSIGN AND_ASSIGN XOR_ASSIGN OR_ASSIGN.
%left LOGICAL_OR.
%left LOGICAL_AND.
%left BITWISE_OR.
%left BITWISE_XOR.
%left BITWISE_AND.
%left EQ NE.
%left LT LE GT GE.
%left LSHIFT RSHIFT.
%left PLUS MINUS.
%left TIMES DIVIDE MOD.
//%left MEMBER_INDIRECT ->* .*
%right INCREMENT DECREMENT CALL INDEX DOT INDIRECT ADDRESSOF DEREFERENCE.
.
.
.
multiplicative_expr ::= cast_expr.
multiplicative_expr(A) ::= multiplicative_expr(B) STAR cast_expr(C). [TIMES]
{ A = Node_2_Op(Op_Mul, B, C); }
.
.
.
unary_expr(A) ::= STAR unary_expr(B). [DEREFERENCE]
{ A = Node_1_Op(Op_Dereference, B); }

Quote equality test in batch

How can I test my string as if it is a quote (") or not.
The default escape character is '^', isn't it?
So I tried the following but with no success yet:
if "!first_char!" == "^"" (...)
I tried double quote, too:
if "!first_char!" == """" (...)
Thanks!
The problem is not with the right part of the comparison, but with the left part. As first_char contains a quote, then the left operand of the comparison does not make sense. So you have to escape with ^ the variable that holds the ".
try something like this...
if .^!first_char!.==.^". (#echo ^!first_char! YES) else (#echo ^!first_char! NO)
The problem is only on the right part of the comparision.
The delayed expansion on the left side is always safe.
So you could simply use also a delayed expansion on the right side, like
set singleQuote="
if "!first_char!" == "!singleQuote!" (...)
Or alternativly you could escape all quotes.
if "!first_char!" == ^"^"^" (...)

Resources