Well, I'm not sure how I should write a function using recursive descent parse to parse a grammer like the below. Actually, I'm not sure if I was doing right it...
BNF:
A : B | A '!'
B : '[' ']'
pseudo-code:
f()
{
if(tok is B)
parse_b();
return somethingB
else if(????) how will I know if it's start of A or I don't need to?
x = f();
parse_c();
return somethingA
}
I was doing this (no check to determine if it's an A but I feel there's something wrong with it):
f()
{
if(tok is B)
parse_b();
return somethingB
else
x = f();
parse_c();
return somethingA
}
See my SO answer to another similar question on details on how to build a recursive descent parser.
In particular it addresses the structure of the parser, and how you can pretty much derive it by inspecting your grammar rules, including handling lists.
Related
I have to produce an LL1 grammar that covers the IF, IF-ELSE, IF - ELSE IF - ELSE condition for a C program.
I was doing the follow and I wasn't able to solve the recursions so I thought that maybe my grammar is wrong or not satisfiyng the LL1 conditions.
Can you tell me if the grammar is correct?
<MAIN> ::= int main () { <PROG> <AUX_PROG> }
<AUX_PROG> ::= <PROG> <AUX_PROG> | ε
<PROG> ::= <IF_STAT> | other | ε
<IF_STAT> ::= if ( other ) { <PROG> } <ELSE_STAT>
<ELSE_STAT> ::= else { <PROG> } | ε
follow(PROG) = { "}", if, other }
follow(AUX_PROG) = { "}" }
follow(IF_STAT) = follow(PROG) = { "}", if, other }
follow(ELSE_STAT) = follow(IF_STAT) = { "}", if, other }
follow(MAIN) = { $ }
first(MAIN) = { int }
first(AUX_PROG) = { if, other, ε }
first(PROG) = { if, other, ε }
first(IF_STAT) = { if }
first(ELSE_STAT) = { else, ε }
UPDATE: I have modified the grammar and also I have included the first and the follow.
The braces are required so that there is no dangling-else problem.
That grammar is ambiguous because <PROG> ::= ε makes <AUX_PROG> ::= <PROG> <AUX_PROG> left-recursive. If you eliminate the null production for <PROG> then the grammar is certainly LL(1).
But just being LL(1) does not demonstrate that the grammar correctly recognises the desired syntax, much less that it correctly parses each input into the desired parse tree. So it definitely depends on how you define "correct". Since your question doesn't really specify either the syntax you hope to match nor the form in which you would like it to be analysed, it's hard to comment on these forms of correctness.
You're absolutely correct to note that the heart of C's dangling-else issue is that C does not require the bodies of if and else clauses to be delimited. So the following is legal C:
if (condition1) if (condition2) statement1; else statement2;
and the language's rules cause else statement2 to be bound to if (condition2), rather than the first if.
That's often called an ambiguity, but it's actually easy to disambiguate. You'll find the disambiguation technique all over the place, including Wikipedia's somewhat ravaged entry on dangling else, or most popular programming language textbooks. However, the disambiguation technique does not result in an LL(1) grammar; you need to use a bottom-up parser. (Even an operator precedence parser can deal with it, but LALR(1) parsers are probably more common.)
As Wikipedia points out, a simple solution is to change the grammar to remove the possibility of if (c1) if (c2) .... A simple way to do that is to insist that the target of the if be delimited in some way, such as adding braces (which would in any case be required if the body were more than one statement). It's not necessary to put the same requirement on the body of the else clause, but that would probably be confusing for language users. However, it is convenient to permit chained if...else constructs like this:
if (c1) {
body1
}
else if (c2) {
body2
}
else if (c3) {
body3
}
...
That's not ambiguous, even though the body of each else is not delimited. In some languages, that construct is abbreviated by using a special elseif token (which might be spelled elif or elsif) in order to preserve the rule that else clauses must be delimited blocks. But it's not too eccentric to simply allow else if as an exception to the general rule about bodies.
So if you're designing a language, you have options. If you're implementing someone else's language (such as the one given by the instructor of a course) you need to make sure you understand what their requirements are.
so i have a marco function like so:
#define PROPOGATE_METHOD(status, function, propogationMethod) \
status = function; \
if(status != eSuccess)\
{\
propogationMethod; \
}
So like any good developer would do, I want to wrap each of the parameters as such :
#define PROPOGATE_METHOD(status, function, propogationMethod) \
(status) = (function); \
if((status) != eSuccess)\
{\
(propogationMethod); \
}
But if I call this macro function with a goto or return, I get an error (expecting expression before goto).
i.e. PROPOGATE_METHOD(status, functionCall(), goto Error;);
Thoughts on working around this? I was thinking of moving the goto into the macro function, and wrapping around the label, but that throws another error :
expected identifier or ‘*’ before ‘(’ token
So like any good developer would do, I want to wrap each of the parameters as such
#StoryTeller had a good response to this in comment section. " A good developer understands why the suggestion exists, what problem it solves, and most importantly, when it's not applicable. Blindly doing something is not good development.".
Another good applicable quote is "Blindly following best practices is not best practice".
Here, it really seems like you're adding parenthesis because someone said "it's a good thing to put parenthesis around the arguments". Not because of any valid purpose in this particular case.
Skip the macro
I don't really see the purpose with this macro. TBH, it looks like you're showing off, but macros like this are very likely to cause hard traced bugs. If it's just to save some lines, you can actually make a somewhat decent oneliner of this without any macro.
if((status = foo()) != eSuccess) goto Error;
or
if((status = foo()) != eSuccess) return x;
or
if((status = foo()) != eSuccess) bar();
In many cases, I'd prefer making those on two or three lines. But the above is not so bad. And I would definitely say that it's better than the macro. Just remember the extra parenthesis around status = foo(). If forgetting this is a big concern, you could do something like this:
int foo_wrapper(int *status) { return *status = foo(); }
...
if(foo_wrapper(&status) != eSuccess) goto Error;
or even:
int status_assign(int *dest, int src) { return *dest = src; }
...
if(status_assign(&status, foo()) != eSuccess) goto Error;
On the other hand, it shouldn't be a problem, because if you compile with -Wall, which you should, you will get this:
warning: suggest parentheses around assignment used as truth value
Personally, I don't think it's extremely important with braces when it's single statements, but if you want a oneliner with braces, well just do:
if((status = foo()) != eSuccess) { goto Error; }
Some will like it. Some will not, but it's not the most important question in the world. But I would prefer any of the above before the macro you're suggesting.
Compare these:
PROPOGATE_METHOD(status, foo(), goto Error;);
if((status = foo()) != eSuccess) goto Error;
When compared side by side, I cannot really see that the macro accomplishes anything at all. It doesn't make anything clearer or safer. Without the macro, I can see EXACTLY what's happening and I don't need to wonder about that. Macros have their uses, but as far as I can see here, this is not one of them.
From comments below
I understand. i'm going through and refactor a code base. i prefer not to littler the code base with these if statements and prefer the macro statement because
I can understand that, but I would really encourage you to reconsider. At least if you're allowed to do it. If I were to refactor that code base, getting rid of that macro would have pretty high priority. After all, what refactoring is, is to rewrite code so it becomes better with focus on design and readability. If you don't want to do it, put parenthesis around status and function and then leave it. It will not make it good, but it will not cause any harm either.
I would not use this expression if it were you who had written that macro, but since it's not you, I can use it. "Fixing" that macro is really polishing a turd. No matter what you do, it will never shine, and it will always be a turd.
Disregarding if that macro is useful or confusing, and the merits of goto, consider what the parens inside a macro/define are for. If you have say, this:
#define FOO a + b
...
int y = x * FOO;
what you end up with, is y = x * a + b (because it's just text replacement, not a real variable), which is the same as y = (x * a) + b. Hence, putting parens around (a + b) in FOO fixes that.
This, of course has a similar problem (both inside x and outside the macro), with a similar solution:
#define FOO2(x) x * 123
...
int y = FOO2(a + b);
Now, you have
#define BAR(x) { x };
is there a similar problem there? What should x include that the parenthesis would remove a similar problem stemming from operator precedence? I don't really see such an issue, in a way, the braces already work to protect the x part from the code surrounding the macro. Adding the parens has just the effect of forcing x to be an expression, instead of a full statement.
The goto statement (goto label;) isn't an expression, so you cannot parenthesize it (and neither is goto label without the ;, which isn't even a separately recognizable construct in C's syntax).
And even if you passed something that you can parenthesize (e.g., longjmp(jbuf,1)), there isn't much of a point in parenthesizing it in this context ({ HOOK; }).
Now if you expanded it in a context like HOOK_EXPR * 2, then parentheses would be useful to force HOOK_EXPR to group tighter than * (imagine you passed 3+4 as HOOK_EXPR), but in this context you don't need them.
What is the "hanging else" problem? (Is that the right name?)
Following a C++ coding standard (forgot which one) I always
use brackets (block) with control structures. So I don't
normally have this problem (to which "if" does the last(?)
else belong), but for understanding possible problems in
foreign code it would be nice with a firm understanding of
this problem. I remember reading about it in a book about
Pascal many years ago, but I can't find that book.
Ambiguous else.
Some info here: http://theory.stanford.edu/~amitp/yapps/yapps-doc/node3.html
But the classic example is:
if a then
if b then
x = 1;
else
y = 1;
vs.
if a then
if b then
x = 1;
else
y = 1;
Which if does the else belong to?
if (a < b)
if (c < d)
a = b + d;
else
b = a + c;
(Obviously you should ignore the indentation.)
That's the "hanging else problem".
C/C++ gets rid of the ambiguity by having a rule that says you can't have an-if-without-an-else as the if-body of an-if-with-an-else.
Looking at this from a langauge design point of view.
The standard BNF-like grammar for if-else:
Statement :- .. STUFF..
| IfStatement
IfStatement :- IF_TOKEN '(' BoolExpression ')' Statement IfElseOpt
IfElseOpt :- /* Empty */
| ELSE_TOKEN Statement
Now from a parsers point of view:
if (cond1) Statement1
if (cond2) Statement2
else Statement3
When you get to the ELSE_TOKEN the parser has two options, SHIFT or REDUCE. The problem is that which to choose requires another rule that the parser must follow. Most parsers generators default to SHIFT when given this option.
I don't see the problem for Pascal?
This one is incorrectly indented.
if a then
if b then
x = 1;
else
y = 1;
Removing the semi-colon from after x = 1 would make it correctly indented.
This one correctly indented
if a then
if b then
x = 1;
else
y = 1;
I know this is possible, but is it good practice to use a ternary operator to call functions rather than using an if statement?
if(x){
a();
} else if(y){
b();
} else if(z){
c();
}
Instead do this:
(x) ?
a() :
(y) ?
b() :
(z) ?
c() : 0;
Is there an unknown issues that can occur that I do not know of?
For the nitty-gritty details of how the conditinal operator works, see the C11 Standard, section 6.5.15.
The biggest difference is that the conditional (ternary) operator is meant to be used specifically for assigning to a value. As in,
x = (a < b) ? c : d
If (a < b) is not zero (true), x = c; otherwise, x = d.
There are several constraints on c and d that need to be considered (see 6.5.15.3). Chief among them is that c and d must be one of the following:
Both arithmetic types
Both the same struct or union type
Both void types
Both pointers (or can be converted to such)
Now all of that said, you're specifically not assigning to a variable, but the return values of those functions need to hold to these constraints as well.
However, as was pointed out in the comments on the question - this is still sacrificing readability for terseness. That's almost never best practice.
**Many thanks to #JensGustedt for pointing out the errors in my answer and helping me to fix them!!
In my search for an example of a software phase lock loop I came across the following question
Software Phase Locked Loop example code needed
In the answer by Adam Davis a site is given that is broken and I have tried the new link that is given in a comment but I cant get that to work either.
The answer from Kragen Javier Sitaker gave the following code as a simple example of a software phase locked loop.
main(a,b){for(;;)a+=((b+=16+a/1024)&256?1:-1)*getchar()-a/512,putchar(b);}
Also included in his answer was a link to what should be a much more readable example but this link is also broken. Thus I have been trying to translate the above code into simpler and more readable code.
I have come this far:
main(a,b){
for(;;){
// here I need to break up the code somehow into a if() statement('s).
if(here I get lost){
a = a+1;
if(here i get lost some more){
b = b+1;
}
}
}
Thanks to the SO question What does y -= m < 3 mean?
I know it is possible to break up the a+= and b+= into if statements.
But the (&256? 1 : -1)*getchar()-a/512,putchar(b); part in the code is killing me. I have been looking on Google and on SO to the meaning of the symbols and functions that are used.
I know the & sign indicates an address in memory.
I know that the : sign declares a bit field OR can be used in combination with the ? sign which is a conditional operator. The combination of the two I can use like sirgeorge answer in what does the colon do in c?
Theory Behind getchar() and putchar() Functions
I know that getchar() reads one character
I know that putchar() displays the character
But the combination of all these in the example code is not readable for me and . I can not make it readable for my self even do I know what they all separately do.
So my question: How do I read this software phase lock loop code?
main(a,b){for(;;)a+=((b+=16+a/1024)&256?1:-1)*getchar()-a/512,putchar(b);}
What I get is:
main (a, b)
{
char c;
for (;;)
{
c = getchar();
b = (b + 16 + (a / 1024));
if(!(b & 256))
{
c = c * -1;
}
a = a + c - (a/512);
putchar(b);
}
}
I had to add a c variable to not get lost.
What the program does:
Take a and b.
Infinite loop:
get a char input in c
calculate b
If (b bitwise AND 256)
c = -c
Calculate a
Print b
It seems it translate input into something else, I have to see the code in action to understand better myself.
Hope it helped!
Hint:
https://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B
a+= => a = a +
a?b:c => if(a){return b;} else {return c;} (As a function itself, it don t truly return)
Add parentheses, it help.
a & b is bitwise AND:
a/b |0|1|
0|0|0|
1|0|1|