Can anyone explain this context-free grammar to me? - theory

I need to understand this for a homework. You would not be giving me the answer by telling me this, you'd simply help me understand the question being asked.
I've read my class notes which haven't been very helpful, as well as searching all over the internet for context-free grammar info. I can't find anything that looks like what I've been given, and I'm very confused.
If anyone could tell me what this CFG describes, or give me a good resource to explain this subject, I would really appreciate it.
The CFG is this:
S is the starting symbol
<S> → <A> | ε
<A> → 0<B> | 1<A>
<B> → 0<C> | 1<B>
<C> → 0<D> | 1<C>
<D> → 1<D> | 0<B> | ε

CFG defines a patterns of string.
Here the string can be a pattern of 1,0,e.(alphabets)
The rules of CFG tell how to expand the expressions into strings of alphabets.
<S> → <A> | ε
<A> → 0<B> | 1<A>
<B> → 0<C> | 1<B>
<C> → 0<D> | 1<C>
<D> → 1<D> | 0<B> | ε
Here A can be expanded to 0B or 1A. LHS can only be expanded.
Given a string you can verify if it is described by the CFG or not.
Lets take 1000 and see if it is described by this CFG or not.
We will start with S which can expand to A or e.
expr = S
e is a special symbol which says its termination of string.
We will use A as it gives us hope instead of terminating with e .
expr = A (S ->A)
A can expand to 0B or 1A. For our string we will use 1A.
expr = 1A (A ->1A)
Now 1A has A. Looking up in the rule table A can expand to 0B or 1A. We will take 0B as it follows the given string. So now our resultant is 10B.
expr = 10B (A -> 0B)
Lookup B which expands to 0C or 1B. We will take up 0C as it matches our given pattern. Therefore our string becoming 100C.
expr = 100C (B -> 0C)
Similarly you can go on expanding the expression and terminate with e.
expr = 1000D (C -> 0D)
expr = 1000e (D -> e)

CFG: A context-free grammar (CFG) is a term used in formal language theory to describe a certain type of formal grammar.In context-free grammars, all rules are one to one, one to many, or one to none.Languages generated by context-free grammars are known as context-free languages (CFL). Different context-free grammars can generate the same context-free language. It is important to distinguish properties of the language from properties of a particular grammar. The language equality question (do two given context-free grammars generate the same language?) is undecidable.
Above lines are chosen part from CFG
So in short, from any CFG given we try to find out the set of strings that can be described in that CFG. These set of strings make the language of that particular CFG.
Below is the solution of your question in graphical form:
In solution, the strings in rectangular boxes are the terminating steps.
So the given CFG can have strings like:
{ 100 ,0100, 1001, 1010, 01001, 10011, 10101, 010011, 0100111, ...... }
So this CFG can have any type of string that has:
at least one 1,
at least two 0, and
length >= 3, as by observing we get the minimum length string can be 100:
S --> A --> 1B --> 10C --> 100D --> 100e --> 100
And by observing the strings on each step you can easily get that this CFG can have any type of strings, that has above three properties.
So this CFG describes a Context Free Language that has above 3 properties.

Related

Regex with no 2 consecutive a's and b's

I have been trying out some regular expressions lately. Now, I have 3 symbols a, b and c.
I first looked at a case where I don't want 2 consecutive a's. The regex would be something like:
((b|c + a(b|c))*(a + epsilon)
Now I'm wondering if there's a way to generalize this problem to say something like:
A regular expression with no two consecutive a's and no two consecutive b's. I tried stuff like:
(a(b|c) + b(a|c) + c)* (a + b + epsilon)
But this accepts inputs such as"abba" or "baab" which will have 2 consecutive a's (or b's) which is not what I want. Can anyone suggest me a way out?
If you can't do a negative match then perhaps you can use negative lookahead to exclude strings matching aa and bb? Something like the following (see Regex 101 for more information):
(?!.*(aa|bb).*)^.*$
I (think I) solved this by hand-drawing a finite state machine, then, generating a regex using FSM2Regex. The state machine is written below (with the syntax from the site):
#states
s0
s1
s2
s3
#initial
s0
#accepting
s1
s2
s3
#alphabet
a
b
c
#transitions
s0:a>s1
s0:b>s2
s0:c>s3
s1:b>s2
s1:c>s3
s2:a>s1
s2:c>s3
s3:c>s3
s3:a>s1
s3:b>s2
If you look at the transitions, you'll notice it's fairly straightforward- I have states that correspond to a "sink" for each letter of the alphabet, and I only allow transitions out of that state for other letters (not the "sink" letter). For example, s1 is the "sink" for a. From all other states, you can get to s1 with an a. Once you're in s1, though, you can only get out of it with a b or a c, which have their own "sinks" s2 and s3 respectively. Because we can repeat c, s3 has a transition to itself on the character c. Paste the block text into the site, and it'll draw all this out for you, and generate the regex.
The regex it generated for me is:
c+cc*(c+$+b+a)+(b+cc*b)(cc*b)*(c+cc*(c+$+b+a)+$+a)+(a+cc*a+(b+cc*b)(cc*b)*(a+cc*a))(cc*a+(b+cc*b)(cc*b)*(a+cc*a))*(c+cc*(c+$+b+a)+(b+cc*b)(cc*b)*(c+cc*(c+$+b+a)+$+a)+b+$)+b+a
Which, I'm pretty sure, is not optimal :)
EDIT: The generated regex uses + as the choice operator (usually known to us coders as |), which means it's probably not suitable to pasting into code. However, I'm too scared to change it and risk ruining my regex :)
You can use back references to match the prev char
string input = "acbbaacbba";
string pattern = #"([ab])\1";
var matchList = Regex.Matches(input, pattern);
This pattern will match: bb, aa and bb. If you don't have any match in your input pattern, it means that it does not contain a repeated a or b.
Explanation:
([ab]): define a group, you can extend your symbols here
\1: back referencing the group, so for example, when 'a' is matched, \1 would be 'a'
check this page: http://www.regular-expressions.info/backref.html

Regular Languages and Concatenation

Regular languages are closed under concatenation - this is demonstrable by having the accepting state(s) of one language with an epsilon transition to the start state of the next language.
If we consider the language L = {a^n | n >=0}, this language is regular (it is simply a*). If we concatenate it with another language L = {b^n | n >=0}, which is also regular, we end up with a^nb^n, but we obviously know this isn't regular.
Where am I going wrong with my logic here?
The definition of the concatenation of two languages L1 and L2 is the set of all strings wx where w &in; L1 and x &in; L2. This means that L1L2 consists of all possible strings formed by pairing one string from L1 and one string from L2, which isn't necessarily the same as pairing up matching strings from each language.
As a result, as #Oli Charlesworth pointed out, the language you get back here isn't actually { anbn | n in N }. Instead, it's the language { anbm | n in N and m in N }, which is the language a*b*. This language is regular, since it's given by the regular languages.
Hope this helps!

Finding all possible combinations of truth values to well-formed formulas occuring in the given formula in C

I want to write a truth table evaluator for a given formula like in this site.
http://jamie-wong.com/experiments/truthtabler/SLR1/
Operators are:
- (negation)
& (and)
| (or)
> (implication)
= (equivalence)
So far I made this
-(-(a& b) > ( -((a|-s)| c )| d))
given this formula my output is
abdsR
TTTT
TTTF
TTFT
TTFF
TFTT
TFTF
TFFT
TFFF
FTTT
FTTF
FTFT
FTFF
FFTT
FFTF
FFFT
FFFF
I am having difficulties with the evaluating part.
I created an array which in I stored indises of parenthesis if it helps,namely
7-3, 17-12, 20-11, 23-9, 24-1
I also checked the code in http://www.stenmorten.com/English/llc/source/turth_tables_ass4.c
,however I didn't get it.
Writing an operator precedence parser to evaluate infix notation expressions is not an easy task. However, the shunting yard algorithm is a good place to start.

Context Free pumping lemma

Is the following language context free?
L = {a^i b^k c^r d^s | i+s = k+r, i,k,r,s >= 0}
I've tried to come up with a context free grammar to generate this but I can not, so I'm assuming its not context free. As for my proof through contradiction:
Assume that L is context free,
Let p be the constant given by the pumping lemma,
Choose string S = a^p b^p c^p d^p where S = uvwxy
As |vwx| <= p, then at most vwx can contain two distinct symbols:
case a) vwx contains only a single type of symbol, therefore uv^2wx^2y will result in i+s != k+r
case b) vwx contains two types of symbols:
i) vwx is composed of b's and c's, therefore uv^2wx^2y will result in i+s != k+r
Now my problem is that if vwx is composed of either a's and b's, or c's and d's, then pumping them won't necessary break the language as i and k or s and r could increase in unison resulting in i+s == k+r.
Am I doing something wrong or is this a context free language?
I can't come up with a CFG to generate that particular language at the top of my head either, but we know that a language is context free iff some pushdown automata recognizes it.
Designing such a PDA won't be too difficult. Some ideas to get you started:
we know i+s=k+r. Equivalently, i-k-r+s = 0 (I wrote it in that order since that is the order in they appear). The crux of the problem is deciding what to do with the stack if (k+r)>i.
If you aren't familiar with PDA's or cannot use them to answer the problem, at least you know now that it is Context Free.
Good luck!
Here is a grammar that accepts this language:
A -> aAd
A -> B
A -> C
B -> aBc
B -> D
C -> bCd
C -> D
D -> bDc
D -> ε

C: Convert A ? B : C into if (A) B else C

I was looking for a tool that can convert C code expressions for the form:
a = (A) ? B : C;
into the 'default' syntax with if/else statements:
if (A)
a = B
else
a = C
Does someone know a tool that's capable to do such a transformation?
I work with GCC 4.4.2 and create a preprocessed file with -E but do not want such structures in it.
Edit:
Following code should be transformed, too:
a = ((A) ? B : C)->b;
Coccinelle can do this quite easily.
Coccinelle is a program matching and
transformation engine which provides
the language SmPL (Semantic Patch
Language) for specifying desired
matches and transformations in C code.
Coccinelle was initially targeted
towards performing collateral
evolutions in Linux. Such evolutions
comprise the changes that are needed
in client code in response to
evolutions in library APIs, and may
include modifications such as renaming
a function, adding a function argument
whose value is somehow
context-dependent, and reorganizing a
data structure. Beyond collateral
evolutions, Coccinelle is successfully
used (by us and others) for finding
and fixing bugs in systems code.
EDIT:
An example of semantic patch:
## expression E; constant C; ##
(
!E & !C
|
- !E & C
+ !(E & C)
)
From the documentation:
The pattern !x&y. An expression of this form is almost always meaningless, because it combines a boolean operator with a bit operator. In particular, if the rightmost bit of y is 0, the result will always be 0. This semantic patch focuses on the case where y is a constant.
You have a good set of examples here.
The mailing list is really active and helpful.
The following semantic patch for Coccinelle will do the transformation.
##
expression E1, E2, E3, E4;
##
- E1 = E2 ? E3 : E4;
+ if (E2)
+ E1 = E3;
+ else
+ E1 = E4;
##
type T;
identifier E5;
T *E3;
T *E4;
expression E1, E2;
##
- E1 = ((E2) ? (E3) : (E4))->E5;
+ if (E2)
+ E1 = E3->E5;
+ else
+ E1 = E4->E5;
##
type T;
identifier E5;
T E3;
T E4;
expression E1, E2;
##
- E1 = ((E2) ? (E3) : (E4)).E5;
+ if (E2)
+ E1 = (E3).E5;
+ else
+ E1 = (E4).E5;
The DMS Software Reengineering Toolkit can do this, by applying program transformations.
A specific DMS transformation to match your specific example:
domain C.
rule ifthenelseize_conditional_expression(a:lvalue,A:condition,B:term,C:term):
stmt -> stmt
= " \a = \A ? \B : \C; "
-> " if (\A) \a = \B; else \a=\C ; ".
You'd need another rule to handle your other case, but it is equally easy to express.
The transformations operate on source code structures rather than text, so layout out and comments won't affect recognition or application. The quotation marks in the rule not traditional string quotes, but rather are metalinguistic quotes that separate the rule syntax language from the pattern langu age used to specify the concrete syntax to be changed.
There are some issues with preprocessing directives if you intend to retain them. Since you apparantly are willing to work with preprocessor-expanded code, you can ask DMS to do the preprocessing as part of the transformation step; it has full GCC4 and GCC4-compatible preprocessors built right in.
As others have observed, this is a rather easy case because you specified it work at the level of a full statement. If you want to rid the code of any assignment that looks similar to this statement, with such assignments embedded in various contexts (initializers, etc.) you may need a larger set of transforms to handle the various set of special cases, and you may need to manufacture other code structures (e.g., temp variables of appropriate type). The good thing about a tool like DMS is that it can explicitly compute a symbolic type for an arbitrary expression (thus the type declaration of any needed temps) and that you can write such a larger set rather straightforwardly and apply all of them.
All that said, I'm not sure of the real value of doing your ternary-conditional-expression elimination operation. Once the compiler gets hold of the result, you may get similar object code as if you had not done the transformations at all. After all, the compiler can apply equivalence-preserving transformations, too.
There is obviously value in making regular changes in general, though.
(DMS can apply source-to-source program transformations to many langauges, including C, C++, Java, C# and PHP).
I am not aware of such a thing as the ternary operator is built-into the language specifications as a shortcut for the if logic... the only way I can think of doing this is to manually look for those lines and rewrite it into the form where if is used... as a general consensus, the ternary operator works like this
expr_is_true ? exec_if_expr_is_TRUE : exec_if_expr_is_FALSE;
If the expression is evaluated to be true, execute the part between ? and :, otherwise execute the last part between : and ;. It would be the reverse if the expression is evaluated to be false
expr_is_false ? exec_if_expr_is_FALSE : exec_if_expr_is_TRUE;
If the statements are very regular like this why not run your files through a little Perl script? The core logic to do the find-and-transform is simple for your example line. Here's a bare bones approach:
use strict;
while(<>) {
my $line = $_;
chomp($line);
if ( $line =~ m/(\S+)\s*=\s*\((\s*\S+\s*)\)\s*\?\s*(\S+)\s*:\s*(\S+)\s*;/ ) {
print "if(" . $2 . ")\n\t" . $1 . " = " . $3 . "\nelse\n\t" . $1 . " = " . $4 . "\n";
} else {
print $line . "\n";
}
}
exit(0);
You'd run it like so:
perl transformer.pl < foo.c > foo.c.new
Of course it gets harder and harder if the text pattern isn't as regular as the one you posted. But free, quick and easy to try.

Resources