Union and intersection of two languages - union

Given the language
L1 = {(i|l)p(f|g)n(f|h)m(f|i)r(l|m)p : n + m > r > 0, p >= 0}
and
L2 = (f|g)*(h|i)+
make an automaton for L1 ∪ L2 and (another one for) L1 ∩ L2.
I know that the L1 is a CFL and you need a PDA to parse it and I know that L2 is a RL and a DFA is to be used.
My question is: how do you make the intersection (and the union)? That is, what is the actual language L3 = L1 ∩ L2 on which you make the automaton and how do you compute it?

Union is easy - if you have PDAs for two languages, a nondeterministic PDA that accepts the union can be had by introducing a new initial state with epsilon transitions to each of the initial states of the automata for your languages. The accepted language will be exactly whatever is accepted by either (or both) automata.
Intersection is a little more tricky. Now, you say you want an automaton - that could refer to any theoretical machine, including Turing machines or, equivalently, two-stack PDAs. If you are OK with creating a two-stack PDA then your task is straightforward. Use the Cartesian product machine construction used to construct a DFA for A and B given DFAs for A and B, and have each transition of the form ((a, b), w) -> (a', b') push a symbol (x, y) onto the stack iff (a, w)->a' pushes x onto A's stack and (b, w)->b' pushes y onto B's stack.
Of course, if you start with a DFA for the RL and perform the above construction you only need one stack and the resulting automaton is a PDA, not a two-stack PDA.

Related

Show that the language is decidable

How do I show this language
{<C,A,B> | C,A,B are DFAs, L(C) contains the shuffle of L(A) and L(B)}
is decidable ?
I believe if I can construct automatas for A and B, then I can get an automata that contains the shuffle of them.
I am also thinking about using emptiness testing but I have not made any progress yet.
Given DFAs A and B, construct a DFA D such that L(D) is equal to the shuffle of L(A) and L(B).
Then, construct two DFAs using the Cartesian Product Machine construction for the languages L(M1) = L(C) \ L(D) and L(M2) = L(D) \ L(C).
Determine which, if either of L(M1) and/or L(M2) is empty.
if L(M1) is empty and L(M2) is empty, L(C) is the shuffle of L(A) and L(B)
if L(M1) is empty, L(C) is a subset of the shuffle of L(A) and L(B)
if L(M2) is empty, L(C) is a superset of the shuffle of L(A) and L(B)
To do #1: create a new DFA whose states are triples (x, y, z) where:
x is a state from A
y is a state from B
z is either 1 or 2
The initial state of the DFA will be (qi_A, qi_B, 1). The input alphabet will be the union of the input alphabets of A and B. The transitions will be such that:
f((x,y,1), a) = (x',y,2) where f(x,a) = x' in machine A
f((x,y,2), b) = (x,y',1) where f(y,b) = y' in machine B
The accepting states shall be the states with are accepting in either A or B (or just B if you prefer).
To do #2: create a new DFA whose states are pairs (x, y) where:
x is a state from D
y is a state from C
The initial state of he DFA will be (qi_D, qi_C). The input alphabet will be the union of input alphabets of A and C. The transitions will be such that:
f((x,y),c) = (x',y') where f(x,c) = x' in D and f(y,c) = y' in C.
The accepting states will be:
states that are accepting in D but not C, for L(D) \ L(C)
states that are accepting in C but not D, for L(C) \ L(D)
To do #3:
You can minimize the DFAs using the well-known DFA minimization algorithm. Iff you end up with a DFA that has a single non-accepting state, the language is empty.
You can try all input strings up to the pumping length for the resulting DFA (strings that don't cause the DFA to enter any state more than once). If none of these are accepted by the DFA, then the DFA accepts no strings and the language is empty.

Regular Languages and Concatenation

Regular languages are closed under concatenation - this is demonstrable by having the accepting state(s) of one language with an epsilon transition to the start state of the next language.
If we consider the language L = {a^n | n >=0}, this language is regular (it is simply a*). If we concatenate it with another language L = {b^n | n >=0}, which is also regular, we end up with a^nb^n, but we obviously know this isn't regular.
Where am I going wrong with my logic here?
The definition of the concatenation of two languages L1 and L2 is the set of all strings wx where w &in; L1 and x &in; L2. This means that L1L2 consists of all possible strings formed by pairing one string from L1 and one string from L2, which isn't necessarily the same as pairing up matching strings from each language.
As a result, as #Oli Charlesworth pointed out, the language you get back here isn't actually { anbn | n in N }. Instead, it's the language { anbm | n in N and m in N }, which is the language a*b*. This language is regular, since it's given by the regular languages.
Hope this helps!

Is Wikipedia's Astar reference implementation incomplete? It seems to omit properly updating cheaper paths

I want to implement A* and I looked to Wikipedia for a reference.
It looks like it can fail in the following case. Consider three nodes, A, B, and C.
START -> A -> C -> GOAL
| ^
\-> B
The path costs are:
START -> A : 10
START -> B : 1
B -> A : 1
A -> C : 100
C -> GOAL : 100
Clearly the solution is START -> B -> A -> C -> GOAL but what happens if the heuristic lets us expand A before expanding B?
Our heuristic costs are as follows (note these are all underestimates)
A -> GOAL : 10
B -> GOAL : 50
When A is expanded, the true cost to C will turn out out to be higher than B's heuristic cost, and so B will be expanded before C.
Fine, right?
The problem I see is that when we expand B and replace the datum "A comes from START with cost 10" to "A comes from B with cost 2" we aren't also updating "C comes from A with cost 110" to "C comes from A with cost 102". There is nothing in Wikipedia's pseudocode that looks like it will forward-propagate the cheaper path. Now imagine another node D which can reach C with cost 105, it will erroneously override "C comes from A with cost 110".
Am I reading this wrong or does Wikipedia need to be fixed?
If you are using graph search, i.e. you remember which nodes you visit and you don't allow revisiting the nodes, then your heuristic is not consistent. It says in the article, that for a heuristic to be consistent, following needs to hold:
h(x) <= d(x, y) + h(y) for all adjacent nodes x, y
In your case the assumption h(B) = 50 is inconsistent as d(B -> A) + h(A) = 1 + 10 = 11. Hence your heuristic is inconsistent and A* wouldn't work in this case, as you rightly noticed and as is also mentioned in the wikipedia article: http://en.wikipedia.org/wiki/A%2a_search_algorithm#Properties.
If you are using tree search, i.e. you allow the algorithm to revisit the nodes, the following will happen:
Add A and B to the queue, score(A) = 10 + 10 = 20, score(B) = 1 + 50 = 51.
Pick A from queue as it has smallest score. Add C to the queue with score(C) = 10 + 100 + h(C).
Pick B from the queue as it is now the smallest. Add A to the queue with score(A) = 2 + 10 = 12.
Pick A from the queue as it is now again smallest. Notice that we are using tree search algorithm, so we can revisit nodes. Add C to the queue with score(C) = 1 + 1 + 100 + h(C).
Now we have 2 elements in the queue, C via A with score 110 + h(C) and C via B and A with score 102 + h(C), so we pick the correct path to C via B and A.
The wikipedia pseudocode is the first case, i.e. graph search. And they indeed state right under the pseudocode that:
Remark: the above pseudocode assumes that the heuristic function is monotonic (or consistent, see below), which is a frequent case in many practical problems, such as the Shortest Distance Path in road networks. However, if the assumption is not true, nodes in the closed set may be rediscovered and their cost improved. In other words, the closed set can be omitted (yielding a tree search algorithm) if a solution is guaranteed to exist, or if the algorithm is adapted so that new nodes are added to the open set only if they have a lower f value than at any previous iteration.

How many distinct Boolean function representable by a threshold perceptron?

It states that there are 2^2^n distinct Boolean functions of n inputs. The question is, how many of these are representable by a threshold perceptron?
Would not the answer be all? I say this because a perceptron is the same as a hard threshold where z = mx1 + c - x2 and threshold(z) = 1 if z>=0 and threshold(z) = 0 if z<0.
All, if the perceptron contains at least one hidden layer. If there is only a single layer, then it can only represent linearly separable functions (which exclude XOR, for example).

Context Free pumping lemma

Is the following language context free?
L = {a^i b^k c^r d^s | i+s = k+r, i,k,r,s >= 0}
I've tried to come up with a context free grammar to generate this but I can not, so I'm assuming its not context free. As for my proof through contradiction:
Assume that L is context free,
Let p be the constant given by the pumping lemma,
Choose string S = a^p b^p c^p d^p where S = uvwxy
As |vwx| <= p, then at most vwx can contain two distinct symbols:
case a) vwx contains only a single type of symbol, therefore uv^2wx^2y will result in i+s != k+r
case b) vwx contains two types of symbols:
i) vwx is composed of b's and c's, therefore uv^2wx^2y will result in i+s != k+r
Now my problem is that if vwx is composed of either a's and b's, or c's and d's, then pumping them won't necessary break the language as i and k or s and r could increase in unison resulting in i+s == k+r.
Am I doing something wrong or is this a context free language?
I can't come up with a CFG to generate that particular language at the top of my head either, but we know that a language is context free iff some pushdown automata recognizes it.
Designing such a PDA won't be too difficult. Some ideas to get you started:
we know i+s=k+r. Equivalently, i-k-r+s = 0 (I wrote it in that order since that is the order in they appear). The crux of the problem is deciding what to do with the stack if (k+r)>i.
If you aren't familiar with PDA's or cannot use them to answer the problem, at least you know now that it is Context Free.
Good luck!
Here is a grammar that accepts this language:
A -> aAd
A -> B
A -> C
B -> aBc
B -> D
C -> bCd
C -> D
D -> bDc
D -> ε

Resources