Why is this DFA not accepted? - dfa

So I am supposed to construct a DFA that accepts sets of all strings over {0,1} of length 2:
So sigma = {0,1}
L = {00, 01, 10, 11}
What I initially tried was :
Why is it wrong?

Assuming B is final state, it can even accept string of length 1. Thats why. Add one more state to mark length 2 and it should be good.
A 0,1 B 0,1 (C) 0,1 D
( ) - accepted state
Also make sure anything more than length 3 results in non terminal state by adding a D.

This is the DFA you want:
This way, if another input symbol is present after the second one - you go from C to D and can never go back to an accepting state.
By the way, and perhaps more importantly - I didn't draw this image myself. I used graphviz, which is a neet tool you might use to visualize your DFAs. The code is:
digraph G {
node [shape="circle"];
Start [shape="none" label=""];
C [shape="doublecircle"];
Start -> A;
A -> B [label="0,1"];
B -> C [label="0,1"];
C -> D [label="0,1"];
D -> D [label="0,1"];
}
and there are online renderers even, like this one.
The "Start" node is a hack to get an arrow to point at the initial state of the DFA (this is why the image is centered on B despite there being 4 states).

Related

k messed sorted array using Insertion sort

In the familiar problem where every element in an array is at most k positions away from it's correct location, either to the left or to the right, I don't completely understand how the Insertion sort algorithm works.
I drew it on a paper, and debugged it step. It does seem to work, and the order of time complexity is O(n.k) as well.
But my question, is that, in this problem, the element could be k positions away either in the left or in the right. However, insertion sort only checks left. How does it still manage to get it right? Could you please explain to me, how this algorithm still works, although we look only to the left, in a manner I can convince myself?
PS : Irrelevant here : If I wasn't aware of this algorithm, I would've thought of something like Selection sort, where for a given element, i, you look k positions on the left and right, to choose the smallest element.
PS : Even more irrelavant : I'm aware of the min-heap based method for this. Again, the same question, why do we only look left?
Items to the left of the correct location get pushed right while the algorithm is processing smaller items. (Assuming that we're sorting in ascending order.) For example, if k is 3 and the initial array is
D A B E H C F G
Let's examine how D gets to location 3 in the array. (Using zero-based indexing, D starts at index 0, and needs to move to index 3 in the array.)
The first pass starts at E, and finds that it can swap A and D resulting in
A D B E H C F G
Second pass starts at H, and swaps B and D
A B D E H C F G
Third pass starts at C, and swaps C with H, E, and D
A B C D E H F G
And now you see that D is already where it's supposed to be.
That's always going to be the case. Any element that starts to the left of its final position will be pushed right (to its final position) as smaller elements are processed. Otherwise, the smaller elements aren't in their correct locations. And an element (like D) won't get pushed past it's correct location, because the algorithm won't swap that element with elements that are larger.
In Insertion Sort what actually happens is we compare an element with every element to its left . So what happens is the list to the left of that element gets sorted and the list to the right of that element is not. Then we move on to the next element and repeat the same process again. This is done till we reach the last element in the list. Therefore we get the sorted list.
It's a wrong assumption that you only need to look left. You can also start from the beginning index and look right.
But, there's something called Loop Invariant (Read more about it). The invariant of insertion sort is that the it keeps a sorted subarray(growing as the algorithm runs) to its left or its right.
Here's the link to a read which will clear it up. https://www.hackerrank.com/challenges/correctness-invariant

Stop Matlab from treating a 1xn matrix as a column vector

I'm very frustrated with MATLAB right now. Let me illustrate the problem. I'm going to use informal notation here.
I have a column cell vector of strings called B. For now, let's say B = {'A';'B';'C';'D'}.
I want to have a matrix G, which is m-by-n, and I want to replace the numbers in G with the respective elements of B... For example, let's say G is [4 3; 2 1]
Let's say I have a variable n which says how many rows of G I want to take out.
When I do B(G(1:2,:)), I get what I want ['D' 'C'; 'B' 'A']
However, if I do B(G(1:1,:)) I get ['D';'C'] when what I really want to get is ['D' 'C']
I am using 1:n, and I want it to have the same behavior for n = 1 as it does for n = 2 and n = 3. Basically, G actually is a n-by-1500 matrix, and I want to take the top n rows and use it as indexes into B.
I could use an if statement that transposes the result if n = 1 but that seems so unnecessary. Is there really no way to make it so that it stops treating my 1-by-n matrix as if it was a column vector?
According to this post by Loren Shure:
Indexing with one array C = A(B) produces output the size of B unless both A and B are vectors.
When both A and B are vectors, the number of elements in C is the number of elements in B and with orientation of A.
You are in second case, hence the behaviour you see.
To make it work, you need to maintain the output to have as many columns as in G. To achieve the same, you can do something like this -
out = reshape(B(G(1:n,:)),[],size(G,2))
Thus, with n = 1:
out =
'D' 'C'
With n = 2:
out =
'D' 'C'
'B' 'A'
I think this will only happen in 1-d case. In default, matlab will return column vector since it is the way how it stores matrix. If you want a row vector, you could just use transpose. Well in my opinion it should be fine when n > 1.

Is Wikipedia's Astar reference implementation incomplete? It seems to omit properly updating cheaper paths

I want to implement A* and I looked to Wikipedia for a reference.
It looks like it can fail in the following case. Consider three nodes, A, B, and C.
START -> A -> C -> GOAL
| ^
\-> B
The path costs are:
START -> A : 10
START -> B : 1
B -> A : 1
A -> C : 100
C -> GOAL : 100
Clearly the solution is START -> B -> A -> C -> GOAL but what happens if the heuristic lets us expand A before expanding B?
Our heuristic costs are as follows (note these are all underestimates)
A -> GOAL : 10
B -> GOAL : 50
When A is expanded, the true cost to C will turn out out to be higher than B's heuristic cost, and so B will be expanded before C.
Fine, right?
The problem I see is that when we expand B and replace the datum "A comes from START with cost 10" to "A comes from B with cost 2" we aren't also updating "C comes from A with cost 110" to "C comes from A with cost 102". There is nothing in Wikipedia's pseudocode that looks like it will forward-propagate the cheaper path. Now imagine another node D which can reach C with cost 105, it will erroneously override "C comes from A with cost 110".
Am I reading this wrong or does Wikipedia need to be fixed?
If you are using graph search, i.e. you remember which nodes you visit and you don't allow revisiting the nodes, then your heuristic is not consistent. It says in the article, that for a heuristic to be consistent, following needs to hold:
h(x) <= d(x, y) + h(y) for all adjacent nodes x, y
In your case the assumption h(B) = 50 is inconsistent as d(B -> A) + h(A) = 1 + 10 = 11. Hence your heuristic is inconsistent and A* wouldn't work in this case, as you rightly noticed and as is also mentioned in the wikipedia article: http://en.wikipedia.org/wiki/A%2a_search_algorithm#Properties.
If you are using tree search, i.e. you allow the algorithm to revisit the nodes, the following will happen:
Add A and B to the queue, score(A) = 10 + 10 = 20, score(B) = 1 + 50 = 51.
Pick A from queue as it has smallest score. Add C to the queue with score(C) = 10 + 100 + h(C).
Pick B from the queue as it is now the smallest. Add A to the queue with score(A) = 2 + 10 = 12.
Pick A from the queue as it is now again smallest. Notice that we are using tree search algorithm, so we can revisit nodes. Add C to the queue with score(C) = 1 + 1 + 100 + h(C).
Now we have 2 elements in the queue, C via A with score 110 + h(C) and C via B and A with score 102 + h(C), so we pick the correct path to C via B and A.
The wikipedia pseudocode is the first case, i.e. graph search. And they indeed state right under the pseudocode that:
Remark: the above pseudocode assumes that the heuristic function is monotonic (or consistent, see below), which is a frequent case in many practical problems, such as the Shortest Distance Path in road networks. However, if the assumption is not true, nodes in the closed set may be rediscovered and their cost improved. In other words, the closed set can be omitted (yielding a tree search algorithm) if a solution is guaranteed to exist, or if the algorithm is adapted so that new nodes are added to the open set only if they have a lower f value than at any previous iteration.

Algorithm for maintaining an "ordering string" for ordering database elements

I have a database in which I'd like to store an arbitrary ordering for a particular element. The database in question doesn't support order sets, so I have to do this myself.
One way to do this would be to store a float value for the element's position, and then take the average of the position of the surrounding elements when inserting a new one:
Item A - Position 1
Item B - Position 1.5 (just inserted).
Item C - Position 2
Now, for various reasons I don't wish to use floats, I'd like to use strings instead. For example:
Item A - Position a
Item B - Position aa (just inserted).
Item C - Position b
I'd like to keep these strings as short as possible since they will never be "tidied up".
Can anyone suggest an algorithm for generating such string as efficiently and compactly as possible?
Thanks,
Tim
It would be reasonable to assign 'am' or 'an' position to Item B and use binary division steps for another insertions.
This resembles 26-al number system, where 'a'..'z' symbols correspond to 0..25.
a b //0 1
a an b //insert after a - middle letter of alphabet
a an au b //insert after an
a an ar au b //insert after an again (middle of an, au)
a an ap ar au b //insert after an again
a an ao ap ar au b //insert after an again
a an ann ao... //insert after an, there are no more place after an, have to use 3--symbol label
....
a an anb... //to insert after an, we treat it as ana
a an anan anb // it looks like 0 0.5 0.505 0.51
Pseudocode for binary tree structure:
function InsertAndGetStringKey(Root, Element): string
if Root = nil then
return Middle('a', 'z') //'n'
if Element > Root then
if Root.Right = nil then
return Middle(Root.StringKey, 'z')
else
return InsertAndGetStringKey(Root.Right, Element)
if Element < Root then
if Root.Left = nil then
return Middle(Root.StringKey, 'a')
else
return InsertAndGetStringKey(Root.Left, Element)
Middle(x, y):
//equalize length of strings like (an, anf)-> (ana, anf)
L = Length(x) - Length(y)
if L < 0 then
x = x + StringOf('a', -L) //x + 'aaaaa...' L times
else if L > 0 then
y = y + StringOf('a', L)
if LL = LastSymbol(x) - LastSymbol(y) = +-1 then
return(Min(x, y) + 'n') // (anf, ang) - > anfn
else
return(Min(x, y) + (LastSymbol(x) + LastSymbol(y))/2) // (nf, ni)-> ng
As stated the problem has no solution. Once an algorithm has generated strings 'a' and 'aa' for adjacent elements there is no string which can be inserted between them. This is a fatal problem for the approach. This problem is independent of the alphabet used for the strings: replace 'a' by 'the first letter in the alphabet used' if you wish.
Of course, it can be worked around by changing the ordering string for other elements when this impasse is reached, but that seems to be beyond what OP wants.
I think that the problem is equivalent to finding an integer to represent the order of an element and finding that, say, 35 and 36 are already used to order existing elements. There is simply no integer between 35 and 36, no matter how hard you look.
Use real numbers, or a computer approximation such as floating-point numbers, or rationals.
EDIT in response to OP's comment
Just adapt the algorithm for adding 2 rationals: (a/b)+(c/d) = (ad+cb)/bd. Take (ad+cb)/2 (rounding if you want or need) and you have a rational midway between the first two.
Are capitals an option?
If so, I would use them to insert between otherwise adjacent values.
For instance to insert between
a
aa
You could do:
a
aAaa <--- this cap. tells there is one more place between adjacent small values .ie. a[Aa]a
aAba
aAca
aBaa
aBba
aa
Now if you need to insert between a and aAaa
You could do
a
aAAaaa <--- 2 caps. tells there are two more places between adjacent small values i.e. a[AAaa]a
aAAaba
aAAaca
...
aAAbaa
aAaa
In terms of being compact or efficient I make no claims...

Context Free pumping lemma

Is the following language context free?
L = {a^i b^k c^r d^s | i+s = k+r, i,k,r,s >= 0}
I've tried to come up with a context free grammar to generate this but I can not, so I'm assuming its not context free. As for my proof through contradiction:
Assume that L is context free,
Let p be the constant given by the pumping lemma,
Choose string S = a^p b^p c^p d^p where S = uvwxy
As |vwx| <= p, then at most vwx can contain two distinct symbols:
case a) vwx contains only a single type of symbol, therefore uv^2wx^2y will result in i+s != k+r
case b) vwx contains two types of symbols:
i) vwx is composed of b's and c's, therefore uv^2wx^2y will result in i+s != k+r
Now my problem is that if vwx is composed of either a's and b's, or c's and d's, then pumping them won't necessary break the language as i and k or s and r could increase in unison resulting in i+s == k+r.
Am I doing something wrong or is this a context free language?
I can't come up with a CFG to generate that particular language at the top of my head either, but we know that a language is context free iff some pushdown automata recognizes it.
Designing such a PDA won't be too difficult. Some ideas to get you started:
we know i+s=k+r. Equivalently, i-k-r+s = 0 (I wrote it in that order since that is the order in they appear). The crux of the problem is deciding what to do with the stack if (k+r)>i.
If you aren't familiar with PDA's or cannot use them to answer the problem, at least you know now that it is Context Free.
Good luck!
Here is a grammar that accepts this language:
A -> aAd
A -> B
A -> C
B -> aBc
B -> D
C -> bCd
C -> D
D -> bDc
D -> ε

Resources