Concatenation of two languages in NP - concatenation

I have a hard time to understand why the concatenation of two languages over an alphabet, which is in NP, doesn't imply that each of the languages for themselves are in NP. I talked with my Prof about the problem today, but I can't wrap my head around it. Can you help me out?

Here's a counterexample to the claim that if A and B are languages and AB ∈ NP, then A ∈ NP and B ∈ NP. First, consider all subsets of the set 1*. There are countably infinitely many strings in 1*, so there are uncountably many subsets of 1*. Since there are only countably many decidable languages, at least one of these languages is undecidable; let's have that language be A. For B, choose 1*.
My claim is that AB is some language of the form { 1n | n ≥ k } for some natural number k. To see this, let k be the length of the shortest string in A. Then any string in AB has the form 1k+m1r, where 1k+m ∈ A and 1r ∈ B. Such a string necessarily belongs to { 1n | n ≥ k }. Similarly, if we take any string from { 1n | n ≥ k }, we can see that it has the form 1k+m = 1k1m, where 1k ∈ A and 1m ∈ B. Therefore, AB = { 1n | n ≥ k }.
The language AB is regular because it's given by the regular expression 1k1*, but A is not in NP because it's undecidable and all languages in NP are decidable.
Hope this helps!

Related

Is the union of two non-regular languages regular?

Given two non-regular languages, is their union regular?
Also, why is L = L1 ∪ L2 = {aibj | i,j >= 0} the union of L1 = {aibj | i >= j} and L2 = {aibj | i < j}?
Then, what is the union of L1 = {aibj | i > j} and L2 = {aibj | i < j}?
Given two non-regular languages, is their union regular?
Sometimes they are, sometimes they are not. Take any non-regular language L. Consider the complement of this language, L'. We know that L' is non-regular since the regular languages are closed under complementation (if L' were regular, then (L')' = L would also be regular, a contradiction). Because the union of a language and its complement is the universal language of all strings over the alphabet, a context free language, certainly some pairs of non-regular languages have a regular union. To see that not all non-regular languages have a regular union, consider the languages 0^n 1^n and a^n b^n on the shared alphabet {0, 1, a, b}. It is not hard to see that the union of these two disjoint non-regular languages cannot possible be regular.
Also, why is L = L1 ∪ L2 = {aibj | i,j >= 0} the union of L1 = {aibj | i >= j} and L2 = {aibj | i < j}?
For all integers i and j, either i <= j or j <= i (or both). The <= relation defines a total ordering of the integers. If it is not the case that i <= j, we may write i > j instead to mean not(i <= j). Thus, > can be thought of as sort of a shorthand for a negation. Because L1 requires i >= j (this too is just a different way of writing j <= i) and L2 requires i < j (just another way of writing j > i), and because j <= i and j > i are logical negations of each other, the union - which applies the disjunctive operator to these conditions - results in tautology. That is, the resulting condition is always true - either i >= j or i < j, always - and so all that remains is the shared requirement implicit in the beginning of the set-builder notation that all strings begin with some a and end with some b. Even adding i,j >= 0 to the first one is totally unnecessary, assuming a valid domain for i and j, and in fact the second and third languages don't bother mentioning this restriction.
Then, what is the union of L1 = {aibj | i > j} and L2 = {aibj | i < j}?
The only strings that don't satisfy either i > j or i < j are precisely those for which i = j. That is, strings of the form a^n b^n. The union of L1 and L2, then, is the complement of the language a^n b^n. We know neither language is regular, and we know that a^n b^n is context-free. The complement of a^n b^n may be context-free or not; the context-free languages are not closed under complementation. In fact, it's not hard to see that this language must really be context free:
take a CFG G1 for language L1
take a CFG G2 for language L2
make a CFG G for language L by adding a new start symbol and productions which produce each of the start symbols in G1 and G2, respectively
Question 1: Is the union of two non-regular language regular?
Sometimes. The regular, context-free, context-sensitive, recursive and recursively enumerable languages are closed under union. However, the deterministic context-free (DCFL) languages accepted by a deterministic pushdown automaton (DPDA) are not. The standard proof goes something like this. Consider the following languages:
L1 = {aibjck : i,j,k ≥ 0}
L2 = {aibicj : i,j ≥ 0}
L3 = {aibjcj : i,j ≥ 0}
L4 = {aibici : i ≥ 0}
The first language is regular, the second and third DCFL, and the fourth not context-free. If DCFL were closed under union, then since it is closed under complementation, the language
L4c = L1c ∪ L2c ∪ L3c
must be DCFL. By the same token, L4 must be DCFL. This is a contradiction, because L4 is not even context-free. Therefore, DCFL is not closed under union. Finally, we can apply De Morgan's laws and the fact that DCFL is closed under complementation to conclude that DCFL is not closed under intersection.
Conversely, there are non-regular languages whose union is regular. The answer to your second question shows that there are DCFL languages whose union is given by a*b*.
Question 2: The union of L1 = {aibj : i ≥ j} and L2 = {aibj : i < j}.
The union of L1 and L2 is L3 = {aibj : i,j ≥ 0}. Since this is an equality involving sets, we must show that L1 ∪ L2 ⊆ L3 and L3 ⊆ L1 ∪ L2.
If u ∈ L1 ∪ L2 then u ∈ L1 or u ∈ L2. If u ∈ L1 then u = aibj where i ≥ j. If u ∈ L2 then u = aibj where i < j. By trichotomy, u = aibj where i,j ≥ 0. Thus, u ∈ L3.
If u ∈ L3 then u = aibj where i,j ≥ 0. By trichotomy, either i ≥ j or i < j. If the former, then u ∈ L1. If the latter, then u ∈ L2. Thus, u ∈ L1 ∪ L2.
Question 3: The union of L1 = {aibj : i > j} and L2 = {aibj : i < j}
The union of L1 and L2 is the set of strings aibj where i < j or i > j. This is equivalent to saying that i ≠ j by trichotomy. Therefore, L1 ∪ L2 = {aibj : i ≠ j}.

Determine the adjacency of two fibonacci number

I have many fibonacci numbers, if I want to determine whether two fibonacci number are adjacent or not, one basic approach is as follows:
Get the index of the first fibonacci number, say i1
Get the index of the second fibonacci number, say i2
Get the absolute value of i1-i2, that is |i1-i2|
If the value is 1, then return true.
else return false.
In the first step and the second step, it may need many comparisons to get the correct index by using accessing an array.
In the third step, it need one subtraction and one absolute operation.
I want to know whether there exists another approach to quickly to determine the adjacency of the fibonacci numbers.
I don't care whether this question could be solved mathematically or by any hacking techniques.
If anyone have some idea, please let me know. Thanks a lot!
No need to find the index of both number.
Given that the two number belongs to Fibonacci series, if their difference is greater than the min. number among them then those two are not adjacent. Other wise they are.
Because Fibonacci series follows following rule:
F(n) = F(n-1) + F(n-2) where F(n)>F(n-1)>F(n-2).
So F(n) - F(n-1) = F(n-2) ,
=> Diff(n,n-1) < F(n-1) < F(n-k) for k >= 1
Difference between two adjacent fibonaci number will always be less than the min number among them.
NOTE : This will only hold if numbers belong to Fibonacci series.
Simply calculate the difference between them. If it is smaller than the smaller of the 2 numbers they are adjacent, If it is bigger, they are not.
Each triplet in the Fibonacci sequence a, b, c conforms to the rule
c = a + b
So for every pair of adjacent Fibonaccis (x, y), the difference between them (y-x) is equal to the value of the previous Fibonacci, which of course must be less than x.
If 2 Fibonaccis, say (x, z) are not adjacent, then their difference must be greater than the smaller of the two. At minimum, (if they are one Fibonacci apart) the difference would be equal to the Fibonacci between them, (which is of course greater than the smaller of the two numbers).
Since for (a, b, c, d)
since c= a+b
and d = b+c
then d-b = (b+c) - b = c
By Binet's formula, the nth Fibonacci number is approximately sqrt(5)*phi**n, where phi is the golden ration. You can use base phi logarithms to recover the index easily:
from math import log, sqrt
def fibs(n):
nums = [1,1]
for i in range(n-2):
nums.append(sum(nums[-2:]))
return nums
phi = (1 + sqrt(5))/2
def fibIndex(f):
return round((log(sqrt(5)*f,phi)))
To test this:
for f in fibs(20): print(fibIndex(f),f)
Output:
2 1
2 1
3 2
4 3
5 5
6 8
7 13
8 21
9 34
10 55
11 89
12 144
13 233
14 377
15 610
16 987
17 1597
18 2584
19 4181
20 6765
Of course,
def adjacentFibs(f,g):
return abs(fibIndex(f) - fibIndex(g)) == 1
This fails with 1,1 -- but there is little point for explicit testing special logic for such an edge-case. Add it in if you want.
At some stage, floating-point round-off error will become an issue. For that, you would need to replace math.log by an integer log algorithm (e.g. one which involves binary search).
On Edit:
I concentrated on the question of how to recover the index (and I will keep the answer since that is an interesting problem in its own right), but as #LeandroCaniglia points out in their excellent comment, this is overkill if all you want to do is check if two Fibonacci numbers are adjacent, since another consequence of Binet's formula is that sufficiently large adjacent Fibonacci numbers have a ratio which differs from phi by a negligible amount. You could do something like:
def adjFibs(f,g):
f,g = min(f,g), max(f,g)
if g <= 34:
return adjacentFibs(f,g)
else:
return abs(g/f - phi) < 0.01
This assumes that they are indeed Fibonacci numbers. The index-based approach can be used to verify that they are (calculate the index and then use the full-fledged Binet's formula with that index).

FIND-S Algorithm - simple question

The FIND-S algorithm is probably one of the most simple machine learning algorithms. However, I can't find many examples out there.. Just the standard 'sunny, rainy, play-ball' examples that's always used in machine learning. Please could someone help me with this application (its a past exam question in machine learning).
Hypotheses are of the form a <= x <= b, c <= y <= d where x and y are points in an x,y plane and c and d are any integer. Basically, these hypotheses define rectangles in the x,y space.
These are the training examples where - is a negative example and + is a positive example and the pairs are the x,y co-ordinates:
+ 4, 4
+ 5, 3
+ 6, 5
- 1, 3
- 2, 6
- 5, 1
- 5, 8
- 9, 4
All I want to do is apply FIND-S to this example! It must be simple! Either some tips or a solution would be awesome.
Thank you.
Find-S seeks the most restrictive (ie most 'specific') hypothesis that fits all the positive examples (negatives are ignored).
In your case, there's an obvious graphical interpretation: "find the smallest rectangle that contains all the '+' coordinates"...
... which would be a=4, b=6, c=3, d=5.
The algorithm for doing it would be something like this:
Define a hypothesis rectangle h[a,b,c,d], and initialise it to [-,-,-,-]
for each + example e {
if e is not within h {
enlarge h to be just big enough to hold e (and all previous e's)
} else { do nothing: h already contains e }
}
If we step through this with your training set, we get:
0. h = [-,-,-,-] // initial value
1. h = [4,4,4,4] // (4,4) is not in h: change h so it just contains (4,4)
2. h = [4,5,3,4] // (5,3) is not in h, so enlarge h to fit (4,4) and (5,3)
3. h = [4,6,3,5] // (6,5) is not in h, so enlarge again
4. // no more positive examples left, so we're done.

Genetics algorithms theoretical question

I'm currently reading "Artificial Intelligence: A Modern Approach" (Russell+Norvig) and "Machine Learning" (Mitchell) - and trying to learn basics of AINN.
In order to understand few basic things I have two 'greenhorn' questions:
Q1: In a genetic algorithm given the two parents A and B with the chromosomes 001110 and 101101, respectively, which of the following offspring could have resulted from a one-point crossover?
a: 001101
b: 001110
Q2: Which of the above offspring could have resulted from a two-point crossover? and why?
Please advise.
It is not possible to find parents if you do not know the inverse-crossover function (so that AxB => (a,b) & (any a) => (A,B)).
Usually the 1-point crossover function is:
a = A1 + B2
b = B1 + A2
Even if you know a and b you cannot solve the system (system of 2 equations with 4 variables).
If you know any 2 parts of any A or/and B then it can be solved (system of 2 equations with 2 variables). This is the case for your question as you provide both A and B.
Generally crossover function does not have inverse function and you just need to find the solution logically or, if you know parents, perform the crossover and compare.
So to make a generic formula for you we should know 2 things:
Crossover function.
Inverse-crossover function.
The 2nd one is not usually used in GAs as it is not required.
Now, I'll just answer your questions.
Q1: In a genetic algorithm given the
two parents A and B with the
chromosomes 001110 and 101101,
respectively, which of the following
offspring could have resulted from a
one-point crossover?
Looking at the a and b I can see the crossover point is here:
1 2
A: 00 | 1110
B: 10 | 1101
Usually the crossover is done using this formula:
a = A1 + B2
b = B1 + A2
so that possible children are:
a: 00 | 1101
b: 10 | 1110
which excludes option b from the question.
So the answer to Q1 is the result child is a: 001101 assuming given crossover function
Q2: Which of the above offspring could
have resulted from a two-point
crossover? and why?
Looking at the a and b I can see the crossover points can be here:
1 2 3
A: 00 | 11 | 10
B: 10 | 11 | 01
Usual formula for 2-point crossover is:
a = A1 + B2 + A3
b = B1 + A2 + B3
So the children would be:
a = 00 | 11 | 10
b = 10 | 11 | 01
Comparing them to the options you asked (small a and b) we can say the answer:
Q2. A: Neither of a or b could be result of 2-point crossover with AxB according to the given crossover function.
Again it is not possible to answer your questions without knowing the crossover function.
The functions I provided are common in GA, but you can invent so many of them so they could answer the question (see the comment below):
One point crossover is when you make one join from each parent, two point crossover is when you make two joins. i.e. two from one parent and one from the others.
See crossover (wikipedia) for further info.
Regarding Q1, (a) could have been produced by a one-point crossover, taking bits 0-4 from parent A and bit 5 from parent B. (b) could not unless your crossover algorithm allows for null contributions, i.e. parent contributions of null weight. In that case, parent A could contribute its full chromosome (bits 0-5) and parent B would contribute nil, yielding (b).
Regarding Q2, both (a) and (b) are possible. There are a few combinations to test; too tedious to write, but you can do the work with pen and paper. :-)

dynamic programming: finding largest non-overlapping squares

I really have no idea how to do this using dynamic programming:
Problem:
I need to find the 2 largest non overlapping squares of a table
For example:
5 6
R F F R R F
F F F F F F
R R F F F F
F F F F F F
F F F F F F
The numbers 5 and 6 are the number of rows and columns respectively, and “R” means reserved
and “F” means free. In this case the largest square is
F F F F
F F F F
F F F F
F F F F
and the second largest (non-overlapping with the previous one) is
F F
F F
So far I have put the values into a 2D array, but do not know what to do after.
Tried to reference 0-1 knapsack and LCS, but really I have no clue on what values I should put into my table.
Well, your first task when designing a dynamic programming algorithm should be to find a recursive solution to the problem. After that is done, converting it to a dynamic programming algorithm is almost trivial ;).
Some tips that could/could not help:
A possible base case is obvious: any two 1 cell squares are always non-overlapping.
Bear in mind that the largest of the two squares can not cover the entire table (because then you wont have a second largest), so it cannot be of rows:columns size.
You should have an "score" to each solution you evaluate, to see what's the best one.This score is obviously size1 + size2, with the condition that size1 should be the maximum possible.
Good luck!!
This is a variation of the Longest Common Substring Problem not LCS (Longest common subsequence). Picture two strings you are comparing as being sides of a rectangle with their characters as the squares. An "F" in a square would mean a match between two characters in the strings, and thus the largest square is the longest common substring.

Resources