Is the union of two non-regular languages regular? - union

Given two non-regular languages, is their union regular?
Also, why is L = L1 ∪ L2 = {aibj | i,j >= 0} the union of L1 = {aibj | i >= j} and L2 = {aibj | i < j}?
Then, what is the union of L1 = {aibj | i > j} and L2 = {aibj | i < j}?

Given two non-regular languages, is their union regular?
Sometimes they are, sometimes they are not. Take any non-regular language L. Consider the complement of this language, L'. We know that L' is non-regular since the regular languages are closed under complementation (if L' were regular, then (L')' = L would also be regular, a contradiction). Because the union of a language and its complement is the universal language of all strings over the alphabet, a context free language, certainly some pairs of non-regular languages have a regular union. To see that not all non-regular languages have a regular union, consider the languages 0^n 1^n and a^n b^n on the shared alphabet {0, 1, a, b}. It is not hard to see that the union of these two disjoint non-regular languages cannot possible be regular.
Also, why is L = L1 ∪ L2 = {aibj | i,j >= 0} the union of L1 = {aibj | i >= j} and L2 = {aibj | i < j}?
For all integers i and j, either i <= j or j <= i (or both). The <= relation defines a total ordering of the integers. If it is not the case that i <= j, we may write i > j instead to mean not(i <= j). Thus, > can be thought of as sort of a shorthand for a negation. Because L1 requires i >= j (this too is just a different way of writing j <= i) and L2 requires i < j (just another way of writing j > i), and because j <= i and j > i are logical negations of each other, the union - which applies the disjunctive operator to these conditions - results in tautology. That is, the resulting condition is always true - either i >= j or i < j, always - and so all that remains is the shared requirement implicit in the beginning of the set-builder notation that all strings begin with some a and end with some b. Even adding i,j >= 0 to the first one is totally unnecessary, assuming a valid domain for i and j, and in fact the second and third languages don't bother mentioning this restriction.
Then, what is the union of L1 = {aibj | i > j} and L2 = {aibj | i < j}?
The only strings that don't satisfy either i > j or i < j are precisely those for which i = j. That is, strings of the form a^n b^n. The union of L1 and L2, then, is the complement of the language a^n b^n. We know neither language is regular, and we know that a^n b^n is context-free. The complement of a^n b^n may be context-free or not; the context-free languages are not closed under complementation. In fact, it's not hard to see that this language must really be context free:
take a CFG G1 for language L1
take a CFG G2 for language L2
make a CFG G for language L by adding a new start symbol and productions which produce each of the start symbols in G1 and G2, respectively

Question 1: Is the union of two non-regular language regular?
Sometimes. The regular, context-free, context-sensitive, recursive and recursively enumerable languages are closed under union. However, the deterministic context-free (DCFL) languages accepted by a deterministic pushdown automaton (DPDA) are not. The standard proof goes something like this. Consider the following languages:
L1 = {aibjck : i,j,k ≥ 0}
L2 = {aibicj : i,j ≥ 0}
L3 = {aibjcj : i,j ≥ 0}
L4 = {aibici : i ≥ 0}
The first language is regular, the second and third DCFL, and the fourth not context-free. If DCFL were closed under union, then since it is closed under complementation, the language
L4c = L1c ∪ L2c ∪ L3c
must be DCFL. By the same token, L4 must be DCFL. This is a contradiction, because L4 is not even context-free. Therefore, DCFL is not closed under union. Finally, we can apply De Morgan's laws and the fact that DCFL is closed under complementation to conclude that DCFL is not closed under intersection.
Conversely, there are non-regular languages whose union is regular. The answer to your second question shows that there are DCFL languages whose union is given by a*b*.
Question 2: The union of L1 = {aibj : i ≥ j} and L2 = {aibj : i < j}.
The union of L1 and L2 is L3 = {aibj : i,j ≥ 0}. Since this is an equality involving sets, we must show that L1 ∪ L2 ⊆ L3 and L3 ⊆ L1 ∪ L2.
If u ∈ L1 ∪ L2 then u ∈ L1 or u ∈ L2. If u ∈ L1 then u = aibj where i ≥ j. If u ∈ L2 then u = aibj where i < j. By trichotomy, u = aibj where i,j ≥ 0. Thus, u ∈ L3.
If u ∈ L3 then u = aibj where i,j ≥ 0. By trichotomy, either i ≥ j or i < j. If the former, then u ∈ L1. If the latter, then u ∈ L2. Thus, u ∈ L1 ∪ L2.
Question 3: The union of L1 = {aibj : i > j} and L2 = {aibj : i < j}
The union of L1 and L2 is the set of strings aibj where i < j or i > j. This is equivalent to saying that i ≠ j by trichotomy. Therefore, L1 ∪ L2 = {aibj : i ≠ j}.

Related

Check subset sum for special array equation

I was trying to solve the following problem.
We are given N and A[0]
N <= 5000
A[0] <= 10^6 and even
if i is odd then
A[i] >= 3 * A[i-1]
if i is even
A[i]= 2 * A[i-1] + 3 * A[i-2]
element at odd index must be odd and at even it must be even.
We need to minimize the sum of the array.
and We are given a Q numbers
Q <= 1000
X<= 10^18
We need to determine is it possible to get subset-sum = X from our array.
What I have tried,
Creating a minimum sum array is easy. Just follow the equations and constraints.
The approach that I know for subset-sum is dynamic programming which has time complexity sum*sizeof(Array) but since sum can be as large as 10^18 that approach won't work.
Is there any equation relation that I am missing?
We can make it with a bit of math:
sorry for latex I am not sure it is possible on stack?
let X_n be the sequence (same as being defined by your A)
I assume X_0 is positive.
Thus sequence is strictly increasing and minimization occurs when X_{2n+1} = 3X_{2n}
We can compute the general term of X_{2n} and X_{2n+1}
v_0 =
X0
X1
v_1 =
X1
X2
the relation between v_0 and v_1 is
M_a =
0 1
3 2
the relation between v_1 and v_2 is
M_b =
0 1
0 3
hence the relation between v_2 and v_0 is
M = M_bM_a =
3 2
9 6
we deduce
v_{2n} =
X_{2n}
X_{2n+1}
v_{2n} = M^n v_0
Follow the classical diagonalization... and we (unless mistaken) get
X_{2n} = 9^n/3 X_0 + 2*9^{n-1}X_1
X_{2n+1} = 9^n X_0 + 2*9^{n-1}/3X_1
recall that X_1 = 3X_0 thus
X_{2n} = 9^n X_0
X_{2n+1} = 3.9^n X_0
Now if we represent the sum we want to check in base 9 we get
9^{n+1} 9^n
___ ________ ___ ___
X^{2n+2} X^2n
In the X^{2n} places we can only put a 1 or a 0 (that means we take the 2n-th elem from the A)
we may also put a 3 in the place of the X^{2n} place which means we selected the 2n+1th elem from the array
so we just have to decompose number in base 9, and check whether all its digits or either 0,1 or 3 (and also if its leading digit is not out of bound of our array....)

Arrays: Find minimum number of swaps to make bitonicity of array minimum?

Suppose we are given an array of integer. All adjacent elements are guaranteed to be distinct. Let us define bitonicity of this array a as bt using the following relation:
bt_array[i] = 0, if i == 0;
= bt_array[i-1] + 1, if a[i] > a[i-1]
= bt_array[i-1] - 1, if a[i] < a[i-1]
= bt_array[i-1], if a[i] == a[i-1]
bt = last item in bt_array
We say the bitonicity of an array is minimum when its bitonicity is 0 if it has an odd number of elements, or its bitonicity is +1 or -1 if it has an even number of elements.
The problem is to design an algorithm that finds the fewest number of swaps required in order to make the bitonicity of any array minimum. The time complexity of this algorithm should be at worst O(n), n being the number of elements in the array.
For example, suppose a = {34,8,10,3,2,80,30,33,1}
Its initial bt is -2. Minimum would be 0. This can be achieved by just 1 swap, namely swapping 2 and 3. So the output should be 1.
Here are some test cases:
Test case 1: a = {34,8,10,3,2,80,30,33,1}, min swaps = 1 ( swap 2 and 3)
Test case 2: {1,2,3,4,5,6,7}: min swaps = 2 (swap 7 with 4 and 6 with 5)
Test case 3: {10,3,15,7,9,11}: min swaps = 0. bt = 1 already.
And a few more:
{2,5,7,9,5,7,1}: current bt = 2. Swap 5 and 7: minSwaps = 1
{1,7,8,9,10,13,11}: current bt = 4: Swap 1,8 : minSwaps = 1
{13,12,11,10,9,8,7,6,5,4,3,2,1}: current bt = -12: Swap (1,6),(2,5) and (3,4) : minSwaps = 3
I was asked this question in an interview, and here's what I came up with:
1. Sort the given array.
2. Reverse the array from n/2 to n-1.
3. Compare from the original array how many elements changed their position.
Return half of it.
And my bit of code that does this:
int returnMinSwaps(int[] a){
int[] a = {1,2,3,4,5,6,7};
int[] b = a;
Arrays.sort(b);
for(int i=0; i<= b.length/2 - 1; i++){
swap(b[b.length - i], b[b.length/2 - i]);
}
int minSwaps = 0;
for(int i=0;i<b.length;i++){
if(a[i] != b[i])
minSwaps++;
}
return minSwaps/2;
}
Unfortunately, I am not getting correct minimum number of ways for some test cases using this logic. Also, I am sorting the array which is making it in O(n log n) and it needs to be done in O(n).
URGENT UPDATE: T3 does not hold!
Consider α = [0, 7, 8, 3, 4, 10, 1, 6, 9, 2, 5]. There is no Sij(α) that can lower |B(α)| by more than 2.
Thinking on amendments to the method…
Warning
This solution only works when there are no array elements that are equal.
Feel free to propose generalizations by editing the answer.
Go straight to Conclusion if you want to skip the boring part.
Introduction
Let`s define the swap operator Sij over the array a:
Sij(a) : [… ai, … aj, …] → [… aj, … ai, …]   ∀i, j ∈ [0; |a|) ∩ ℤ : i ≠ j
Let`s also refer to the bitonicity as B(a), and define it more formally:
The obvious facts:
Swaps are symmetric:
Sij(a) = Sji(a)
Two swaps are independent if their target positions don`t intersect:
Sij(Skl(a)) = Skl(Sij(a))   ∀i, j, k, l : {i, j} ∩ {k, l} = ∅
Two 2-dependent swaps undo one another:
Sij(Sij(a)) = a
Two 1-dependent swaps abide to the following:
Sjk(Sij(a)) = Sij(Sik(a)) = Sik(Sjk(a))
Bitonicity difference is always even for equally sized arrays:
(B(a) – B(a')) mod 2 = 0   ∀a, a' : |a| = |a'|
Naturally, ∀i : 0 < i < |a|,
B([ai–1, ai]) – B([a'i–1, a'i]) = sgn(ai – ai–1) – sgn(a'i – a'i–1),
which can either be 1 – 1 = 0, or 1 – –1 = 2, or –1 – 1 = –2, or –1 – –1 = 0, and any number of ±2`s and 0`s summed yield an even result.
N.B.: this is only true if all elements in a differ from one another, same with a'!
Theorems
[T1]   |B(Sij(a)) – B(a)| ≤ 4   ∀a, Sij(a)
Without loss of generality, let`s assume that:
0 < i, j < |a| – 1
j – i ≥ 2
ai–1 < ai+1
aj–1 < aj+1
Depending on ai, 3 cases are possible:
ai–1 < ai < ai+1: sgn(ai – ai–1) + sgn(ai+1 – ai) = 1 + 1 = 2
ai < ai–1 < ai+1: sgn(ai – ai–1) + sgn(ai+1 – ai) = –1 + 1 = 0
ai–1 < ai+1 < ai: sgn(ai – ai–1) + sgn(ai+1 – ai) = 1 + –1 = 0
When altering ai and leaving all other elements of a intact, |B(a') – B(a)| ≤ 2 (where a' is the resulting array, for which the above 3 cases also apply), since no other terms of B(a) changed their value, except those two from the 1-neighborhood of ai.
Sij(a) implies what`s described above to happen twice, once for ai and once for aj.
Thus, |B(Sij(a)) – B(a)| ≤ 2 + 2 = 4.
Analogously, for each of the corners and j – i = 1 the max. possible delta is 2, which is ≤ 4.
Finally, this straightforwardly extrapolates to ai–1 > ai+1 and aj–1 > aj+1.
QED
[T2]   ∀a : |B(a)| ≥ 2   ∃Sij(a) : |B(Sij(a))| = |B(a)| – 2
{proof in progress, need to sleep}
[T3]   ∀a : |B(a)| ≥ 4   ∃Sij(a) : |B(Sij(a))| = |B(a)| – 4
{proof in progress, need to sleep}
Conclusion
From T1, T2 and T3, the minimal number of swaps needed to minimize |B(a)| equals:
⌊|B(a)| / 4⌋ + ß,
where ß equals 1 if |B(a)| mod 4 ≥ 2, 0 otherwise.

Is there any possibility to recover A in "A & B = C" with given B and C?

I would like to ask: with A, B, and C are any binary number. After getting C = A & B (& is AND operator), is there any possibility to recover A from B and C?
I know that the information of A will be lost through the operation. Can we form a function like B <...> C = A, and how complexity it can be?
For example:
A = 0011
B = 1010
C = A & B = 0010
The 2nd bit of C is 1, i.e. 2nd bit of A and B must be 1. However, the other bits lack information to be recovered.
Thank you in advance.
No, it's not possible. You can see this from the truth table for AND:
A B C (A & B)
0 0 0
0 1 0
1 0 0
1 1 1
Suppose you know that B is 0 and C is 0. A could be either 1 or 0, so it cannot be deduced from B and C.
You can recover only bits of A that have 1s in the corresponding bits of B. For bits of B that have zeros it does not matter what A has in the corresponding position, because the bit in C would be zero anyway:
A = 1xx0x011x0
B = 1001011101
----------
C = 1000001100
Positions of A marked with x can be zeros or ones; the information in them is going to be lost either way.
Assuming you are just talking binary logic not C variables, then no.
Consider:
a=0111, b=1010 therefore c=0010
So you have b=1010, c=0010 so now how can you find a?
The left most bit in c is a 0, in b it is 1 so we know a it must be 0
The second bit in c is 0, in b it is 0 so you can't tell what it was in a (either 1 or 0 leads to a 0 in c)
At this point we've proven you can't do it.
No, because there isn't a unique solution. Any value of A that has the same bits set as B would satisfy the equation, regardless of the other bits.
This is a question about equations. It is not possible as the degree of freedom is not zero. It is the same as asking a+b = 10 -- what is a and what is b?
You can't recover A, but you can write A = (X & ~B) ^ C. Here, X can be anything (and it gives all the A's).
Of course this will work only for B and C such that C & ~B == 0.
This is a parametrized solution. Example in python
>>> A = 32776466
>>> B = 89773888
>>> C = A & B
>>> C
22020352
>>> X = 1234567890 # arbitrary value
>>> U = (X & ~B) ^ C
>>> U
1238761874
>>> U & B # same result as A & B
22020352

Difference of elements of 2 sorted arrays within given interval

Let us assume that we have 2 sorted arrays A and B of integers and a given interval [L,M] . If x is an element of A and y an element of B ,our task is to find all pairs of (x,y) that hold the following property: L<=y-x<=M.
Which is the most suitable algorithm for that purpose?
So far ,I have considered the following solution:
Brute force. Check the difference of all possible pairs of elements with a double loop .Complexity O(n^2).
A slightly different version of the previous solution is to make use of the fact that arrays are sorted by not checking the elements of A ,once difference gets out of interval .Complexity would still be O(n^2) but hopefully our program would run faster at an average case.
However ,I believe that O(n^2) is not optimal .Is there an algorithm with better complexity?
Here is a solution.
Have a pointer at the beginning of each array say i for array A and j for array B.
Calculate the difference between B[j] and A[i].
If it is less than L, increment the pointer in array B[], i.e increment j by 1
If it is more than M, increment i, i.e pointer of A.
If the difference is in between, then do the following:
search for the position of an element whose value is B[j]-A[i]-L or the nearest
element whose value is lesser than (B[j]-A[i])-L in array A. This
takes O(logN) time. Say the position is p. Increment the count of
(x,y) pairs by p-i+1
Increment only pointer j
My solution only counts the number of possible (x,y) pairs in O(NlogN) time
For A=[1,2,3] and B=[10,12,15] and L=12 and M=14, answer is 3.
Hope this helps. I leave it up to you, to implement the solution
Edit: Enumerating all the possible (x,y) pairs would take O(N^2) worst case time. We will be able to return the count of such pairs (x,y) in O(NlogN) time. Sorry for not clarifying it earlier.
Edit2: I am attaching a sample implementation of my proposed method below:
def binsearch(a, x):
low = 0
high = len(a)-1
while(low<=high):
mid = (low+high)/2
if a[mid] == x:
return mid
elif a[mid]<x:
k = mid
low = low + mid
else:
high = high - mid
return k
a = [1, 2, 3]
b = [10, 12, 15, 16]
i = 0
j = 0
lenA = len(a)
lenB = len(b)
L = 12
M = 14
count = 0
result = []
while i<lenA and j<lenB:
if b[j] - a[i] < L:
j = j + 1
elif b[j] - a[i] > M:
i = i + 1
else:
p = binsearch(a,b[j]-L)
count = count + p - i + 1
j = j + 1
print "number of (x,y) pairs: ", count
Because it's possible for every combination to be in the specified range, the worst-case is O([A][B]), which is basically O(n^2)
However, if you want the best simple algorithm, this is what I've come up with. It starts similarly to user-targaryen's algorithm, but handles overlaps in a simplistic fashion
Create three variables: x,y,Z and s (set all to 0)
Create a boolean 'success' and set to false
Calculate Z = B[y] - A[x]
if Z < L
increment y
if Z >= L and <= M
if success is false
set s = y
set success = true
increment y
store x,y
if Z > M
set y = s //this may seem inefficient with my current example
//but you'll see the necessity if you have a sorted list with duplicate values)
//you can just change the A from my example to {1,1,2,2,3} to see what I mean
set success = false
an example:
A = {1,2,3,4,5}
B = {3,4,5,6,7}
L = 2, M = 3
In this example, the first pair is x,y. The second number is s. The third pair is the values A[x] and B[y]. The fourth number is Z, the difference between A[x] and B[y]. The final value is X for not a match and O for a match
0,0 - 0 - 1,3 = 2 O
increment y
0,1 - 0 - 1,4 = 3 O
increment y
0,2 - 0 - 1,5 = 4 X
//this is the tricky part. Look closely at the changes this makes
set y to s
increment x
1,0 - 0 - 2,3 = 1 X
increment y
1,1 - 0 - 2,4 = 2 O
set s = y, set success = true
increment y
1,2 - 1 - 2,5 = 3 O
increment y
1,3 - 1 - 2,6 = 4 X
set y to s
increment x
2,1 - 1 - 3,4 = 1 X
increment y
2,2 - 1 - 3,5 = 2 O
set s = y, set success = true
increment y
2,3 - 2 - 3,6 = 3 O
... and so on

Concatenation of two languages in NP

I have a hard time to understand why the concatenation of two languages over an alphabet, which is in NP, doesn't imply that each of the languages for themselves are in NP. I talked with my Prof about the problem today, but I can't wrap my head around it. Can you help me out?
Here's a counterexample to the claim that if A and B are languages and AB &in; NP, then A &in; NP and B &in; NP. First, consider all subsets of the set 1*. There are countably infinitely many strings in 1*, so there are uncountably many subsets of 1*. Since there are only countably many decidable languages, at least one of these languages is undecidable; let's have that language be A. For B, choose 1*.
My claim is that AB is some language of the form { 1n | n ≥ k } for some natural number k. To see this, let k be the length of the shortest string in A. Then any string in AB has the form 1k+m1r, where 1k+m &in; A and 1r &in; B. Such a string necessarily belongs to { 1n | n ≥ k }. Similarly, if we take any string from { 1n | n ≥ k }, we can see that it has the form 1k+m = 1k1m, where 1k &in; A and 1m &in; B. Therefore, AB = { 1n | n ≥ k }.
The language AB is regular because it's given by the regular expression 1k1*, but A is not in NP because it's undecidable and all languages in NP are decidable.
Hope this helps!

Resources