minimum operations to make array left part equal to right part - arrays

Given an even length array, [a1, a2,....,an], a beautiful array is an array where a[i] == a[i + n / 2] for 0<= i < n / 2. define an operation as change all array elements equal to value x to value y. what's the minimum operations required to make a given array beautiful? all elements are in range [1, 100000]. If simply return unmatch array pairs (ignore order) in left and right part of array, it will return wrong results in some cases such as [1, 1, 2, 5, 2, 5, 5, 2], unmatched pairs are (1, 2), (1, 5), (2, 5), but when change 2 -> 5, than (1, 2) and (1, 5) become the same. so what's the correct method to solve this problem?

It is a graph question.
For every pair(a[i], a[i+n/2]) where a[i]!=a[i+n/2], add an undirected edge between the two nodes.
Note that you shouldn't add multiple edges between 2 numbers.
Now you essentially need to remove all the edges in the graph by performing some operations. The final answer is the number of operations.
In each operation, you remove an edge. After removing an edge between two vertices, combine the vertices and rearrange their edges.

Related

Flip/Reorient pairs with common element so that adjacent pairs have common elements next to each other

We have n pairs and each pair has a common element with adjacent pair. Assume a cyclical group of pairs where nth pair is also adjacent to the first pair.
Now, given n pairs, we want to output an array of size n that has 1s or 0s, depending on whether the pair must be flipped (or reoriented) or not. The goal is to flip the minimum number of pairs so as to have the pairs such that the common elements are next to each other.
For example,
Input: [(32,4),(4,1),(9,1),(9,16),(32,16)]
Output: [0,0,1,0,1]
such that upon flipping, we have [(32,4),(4,1),(1,9),(9,16),(16,32)]
I am looking for an efficient solution, preferably using numpy.
You can check if the first and second element for each pair is equal to the item in common between each consecutive pair:
inp = [(32, 4), (4, 1), (9, 1), (9, 16), (32, 16)]
pairs = [set(pair) for pair in inp]
common = [next(iter(one & two)) for one, two in zip(pairs, pairs[-1:] + pairs[:-1])]
out = [int(comm != pair[0]) for comm, pair in zip(common, inp)]
[0, 0, 1, 0, 1] # output
You can do the final line using vectorization in numpy, but it's not really necessary

How to do a cartesian product of a variable number of lists in Julia?

For each value j in the set {1, 2, ..., n} where the value of n can vary (it is some variable in my program that can be different depending on the inputs from the user), I have an array A_j. I would like to obtain the cartesian product of all the arrays A_j, so that I can then iterate through that cartesian product (taking one element from each A_1, A_2, ... A_n to get a tuple (a_1, a_2, ..., a_n) in A_1 x A_2 x ... x A_n). How would I accomplish this in Julia?
Use Iterators.product:
help?> Iterators.product
product(iters...)
Return an iterator over the product of several iterators. Each generated
element is a tuple whose ith element comes from the ith argument iterator.
The first iterator changes the fastest.
Examples
≡≡≡≡≡≡≡≡≡≡
julia> collect(Iterators.product(1:2, 3:5))
2×3 Matrix{Tuple{Int64, Int64}}:
(1, 3) (1, 4) (1, 5)
(2, 3) (2, 4) (2, 5)

For each element, count how many given subarrays contain it

While I was solving a problem in a coding contest, I found that I needed to do this: Given the pairs (i, j) indicating the indexes of a subarray; for each element, count how many subarrays contain it. For example:
We have the array [7, -2, -7, 0, 6], and the pairs are (0, 2), (1, 4), (2, 3), (0, 3), then the result array will be [2, 3, 4, 3, 1], since the first element is in the subarrays (0, 2), (0, 3), the second one is in (0, 2), (1, 4), (0, 3), etc.
One way of doing it would of course be manually counting, using an array and count, but that will likely give me TLE because the array size and number of pairs is too big to do it. I've also read somewhere else that you can use an extra array storing "stuff" inside it (add/subtract something for every subarray), and then traverse through that array to find out, but I don't remember where I read it.
Here's the original problem if you also have any improvements on my algorithm:
Given an array of size n, for each i-th element in the array (0 <= i <= n-1), count how many balanced subarrays contain that element. A balanced subarray is a subarray whose sum is equal to 0.
I have known a way to find subarrays with its sum to be equal to 0: https://www.geeksforgeeks.org/print-all-subarrays-with-0-sum/ . But the second task, which I have stated above, I haven't figured it out yet. If you have some ideas about this, please let me know and thank you very much.
You could take each pair and mark a +1 where it starts and -1 where it stops (in a separate array). These represent changes in the number of ranges that overlap a certain index. Then iterate that list to collect and accumulate those changes into absolute numbers (number of overlapping ranges).
Here is an implementation in JavaScript for the ranges you have given as example -- the content of the given list really is not relevant for this, only its size (n=5 in the example):
let pairs = [[0, 2], [1, 4], [2, 3], [0, 3]];
let n = 5;
// Translate pairs to +1/-1 changes in result array
let result = Array(n+1).fill(0); // Add one more dummy entry
for (let pair of pairs) {
result[pair[0]]++;
result[pair[1] + 1]--; // after(!) the range
}
result.pop(); // ignore last dummy entry
// Accumulate changes from left to right
for (let i = 1; i < n; i++) {
result[i] += result[i-1];
}
console.log(result);

Binning then sorting arrays in each bin but keeping their indices together

I have two arrays and the indices of these arrays are related. So x[0] is related to y[0], so they need to stay organized. I have binned the x array into two bins as shown in the code below.
x = [1,4,7,0,5]
y = [.1,.7,.6,.8,.3]
binx = [0,4,9]
index = np.digitize(x,binx)
Giving me the following:
In [1]: index
Out[1]: array([1, 2, 2, 1, 2])
So far so good. (I think)
The y array is a parameter telling me how well measured the x data point is, so .9 is better than .2, so I'm using the next code to sort out the best of the y array:
y.sort()
ysorted = y[int(len(y) * .5):]
which gives me:
In [2]: ysorted
Out[2]: [0.6, 0.7, 0.8]
giving me the last 50% of the array. Again, this is what I want.
My question is how do I combine these two operations? From each bin, I need to get the best 50% and put these new values into a new x and new y array. Again, keeping the indices of each array organized. Or is there an easier way to do this? I hope this makes sense.
Many numpy functions have arg... variants that don't operate "by value" but rather "by index". In your case argsort does what you want:
order = np.argsort(y)
# order is an array of indices such that
# y[order] is sorted
top50 = order[len(order) // 2 :]
top50x = x[top50]
# now top50x are the x corresponding 1-to-1 to the 50% best y
You should make a list of pairs from your x and y lists
It can be achieved with the zip function:
x = [1,4,7,0,5]
y = [.1,.7,.6,.8,.3]
values = zip(x, y)
values
[(1, 0.1), (4, 0.7), (7, 0.6), (0, 0.8), (5, 0.3)]
To sort such a list of pairs by a specific element of each pair you may use the sort's key parameter:
values.sort(key=lambda pair: pair[1])
[(1, 0.1), (5, 0.3), (7, 0.6), (4, 0.7), (0, 0.8)]
Then you may do whatever you want with this sorted list of pairs.

Number of Distinct Subarrays

I want to find an algorithm to count the number of distinct subarrays of an array.
For example, in the case of A = [1,2,1,2],
the number of distinct subarrays is 7:
{ [1] , [2] , [1,2] , [2,1] , [1,2,1] , [2,1,2], [1,2,1,2]}
and in the case of B = [1,1,1], the number of distinct subarrays is 3:
{ [1] , [1,1] , [1,1,1] }
A sub-array is a contiguous subsequence, or slice, of an array. Distinct means different contents; for example:
[1] from A[0:1] and [1] from A[2:3] are not distinct.
and similarly:
B[0:1], B[1:2], B[2:3] are not distinct.
Construct suffix tree for this array. Then add together lengths of all edges in this tree.
Time needed to construct suffix tree is O(n) with proper algorithm (Ukkonen's or McCreight's algorithms). Time needed to traverse the tree and add together lengths is also O(n).
Edit: I think about how to reduce iteration/comparison number.
I foud a way to do it: if you retrieve a sub-array of size n, then each sub-arrays of size inferior to n will already be added.
Here is the code updated.
List<Integer> A = new ArrayList<Integer>();
A.add(1);
A.add(2);
A.add(1);
A.add(2);
System.out.println("global list to study: " + A);
//global list
List<List<Integer>> listOfUniqueList = new ArrayList<List<Integer>>();
// iterate on 1st position in list, start at 0
for (int initialPos=0; initialPos<A.size(); initialPos++) {
// iterate on liste size, start on full list and then decrease size
for (int currentListSize=A.size()-initialPos; currentListSize>0; currentListSize--) {
//initialize current list.
List<Integer> currentList = new ArrayList<Integer>();
// iterate on each (corresponding) int of global list
for ( int i = 0; i<currentListSize; i++) {
currentList.add(A.get(initialPos+i));
}
// insure unicity
if (!listOfUniqueList.contains(currentList)){
listOfUniqueList.add(currentList);
} else {
continue;
}
}
}
System.out.println("list retrieved: " + listOfUniqueList);
System.out.println("size of list retrieved: " + listOfUniqueList.size());
global list to study: [1, 2, 1, 2]
list retrieved: [[1, 2, 1, 2], [1, 2, 1], [1, 2], [1], [2, 1, 2], [2, 1], [2]]
size of list retrieved: 7
With a list containing the same patern many time the number of iteration and comparison will be quite low.
For your example [1, 2, 1, 2], the line if (!listOfUniqueList.contains(currentList)){ is executed 10 times. It only raise to 36 for the input [1, 2, 1, 2, 1, 2, 1, 2] that contains 15 different sub-arrays.
You could trivially make a set of the subsequences and count them, but i'm not certain it is the most efficient way, as it is O(n^2).
in python that would be something like :
subs = [tuple(A[i:j]) for i in range(0, len(A)) for j in range(i + 1, len(A) + 1)]
uniqSubs = set(subs)
which gives you :
set([(1, 2), (1, 2, 1), (1,), (1, 2, 1, 2), (2,), (2, 1), (2, 1, 2)])
The double loop in the comprehension clearly states the O(n²) complexity.
Edit
Apparently there are some discussion about the complexity. Creation of subs is O(n^2) as there are n^2 items.
Creating a set from a list is O(m) where m is the size of the list, m being n^2 in this case, as adding to a set is amortized O(1).
The overall is therefore O(n^2).
Right my first answer was a bit of a blonde moment.
I guess the answer would be to generate them all and then remove duplicates. Or if you are using a language like Java with a set object make all the arrays and add them to a set of int[]. Sets only contain one instance of each element and automatically remove duplicates so you can just get the size of the set at the end
I can think of 2 ways...
first is compute some sort of hash then add to a set.
if on adding your hashes are the same is an existing array... then do a verbose comparison... and log it so that you know your hash algorithm isn't good enough...
The second is to use some sort of probable match and then drill down from there...
if number of elements is same and the total of the elements added together is the same, then check verbosely.
Create an array of pair where each pair store the value of the element of subarray and its index.
pair[i] = (A[i],i);
Sort the pair in increasing order of A[i] and then decreasing order of i.
Consider example A = [1,3,6,3,6,3,1,3];
pair array after sorting will be pair = [(1,6),(1,0),(3,7),(3,5),(3,3),(3,1),(6,4),(6,2)]
pair[0] has element of index 6. From index 6 we can have two sub-arrays [1] and [1,3]. So ANS = 2;
Now take each consecutive pair one by one.
Taking pair[0] and pair[1],
pair[1] has index 0. We can have 8 sub-arrays beginning from index 0. But two subarrays [1] and [1,3] are already counted. So to remove them, we need to compare longest common prefix of sub-array for pair[0] and pair[1]. So longest common prefix length for indices beginning from 0 and 6 is 2 i.e [1,3].
So now new distinct sub-arrays will be [1,3,6] .. to [1,3,6,3,6,3,1,3] i.e. 6 sub-arrays.
So new value of ANS is 2+6 = 8;
So for pair[i] and pair[i+1]
ANS = ANS + Number of sub-arrays beginning from pair[i+1] - Length of longest common prefix.
The sorting part takes O(n logn).
Iterating each consecutive pair is O(n) and for each iteration find longest common prefix takes O(n) making whole iteration part O(n^2). Its the best I could get.
You can see that we dont need pair for this. The first value of pair, value of element was not required. I used this for better understanding. You can always skip that.

Resources