Answering Queries on List of Subarrays - arrays

Given an array A of size N, we construct a list containing all possible subarrays of A in descending order.
Two subarrays B and C are compare by padding zeroes until both are of size N. Then, we compare the two subarrays element by element and return as soon as a point of difference is observed.
We are given multiple queries where given x we have to find the maximum element in the xth subarray sorted according to the order given above.
For example, if the array A is [3, 1, 2, 4]; then the sorted subarrays will be:
[4]
[3, 1, 2, 4]
[3, 1, 2]
[3, 1]
[3]
[2, 4]
[2]
[1, 2, 4]
[1, 2]
[1]
A query where x = 3 corresponds to finding the maximum element in the subarray [3, 1, 2]; so here the answer would be 3.
Since the number of queries are large (of the order of 10^5) and the number of elements in the array can also be large (of the order of 10^5), we would need to do some preprocessing to answer each query in O(1) or O(log N) or O(sqrt N) time. I can't seem to figure out how to do this. I have solved it for when the array contains unique elements, however how could we do this for when the array contains repetitions? Is there any data structure which could help in storing the required information?

Build suffix array in back order for this array (consider it like string)
For every entry store it's length and cumulative count (sum of lengths from the beginning of suffix array)
For query find needed index by binary search for cumulative counts, and get needed prefix of found suffix
For your examples suffixes with cumul.counts are
4 (0)
3124 (1)
34 (5)
124 (7)
query 3 finds entry 3124 (1<=3<5), and gets 3-1=2-nd (by length) prefix = 312

Related

How to understand the index of np.array

I am learning python numpy.array and am confused about how the index works. Let's see I have the following 3x4 2D array:
A = np.array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9,10,11,12]])
If I want to extract the 1 from this array, I need to input the index of that number, which is A[0,0]
Out of curiosity I also tried the following
B = A[[0,0]]
C = A[[0],[0]]
B turns out to be a 2x4 2D array:
array([[1, 2, 3, 4],
[1, 2, 3, 4]])
C turns out to be a 1D array of 1 element:
array([1])
I am wondering how indexing of B and C works and why I obtain those arrays?
In B, you are only giving one index for a 2 dimensional array which is [0,0]. So it will return the element in the first dimension of the index given (0 and 0 here).
So, for the first index (which is 0) it will return the first element in the first dimension which is [1,2,3,4] and it will go for the next index given which is again 0, so it will print two [1,2,3,4] as you have got.
Next in the C, you have given 2 indices for a 2 dimensional array which are [0] and [0]. So it will go through the first dimension for the index 0 which is [1,2,3,4] and in that element it will return the 0th position which is [1] as you have got.
For better understanding, let's see another case A[[0,1],2].
Here, we have given 2 indices for a 2 dimensional array which are [0,1] and 2. So, we get the elements which are in the index [0,2] and next with [1,2].Th output will be [3,7].
The thing is it will iterate through all possible combinations of indices given and return those values in those indices.

Generate 2D array with adjacent elements not being x+1

I need to develop an algorithm which would accept two numbers m and n - dimensions of 2D array - as input and generate 2D array filled with numbers [1..m*n] with the following condition:
All (4) elements adjacent to a given element cannot be equal to currentElement + 1
Adjacent elements are located to the two/three/four sides (depending on position) of a given element
0 1 0
1 2 1
0 1 0
(E.g four 1s are adjacent to 2)
Example:
Input: m = 3, n = 3 (does not essentially have to be square matrix)
(Sample) output:
[
[7, 2, 5],
[1, 6, 9],
[3, 8, 4]
]
Note that there apparently may exist more than one possible output. In that case, numbers in the array have to be generated randomly (though still meeting the conditions), not following any preset sequence (e.g not [ [1, 3, 5], [4, 6, 2], [7, 9, 8] ] because it clearly uses a non-randomly generated sequence of numbers, odds first, then evens, etc)
Basically, for the same input, on two different occasions, two different arrays should be generated.
P.S: that was a coding interview question and I wonder how I could solve it, so, any help is highly appreciated.

array size n into k same size groups

I have an unsorted array of size n and I need to find k-1 divisors so every subset is of the same size (like after the array is sorted).
I have seen this question with k-1=3. I guess I need the median of medians and this is will take o(n). But I think we should do it k times so o(nk).
I would like to understand why it would take o(n logk).
For example: I have an unsorted array with integers and I want find the k'th divisors which is the k-1 integers that split the array into k (same sized) subarrays according to their values.
If I have [1, 13, 6, 7, 81, 9, 10, 11] the 3=k dividers is [7 ,11] spliting to [1 6, 9 10 13 81] where every subset is big as 2 and equal.
You can use a divide-and-conquer approach. First, find the (k-1)/2th divider using the median-of-medians algorithm. Next, use the selected element to partition the list into two sub-lists. Repeat the algorithm on each sub-list to find the remaining dividers.
The maximum recursion depth is O(log k) and the total cost across all sub-lists at each level is O(n), so this is an O(n log k) algorithm.

get the average equibrilium from an array

There are n elements in the array. I need to divide the array in two parts where average of both array parts is same.
Say, you have an array [1, 2, 3]. Here elements [1, 3] have average of 2 whereas element [2] too have an average of two.
Another example is that : [1, 2, 5, 4]. Here elements [1, 5] has an average of 3 whereas elements [2, 4] too have an average of 3.
So, in case there exists such an average condition, I should flag "Yes" otherwise "No". What data structure/algorithm would you recommend for such problem?
I tried something on lines of this :
http://www.geeksforgeeks.org/equilibrium-index-of-an-array/
but it did not work.
I'm not an expert of algorthms and the only solution I can think now is a bit brutal:
avg(array)
if there is an element with the same value of the avg => done
sort of the array
starting from the biggest element, I would calculate the avg with the others starting from the smallest ones with tail recursion (until they don't give a solution higher than the calculated avg or the calculated avg)
if I find a combination which gives the calculated avg, the remain numbers will give the same avg for sure
Unluckily I don't remember any kind of useful theorem about the average...

Generate unique integer key for set of lists

I have a number of integer vectors all the same length. They could contain any signed int16.
I need to create a single unique number for each vector, but vectors which have the same content must be given the same number.
E.g the following vectors:
[1, 2, 3, 4]
[1, 2, 3, 4]
[6, 2, 4, 1]
might be assigned the numbers 2, 2 and 4.
Also order counts. So the vectors
[1, 2, 3, 4]
[2, 1, 4, 3]
should get different values.
Is there any reliable way to calculate a single number for a such a set of vectors?
To sum up the value must:
Be the same for vectors which are exactly the same (order counts!)
Be guaranteed to be unique for different values
The value must be calculated for one vector at a time...i.e you are given a vector, you get the value, then you get the next vector and so on.
The whole purpose of this is that Im interested in an alternative way of indexing distinct vectors to e.g adding them all to an oredered set or similar.
To guarantee perfect uniqueness, you will have to compose a large number out of every single number. Given that you specified to allow signed int16 values you would get the following 64bit hash key:
[n1, n2, n3, n4] => n1 + n2*2^16 + n3*2^32 + n4*2^48

Resources