Interview question: reverse pairs in an array - arrays

I got this for my interview: a pair of arrays [a,b],[b,a] are counted as a reverse pair. for example, the input array is [[a,b],[b,a],[a,c],[c,a],[a,b]], and the output is 2 because there are two reversed pairs.
Now I can get the time complexity to be O(n) using a hashmap, is there any way to get better than O(n)?

The algorithm is simple:
you iterate over an array, and count in hashmap the occurence of the pairs, getting the result like [ [a,b] : 2, [a,c] : 1, [b,a] : 1, ... ]
you iterate over the hashmap, computing the minimum of the occurence of the inversed pairs, for example [a,b] : 2, [b,a] : 1, so you have 1.
you add the results from step 2, which gives you the final result.
And you cannot do it faster than O(N), because you have to check every element at least once.

Related

Longest subarray with sum greater than K

I was trying to understand the code at https://www.geeksforgeeks.org/largest-subarray-having-sum-greater-than-k/amp/
However, I am not following it.
Specifically, I did not understand the following:
what does minInd array hold?
What is the use of minInd in keeping track of largest subarray?
What does find method return?
Illustration with an example would be highly appreciated.
The trivial approach is to do it in O(N^2) time. However you can do it O(N log N) time on expense of space. This solution doesn't follow geeksforgeeks precisely but will give you O(N Log N) solution. I have attached image for better understanding.
Lets take an input array (-2, 3,1,-2,1,1) and create prefix sum array.
A prefix sum array is sum of all elements up to current element: [-2, 1, 2, 0, 1, 2]
e.g: at index 2 in input array sum of all the elements before and including it : -2+3+1=2
Please note that :In below approach Prefix sub array and array implies the same thing. For original input array I will call it original array.
you create a prefix sum array and store the index use to create this prefix sum array (0-5) and put it into a queue.
Take the first content ( prefix sum array) out of the queue. Now if your last value in the prefix sum array is greater than or equal to K then the index you used to create this array is the answer.
If not then lets break this prefix array in two sub prefix sum array and add both of them in the queue. Store their index too.
One consisting of sub prefix array from prefix sum array's start to its end -1. - Second part ( slight change) consisting of prefix sum array from start+1 to end. In the second part you have to subtract the first element from each element present in the sub array (if you check the attached image it will become quite apparent).
Go back to step 2 and repeat until either Queue is empty or you find solution.
Further optimization : you can use memoization to reduce branching . if you have already processed a prefix sub array sum then no need to re-evaluate it again.
Below is the diagram where input is transformed into Prefix array sum and broken down repeatedly into prefix sub array sum until last element of our prefix sub array sum >=k
Note: each node consist of key : the index we are considering for prefix sub array. Value = the actual prefix array sum
Hopefully it will be clearer from the diagram.
Lets say your k=3 and input array =(-2, 3,1,-2,1,1)
you create prefix sum array =[{0-5=[-2, 1, 2, 0, 1, 2]}]
since the last element in the prefix sum array <k you split into two parts
sub array from 0-4=> [-2, 1, 2, 0, 1]
sub array from 1-5=> [1, 2, 0, 1, 2] ( and subtract -2 from each element ) to get [3, 4, 2, 3, 4] a proper prefix sub array
Now your last element in the prefix sub array sum is 4 ( greater than K ) your answer giving you the index of such sub array is 1-5

algorithm which finds the numbers in a sequence which appear 3 times or more, and prints their indexes

Suppose I input a sequence of numbers which ends with -1.
I want to print all the values of the sequence that occur in it 3 times or more, and also print their indexes in the sequence.
For example , if the input is : 2 3 4 2 2 5 2 4 3 4 2 -1
so the expected output in that case is :
2: 0 3 4 6 10
4: 2 7 9
First I thought of using quick-sort , but then I realized that as a result I will lose the original indexes of the sequence. I also have been thinking of using count, but that sequence has no given range of numbers - so maybe count will be no good in that case.
Now I wonder if I might use an array of pointers (but how?)
Do you have any suggestions or tips for an algorithm with time complexity O(nlogn) for that ? It would be very appreciated.
Keep it simple!
The easiest way would be to scan the sequence and count the number of occurrence of each element, put the elements that match the condition in an auxiliary array.
Then, for each element in the auxiliary array, scan the sequence again and print out the indices.
First of all, sorry for my bad english (It's not my language) I'll try my best.
So similar to what #vvigilante told, here is an algorithm implemented in python (it is in python because is more similar to pseudo code, so you can translate it to any language you want, and moreover I add a lot of comment... hope you get it!)
from typing import Dict, List
def three_or_more( input_arr:int ) -> None:
indexes: Dict[int, List[int]] = {}
#scan the array
i:int
for i in range(0, len(input_arr)-1):
#create list for the number in position i
# (if it doesn't exist)
#and append the number
indexes.setdefault(input_arr[i],[]).append(i)
#for each key in the dictionary
n:int
for n in indexes.keys():
#if the number of element for that key is >= 3
if len(indexes[n]) >= 3:
#print the key
print("%d: "%(n), end='')
#print each element int the current key
el:int
for el in indexes[n]:
print("%d,"%(el), end='')
#new line
print("\n", end='')
#call the function
three_or_more([2, 3, 4, 2, 2, 5, 2, 4, 3, 4, 2, -1])
Complexity:
The first loop scan the input array = O(N).
The second one check for any number (digit) in the array,
since they are <= N (you can not have more number than element), so it is O(numbers) the complexity is O(N).
The loop inside the loop go through all indexes corresponding to the current number...
the complexity seem to be O(N) int the worst case (but it is not)
So the complexity would be O(N) + O(N)*O(N) = O(N^2)
but remember that the two nest loop can at least print all N indexes, and since the indexes are not repeated the complexity of them is O(N)...
So O(N)+O(N) ~= O(N)
Speaking about memory it is O(N) for the input array + O(N) for the dictionary (because it contain all N indexes) ~= O(N).
Well if you do it in c++ remember that maps are way slower than array, so if N is small, you should use an array of array (or std::vector> ), else you can also try an unordered map that use hashes
P.S. Remember that get the size of a vector is O(1) time because it is a difference of pointers!
Starting with a sorted list is a good idea.
You could create a second array of original indices and duplicate all of the memory moves for the sort on the indices array. Then checking for triplicates is trivial and only requires sort + 1 traversal.

Find the most frequent triplet in an array

We have an array of N numbers. All the numbers are between 1-k.
The problem is how to find the best way of finding the most frequent triplet.
My approach to the problem is:
Say if the input is like { 1, 2, 3, 4, 1, 2, 3, 4}
First search for the count of triplet ( 1, 2, 3) start from the second element in the array till the end of the array. Now we will have the count as 1.
Now start with { 2, 3, 4) and search the array.
for each triplet we scan the array and find the count. Like this we run the array for n-1 times.
This way my algorithm runs in the order of n*n time complexity. Is there any better way for
this problem.?
You can do it in O(n * log n) worst-case space and time complexity: Just insert all triples into a balanced binary search tree and find the maximum afterwards.
Alternatively, you can use a hash table to get O(n) expected time (which is typically faster than the search tree approach in reality, if you choose a good hash function).
Are there any memory boundaries i.e. does it run on a device with memory limitations?
If not, maybe this could be good solution: iterate over array and for each tripple build and representation object (or struct if implemented in c#) which goes into map as a key and the tripple counter as a value.
If you implement hash and equals functions appropriately, you will be able to find the "most popular" tripple where numbers order matters or not e.g. 1,2,3 != 2,1,3 or 1,2,3 == 2,1,3
After iterating entire array you would have to find the largest value and its key would be your "most popular" tripple. With that approach you could find X most popular tripples too. Also you would scan array only once and aggregate all the trippels (no extra scanning for each tripple).

Elements in array O(nlogn) complexity method for finding pairs

Okay, I keep getting stuck with the complexity here. There is an array of elements, say A[n]. Need to find all pairs so that A[i]>A[j] and also i < j.
So if it is {10, 8, 6, 7, 11}, the pairs would be (10,8) (10, 6), (10,7) and so on...
I did a merge sort in nlogn time and then a binary search for the entire array again in nlogn to get the indices of the elements in the sorted array.
So sortedArray={6 7 8 10 11} and index={3 2 0 1 4}
Irrespective of what I try, I keep getting another n^2 time in the complexity when I begin loops to compare. I mean, if I start for the first element i.e. 10, it is at index[2] which means there are 2 elements less than it. So if index[2]<index[i] then they can be accepted but that increases the complexity. Any thoughts? I don't want the code, just a hint in the right direction would be helpful.
Thanks. Everything i have been doing in C and time complexity is important here c
You cannot do this in under O(N^2), because the number of pairs that the algorithm will produce when the original array sorted in descending order is N(N-1)/2. You simply cannot produce O(N^2) results in O(N*LogN) time.
The result consists of O(n^2) elements, so any attempt to iterate through all pairs will be O(n^2).

efficient sorted Cartesian product of 2 sorted array of integers

Need Hints to design an efficient algorithm that takes the following input and spits out the following output.
Input: two sorted arrays of integers A and B, each of length n
Output: One sorted array that consists of Cartesian product of arrays A and B.
For Example:
Input:
A is 1, 3, 5
B is 4, 8, 10
here n is 3.
Output:
4, 8, 10, 12, 20, 24, 30, 40, 50
Here are my attempts at solving this problem.
1) Given that output is n^2, Efficient algorithm can't do any better than O(n^2) time complexity.
2) First I tried a simple but inefficient approach. Generate Cartesian product of A and B. It can be done in O(n^2) time complexity. we need to store, so we can do sorting on it. Therefore O(n^2) space complexity too. Now we sort n^2 elements which can't be done better than O(n^2logn) without making any assumptions on the input.
Finally I have O(n^2logn) time and O(n^2) space complexity algorithm.
There must be a better algorithm because I've not made use of sorted nature of input arrays.
If there's a solution that's better than O(n² log n) it needs to do more than just exploit the fact that A and B are already sorted. See my answer to this question.
Srikanth wonders how this can be done in O(n) space (not counting the space for the output). This can be done by generating the lists lazily.
Suppose we have A = 6,7,8 and B = 3,4,5. First, multiply every element in A by the first element in B, and store these in a list:
6×3 = 18, 7×3 = 21, 8×3 = 24
Find the smallest element of this list (6×3), output it, replace it with that element in A times the next element in B:
7×3 = 21, 6×4 = 24, 8×3 = 24
Find the new smallest element of this list (7×3), output it, and replace:
6×4 = 24, 8×3 = 24, 7×4 = 28
And so on. We only need O(n) space for this intermediate list, and finding the smallest element at each stage takes O(log n) time if we keep the list in a heap.
If you multiply a value of A with all values of B, the result list is still sorted. In your example:
A is 1, 3, 5
B is 4, 8, 10
1*(4,8,10) = 4,8,10
3*(4,8,10) = 12,24,30
Now you can merge the two lists (exactly like in merge sort). You just look at both list heads and put the smaller one in the result list. so here you would select 4, then 8 then 10 etc.
result = 4,8,10,12,24,30
Now you do the same for result list and the next remaining list merging 4,8,10,12,24,30 with 5*(4,8,10) = 20,40,50.
As merging is most efficient if both lists have the same length, you can modify that schema by dividing A in two parts, do the merging recursively for both parts, and merge both results.
Note that you can save some time using a merge approach as is isn't required that A is sorted, just B needs to be sorted.

Resources