Lets say we have 2 arrays, one of them (i.e. A) contains the time an object i will come into a room, and the other (i.e. B) contains the time i will leave. Neither of these are in any way sorted and their contents are Real numbers.
For example, object 3 has: A[3]=0.785 and B[3]=4.829.
How would you in O(nlogn) find the max objects in the room at any given time t?
You can try this:
initialize number of objects as zero
sort both arrays
while there are elements left in either array
determine which array's first value is smaller
if the first value in "enter" is smaller, increment number of objects and pop that value
if the first value in "leave" is smaller, decrement number of objects and pop that value
check whether you found a new maximum number of objects
If you can not "pop" elements from the arrays, you can use two index variables instead; also, you will have to add cases for when one of the arrays is already empty.
Sorting has O(nlogn), and the following loop has O(2*n), thus O(nlogn) in total.
Get all times from both arrays and make pairs {time from A or from B; f = +1 for A/ -1 for B}
Sort array of all pairs by time key (in case of tie +1 goes before -1)
Make count = 0
Traverse array of pairs, adding f value to count.
Max value of count is " the max objects in the room"
Example:
A = [2, 5], B = [7, 9]
pairs = (2,1),(5,1),(7,-1),(9,-1)
count = 1, 2, 1, 0
maxcount=2 at interval 5..7
Related
For example, given an array
a = [1, 2, 3, 7, 8, 9]
and an integer
i = 2. Find maximal subarrays where the distance between the largest and the smallest elements is at most i. The output for the example above would be:
[1,2,3] [7,8,9]
The subarrays are maximal in the sense given two subarrays A and B. There exists no element b in B such that A + b satisfies the condition given. Does there exist a non-polynomial time algorithm for said problem ?
This problem might be solved in linear time using method of two pointers and two deques storing indices, the first deque keeps minimum, another keeps maximum in sliding window.
Deque for minimum (similar for maximum):
current_minimum = a[minq.front]
Adding i-th element of array: //at the right index
while (!minq.empty and a[minq.back] > a[i]):
//last element has no chance to become a minimum because newer one is better
minq.pop_back
minq.push_back(i)
Extracting j-th element: //at the left index
if (!minq.empty and minq.front == j)
minq.pop_front
So min-deque always contains non-decreasing sequence.
Now set left and right indices in 0, insert index 0 into deques, and start to move right. At every step add index in order into deques, and check than left..right interval range is good. When range becomes too wide (min-max distance is exceeded), stop moving right index, check length of the last good interval, compare with the best length.
Now move left index, removing elements from deques. When max-min becomes good, stop left and start with right again. Repeat until array end.
Imagine there's have an array of integers but you aren't allowed to access any of the values (so no Arr[i] > Arr[i+1] or whatever). The only way to discern the integers from one another is by using a query() function: this function takes a subset of elements as inputs and returns the number of unique integers in this subset. The goal is to partition the integers into groups based on their values — integers in the same group should have the same value, while integers in different groups have different values.
The catch - the code has to be O(nlog(n)), or in other words the query() function can only be called O(nlog(n)) times.
I've spent hours optimizing different algorithms in Python, but all of them have been O(n^2). For reference, here's the code I start out with:
n = 100
querycalls = 0
secretarray = [random.randint(0, n-1) for i in range(n)]
def query(items):
global querycalls
querycalls += 1
return len(set(items))
groups = []
secretarray generates a giant random list of numbers of length n. querycalls keeps track of how much the function is called. groups are where the results go.
The first thing I did was try to create an algorithm based off of merge sort (split the arrays down and then merge based on the query() value) but I could never get it below O(n^2).
Let's say you have an element x and an array of distinct elements, A = [x0, x1, ..., x_{k-1}] and want to know if the x is equivalent to some element in the array and if yes, to which element.
What you can do is a simple recursion (let's call it check-eq):
Check if query([x, A]) == k + 1. If yes, then you know that x is distinct from every element in A.
Otherwise, you know that x is equivalent to some element of A. Let A1 = A[:k/2], A2 = A[k/2+1:]. If query([x, A1]) == len(A1), then you know that x is equivalent to some element in A1, so recurse in A1. Otherwise recurse in A2.
This recursion takes at most O(logk) steps. Now, let our initial array be T = [x0, x1, ..., x_{n-1}]. A will be an array of "representative" of the groups of elements. What you do is first take A = [x0] and x = x1. Now use check-eq to see if x1 is in the same group as x0. If no, then let A = [x0, x1]. Otherwise do nothing. Proceed with x = x2. You can see how it goes.
Complexity is of course O(nlogn), because check-eq is called exactly n-1 times and each call take O(logn) time.
I was trying to understand the code at https://www.geeksforgeeks.org/largest-subarray-having-sum-greater-than-k/amp/
However, I am not following it.
Specifically, I did not understand the following:
what does minInd array hold?
What is the use of minInd in keeping track of largest subarray?
What does find method return?
Illustration with an example would be highly appreciated.
The trivial approach is to do it in O(N^2) time. However you can do it O(N log N) time on expense of space. This solution doesn't follow geeksforgeeks precisely but will give you O(N Log N) solution. I have attached image for better understanding.
Lets take an input array (-2, 3,1,-2,1,1) and create prefix sum array.
A prefix sum array is sum of all elements up to current element: [-2, 1, 2, 0, 1, 2]
e.g: at index 2 in input array sum of all the elements before and including it : -2+3+1=2
Please note that :In below approach Prefix sub array and array implies the same thing. For original input array I will call it original array.
you create a prefix sum array and store the index use to create this prefix sum array (0-5) and put it into a queue.
Take the first content ( prefix sum array) out of the queue. Now if your last value in the prefix sum array is greater than or equal to K then the index you used to create this array is the answer.
If not then lets break this prefix array in two sub prefix sum array and add both of them in the queue. Store their index too.
One consisting of sub prefix array from prefix sum array's start to its end -1. - Second part ( slight change) consisting of prefix sum array from start+1 to end. In the second part you have to subtract the first element from each element present in the sub array (if you check the attached image it will become quite apparent).
Go back to step 2 and repeat until either Queue is empty or you find solution.
Further optimization : you can use memoization to reduce branching . if you have already processed a prefix sub array sum then no need to re-evaluate it again.
Below is the diagram where input is transformed into Prefix array sum and broken down repeatedly into prefix sub array sum until last element of our prefix sub array sum >=k
Note: each node consist of key : the index we are considering for prefix sub array. Value = the actual prefix array sum
Hopefully it will be clearer from the diagram.
Lets say your k=3 and input array =(-2, 3,1,-2,1,1)
you create prefix sum array =[{0-5=[-2, 1, 2, 0, 1, 2]}]
since the last element in the prefix sum array <k you split into two parts
sub array from 0-4=> [-2, 1, 2, 0, 1]
sub array from 1-5=> [1, 2, 0, 1, 2] ( and subtract -2 from each element ) to get [3, 4, 2, 3, 4] a proper prefix sub array
Now your last element in the prefix sub array sum is 4 ( greater than K ) your answer giving you the index of such sub array is 1-5
Suppose I input a sequence of numbers which ends with -1.
I want to print all the values of the sequence that occur in it 3 times or more, and also print their indexes in the sequence.
For example , if the input is : 2 3 4 2 2 5 2 4 3 4 2 -1
so the expected output in that case is :
2: 0 3 4 6 10
4: 2 7 9
First I thought of using quick-sort , but then I realized that as a result I will lose the original indexes of the sequence. I also have been thinking of using count, but that sequence has no given range of numbers - so maybe count will be no good in that case.
Now I wonder if I might use an array of pointers (but how?)
Do you have any suggestions or tips for an algorithm with time complexity O(nlogn) for that ? It would be very appreciated.
Keep it simple!
The easiest way would be to scan the sequence and count the number of occurrence of each element, put the elements that match the condition in an auxiliary array.
Then, for each element in the auxiliary array, scan the sequence again and print out the indices.
First of all, sorry for my bad english (It's not my language) I'll try my best.
So similar to what #vvigilante told, here is an algorithm implemented in python (it is in python because is more similar to pseudo code, so you can translate it to any language you want, and moreover I add a lot of comment... hope you get it!)
from typing import Dict, List
def three_or_more( input_arr:int ) -> None:
indexes: Dict[int, List[int]] = {}
#scan the array
i:int
for i in range(0, len(input_arr)-1):
#create list for the number in position i
# (if it doesn't exist)
#and append the number
indexes.setdefault(input_arr[i],[]).append(i)
#for each key in the dictionary
n:int
for n in indexes.keys():
#if the number of element for that key is >= 3
if len(indexes[n]) >= 3:
#print the key
print("%d: "%(n), end='')
#print each element int the current key
el:int
for el in indexes[n]:
print("%d,"%(el), end='')
#new line
print("\n", end='')
#call the function
three_or_more([2, 3, 4, 2, 2, 5, 2, 4, 3, 4, 2, -1])
Complexity:
The first loop scan the input array = O(N).
The second one check for any number (digit) in the array,
since they are <= N (you can not have more number than element), so it is O(numbers) the complexity is O(N).
The loop inside the loop go through all indexes corresponding to the current number...
the complexity seem to be O(N) int the worst case (but it is not)
So the complexity would be O(N) + O(N)*O(N) = O(N^2)
but remember that the two nest loop can at least print all N indexes, and since the indexes are not repeated the complexity of them is O(N)...
So O(N)+O(N) ~= O(N)
Speaking about memory it is O(N) for the input array + O(N) for the dictionary (because it contain all N indexes) ~= O(N).
Well if you do it in c++ remember that maps are way slower than array, so if N is small, you should use an array of array (or std::vector> ), else you can also try an unordered map that use hashes
P.S. Remember that get the size of a vector is O(1) time because it is a difference of pointers!
Starting with a sorted list is a good idea.
You could create a second array of original indices and duplicate all of the memory moves for the sort on the indices array. Then checking for triplicates is trivial and only requires sort + 1 traversal.
I need to design an algorithm that finds the k'th smallest element in unsorted array using function that called "MED3":
This function finds the n/3 (floor) and 2n/3 (ceil) elements of the array if it was sorted (very similar to median, but instead of n/2 it returns those values).
I thought about kind of partition around those 2 values, and than to continue like QuickSelect, but the problem is that "MED3" doesn't return indices of the 2 values, only the values.
for example, if the array is: 1, 2, 10, 1, 7, 6, 3, 4, 4 it returns 2 (n/3 value) and 4 (2n/3 value).
I also thought to run over the array and to take all the values between 2 and 4 (for example, in the given array above) to new array and then use "MED3" again, but can be duplicates (if the array is 2, 2, 2, 2, ..., 2 I would take all the elements each time).
Any ideas? I must use "MED3".
* MED3 is like a black box, it runs in linear time.
Thank you.
I think you're on the right track, but instead of taking 2 to 4, I'd suggest removing the first n/3 values that are <= MED3.floor() and the first n/3 values that are >= MED3.ceil(). That avoids issues with too many duplicates. If two passes/cycle aren't too expensive, you can remove all values < MED3.floor() + up to a total of n/3 values = MED3.floor() (do the same for ceil())
then repeat until you are at the k'th smallest target.