Is it possible to check (in Java) if an array is sorted or not with O(1) worst time complexity?
It's impossible to have a correct algorithm with better than O(n) complexity.
Let's prove it by contradiction. If we are given an algorithm with better than O(n)
complexity we can provide a test array
{1, 2, 3, ..., n}
where n is large enough so the algorithm has to skip some items (note, that
if algorithm inspects all items it has at least O(n) time complexity). If
algorithm returns false it's incorrect; if it returns true we have to create
one more test. Let m be the item which is not inspected:
{1, 2, 3, ... m - 1, m, m + 1,... n}
^
Not inspected by the algorithm.
Let's create the test array as it was before but change m into n + 1 (or 1 if m == n)
{1, 2, 3, ..., m - 1, n + 1, m + 1, ... n}
^
we changed m into n + 1
Since m is not inspected, the algorithm returns true which is now incorrect. So the arbitrary algorithm with time complexity better than O(n) is incorrect, or put it in different way: there are no correct algorithms with better than O(n) time complexity.
As others suggest, since you need to access all of the N array elements, you end up with O(N) complexity. But this is true only when the accesses take place sequentially. If you have N processors at your disposal, you can access all elements in one go and get O(1) complexity. But this is just a theoretical dream. Still, in practice today, most computers sport many cores, and most languages offer parallel constructs. For example, Java has parallel streams. So you will never reach O(1), but you may do better than O(N).
Related
For linear search it makes sense that the run time is big O of N since it will always be one step. As for my understanding of bubble sort it's runtime is O of n^2 this makes sense to me because you'd iterate the number of elements in the an array and each time compare two values till the end of said array.
But for merge sort it's always splitting the data in half, so I'm confused as to explanation as to why the run time is n log n. Additionally I want to clarify my understanding of insertion sorts runtime big O of n^2. Since insertion sort looks for the smallest number then compares it to every single number of the array it would be n^2 because it will loop through the array contents for every iteration.
If I could be given some advice about merge sort, and general understanding of run times that'd be appreciated. I am an absolute newbie and wanted to throw that disclaimer out there.
Let's assume that sorting of an array of N elements is taking T(N) time. In merge sort we know that we need to sort two arrays of N/2 elements (that is 2*T(N/2)) and then merge them (in O(N) time complexity, that is c*N for some constant c).
So, T(N) = 2T(N/2) + c*N.
We could stop here, as it is basically the "equation" you asking about. But let's go a bit further.
To simplify things, we can show that T(N) = kN log N as follows (for some constant k):
Let's substitute T on both sides of the equation we have derived:
kN log N = 2 * k*(N/2) log (N/2) + c*N
and expand the right hand side (assuming log with base 2):
= k*N *(log N - log 2) + c*N = k*N *(log N - 1) + c*N = kNlog N + (c-k)N
That is for c=k the equality holds, and it proves that T(N) if of a form kN log N, that is O(N log N)
Given an array A consisting of N elements. Our task is to find the maximal subarray sum after applying the following operation exactly once:
. Select any subarray and set all the elements in it to zero.
Eg:- array is -1 4 -1 2 then answer is 6 because we can choose -1 at index 2 as a subarray and make it 0. So the resultatnt array will be after applying the operation is : -1 4 0 2. Max sum subarray is 4+0+2 = 6.
My approach was to find start and end indexes of minimum sum subarray and make all elements as 0 of that subarray and after that find maximum sum subarray. But this approach is wrong.
Starting simple:
First, let us start with the part of the question: Finding the maximal subarray sum.
This can be done via dynamic programming:
a = [1, 2, 3, -2, 1, -6, 3, 2, -4, 1, 2, 3]
a = [-1, -1, 1, 2, 3, 4, -6, 1, 2, 3, 4]
def compute_max_sums(a):
res = []
currentSum = 0
for x in a:
if currentSum > 0:
res.append(x + currentSum)
currentSum += x
else:
res.append(x)
currentSum = x
return res
res = compute_max_sums(a)
print(res)
print(max(res))
Quick explanation: we iterate through the array. As long as the sum is non-negative, it is worth appending the whole block to the next number. If we dip below zero at any point, we discard whole "tail" sequence since it will not be profitable to keep it anymore and we start anew. At the end, we have an array, where j-th element is the maximal sum of a subarray i:j where 0 <= i <= j.
Rest is just the question of finding the maximal value in the array.
Back to the original question
Now that we solved the simplified version, it is time to look further. We can now select a subarray to be deleted to increase the maximal sum. The naive solution would be to try every possible subarray and to repeat the steps above. This would unfortunately take too long1. Fortunately, there is a way around this: we can think of the zeroes as a bridge between two maxima.
There is one more thing to address though - currently, when we have the j-th element, we only know that the tail is somewhere behind it so if we were to take maximum and 2nd biggest element from the array, it could happen that they would overlap which would be a problem since we would be counting some of the elements more than once.
Overlapping tails
How to mitigate this "overlapping tails" issue?
The solution is to compute everything once more, this time from the end to start. This gives us two arrays - one where j-th element has its tail i pointing towards the left end of the array(e.g. i <=j) and the other where the reverse is true. Now, if we take x from first array and y from second array we know that if index(x) < index(y) then their respective subarrays are non-overlapping.
We can now proceed to try every suitable x, y pair - there is O(n2) of them. However since we don't need any further computation as we already precomputed the values, this is the final complexity of the algorithm since the preparation cost us only O(n) and thus it doesn't impose any additional penalty.
Here be dragons
So far the stuff we did was rather straightforward. This following section is not that complex but there are going to be some moving parts. Time to brush up the max heaps:
Accessing the max is in constant time
Deleting any element is O(log(n)) if we have a reference to that element. (We can't find the element in O(log(n)). However if we know where it is, we can swap it with the last element of the heap, delete it, and bubble down the swapped element in O(log(n)).
Adding any element into the heap is O(log(n)) as well.
Building a heap can be done in O(n)
That being said, since we need to go from start to the end, we can build two heaps, one for each of our pre-computed arrays.
We will also need a helper array that will give us quick index -> element-in-heap access to get the delete in log(n).
The first heap will start empty - we are at the start of the array, the second one will start full - we have the whole array ready.
Now we can iterate over whole array. In each step i we:
Compare the max(heap1) + max(heap2) with our current best result to get the current maximum. O(1)
Add the i-th element from the first array into the first heap - O(log(n))
Remove the i-th indexed element from the second heap(this is why we have to keep the references in a helper array) - O(log(n))
The resulting complexity is O(n * log(n)).
Update:
Just a quick illustration of the O(n2) solution since OP nicely and politely asked. Man oh man, I'm not your bro.
Note 1: Getting the solution won't help you as much as figuring out the solution on your own.
Note 2: The fact that the following code gives the correct answer is not a proof of its correctness. While I'm fairly certain that my solution should work it is definitely worth looking into why it works(if it works) than looking at one example of it working.
input = [100, -50, -500, 2, 8, 13, -160, 5, -7, 100]
reverse_input = [x for x in reversed(input)]
max_sums = compute_max_sums(input)
rev_max_sums = [x for x in reversed(compute_max_sums(reverse_input))]
print(max_sums)
print(rev_max_sums)
current_max = 0
for i in range(len(max_sums)):
if i < len(max_sums) - 1:
for j in range(i + 1, len(rev_max_sums)):
if max_sums[i] + rev_max_sums[j] > current_max:
current_max = max_sums[i] + rev_max_sums[j]
print(current_max)
1 There are n possible beginnings, n possible ends and the complexity of the code we have is O(n) resulting in a complexity of O(n3). Not the end of the world, however it's not nice either.
I have a 2d array, matrix of a sort (m x n). I need to generate '1' in k cells, but the probability of it should be equal for each cell.
for example, if k=3, we pick randomly where to place the 3 '1's :
[0, 0, 0, 0]
[0, 1, 1, 0]
[1, 0, 0, 0]
At first, I tackled this by generating a Random of modulu m * n (rows * columns).
But, that means that we could theoretically get to the end of the matrix without generating a single '1'.
Then, I read about Yates Shuffle, but wasn't sure whether that's wise and even feasible to implement it with that.
What is an efficient way to implement this?
This is essentially a problem of "sampling without replacement": Out of mn cells, choose k of them without replacement. There are many approaches to this problem, depending on how big mn is in relation to k. If mn is relatively small, then a Fisher–Yates shuffle will work well; make a list of cells, shuffle them, then take the first k cells to set to 1.
For more details, see my sections on sampling without replacement and shuffling. (Part of my comment moved here:) Different sampling algorithms have different tradeoffs in terms of time and space. For example, a Fisher–Yates shuffle has time and space complexity of O(mn), while a partial shuffle can have time complexity O(k).
I'm taking an algorithms course, and would like some help with the following question:
What is the time complexity of the quicksort algorithm on the array [k+1, ..., n, 1, ..., k] where k > n/2, and the pivot is always chosen to be the right-most cell of the sub-array?
Will it be O(n²) or O(n log n)?
From a scanned algorithms test from a past semester, the student said O(n²) (which I agreed with after a few simulations), but that student got the answer wrong with no explanations.
Me and a couple of other students are confused as to why the answer is marked wrong when all three of us got to the same conclusion by ourselves.
O(n²) is correct.
The first run through will select the right-most element k as the pivot, partitioning the rest of the array into [1, ..., k-1] on the left and [k+1, ..., n] on the right. Since both of these subarrays are in sorted order, they are in a form where quicksort selecting the right-most element as the pivot takes quadratic time.
Sorting the left side of the partition will therefore take O(k²) time, and sorting the right side will take O((n-k)²) time. Since we have n/2 < k <= n, we also have that O(k²) = O(n²).
Okay, I keep getting stuck with the complexity here. There is an array of elements, say A[n]. Need to find all pairs so that A[i]>A[j] and also i < j.
So if it is {10, 8, 6, 7, 11}, the pairs would be (10,8) (10, 6), (10,7) and so on...
I did a merge sort in nlogn time and then a binary search for the entire array again in nlogn to get the indices of the elements in the sorted array.
So sortedArray={6 7 8 10 11} and index={3 2 0 1 4}
Irrespective of what I try, I keep getting another n^2 time in the complexity when I begin loops to compare. I mean, if I start for the first element i.e. 10, it is at index[2] which means there are 2 elements less than it. So if index[2]<index[i] then they can be accepted but that increases the complexity. Any thoughts? I don't want the code, just a hint in the right direction would be helpful.
Thanks. Everything i have been doing in C and time complexity is important here c
You cannot do this in under O(N^2), because the number of pairs that the algorithm will produce when the original array sorted in descending order is N(N-1)/2. You simply cannot produce O(N^2) results in O(N*LogN) time.
The result consists of O(n^2) elements, so any attempt to iterate through all pairs will be O(n^2).