dry run of worst case of quick sort

dry run of worst case of quick sort - arrays

we know that in quick sort worst case is O(n^2)
i can solving the array of:
1 2 3 4 5 6 7 8 9 10
when i put value of n in equation of worst case answer is 100
but in dry run it can solve in 51 steps.
its a big difference what the reason of this

O(n^2) means that the complexity grows with the square of n, not that it is exactly n^2.
You need to check how the cost (ans) grows when n grows. Try putting 5, 10 and 20 items in the worst-case array and then you will see that ans does not grow proportionally (2x each time) to n but much faster.

It would be helpful to consider the definition of Big O when thinking about how it can be applied, in this case, to the worst case scenario in a quick sort algorithm. The Big O describes the asymptotic behavior of functions. When referring to an algorithm, the running time is bounded above by f(x) which is O(f(x)). What this means is that your algorithm cannot grow any faster than f(x). In your example, quick sort is bounded above by (n^2), therefore, it cannot grow any faster than n^2 as n gets arbitrarily large.
Being bounded above by n^2 does not necessarily mean the worst case takes exactly n^2 steps. It is also bounded above by n^4, n^100, n^n. All this means is that quick sort can never grow faster than n^2, n^4, n^100, n^n.
Another point to keep in mind when describing Big O is thinking in terms of n getting arbitrarily large or going towards infinity. In this example n is 10, but when you consider the Big O of larger values of n in the worst-case the number of steps will increase, but will never exceed n^2. I hope this helps!

Related

What is the time complexity of this algorithmic problem?

*
A search method has time complexity O(n2), where n is the number of states in the space to be
searched. If it takes 1 second to search a space of a thousand states, roughly how long will it take to
search a space of a million states?*
I have found that its approximately 12 days but the way I found is quite wrong i think.
I did 1million^2 / 86400(seconds in a day ) and found 11.56 so approximately 12 days. Is there a better and more efficient solution?

There is not nearly enough information to answer this question. See Big-O description.
O(N^2) means only that the algorithm's execution time will be dominated by an N^2 term. As N grows large, the ratio between two execution times will asymptotically approach the square of their ratios. It says nothing about the execution time for particular values.
Let's keep this simple, assuming a set-up overhead with an array initialization O(N) and some system start-up, a constant. This makes the execution time
t = a * N^2 + b * N + c
for some values of a, b, and c. Even if we know that this is the equation form, we do not have enough information to solve given only one (t, N) data point. We don't know enough to derive t for N= 10^6.
I suspect that whomever posed this problem is looking for the invalid solution, making the unwarranted assumption that N=1000 has already blown all smaller terms to insignificance. In this case, simply scale up by the square of the size ratio:
N1 / N2 = 10^6 / 10^3 = 10^3
Scale up by N^2, or (10^3)^2 = 10^6
That gives you 10^6 seconds, or somewhat over a day; I'll leave the math to you.

Reducing time complexity of Knapsack 0~1 while using DP algorithm

I'm using DP algorithm, i.e. storing sub-problem values in 2D array where one axis
means n items and other - w values from 0 to W where W is the maximum
capacity of knapsack. Therefore T[n-1][W] value is the optimum I need to
calculate. I've read in other sources that time complexity of this algorithm is
O(nW). My quesiton would be: is it possible to reduce this time complexity even more?
I found other answer which talks about pretty much same thing but I can't understant it without example: how to understand about reducing time complexity on 0~1 knapsack
I tells that we de not need to to calculate T[i][w] with small w values as they are not used in the optimum, but I can't get this properly, could anyone give detailed and visual example? This would benefit me a lot.

The 2D array you're trying to fill is of size n by W (actually, W+1 since the values go from 0..W, but off-by-one doesn't affect the asymptotic complexity here). Therefore, to fill that array, you would need to do at least n*W work (even if you just initialize the array to all zeroes!).
Therefore, Θ(nW) (tightly bound, which is both O(nW) and Ω(nW)) is the best you can do in terms of asymptotic algorithmic time complexity.
This is what makes the dynamic programming solution so cool, is that you spend constant time on each element of the solution array (in this case, 2D) doing some constant work, from the bottom up (contrast this to the complexity of the top-down recursive solution!).

Sorting algorithm vs. Simple iterations

I'm just getting started in algorithms and sorting, so bear with me...
Let's say I have an array of 50000 integers.
I need to select the smallest 30000 of them.
I thought of two methods :
1. I iterate the entire array and find each smallest integer
2. I first sort the entire array , and then simply select the first 30000.
Can anyone tell me what's the difference, which method would be faster, and why?
What if the array was smaller or bigger? Would the answer change?

Option 1 sounds like the naive solution. It would involve passing through the array to find the smallest item 30000 times. Each time it finds the smallest, presumably it would swap that item to the beginning or end of the array. In basic terms, this is O(n^2) complexity.
The actual number of operations involved would be less than n^2 because n reduces every time. So you would have roughly 50000 + 49999 + 49998 + ... + 20001, which amounts to just over 1 billion (1000 million) iterations.
Option 2 would employ an algorithm like quicksort or similar, which is commonly O(n.logn).
Here it's harder to provide actual figures, because some efficient sorting algorithms can have a worst-case of O(n^2). But let's say you use a well-behaved one that is guaranteed to be O(n.logn). This would amount to 50000 * 15.61 which is about 780 thousand.
So it's clear that Option 2 wins in this case.
What if the array was smaller or bigger? Would the answer change?
Unless the array became trivially small, the answer would still be Option 2. And the larger your array becomes, the more beneficial Option 2 becomes. This is the nature of time complexity. O(n^2) grows much faster than O(n.logn).
A better question to ask is "what if I want fewer smallest values, and when does Option 1 become preferable?". Although the answer is slightly more complex because of numerous factors (such as what constitutes "one operation" in Option 1 vs Option 2, plus other issues like memory access patterns etc), you can get the simple answer directly from time complexity. Option 1 would become preferable when the number of smallest values to select drops below n.logn. In the case of a 50000-element array, that would mean if you want to select 15 or less smallest elements, then Option 1 wins.
Now, consider an Option 3, where you transform the array into a min-heap. Building a heap is O(n), and removing one item from it is O(logn). You are going to remove 30000 items. So you have the cost of building plus the cost of removal: 50000 + 30000 * 15.6 = approximately 520 thousand. And this is ignoring the fact that n gets smaller every time you remove an element. It's still O(n.logn), like Option 2 but it is probably faster: you've saved time by not bothering to sort the elements you don't care about.
I should mention that in all three cases, the result would be the smallest 30000 values in sorted order. There may be other solutions that would give you these values in no particular order.

30k is close to 50k. Just sort the array and get the smallest 30k e.g., in Python: sorted(a)[:30000]. It is O(n * log n) operation.
If you were needed to find 100 smallest items instead (100 << 50k) then a heap might be more suitable e.g., in Python: heapq.nsmallest(100, a). It is O(n * log k).
If the range of integers is limited—you could consider O(n) sorting methods such as counting sort and radix sort.
Simple iterative method is O(n**2) (quadratic) here. Even for a moderate n that is around a million; it leads to ~10**12 operations that is much worse than ~10**6 for a linear algorithm.

For nearly all practical purposes, sorting and taking the first 30,000 is the likely to be best. In most languages, this is one or two lines of code. Hard to get wrong.
If you have a truly demanding application or are just out to fiddle, you can use a selection algorithm to find the 30,000th largest number. Then one more pass through the array will find 29,999 that are no bigger.
There are several well known selection algorithms that require only O(n) comparisons and some that are sub-linear for data with specific properties.
The fastest in practice is QuickSelect, which - as its name implies - works roughly like a partial QuickSort. Unfortunately, if the data happens to be very badly ordered, QuickSelect can require O(n^2) time (just as QuickSort can). There are various tricks for selecting pivots that the make it virtually impossible to get the worst case run time.
QuickSelect will finish with the array reordered so the smallest 30,000 elements are in the first part (unsorted) followed by the rest.
Because standard selection algorithms are comparison-based, they'll work on any kind of comparable data, not just integers.

You can do this in potentially O(N) time with radix sort or counting sort, given that your input is integers.
Another method is to get the 30000th largest integer by quickselect and simply iterate through the original array. This has Θ(N) time complexity, but in the worst case has O(N^2) for quickselect.

Time taken to sort array, when CPU operates on 10^8 operations/sec

Assume that a CPU can process 10^8 operations per second. Suppose you have to sort an array with 10^6 elements. Which of the following is true?
Insertion sort will always take more than 2.5 hours while merge sort will always take less than 1 second.
Insertion sort will always take more than 2.5 hours while quicksort will always take less than 1 second
Insertion sort could take more than 2.5 hours while merge sort will always take less than 1 second.
Insertion sort could take more than 2.5 hours while quicksort will always take less than 1 second.

What's the worst-case complexity of insertion sort?
What's the worst-case complexity of mergesort and quicksort?
Is it possible that insertion sort lasts less than 2.5 hours? If it is, what's the case?
What's the best-case complexity of quicksort (or mergesort)?
What's the best-case complexity of insertion sort?
If you answer these questions. You'll easily answer yours.
If you don't know what worst-case or best-case means, then try to find answers to these questions.
If sorting takes 10^12 steps, how long does it last in seconds (or hours) if your CPU does 10^8 steps per second?
How many steps does it take to sort an array of length n for quicksort, mergesort or insertion sort?
Answering all of the questions above leads you one step closer to answering your question.
Do it, and I guarantee you, you'll be able to answer it immediately.

Worst case complexity of Insertion sort & Quick Sort is O(n^2).
Worst case complexity of merge sort is O(n*log(n))
therefore, for insertion sort & quick sort no. of steps are (10^6)^2 = 10^12
so it takes 10^12/10^8 = 10^4 or 10,000 seconds = 2.77hrs (as CPU operates 10^8 steps/sec)
where as, no. of steps for merge sort is (10^6*log(10^6)) = 6*10^6 steps which takes (6*10^6/10^8) = 0.06secs which is less than 1
but as the case may be option 1 or 3 could be correct, here the insertion sort can take less than 2.5hrs depending on the problem, cause average-case complexity for insertion sort is (N-1)*N/4 calculating running time for it, we got 0.69hrs which is lesser than 2.5hrs so the 3rd statement is correct.

Exhaustive searches vs sorting followed by binary search

This is a direct quote from the textbook, Invitation to Computer Science by G. Michael Scneider and Judith L. Gersting.
At the end of Section 3.4.2, we talked about the tradeoff between using sequential search on an unsorted list as opposed to sorting the list and then using binary search. If the list size is n=100,000 about how many worst-case searches must be done before the second alternative is better in terms of number of comparisons?
I don't really get what the question is asking for.
Sequential search is of order (n) and binary is of order (lgn) which in any case lgn will always be less than n. And in this case n is already given so what am I supposed to find.
This is one of my homework assignment but I don't really know what to do. Could anyone explain the question in plain English for me?

and binary is of order (lgn) which in any case lgn will always be less than n
This is where you're wrong. In assignment, you're asked to consider the cost of sorting array too.
Obviously, if you need only one search, first approach is better than sorting array and doing binary search: n < n*logn + logn. And you're asked, how many searches you need for second approach to become more effective.
End of hint.

The question is how to decide which approach to choose - to just use linear search or to sort and then use binary search.
If you only search a couple of times linear search is better - it is O(n), while sorting is already O(n*logn). If you search very often on the same collection sorting is better - searching multiple times can become O(n*n) but sorting and then searching with binary search is again O(n*logn) + NumberOfSearches*O(logn) which can be less or more than using linear search depending on how NumberOfSearches and n relate.
The task is to determine the exact value of NumberOfSearches (not the exact number, but a function of n) which will make one of the options preferable:
NumberOfSearches * O(n) <> O(n*logn) + NumberOfSearches * O(logn)
don't forget that each O() can have a different constant value.

The order of the methods is not important here. It tells you something how well algorithms scale when the problem becomes bigger and bigger. You can't do any exact calculations if you only know O(n) == it complexity grows linear in the size of the problem. It won't give you any numbers.
This can well mean that an algorithm with O(n) complexity is faster than a O(logn) algorithm, for some n. Because O(log(n)) scales better when it gets larger, we know for sure, there is a n (a problem size) where the algorithm with O(logn) complexity is faster. We just don't know when (for what n).
In plain english:
If you want to know 'how many searches', you need exact equations to solve, you need exact numbers. How many comparisons does it take to search sequential? (Remember n is given, so you can give a number.) How many comparisons (in the worst case!) does it take to search with a binary search? Before you can do a binary search, you have to sort. Let's add the number of comparisons needed to sort to the cost of binary search. Now compare the two numbers, which one is less?
The binary search is fast, but the sorting is slow. The sequential search is slower than binary search, but faster than sorting. However the sorting needs to be done only once, no matter how many times you search. So, when does one heavy sort outweigh having to do a slow (sequential) search every time?
Good luck!

For sequential search, the worst case is n = 100000, so for p searches p × 100000 comparisons are required.
Using a Θ(n2) sorting algorithm would require 100000 × 100000 comparisons.
Binary search would require 1 + log n = 1 + log 100000 = 17 comparisons for each search,
together there would be 100000×100000 + 17p comparisons.
The first expression is larger than the second, meaning
100000p > 100000^2 + 17p
For p > 100017.

The question is about appreciating the number NUM_SEARCHES needed to compensate the cost of sorting. So we'll have:
time( NUM_SEARCHES * O(n) ) > time( NUM_SEARCHES * O(log(n)) + O(n* log(n)) )

Thank you guys. I think I get the point now. Could you take a look at my answer and see whether I'm on the right track.
For worst case searches
Number of comparison for sequential search is n = 100,000.
Number of comparison for binary search is lg(n) = 17.
Number of comparison for sorting is (n-1)/2 * n = (99999)(50000).
(I'm following my textbook and used the selection sort algorithm covered in my class)
So let p be the number of worst case searches, then 100,000p > (99999)(50000) + 17p
OR p > 50008
In conclusion, I need 50,008 worst case searches to make sorting and using binary search better than a sequential search for a list of n=100,000.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight