can someone explain the while loop part of this pseudocode - arrays

I understand this is a code to check if elements in the list are different. But the while loop is aspect is confusing. Can someone explain this part
// Input: list (or array) of n integers a[0]; a[1]; a[2],....., a[n − 1]
// Output: Does there exist a repeated integer in the list?
repeat ← false
i ← 0 // set i to zero
while i <= n − 2 do
j ← i + 1
while j <= n − 1 do
if (a[i] == a[j]) then
repeat ← true
else
repeat ← false
j ← j + 1
i ← i + 1
if (repeat == true) then
print "Some numbers repeated"
else
print "All numbers are different"

As other users mentioned in the comments the code contains a bug. You have to remove the "else" branch from the if-statement in the inner while-loop. If you do that, the code should work according to the specifications checking all pairs of elements in the array for equality. The first while-loop with running index i iterates over all elements of the array up to the second last. In each iteration of the outer while-loop the inner (nested) while-loop iterates from element j = i + 1 to the last element (i.e. j runs over all elements to the right of the i-th element) and checks each pair of elements (i-th and j-th element) for equality (setting the repeated-flag if two elements are equal). To better understand the pattern this algorithm follows and see why it actually compares all pairs of elements it could help to execute the algorithm manually on a small example. This algorithm is quite inefficient, its time complexity is O(n^2). You can use an efficient set data stracture (such as a balanced binary tree or a hash set) to reduce the time complexity to O(nlog(n)) or amortized O(n).

Related

Need help proving loop invariant (simple bubble sort, partial correctness)

The bubble-sort algorithm (pseudo-code):
Input: Array A[1...n]
for i <- n,...,2 do
for j <- 2,...,i do
if A[j - 1] >= A[j] then
swap the values of A[j-1] and A[j];
I am not sure but my proof seems to work, but is overly convoluted. Could you help me clean it up?
Loop-invariant: After each iteration i, the i - n + 1 greatest
elements of A are in the position they would be were A sorted
non-descendingly. In the case that array A contains more than one
maximal value, let the greatest element be the one with the smallest index
of all the possible maximal values.
Induction-basis (i = n): The inner loop iterates over every element of
A. Eventually, j points to the greatest element. This value will be
swapped until it reaches position i = n, which is the highest position
in array A and hence the final position for the greatest element of A.
Induction-step: (i = m -> i = m - 1 for all m > 3): The inner loop
iterates over every element of A. Eventually, j points to the greatest
element of the ones not yet sorted. This value will be swapped until
it reaches position i = m - 1, which is the highest position of the
positions not-yet-sorted in array A and hence the final position for
the greatest not-yet-sorted element of A.
After the algorithm was fully executed, the remaining element at
position 1 is also in its final position because were it not, the
element to its right side would not be in its final position, which is
a contradiction. Q.E.D.
I'd be inclined to recast your proof in the following terms:
Bubble sort A[1..n]:
for i in n..2
for j in 2..i
swap A[j - 1], A[j] if they are not already in order
Loop invariant:
let P(i) <=> for all k s.t. i < k <= n. A[k] = max(A[1..k])
Base case:
initially i = n and the invariant P(n) is trivially satisfied.
Induction step:
assuming the invariant holds for some P(m + 1),
show that after the inner loop executes, the invariant holds for P(m).

insertion sort theoretical analysis, total number of shifts.

Given the following array:
[14 17 21 34 47 19 71 22 29 41 8]
and the following excerpt from the book Algorithms Unlocked by Thomas Cormen
(slightly edited, [START] and [STOP] flags are not part of the text):
Insertion sort is an excellent choice when the array starts out as
''almost sorted''. [START] Suppose that each array element starts out within
k positions of where it ends up in the sorted array. Then the total
number of times that a given element is shifted over all iterations
of the inner loop is at most k. Therefore, the total number of times
that all elements are shifted over all inner-loop iterations, is at
most kn, which in turn tells us that the total number of inner-loop
iterations is at most kn (since each inner-loop iteration shifts
exactly one element by one position).[STOP] If k is a constant, then the
total running time of insertion sort would he only Θ(n), because the
Θ-notation subsumes the constant factor k. In fact we can even
tolerate some elements moving a long distance in the array, as long as
there are not too many such elements. In particular, if L elements can
move anywhere in the array (so that each of these elements can move by
up to n-1 positions), and the remaining n - L elements can more at
most k positions, then the total number of shifts is at most L * (n –
1) + (n – L) * k = (k + L) * n – (k + 1) * L, which is Θ(n) if both k
and L are constants.
The books is trying to explain how it crafts a formula, which it presents at the bottom of the text. I would like some help to better understand what it says, very likely, it could help a specific example using the above sample array, so that what is going on with the k and n variables. Can you help me to better understand the above excerpt's analysis?
To be more specific what is confusing me, the lines between [START] and [STOP] flags ,these are the lines:
Suppose that each array element..... which in turn tells us that the
total number of inner-loop iterations is at most kn(since each
inner-loop iteration shifts exactly one element by one position).
(anything below these lines is totally understood all the way to the end.)
Let is consider the insertion sort algorithm
Algorithm: InsertionSort(A)
i ← 1
while i < length(A)
j ← i
while j > 0 and A[j-1] > A[j]
swap A[j] and A[j-1]
j ← j - 1
end while
i ← i + 1
end while
The inner loop - move elements of A[0..i-1] one by one, till A[i] is in its correct position.
Therefore if a given element is atmost k position away from its correct place, we will have a maximum of k compares and swaps. For n elements it will be k*n.
Hope it helps!

How do you reorganize an array within O(n) runtime & O(1) space complexity?

I'm a 'space-complexity' neophyte and was given a problem.
Suppose I have an array of arbitrary integers:
[1,0,4,2,1,0,5]
How would I reorder this array to have all the zeros at one end:
[1,4,2,1,5,0,0]
...and compute the count of non-zero integers (in this case: 5)?
... in O(n) runtime with O(1) space complexity?
I'm not good at this.
My background is more environmental engineering than computer science so I normally think in the abstract.
I thought I could do a sort, then count the non-zero integers.
Then I thought I could merely do a element-per-element copy as I re-arrange the array.
Then I thought something like a bubble sort, switching neighboring elements till I reached the end with the zeroes.
I thought I could save on the 'space-complexity' via shift array-members' addresses, being that the array point points to the array, with offsets to its members.
I either enhance the runtime at the expense of the space complexity or vice versa.
What's the solution?
Two-pointer approach will solve this task and keep within the time and memory constraints.
Start by placing one pointer at the end, another at the start of the array. Then decrement the end pointer until you see the first non-zero element.
Now the main loop:
If the start pointer points to zero, swap it with the value pointed
by the end pointer; then decrement the end pointer.
Always increment the start pointer.
Finish when start pointer becomes greater than or equal to the end
pointer.
Finally, return the position of the start pointer - that's the number of nonzero elements.
This is the Swift code for the smart answer provided by #kfx
func putZeroesToLeft(inout nums: [Int]) {
guard var firstNonZeroIndex: Int = (nums.enumerate().filter { $0.element != 0 }).first?.index else { return }
for index in firstNonZeroIndex..<nums.count {
if nums[index] == 0 {
swap(&nums[firstNonZeroIndex], &nums[index])
firstNonZeroIndex += 1
}
}
}
Time complexity
There are 2 simple (not nested) loops repeated max n times (where n is the length of input array). So time is O(n).
Space complexity
Beside the input array we only use the firstAvailableSlot int var. So the space is definitely a constant: O(1).
As indicated by the other answers, the idea is to have two pointers, p and q, one pointing at the end of the array (specifically at the first nonzero entry from behind) and the other pointing at the beginning of the array. Scan the array with q, each time you hit a 0, swap elements pointed to by p and q, increment p and decrement q (specifically, make it point to the next nonzero entry from behind); iterate as long as p < q.
In C++, you could do something like this:
void rearrange(std::vector<int>& v) {
int p = 0, q = v.size()-1;
// make q point to the right position
while (q >= 0 && !v[q]) --q;
while (p < q) {
if (!v[p]) { // found a zero element
std::swap(v[p], v[q]);
while (q >= 0 && !v[q]) --q; // make q point to the right position
}
++p;
}
}
Start at the far end of the array and work backwards. First scan until you hit a nonzero (if any). Keep track of the location of this nonzero. Keep scanning. Whenever you encounter a zero -- swap. Otherwise increase the count of nonzeros.
A Python implementation:
def consolidateAndCount(nums):
count = 0
#first locate last nonzero
i = len(nums)-1
while nums[i] == 0:
i -=1
if i < 0:
#no nonzeros encountered
return 0
count = 1 #since a nonzero was encountered
for j in range(i-1,-1,-1):
if nums[j] == 0:
#move to end
nums[j], nums[i] = nums[i],nums[j] #swap is constant space
i -=1
else:
count += 1
return count
For example:
>>> nums = [1,0,4,2,1,0,5]
>>> consolidateAndCount(nums)
5
>>> nums
[1, 5, 4, 2, 1, 0, 0]
The suggested answers with 2 pointers and swapping are changing the order of non-zero array elements which is in conflict with the example provided. (Although he doesn't name that restriction explicitly, so maybe it is irrelevant)
Instead, go through the list from left to right and keep track of the number of 0s encountered so far.
Set counter = 0 (zeros encountered so far).
In each step, do the following:
Check if the current element is 0 or not.
If the current element is 0, increment the counter.
Otherwise, move the current element by counter to the left.
Go to the next element.
When you reach the end of the list, overwrite the values from array[end-counter] to the end of the array with 0s.
The number of non-zero integers is the size of the array minus the counted zeros.
This algorithm has O(n) time complexity as we go at most twice through the whole array (array of all 0s; we could modify the update scheme a little to only go through at most exactly once though). It only uses an additional variable for counting which satisfies the O(1) space constraint.
Start with iterating over the array (say, i) and maintaining count of zeros encountered (say zero_count) till now.
Do not increment the iterative counter when the current element is 0. Instead increment zero_count.
Copy the value in i + zero_count index to the current index i.
Terminate the loop when i + zero_count is greater than array length.
Set the remaining array elements to 0.
Pseudo code:
zero_count = 0;
i = 0;
while i + zero_count < arr.length
if (arr[i] == 0) {
zero_count++;
if (i + zero_count < arr.length)
arr[i] = arr[i+zero_count]
} else {
i++;
}
while i < arr.length
arr[i] = 0;
i++;
Additionally, this also preserves the order of non-zero elements in the array,
You can actually solve a more generic problem called the Dutch national flag problem, which is used to in Quicksort. It partitions an array into 3 parts according to a given mid value. First, place all numbers less than mid, then all numbers equal to mid and then all numbers greater than mid.
Then you can pick the mid value as infinity and treat 0 as infinity.
The pseudocode given by the above link:
procedure three-way-partition(A : array of values, mid : value):
i ← 0
j ← 0
n ← size of A - 1
while j ≤ n:
if A[j] < mid:
swap A[i] and A[j]
i ← i + 1
j ← j + 1
else if A[j] > mid:
swap A[j] and A[n]
n ← n - 1
else:
j ← j + 1

Counting the number of moves for sorting an array if only unit shift are allowed

So, I have an array containing integers. I need to sort it. However, the only operation I can perform is a unit shift. That is, I can move the last element of the sequence to its beginning.
`a1, a2, ..., an → an, a1, a2, ..., an - 1.`
what is the minimum number of operations that I need to sort the sequence?
The number of integers in the array can be upto 10^5. And each integer individual value can be 10^5 too. Also, if array is already sorted, print 0 else if array cannot be sorted by unit shifts, print -1.
The solution that I thought of:
Check if array is sorted or not.
If array is sorted, print 0 else
Set count = 0
Rotate array by one unit and increment count.
Check if array is sorted: if yes, print count and break, else
repeat steps 4-5 till count < (total integers in array).
Now, the above solution has a time complexity of O(n^2). Because, I am checking for each individual element if array is sorted and this checking takes O(n) time, and I have n elements, so that makes it O(n^2).
Can anyone suggest me some other better approach?
Thanks!
PS: I tried really hard thinking of some other approach. I reached uptill counting inversions, but that doesn't really help.
Just iterate the array, find the first index i such that arr[i] > arr[i+1] (If there is no such index, we are done since array is already sorted), then check if arr[i+1],...,arr[n] is sorted, and if arr[n] <= arr[1]. If it is, it can be done by doing n-i rotations.
Otherwise, there is no solution.
Time complexity is O(n), space complexity O(1).
Appendix: Correctness of the algorithm:
Claim 1:
If the array cannot be splitted to two arrays
arr[1],arr[2],..,arr[i] and arr[i+1],...,arr[n] - both sorted,
then there is no solution.
Proof:
Let's assume i is the first index where arr[i] > arr[i+1], and let j be some other index such that arr[j] > arr[j+1], j!=i. There must be such because arr[i+1],...,arr[n] is unsorted.
By definition, while j+1 was not "unit shifted", the array is unsorted.
Immidiately after it was shifted, it is still not sorted since arr[i] > arr[i+1], and after another shift, arr[j] is again before arr[j+1], and violating the sorted order.
Thus, the array cannot be sorted.
Claim 2:
Assume an array that can be splitted to two sorted array
arr[1],...,arr[i], and arr[i+1],...,arr[n]. Also assume
arr[i] > arr[i+1]. Then, the array can be "unit shifted" into
sorted one, if and only if arr[n] <= arr[1].
Proof:
<---
The array is not sorted, so at least one unit shift must be done. This unit shift places arr[n] before arr[1], and the array will never be sorted unless arr[n]<=arr[1]
--->
Assume arr[n]<=arr[1], then by shifting arr[i+1],...,arr[n], we get the following array:
arr[i+1],arr[i+2],...,arr[n],arr[1],arr[2],...,arr[i]
Note that arr[i+1]<= arr[i+2] <= .... <= arr[n] since we assumed it is sorted.
Similarly arr[1]<=arr[2]<=...<=arr[i]
Also note arr[n] <= arr[i], from assumption.
By joining the 3 above inequalities we get:
arr[i+1] <= arr[i+2] <= ... <= arr[n] M= arr[1] <= arr[2] <= ... <= arr[i]
The above is by definition a sorted array, which concludes the proof for the claim.
Claim 3:
The algorithm is correct
By applying claim1,claim2 and handling specifically the case where array is already sorted, we get that:
The array can be sorted using "unit shifts" if and only if: It is already sorted, OR conditions of claim2 applies, and this concludes the proof.
QED

Is there an O(n) algorithm to generate a prefix-less array for an positive integer array?

For array [4,3,5,1,2],
we call prefix of 4 is NULL, prefix-less of 4 is 0;
prefix of 3 is [4], prefix-less of 3 is 0, because none in prefix is less than 3;
prefix of 5 is [4,3], prefix-less of 5 is 2, because 4 and 3 are both less than 5;
prefix of 1 is [4,3,5], prefix-less of 1 is 0, because none in prefix is less than 1;
prefix of 2 is [4,3,5,1], prefix-less of 2 is 1, because only 1 is less than 2
So for array [4, 3, 5, 1, 2], we get prefix-less arrary of [0,0, 2,0,1],
Can we get an O(n) algorithm to get prefix-less array?
It can't be done in O(n) for the same reasons a comparison sort requires O(n log n) comparisons. The number of possible prefix-less arrays is n! so you need at least log2(n!) bits of information to identify the correct prefix-less array. log2(n!) is O(n log n), by Stirling's approximation.
Assuming that the input elements are always fixed-width integers you can use a technique based on radix sort to achieve linear time:
L is the input array
X is the list of indexes of L in focus for current pass
n is the bit we are currently working on
Count is the number of 0 bits at bit n left of current location
Y is the list of indexs of a subsequence of L for recursion
P is a zero initialized array that is the output (the prefixless array)
In pseudo-code...
Def PrefixLess(L, X, n)
if (n == 0)
return;
// setup prefix less for bit n
Count = 0
For I in 1 to |X|
P(I) += Count
If (L(X(I))[n] == 0)
Count++;
// go through subsequence with bit n-1 with bit(n) = 1
Y = []
For I in 1 to |X|
If (L(X(I))[n] == 1)
Y.append(X(I))
PrefixLess(L, Y, n-1)
// go through subsequence on bit n-1 where bit(n) = 0
Y = []
For I in 1 to |X|
If (L(X(I))[n] == 0)
Y.append(X(I))
PrefixLess(L, Y, n-1)
return P
and then execute:
PrefixLess(L, 1..|L|, 32)
I think this should work, but double check the details. Let's call an element in the original array a[i] and one in the prefix array as p[i] where i is the ith element of the respective arrays.
So, say we are at a[i] and we have already computed the value of p[i]. There are three possible cases. If a[i] == a[i+1], then p[i] == p[i+1]. If a[i] < a[i+1], then p[i+1] >= p[i] + 1. This leaves us with the case where a[i] > a[i+1]. In this situation we know that p[i+1] >= p[i].
In the naïve case, we go back through the prefix and start counting items less than a[i]. However, we can do better than that. First, recognize that the minimum value for p[i] is 0 and the maximum is i. Next look at the case of an index j, where i > j. If a[i] >= a[j], then p[i] >= p[j]. If a[i] < a[j], then p[i] <= p[j] + j . So, we can start going backwards through p updating the values for p[i]_min and p[i]_max. If p[i]_min equals p[i]_max, then we have our solution.
Doing a back of the envelope analysis of the algorithm, it has O(n) best case performance. This is the case where the list is already sorted. The worst case is where it is reversed sorted. Then the performance is O(n^2). The average performance is going to be O(k*n) where k is how much one needs to backtrack. My guess is for randomly distributed integers, k will be small.
I am also pretty sure there would be ways to optimize this algorithm for cases of partially sorted data. I would look at Timsort for some inspiration on how to do this. It uses run detection to detect partially sorted data. So the basic idea for the algorithm would be to go through the list once and look for runs of data. For ascending runs of data you are going to have the case where p[i+1] = p[i]+1. For descending runs, p[i] = p_run[0] where p_run is the first element in the run.

Resources