Start, end and stopping condition of Binary Search code - arrays

I hope everyone is doing well.
I know there are a lot of questions whose title is very much similar to my question but I have a doubt regarding start and end values :-
If I use stopping condition as start<=end then, whether start=0 ,end=n/n-1|start=-1,end=n-1/n|start=-2 to end=n-2| start=-3 to end=n-3 **the output is same in all these cases for the below code
while(l<=r)
{
int mid=(l+r)/2;
if(arr[mid]==k)
return mid;
else if(k>arr[mid])
l=mid+1;
else
r=mid-1;
}
From start=-4 to end=-4 the result is wrong.
I've read the other questions and learned the fact that changing the stopping condition changes the range of start and end (inclusive/ exclusive ) but,
Why is binary search working for start=0/-1/-2/-3 to end=n/n+1/n+2/n+3? I mean apart from starting mid=(n-2+2)/2<=> n/2, there may be a condition when the target element appears at arr[0].
Thanks for spending your precious time in solving my query.
I hope I've written clearly.

If I use stopping condition as start<=end then, whether start=0 ,end=n/n-1|start=-1,end=n-1/n|start=-2 to end=n-2| start=-3 to end=n-3 the output is same in all these cases for the below code.
This is not true. start and end must start at precisely the first and last index of the array, or there will be cases where the algorithm you provided fails. So assuming zero-based array indexing, start = 0, end = n - 1 is the only correct initialisation for this algorithm.
Here are counter examples for some alternatives:
start = 0, end = n: if k is a value that is greater than the greatest value in arr, then eventually mid will become equal to n and arr[mid] will be an invalid reference. Depending on the language this may trigger an exception.
start = -1, end = n - 1: if k is a value that is less than the least value in arr, then eventually mid will become equal to -1 and arr[mid] will be an invalid reference.
start = -2, end = n - 2. if k == arr[0] and n == 3, the element will not be found, as mid will get the value -1.
start = -2, end = n + 2. if k == arr[0] and n == 3, the element will not be found, as mid will get the value 1 and in the next iteration it will be -1, so skipping for ever the index 0.
...etc

Related

Counting segments without the function Count

I had the next problem: Given an array, count the number of segments of length k in which this happens; the number of positives in the left-half of the segment is bigger or equal to the right-half.
As an example (imagine segments can only be even, so that there is no discussion about what a half is):
k=2 ---> count(array[-4,-2,2,1],k) ---> 2, as [-4,-2] fulfills and also [2,1]
k=4 ---> count(array[-4,-2,2,1],k) ---> 0, as [-4,-2,2,1] does not fulfil.
k=6 ---> count(array[-4,-2,2,1],k) ---> 0, as there are not length 6 segments.
I have solved it recursively, using the function Count, in a trivial way: I move the array from left to right, enumerating all the segments of length k, and applying the count on each of those. It is done in Dafny:
function method Count_segments(sequ: seq<int>, seg_length:int): int
{
if |sequ| == 0 then 0
else (if (Count(x => x >= 0, sequ[0..seg_length/2])) >= (Count(x => x >= 0, sequ[seg_length/2..seg_length])) then
1 + if (|sequ|-1 < seg_length) then 0 //I add the condition that says that if in the next iteration, sequ will be smaller than the sequence_length, then we end.
else Count_segments(sequ[1..], seg_length)
else if (|sequ|-1 < seg_length) then 0
else Count_segments(sequ[1..], seg_length)
)
}
But, obviously, using Count, I am not doing a linear search iteratively (in the first example, instead of searching 4 times, it does 6 times). I would like to implement this in O(n) but cannot find any info, does anyone have an idea? I do not care about the programming language (can answer me in any language), but about the algorithm itself.
Thanks!!

How to determine length of array in O(log n) time, by only calling A[i]?

I have an array of unknown length n, where each element of the array is the number 1. So its like A=[1,1,1,1.....], n times. Now I need to write an algorithm to find out the value of n.
Here is the full problem statement:
You are given an array A of length n. Each value is 1. However, you do not know what the value of n is. You are allowed to access an element i of the array by calling A[i]. If i < n, this will return 1, and if i >= n, this will return a message saying that the location does not exist (you can think of this as an error message, but it will not crash your program). Describe an O(log n) algorithm to determine the value of n.
This is what I have come up with :
Set i = 1;
while A(i-1) is true // i.e. for i <= n, it will return the element for the index (i-1)
print i_old = i
print i_new = 2i // doubling the value of i
i = i_new // to run the loop again to check the array A with new value of i
else
print no element found // when i>n
Now, this means the last index is between the last pair of i_old and i_new.
So, for an array [1,1,1,1,1,1] of six elements the value of n can be between 4 and 6, i.e. it can be either 4 or 5.
I am not sure about how to proceed further. What am I missing?
We can solve this using the exponential search algorithm in O(log n) time:
Assume the length is 2. Try successive powers of 2 (2, 4, 8, 16...) until an out of bounds exception is raised. This gives a guarantee of an upper bound on the possible length. Perform a right binary search between 0 and this upper bound. On each bisection iteration, if the midpoint index doesn't raise an exception, try the upper half of the search space, else try the lower half. When our bounds meet, return that index.
Python implementation:
def length(L):
lo = 0
hi = 1
while 1:
hi *= 2
try:
L[hi]
except IndexError:
break
while lo < hi:
mid = (lo + hi) // 2
try:
L[mid]
lo = mid + 1
except IndexError:
hi = mid
return hi
if __name__ == "__main__":
assert all(length([1] * i) == i for i in range(1000))

How do you reorganize an array within O(n) runtime & O(1) space complexity?

I'm a 'space-complexity' neophyte and was given a problem.
Suppose I have an array of arbitrary integers:
[1,0,4,2,1,0,5]
How would I reorder this array to have all the zeros at one end:
[1,4,2,1,5,0,0]
...and compute the count of non-zero integers (in this case: 5)?
... in O(n) runtime with O(1) space complexity?
I'm not good at this.
My background is more environmental engineering than computer science so I normally think in the abstract.
I thought I could do a sort, then count the non-zero integers.
Then I thought I could merely do a element-per-element copy as I re-arrange the array.
Then I thought something like a bubble sort, switching neighboring elements till I reached the end with the zeroes.
I thought I could save on the 'space-complexity' via shift array-members' addresses, being that the array point points to the array, with offsets to its members.
I either enhance the runtime at the expense of the space complexity or vice versa.
What's the solution?
Two-pointer approach will solve this task and keep within the time and memory constraints.
Start by placing one pointer at the end, another at the start of the array. Then decrement the end pointer until you see the first non-zero element.
Now the main loop:
If the start pointer points to zero, swap it with the value pointed
by the end pointer; then decrement the end pointer.
Always increment the start pointer.
Finish when start pointer becomes greater than or equal to the end
pointer.
Finally, return the position of the start pointer - that's the number of nonzero elements.
This is the Swift code for the smart answer provided by #kfx
func putZeroesToLeft(inout nums: [Int]) {
guard var firstNonZeroIndex: Int = (nums.enumerate().filter { $0.element != 0 }).first?.index else { return }
for index in firstNonZeroIndex..<nums.count {
if nums[index] == 0 {
swap(&nums[firstNonZeroIndex], &nums[index])
firstNonZeroIndex += 1
}
}
}
Time complexity
There are 2 simple (not nested) loops repeated max n times (where n is the length of input array). So time is O(n).
Space complexity
Beside the input array we only use the firstAvailableSlot int var. So the space is definitely a constant: O(1).
As indicated by the other answers, the idea is to have two pointers, p and q, one pointing at the end of the array (specifically at the first nonzero entry from behind) and the other pointing at the beginning of the array. Scan the array with q, each time you hit a 0, swap elements pointed to by p and q, increment p and decrement q (specifically, make it point to the next nonzero entry from behind); iterate as long as p < q.
In C++, you could do something like this:
void rearrange(std::vector<int>& v) {
int p = 0, q = v.size()-1;
// make q point to the right position
while (q >= 0 && !v[q]) --q;
while (p < q) {
if (!v[p]) { // found a zero element
std::swap(v[p], v[q]);
while (q >= 0 && !v[q]) --q; // make q point to the right position
}
++p;
}
}
Start at the far end of the array and work backwards. First scan until you hit a nonzero (if any). Keep track of the location of this nonzero. Keep scanning. Whenever you encounter a zero -- swap. Otherwise increase the count of nonzeros.
A Python implementation:
def consolidateAndCount(nums):
count = 0
#first locate last nonzero
i = len(nums)-1
while nums[i] == 0:
i -=1
if i < 0:
#no nonzeros encountered
return 0
count = 1 #since a nonzero was encountered
for j in range(i-1,-1,-1):
if nums[j] == 0:
#move to end
nums[j], nums[i] = nums[i],nums[j] #swap is constant space
i -=1
else:
count += 1
return count
For example:
>>> nums = [1,0,4,2,1,0,5]
>>> consolidateAndCount(nums)
5
>>> nums
[1, 5, 4, 2, 1, 0, 0]
The suggested answers with 2 pointers and swapping are changing the order of non-zero array elements which is in conflict with the example provided. (Although he doesn't name that restriction explicitly, so maybe it is irrelevant)
Instead, go through the list from left to right and keep track of the number of 0s encountered so far.
Set counter = 0 (zeros encountered so far).
In each step, do the following:
Check if the current element is 0 or not.
If the current element is 0, increment the counter.
Otherwise, move the current element by counter to the left.
Go to the next element.
When you reach the end of the list, overwrite the values from array[end-counter] to the end of the array with 0s.
The number of non-zero integers is the size of the array minus the counted zeros.
This algorithm has O(n) time complexity as we go at most twice through the whole array (array of all 0s; we could modify the update scheme a little to only go through at most exactly once though). It only uses an additional variable for counting which satisfies the O(1) space constraint.
Start with iterating over the array (say, i) and maintaining count of zeros encountered (say zero_count) till now.
Do not increment the iterative counter when the current element is 0. Instead increment zero_count.
Copy the value in i + zero_count index to the current index i.
Terminate the loop when i + zero_count is greater than array length.
Set the remaining array elements to 0.
Pseudo code:
zero_count = 0;
i = 0;
while i + zero_count < arr.length
if (arr[i] == 0) {
zero_count++;
if (i + zero_count < arr.length)
arr[i] = arr[i+zero_count]
} else {
i++;
}
while i < arr.length
arr[i] = 0;
i++;
Additionally, this also preserves the order of non-zero elements in the array,
You can actually solve a more generic problem called the Dutch national flag problem, which is used to in Quicksort. It partitions an array into 3 parts according to a given mid value. First, place all numbers less than mid, then all numbers equal to mid and then all numbers greater than mid.
Then you can pick the mid value as infinity and treat 0 as infinity.
The pseudocode given by the above link:
procedure three-way-partition(A : array of values, mid : value):
i ← 0
j ← 0
n ← size of A - 1
while j ≤ n:
if A[j] < mid:
swap A[i] and A[j]
i ← i + 1
j ← j + 1
else if A[j] > mid:
swap A[j] and A[n]
n ← n - 1
else:
j ← j + 1

ending condition in finding magic index in an array

I have a question in the solution I'm referring for the below question
A magic index in an array A[l.. .n-l] is defined to be an index such
that A[i] = i. Given a sorted array of distinct integers, write a
method to find a magic index, if one exists, in array A.
The solution I'm referring looks like. Assume 's' stands for start and 'e' stands for end.
int fun(int a[], int s, int e)
{
if(s > e || s < 0 || e >= array.length)
return -1;
mid = (s + e)/2;
if(mid == a[mid])
return mid;
else if(mid < a[mid])
return fun(a, s, mid-1);
else
return fun(a, mid+1, e);
}
I'm not sure about the ending condition here.
I feel the ending condition should just be
if(s > e)
return -1;
Let's consider the two extreme cases when the magic index is not present
CASE 1 - going left till index 0
Say the array looks as follows a[] = {2,10,20,30,40,50}
mid = (0+6)/2 = 3 , call fun(0,2)
mid = (0+2)/2 = 1 , call fun(0,0)
mid = (0+0)/2 = 0 , call fun(0,-1)
since start > end, -1 is returned
CASE 2 - going right till the last element
Say the array looks as follows a[] = {-20,-10,-5,-4,-3,30,80}
mid = (0+6)/2 = 3 , call fun(4,6)
mid = (4+6)/2 = 5 , call fun(6,6)
mid = (6+6)/2 = 6 , call fun(7,6)
since start > end, -1 is returned
Moreover, I feel the extra conditions given in the solution can never be reached.
I feel s<0 cannot be reached because we are never subtracting anything from 's'. I feel the smallest value that 's' can take is 0. Maybe 'e' can be < 0, but not 's'
Also I feel e >= array.length is not possible since we are never adding anything to 'e'. Maybe 's' can be greater than or equal to array.length but not 'e'
Youre right s>e is enough. S can never be below zero since it's either preserved or equal to (s+e)/2+1>=s+1 (since e>=s), so it's always larger or equal to the initial value passed, which is zero. Similarly it can be shown that e<=n-1 always, so the extra conditions are redundant.

Find shortest subarray containing all elements

Suppose you have an array of numbers, and another set of numbers. You have to find the shortest subarray containing all numbers with minimal complexity.
The array can have duplicates, and let's assume the set of numbers does not. It's not ordered - the subarray may contain the set of number in any order.
For example:
Array: 1 2 5 8 7 6 2 6 5 3 8 5
Numbers: 5 7
Then the shortest subarray is obviously Array[2:5] (python notation).
Also, what would you do if you want to avoid sorting the array for some reason (a la online algorithms)?
Proof of a linear-time solution
I will write right-extension to mean increasing the right endpoint of a range by 1, and left-contraction to mean increasing the left endpoint of a range by 1. This answer is a slight variation of Aasmund Eldhuset's answer. The difference here is that once we find the smallest j such that [0, j] contains all interesting numbers, we thereafter consider only ranges that contain all interesting numbers. (It's possible to interpret Aasmund's answer this way, but it's also possible to interpret it as allowing a single interesting number to be lost due to a left-contraction -- an algorithm whose correctness has yet to be established.)
The basic idea is that for each position j, we will find the shortest satisfying range ending at position j, given that we know the shortest satisfying range ending at position j-1.
EDIT: Fixed a glitch in the base case.
Base case: Find the smallest j' such that [0, j'] contains all interesting numbers. By construction, there can be no ranges [0, k < j'] that contain all interesting numbers so we don't need to worry about them further. Now find the smallestlargest i such that [i, j'] contains all interesting numbers (i.e. hold j' fixed). This is the smallest satisfying range ending at position j'.
To find the smallest satisfying range ending at any arbitrary position j, we can right-extend the smallest satisfying range ending at position j-1 by 1 position. This range will necessarily also contain all interesting numbers, though it may not be minimal-length. The fact that we already know this is a satisfying range means that we don't have to worry about extending the range "backwards" to the left, since that can only increase the range over its minimal length (i.e. make the solution worse). The only operations we need to consider are left-contractions that preserve the property of containing all interesting numbers. So the left endpoint of the range should be advanced as far as possible while this property holds. When no more left-contractions can be performed, we have the minimal-length satisfying range ending at j (since further left-contractions clearly cannot make the range satisfying again) and we are done.
Since we perform this for each rightmost position j, we can take the minimum-length range over all rightmost positions to find the overall minimum. This can be done using a nested loop in which j advances on each outer loop cycle. Clearly j advances by 1 n times. Since at any point in time we only ever need the leftmost position of the best range for the previous value of j, we can store this in i and just update it as we go. i starts at 0, is at all times <= j <= n, and only ever advances upwards by 1, meaning it can advance at most n times. Both i and j advance at most n times, meaning that the algorithm is linear-time.
In the following pseudo-code, I've combined both phases into a single loop. We only try to contract the left side if we have reached the stage of having all interesting numbers:
# x[0..m-1] is the array of interesting numbers.
# Load them into a hash/dictionary:
For i from 0 to m-1:
isInteresting[x[i]] = 1
i = 0
nDistinctInteresting = 0
minRange = infinity
For j from 0 to n-1:
If count[a[j]] == 0 and isInteresting[a[j]]:
nDistinctInteresting++
count[a[j]]++
If nDistinctInteresting == m:
# We are in phase 2: contract the left side as far as possible
While count[a[i]] > 1 or not isInteresting[a[i]]:
count[a[i]]--
i++
If j - i < minRange:
(minI, minJ) = (i, j)
count[] and isInteresting[] are hashes/dictionaries (or plain arrays if the numbers involved are small).
This sounds like a problem that is well-suited for a sliding window approach: maintain a window (a subarray) that is gradually expanding and contracting, and use a hashmap to keep track of the number of times each "interesting" number occurs in the window. E.g. start with an empty window, then expand it to include only element 0, then elements 0-1, then 0-2, 0-3, and so on, by adding subsequent elements (and using the hashmap to keep track of which numbers exist in the window). When the hashmap tells you that all interesting numbers exist in the window, you can begin contracting it: e.g. 0-5, 1-5, 2-5, etc., until you find out that the window no longer contains all interesting numbers. Then, you can begin expanding it on the right hand side again, and so on. I'm quite (but not entirely) sure that this would work for your problem, and it can be implemented to run in linear time.
Say the array has n elements, and set has m elements
Sort the array, noting the reverse index (position in the original array)
// O (n log n) time
for each element in given set
find it in the array
// O (m log n) time - log n for binary serch, m times
keep track of the minimum and maximum index for each found element
min - max defines your range
Total time complexity: O ((m+n) log n)
This solution definitely does not run in O(n) time as suggested by some of the pseudocode above, however it is real (Python) code that solves the problem and by my estimates runs in O(n^2):
def small_sub(A, B):
len_A = len(A)
len_B = len(B)
sub_A = []
sub_size = -1
dict_b = {}
for elem in B:
if elem in dict_b:
dict_b[elem] += 1
else:
dict_b.update({elem: 1})
for i in range(0, len_A - len_B + 1):
if A[i] in dict_b:
temp_size, temp_sub = find_sub(A[i:], dict_b.copy())
if (sub_size == -1 or (temp_size != -1 and temp_size < sub_size)):
sub_A = temp_sub
sub_size = temp_size
return sub_size, sub_A
def find_sub(A, dict_b):
index = 0
for i in A:
if len(dict_b) == 0:
break
if i in dict_b:
dict_b[i] -= 1
if dict_b[i] <= 0:
del(dict_b[i])
index += 1
if len(dict_b) > 0:
return -1, {}
else:
return index, A[0:index]
Here's how I solved this problem in linear time using collections.Counter objects
from collections import Counter
def smallest_subsequence(stream, search):
if not search:
return [] # the shortest subsequence containing nothing is nothing
stream_counts = Counter(stream)
search_counts = Counter(search)
minimal_subsequence = None
start = 0
end = 0
subsequence_counts = Counter()
while True:
# while subsequence_counts doesn't have enough elements to cancel out every
# element in search_counts, take the next element from search
while search_counts - subsequence_counts:
if end == len(stream): # if we've reached the end of the list, we're done
return minimal_subsequence
subsequence_counts[stream[end]] += 1
end += 1
# while subsequence_counts has enough elements to cover search_counts, keep
# removing from the start of the sequence
while not search_counts - subsequence_counts:
if minimal_subsequence is None or (end - start) < len(minimal_subsequence):
minimal_subsequence = stream[start:end]
subsequence_counts[stream[start]] -= 1
start += 1
print(smallest_subsequence([1, 2, 5, 8, 7, 6, 2, 6, 5, 3, 8, 5], [5, 7]))
# [5, 8, 7]
Java solution
List<String> paragraph = Arrays.asList("a", "c", "d", "m", "b", "a");
Set<String> keywords = Arrays.asList("a","b");
Subarray result = new Subarray(-1,-1);
Map<String, Integer> keyWordFreq = new HashMap<>();
int numKeywords = keywords.size();
// slide the window to contain the all the keywords**
// starting with [0,0]
for (int left = 0, right = 0 ; right < paragraph.size() ; right++){
// expand right to contain all the keywords
String currRight = paragraph.get(right);
if (keywords.contains(currRight)){
keyWordFreq.put(currRight, keyWordFreq.get(currRight) == null ? 1 : keyWordFreq.get(currRight) + 1);
}
// loop enters when all the keywords are present in the current window
// contract left until the all the keywords are still present
while (keyWordFreq.size() == numKeywords){
String currLeft = paragraph.get(left);
if (keywords.contains(currLeft)){
// remove from the map if its the last available so that loop exists
if (keyWordFreq.get(currLeft).equals(1)){
// now check if current sub array is the smallest
if((result.start == -1 && result.end == -1) || (right - left) < (result.end - result.start)){
result = new Subarray(left, right);
}
keyWordFreq.remove(currLeft);
}else {
// else reduce the frequcency
keyWordFreq.put(currLeft, keyWordFreq.get(currLeft) - 1);
}
}
left++;
}
}
return result;
}

Resources