Obtaining strictly increasing sequence in array? - arrays

I am working on this algorithm problem,
Given a sequence of integers as an array, determine whether it is possible to obtain a strictly increasing sequence by removing no more than one element from the array.
I am not sure how to proceed with this?

Hints:
The "strictly increasing" property breaks when an element is immediately followed by a non-larger element. You can detect such a configuration by a simple linear search (and there is no faster way).
Now there are two ways to fix: by removing one of the elements, or the other. Ask yourself if the two choices are equivalent.
Next you need to check that after removal no more inversions exist (by continuing the search from the right point).

The problem is a LeetCode #1909
We can easily check if array is increasing with one item removed, e.g. (c# code):
// if nums is sorted except the item at indexToRemove
public bool IsSorted(int[] nums, int indexToRemove) {
if (nums.Length <= 1)
return true;
int prior = 0;
bool hasPrior = false;
for (int i = 0; i < nums.Length; ++i) {
if (i == indexToRemove) // don't count for item at indexToRemove
continue;
if (!hasPrior)
hasPrior = true;
else if (prior >= nums[i])
return false;
prior = nums[i];
}
return true;
}
Then we can scan the array, and if we have conflict, i.e. array[i - 1] >= array[i] we can try to remove either i-1th or ith item and check if we have an strictly increasing sequence:
public bool CanBeIncreasing(int[] nums) {
if (nums.Length <= 2)
return true;
for (int i = 1; i < nums.Length; ++i) {
if (nums[i - 1] < nums[i]) // no conflict, keep on doing
continue;
// conflict; can we resolve it removing item #i - 1 or item #i?
return IsSorted(nums, i - 1) || IsSorted(nums, i);
}
// no conflicts at all, we don't need to remove any item
return true;
}
Time complexity: O(n) (in the worst case we scan the array 3 times)
Space complexity: O(1)

Related

Algorithm to step over array elements as per the given condition

I am practicing to solve this problem, and have gotten 5 test cases passed but some test cases are failing I am not able to figure out what's the issue in my algorithm. Although I tried with some test data from failed test cases, most of them are coming correctly but I believe some are incorrect hence leading to my algorithm failure. So If someone can give an insight on the correct way to implement this algorithm that would be very helpful or where am I going wrong in my implementation.
My Algo:
1. Index for the move is at index '0' of string (say moving index)
2. Loop over the string starting with index '1' of string:
2.1. check if (moving index + leap) can outrun the array:
2.2. If not then, check whether the character is 1 or 0 :
2.2.1 Check for the number of '1's that are continuous, if they exceed the leap value then return false (as anyway we will not be able to jump).
2.2.2 If its 0, then check whether its a zero after continuous '1's.
If not so, continue moving forward one step at a time.
If so, first try to skip over those continuous '1's by checking whether (moving index + leap) is allowed or not as per the rule.
If not allowed, check in a while loop till what point we can move backwards one step at a time to get (moving index + leap) to satisfy.
If not possible, return false.
I don't know whether this is an efficient way to implement solution of this sort of problem, any other possible methods are much appreciated.
code:
import java.util.*;
public class Solution {
public static int leapStep(int index,int leap,int len,int[] game){
if(game[index+leap]==0){
index += leap;
}
return index;
}
public static boolean canWin(int leap, int[] game) {
int index = 0;
int len = game.length;
int consecutiveLength=0;
for(int i=1;i<len;){
if(index+leap>len-1){
return true;
}
if(game[i]==1){
consecutiveLength++;
if(consecutiveLength>=leap){
return false;
}
i++;
}else{
if(consecutiveLength==0){
index =i;
i++;
}else{
if(index+leap<=len-1){
int tryLeap = leapStep(index,leap,len,game);
if(index < tryLeap){
index = tryLeap;
tryLeap =0;
i = index+1;
}else if(index>0 && game[index-1]==0 ){
boolean notViable = false;
while(index>0){
if(game[index-1]!=0)
return false;
index -= 1;
i = index+1;
tryLeap = leapStep(index,leap,len,game);
if(index<tryLeap){
index = tryLeap;
i = index+1;
tryLeap=0;
notViable = false;
break;
}
else{
notViable = true;
}
}
if(notViable){
return false;
}
}else{
return false;
}
}
consecutiveLength=0;
}
}
}//closing for
return true;
}
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
int q = scan.nextInt();
while (q-- > 0) {
int n = scan.nextInt();
int leap = scan.nextInt();
int[] game = new int[n];
for (int i = 0; i < n; i++) {
game[i] = scan.nextInt();
}
System.out.println( (canWin(leap, game)) ? "YES" : "NO" );
}
scan.close();
}
}
To me, a better approach is to solve this recursively as below (it passed all the tests):
public static boolean canWin(int[] array, int index, int leap) {
// the only case when we lose
if (index < 0 || array[index] > 0) {
return false;
}
// if you're standing in the last entry or (index + leap) >= array.length then win
if ((index >= array.length - 1) || ((index + leap) >= array.length)) {
return true;
}
// mark it as visited so that not to iterate over it again
array[index] = 1;
// check all 3 conditions then recursively again
return canWin(array, index + 1, leap) || canWin(array, index - 1, leap) || canWin(array, index + leap, leap);
}
In the input below several pairs of lines are shown. The first element of each pair stands for leap and the second one for an array.
Input:
3
0 0 0 0 0
5
0 0 0 1 1 1
3
0 0 1 1 1 0
1
0 1 0
Output:
true
true
false
false
Explanation:
Let's say your current position is index.
If it's negative or the array value is larger than 0 then the game is lost. If it's the last position or index + leap reaches at least the length of the array then the game is won by definition.
Otherwise, the only possible moves from here could be index - 1 or index + 1 or index + leap. So, you repeat step 1 for each of the latter indices and take OR of the result because finding a single path is enough. Don't forget to set a value of the cell to 1 because it doesn't make sense to visit it the second time - we don't want to repeat the same moves over and over again and crash.
Your pseudo-code seems fine, but there a few mistake in your code, that may be the cause of your trouble.
The least problematic first, if(index+leap<=len-1) inside your loop is useless, you can remove it without modify the behaviour of your algorithm. It is the case because you already checked it in the first line of the loop and entered an else keyword.
This one is about your variables index and i. Their meaning isn't clear to me after a few complete read, and they look like the same. It might cause you trouble because you use the variable index inside your call to leapStep, but index is often one step behind i. It's confusing.
I did not found an example where your code fails.
Here is my solution HackerRank accepted. It is an iterative one, close to yours. Its principle is simple: starting from position 0, as we increase step by step our position, keep track of the positions you have access to (in variable memoTab, I removed the dp name as it can be frightening): if we are on a position we already reached before, then we can go to +1 or +leap.
It would be enough if it wasn't allowed to backtrack and go the reverse direction. To deal with that, whenever we reach some 1s, I keep in memory the next 0. And if I encounter a position I can reach just after, I go back to that 0 and say I can go there.
Here is the code, first a little helper function that returns true if the game is finished. Given a game and an index it says if we can go to that index and write it to the memo.
public static boolean check(int[] game, boolean[] memo, int index){
if(index >= 0 && index < game.length){
if(game[index] != 1){
memo[index] = true;
}
}
return index >= game.length;
}
This is the solver function, it first reads the values, then starts looping.
public static void solveOne(){
int n = sc.nextInt();
int leap = sc.nextInt();
int[] game = new int[n];
for (int i = 0; i < n; i++) {
game[i] = sc.nextInt();
}
int index = 0;
boolean[] memoTab = new boolean[n];
for (int i = 0; i < n; i++) {
memoTab[i] = false;
}
memoTab[0] = true;
boolean rememberIndex0 = false;
boolean gotoIndex0 = false;
int index0 = 0;
boolean finished = false;
We are done with the initialization, let's loop:
while(index < game.length){
// we encounter the first 0 after some 1, keep it in memory !
if(rememberIndex0 && game[index] == 0){
index0 = index;
gotoIndex0 = true;
rememberIndex0 = false;
}
// this index is an index we reached before, we can continue from here
if(memoTab[index]){
// we previously said we need to go back to a lower position
if(gotoIndex0){
gotoIndex0 = false;
index = index0;
memoTab[index] = true;
continue;
}
// it's finished if either is true
finished = check(game, memoTab, index + 1)
|| check(game, memoTab, index + leap);
if(finished) break;
}
// if this position is a 1, then we will keep in memory the next 0
if(game[index] == 1){
rememberIndex0= true;
}
// don't forget incrementing
index += 1;
}
System.out.println(finished?"YES":"NO");
}

Determine whether or not there exist two elements in an array whose sum is exactly X?

Given an array A[] of N elements and a number x, check for pair in A[] with sum as x ?
Method 1 = Sorting which gives O(n lg n).
Method 2 = Using hash table which gives O(n) .
I am having a doubt in method 2, that what if chaining is used , then for every element we have to search in list for its complement , which can yield O(n^2) in worst case because of chaining .
I think it will work only when range of integers is given , so that we can have hashtable without chaining which gives O(n) . Am i right ?
You can try the following approach ->
hash all elements in A[], like (key, value) = (A[i],true)
for all elements in A[]:
if hash(x-A[i])=true: it exists
You are right about hashtable that O(n) is not the WORST CASE guaranteed complexity.
However, with a reasonable hash function, the worst case should rarely happen.
And of course, if a small enough upper bound is given on the range of numbers, you can just use normal array to do the trick.
O(N) solution which uses hashmap to maintain the element Vs its frequency. Frequency is maintained so as to make it work for duplicate array elements case.
public static boolean countDiffPairsUsingHashing(int[] nums, int target) {
if (nums != null && nums.length > 0) {
HashMap<Integer, Integer> numVsFreq = new HashMap<Integer, Integer>();
for (int i = 0; i < nums.length; i++) {
numVsFreq.put(nums[i], numVsFreq.getOrDefault(nums[i], 0) + 1);
}
for (int i = 0; i < nums.length; i++) {
int diff = target - nums[i];
numVsFreq.put(nums[i], numVsFreq.get(nums[i]) - 1);
if (numVsFreq.get(diff) != null && numVsFreq.get(diff) > 0) {
return true;
}
numVsFreq.put(nums[i], numVsFreq.get(nums[i]) + 1);
}
}
return false;
}

Moving along a 1D Array

I came across this question in an online coding challenge recently, but I can't seem to make any head way.
There's a 1D array consisting of 0 and 1.
A player starts at index 0 and needs to go beyond the length of the array.
Once the length of the array is crossed the player wins.
The player can only go into indices that have a 0.
A player can move 1 step back, 1 step forward or m steps forward.
The question is how to find out if a game is winnable.
It all boils down to the following function signature:
boolean winnable(int[] arr, int m){
}
Can someone help me with an algorithm to get started.
Added Later
I can up with this algorithm, which of course doesn't pass most of the test cases.
public static boolean winnable(int[] arr, int m){
int currPos = 0;
for(int i=1; i< arr.length; i++){
if(currPos == arr.length -1 - m) return true;
else if(arr[i] == 0)currPos++;
else if(arr[i] == 1){
if(arr[currPos + m] == 0) currPos = currPos + m;
}
}
return false;
}
Iterate over the entire array. For each cell ->
If it is 1, mark it as unreachable. Else, check if it is reachable. A cell is reachable if either
A) the cell before it is reachable
B) the cell m cells before it is reachable.
Once a cell is marked as reachable, you must also mark all consecutive cells behind it which are all '0' as reachable. Once you have marked a cell less than m cells from the end as reachable, that means the end is reachable. If you've marked the last m cells as unreachable, the end is unreachable.
You're going to need a queue, or some other way to remember what indexes need to be checked. Each time you reach a zero that hasn't been seen before, you need to check 3 indexes: the one before, the one after, and the one at distance m.
The size of the queue is limited to the number of zeros in the input array. For example, if the input array has 10 zeros, then the queue can't possibly have more than 10 items in it. So you could implement the queue as a simple array that's the same size as the input array.
Here's some pseudo-code that shows how to solve the problem:
writeToQueue(0)
while ( queue is not empty )
{
index = readFromQueue
if ( index >= length-m )
the game is winnable
array[index] = 1 // mark this index as visited
if ( index > 0 && array[index-1] == 0 ) // 1 step back
writeToQueue(index-1)
if ( array[index+1] == 0 ) // 1 step forward
writeToQueue(index+1)
if ( array[index+m] == 0 ) // m steps forward
writeToQueue(index+m)
}
if the queue empties without reaching the end, the game is not winnable
Note that the input array is used to keep track of which indexes have been visited, i.e. each 0 that is found is changed to a 1, until either the game is won, or no more 0's are reachable.
I just added an accepted solution to this problem on HackerRank.
This is a recursive approach. I created a helper function that would take the currentIndx, array, jumpValue and a Set of visited indices as arguments.
Since currentIndx can't be < 0, I return false;
If currentIndx > arr.length - 1, we are done.
If the value at currentIndx is not 0, we again have to return false since it cannot be in the path.
Now, after these checks we add the visited index to the set. If the add operation returns false, that index must have been visited previously; so we return false.
Then, we recurse. We call the same function with currentIndx - 1, currentIndx + 1 and currentIndx + jumpValue to see what it returns. If any of these are true, we have found a path.
[Java source code]
It can be solved cleanly using BFS. Here goes my solution:
private static boolean isReachable(int[] array, int m) {
boolean[] visited = new boolean[array.length];
Queue<Integer> queue = new LinkedList<>();
queue.add(0);
visited[0] = true;
while (!queue.isEmpty()) {
Integer current = queue.poll();
if (current+m >= array.length) {
return true;
} else if (current+m < array.length && array[current+m] == 0 && !visited[current+m]) {
queue.add(current+m);
visited[current+m] = true;
}
if (current+1 >= array.length) {
return true;
} else if (current+1 < array.length && array[current+1] == 0 && !visited[current+1]) {
queue.add(current+1);
visited[current+1] = true;
}
if (current-1 >= 0 && array[current-1] == 0 && !visited[current-1]) {
queue.add(current-1);
visited[current-1] = true;
}
}
return false;
}

2D peak finding algorithm in O(n) worst case time?

I was doing this course on algorithms from MIT. In the very first lecture the professor presents the following problem:-
A peak in a 2D array is a value such that all it's 4 neighbours are less than or equal to it, ie. for
a[i][j] to be a local maximum,
a[i+1][j] <= a[i][j]
&& a[i-1][j] <= a[i][j]
&& a[i][j+1] <= a[i][j]
&& a[i+1][j-1] <= a[i][j]
Now given an NxN 2D array, find a peak in the array.
This question can be easily solved in O(N^2) time by iterating over all the elements and returning a peak.
However it can be optimized to be solved in O(NlogN) time by using a divide and conquer solution as explained here.
But they have said that there exists an O(N) time algorithm that solves this problem. Please suggest how can we solve this problem in O(N) time.
PS(For those who know python) The course staff has explained an approach here (Problem 1-5. Peak-Finding Proof) and also provided some python code in their problem sets. But the approach explained is totally non-obvious and very hard to decipher. The python code is equally confusing. So I have copied the main part of the code below for those who know python and can tell what algorithm is being used from the code.
def algorithm4(problem, bestSeen = None, rowSplit = True, trace = None):
# if it's empty, we're done
if problem.numRow <= 0 or problem.numCol <= 0:
return None
subproblems = []
divider = []
if rowSplit:
# the recursive subproblem will involve half the number of rows
mid = problem.numRow // 2
# information about the two subproblems
(subStartR1, subNumR1) = (0, mid)
(subStartR2, subNumR2) = (mid + 1, problem.numRow - (mid + 1))
(subStartC, subNumC) = (0, problem.numCol)
subproblems.append((subStartR1, subStartC, subNumR1, subNumC))
subproblems.append((subStartR2, subStartC, subNumR2, subNumC))
# get a list of all locations in the dividing column
divider = crossProduct([mid], range(problem.numCol))
else:
# the recursive subproblem will involve half the number of columns
mid = problem.numCol // 2
# information about the two subproblems
(subStartR, subNumR) = (0, problem.numRow)
(subStartC1, subNumC1) = (0, mid)
(subStartC2, subNumC2) = (mid + 1, problem.numCol - (mid + 1))
subproblems.append((subStartR, subStartC1, subNumR, subNumC1))
subproblems.append((subStartR, subStartC2, subNumR, subNumC2))
# get a list of all locations in the dividing column
divider = crossProduct(range(problem.numRow), [mid])
# find the maximum in the dividing row or column
bestLoc = problem.getMaximum(divider, trace)
neighbor = problem.getBetterNeighbor(bestLoc, trace)
# update the best we've seen so far based on this new maximum
if bestSeen is None or problem.get(neighbor) > problem.get(bestSeen):
bestSeen = neighbor
if not trace is None: trace.setBestSeen(bestSeen)
# return when we know we've found a peak
if neighbor == bestLoc and problem.get(bestLoc) >= problem.get(bestSeen):
if not trace is None: trace.foundPeak(bestLoc)
return bestLoc
# figure out which subproblem contains the largest number we've seen so
# far, and recurse, alternating between splitting on rows and splitting
# on columns
sub = problem.getSubproblemContaining(subproblems, bestSeen)
newBest = sub.getLocationInSelf(problem, bestSeen)
if not trace is None: trace.setProblemDimensions(sub)
result = algorithm4(sub, newBest, not rowSplit, trace)
return problem.getLocationInSelf(sub, result)
#Helper Method
def crossProduct(list1, list2):
"""
Returns all pairs with one item from the first list and one item from
the second list. (Cartesian product of the two lists.)
The code is equivalent to the following list comprehension:
return [(a, b) for a in list1 for b in list2]
but for easier reading and analysis, we have included more explicit code.
"""
answer = []
for a in list1:
for b in list2:
answer.append ((a, b))
return answer
Let's assume that width of the array is bigger than height, otherwise we will split in another direction.
Split the array into three parts: central column, left side and right side.
Go through the central column and two neighbour columns and look for maximum.
If it's in the central column - this is our peak
If it's in the left side, run this algorithm on subarray left_side + central_column
If it's in the right side, run this algorithm on subarray right_side + central_column
Why this works:
For cases where the maximum element is in the central column - obvious. If it's not, we can step from that maximum to increasing elements and will definitely not cross the central row, so a peak will definitely exist in the corresponding half.
Why this is O(n):
step #3 takes less than or equal to max_dimension iterations and max_dimension at least halves on every two algorithm steps. This gives n+n/2+n/4+... which is O(n). Important detail: we split by the maximum direction. For square arrays this means that split directions will be alternating. This is a difference from the last attempt in the PDF you linked to.
A note: I'm not sure if it exactly matches the algorithm in the code you gave, it may or may not be a different approach.
To see thata(n):
Calculation step is in the picture
To see algorithm implementation:
1) start with either 1a) or 1b)
1a) set left half, divider, right half.
1b) set top half, divider, bottom half.
2) Find global maximum on the divider. [theta n]
3) Find the values of its neighbour. And record the largest node ever visited as the bestSeen node. [theta 1]
# update the best we've seen so far based on this new maximum
if bestSeen is None or problem.get(neighbor) > problem.get(bestSeen):
bestSeen = neighbor
if not trace is None: trace.setBestSeen(bestSeen)
4) check if the global maximum is larger than the bestSeen and its neighbour.
[theta 1]
//Step 4 is the main key of why this algorithm works
# return when we know we've found a peak
if neighbor == bestLoc and problem.get(bestLoc) >= problem.get(bestSeen):
if not trace is None: trace.foundPeak(bestLoc)
return bestLoc
5) If 4) is True, return the global maximum as 2-D peak.
Else if this time did 1a), choose the half of BestSeen, go back to step 1b)
Else, choose the half of BestSeen, go back to step 1a)
To see visually why this algorithm works, it is like grabbing the greatest value side, keep reducing the boundaries and eventually get the BestSeen value.
# Visualised simulation
round1
round2
round3
round4
round5
round6
finally
For this 10*10 matrix, we used only 6 steps to search for the 2-D peak, its quite convincing that it is indeed theta n
By Falcon
Here is the working Java code that implements #maxim1000 's algorithm. The following code finds a peak in the 2D array in linear time.
import java.util.*;
class Ideone{
public static void main (String[] args) throws java.lang.Exception{
new Ideone().run();
}
int N , M ;
void run(){
N = 1000;
M = 100;
// arr is a random NxM array
int[][] arr = randomArray();
long start = System.currentTimeMillis();
// for(int i=0; i<N; i++){ // TO print the array.
// System. out.println(Arrays.toString(arr[i]));
// }
System.out.println(findPeakLinearTime(arr));
long end = System.currentTimeMillis();
System.out.println("time taken : " + (end-start));
}
int findPeakLinearTime(int[][] arr){
int rows = arr.length;
int cols = arr[0].length;
return kthLinearColumn(arr, 0, cols-1, 0, rows-1);
}
// helper function that splits on the middle Column
int kthLinearColumn(int[][] arr, int loCol, int hiCol, int loRow, int hiRow){
if(loCol==hiCol){
int max = arr[loRow][loCol];
int foundRow = loRow;
for(int row = loRow; row<=hiRow; row++){
if(max < arr[row][loCol]){
max = arr[row][loCol];
foundRow = row;
}
}
if(!correctPeak(arr, foundRow, loCol)){
System.out.println("THIS PEAK IS WRONG");
}
return max;
}
int midCol = (loCol+hiCol)/2;
int max = arr[loRow][loCol];
for(int row=loRow; row<=hiRow; row++){
max = Math.max(max, arr[row][midCol]);
}
boolean centralMax = true;
boolean rightMax = false;
boolean leftMax = false;
if(midCol-1 >= 0){
for(int row = loRow; row<=hiRow; row++){
if(arr[row][midCol-1] > max){
max = arr[row][midCol-1];
centralMax = false;
leftMax = true;
}
}
}
if(midCol+1 < M){
for(int row=loRow; row<=hiRow; row++){
if(arr[row][midCol+1] > max){
max = arr[row][midCol+1];
centralMax = false;
leftMax = false;
rightMax = true;
}
}
}
if(centralMax) return max;
if(rightMax) return kthLinearRow(arr, midCol+1, hiCol, loRow, hiRow);
if(leftMax) return kthLinearRow(arr, loCol, midCol-1, loRow, hiRow);
throw new RuntimeException("INCORRECT CODE");
}
// helper function that splits on the middle
int kthLinearRow(int[][] arr, int loCol, int hiCol, int loRow, int hiRow){
if(loRow==hiRow){
int ans = arr[loCol][loRow];
int foundCol = loCol;
for(int col=loCol; col<=hiCol; col++){
if(arr[loRow][col] > ans){
ans = arr[loRow][col];
foundCol = col;
}
}
if(!correctPeak(arr, loRow, foundCol)){
System.out.println("THIS PEAK IS WRONG");
}
return ans;
}
boolean centralMax = true;
boolean upperMax = false;
boolean lowerMax = false;
int midRow = (loRow+hiRow)/2;
int max = arr[midRow][loCol];
for(int col=loCol; col<=hiCol; col++){
max = Math.max(max, arr[midRow][col]);
}
if(midRow-1>=0){
for(int col=loCol; col<=hiCol; col++){
if(arr[midRow-1][col] > max){
max = arr[midRow-1][col];
upperMax = true;
centralMax = false;
}
}
}
if(midRow+1<N){
for(int col=loCol; col<=hiCol; col++){
if(arr[midRow+1][col] > max){
max = arr[midRow+1][col];
lowerMax = true;
centralMax = false;
upperMax = false;
}
}
}
if(centralMax) return max;
if(lowerMax) return kthLinearColumn(arr, loCol, hiCol, midRow+1, hiRow);
if(upperMax) return kthLinearColumn(arr, loCol, hiCol, loRow, midRow-1);
throw new RuntimeException("Incorrect code");
}
int[][] randomArray(){
int[][] arr = new int[N][M];
for(int i=0; i<N; i++)
for(int j=0; j<M; j++)
arr[i][j] = (int)(Math.random()*1000000000);
return arr;
}
boolean correctPeak(int[][] arr, int row, int col){//Function that checks if arr[row][col] is a peak or not
if(row-1>=0 && arr[row-1][col]>arr[row][col]) return false;
if(row+1<N && arr[row+1][col]>arr[row][col]) return false;
if(col-1>=0 && arr[row][col-1]>arr[row][col]) return false;
if(col+1<M && arr[row][col+1]>arr[row][col]) return false;
return true;
}
}

Binary Search with an unknown number of items

Assuming you don't know the number of elements you are searching and given an API that accepts an index and will return null if you are outside the bounds (as implemented here with the getWordFromDictionary method), how can you perform a binary search and implement the isWordInDictionary() method for client programs?
This solution works, but I ended up doing a serial search above the level where I found an initial high-index value. The search through the lower range of values was inspired by this answer. I also peeked at BinarySearch in Reflector (C# decompiler), but that has a known list length, so still looking to fill in the gaps.
private static string[] dictionary;
static void Main(string[] args)
{
dictionary = System.IO.File.ReadAllLines(#"C:\tmp\dictionary.txt");
Console.WriteLine(isWordInDictionary("aardvark", 0));
Console.WriteLine(isWordInDictionary("bee", 0));
Console.WriteLine(isWordInDictionary("zebra", 0));
Console.WriteLine(isWordInDictionaryBinary("aardvark"));
Console.WriteLine(isWordInDictionaryBinary("bee"));
Console.WriteLine(isWordInDictionaryBinary("zebra"));
Console.ReadLine();
}
static bool isWordInDictionaryBinary(string word)
{
// assume the size of the dictionary is unknown
// quick check for empty dictionary
string w = getWordFromDictionary(0);
if (w == null)
return false;
// assume that the length is very big.
int low = 0;
int hi = int.MaxValue;
while (low <= hi)
{
int mid = (low + ((hi - low) >> 1));
w = getWordFromDictionary(mid);
// If the middle element m you select at each step is outside
// the array bounds (you need a way to tell this), then limit
// the search to those elements with indexes small than m.
if (w == null)
{
hi = mid;
continue;
}
int compare = String.Compare(w, word);
if (compare == 0)
return true;
if (compare < 0)
low = mid + 1;
else
hi = mid - 1;
}
// punting on the search above the current value of hi
// to the (still unknown) upper limit
return isWordInDictionary(word, hi);
}
// serial search, works good for small number of items
static bool isWordInDictionary(string word, int startIndex)
{
// assume the size of the dictionary is unknown
int i = startIndex;
while (getWordFromDictionary(i) != null)
{
if (getWordFromDictionary(i).Equals(word, StringComparison.OrdinalIgnoreCase))
return true;
i++;
}
return false;
}
private static string getWordFromDictionary(int index)
{
try
{
return dictionary[index];
}
catch (IndexOutOfRangeException)
{
return null;
}
}
Final Code after answers
static bool isWordInDictionaryBinary(string word)
{
// assume the size of the dictionary is unknown
// quick check for empty dictionary
string w = getWordFromDictionary(0);
if (w == null)
return false;
// assume that the number of elements is very big
int low = 0;
int hi = int.MaxValue;
while (low <= hi)
{
int mid = (low + ((hi - low) >> 1));
w = getWordFromDictionary(mid);
// treat null the same as finding a string that comes
// after the string you are looking for
if (w == null)
{
hi = mid - 1;
continue;
}
int compare = String.Compare(w, word);
if (compare == 0)
return true;
if (compare < 0)
low = mid + 1;
else
hi = mid - 1;
}
return false;
}
You can implement a binary search in two phases. In the first phase, you grow the size of the interval you're searching in. Once you detect you're outside the bounds, you can do a normal binary search in the latest interval you found. Something like this:
bool isPresentPhase1(string word)
{
int l = 0, d = 1;
while( true ) // you should eventually reach an index out of bounds
{
w = getWord(l + d);
if( w == null )
return isPresentPhase2(word, l, l + d - 1);
int c = String.Compare(w, word);
if( c == 0 )
return true;
else if( c < 0 )
isPresentPhase2(value, l, l + d - 1);
else
{
l = d + 1;
d *= 2;
}
}
}
bool isPresentPhase2(string word, int lo, int hi)
{
// normal binary search in the interval [lo, hi]
}
Sure you can. Start at index one, and double your query index until you hit something that's lexographically larger than your query word(Edit: or null). Then you can narrow down your search space again until you find the index, or return false.
Edit: Note that this does NOT add to your asymptotic runtime, and it is still O(logN), where N is the number of items in the series.
So, I'm not sure I entirely understand the problem from your description, but I'm assuming you're trying to search through a sorted array of unknown length to find a particular string. I'm also assuming that there are no nulls in the actual array; the array only returns null if you ask for an index that's out of bounds.
If those things are true, the solution should be just a standard binary search, albeit one where you search over the entire integer space, and you just treat null the same as finding a string that comes after the string you are looking for. Essentially just imagine that your sorted array of N strings is really a sorted array of INT_MAX strings sorted with nulls at the end.
What I don't quite understand is that you seem to basically have done that already (at least from a cursory look at the code), so I think I might not understand your problem completely.

Resources