why is this else block executed twice? - c

I have the following code which correctly calculates the jaccard similarity between an input char array and an existing array of char arrays. jacc_sim_rec[] is used to record the similarities which satisfy a minimum threshold value. The for loop is used to iterate through the multidimensional array and the loop is supposed to continue checking similarity if minimum threshold is not satisfied at if (jacc_sim < SIM_THRESHOLD); else record the result at
else
{
jacc_sim_rec[j] = jacc_sim;//keep record of similarity
++j;//record number of highly similar elements
}
my problem is, the whole statements in the else block is executed twice every time the threshold value is satisfied.
int j=0;
void calc_jac_sim( char*INCOMING, int grp)
{
unsigned long i, j11 = 0, j01 = 0, j10 = 0,m=0;
char *m11, *m01, *m10;
float jacc_sim = 0.0;
char r1[SBF_LEN] = { NULL };
char r2[SBF_LEN] = { NULL };
char r3[SBF_LEN] = { NULL };
int cnt = SBF_LEN - 1;
clear_jacc_sim_info();
for (int i = 0; i <= SBF_REC[grp]; ++i)
{
while (cnt >= 0)
{
r1[cnt] = SBF[grp][i][cnt] & INCOMING[cnt];
r2[cnt] = ~SBF[grp][i][cnt] & INCOMING[cnt];
r3[cnt] = SBF[grp][i][cnt] & ~INCOMING[cnt];
cnt--;
}
m11 = ( char*)r1;
m01 = ( char*)r2;
m10 = ( char*)r3;
for (m = SBF_LEN * sizeof( char); m--;
j11 += NumberOfSetBits(*m11++),
j01 += NumberOfSetBits(*m01++),
j10 += NumberOfSetBits(*m10++));
jacc_sim = j11 / (float)(j11 + j01 + j10);
if (jacc_sim < SIM_THRESHOLD);
//continue;//do nothing
else
{
jacc_sim_rec[j] = jacc_sim;//keep record of similarity
++j;//record number of highly similar elements
}
}
}

I don't understand the code, but I'll bet the problem is that you're not reinitializing cnt each time through the for loop, so you only fill in r1, r2, and r3 when i = 0.
Change that loop to:
for (int cnt = SBF_LEN - 1; cnt >= 0; cnt--)
{
r1[cnt] = SBF[grp][i][cnt] & INCOMING[cnt];
r2[cnt] = ~SBF[grp][i][cnt] & INCOMING[cnt];
r3[cnt] = SBF[grp][i][cnt] & ~INCOMING[cnt];
}
I'm also not sure why this needs to count down instead of up, like a typical loop, but it shouldn't make a difference.

Related

How to find the minimum number of coins needed for a given target amount(different from existing ones)

This is a classic question, where a list of coin amounts are given in coins[], len = length of coins[] array, and we try to find minimum amount of coins needed to get the target.
The coins array is sorted in ascending order
NOTE: I am trying to optimize the efficiency. Obviously I can run a for loop through the coins array and add the target%coins[i] together, but this will be erroneous when I have for example coins[] = {1,3,4} and target = 6, the for loop method would give 3, which is 1,1,4, but the optimal solution is 2, which is 3,3.
I haven't learned matrices and multi-dimensional array yet, are there ways to do this problem without them? I wrote a function, but it seems to be running in an infinity loop.
int find_min(const int coins[], int len, int target) {
int i;
int min = target;
int curr;
for (i = 0; i < len; i++) {
if (target == 0) {
return 0;
}
if (coins[i] <= target) {
curr = 1 + find_min(coins, len, target - coins[i]);
if (curr < min) {
min = curr;
}
}
}
return min;
}
I can suggest you this reading,
https://www.geeksforgeeks.org/generate-a-combination-of-minimum-coins-that-results-to-a-given-value/
the only thing is that there is no C version of the code, but if really need it you can do the porting by yourself.
Since no one gives a good answer, and that I figured it out myself. I might as well post an answer.
I add an array called lp, which is initialized in main,
int lp[4096];
int i;
for (i = 0; i <= COINS_MAX_TARGET; i++) {
lp[i] = -1;
}
every index of lp is equal to -1.
int find_min(int tar, const int coins[], int len, int lp[])
{
// Base case
if (tar == 0) {
lp[0] = 0;
return 0;
}
if (lp[tar] != -1) {
return lp[tar];
}
// Initialize result
int result = COINS_MAX_TARGET;
// Try every coin that is smaller than tar
for (int i = 0; i < len; i++) {
if (coins[i] <= tar) {
int x = find_min(tar - coins[i], coins, len, lp);
if (x != COINS_MAX_TARGET)
result = ((result > (1 + x)) ? (1+x) : result);
}
}
lp[tar] = result;
return result;
}

The longest sub-array with switching elements

An array is called "switching" if the odd and even elements are equal.
Example:
[2,4,2,4] is a switching array because the members in even positions (indexes 0 and 2) and odd positions (indexes 1 and 3) are equal.
If A = [3,7,3,7, 2, 1, 2], the switching sub-arrays are:
== > [3,7,3,7] and [2,1,2]
Therefore, the longest switching sub-array is [3,7,3,7] with length = 4.
As another example if A = [1,5,6,0,1,0], the the only switching sub-array is [0,1,0].
Another example: A= [7,-5,-5,-5,7,-1,7], the switching sub-arrays are [7,-1,7] and [-5,-5,-5].
Question:
Write a function that receives an array and find its longest switching sub-array.
I would like to know how you solve this problem and which strategies you use to solve this with a good time complexity?
I am assuming that the array is zero indexed .
if arr.size <= 2
return arr.size
else
ans = 2
temp_ans = 2 // We will update ans when temp_ans > ans;
for i = 2; i < arr.size ; ++i
if arr[i] = arr[i-2]
temp_ans = temp_ans + 1;
else
temp_ans = 2;
ans = max(temp_ans , ans);
return ans;
I think this should work and I don't think it needs any kind of explanation .
Example Code
private static int solve(int[] arr){
if(arr.length == 1) return 1;
int even = arr[0],odd = arr[1];
int start = 0,max_len = 0;
for(int i=2;i<arr.length;++i){
if(i%2 == 0 && arr[i] != even || i%2 == 1 && arr[i] != odd){
max_len = Math.max(max_len,i - start);
start = i-1;
if(i%2 == 0){
even = arr[i];
odd = arr[i-1];
}else{
even = arr[i-1];
odd = arr[i];
}
}
}
return Math.max(max_len,arr.length - start);
}
It's like a sliding window problem.
We keep track of even and odd equality with 2 variables, even and odd.
Whenever we come across a unmet condition, like index even but not equal with even variable and same goes for odd, we first
Record the length till now in max_len.
Reset start to i-1 as this is need incase of all elements equal.
Reset even and odd according to current index i to arr[i] and arr[i-1] respectively.
Demo: https://ideone.com/iUQti7
I didn't analyse the time complexity, just wrote a solution that uses recursion and it works (I think):
public class Main
{
public static int switching(int[] arr, int index, int end)
{
try
{
if (arr[index] == arr[index+2])
{
end = index+2;
return switching(arr, index+1, end);
}
} catch (Exception e) {}
return end;
}
public static void main(String[] args)
{
//int[] arr = {3,2,3,2,3};
//int[] arr = {3,2,3};
//int[] arr = {4,4,4};
int[] arr = {1,2,3,4,5,4,4,7,9,8,10};
int best = -1;
for (int i = 0; i < arr.length; i++)
best = Math.max(best, (switching(arr, i, 0) - i));
System.out.println(best+1); // It returns, in this example, 3
}
}
int switchingSubarray(vector<int> &arr, int n) {
if(n==1||n==2) return n;
int i=0;
int ans=2;
int j=2;
while(j<n)
{
if(arr[j]==arr[j-2]) j++;
else
{
ans=max(ans,j-i);
i=j-1;
j++;
}
}
ans=max(ans,j-i);
return ans;
}
Just using sliding window technique to solve this problems as element at j and j-2 need to be same.
Try to dry run on paper u will surely get it .
# Switching if numbers in even positions equal to odd positions find length of longest switch in continuos sub array
def check(t):
even = []
odd = []
i = 0
while i < len(t):
if i % 2 == 0:
even.append(t[i])
else:
odd.append(t[i])
i += 1
if len(set(even)) == 1 and len(set(odd)) == 1:
return True
else:
return False
def solution(A):
maxval = 0
if len(A) == 1:
return 1
for i in range(0, len(A)):
for j in range(0, len(A)):
if check(A[i:j+1]) == True:
val = len(A[i:j+1])
print(A[i:j+1])
if val > maxval:
maxval = val
return maxval
A = [3,2,3,2,3]
A = [7,4,-2,4,-2,-9]
A=[4]
A = [7,-5,-5,-5,7,-1,7]
print(solution(A))

Concatenating arrays in place

I ran into an issue while implementing a circular buffer that must occasionally be aligned.
Say I have two arrays, leftArr and rightArr. I want to move the right array to byteArr and the left array to byteArr + the length of the right array. Both leftArr and rightArr are greater than byteArr, and rightArr is greater than leftArr. (this is not quite the same as rotating a circular buffer because the left array does not need to start at byteArr) Although the left and right arrays do not overlap, the combined array stored at byteArr may overlap with the current arrays, stored at leftArr and rightArr. All memory from byteArr to rightArr + rightArrLen can be safely written to. One possible implementation is:
void align(char* byteArr, char* leftArr, int leftArrLen, char* rightArr, int rightArrLen) {
char *t = malloc(rightArrLen + leftArrLen);
// form concatenated data
memcpy(t, right, rightArrLen);
memcpy(t + rightArrLen, left, leftArrLen);
// now replace
memcpy(byteArr, t, rightArrLen + leftArrLen);
free(t);
}
However, I must accomplish this with constant memory complexity.
What I have so far looks like this:
void align(char* byteArr, char* leftArr, int leftArrLen, char* rightArr, int rightArrLen)
{
// first I check to see if some combination of memmove and memcpy will suffice, if not:
unsigned int lStart = leftArr - byteArr;
unsigned int lEnd = lStart + leftArrLen;
unsigned int rStart = rightArr - byteArr;
unsigned int rEnd = rStart + rightArrLen;
unsigned int lShift = rEnd - rStart - lStart;
unsigned int rShift = -rStart;
char temp1;
char temp2;
unsigned int nextIndex;
bool alreadyMoved;
// move the right array
for( unsigned int i = 0; i < rEnd - rStart; i++ )
{
alreadyMoved = false;
for( unsigned int j = i; j < rEnd - rStart; j-= rShift )
{
if( lStart <= j + rStart - lShift
&& j + rStart - lShift < lEnd
&& lStart <= (j + rStart) % lShift
&& (j + rStart) % lShift < lEnd
&& (j + rStart) % lShift < i )
{
alreadyMoved = true;
}
}
if(alreadyMoved)
{
// byte has already been moved
continue;
}
nextIndex = i - rShift;
temp1 = byteArr[nextIndex];
while( rStart <= nextIndex && nextIndex < rEnd )
{
nextIndex += rShift;
temp2 = byteArr[nextIndex];
byteArr[nextIndex] = temp1;
temp1 = temp2;
while( lStart <= nextIndex && nextIndex < lEnd )
{
nextIndex += lShift;
temp2 = byteArr[nextIndex];
byteArr[nextIndex] = temp1;
temp1 = temp2;
}
if( nextIndex <= i - rShift )
{
// byte has already been moved
break;
}
}
}
// move the left array
for( unsigned int i = lStart; i < lShift && i < lEnd; i++ )
{
if( i >= rEnd - rStart )
{
nextIndex = i + lShift;
temp1 = byteArr[nextIndex];
byteArr[nextIndex] = byteArr[i];
while( nextIndex < lEnd )
{
nextIndex += lShift;
temp2 = byteArr[nextIndex];
byteArr[nextIndex] = temp1;
temp1 = temp2;
}
}
}
}
This code works in the case lStart = 0, lLength = 11, rStart = 26, rLength = 70 but fails in the case lStart = 0, lLength = 46, rStart = 47, rLength = 53. The solution that I can see is to add logic to determine when a byte from the right array has already been moved. While this would be possible for me to do, I was wondering if there's a simpler solution to this problem that runs with constant memory complexity and without extra reads and writes?
Here's a program to test an implementation:
bool testAlign(int lStart, int lLength, int rStart, int rLength)
{
char* byteArr = (char*) malloc(100 * sizeof(char));
char* leftArr = byteArr + lStart;
char* rightArr = byteArr + rStart;
for(int i = 0; i < rLength; i++)
{
rightArr[i] = i;
}
for(int i = 0; i < lLength; i++)
{
leftArr[i] = i + rLength;
}
align(byteArr, leftArr, lLength, rightArr, rLength);
for(int i = 0; i < lLength + rLength; i++)
{
if(byteArr[i] != i) return false;
}
return true;
}
Imagine dividing byteArr into regions (not necessarily to scale):
X1 Left X2 Right
|---|--------|---|------|
The X1 and X2 are gaps in byteArr before the start of the left array and between the two arrays. In the general case, any or all of those four regions may have zero length.
You can then proceed like this:
Start by partially or wholly filling in the leading space in byteArr
If Left has zero length then move Right to the front (if necessary) via memmove(). Done.
Else if X1 is the same length as the Right array or larger then move the right array into that space via memcpy() and, possibly, move up the left array to abut it via memmove(). Done.
Else, move the lead portion of the Right array into that space, producing the below layout. If X1 had zero length then R1 also has zero length, X2' == X2, and R2 == Right.
R1 Left X2' R2
|---|--------|------|---|
There are now two alternatives
If R2 is the same length as Left or longer, then swap Left with the initial portion of R2 to produce (still not to scale):
R1' X2'' Left R2'
|------|-----|-------|--|
Otherwise, swap the initial portion of Left with all of R2 to produce (still not to scale):
R1' L2 X2'' L1
|------|---|-------|----|
Now recognize that in either case, you have a strictly smaller problem of the same form as the original, where the new byteArr is the tail of the original starting immediately after region R1'. In the first case the new leftArr is the (final) Left region and the new rightArr is region R2'. In the other case, the new leftArr is region L2, and the new rightArr is region L1. Reset parameters to reflect this new problem, and loop back to step (1).
Note that I say to loop back to step 1. You could, of course, implement this algorithm (tail-)recursively, but then to achieve constant space usage you would need to rely on your compiler to optimize out the tail recursion, which otherwise consumes auxiliary space proportional to the length ratio of the larger of the two sub-arrays to the smaller.

Connect-N Board Game, crashing when Width is >> Height

I'm in the process of coding a Connect-N board game, and I'm almost finished and have gone through troubleshooting. My problem is now after changing some stuff my game crashes when the computer plays its move if the Width is too much greater than the height. There are two functions involved here, so I will paste them both.
Board
*AllocateBoard(int columns, int rows)
{
int **array= malloc(sizeof(int *) *columns);
int r = 0;
for ( r = 0; r < columns; ++r)
{
array[r] = malloc(sizeof(int) * rows);
}
int j = columns - 1;
int k = rows - 1;
int m = 0;
int n = 0;
for ( m = 0; m < j; ++m)
{
for ( n = 0; n < k; ++n)
{
array[m][n] = 0;
}
}
Board *board = malloc(sizeof(Board));
board->columns = columns;
board->rows = rows;
board->spaces = array;
return board;
}
This first function allocates the board to be a matrix Width * Height that the user passes in via the command line. It then initializes every space on the board to be zero, and then stores the columns, rows, and spaces into a Board structure that I've created. It then returns the board.
int
computerMakeMove(Board *board)
{ int RandIndex = 0;
int **spaces = board->spaces;
int columns = board->columns;
int *arrayoflegalmoves = malloc(sizeof(int) * (columns));
int columncheck = 0;
int legalmoveindex = 0;
while (columncheck <= columns - 1)
{
if (spaces[columncheck][0] == 0)
{
arrayoflegalmoves[legalmoveindex] = columncheck;
++legalmoveindex;
++columncheck;
}
else
{
++columncheck;
}
arrayoflegalmoves = realloc(arrayoflegalmoves, (legalmoveindex) * sizeof(int));
}
if (legalmoveindex == 1)
{
return arrayoflegalmoves[0];
}
else
{
RandIndex = rand() % (legalmoveindex);
return arrayoflegalmoves[RandIndex];
}
}
This second function is designed to make the computer randomly pick a column on the board. It does this by checking the value of the top row in each column. If there is a zero there, it will store this value in an array of legal moves, and then it increments the legalmoveindex. If there isn't, it skips the column and checks the next. It ends when it gets finished checking the final column. If there is only one legal move, it will play it. If there are more, it will select a random index from the array of legal moves (I run srand in the main) and then return that value. It will only ever attempt to play on a legal board, so that's not the problem. I am pretty confident the problem occurs in this function, however, as I call the functions as follows
printf("Taking the computers move.\n");
{printf("Taking computer's move.");
computermove = computerMakeMove(playerboard);
printf("Computer's move successfully taken.\n");
playerboard = MakeMove(playerboard, computermove, player);
printf("Computer's board piece successfully played.\n");
system("clear");
displayBoard(playerboard);
...;
}
and it prints
Aborted (core dumped)
immediately after it prints
"Taking computer's move."
Once again, my question is: why is my program crashing if the width is larger than the height when the computer plays?
Thanks.
Edit: I found the solution and I am stupid.
I realloc'd during the while loop.
The realloc should be the first thing outside of the while loop.
The answer for any future programmers who may have this problem:
Notice the
while (columncheck <= columns - 1)
{
if (spaces[columncheck][0] == 0)
{
arrayoflegalmoves[legalmoveindex] = columncheck;
++legalmoveindex;
++columncheck;
}
else
{
++columncheck;
}
arrayoflegalmoves = realloc(arrayoflegalmoves, (legalmoveindex) * sizeof(int));
}
has a realloc inside of it. The realloc should be moved to immediately outside of it, like so
while (columncheck <= columns - 1)
{
if (spaces[columncheck][0] == 0)
{
arrayoflegalmoves[legalmoveindex] = columncheck;
++legalmoveindex;
++columncheck;
}
else
{
++columncheck;
}
}
arrayoflegalmoves = realloc(arrayoflegalmoves, (legalmoveindex) * sizeof(int));
it is unusual to have the columns be the first index in an array.
having the first index of an array be columns leads to confusion
// suggest using camel case for all variable names, for readability
Board *AllocateBoard(int columns, int rows)
{
int **array= malloc(sizeof(int *) *columns); // add check that malloc successful
int r = 0;
for ( r = 0; r < columns; ++r)
{
array[r] = malloc(sizeof(int) * rows); // <-- add: check that malloc successful
}
int j = columns - 1; // this results in last column not initialized
int k = rows - 1; // this results in last row of each column not initialized
int m = 0; // column loop counter
int n = 0; // row loop counter
for ( m = 0; m < j; ++m)
{
for ( n = 0; n < k; ++n)
{
array[m][n] = 0;
}
}
Board *board = malloc(sizeof(Board)); // <-- add: check if malloc successful
board->columns = columns;
board->rows = rows;
board->spaces = array;
return board;
} // end function: AllocateBoard
// why is this only looking at the first row of each column?
int computerMakeMove(Board *board)
{
int RandIndex = 0;
int **spaces = board->spaces;
int columns = board->columns;
int *arrayoflegalmoves = malloc(sizeof(int) * (columns)); // <-- add check that malloc successful
int columncheck = 0;
int legalmoveindex = 0;
while (columncheck <= columns - 1)// should be: for(; columncheck < columns; columncheck++ )
{
if (spaces[columncheck][0] == 0)
{ // then first row of column is zero
arrayoflegalmoves[legalmoveindex] = columncheck;
++legalmoveindex;
++columncheck; // <-- remove this line
}
else // remove this 'else' code block
{
++columncheck;
} // end if
arrayoflegalmoves = realloc(arrayoflegalmoves, (legalmoveindex) * sizeof(int));
// <-- 1) use temp int*, in case realloc fails
// <-- 2) if realloc successful, update arrayoflegalmoves
// <-- 3) the code is not checking each row of each column,
// so the original malloc is more than plenty
// so why bother to realloc
// <-- 4) if legalmoveindex is 0 then realloc returns NULL
} // end while
// in following, what about when zero moves found? probably should return NULL
if (legalmoveindex == 1)
{ // only one column[row0] found to contain 0
return arrayoflegalmoves[0];
}
else
{
RandIndex = rand() % (legalmoveindex);
return arrayoflegalmoves[RandIndex]; // if zero moves found, this returns a
// de-reference to address 0
// which would result in a seg fault event
} // end if
} // end function: computerMakeMove

Find length of smallest window that contains all the characters of a string in another string

Recently i have been interviewed. I didn't do well cause i got stuck at the following question
suppose a sequence is given : A D C B D A B C D A C D
and search sequence is like: A C D
task was to find the start and end index in given string that contains all the characters of search string preserving the order.
Output: assuming index start from 1:
start index 10
end index 12
explanation :
1.start/end index are not 1/3 respectively because though they contain the string but order was not maintained
2.start/end index are not 1/5 respectively because though they contain the string in the order but the length is not optimum
3.start/end index are not 6/9 respectively because though they contain the string in the order but the length is not optimum
Please go through How to find smallest substring which contains all characters from a given string?.
But the above question is different since the order is not maintained. I'm still struggling to maintain the indexes. Any help would be appreciated . thanks
I tried to write some simple c code to solve the problem:
Update:
I wrote a search function that looks for the required characters in correct order, returning the length of the window and storing the window start point to ìnt * startAt. The function processes a sub-sequence of given hay from specified startpoint int start to it's end
The rest of the algorithm is located in main where all possible subsequences are tested with a small optimisation: we start looking for the next window right after the startpoint of the previous one, so we skip some unnecessary turns. During the process we keep track f the 'till-now best solution
Complexity is O(n*n/2)
Update2:
unnecessary dependencies have been removed, unnecessary subsequent calls to strlen(...) have been replaced by size parameters passed to search(...)
#include <stdio.h>
// search for single occurrence
int search(const char hay[], int haySize, const char needle[], int needleSize, int start, int * startAt)
{
int i, charFound = 0;
// search from start to end
for (i = start; i < haySize; i++)
{
// found a character ?
if (hay[i] == needle[charFound])
{
// is it the first one?
if (charFound == 0)
*startAt = i; // store starting position
charFound++; // and go to next one
}
// are we done?
if (charFound == needleSize)
return i - *startAt + 1; // success
}
return -1; // failure
}
int main(int argc, char **argv)
{
char hay[] = "ADCBDABCDACD";
char needle[] = "ACD";
int resultStartAt, resultLength = -1, i, haySize = sizeof(hay) - 1, needleSize = sizeof(needle) - 1;
// search all possible occurrences
for (i = 0; i < haySize - needleSize; i++)
{
int startAt, length;
length = search(hay, haySize, needle, needleSize, i, &startAt);
// found something?
if (length != -1)
{
// check if it's the first result, or a one better than before
if ((resultLength == -1) || (resultLength > length))
{
resultLength = length;
resultStartAt = startAt;
}
// skip unnecessary steps in the next turn
i = startAt;
}
}
printf("start at: %d, length: %d\n", resultStartAt, resultLength);
return 0;
}
Start from the beginning of the string.
If you encounter an A, then mark the position and push it on a stack. After that, keep checking the characters sequentially until
1. If you encounter an A, update the A's position to current value.
2. If you encounter a C, push it onto the stack.
After you encounter a C, again keep checking the characters sequentially until,
1. If you encounter a D, erase the stack containing A and C and mark the score from A to D for this sub-sequence.
2. If you encounter an A, then start another Stack and mark this position as well.
2a. If now you encounter a C, then erase the earlier stacks and keep the most recent stack.
2b. If you encounter a D, then erase the older stack and mark the score and check if it is less than the current best score.
Keep doing this till you reach the end of the string.
The pseudo code can be something like:
Initialize stack = empty;
Initialize bestLength = mainString.size() + 1; // a large value for the subsequence.
Initialize currentLength = 0;
for ( int i = 0; i < mainString.size(); i++ ) {
if ( stack is empty ) {
if ( mainString[i] == 'A' ) {
start a new stack and push A on it.
mark the startPosition for this stack as i.
}
continue;
}
For each of the stacks ( there can be at most two stacks prevailing,
one of size 1 and other of size 0 ) {
if ( stack size == 1 ) // only A in it {
if ( mainString[i] == 'A' ) {
update the startPosition for this stack as i.
}
if ( mainString[i] == 'C' ) {
push C on to this stack.
}
} else if ( stack size == 2 ) // A & C in it {
if ( mainString[i] == 'C' ) {
if there is a stack with size 1, then delete this stack;// the other one dominates this stack.
}
if ( mainString[i] == 'D' ) {
mark the score from startPosition till i and update bestLength accordingly.
delete this stack.
}
}
}
}
I modified my previous suggestion using a single queue, now I believe this algorithm runs with O(N*m) time:
FindSequence(char[] sequenceList)
{
queue startSeqQueue;
int i = 0, k;
int minSequenceLength = sequenceList.length + 1;
int startIdx = -1, endIdx = -1;
for (i = 0; i < sequenceList.length - 2; i++)
{
if (sequenceList[i] == 'A')
{
startSeqQueue.queue(i);
}
}
while (startSeqQueue!=null)
{
i = startSeqQueue.enqueue();
k = i + 1;
while (sequenceList.length < k && sequenceList[k] != 'C')
if (sequenceList[i] == 'A') i = startSeqQueue.enqueue();
k++;
while (sequenceList.length < k && sequenceList[k] != 'D')
k++;
if (k < sequenceList.length && k > minSequenceLength > k - i + 1)
{
startIdx = i;
endIdx = j;
minSequenceLength = k - i + 1;
}
}
return startIdx & endIdx
}
My previous (O(1) memory) suggestion:
FindSequence(char[] sequenceList)
{
int i = 0, k;
int minSequenceLength = sequenceList.length + 1;
int startIdx = -1, endIdx = -1;
for (i = 0; i < sequenceList.length - 2; i++)
if (sequenceList[i] == 'A')
k = i+1;
while (sequenceList.length < k && sequenceList[k] != 'C')
k++;
while (sequenceList.length < k && sequenceList[k] != 'D')
k++;
if (k < sequenceList.length && k > minSequenceLength > k - i + 1)
{
startIdx = i;
endIdx = j;
minSequenceLength = k - i + 1;
}
return startIdx & endIdx;
}
Here's my version. It keeps track of possible candidates for an optimum solution. For each character in the hay, it checks whether this character is in sequence of each candidate. It then selectes the shortest candidate. Quite straightforward.
class ShortestSequenceFinder
{
public class Solution
{
public int StartIndex;
public int Length;
}
private class Candidate
{
public int StartIndex;
public int SearchIndex;
}
public Solution Execute(string hay, string needle)
{
var candidates = new List<Candidate>();
var result = new Solution() { Length = hay.Length + 1 };
for (int i = 0; i < hay.Length; i++)
{
char c = hay[i];
for (int j = candidates.Count - 1; j >= 0; j--)
{
if (c == needle[candidates[j].SearchIndex])
{
if (candidates[j].SearchIndex == needle.Length - 1)
{
int candidateLength = i - candidates[j].StartIndex;
if (candidateLength < result.Length)
{
result.Length = candidateLength;
result.StartIndex = candidates[j].StartIndex;
}
candidates.RemoveAt(j);
}
else
{
candidates[j].SearchIndex += 1;
}
}
}
if (c == needle[0])
candidates.Add(new Candidate { SearchIndex = 1, StartIndex = i });
}
return result;
}
}
It runs in O(n*m).
Here is my solution in Python. It returns the indexes assuming 0-indexed sequences. Therefore, for the given example it returns (9, 11) instead of (10, 12). Obviously it's easy to mutate this to return (10, 12) if you wish.
def solution(s, ss):
S, E = [], []
for i in xrange(len(s)):
if s[i] == ss[0]:
S.append(i)
if s[i] == ss[-1]:
E.append(i)
candidates = sorted([(start, end) for start in S for end in E
if start <= end and end - start >= len(ss) - 1],
lambda x,y: (x[1] - x[0]) - (y[1] - y[0]))
for cand in candidates:
i, j = cand[0], 0
while i <= cand[-1]:
if s[i] == ss[j]:
j += 1
i += 1
if j == len(ss):
return cand
Usage:
>>> from so import solution
>>> s = 'ADCBDABCDACD'
>>> solution(s, 'ACD')
(9, 11)
>>> solution(s, 'ADC')
(0, 2)
>>> solution(s, 'DCCD')
(1, 8)
>>> solution(s, s)
(0, 11)
>>> s = 'ABC'
>>> solution(s, 'B')
(1, 1)
>>> print solution(s, 'gibberish')
None
I think the time complexity is O(p log(p)) where p is the number of pairs of indexes in the sequence that refer to search_sequence[0] and search_sequence[-1] where the index for search_sequence[0] is less than the index forsearch_sequence[-1] because it sorts these p pairings using an O(n log n) algorithm. But then again, my substring iteration at the end could totally overshadow that sorting step. I'm not really sure.
It probably has a worst-case time complexity which is bounded by O(n*m) where n is the length of the sequence and m is the length of the search sequence, but at the moment I cannot think of an example worst-case.
Here is my O(m*n) algorithm in Java:
class ShortestWindowAlgorithm {
Multimap<Character, Integer> charToNeedleIdx; // Character -> indexes in needle, from rightmost to leftmost | Multimap is a class from Guava
int[] prefixesIdx; // prefixesIdx[i] -- rightmost index in the hay window that contains the shortest found prefix of needle[0..i]
int[] prefixesLengths; // prefixesLengths[i] -- shortest window containing needle[0..i]
public int shortestWindow(String hay, String needle) {
init(needle);
for (int i = 0; i < hay.length(); i++) {
for (int needleIdx : charToNeedleIdx.get(hay.charAt(i))) {
if (firstTimeAchievedPrefix(needleIdx) || foundShorterPrefix(needleIdx, i)) {
prefixesIdx[needleIdx] = i;
prefixesLengths[needleIdx] = getPrefixNewLength(needleIdx, i);
forgetOldPrefixes(needleIdx);
}
}
}
return prefixesLengths[prefixesLengths.length - 1];
}
private void init(String needle) {
charToNeedleIdx = ArrayListMultimap.create();
prefixesIdx = new int[needle.length()];
prefixesLengths = new int[needle.length()];
for (int i = needle.length() - 1; i >= 0; i--) {
charToNeedleIdx.put(needle.charAt(i), i);
prefixesIdx[i] = -1;
prefixesLengths[i] = -1;
}
}
private boolean firstTimeAchievedPrefix(int needleIdx) {
int shortestPrefixSoFar = prefixesLengths[needleIdx];
return shortestPrefixSoFar == -1 && (needleIdx == 0 || prefixesLengths[needleIdx - 1] != -1);
}
private boolean foundShorterPrefix(int needleIdx, int hayIdx) {
int shortestPrefixSoFar = prefixesLengths[needleIdx];
int newLength = getPrefixNewLength(needleIdx, hayIdx);
return newLength <= shortestPrefixSoFar;
}
private int getPrefixNewLength(int needleIdx, int hayIdx) {
return needleIdx == 0 ? 1 : (prefixesLengths[needleIdx - 1] + (hayIdx - prefixesIdx[needleIdx - 1]));
}
private void forgetOldPrefixes(int needleIdx) {
if (needleIdx > 0) {
prefixesLengths[needleIdx - 1] = -1;
prefixesIdx[needleIdx - 1] = -1;
}
}
}
It works on every input and also can handle repeated characters etc.
Here are some examples:
public class StackOverflow {
public static void main(String[] args) {
ShortestWindowAlgorithm algorithm = new ShortestWindowAlgorithm();
System.out.println(algorithm.shortestWindow("AXCXXCAXCXAXCXCXAXAXCXCXDXDXDXAXCXDXAXAXCD", "AACD")); // 6
System.out.println(algorithm.shortestWindow("ADCBDABCDACD", "ACD")); // 3
System.out.println(algorithm.shortestWindow("ADCBDABCD", "ACD")); // 4
}
I haven't read every answer here, but I don't think anyone has noticed that this is just a restricted version of local pairwise sequence alignment, in which we are only allowed to insert characters (and not delete or substitute them). As such it will be solved by a simplification of the Smith-Waterman algorithm that considers only 2 cases per vertex (arriving at the vertex either by matching a character exactly, or by inserting a character) rather than 3 cases. This algorithm is O(n^2).
Here's my solution. It follows one of the pattern matching solutions. Please comment/correct me if I'm wrong.
Given the input string as in the question
A D C B D A B C D A C D. Let's first compute the indices where A occurs. Assuming a zero based index this should be [0,5,9].
Now the pseudo code is as follows.
Store the indices of A in a list say *orders*.// orders=[0,5,9]
globalminStart, globalminEnd=0,localMinStart=0,localMinEnd=0;
for (index: orders)
{
int i =index;
Stack chars=new Stack();// to store the characters
i=localminStart;
while(i< length of input string)
{
if(str.charAt(i)=='C') // we've already seen A, so we look for C
st.push(str.charAt(i));
i++;
continue;
else if(str.charAt(i)=='D' and st.peek()=='C')
localminEnd=i; // we have a match! so assign value of i to len
i+=1;
break;
else if(str.charAt(i)=='A' )// seen the next A
break;
}
if (globalMinEnd-globalMinStart<localMinEnd-localMinStart)
{
globalMinEnd=localMinEnd;
globalMinStart=localMinStart;
}
}
return [globalMinstart,globalMinEnd]
}
P.S: this is pseudocode and a rough idea. Id be happy to correct it and understand if there's something wrong.
AFAIC Time complexity -O(n). Space complexity O(n)

Resources