How do I calculate the row offset in a sparse matrix representation? - c

I'm writing a SpMxV (sparse matrix vector multiplication) program where I store the sparse matrix in a CRS format and then carry out the operations. Here's a short introduction on the format.
I know how to acquire the val and col_index_array arrays:
for (row_idx = 0; row_idx < row_count; row_idx++) {
for (col_idx = 0; col_idx < column_count; col_idx++) {
if (sparse_matrix[row_idx][col_idx] != 0) {
val[i] = sparse_matrix[row_idx][col_idx];
col_idx_array[i] = col_idx;
i++;
}
}
}
But I got stuck at acquiring the row_ptr indexes. How do I actually calculate them?

We have
row_ptr[i + 1] - row_ptr[i] = number of values in row i
therefore we simply have to store the current number of entries i in row_ptr when we look at the next row:
for (row_idx = 0; row_idx < row_count; row_idx++) {
row_ptr[row_idx] = i;
/* other code ommited */
}
row_ptr[row_idx] = i; /* equivalent to row_ptr[row_count] = i;
* total number of entries
*/
Note that this assumes that your arrays are 0-indexed, whereas the introduction you've posted assumes 1-indexed arrays.

Taking your code would look something like this:
for (row_idx = 0; row_idx < row_count; row_idx++) {
for (col_idx = 0; col_idx < column_count; col_idx++) {
if (sparse_matrix[row_idx][col_idx] != 0) {
val[i] = sparse_matrix[row_idx][col_idx];
col_idx_array[i] = col_idx;
i++;
}
}
row_ptr[row_idx+1]=i;
}

Related

Conway's game of life with C. Command terminated when calling function to generate new generation and update "board"

I'm writing Conway's game of life in C. I understand that the rules of the game are:
Any live cell with 2 or 3 live neighbours survives.
Any dead cell with 3 live neighbours becomes a live cell.
All other live cells die in the next generation.
'#' symbolises a live cell and '.' symbolises a dead one
The code to my function to generate and update the universe "board" for every generation is written below. It takes in the universe (A 2d array which I dynamically allocated memory for in the main function), the no of rows and no of columns of the 2d array. All other functions (such as alive_neighbour) are correct and the program runs smoothly to the point this function is called. Afterwhich the terminal just prints "command terminated". I can't seem to figure out which part of the function causes the program to crash. Help will be greatly appreciated. Thank you!
void next_generation(char **universe, long row, long column)
{
for (long i = 1; i <= (row - 1); i += 1) {
for (long j = 1; j <= (column - 1); j += 1) {
//a = no of alive neighbours surrounding universe[i][j]
long a = alive_neighbour(universe, i, j);
if (universe[i][j] == '#') {
if (a == 2 || a == 3) {
universe[i][j] = '#';
}
}
else if (universe[i][j] == '.') {
if (a == 3) {
universe[i][j] = '#';
}
}
else {
universe[i][j] = '.';
}
}
}
}
long alive_neighbour(char **universe, long row, long column)
{
long count = 0;
for (long i = -1; i <= 1; i += 1) {
for (long j = -1; j <= 1; j += 1) {
if (universe[row + i][column + j] == '#') {
count += 1;
}
}
}
if (universe[row][column] == '#') {
count -= 1;
}
return count;
}
How I declared universe in the main function:
universe = (char **)malloc((size_t) row * sizeof(char *));
for (long i = 0; i < row; i += 1) {
universe[i] = (char *)malloc((size_t) column * sizeof(char));
}

Transform an array to another array by shifting value to adjacent element

I am given 2 arrays, Input and Output Array. The goal is to transform the input array to output array by performing shifting of 1 value in a given step to its adjacent element. Eg: Input array is [0,0,8,0,0] and Output array is [2,0,4,0,2]. Here 1st step would be [0,1,7,0,0] and 2nd step would be [0,1,6,1,0] and so on.
What can be the algorithm to do this efficiently? I was thinking of performing BFS but then we have to do BFS from each element and this can be exponential. Can anyone suggest solution for this problem?
I think you can do this simply by scanning in each direction tracking the cumulative value (in that direction) in the current array and the desired output array and pushing values along ahead of you as necessary:
scan from the left looking for first cell where
cumulative value > cumulative value in desired output
while that holds move 1 from that cell to the next cell to the right
scan from the right looking for first cell where
cumulative value > cumulative value in desired output
while that holds move 1 from that cell to the next cell to the left
For your example the steps would be:
FWD:
[0,0,8,0,0]
[0,0,7,1,0]
[0,0,6,2,0]
[0,0,6,1,1]
[0,0,6,0,2]
REV:
[0,1,5,0,2]
[0,2,4,0,2]
[1,1,4,0,2]
[2,0,4,0,2]
i think BFS could actually work.
notice that n*O(n+m) = O(n^2+nm) and therefore not exponential.
also you could use: Floyd-Warshall algorithm and Johnson’s algorithm, with a weight of 1 for a "flat" graph, or even connect the vertices in a new way by their actual distance and potentially save some iterations.
hope it helped :)
void transform(int[] in, int[] out, int size)
{
int[] state = in.clone();
report(state);
while (true)
{
int minPressure = 0;
int indexOfMinPressure = 0;
int maxPressure = 0;
int indexOfMaxPressure = 0;
int pressureSum = 0;
for (int index = 0; index < size - 1; ++index)
{
int lhsDiff = state[index] - out[index];
int rhsDiff = state[index + 1] - out[index + 1];
int pressure = lhsDiff - rhsDiff;
if (pressure < minPressure)
{
minPressure = pressure;
indexOfMinPressure = index;
}
if (pressure > maxPressure)
{
maxPressure = pressure;
indexOfMaxPressure = index;
}
pressureSum += pressure;
}
if (minPressure == 0 && maxPressure == 0)
{
break;
}
boolean shiftLeft;
if (Math.abs(minPressure) > Math.abs(maxPressure))
{
shiftLeft = true;
}
else if (Math.abs(minPressure) < Math.abs(maxPressure))
{
shiftLeft = false;
}
else
{
shiftLeft = (pressureSum < 0);
}
if (shiftLeft)
{
++state[indexOfMinPressure];
--state[indexOfMinPressure + 1];
}
else
{
--state[indexOfMaxPressure];
++state[indexOfMaxPressure + 1];
}
report(state);
}
}
A simple greedy algorithm will work and do the job in minimum number of steps. The function returns the total numbers of steps required for the task.
int shift(std::vector<int>& a,std::vector<int>& b){
int n = a.size();
int sum1=0,sum2=0;
for (int i = 0; i < n; ++i){
sum1+=a[i];
sum2+=b[i];
}
if (sum1!=sum2)
{
return -1;
}
int operations=0;
int j=0;
for (int i = 0; i < n;)
{
if (a[i]<b[i])
{
while(j<n and a[j]==0){
j++;
}
if(a[j]<b[i]-a[i]){
operations+=(j-i)*a[j];
a[i]+=a[j];
a[j]=0;
}else{
operations+=(j-i)*(b[i]-a[i]);
a[j]-=(b[i]-a[i]);
a[i]=b[i];
}
}else if (a[i]>b[i])
{
a[i+1]+=(a[i]-b[i]);
operations+=(a[i]-b[i]);
a[i]=b[i];
}else{
i++;
}
}
return operations;
}
Here -1 is a special value meaning that given array cannot be converted to desired one.
Time Complexity: O(n).

Connect-N Board Game, crashing when Width is >> Height

I'm in the process of coding a Connect-N board game, and I'm almost finished and have gone through troubleshooting. My problem is now after changing some stuff my game crashes when the computer plays its move if the Width is too much greater than the height. There are two functions involved here, so I will paste them both.
Board
*AllocateBoard(int columns, int rows)
{
int **array= malloc(sizeof(int *) *columns);
int r = 0;
for ( r = 0; r < columns; ++r)
{
array[r] = malloc(sizeof(int) * rows);
}
int j = columns - 1;
int k = rows - 1;
int m = 0;
int n = 0;
for ( m = 0; m < j; ++m)
{
for ( n = 0; n < k; ++n)
{
array[m][n] = 0;
}
}
Board *board = malloc(sizeof(Board));
board->columns = columns;
board->rows = rows;
board->spaces = array;
return board;
}
This first function allocates the board to be a matrix Width * Height that the user passes in via the command line. It then initializes every space on the board to be zero, and then stores the columns, rows, and spaces into a Board structure that I've created. It then returns the board.
int
computerMakeMove(Board *board)
{ int RandIndex = 0;
int **spaces = board->spaces;
int columns = board->columns;
int *arrayoflegalmoves = malloc(sizeof(int) * (columns));
int columncheck = 0;
int legalmoveindex = 0;
while (columncheck <= columns - 1)
{
if (spaces[columncheck][0] == 0)
{
arrayoflegalmoves[legalmoveindex] = columncheck;
++legalmoveindex;
++columncheck;
}
else
{
++columncheck;
}
arrayoflegalmoves = realloc(arrayoflegalmoves, (legalmoveindex) * sizeof(int));
}
if (legalmoveindex == 1)
{
return arrayoflegalmoves[0];
}
else
{
RandIndex = rand() % (legalmoveindex);
return arrayoflegalmoves[RandIndex];
}
}
This second function is designed to make the computer randomly pick a column on the board. It does this by checking the value of the top row in each column. If there is a zero there, it will store this value in an array of legal moves, and then it increments the legalmoveindex. If there isn't, it skips the column and checks the next. It ends when it gets finished checking the final column. If there is only one legal move, it will play it. If there are more, it will select a random index from the array of legal moves (I run srand in the main) and then return that value. It will only ever attempt to play on a legal board, so that's not the problem. I am pretty confident the problem occurs in this function, however, as I call the functions as follows
printf("Taking the computers move.\n");
{printf("Taking computer's move.");
computermove = computerMakeMove(playerboard);
printf("Computer's move successfully taken.\n");
playerboard = MakeMove(playerboard, computermove, player);
printf("Computer's board piece successfully played.\n");
system("clear");
displayBoard(playerboard);
...;
}
and it prints
Aborted (core dumped)
immediately after it prints
"Taking computer's move."
Once again, my question is: why is my program crashing if the width is larger than the height when the computer plays?
Thanks.
Edit: I found the solution and I am stupid.
I realloc'd during the while loop.
The realloc should be the first thing outside of the while loop.
The answer for any future programmers who may have this problem:
Notice the
while (columncheck <= columns - 1)
{
if (spaces[columncheck][0] == 0)
{
arrayoflegalmoves[legalmoveindex] = columncheck;
++legalmoveindex;
++columncheck;
}
else
{
++columncheck;
}
arrayoflegalmoves = realloc(arrayoflegalmoves, (legalmoveindex) * sizeof(int));
}
has a realloc inside of it. The realloc should be moved to immediately outside of it, like so
while (columncheck <= columns - 1)
{
if (spaces[columncheck][0] == 0)
{
arrayoflegalmoves[legalmoveindex] = columncheck;
++legalmoveindex;
++columncheck;
}
else
{
++columncheck;
}
}
arrayoflegalmoves = realloc(arrayoflegalmoves, (legalmoveindex) * sizeof(int));
it is unusual to have the columns be the first index in an array.
having the first index of an array be columns leads to confusion
// suggest using camel case for all variable names, for readability
Board *AllocateBoard(int columns, int rows)
{
int **array= malloc(sizeof(int *) *columns); // add check that malloc successful
int r = 0;
for ( r = 0; r < columns; ++r)
{
array[r] = malloc(sizeof(int) * rows); // <-- add: check that malloc successful
}
int j = columns - 1; // this results in last column not initialized
int k = rows - 1; // this results in last row of each column not initialized
int m = 0; // column loop counter
int n = 0; // row loop counter
for ( m = 0; m < j; ++m)
{
for ( n = 0; n < k; ++n)
{
array[m][n] = 0;
}
}
Board *board = malloc(sizeof(Board)); // <-- add: check if malloc successful
board->columns = columns;
board->rows = rows;
board->spaces = array;
return board;
} // end function: AllocateBoard
// why is this only looking at the first row of each column?
int computerMakeMove(Board *board)
{
int RandIndex = 0;
int **spaces = board->spaces;
int columns = board->columns;
int *arrayoflegalmoves = malloc(sizeof(int) * (columns)); // <-- add check that malloc successful
int columncheck = 0;
int legalmoveindex = 0;
while (columncheck <= columns - 1)// should be: for(; columncheck < columns; columncheck++ )
{
if (spaces[columncheck][0] == 0)
{ // then first row of column is zero
arrayoflegalmoves[legalmoveindex] = columncheck;
++legalmoveindex;
++columncheck; // <-- remove this line
}
else // remove this 'else' code block
{
++columncheck;
} // end if
arrayoflegalmoves = realloc(arrayoflegalmoves, (legalmoveindex) * sizeof(int));
// <-- 1) use temp int*, in case realloc fails
// <-- 2) if realloc successful, update arrayoflegalmoves
// <-- 3) the code is not checking each row of each column,
// so the original malloc is more than plenty
// so why bother to realloc
// <-- 4) if legalmoveindex is 0 then realloc returns NULL
} // end while
// in following, what about when zero moves found? probably should return NULL
if (legalmoveindex == 1)
{ // only one column[row0] found to contain 0
return arrayoflegalmoves[0];
}
else
{
RandIndex = rand() % (legalmoveindex);
return arrayoflegalmoves[RandIndex]; // if zero moves found, this returns a
// de-reference to address 0
// which would result in a seg fault event
} // end if
} // end function: computerMakeMove

Minimum number of swaps needed to sort the array

I have an array of size n, which contain elements from 1 to n, in random order. So, we'd have as input an unordered array of integers.
Considering I can swap any two elements any number of times, how can I find minimum numbers of such swap to make array sorted?
This can be done in O(n). Assuming elements are in range 1 to n and there're no duplicates.
noofswaps = 0
for i in range(len(A)):
while A[i] != i + 1:
temp = A[i]
A[i] = A[A[i] - 1]
A[temp - 1] = temp
noofswaps += 1
print noofswaps
static int minimumSwaps(int[] arr) {
int swap=0;
boolean newarr[]=new boolean[arr.length];
for(int i=0;i<arr.length;i++){
int j=i,count=0;
while(!newarr[j]){
newarr[j]=true;
j=arr[j]-1;
count++;
}
if(count!=0)
swap+=count-1;
}
return swap;
}
I'll try to answer this question using javascript.
This is most optimal code I have tried so far :
function minimumSwaps(arr) {
var arrLength = arr.length;
// create two new Arrays
// one record value and key separately
// second to keep visited node count (default set false to all)
var newArr = [];
var newArrVisited = [];
for (let i = 0; i < arrLength; i++) {
newArr[i]= [];
newArr[i].value = arr[i];
newArr[i].key = i;
newArrVisited[i] = false;
}
// sort new array by value
newArr.sort(function (a, b) {
return a.value - b.value;
})
var swp = 0;
for (let i = 0; i < arrLength; i++) {
// check if already visited or swapped
if (newArr[i].key == i || newArrVisited[i]) {
continue;
}
var cycle = 0;
var j = i;
while (!newArrVisited[j]) {
// mark as visited
newArrVisited[j] = true;
j = newArr[j].key; //assign next key
cycle++;
}
if (cycle > 0) {
swp += (cycle > 1) ? cycle - 1 : cycle;
}
}
return swp;
}
reference -1
reference -2
Hackerrank Python code for minimum swaps 2 using hashmaps
length = int(input())
arr= list(map(int,input().split()))
hashmap = {}
for i in range(0,len(arr)):
hashmap[i+1] = [arr[i],False]
swap_count = 0
for e_pos, e_val in hashmap.items():
if e_val[1] == False:
e_val[1] = True
if e_pos == e_val[0]:
continue
else:
c = e_val[0]
while hashmap[c][1] == False:
hashmap[c][1] = True
b = hashmap[c][0]
c = b
swap_count+=1
print(swap_count)
There's an interesting take in GeeksForGeeks with
Time Complexity: O(N) where N is the size of the array.
Auxiliary Space: O(1)
The used approach was
For each index in arr[], check if the current element is in it’s right position or not. Since the array contains distinct elements from 1 to N, we can simply compare the element with it’s index in array to check if it is at its right position.
If current element is not at it’s right position then swap the element with the element which has occupied its place (using temp variable)
Else check for next index (i += 1)
This is the code
def minimumSwaps(arr):
min_num_swaps = 0;
i = 0;
while (i < len(arr)):
if (arr[i] != i + 1):
while (arr[i] != i + 1):
temp = 0;
temp = arr[arr[i] - 1];
arr[arr[i] - 1] = arr[i];
arr[i] = temp;
min_num_swaps += 1;
i += 1;
return min_num_swaps;
that could easily be updated to
Remove semicolons
Remove the need for temp
Substitute len(arr) with a given integer input n with the size of the array
def minimumSwaps(arr):
min_num_swaps = 0
i = 0
while (i < n-1):
if (arr[i] != i + 1):
while (arr[i] != i + 1):
arr[arr[i] - 1], arr[i] = arr[i], arr[arr[i] - 1]
min_num_swaps += 1
i += 1;
return min_num_swaps
They both are gonna pass all the current 15 Test cases in HackerRank
Here is my code for minimumsawap function using java 7
static int minimumSwaps(int[] arr) {
int c=0;
for(int i=0;i<arr.length;i++){
if(arr[i]!=(i+1)){
int t= arr[i];
arr[i]=arr[t-1];
arr[t-1]=t;
c++;
i=0;
}
}
return c;
}

Find length of smallest window that contains all the characters of a string in another string

Recently i have been interviewed. I didn't do well cause i got stuck at the following question
suppose a sequence is given : A D C B D A B C D A C D
and search sequence is like: A C D
task was to find the start and end index in given string that contains all the characters of search string preserving the order.
Output: assuming index start from 1:
start index 10
end index 12
explanation :
1.start/end index are not 1/3 respectively because though they contain the string but order was not maintained
2.start/end index are not 1/5 respectively because though they contain the string in the order but the length is not optimum
3.start/end index are not 6/9 respectively because though they contain the string in the order but the length is not optimum
Please go through How to find smallest substring which contains all characters from a given string?.
But the above question is different since the order is not maintained. I'm still struggling to maintain the indexes. Any help would be appreciated . thanks
I tried to write some simple c code to solve the problem:
Update:
I wrote a search function that looks for the required characters in correct order, returning the length of the window and storing the window start point to ìnt * startAt. The function processes a sub-sequence of given hay from specified startpoint int start to it's end
The rest of the algorithm is located in main where all possible subsequences are tested with a small optimisation: we start looking for the next window right after the startpoint of the previous one, so we skip some unnecessary turns. During the process we keep track f the 'till-now best solution
Complexity is O(n*n/2)
Update2:
unnecessary dependencies have been removed, unnecessary subsequent calls to strlen(...) have been replaced by size parameters passed to search(...)
#include <stdio.h>
// search for single occurrence
int search(const char hay[], int haySize, const char needle[], int needleSize, int start, int * startAt)
{
int i, charFound = 0;
// search from start to end
for (i = start; i < haySize; i++)
{
// found a character ?
if (hay[i] == needle[charFound])
{
// is it the first one?
if (charFound == 0)
*startAt = i; // store starting position
charFound++; // and go to next one
}
// are we done?
if (charFound == needleSize)
return i - *startAt + 1; // success
}
return -1; // failure
}
int main(int argc, char **argv)
{
char hay[] = "ADCBDABCDACD";
char needle[] = "ACD";
int resultStartAt, resultLength = -1, i, haySize = sizeof(hay) - 1, needleSize = sizeof(needle) - 1;
// search all possible occurrences
for (i = 0; i < haySize - needleSize; i++)
{
int startAt, length;
length = search(hay, haySize, needle, needleSize, i, &startAt);
// found something?
if (length != -1)
{
// check if it's the first result, or a one better than before
if ((resultLength == -1) || (resultLength > length))
{
resultLength = length;
resultStartAt = startAt;
}
// skip unnecessary steps in the next turn
i = startAt;
}
}
printf("start at: %d, length: %d\n", resultStartAt, resultLength);
return 0;
}
Start from the beginning of the string.
If you encounter an A, then mark the position and push it on a stack. After that, keep checking the characters sequentially until
1. If you encounter an A, update the A's position to current value.
2. If you encounter a C, push it onto the stack.
After you encounter a C, again keep checking the characters sequentially until,
1. If you encounter a D, erase the stack containing A and C and mark the score from A to D for this sub-sequence.
2. If you encounter an A, then start another Stack and mark this position as well.
2a. If now you encounter a C, then erase the earlier stacks and keep the most recent stack.
2b. If you encounter a D, then erase the older stack and mark the score and check if it is less than the current best score.
Keep doing this till you reach the end of the string.
The pseudo code can be something like:
Initialize stack = empty;
Initialize bestLength = mainString.size() + 1; // a large value for the subsequence.
Initialize currentLength = 0;
for ( int i = 0; i < mainString.size(); i++ ) {
if ( stack is empty ) {
if ( mainString[i] == 'A' ) {
start a new stack and push A on it.
mark the startPosition for this stack as i.
}
continue;
}
For each of the stacks ( there can be at most two stacks prevailing,
one of size 1 and other of size 0 ) {
if ( stack size == 1 ) // only A in it {
if ( mainString[i] == 'A' ) {
update the startPosition for this stack as i.
}
if ( mainString[i] == 'C' ) {
push C on to this stack.
}
} else if ( stack size == 2 ) // A & C in it {
if ( mainString[i] == 'C' ) {
if there is a stack with size 1, then delete this stack;// the other one dominates this stack.
}
if ( mainString[i] == 'D' ) {
mark the score from startPosition till i and update bestLength accordingly.
delete this stack.
}
}
}
}
I modified my previous suggestion using a single queue, now I believe this algorithm runs with O(N*m) time:
FindSequence(char[] sequenceList)
{
queue startSeqQueue;
int i = 0, k;
int minSequenceLength = sequenceList.length + 1;
int startIdx = -1, endIdx = -1;
for (i = 0; i < sequenceList.length - 2; i++)
{
if (sequenceList[i] == 'A')
{
startSeqQueue.queue(i);
}
}
while (startSeqQueue!=null)
{
i = startSeqQueue.enqueue();
k = i + 1;
while (sequenceList.length < k && sequenceList[k] != 'C')
if (sequenceList[i] == 'A') i = startSeqQueue.enqueue();
k++;
while (sequenceList.length < k && sequenceList[k] != 'D')
k++;
if (k < sequenceList.length && k > minSequenceLength > k - i + 1)
{
startIdx = i;
endIdx = j;
minSequenceLength = k - i + 1;
}
}
return startIdx & endIdx
}
My previous (O(1) memory) suggestion:
FindSequence(char[] sequenceList)
{
int i = 0, k;
int minSequenceLength = sequenceList.length + 1;
int startIdx = -1, endIdx = -1;
for (i = 0; i < sequenceList.length - 2; i++)
if (sequenceList[i] == 'A')
k = i+1;
while (sequenceList.length < k && sequenceList[k] != 'C')
k++;
while (sequenceList.length < k && sequenceList[k] != 'D')
k++;
if (k < sequenceList.length && k > minSequenceLength > k - i + 1)
{
startIdx = i;
endIdx = j;
minSequenceLength = k - i + 1;
}
return startIdx & endIdx;
}
Here's my version. It keeps track of possible candidates for an optimum solution. For each character in the hay, it checks whether this character is in sequence of each candidate. It then selectes the shortest candidate. Quite straightforward.
class ShortestSequenceFinder
{
public class Solution
{
public int StartIndex;
public int Length;
}
private class Candidate
{
public int StartIndex;
public int SearchIndex;
}
public Solution Execute(string hay, string needle)
{
var candidates = new List<Candidate>();
var result = new Solution() { Length = hay.Length + 1 };
for (int i = 0; i < hay.Length; i++)
{
char c = hay[i];
for (int j = candidates.Count - 1; j >= 0; j--)
{
if (c == needle[candidates[j].SearchIndex])
{
if (candidates[j].SearchIndex == needle.Length - 1)
{
int candidateLength = i - candidates[j].StartIndex;
if (candidateLength < result.Length)
{
result.Length = candidateLength;
result.StartIndex = candidates[j].StartIndex;
}
candidates.RemoveAt(j);
}
else
{
candidates[j].SearchIndex += 1;
}
}
}
if (c == needle[0])
candidates.Add(new Candidate { SearchIndex = 1, StartIndex = i });
}
return result;
}
}
It runs in O(n*m).
Here is my solution in Python. It returns the indexes assuming 0-indexed sequences. Therefore, for the given example it returns (9, 11) instead of (10, 12). Obviously it's easy to mutate this to return (10, 12) if you wish.
def solution(s, ss):
S, E = [], []
for i in xrange(len(s)):
if s[i] == ss[0]:
S.append(i)
if s[i] == ss[-1]:
E.append(i)
candidates = sorted([(start, end) for start in S for end in E
if start <= end and end - start >= len(ss) - 1],
lambda x,y: (x[1] - x[0]) - (y[1] - y[0]))
for cand in candidates:
i, j = cand[0], 0
while i <= cand[-1]:
if s[i] == ss[j]:
j += 1
i += 1
if j == len(ss):
return cand
Usage:
>>> from so import solution
>>> s = 'ADCBDABCDACD'
>>> solution(s, 'ACD')
(9, 11)
>>> solution(s, 'ADC')
(0, 2)
>>> solution(s, 'DCCD')
(1, 8)
>>> solution(s, s)
(0, 11)
>>> s = 'ABC'
>>> solution(s, 'B')
(1, 1)
>>> print solution(s, 'gibberish')
None
I think the time complexity is O(p log(p)) where p is the number of pairs of indexes in the sequence that refer to search_sequence[0] and search_sequence[-1] where the index for search_sequence[0] is less than the index forsearch_sequence[-1] because it sorts these p pairings using an O(n log n) algorithm. But then again, my substring iteration at the end could totally overshadow that sorting step. I'm not really sure.
It probably has a worst-case time complexity which is bounded by O(n*m) where n is the length of the sequence and m is the length of the search sequence, but at the moment I cannot think of an example worst-case.
Here is my O(m*n) algorithm in Java:
class ShortestWindowAlgorithm {
Multimap<Character, Integer> charToNeedleIdx; // Character -> indexes in needle, from rightmost to leftmost | Multimap is a class from Guava
int[] prefixesIdx; // prefixesIdx[i] -- rightmost index in the hay window that contains the shortest found prefix of needle[0..i]
int[] prefixesLengths; // prefixesLengths[i] -- shortest window containing needle[0..i]
public int shortestWindow(String hay, String needle) {
init(needle);
for (int i = 0; i < hay.length(); i++) {
for (int needleIdx : charToNeedleIdx.get(hay.charAt(i))) {
if (firstTimeAchievedPrefix(needleIdx) || foundShorterPrefix(needleIdx, i)) {
prefixesIdx[needleIdx] = i;
prefixesLengths[needleIdx] = getPrefixNewLength(needleIdx, i);
forgetOldPrefixes(needleIdx);
}
}
}
return prefixesLengths[prefixesLengths.length - 1];
}
private void init(String needle) {
charToNeedleIdx = ArrayListMultimap.create();
prefixesIdx = new int[needle.length()];
prefixesLengths = new int[needle.length()];
for (int i = needle.length() - 1; i >= 0; i--) {
charToNeedleIdx.put(needle.charAt(i), i);
prefixesIdx[i] = -1;
prefixesLengths[i] = -1;
}
}
private boolean firstTimeAchievedPrefix(int needleIdx) {
int shortestPrefixSoFar = prefixesLengths[needleIdx];
return shortestPrefixSoFar == -1 && (needleIdx == 0 || prefixesLengths[needleIdx - 1] != -1);
}
private boolean foundShorterPrefix(int needleIdx, int hayIdx) {
int shortestPrefixSoFar = prefixesLengths[needleIdx];
int newLength = getPrefixNewLength(needleIdx, hayIdx);
return newLength <= shortestPrefixSoFar;
}
private int getPrefixNewLength(int needleIdx, int hayIdx) {
return needleIdx == 0 ? 1 : (prefixesLengths[needleIdx - 1] + (hayIdx - prefixesIdx[needleIdx - 1]));
}
private void forgetOldPrefixes(int needleIdx) {
if (needleIdx > 0) {
prefixesLengths[needleIdx - 1] = -1;
prefixesIdx[needleIdx - 1] = -1;
}
}
}
It works on every input and also can handle repeated characters etc.
Here are some examples:
public class StackOverflow {
public static void main(String[] args) {
ShortestWindowAlgorithm algorithm = new ShortestWindowAlgorithm();
System.out.println(algorithm.shortestWindow("AXCXXCAXCXAXCXCXAXAXCXCXDXDXDXAXCXDXAXAXCD", "AACD")); // 6
System.out.println(algorithm.shortestWindow("ADCBDABCDACD", "ACD")); // 3
System.out.println(algorithm.shortestWindow("ADCBDABCD", "ACD")); // 4
}
I haven't read every answer here, but I don't think anyone has noticed that this is just a restricted version of local pairwise sequence alignment, in which we are only allowed to insert characters (and not delete or substitute them). As such it will be solved by a simplification of the Smith-Waterman algorithm that considers only 2 cases per vertex (arriving at the vertex either by matching a character exactly, or by inserting a character) rather than 3 cases. This algorithm is O(n^2).
Here's my solution. It follows one of the pattern matching solutions. Please comment/correct me if I'm wrong.
Given the input string as in the question
A D C B D A B C D A C D. Let's first compute the indices where A occurs. Assuming a zero based index this should be [0,5,9].
Now the pseudo code is as follows.
Store the indices of A in a list say *orders*.// orders=[0,5,9]
globalminStart, globalminEnd=0,localMinStart=0,localMinEnd=0;
for (index: orders)
{
int i =index;
Stack chars=new Stack();// to store the characters
i=localminStart;
while(i< length of input string)
{
if(str.charAt(i)=='C') // we've already seen A, so we look for C
st.push(str.charAt(i));
i++;
continue;
else if(str.charAt(i)=='D' and st.peek()=='C')
localminEnd=i; // we have a match! so assign value of i to len
i+=1;
break;
else if(str.charAt(i)=='A' )// seen the next A
break;
}
if (globalMinEnd-globalMinStart<localMinEnd-localMinStart)
{
globalMinEnd=localMinEnd;
globalMinStart=localMinStart;
}
}
return [globalMinstart,globalMinEnd]
}
P.S: this is pseudocode and a rough idea. Id be happy to correct it and understand if there's something wrong.
AFAIC Time complexity -O(n). Space complexity O(n)

Resources