Algorithm to iterate N-dimensional array in pseudo random order - arrays

I have an array that I would like to iterate in random order. That is, I would like my iteration to visit each element only once in a seemingly random order.
Would it be possible to implement an iterator that would iterate elements like this without storing the order or other data in a lookup table first?
Would it be possible to do it for N-dimensional arrays where N>1?
UPDATE: Some of the answers mention how to do this by storing indices. A major point of this question is how to do it without storing indices or other data.

I decided to solve this, because it annoyed me to death not remembering the name of solution that I had heard before. I did however remember in the end, more on that in the bottom of this post.
My solution depends on the mathematical properties of some cleverly calculated numbers
range = array size
prime = closestPrimeAfter(range)
root = closestPrimitiveRootTo(range/2)
state = root
With this setup we can calculate the following repeatedly and it will iterate all elements of the array exactly once in a seemingly random order, after which it will loop to traverse the array in the same exact order again.
state = (state * root) % prime
I implemented and tested this in Java, so I decided to paste my code here for future reference.
import java.math.BigInteger;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Random;
public class PseudoRandomSequence {
private long state;
private final long range;
private final long root;
private final long prime;
//Debugging counter
private int dropped = 0;
public PseudoRandomSequence(int r) {
range = r;
prime = closestPrimeAfter(range);
root = modPow(generator(prime), closestPrimeTo(prime / 2), prime);
reset();
System.out.println("-- r:" + range);
System.out.println(" p:" + prime);
System.out.println(" k:" + root);
System.out.println(" s:" + state);
}
// https://en.wikipedia.org/wiki/Primitive_root_modulo_n
private static long modPow(long base, long exp, long mod) {
return BigInteger.valueOf(base).modPow(BigInteger.valueOf(exp), BigInteger.valueOf(mod)).intValue();
}
//http://e-maxx-eng.github.io/algebra/primitive-root.html
private static long generator(long p) {
ArrayList<Long> fact = new ArrayList<Long>();
long phi = p - 1, n = phi;
for (long i = 2; i * i <= n; ++i) {
if (n % i == 0) {
fact.add(i);
while (n % i == 0) {
n /= i;
}
}
}
if (n > 1) fact.add(n);
for (long res = 2; res <= p; ++res) {
boolean ok = true;
for (long i = 0; i < fact.size() && ok; ++i) {
ok &= modPow(res, phi / fact.get((int) i), p) != 1;
}
if (ok) {
return res;
}
}
return -1;
}
public long get() {
return state - 1;
}
public void advance() {
//This loop simply skips all results that overshoot the range, which should never happen if range is a prime number.
dropped--;
do {
state = (state * root) % prime;
dropped++;
} while (state > range);
}
public void reset() {
state = root;
dropped = 0;
}
private static boolean isPrime(long num) {
if (num == 2) return true;
if (num % 2 == 0) return false;
for (int i = 3; i * i <= num; i += 2) {
if (num % i == 0) return false;
}
return true;
}
private static long closestPrimeAfter(long n) {
long up;
for (up = n + 1; !isPrime(up); ++up)
;
return up;
}
private static long closestPrimeBefore(long n) {
long dn;
for (dn = n - 1; !isPrime(dn); --dn)
;
return dn;
}
private static long closestPrimeTo(long n) {
final long dn = closestPrimeBefore(n);
final long up = closestPrimeAfter(n);
return (n - dn) > (up - n) ? up : dn;
}
private static boolean test(int r, int loops) {
final int array[] = new int[r];
Arrays.fill(array, 0);
System.out.println("TESTING: array size: " + r + ", loops: " + loops + "\n");
PseudoRandomSequence prs = new PseudoRandomSequence(r);
final long ct = loops * r;
//Iterate the array 'loops' times, incrementing the value for each cell for every visit.
for (int i = 0; i < ct; ++i) {
prs.advance();
final long index = prs.get();
array[(int) index]++;
}
//Verify that each cell was visited exactly 'loops' times, confirming the validity of the sequence
for (int i = 0; i < r; ++i) {
final int c = array[i];
if (loops != c) {
System.err.println("ERROR: array element #" + i + " was " + c + " instead of " + loops + " as expected\n");
return false;
}
}
//TODO: Verify the "randomness" of the sequence
System.out.println("OK: Sequence checked out with " + prs.dropped + " drops (" + prs.dropped / loops + " per loop vs. diff " + (prs.prime - r) + ") \n");
return true;
}
//Run lots of random tests
public static void main(String[] args) {
Random r = new Random();
r.setSeed(1337);
for (int i = 0; i < 100; ++i) {
PseudoRandomSequence.test(r.nextInt(1000000) + 1, r.nextInt(9) + 1);
}
}
}
As stated in the top, about 10 minutes after spending a good part of my night actually getting a result, I DID remember where I had read about the original way of doing this. It was in a small C implementation of a 2D graphics "dissolve" effect as described in Graphics Gems vol. 1 which in turn is an adaption to 2D with some optimizations of a mechanism called "LFSR" (wikipedia article here, original dissolve.c source code here).

You could collect all possible indices in a list and then remove a random indece to visit. I know this is sort of like a lookup table, but i don't see any other option than this.
Here is an example for a one-dimensional array (adaption to multiple dimensions should be trivial):
class RandomIterator<T> {
T[] array;
List<Integer> remainingIndeces;
public RandomIterator(T[] array) {
this.array = array;
this.remainingIndeces = new ArrayList<>();
for(int i = 0;i<array.length;++i)
remainingIndeces.add(i);
}
public T next() {
return array[remainingIndeces.remove((int)(Math.random()*remainingIndeces.size()))];
}
public boolean hasNext() {
return !remainingIndeces.isEmpty();
}
}
On a side note: If this code is performance relevant, this method would perform worse by far, as the random removing from the list triggers copies if you use a list backed by an array (a linked-list won't help either, as indexed access is O(n)). I would suggest a lookup-structure (e.g. HashSet in Java) that stores all visited indices to circumvent this problem (though that's exactly what you did not want to use)
EDIT: Another approach is to copy said array and use a library function to shuffle it and then traverse it in linear order. If your array isn't that big, this seems like the most readable and performant option.

You would need to create a pseudo random number generator that generates values from 0 to X-1 and takes X iterations before repeating the cycle, where X is the product of all the dimension sizes. I don't know if there is a generic solution to doing this. Wiki article for one type of random number generator:
http://en.wikipedia.org/wiki/Linear_congruential_generator

Yes, it is possible. Imagine 3D array (you not likely use anything more than that). This is like a cube and where all 3 lines connect is a cell. You can enumerate your cells 1 to N using a dictionary, you can do this initialization in loops, and create a list of cells to use for random draw
Initialization
totalCells = ... (xMax * yMax * zMax)
index = 0
For (x = 0; x < xMax ; x++)
{
For (y = 0; y < yMax ; y++)
{
For (z = 0; z < zMax ; z++)
{
dict.Add(i, new Cell(x, y, z))
lst.Add(i)
i++
}
}
}
Now, all you have to do is iterate randomly
Do While (lst.Count > 0)
{
indexToVisit = rand.Next(0, lst.Count - 1)
currentCell = dict[lst[indexToVisit]]
lst.Remove(indexToVisit)
// Do something with current cell here
. . . . . .
}
This is pseudo code, since you didn't mention language you work in
Another way is to randomize 3 (or whatever number of dimensions you have) lists and then just nested loop through them - this will be random in the end.

Related

Transform an array to another array by shifting value to adjacent element

I am given 2 arrays, Input and Output Array. The goal is to transform the input array to output array by performing shifting of 1 value in a given step to its adjacent element. Eg: Input array is [0,0,8,0,0] and Output array is [2,0,4,0,2]. Here 1st step would be [0,1,7,0,0] and 2nd step would be [0,1,6,1,0] and so on.
What can be the algorithm to do this efficiently? I was thinking of performing BFS but then we have to do BFS from each element and this can be exponential. Can anyone suggest solution for this problem?
I think you can do this simply by scanning in each direction tracking the cumulative value (in that direction) in the current array and the desired output array and pushing values along ahead of you as necessary:
scan from the left looking for first cell where
cumulative value > cumulative value in desired output
while that holds move 1 from that cell to the next cell to the right
scan from the right looking for first cell where
cumulative value > cumulative value in desired output
while that holds move 1 from that cell to the next cell to the left
For your example the steps would be:
FWD:
[0,0,8,0,0]
[0,0,7,1,0]
[0,0,6,2,0]
[0,0,6,1,1]
[0,0,6,0,2]
REV:
[0,1,5,0,2]
[0,2,4,0,2]
[1,1,4,0,2]
[2,0,4,0,2]
i think BFS could actually work.
notice that n*O(n+m) = O(n^2+nm) and therefore not exponential.
also you could use: Floyd-Warshall algorithm and Johnson’s algorithm, with a weight of 1 for a "flat" graph, or even connect the vertices in a new way by their actual distance and potentially save some iterations.
hope it helped :)
void transform(int[] in, int[] out, int size)
{
int[] state = in.clone();
report(state);
while (true)
{
int minPressure = 0;
int indexOfMinPressure = 0;
int maxPressure = 0;
int indexOfMaxPressure = 0;
int pressureSum = 0;
for (int index = 0; index < size - 1; ++index)
{
int lhsDiff = state[index] - out[index];
int rhsDiff = state[index + 1] - out[index + 1];
int pressure = lhsDiff - rhsDiff;
if (pressure < minPressure)
{
minPressure = pressure;
indexOfMinPressure = index;
}
if (pressure > maxPressure)
{
maxPressure = pressure;
indexOfMaxPressure = index;
}
pressureSum += pressure;
}
if (minPressure == 0 && maxPressure == 0)
{
break;
}
boolean shiftLeft;
if (Math.abs(minPressure) > Math.abs(maxPressure))
{
shiftLeft = true;
}
else if (Math.abs(minPressure) < Math.abs(maxPressure))
{
shiftLeft = false;
}
else
{
shiftLeft = (pressureSum < 0);
}
if (shiftLeft)
{
++state[indexOfMinPressure];
--state[indexOfMinPressure + 1];
}
else
{
--state[indexOfMaxPressure];
++state[indexOfMaxPressure + 1];
}
report(state);
}
}
A simple greedy algorithm will work and do the job in minimum number of steps. The function returns the total numbers of steps required for the task.
int shift(std::vector<int>& a,std::vector<int>& b){
int n = a.size();
int sum1=0,sum2=0;
for (int i = 0; i < n; ++i){
sum1+=a[i];
sum2+=b[i];
}
if (sum1!=sum2)
{
return -1;
}
int operations=0;
int j=0;
for (int i = 0; i < n;)
{
if (a[i]<b[i])
{
while(j<n and a[j]==0){
j++;
}
if(a[j]<b[i]-a[i]){
operations+=(j-i)*a[j];
a[i]+=a[j];
a[j]=0;
}else{
operations+=(j-i)*(b[i]-a[i]);
a[j]-=(b[i]-a[i]);
a[i]=b[i];
}
}else if (a[i]>b[i])
{
a[i+1]+=(a[i]-b[i]);
operations+=(a[i]-b[i]);
a[i]=b[i];
}else{
i++;
}
}
return operations;
}
Here -1 is a special value meaning that given array cannot be converted to desired one.
Time Complexity: O(n).

Merging two arraylists without creating third one

Here is one task, i was trying to solve. You must write the function
void merge(ArrayList a, ArrayList b) {
// code
}
The function recieves two ArrayLists with equal size as input parameters [a1, a2, ..., an], [b1, b2, ..., bn]. The execution result is the 1st ArrayList must contain elements of both lists, and they alternate consistently ([a1, b1, a2, b2, ..., an, bn]) Please read the bold text twice =)
Code must work as efficiently as possible.
Here is my solution
public static void merge(ArrayList a, ArrayList b) {
ArrayList result = new ArrayList();
int i = 0;
Iterator iter1 = a.iterator();
Iterator iter2 = b.iterator();
while ((iter1.hasNext() || iter2.hasNext()) && i < (a.size() + b.size())) {
if (i % 2 ==0) {
result.add(iter1.next());
} else {
result.add(iter2.next());
}
i++;
}
a = result;
}
I know it's not perfect at all. But I can't understand how to merge in the 1st list without creating tmp list.
Thanks in advance for taking part.
Double ArrayList a's size. Set last two elements of a to the last element of the old a and the last element of b. Keep going, backing up each time, until you reach the beginnings of a and b. You have to do it from the rear because otherwise you will write over the original a's values.
In the end i got this:
public static void merge(ArrayList<Integer> arr1, ArrayList<Integer> arr2) {
int indexForArr1 = arr1.size() - 1;
int oldSize = arr1.size();
int newSize = arr1.size() + arr2.size();
/*
decided not to create new arraylist with new size but just to fill up old one with nulls
*/
fillWithNulls(arr1, newSize);
for(int i = (newSize-1); i >= 0; i--) {
if (i%2 != 0) {
int indexForArr2 = i%oldSize;
arr1.set(i,arr2.get(indexForArr2));
oldSize--; // we reduce the size because we don't need tha last element any more
} else {
arr1.set(i, arr1.get(indexForArr1));
indexForArr1--;
}
}
}
private static void fillWithNulls(ArrayList<Integer> array, int newSize) {
int delta = newSize - array.size();
for(int i = 0; i < delta; i++) {
array.add(null);
}
}
Thanks John again for bright idea!

Find length of smallest window that contains all the characters of a string in another string

Recently i have been interviewed. I didn't do well cause i got stuck at the following question
suppose a sequence is given : A D C B D A B C D A C D
and search sequence is like: A C D
task was to find the start and end index in given string that contains all the characters of search string preserving the order.
Output: assuming index start from 1:
start index 10
end index 12
explanation :
1.start/end index are not 1/3 respectively because though they contain the string but order was not maintained
2.start/end index are not 1/5 respectively because though they contain the string in the order but the length is not optimum
3.start/end index are not 6/9 respectively because though they contain the string in the order but the length is not optimum
Please go through How to find smallest substring which contains all characters from a given string?.
But the above question is different since the order is not maintained. I'm still struggling to maintain the indexes. Any help would be appreciated . thanks
I tried to write some simple c code to solve the problem:
Update:
I wrote a search function that looks for the required characters in correct order, returning the length of the window and storing the window start point to ìnt * startAt. The function processes a sub-sequence of given hay from specified startpoint int start to it's end
The rest of the algorithm is located in main where all possible subsequences are tested with a small optimisation: we start looking for the next window right after the startpoint of the previous one, so we skip some unnecessary turns. During the process we keep track f the 'till-now best solution
Complexity is O(n*n/2)
Update2:
unnecessary dependencies have been removed, unnecessary subsequent calls to strlen(...) have been replaced by size parameters passed to search(...)
#include <stdio.h>
// search for single occurrence
int search(const char hay[], int haySize, const char needle[], int needleSize, int start, int * startAt)
{
int i, charFound = 0;
// search from start to end
for (i = start; i < haySize; i++)
{
// found a character ?
if (hay[i] == needle[charFound])
{
// is it the first one?
if (charFound == 0)
*startAt = i; // store starting position
charFound++; // and go to next one
}
// are we done?
if (charFound == needleSize)
return i - *startAt + 1; // success
}
return -1; // failure
}
int main(int argc, char **argv)
{
char hay[] = "ADCBDABCDACD";
char needle[] = "ACD";
int resultStartAt, resultLength = -1, i, haySize = sizeof(hay) - 1, needleSize = sizeof(needle) - 1;
// search all possible occurrences
for (i = 0; i < haySize - needleSize; i++)
{
int startAt, length;
length = search(hay, haySize, needle, needleSize, i, &startAt);
// found something?
if (length != -1)
{
// check if it's the first result, or a one better than before
if ((resultLength == -1) || (resultLength > length))
{
resultLength = length;
resultStartAt = startAt;
}
// skip unnecessary steps in the next turn
i = startAt;
}
}
printf("start at: %d, length: %d\n", resultStartAt, resultLength);
return 0;
}
Start from the beginning of the string.
If you encounter an A, then mark the position and push it on a stack. After that, keep checking the characters sequentially until
1. If you encounter an A, update the A's position to current value.
2. If you encounter a C, push it onto the stack.
After you encounter a C, again keep checking the characters sequentially until,
1. If you encounter a D, erase the stack containing A and C and mark the score from A to D for this sub-sequence.
2. If you encounter an A, then start another Stack and mark this position as well.
2a. If now you encounter a C, then erase the earlier stacks and keep the most recent stack.
2b. If you encounter a D, then erase the older stack and mark the score and check if it is less than the current best score.
Keep doing this till you reach the end of the string.
The pseudo code can be something like:
Initialize stack = empty;
Initialize bestLength = mainString.size() + 1; // a large value for the subsequence.
Initialize currentLength = 0;
for ( int i = 0; i < mainString.size(); i++ ) {
if ( stack is empty ) {
if ( mainString[i] == 'A' ) {
start a new stack and push A on it.
mark the startPosition for this stack as i.
}
continue;
}
For each of the stacks ( there can be at most two stacks prevailing,
one of size 1 and other of size 0 ) {
if ( stack size == 1 ) // only A in it {
if ( mainString[i] == 'A' ) {
update the startPosition for this stack as i.
}
if ( mainString[i] == 'C' ) {
push C on to this stack.
}
} else if ( stack size == 2 ) // A & C in it {
if ( mainString[i] == 'C' ) {
if there is a stack with size 1, then delete this stack;// the other one dominates this stack.
}
if ( mainString[i] == 'D' ) {
mark the score from startPosition till i and update bestLength accordingly.
delete this stack.
}
}
}
}
I modified my previous suggestion using a single queue, now I believe this algorithm runs with O(N*m) time:
FindSequence(char[] sequenceList)
{
queue startSeqQueue;
int i = 0, k;
int minSequenceLength = sequenceList.length + 1;
int startIdx = -1, endIdx = -1;
for (i = 0; i < sequenceList.length - 2; i++)
{
if (sequenceList[i] == 'A')
{
startSeqQueue.queue(i);
}
}
while (startSeqQueue!=null)
{
i = startSeqQueue.enqueue();
k = i + 1;
while (sequenceList.length < k && sequenceList[k] != 'C')
if (sequenceList[i] == 'A') i = startSeqQueue.enqueue();
k++;
while (sequenceList.length < k && sequenceList[k] != 'D')
k++;
if (k < sequenceList.length && k > minSequenceLength > k - i + 1)
{
startIdx = i;
endIdx = j;
minSequenceLength = k - i + 1;
}
}
return startIdx & endIdx
}
My previous (O(1) memory) suggestion:
FindSequence(char[] sequenceList)
{
int i = 0, k;
int minSequenceLength = sequenceList.length + 1;
int startIdx = -1, endIdx = -1;
for (i = 0; i < sequenceList.length - 2; i++)
if (sequenceList[i] == 'A')
k = i+1;
while (sequenceList.length < k && sequenceList[k] != 'C')
k++;
while (sequenceList.length < k && sequenceList[k] != 'D')
k++;
if (k < sequenceList.length && k > minSequenceLength > k - i + 1)
{
startIdx = i;
endIdx = j;
minSequenceLength = k - i + 1;
}
return startIdx & endIdx;
}
Here's my version. It keeps track of possible candidates for an optimum solution. For each character in the hay, it checks whether this character is in sequence of each candidate. It then selectes the shortest candidate. Quite straightforward.
class ShortestSequenceFinder
{
public class Solution
{
public int StartIndex;
public int Length;
}
private class Candidate
{
public int StartIndex;
public int SearchIndex;
}
public Solution Execute(string hay, string needle)
{
var candidates = new List<Candidate>();
var result = new Solution() { Length = hay.Length + 1 };
for (int i = 0; i < hay.Length; i++)
{
char c = hay[i];
for (int j = candidates.Count - 1; j >= 0; j--)
{
if (c == needle[candidates[j].SearchIndex])
{
if (candidates[j].SearchIndex == needle.Length - 1)
{
int candidateLength = i - candidates[j].StartIndex;
if (candidateLength < result.Length)
{
result.Length = candidateLength;
result.StartIndex = candidates[j].StartIndex;
}
candidates.RemoveAt(j);
}
else
{
candidates[j].SearchIndex += 1;
}
}
}
if (c == needle[0])
candidates.Add(new Candidate { SearchIndex = 1, StartIndex = i });
}
return result;
}
}
It runs in O(n*m).
Here is my solution in Python. It returns the indexes assuming 0-indexed sequences. Therefore, for the given example it returns (9, 11) instead of (10, 12). Obviously it's easy to mutate this to return (10, 12) if you wish.
def solution(s, ss):
S, E = [], []
for i in xrange(len(s)):
if s[i] == ss[0]:
S.append(i)
if s[i] == ss[-1]:
E.append(i)
candidates = sorted([(start, end) for start in S for end in E
if start <= end and end - start >= len(ss) - 1],
lambda x,y: (x[1] - x[0]) - (y[1] - y[0]))
for cand in candidates:
i, j = cand[0], 0
while i <= cand[-1]:
if s[i] == ss[j]:
j += 1
i += 1
if j == len(ss):
return cand
Usage:
>>> from so import solution
>>> s = 'ADCBDABCDACD'
>>> solution(s, 'ACD')
(9, 11)
>>> solution(s, 'ADC')
(0, 2)
>>> solution(s, 'DCCD')
(1, 8)
>>> solution(s, s)
(0, 11)
>>> s = 'ABC'
>>> solution(s, 'B')
(1, 1)
>>> print solution(s, 'gibberish')
None
I think the time complexity is O(p log(p)) where p is the number of pairs of indexes in the sequence that refer to search_sequence[0] and search_sequence[-1] where the index for search_sequence[0] is less than the index forsearch_sequence[-1] because it sorts these p pairings using an O(n log n) algorithm. But then again, my substring iteration at the end could totally overshadow that sorting step. I'm not really sure.
It probably has a worst-case time complexity which is bounded by O(n*m) where n is the length of the sequence and m is the length of the search sequence, but at the moment I cannot think of an example worst-case.
Here is my O(m*n) algorithm in Java:
class ShortestWindowAlgorithm {
Multimap<Character, Integer> charToNeedleIdx; // Character -> indexes in needle, from rightmost to leftmost | Multimap is a class from Guava
int[] prefixesIdx; // prefixesIdx[i] -- rightmost index in the hay window that contains the shortest found prefix of needle[0..i]
int[] prefixesLengths; // prefixesLengths[i] -- shortest window containing needle[0..i]
public int shortestWindow(String hay, String needle) {
init(needle);
for (int i = 0; i < hay.length(); i++) {
for (int needleIdx : charToNeedleIdx.get(hay.charAt(i))) {
if (firstTimeAchievedPrefix(needleIdx) || foundShorterPrefix(needleIdx, i)) {
prefixesIdx[needleIdx] = i;
prefixesLengths[needleIdx] = getPrefixNewLength(needleIdx, i);
forgetOldPrefixes(needleIdx);
}
}
}
return prefixesLengths[prefixesLengths.length - 1];
}
private void init(String needle) {
charToNeedleIdx = ArrayListMultimap.create();
prefixesIdx = new int[needle.length()];
prefixesLengths = new int[needle.length()];
for (int i = needle.length() - 1; i >= 0; i--) {
charToNeedleIdx.put(needle.charAt(i), i);
prefixesIdx[i] = -1;
prefixesLengths[i] = -1;
}
}
private boolean firstTimeAchievedPrefix(int needleIdx) {
int shortestPrefixSoFar = prefixesLengths[needleIdx];
return shortestPrefixSoFar == -1 && (needleIdx == 0 || prefixesLengths[needleIdx - 1] != -1);
}
private boolean foundShorterPrefix(int needleIdx, int hayIdx) {
int shortestPrefixSoFar = prefixesLengths[needleIdx];
int newLength = getPrefixNewLength(needleIdx, hayIdx);
return newLength <= shortestPrefixSoFar;
}
private int getPrefixNewLength(int needleIdx, int hayIdx) {
return needleIdx == 0 ? 1 : (prefixesLengths[needleIdx - 1] + (hayIdx - prefixesIdx[needleIdx - 1]));
}
private void forgetOldPrefixes(int needleIdx) {
if (needleIdx > 0) {
prefixesLengths[needleIdx - 1] = -1;
prefixesIdx[needleIdx - 1] = -1;
}
}
}
It works on every input and also can handle repeated characters etc.
Here are some examples:
public class StackOverflow {
public static void main(String[] args) {
ShortestWindowAlgorithm algorithm = new ShortestWindowAlgorithm();
System.out.println(algorithm.shortestWindow("AXCXXCAXCXAXCXCXAXAXCXCXDXDXDXAXCXDXAXAXCD", "AACD")); // 6
System.out.println(algorithm.shortestWindow("ADCBDABCDACD", "ACD")); // 3
System.out.println(algorithm.shortestWindow("ADCBDABCD", "ACD")); // 4
}
I haven't read every answer here, but I don't think anyone has noticed that this is just a restricted version of local pairwise sequence alignment, in which we are only allowed to insert characters (and not delete or substitute them). As such it will be solved by a simplification of the Smith-Waterman algorithm that considers only 2 cases per vertex (arriving at the vertex either by matching a character exactly, or by inserting a character) rather than 3 cases. This algorithm is O(n^2).
Here's my solution. It follows one of the pattern matching solutions. Please comment/correct me if I'm wrong.
Given the input string as in the question
A D C B D A B C D A C D. Let's first compute the indices where A occurs. Assuming a zero based index this should be [0,5,9].
Now the pseudo code is as follows.
Store the indices of A in a list say *orders*.// orders=[0,5,9]
globalminStart, globalminEnd=0,localMinStart=0,localMinEnd=0;
for (index: orders)
{
int i =index;
Stack chars=new Stack();// to store the characters
i=localminStart;
while(i< length of input string)
{
if(str.charAt(i)=='C') // we've already seen A, so we look for C
st.push(str.charAt(i));
i++;
continue;
else if(str.charAt(i)=='D' and st.peek()=='C')
localminEnd=i; // we have a match! so assign value of i to len
i+=1;
break;
else if(str.charAt(i)=='A' )// seen the next A
break;
}
if (globalMinEnd-globalMinStart<localMinEnd-localMinStart)
{
globalMinEnd=localMinEnd;
globalMinStart=localMinStart;
}
}
return [globalMinstart,globalMinEnd]
}
P.S: this is pseudocode and a rough idea. Id be happy to correct it and understand if there's something wrong.
AFAIC Time complexity -O(n). Space complexity O(n)

Binary Search with an unknown number of items

Assuming you don't know the number of elements you are searching and given an API that accepts an index and will return null if you are outside the bounds (as implemented here with the getWordFromDictionary method), how can you perform a binary search and implement the isWordInDictionary() method for client programs?
This solution works, but I ended up doing a serial search above the level where I found an initial high-index value. The search through the lower range of values was inspired by this answer. I also peeked at BinarySearch in Reflector (C# decompiler), but that has a known list length, so still looking to fill in the gaps.
private static string[] dictionary;
static void Main(string[] args)
{
dictionary = System.IO.File.ReadAllLines(#"C:\tmp\dictionary.txt");
Console.WriteLine(isWordInDictionary("aardvark", 0));
Console.WriteLine(isWordInDictionary("bee", 0));
Console.WriteLine(isWordInDictionary("zebra", 0));
Console.WriteLine(isWordInDictionaryBinary("aardvark"));
Console.WriteLine(isWordInDictionaryBinary("bee"));
Console.WriteLine(isWordInDictionaryBinary("zebra"));
Console.ReadLine();
}
static bool isWordInDictionaryBinary(string word)
{
// assume the size of the dictionary is unknown
// quick check for empty dictionary
string w = getWordFromDictionary(0);
if (w == null)
return false;
// assume that the length is very big.
int low = 0;
int hi = int.MaxValue;
while (low <= hi)
{
int mid = (low + ((hi - low) >> 1));
w = getWordFromDictionary(mid);
// If the middle element m you select at each step is outside
// the array bounds (you need a way to tell this), then limit
// the search to those elements with indexes small than m.
if (w == null)
{
hi = mid;
continue;
}
int compare = String.Compare(w, word);
if (compare == 0)
return true;
if (compare < 0)
low = mid + 1;
else
hi = mid - 1;
}
// punting on the search above the current value of hi
// to the (still unknown) upper limit
return isWordInDictionary(word, hi);
}
// serial search, works good for small number of items
static bool isWordInDictionary(string word, int startIndex)
{
// assume the size of the dictionary is unknown
int i = startIndex;
while (getWordFromDictionary(i) != null)
{
if (getWordFromDictionary(i).Equals(word, StringComparison.OrdinalIgnoreCase))
return true;
i++;
}
return false;
}
private static string getWordFromDictionary(int index)
{
try
{
return dictionary[index];
}
catch (IndexOutOfRangeException)
{
return null;
}
}
Final Code after answers
static bool isWordInDictionaryBinary(string word)
{
// assume the size of the dictionary is unknown
// quick check for empty dictionary
string w = getWordFromDictionary(0);
if (w == null)
return false;
// assume that the number of elements is very big
int low = 0;
int hi = int.MaxValue;
while (low <= hi)
{
int mid = (low + ((hi - low) >> 1));
w = getWordFromDictionary(mid);
// treat null the same as finding a string that comes
// after the string you are looking for
if (w == null)
{
hi = mid - 1;
continue;
}
int compare = String.Compare(w, word);
if (compare == 0)
return true;
if (compare < 0)
low = mid + 1;
else
hi = mid - 1;
}
return false;
}
You can implement a binary search in two phases. In the first phase, you grow the size of the interval you're searching in. Once you detect you're outside the bounds, you can do a normal binary search in the latest interval you found. Something like this:
bool isPresentPhase1(string word)
{
int l = 0, d = 1;
while( true ) // you should eventually reach an index out of bounds
{
w = getWord(l + d);
if( w == null )
return isPresentPhase2(word, l, l + d - 1);
int c = String.Compare(w, word);
if( c == 0 )
return true;
else if( c < 0 )
isPresentPhase2(value, l, l + d - 1);
else
{
l = d + 1;
d *= 2;
}
}
}
bool isPresentPhase2(string word, int lo, int hi)
{
// normal binary search in the interval [lo, hi]
}
Sure you can. Start at index one, and double your query index until you hit something that's lexographically larger than your query word(Edit: or null). Then you can narrow down your search space again until you find the index, or return false.
Edit: Note that this does NOT add to your asymptotic runtime, and it is still O(logN), where N is the number of items in the series.
So, I'm not sure I entirely understand the problem from your description, but I'm assuming you're trying to search through a sorted array of unknown length to find a particular string. I'm also assuming that there are no nulls in the actual array; the array only returns null if you ask for an index that's out of bounds.
If those things are true, the solution should be just a standard binary search, albeit one where you search over the entire integer space, and you just treat null the same as finding a string that comes after the string you are looking for. Essentially just imagine that your sorted array of N strings is really a sorted array of INT_MAX strings sorted with nulls at the end.
What I don't quite understand is that you seem to basically have done that already (at least from a cursory look at the code), so I think I might not understand your problem completely.

Generating All Permutations of Character Combinations when # of arrays and length of each array are unknown

I'm not sure how to ask my question in a succinct way, so I'll start with examples and expand from there. I am working with VBA, but I think this problem is non language specific and would only require a bright mind that can provide a pseudo code framework. Thanks in advance for the help!
Example:
I have 3 Character Arrays Like So:
Arr_1 = [X,Y,Z]
Arr_2 = [A,B]
Arr_3 = [1,2,3,4]
I would like to generate ALL possible permutations of the character arrays like so:
XA1
XA2
XA3
XA4
XB1
XB2
XB3
XB4
YA1
YA2
.
.
.
ZB3
ZB4
This can be easily solved using 3 while loops or for loops. My question is how do I solve for this if the # of arrays is unknown and the length of each array is unknown?
So as an example with 4 character arrays:
Arr_1 = [X,Y,Z]
Arr_2 = [A,B]
Arr_3 = [1,2,3,4]
Arr_4 = [a,b]
I would need to generate:
XA1a
XA1b
XA2a
XA2b
XA3a
XA3b
XA4a
XA4b
.
.
.
ZB4a
ZB4b
So the Generalized Example would be:
Arr_1 = [...]
Arr_2 = [...]
Arr_3 = [...]
.
.
.
Arr_x = [...]
Is there a way to structure a function that will generate an unknown number of loops and loop through the length of each array to generate the permutations? Or maybe there's a better way to think about the problem?
Thanks Everyone!
Recursive solution
This is actually the easiest, most straightforward solution. The following is in Java, but it should be instructive:
public class Main {
public static void main(String[] args) {
Object[][] arrs = {
{ "X", "Y", "Z" },
{ "A", "B" },
{ "1", "2" },
};
recurse("", arrs, 0);
}
static void recurse (String s, Object[][] arrs, int k) {
if (k == arrs.length) {
System.out.println(s);
} else {
for (Object o : arrs[k]) {
recurse(s + o, arrs, k + 1);
}
}
}
}
(see full output)
Note: Java arrays are 0-based, so k goes from 0..arrs.length-1 during the recursion, until k == arrs.length when it's the end of recursion.
Non-recursive solution
It's also possible to write a non-recursive solution, but frankly this is less intuitive. This is actually very similar to base conversion, e.g. from decimal to hexadecimal; it's a generalized form where each position have their own set of values.
public class Main {
public static void main(String[] args) {
Object[][] arrs = {
{ "X", "Y", "Z" },
{ "A", "B" },
{ "1", "2" },
};
int N = 1;
for (Object[] arr : arrs) {
N = N * arr.length;
}
for (int v = 0; v < N; v++) {
System.out.println(decode(arrs, v));
}
}
static String decode(Object[][] arrs, int v) {
String s = "";
for (Object[] arr : arrs) {
int M = arr.length;
s = s + arr[v % M];
v = v / M;
}
return s;
}
}
(see full output)
This produces the tuplets in a different order. If you want to generate them in the same order as the recursive solution, then you iterate through arrs "backward" during decode as follows:
static String decode(Object[][] arrs, int v) {
String s = "";
for (int i = arrs.length - 1; i >= 0; i--) {
int Ni = arrs[i].length;
s = arrs[i][v % Ni] + s;
v = v / Ni;
}
return s;
}
(see full output)
Thanks to #polygenelubricants for the excellent solution.
Here is the Javascript equivalent:
var a=['0'];
var b=['Auto', 'Home'];
var c=['Good'];
var d=['Tommy', 'Hilfiger', '*'];
var attrs = [a, b, c, d];
function recurse (s, attrs, k) {
if(k==attrs.length) {
console.log(s);
} else {
for(var i=0; i<attrs[k].length;i++) {
recurse(s+attrs[k][i], attrs, k+1);
}
}
}
recurse('', attrs, 0);
EDIT: Here's a ruby solution. Its pretty much the same as my other solution below, but assumes your input character arrays are words: So you can type:
% perm.rb ruby is cool
~/bin/perm.rb
#!/usr/bin/env ruby
def perm(args)
peg = Hash[args.collect {|v| [v,0]}]
nperms= 1
args.each { |a| nperms *= a.length }
perms = Array.new(nperms, "")
nperms.times do |p|
args.each { |a| perms[p] += a[peg[a]] }
args.each do |a|
peg[a] += 1
break if peg[a] < a.length
peg[a] = 0
end
end
perms
end
puts perm ARGV
OLD - I have a script to do this in MEL, (Maya's Embedded Language) - I'll try to translate to something C like, but don't expect it to run without a bit of fixing;) It works in Maya though.
First - throw all the arrays together in one long array with delimiters. (I'll leave that to you - because in my system it rips the values out of a UI). So, this means the delimiters will be taking up extra slots: To use your sample data above:
string delimitedArray[] = {"X","Y","Z","|","A","B","|","1","2","3","4","|"};
Of course you can concatenate as many arrays as you like.
string[] getPerms( string delimitedArray[]) {
string result[];
string delimiter("|");
string compactArray[]; // will be the same as delimitedArray, but without the "|" delimiters
int arraySizes[]; // will hold number of vals for each array
int offsets[]; // offsets will holds the indices where each new array starts.
int counters[]; // the values that will increment in the following loops, like pegs in each array
int nPemutations = 1;
int arrSize, offset, nArrays;
// do a prepass to find some information about the structure, and to build the compact array
for (s in delimitedArray) {
if (s == delimiter) {
nPemutations *= arrSize; // arrSize will have been counting elements
arraySizes[nArrays] = arrSize;
counters[nArrays] = 0; // reset the counter
nArrays ++; // nArrays goes up every time we find a new array
offsets.append(offset - arrSize) ; //its here, at the end of an array that we store the offset of this array
arrSize=0;
} else { // its one of the elements, not a delimiter
compactArray.append(s);
arrSize++;
offset++;
}
}
// put a bail out here if you like
if( nPemutations > 256) error("too many permutations " + nPemutations+". max is 256");
// now figure out the permutations
for (p=0;p<nPemutations;p++) {
string perm ="";
// In each array at the position of that array's counter
for (i=0;i<nArrays ;i++) {
int delimitedArrayIndex = counters[i] + offsets[i] ;
// build the string
perm += (compactArray[delimitedArrayIndex]);
}
result.append(perm);
// the interesting bit
// increment the array counters, but in fact the program
// will only get to increment a counter if the previous counter
// reached the end of its array, otherwise we break
for (i = 0; i < nArrays; ++i) {
counters[i] += 1;
if (counters[i] < arraySizes[i])
break;
counters[i] = 0;
}
}
return result;
}
If I understand the question correctly, I think you could put all your arrays into another array, thereby creating a jagged array.
Then, loop through all the arrays in your jagged array creating all the permutations you need.
Does that make sense?
it sounds like you've almost got it figured out already.
What if you put in there one more array, call it, say ArrayHolder , that holds all of your unknown number of arrays of unknown length. Then, you just need another loop, no?

Resources