Calculate efficiently the minimum over each group and sub-group - c

Imagine that we have drawn a random sample y1, y2, ...,yn from some population, so double y[] and int n are known. And there are groups in our population but we do not know exactly which observation is allocated on a particular group. So to each yi we introduce an allocation variable zi that tells us from which group yi has been drawn. Now we assume that there are int k groups, so zi e {0, .., k-1} for all i. Now to make inferences for the groups I need to iterate my algorithm several number of times say 50,000 or 100,000. And at each iteration we will allocate probabilistically each observation to some group so my array of allocations int z[] will be changing. In this case to count the number of observations in each group and minimum is very easy;
int nj[k], yj_min[k];
/* initializing the variables at each iteration */
for(j=0; j<k; j++){
nj[j]=0;
yj_min[j]=y[n]; /* y[] are ordered so y[n] is the maximum*/
}
for(i=0; i<n; i++){
nj[z[i]] = nj[z[i]] + 1;
if(yj_min[z[i]]) < y[z[i]]){
yj_min[z[i]] = y[z[i]];
}
}
but if we introduce a further allocation variable di for each observation yi that will indicate the sub-group from which yi has been sampled (as well sampled probabilistically). There are int m sub-groups, so di e {0, .., m-1}. Then (zi=j, di=s) indicates that the observation yi has been drawn from the group j and sub-group s.
How could I calculate EFFICIENTLY, as I have to do this at each iteration, the minimum yjs_min over {i:zi=j, di=s}? i.e. the minimum over yi such that zi=j and di=s with j=0, ..k-1 and s=0,..,m-1
It would be great to do something like
for(i=0; i<n; i++){
njs[z[i]][d[i]] = njs[z[i]][d[i]] + 1;
if(yjs_min[z[i]][d[i]]) < y[z[i]][d[i]]){
yjs_min[z[i]][d[i]] = y[z[i]][d[i]];
}
}
but obviously this is impossible!!! So please any ideas?
Cheers,
Carlos

It looks like you're trying to do something like a Fisher exact test or a permutation test. If so, you might try using a statistics package like R, which is designed to do this kind of stuff, and is likely to have the most efficient algorithms built in already.
That aside, as I understand it, you are stratifying the sample into n subgroups (y), and then each of those subgroups into k sub-subgroups. You want to find the minimum element of each sub-subgroup.
One reasonably efficient solution is: create n*k unique identifiers, and a map that indicates which sub-subgroup each of them corresponds to. Then, randomly allocate these numbers, (using the same distribution) to your sample observations (like you were before). Use an efficient in-place sort (like quicksort with a properly selected pivot) to sort the sample by identifier, so that all elements with the same identifier are stored in a contiguous block of memory. This takes log-linear time, so it should be very quick.
Then you just need to walk through the array in order, and find the minimum element for each unique identifier. This should take linear time and n*k extra space.
Hope that helps.

Related

Given an array of integers of size n+1 consisting of the elements [1,n]. All elements are unique except one which is duplicated k times

I have been attempting to solve the following problem:
You are given an array of n+1 integers where all the elements lies in [1,n]. You are also given that one of the elements is duplicated a certain number of times, whilst the others are distinct. Develop an algorithm to find both the duplicated number and the number of times it is duplicated.
Here is my solution where I let k = number of duplications:
struct LatticePoint{ // to hold duplicate and k
int a;
int b;
LatticePoint(int a_, int b_) : a(a_), b(b_) {}
}
LatticePoint findDuplicateAndK(const std::vector<int>& A){
int n = A.size() - 1;
std::vector<int> Numbers (n);
for(int i = 0; i < n + 1; ++i){
++Numbers[A[i] - 1]; // A[i] in range [1,n] so no out-of-access
}
int i = 0;
while(i < n){
if(Numbers[i] > 1) {
int duplicate = i + 1;
int k = Numbers[i] - 1;
LatticePoint result{duplicate, k};
return LatticePoint;
}
So, the basic idea is this: we go along the array and each time we see the number A[i] we increment the value of Numbers[A[i]]. Since only the duplicate appears more than once, the index of the entry of Numbers with value greater than 1 must be the duplicate number with the value of the entry the number of duplications - 1. This algorithm of O(n) in time complexity and O(n) in space.
I was wondering if someone had a solution that is better in time and/or space? (or indeed if there are any errors in my solution...)
You can reduce the scratch space to n bits instead of n ints, provided you either have or are willing to write a bitset with run-time specified size (see boost::dynamic_bitset).
You don't need to collect duplicate counts until you know which element is duplicated, and then you only need to keep that count. So all you need to track is whether you have previously seen the value (hence, n bits). Once you find the duplicated value, set count to 2 and run through the rest of the vector, incrementing count each time you hit an instance of the value. (You initialise count to 2, since by the time you get there, you will have seen exactly two of them.)
That's still O(n) space, but the constant factor is a lot smaller.
The idea of your code works.
But, thanks to the n+1 elements, we can achieve other tradeoffs of time and space.
If we have some number of buckets we're dividing numbers between, putting n+1 numbers in means that some bucket has to wind up with more than expected. This is a variant on the well-known pigeonhole principle.
So we use 2 buckets, one for the range 1..floor(n/2) and one for floor(n/2)+1..n. After one pass through the array, we know which half the answer is in. We then divide that half into halves, make another pass, and so on. This leads to a binary search which will get the answer with O(1) data, and with ceil(log_2(n)) passes, each taking time O(n). Therefore we get the answer in time O(n log(n)).
Now we don't need to use 2 buckets. If we used 3, we'd take ceil(log_3(n)) passes. So as we increased the fixed number of buckets, we take more space and save time. Are there other tradeoffs?
Well you showed how to do it in 1 pass with n buckets. How many buckets do you need to do it in 2 passes? The answer turns out to be at least sqrt(n) bucekts. And 3 passes is possible with the cube root. And so on.
So you get a whole family of tradeoffs where the more buckets you have, the more space you need, but the fewer passes. And your solution is merely at the extreme end, taking the most spaces and the least time.
Here's a cheekier algorithm, which requires only constant space but rearranges the input vector. (It only reorders; all the original elements are still present at the end.)
It's still O(n) time, although that might not be completely obvious.
The idea is to try to rearrange the array so that A[i] is i, until we find the duplicate. The duplicate will show up when we try to put an element at the right index and it turns out that that index already holds that element. With that, we've found the duplicate; we have a value we want to move to A[j] but the same value is already at A[j]. We then scan through the rest of the array, incrementing the count every time we find another instance.
#include <utility>
#include <vector>
std::pair<int, int> count_dup(std::vector<int> A) {
/* Try to put each element in its "home" position (that is,
* where the value is the same as the index). Since the
* values start at 1, A[0] isn't home to anyone, so we start
* the loop at 1.
*/
int n = A.size();
for (int i = 1; i < n; ++i) {
while (A[i] != i) {
int j = A[i];
if (A[j] == j) {
/* j is the duplicate. Now we need to count them.
* We have one at i. There's one at j, too, but we only
* need to add it if we're not going to run into it in
* the scan. And there might be one at position 0. After that,
* we just scan through the rest of the array.
*/
int count = 1;
if (A[0] == j) ++count;
if (j < i) ++count;
for (++i; i < n; ++i) {
if (A[i] == j) ++count;
}
return std::make_pair(j, count);
}
/* This swap can only happen once per element. */
std::swap(A[i], A[j]);
}
}
/* If we get here, every element from 1 to n is at home.
* So the duplicate must be A[0], and the duplicate count
* must be 2.
*/
return std::make_pair(A[0], 2);
}
A parallel solution with O(1) complexity is possible.
Introduce an array of atomic booleans and two atomic integers called duplicate and count. First set count to 1. Then access the array in parallel at the index positions of the numbers and perform a test-and-set operation on the boolean. If a boolean is set already, assign the number to duplicate and increment count.
This solution may not always perform better than the suggested sequential alternatives. Certainly not if all numbers are duplicates. Still, it has constant complexity in theory. Or maybe linear complexity in the number of duplicates. I am not quite sure. However, it should perform well when using many cores and especially if the test-and-set and increment operations are lock-free.

How to understand linear partitioning in dynamic programming

a couple days ago I learned about linear partitioning problem, here is my code for it, is this code right and I don't understand the formula behind it, why is it like that, if you are able please explain me why the formula works.
for(int i=1;i<=n;i++) {
rsq[i]=rsq[i-1]+arr[i];
}
int dp[n+1][k+1];
for(int i=0;i<=n;i++) {
for(int j=0;j<=k;j++) {
dp[i][j]=987654321;
}
}
dp[0][0]=0;
for(int i=1;i<=n;i++) {
dp[i][1]=rsq[i];
}
for(int i=1;i<=k;i++) {
dp[1][i]=arr[1];
}
for(int i=2;i<=n;i++) {
for(int j=2;j<=k;j++) {
for(int x=1;x<i;x++) {
int s=max(dp[x][j-1], rsq[i]-rsq[x]);
if(dp[i][j]>s) dp[i][j]=s;
}
}
}
cout<<dp[n][k];
Thanks in advance.
Following this explanation, apparently the semantics of the state space dp as follows; apparently arr contains the sizes of the items to process and rsq contains the partial sums needed below to circumvent their recalculation.
dp[i][j] = minimum possible cost over all partitions of
arr[1],...arr[i] into j ranges
where i in {1,...,n} and j in {1,...k} or positive
infinity if such a partition does not exist
Apparently in the implementation 987654321 is used to model the value of positive infinity. Note that in the explanation, the axes of the state space are exchanged compared to the implementation in the original question. Based on this definition, we obtain the following recurrence relation for the values of the states.
dp[i,j] = min{ max{ dp[i-1,j'], sum_{i'=j'+1}^{n} arr[i']} : j' in {1,...,j} }
In the implementation, the sum above is precalculated in rsq. The recurrence relation can be interpreted as follows. Given all values of dp[i-1][*] for some specific value of i (which means that all cost values for items 1 up to i-1 are known), all values dp[i][*] (for items 1 up to i) can be obtained by taking all items from j'+1 to n' (j' ranges from j to j, all possibilies are considered) and summing up the remainig items (which then consitute a partition); for the optimal partition of the first items, the precalculated value is used. The maximum of these values is the cost of the choice.
Intuitively, this can be seen as partitioning the items arr[1],...,arr[n] at an arbitrary split point. The items to the right are considered as one partition (the cost of which is the sum of their members, as they are placed together into one partition), the items to the left are recursively partitioned optimally into one partition less. The dynamic programming algorithm (besides the precalculation of the partial sums) initializes some base cases which corrspond to placing every item in a single partition and organizes the order of evaluation of the states in such a way that all values needed for the next larger value j of the second axis are always calculated when needed.

Find three elements in a sorted array which sum to a fourth element

A friend of mine recently got this interview question, which seems to us to be solvable but not within the asymptotic time bounds that the interviewer thought should be possible. Here is the problem:
You have an array of N integers, xs, sorted but possibly non-distinct. Your goal is to find four array indices(1) (a,b,c,d) such that the following two properties hold:
xs[a] + xs[b] + xs[c] = xs[d]
a < b < c < d
The goal is to do this in O(N2) time.
First, an O(N3log(N)) solution is obvious: for each (a,b,c) ordered triple, use binary search to see if an appropriate d can be found. Now, how to do better?
One interesting suggestion from the interviewer is to rewrite the first condition as:
xs[a] + xs[b] = xs[d] - xs[c]
It's not clear what to do after this, but perhaps we could chose some pivot value P, and search for an (a,b) pair adding up to P, and a (d,c) pair subtracting to it. That search is easy enough to do in O(n) time for a given P, by searching inwards from both ends of the array. However, it seems to me that the problem with this is that there are N2 such values P, not just N of them, so we haven't actually reduced the problem size at all: we're doing O(N) work, O(N2) times.
We found some related problems being discussed online elsewhere: Find 3 numbers in an array adding to a given sum is solvable in N2 time, but requires that the sum be fixed ahead of time; adapting the same algorithm but iterating through each possible sum leaves us at N3 as always.
Another related problem seems to be Find all triplets in array with sum less than or equal to given sum, but I'm not sure how much of the stuff there is relevant here: an inequality rather than an equality mixes things up quite a bit, and of course the target is fixed rather than varying.
So, what are we missing? Is the problem impossible after all, given the performance requirements? Or is there a clever algorithm we're unable to spot?
(1) Actually the problem as posed is to find all such (a,b,c,d) tuples, and return a count of how many there are. But I think even finding a single one of them in the required time constraints is hard enough.
If the algorithm would have to list the solutions (i.e. the sets of a, b, c, and d that satisfy the condition), the worst case time complexity is O(n4):
1. There can be O(n4) solutions
The trivial example is an array with only 0 values in it. Then a, b, c and d have all the freedom as long as they stay in order. This represents O(n4) solutions.
But more generally arrays which follow the following pattern have O(n4) solutions:
w, w, w, ... x, x, x, ..., y, y, y, ... z, z, z, ....
With just as many occurrences of each, and:
w + x + y = z
However, to only produce the number of solutions, an algorithm can have a better time complexity.
2. Algorithm
This is a slight variation of the already posted algorithm, which does not involve the H factor. It also describes how to handle cases where different configurations lead to the same sums.
Retrieve all pairs and store them in an array X, where each element gets the following information:
a: the smallest index of the two
b: the other index
sum: the value of xs[a] + xs[b]
At the same time also store for each such pair in another array Y, the following:
c: the smallest index of the two
d: the other index
sum: the value of xs[d] - xs[c]
The above operation has a time complexity of O(n²)
Sort both arrays by their element's sum attribute. In case of equal sum values, the sort order will be determined as follows: for the X array by increasing b; for the Y array by decreasing c. Sorting can be done in O(n²) O(n²logn) time.
[Edit: I could not prove the earlier claim of O(n²) (unless some assumptions are made that allow for a radix/bucket sorting algorithm, which I will not assume). As noted in comments, in general an array with n² elements can be sorted in O(n²logn²), which is O(n²logn), but not O(n²)]
Go through both arrays in "tandem" to find pairs of sums that are equal. If that is the case, it needs to be checked that X[i].b < Y[j].c. If so it represents a solution. But there could be many of them, and counting those in an acceptable time needs special care.
Let m = n(n-1)/2, i.e. the number of elements in array X (which is also the size of array Y):
i = 0
j = 0
while i < m and j < m:
if X[i].sum < Y[j].sum:
i = i + 1
elif X[i].sum > Y[j].sum:
j = j + 1
else:
# We have a solution. Need to count all others that have same sums in X and Y.
# Find last match in Y and set k as index to it:
countY = 0
while k < m and X[i].sum == Y[j].sum and X[i].b < Y[j].c:
countY = countY + 1
j = j + 1
k = j - 1
# add chunks to `count`:
while i < m and countY >= 0 and X[i].sum == Y[k].sum:
while countY >= 0 and X[i].b >= Y[k].c:
countY = countY - 1
k = k - 1
count = count + countY
i = i + 1
Note that although there are nested loops, the variable i only ever increments, and so does j. The variable k always decrements in the innermost loop. Although it also gets higher values to start from, it can never address the same Y element more than a constant number of times via the k index, because while decrementing this index, it stays within the "same sum" range of Y.
So this means that this last part of the algorithm runs in O(m), which is O(n²). As my latest edit confirmed that the sorting step is not O(n²), that step determines the overall time-complexity: O(n²logn).
So one solution can be :
List all x[a] + x[b] value possible such that a < b and hash them in this fashion
key = (x[a]+x[b]) and value = (a,b).
Complexity of this step - O(n^2)
Now List all x[d] - x[c] values possible such that d > c. Also for each x[d] - x[c] search the entry in your hash map by querying. We have a solution if there exists an entry such that c > b for any hit.
Complexity of this step - O(n^2) * H.
Where H is the search time in your hashmap.
Total complexity - O(n^2)* H. Now H may be O(1). This could done if the range of values in the array is small. Also the choice of hash function would depend on the properties of elements in the array.

Randomly permuting an array [duplicate]

The famous Fisher-Yates shuffle algorithm can be used to randomly permute an array A of length N:
For k = 1 to N
Pick a random integer j from k to N
Swap A[k] and A[j]
A common mistake that I've been told over and over again not to make is this:
For k = 1 to N
Pick a random integer j from 1 to N
Swap A[k] and A[j]
That is, instead of picking a random integer from k to N, you pick a random integer from 1 to N.
What happens if you make this mistake? I know that the resulting permutation isn't uniformly distributed, but I don't know what guarantees there are on what the resulting distribution will be. In particular, does anyone have an expression for the probability distributions over the final positions of the elements?
An Empirical Approach.
Let's implement the erroneous algorithm in Mathematica:
p = 10; (* Range *)
s = {}
For[l = 1, l <= 30000, l++, (*Iterations*)
a = Range[p];
For[k = 1, k <= p, k++,
i = RandomInteger[{1, p}];
temp = a[[k]];
a[[k]] = a[[i]];
a[[i]] = temp
];
AppendTo[s, a];
]
Now get the number of times each integer is in each position:
r = SortBy[#, #[[1]] &] & /# Tally /# Transpose[s]
Let's take three positions in the resulting arrays and plot the frequency distribution for each integer in that position:
For position 1 the freq distribution is:
For position 5 (middle)
And for position 10 (last):
and here you have the distribution for all positions plotted together:
Here you have a better statistics over 8 positions:
Some observations:
For all positions the probability of
"1" is the same (1/n).
The probability matrix is symmetrical
with respect to the big anti-diagonal
So, the probability for any number in the last
position is also uniform (1/n)
You may visualize those properties looking at the starting of all lines from the same point (first property) and the last horizontal line (third property).
The second property can be seen from the following matrix representation example, where the rows are the positions, the columns are the occupant number, and the color represents the experimental probability:
For a 100x100 matrix:
Edit
Just for fun, I calculated the exact formula for the second diagonal element (the first is 1/n). The rest can be done, but it's a lot of work.
h[n_] := (n-1)/n^2 + (n-1)^(n-2) n^(-n)
Values verified from n=3 to 6 ( {8/27, 57/256, 564/3125, 7105/46656} )
Edit
Working out a little the general explicit calculation in #wnoise answer, we can get a little more info.
Replacing 1/n by p[n], so the calculations are hold unevaluated, we get for example for the first part of the matrix with n=7 (click to see a bigger image):
Which, after comparing with results for other values of n, let us identify some known integer sequences in the matrix:
{{ 1/n, 1/n , ...},
{... .., A007318, ....},
{... .., ... ..., ..},
... ....,
{A129687, ... ... ... ... ... ... ..},
{A131084, A028326 ... ... ... ... ..},
{A028326, A131084 , A129687 ... ....}}
You may find those sequences (in some cases with different signs) in the wonderful http://oeis.org/
Solving the general problem is more difficult, but I hope this is a start
The "common mistake" you mention is shuffling by random transpositions. This problem was studied in full detail by Diaconis and Shahshahani in Generating a random permutation with random transpositions (1981). They do a complete analysis of stopping times and convergence to uniformity. If you cannot get a link to the paper, then please send me an e-mail and I can forward you a copy. It's actually a fun read (as are most of Persi Diaconis's papers).
If the array has repeated entries, then the problem is slightly different. As a shameless plug, this more general problem is addressed by myself, Diaconis and Soundararajan in Appendix B of A Rule of Thumb for Riffle Shuffling (2011).
Let's say
a = 1/N
b = 1-a
Bi(k) is the probability matrix after i swaps for the kth element. i.e the answer to the question "where is k after i swaps?". For example B0(3) = (0 0 1 0 ... 0) and B1(3) = (a 0 b 0 ... 0). What you want is BN(k) for every k.
Ki is an NxN matrix with 1s in the i-th column and i-th row, zeroes everywhere else, e.g:
Ii is the identity matrix but with the element x=y=i zeroed. E.g for i=2:
Ai is
Then,
But because BN(k=1..N) forms the identity matrix, the probability that any given element i will at the end be at position j is given by the matrix element (i,j) of the matrix:
For example, for N=4:
As a diagram for N = 500 (color levels are 100*probability):
The pattern is the same for all N>2:
The most probable ending position for k-th element is k-1.
The least probable ending position is k for k < N*ln(2), position 1 otherwise
I knew I had seen this question before...
" why does this simple shuffle algorithm produce biased results? what is a simple reason? " has a lot of good stuff in the answers, especially a link to a blog by Jeff Atwood on Coding Horror.
As you may have already guessed, based on the answer by #belisarius, the exact distribution is highly dependent on the number of elements to be shuffled. Here's Atwood's plot for a 6-element deck:
What a lovely question! I wish I had a full answer.
Fisher-Yates is nice to analyze because once it decides on the first element, it leaves it alone. The biased one can repeatedly swap an element in and out of any place.
We can analyze this the same way we would a Markov chain, by describing the actions as stochastic transition matrices acting linearly on probability distributions. Most elements get left alone, the diagonal is usually (n-1)/n. On pass k, when they don't get left alone, they get swapped with element k, (or a random element if they are element k). This is 1/(n-1) in either row or column k. The element in both row and column k is also 1/(n-1). It's easy enough to multiply these matrices together for k going from 1 to n.
We do know that the element in last place will be equally likely to have originally been anywhere because the last pass swaps the last place equally likely with any other. Similarly, the first element will be equally likely to be placed anywhere. This symmetry is because the transpose reverses the order of matrix multiplication. In fact, the matrix is symmetric in the sense that row i is the same as column (n+1 - i). Beyond that, the numbers don't show much apparent pattern. These exact solutions do show agreement with the simulations run by belisarius: In slot i, The probability of getting j decreases as j raises to i, reaching its lowest value at i-1, and then jumping up to its highest value at i, and decreasing until j reaches n.
In Mathematica I generated each step with
step[k_, n_] := Normal[SparseArray[{{k, i_} -> 1/n,
{j_, k} -> 1/n, {i_, i_} -> (n - 1)/n} , {n, n}]]
(I haven't found it documented anywhere, but the first matching rule is used.)
The final transition matrix can be calculated with:
Fold[Dot, IdentityMatrix[n], Table[step[m, n], {m, s}]]
ListDensityPlot is a useful visualization tool.
Edit (by belisarius)
Just a confirmation. The following code gives the same matrix as in #Eelvex's answer:
step[k_, n_] := Normal[SparseArray[{{k, i_} -> (1/n),
{j_, k} -> (1/n), {i_, i_} -> ((n - 1)/n)}, {n, n}]];
r[n_, s_] := Fold[Dot, IdentityMatrix[n], Table[step[m, n], {m, s}]];
Last#Table[r[4, i], {i, 1, 4}] // MatrixForm
Wikipedia's page on the Fisher-Yates shuffle has a description and example of exactly what will happen in that case.
You can compute the distribution using stochastic matrices. Let the matrix A(i,j) describe the probability of the card originally at position i ending up in position j. Then the kth swap has a matrix Ak given by Ak(i,j) = 1/N if i == k or j == k, (the card in position k can end up anywhere and any card can end up at position k with equal probability), Ak(i,i) = (N - 1)/N for all i != k (every other card will stay in the same place with probability (N-1)/N) and all other elements zero.
The result of the complete shuffle is then given by the product of the matrices AN ... A1.
I expect you're looking for an algebraic description of the probabilities; you can get one by expanding out the above matrix product, but it I imagine it will be fairly complex!
UPDATE: I just spotted wnoise's equivalent answer above! oops...
I've looked into this further, and it turns out that this distribution has been studied at length. The reason it's of interest is because this "broken" algorithm is (or was) used in the RSA chip system.
In Shuffling by semi-random transpositions, Elchanan Mossel, Yuval Peres, and Alistair Sinclair study this and a more general class of shuffles. The upshot of that paper appears to be that it takes log(n) broken shuffles to achieve near random distribution.
In The bias of three pseudorandom shuffles (Aequationes Mathematicae, 22, 1981, 268-292), Ethan Bolker and David Robbins analyze this shuffle and determine that the total variation distance to uniformity after a single pass is 1, indicating that it is not very random at all. They give asympotic analyses as well.
Finally, Laurent Saloff-Coste and Jessica Zuniga found a nice upper bound in their study of inhomogeneous Markov chains.
This question is begging for an interactive visual matrix diagram analysis of the broken shuffle mentioned. Such a tool is on the page Will It Shuffle? - Why random comparators are bad by Mike Bostock.
Bostock has put together an excellent tool that analyzes random comparators. In the dropdown on that page, choose naïve swap (random ↦ random) to see the broken algorithm and the pattern it produces.
His page is informative as it allows one to see the immediate effects a change in logic has on the shuffled data. For example:
This matrix diagram using a non-uniform and very-biased shuffle is produced using a naïve swap (we pick from "1 to N") with code like this:
function shuffle(array) {
var n = array.length, i = -1, j;
while (++i < n) {
j = Math.floor(Math.random() * n);
t = array[j];
array[j] = array[i];
array[i] = t;
}
}
But if we implement a non-biased shuffle, where we pick from "k to N" we should see a diagram like this:
where the distribution is uniform, and is produced from code such as:
function FisherYatesDurstenfeldKnuthshuffle( array ) {
var pickIndex, arrayPosition = array.length;
while( --arrayPosition ) {
pickIndex = Math.floor( Math.random() * ( arrayPosition + 1 ) );
array[ pickIndex ] = [ array[ arrayPosition ], array[ arrayPosition ] = array[ pickIndex ] ][ 0 ];
}
}
The excellent answers given so far are concentrating on the distribution, but you have asked also "What happens if you make this mistake?" - which is what I haven't seen answered yet, so I'll give an explanation on this:
The Knuth-Fisher-Yates shuffle algorithm picks 1 out of n elements, then 1 out of n-1 remaining elements and so forth.
You can implement it with two arrays a1 and a2 where you remove one element from a1 and insert it into a2, but the algorithm does it in place (which means, that it needs only one array), as is explained here (Google: "Shuffling Algorithms Fisher-Yates DataGenetics") very well.
If you don't remove the elements, they can be randomly chosen again which produces the biased randomness. This is exactly what the 2nd example your are describing does. The first example, the Knuth-Fisher-Yates algorithm, uses a cursor variable running from k to N, which remembers which elements have already been taken, hence avoiding to pick elements more than once.

What sort of indexing method can I use to store the distances between X^2 vectors in an array without redundancy?

I'm working on a demo that requires a lot of vector math, and in profiling, I've found that it spends the most time finding the distances between given vectors.
Right now, it loops through an array of X^2 vectors, and finds the distance between each one, meaning it runs the distance function X^4 times, even though (I think) there are only (X^2)/2 unique distances.
It works something like this: (pseudo c)
#define MATRIX_WIDTH 8
typedef float vec2_t[2];
vec2_t matrix[MATRIX_WIDTH * MATRIX_WIDTH];
...
for(int i = 0; i < MATRIX_WIDTH; i++)
{
for(int j = 0; j < MATRIX_WIDTH; j++)
{
float xd, yd;
float distance;
for(int k = 0; k < MATRIX_WIDTH; k++)
{
for(int l = 0; l < MATRIX_WIDTH; l++)
{
int index_a = (i * MATRIX_LENGTH) + j;
int index_b = (k * MATRIX_LENGTH) + l;
xd = matrix[index_a][0] - matrix[index_b][0];
yd = matrix[index_a][1] - matrix[index_b][1];
distance = sqrtf(powf(xd, 2) + powf(yd, 2));
}
}
// More code that uses the distances between each vector
}
}
What I'd like to do is create and populate an array of (X^2) / 2 distances without redundancy, then reference that array when I finally need it. However, I'm drawing a blank on how to index this array in a way that would work. A hash table would do it, but I think it's much too complicated and slow for a problem that seems like it could be solved by a clever indexing method.
EDIT: This is for a flocking simulation.
performance ideas:
a) if possible work with the squared distance, to avoid root calculation
b) never use pow for constant, integer powers - instead use xd*xd
I would consider changing your algorithm - O(n^4) is really bad. When dealing with interactions in physics (also O(n^4) for distances in 2d field) one would implement b-trees etc and neglect particle interactions with a low impact. But it will depend on what "more code that uses the distance..." really does.
just did some considerations: the number of unique distances is 0.5*n*n(+1) with n = w*h.
If you write down when unique distances occur, you will see that both inner loops can be reduced, by starting at i and j.
Additionally if you only need to access those distances via the matrix index, you can set up a 4D-distance matrix.
If memory is limited we can save up nearly 50%, as mentioned above, with a lookup function that will access a triangluar matrix, as Code-Guru said. We would probably precalculate the line index to avoid summing up on access
float distanceArray[(H*W+1)*H*W/2];
int lineIndices[H];
searchDistance(int i, int j)
{
return i<j?distanceArray[i+lineIndices[j]]:distanceArray[j+lineIndices[i]];
}

Resources