Random permutation with a[i] != i - permutation

Usually a random permutation for an array with n elements means a uniform distribution from n! possibilities, and the Knuth shuffle is used to do so:
for i from n − 1 downto 1 do
j ← random integer with 0 ≤ j ≤ i
exchange a[j] and a[i]
But with the constraint that a[i] != i, I have no idea how to form such a permutation uniformly.
For example, with n = 3, how to form a permutation randomly from the possibilities below?
{1, 2, 0}, {2, 0, 1}

Permutation without fixed points is called derangement
As the number of derangemets is O(n!), just like the number of permutations, generating all permutations and filtering those which are not derangements wouldn't hurt your performance.
Quick search returned me these slides, which describe another algorithm.

You didn't state how large your arrays are or how concerned you are with efficiency. One possible solution would simply be to do the Knuth shuffle, then test to see if your constraint is satisfied and redo the shuffle if not.
If you want a bit better efficiency, you could try this instead. Because i is decreasing, after the step exchange a[j] and a[i], a[i] is fixed. So simply modify the algorithm to:
for i from n − 1 downto 1 do
j ← random integer with 0 ≤ j ≤ i; repeat until a[j] != i
exchange a[j] and a[i]

Related

Finding the probability that two items are compared. (hints please)

I'm attempting to solve the following problem (from Prof. Jeff Erikson's notes): Given the algorithm below which takes in an unsorted array A and returns the k-th smallest element in the array (given that Partition does what its name implies via the standard quicksort method given the pivot returned by Random (which is assumed to return a uniformly random integer between 1 and n in linear time) and returns the new index of the pivot), we are to find the exact probability that this algorithm compares the i-th smallest and j-th smallest elements in the input array.
QuickSelect(A[1..n],k):
r <-- Partition(A[1..n],Random(n))
if k < r:
return QuickSelect(A[1..r-1],k)
else if k > r:
return QuickSelect(A[r+1..n],k-r)
else:
return A[k]
Now, I can see that the probability of the first if statement being true is (n-k)/n, the probability of the second block being true is (k-1)/n, and the probability of executing the else statement is 1/n. I also know that (assuming i < j) the probability of i < r < j is (j-i-1)/n which guarantees that the two elements are never compared. On the other hand, if i==r or j==r, then i and j are guaranteed to be compared. The part that really trips me up is what happens if r < i or j < r, because whether or not i and j are compared depends on the value of k (whether or not we are able to recursively call QuickSelect).
Any hints and/or suggestions would be greatly appreciated. This is for homework, so I would rather not have full solutions given to me so that I may actually learn a bit. Thanks in advance!
As it has already been mentioned Monte Carlo method is simple solution for fast (in sense of implementation) approximation.
There is a way to compute exact probability using dynamic programming
Here we will assume that all elements in array are distinct and A[i] < A[j].
Let us denote P(i, j, k, n) for probability of comparison ith and jth elements while selecting k-th in an n-elements array.
Then there is equal probability for r to be any of 1..n and this probability is 1/n. Also note that all this events are non-intersecting and their union forms all the space of events.
Let us look carefully at each possible value of r.
If r = 1..i-1 then i and j fall into the same part and the probability of their comparison is P(i-r, j-r, k-r, n-r) if k > r and 0 otherwise.
If r = i the probability is 1.
If r = i+1..j-1 the probability is 0.
If r = j the probability is 1 and if r = j+1..n the probability is P(i, j, k, r-1) if k < r and 0 otherwise.
So the full recurrent formula is P(i, j, k, n) = 1/n * (2 + Sum for r = 1..min(r, i)-1 P(i-r, j-r, k-r, n-r) + sum for r = max(j, k)+1..n P(i, j, k, r-1))
Finally for n = 2 (for i and j to be different) the only possible Ps are P(1, 2, 1, 2) and P(1, 2, 2, 2) and both equal 1 (no matter what r is equal to there will be a comparison)
Time complexity is O(n^5), space complexity is O(n^4). Also it is possible to optimize calculations and make time complexity O(n^4). Also as we only consider A[i] < A[j] and i,j,k <= n multiplicative constant is 1/8. So it would possible to compute any value for n up to 100 in a couple of minutes, using straight-forward algorithm described or up to 300 for optimized one.
Note that two positions are only compared if one of them is the pivot. So the best way to look at this is to look at the sequence of chosen pivots.
Suppose the k-th smallest element is between i and j. Then i and j are not compared if and only if an element between them is selected as a pivot before i or j are. What is the probability that this happens?
Now suppose the k-th smallest element is after j. i and j are not compared if and only if an element between i+1 and k (excluding j) is selected as a pivot before i or j are. What is the probability that this happens?

Rearrange array with sum and length n while minimizing distance change

An array A[n] (length n) of nonnegative integers is given. The array contains n objects dispersed through itself, so that A[i] represents the number of objects in slot i, and the sum from i=0 to n of A[i] is n. We must rearrange the array such that each element is 1. We are given that n <= 10^5. (So O(n^2) is too slow, but O(n log n) is fine.)
Rearranging works as follows: if A[i] = k and A[j] = h, then we can decrease A[i] by some positive integer m <= k and increase some element A[j] by m correspondingly, so that A[i] = k-m and A[j] = h+m. However, each rearrange has a cost given by Cost(m, i, j) = m d(i, j)^2 (proportional to the square of the distance of the rearrange). The array works such that the distance function d(i, j) is standard subtraction, but it can wrap around the array, so if n = 7 then d(1, 4)=3, but d(0, 6) = 1 and d(1,5) = 3, etc. That is, we can think of the array as "circular."
The total cost is given by the sum of the cost function over all rearranges, and our goal is to find the minimum value of the cost function such that A[i] = 1 for all i, i.e. all the elements are equal to 1. Each object can only be rearranged once, so for example if A = [5,0,0,0,0] we can't just move an object from A[0] to A[1] and then to A[2] to circumvent the squaring of the distance.
Help on an algorithm or pseudocode/code to solve this problem would be appreciated.

Finding continuous subsequence that minimizes the average of the rest of the array?

Suppose there's an integer array arr[0..n-1]. Find a subsequence sub[i..j] (i > 0 and j < n - 1) such that the rest of the array has the smallest average.
Example:
arr[5] = {5,1,7,8,2};
Remove {7,8}, the array becomes {5, 1, 2} which has average 2.67 (smallest possible).
I thought this is a modification of the Longest Increasing Subsequence but couldn't figure it out.
Thanks,
Let's find the average value using binary search.
Suppose, that sum of all elements is S.
For given x let's check if exist i and j such that avg of all elements except from i to j less or equal to x.
To do that, let's subtract x from all elements in arr. We need to check if exists i and j such that sum of all elements except from i to j less or equal to zero. To do that, lets find sum of all elements in current array: S' = S - x * n. So we want to find i and j such that sum from i to j will be greater or equal than S'. To do that, let's find subarray with the larges sum. And this can be done using elegant Jay Kadane's algorithm: https://en.wikipedia.org/wiki/Maximum_subarray_problem
When to terminate binary search? When the maximum subarray sum will be zero (or close enough).
Time complexity: O(n log w), w - presicion of the binary search.

Array of random numbers with sum in given range?

In C, how do I get an array of n numbers (each 0x00-0xFF in my case), of which the sum is within a given range 0..k?
The almost duplicate C++ multiple random numbers adding up to equal a certain number targets a specific sum, but in my case the sum can be anything between 0..k.
You need to specify what is the desired distribution of the random numbers.
If there are no further requirements, I would suggest one of the following:
(1)
pick random number a[1] in interval 0 .. k
pick random number a[2] in interval 0 .. k-a[1]
pick random number a[3] in interval 0 .. k-a[1]-a[2]
...
pick random number a[n] in interval 0 .. k-a[1]-a[2]-...-a[n-1]
If you have upper limit m on the range of the random number, use min(k-a[1]-... m) as upper bound of the interval.
Disadvantages: you will get a lot of small numbers and just a few big ones.
(2)
pick n random numbers a[1], .., a[n] in interval 0 .. m, m being the upper limit
s = a[1]+a[2]+...+a[n]
multiply each a[i] by k/s (if integers are required, round down)
Disadvantages: It is unlikely to get large numbers this way. If integers are required, there will likely be a gap between the sum of numbers and k due to rounding error.
I think you get "nicer" numbers with option (2) but as stated above, it depends on the requirements.
Assuming k is less than 255 * n one solution is to assign k / n to every element of the array, then randomly subtract a value to the array elements.
// for (int i = 0; i < n; i++) array[i] = k / n;
// for (int i = 0; i < n; i++) array[i] -= randbetween(0, array[i]);
for (int i = 0; i < n; i++) array[i] = randbetween(0, k / n);
This has an expected sum of k / 2. By tweaking the randbetween() function you can change the probability of the resulting array sum.
It is easy to create one number within range [0, 255].
It is easy to identify if k > 255*n or k < 0 there is no solution.
If 0 <= k <= 255*n, the solution exists. Here we only talk about n > 1 condition.
You have created n-1 random numbers, and sum of the n-1 numbers is s1, suppose the nth number is x. So s1 + x = k, and x should be [0, 255]. If the n-1 numbers are all within range [0, a], then (n-1)*a + 255 >= k, we get a >= (k-255)/(n-1).
If k > 255, just let a = (k-255)/(n-1). It means s1 is [0, k-255]. Then the nth number x can be any random number within [0, 255].
So the solution is arbitrary select n-1 numbers each within [0, (k-255)/(n-1)] (you know (k-255)/(n-1) <= 255, thus it satisfied your condition), and select one random number within [0, 255].
If k <= 255, arbitrary select n numbers each within [0, k/n] (you know k/n is within [0, 255]).

Given two arrays find the index k that minimizes the sum A[i]*|B[i]-B[k]|

I am given two arrays that contains natural numbers , A and B , and I need to find the index k that minimizes the sum A[i] * |B[i]-B[k]| from i=0 to n-1.
(Both arrays have the same length)
Its obviously easy to do in O(n^2) , I just calculate all sums for all k between 0 and n-1, but I need a better run time complexity.
Any ideas? Thanks!
You can do this in time O(nlogn) by first sorting both arrays based on the values in B, and then performing a single scan.
Once the arrays are sorted, then B[i]>=B[k] if i>k and B[i]<=B[k] if i<= k, so the sum can be rewritten as:
sum A[i] * abs(B[i]-B[k]) = sum A[i]*(B[i]-B[k]) for i=k..n-1
+ sum A[i]*(B[k]-B[i]) for i=0..k-1
= sum A[i]*B[i] for i=k..n-1
- B[k] * sum A[i] for i=k..n-1
+ B[k] * sum A[i] for i = 0..k-1
- sum A[i]*B[i] for i = 0..k-1
You can precalculate all of the sums in time O(n) which then lets you evaluate the target sum at every position in O(n) and select the value for k which gives the best score.
I believe I can do this is O(n log n).
First, sort the B array, applying the same permutation to the A array (and remembering the permutation). This is the O(n log n) part. Since we sum over all i, applying the same permutation to the A and B arrays does not change the minimum.
With a sorted B array, the rest of the algorithm is actually O(n).
For each k, define an array Ck[i] = |B[i] - B[k]|
(Note: We will not actually construct Ck... We will just use it as a concept for easier reasoning.)
Observe that the quantity we are trying to minimize (over k) is the sum of A[i] * Ck[i]. Let's go ahead and give that a name:
Define: Sk = Σ A[i] * Ck[i]
Now, for any particular k, what does Ck look like?
Well, Ck[k] = 0, obviously.
More interestingly, since the B array is sorted, we can get rid of the absolute value signs:
Ck[i] = B[k] - B[i], for 0 <= i < k
Ck[i] = 0, for i = k
Ck[i] = B[i] - B[k], for k < i < n
Let's define two more things.
Definition: Tk = Σ A[i] for 0 <= i < k
Definition: Uk = Σ A[i] for k < i < n
(That is, Tk is the sum of the first k-1 elements of A. Uk is the sum of all but the first k elements of A.)
The key observation: Given Sk, Tk, and Uk, we can compute Sk+1, Tk+1, and Uk+1 in constant time. How?
T and U are easy.
The question is, how do we get from Sk to Sk+1?
Consider what happens to Ck when we go to Ck+1. We simply add B[k+1]-B[k] to every element of C from 0 to k, and we subtract the same amount from every element of C from k+1 to n (prove this). That means we just need to add Tk * (B[k+1] - B[k]) and subtract Uk * (B[k+1] - B[k]) to get from Sk to Sk+1.
Algebraically... The first k terms of Sk are just the sum from 0 to k-1 of A[i] * (B[k] - B[i]).
The first k terms of Sk+1 are the sum from 0 to k-1 of A[i] * (B[k+1] - B[i])
The difference between these is the sum, from 0 to k-1, of (A[i] * (B[k+1] - B[i]) - (A[i] * (B[k] - B[i])). Factor out the A[i] terms and cancel the B[i] terms to get the sum from 0 to k-1 of A[i] * (B[k+1] - B[k]), which is just Tk * (B[k+1] - B[k]).
Similarly for the last n-k-1 terms of Sk.
Since we can compute S0, T0, and U0 in linear time, and we can go from Sk to Sk+1 in constant time, we can calculate all of the Sk in linear time. So do that, remember the smallest, and you are done.
Use the inverse of the sort permutation to get the k for the original arrays.
Here is O(NlogN) solution.
Example
A 6 2 5 10 3 8 7
B 1 5 4 3 6 9 7
1) First sort the two array to increasing of order of B. A's element is just binding with B.
After sort, we get
A 6 10 5 2 3 7
B 1 3 4 5 6 7
Since B are in order now. We have
n-1
sum A[i]|B[i]-B[k]|
i=0
k-1 n-1
=sum A[i](B[k]-B[i])+ sum A[i](B[k]-B[i])
i=0 i=k+1
k-1 n-1 k-1 n-1
=B[k](sum A[i] -sum A[i]) - (sum A[i]B[i]- sum A[i]B[i])
i=0 i=k+1 i=0 i=k+1
2) We calculate prefix sum of array A sumA=0 6 16 21 23 26 33
i=e
With sumA sum A[i] can be calcuated in O(1) time for any s and e.
i=s
For the same reason, we can calculate A[i]B[i]'s prefix sum.
So for each k, to check its value, it just take O(1) time.
So total time complexity is O(NlogN)+O(N).

Resources