How do I generate random numbers from an array without repetition? - arrays

I know similar question have been asked before but bear with me.
I have an array:
int [] arr = {1,2,3,4,5,6,7,8,9};
I want numbers to be generated randomly 10 times. Something like this:
4,6,8,2,4,9,3,8,7
Although some numbers are repeated, no number is generated more than once in a row. So not like this:
7,3,1,8,8,2,4,9,5,6
As you can see, the number 8 was repeated immediately after it was generated. This is not the desired effect.
So basically, I'm ok with a number being repeated as long as it doesn't appear more than once in a row.

Generate a random number.
Compare it to the last number you generated
If it is the same; discard it
If it is different, add it to the array
Return to step 1 until you have enough numbers

generate a random index into the array.
repeat until it's different from the last index used.
pull the value corresponding to that index out of the array.
repeat from beginning until you have as many numbers as you need.

While the answers posted are not bad and would work well, someone might be not pleased with the solution as it is possible (tough incredibly unlikely) for it to hang if you generate long enough sequence of same numbers.
Algorithm that deals with this "problem", while preserving distribution of numbers would be:
Pick a random number from the original array, let's call it n, and output it.
Make array of all elements but n
Generate random index from the shorter array. Swap the element on the index with n. Output n.
Repeat last step until enough numbers is outputed.

int[] arr = {1, 2, 3, 4, 5, 6, 7, 8, 9};
int[] result = new int[10];
int previousChoice = -1;
int i = 0;
while (i < 10) {
int randomIndex = (int) (Math.random() * arr.length);
if (arr[randomIndex] != previousChoice) {
result[i] = arr[randomIndex];
i++;
}
}

The solutions given so far all involve non-constant work per generation; if you repeatedly generate indices and test for repetition, you could conceivably generate the same index many times before finally getting a new index. (An exception is Kiraa's answer, but that one involves high constant overhead to make copies of partial arrays)
The best solution here (assuming you want unique indices, not unique values, and/or that the source array has unique values) is to cycle the indices so you always generate a new index in (low) constant time.
Basically, you'd have a with loop like this (using Python for language mostly for brevity):
# randrange(x, y) generates an int in range x to y-1 inclusive
from random import randrange
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9]
result = []
selectidx = 0
randstart = 0
for _ in range(10): # Runs loop body 10 times
# Generate offset from last selected index (randstart is initially 0
# allowing any index to be selected; on subsequent loops, it's 1, preventing
# repeated selection of last index
offset = randrange(randstart, len(arr))
randstart = 1
# Add offset to last selected index and wrap so we cycle around the array
selectidx = (selectidx + offset) % len(arr)
# Append element at newly selected index
result.append(arr[selectidx])
This way, each generation step is guaranteed to require no more than one new random number, with the only constant additional work being a single addition and remainder operation.

Related

Find the longest subarray that contains a majority element

I am trying to solve this algorithmic problem:
https://dunjudge.me/analysis/problems/469/
For convenience, I have summarized the problem statement below.
Given an array of length (<= 2,000,000) containing integers in the range [0, 1,000,000], find the
longest subarray that contains a majority element.
A majority element is defined as an element that occurs > floor(n/2) times in a list of length n.
Time limit: 1.5s
For example:
If the given array is [1, 2, 1, 2, 3, 2],
The answer is 5 because the subarray [2, 1, 2, 3, 2] of length 5 from position 1 to 5 (0-indexed) has the number 2 which appears 3 > floor(5/2) times. Note that we cannot take the entire array because 3 = floor(6/2).
My attempt:
The first thing that comes to mind is an obvious brute force (but correct) solution which fixes the start and end indexes of a subarray and loop through it to check if it contains a majority element. Then we take the length of the longest subarray that contains a majority element. This works in O(n^2) with a small optimization. Clearly, this will not pass the time limit.
I was also thinking of dividing the elements into buckets that contain their indexes in sorted order.
Using the example above, these buckets would be:
1: 0, 2
2: 1, 3, 5
3: 4
Then for each bucket, I would make an attempt to merge the indexes together to find the longest subarray that contains k as the majority element where k is the integer label of that bucket.
We could then take the maximum length over all values of k. I didn't try out this solution as I didn't know how to perform the merging step.
Could someone please advise me on a better approach to solve this problem?
Edit:
I solved this problem thanks to the answers of PhamTrung and hk6279. Although I accepted the answer from PhamTrung because he first suggested the idea, I highly recommend looking at the answer by hk6279 because his answer elaborates the idea of PhamTrung and is much more detailed (and also comes with a nice formal proof!).
Note: attempt 1 is wrong as #hk6279 has given a counter example. Thanks for pointing it out.
Attempt 1:
The answer is quite complex, so I will discuss a brief idea
Let process each unique number one by one.
Processing each occurrence of number x from left to right, at index i, let add an segment (i, i) indicates the start and end of the current subarray. After that, we need to look to the left side of this segment, and try to merge the left neighbour of this segment into (i, i), (So, if the left is (st, ed), we try to make it become (st, i) if it satisfy the condition) if possible, and continue to merge them until we are not able to merge, or there is no left neighbour.
We keep all those segments in a stack for faster look up/add/remove.
Finally, for each segment, we try to enlarge them as large as possible, and keep the biggest result.
Time complexity should be O(n) as each element could only be merged once.
Attempt 2:
Let process each unique number one by one
For each unique number x, we maintain an array of counter. From 0 to end of the array, if we encounter a value x we increase the count, and if we don't we decrease, so for this array
[0,1,2,0,0,3,4,5,0,0] and number 0, we have this array counter
[1,0,-1,0,1,0,-1,-2,-1,0]
So, in order to make a valid subarray which ends at a specific index i, the value of counter[i] - counter[start - 1] must be greater than 0 (This can be easily explained if you view the array as making from 1 and -1 entries; with 1 is when there is an occurrence of x, -1 otherwise; and the problem can be converted into finding the subarray with sum is positive)
So, with the help of a binary search, the above algo still have an complexity of O(n ^ 2 log n) (in case we have n/2 unique numbers, we need to do the above process n/2 times, each time take O (n log n))
To improve it, we make an observation that, we actually don't need to store all values for all counter, but just the values of counter of x, we saw that we can store for above array counter:
[1,#,#,0,1,#,#,#,-1,0]
This will leads to O (n log n) solution, which only go through each element once.
This elaborate and explain how attempt 2 in #PhamTrung solution is working
To get the length of longest subarray. We should
Find the max. number of majority element in a valid array, denote as m
This is done by attempt 2 in #PhamTrung solution
Return min( 2*m-1, length of given array)
Concept
The attempt is stem from a method to solve longest positive subarray
We maintain an array of counter for each unique number x. We do a +1 when we encounter x. Otherwise, do a -1.
Take array [0,1,2,0,0,3,4,5,0,0,1,0] and unique number 0, we have array counter [1,0,-1,0,1,0,-1,-2,-1,0,-1,0]. If we blind those are not target unique number, we get [1,#,#,0,1,#,#,#,-1,0,#,0].
We can get valid array from the blinded counter array when there exist two counter such that the value of the right counter is greater than or equal to the left one. See Proof part.
To further improve it, we can ignore all # as they are useless and we get [1(0),0(3),1(4),-1(8),0(9),0(11)] in count(index) format.
We can further improve this by not record counter that is greater than its previous effective counter. Take counter of index 8,9 as an example, if you can form subarray with index 9, then you must be able to form subarray with index 8. So, we only need [1(0),0(3),-1(8)] for computation.
You can form valid subarray with current index with all previous index using binary search on counter array by looking for closest value that is less than or equal to current counter value (if found)
Proof
When right counter greater than left counter by r for a particular x, where k,r >=0 , there must be k+r number of x and k number of non x exist after left counter. Thus
The two counter is at index position i and r+2k+i
The subarray form between [i, r+2k+i] has exactly k+r+1 number of x
The subarray length is 2k+r+1
The subarray is valid as (2k+r+1) <= 2 * (k+r+1) -1
Procedure
Let m = 1
Loop the array from left to right
For each index pi
If the number is first encounter,
Create a new counter array [1(pi)]
Create a new index record storing current index value (pi) and counter value (1)
Otherwise, reuse the counter array and index array of the number and perform
Calculate current counter value ci by cprev+2-(pi - pprev), where cprev,pprev are counter value and index value in index record
Perform binary search to find the longest subarray that can be formed with current index position and all previous index position. i.e. Find the closest c, cclosest, in counter array where c<=ci. If not found, jump to step 5
Calculate number of x in the subarray found in step 2
r = ci - cclosest
k = (pi-pclosest-r)/2
number of x = k+r+1
Update counter m by number of x if subarray has number of x > m
Update counter array by append current counter if counter value less than last recorded counter value
Update index record by current index (pi) and counter value (ci)
For completeness, here's an outline of an O(n) theory. Consider the following, where * are characters different from c:
* c * * c * * c c c
i: 0 1 2 3 4 5 6 7 8 9
A plot for adding 1 for c and subtracting 1 for a character other than c could look like:
sum_sequence
0 c c
-1 * * c c
-2 * * c
-3 *
A plot for the minimum of the above sum sequence, seen for c, could look like:
min_sum
0 c * *
-1 * c * *
-2 c c c
Clearly, for each occurrence of c, we are looking for the leftmost occurrence of c with sum_sequence lower than or equal to the current sum_sequence. A non-negative difference would mean c is a majority, and leftmost guarantees the interval is the longest up to our position. (We can extrapolate a maximal length that is bounded by characters other than c from the inner bounds of c as the former can be flexible without affecting the majority.)
Observe that from one occurrence of c to the next, its sum_sequence can decrease by an arbitrary size. However, it can only ever increase by 1 between two consecutive occurrences of c. Rather than each value of min_sum for c, we can record linear segments, marked by cs occurrences. A visual example:
[start_min
\
\
\
\
end_min, start_min
\
\
end_min]
We iterate over occurrences of c and maintain a pointer to the optimal segment of min_sum. Clearly we can derive the next sum_sequence value for c from the previous one since it is exactly diminished by the number of characters in between.
An increase in sum_sequence for c corresponds with a shift of 1 back or no change in the pointer to the optimal min_sum segment. If there is no change in the pointer, we hash the current sum_sequence value as a key to the current pointer value. There can be O(num_occurrences_of_c) such hash keys.
With an arbitrary decrease in c's sum_sequence value, either (1) sum_sequence is lower than the lowest min_sum segment recorded so we add a new, lower segment and update the pointer, or (2) we've seen this exact sum_sequence value before (since all increases are by 1 only) and can use our hash to retrieve the optimal min_sum segment in O(1).
As Matt Timmermans pointed out in the question comments, if we were just to continually update the pointer to the optimal min_sum by iterating over the list, we would still only perform O(1) amortized-time iterations per character occurrence. We see that for each increasing segment of sum_sequence, we can update the pointer in O(1). If we used binary search only for the descents, we would add at most (log k) iterations for every k occurences (this assumes we jump down all the way), which keeps our overall time at O(n).
Algorithm :
Essentially, what Boyer-Moore does is look for a suffix sufsuf of nums where suf[0]suf[0] is the majority element in that suffix. To do this, we maintain a count, which is incremented whenever we see an instance of our current candidate for majority element and decremented whenever we see anything else. Whenever count equals 0, we effectively forget about everything in nums up to the current index and consider the current number as the candidate for majority element. It is not immediately obvious why we can get away with forgetting prefixes of nums - consider the following examples (pipes are inserted to separate runs of nonzero count).
[7, 7, 5, 7, 5, 1 | 5, 7 | 5, 5, 7, 7 | 7, 7, 7, 7]
Here, the 7 at index 0 is selected to be the first candidate for majority element. count will eventually reach 0 after index 5 is processed, so the 5 at index 6 will be the next candidate. In this case, 7 is the true majority element, so by disregarding this prefix, we are ignoring an equal number of majority and minority elements - therefore, 7 will still be the majority element in the suffix formed by throwing away the first prefix.
[7, 7, 5, 7, 5, 1 | 5, 7 | 5, 5, 7, 7 | 5, 5, 5, 5]
Now, the majority element is 5 (we changed the last run of the array from 7s to 5s), but our first candidate is still 7. In this case, our candidate is not the true majority element, but we still cannot discard more majority elements than minority elements (this would imply that count could reach -1 before we reassign candidate, which is obviously false).
Therefore, given that it is impossible (in both cases) to discard more majority elements than minority elements, we are safe in discarding the prefix and attempting to recursively solve the majority element problem for the suffix. Eventually, a suffix will be found for which count does not hit 0, and the majority element of that suffix will necessarily be the same as the majority element of the overall array.
Here's Java Solution :
Time complexity : O(n)
Space complexity : O(1)
public int majorityElement(int[] nums) {
int count = 0;
Integer candidate = null;
for (int num : nums) {
if (count == 0) {
candidate = num;
}
count += (num == candidate) ? 1 : -1;
}
return candidate;
}

Number of subarrays with same 'degree' as the array

So this problem was asked in a quiz and the problem goes like:
You are given an array 'a' with elements ranging from 1-106 and the size of array could be maximum 105 Now we are asked to find the number of subarrays with the same 'degree' as the original array. Degree of an array is defined as the frequency of maximum occurring element in the array. Multiple elements could have the same frequency.
I was stuck in this problem for like an hour but couldn't think of any solution. How do I solve it?
Sample Input:
first-input
1,2,2,3,1
first-output 2
second-input
1,1,2,1,2,2
second-output 4
The element that occurs most frequently is called the mode; this problem defines degree as the frequency count. Your tasks are:
Identify all of the mode values.
For each mode value, find the index range of that value. For instance, in the array
[1, 1, 2, 1, 3, 3, 2, 4, 2, 4, 5, 5, 5]
You have three modes (1 2 5) with a degree of 3. The index ranges are
1 - 0:3
2 - 2:8
5 - 10:12
You need to count all index ranges (subarrays) that include at least one of those three ranges.
I've tailored this example to have both basic cases: modes that overlap, and those that do not. Note that containment is a moot point: if you have an array where one mode's range contains another:
[0, 1, 1, 1, 0, 0]
You can ignore the outer one altogether: any subarray that contains 0 will also contain 1.
ANALYSIS
A subarray is defined by two numbers, the starting and ending indices. Since we must have 0 <= start <= end <= len(array), this is the "handshake" problem between array bounds. We have N(N+1)/2 possible subarrays.
For 10**5 elements, you could just brute-force the problem from here: for each pair of indices, check to see whether that range contains any of the mode ranges. However, you can easily cut that down with interval recognition.
ALGORITHM
Step through the mode ranges, left to right. First, count all subranges that include the first mode range [0:3]. There is only 1 possible starts [0] and 10 possible ends [3:12]; that's 10 subarrays.
Now move to the second mode range, [2:8]. You need to count subarrays that include this, but exclude those you've already counted. Since there's an overlap, you need a starting point later than 0, or an ending point before 3. This second clause is not possible with the given range.
Thus, you consider start [1:2], end [8:12]. That's 2 * 5 more subarrays.
For the third range [10:12 (no overlap), you need a starting point that does not include any other subrange. This means that any starting point [3:10] will do. Since there's only one possible endpoint, you have 8*1, or 8 more subarrays.
Can you turn this into something formal?
Taking reference from leet code
https://leetcode.com/problems/degree-of-an-array/solution/
solve
class Solution {
public int findShortestSubArray(int[] nums) {
Map<Integer, Integer> left = new HashMap(),
right = new HashMap(), count = new HashMap();
for (int i = 0; i < nums.length; i++) {
int x = nums[i];
if (left.get(x) == null) left.put(x, i);
right.put(x, i);
count.put(x, count.getOrDefault(x, 0) + 1);
}
int ans = nums.length;
int degree = Collections.max(count.values());
for (int x: count.keySet()) {
if (count.get(x) == degree) {
ans = Math.min(ans, right.get(x) - left.get(x) + 1);
}
}
return ans;
}
}

Finding count of distinct elements in every k subarray

How to solve this question efficiently?
Given an array of size n and an integer k we need to return the sum of count of all distinct numbers in a window of size k. The window slides forward.
e.g. arr[] = {1,2,1,3,4,2,3};
Let k = 4.
The first window is {1,2,1,3}, count of distinct numbers is 2….(1 is repeated)
The second window is {2,1,3,4} count of distinct numbers is 4
The third window is {1,3,4,2} count of distinct numbers is 4
The fourth window is {3,4,2,3} count of distinct numbers is 2
You should keep track of
a map that counts frequencies of elements in your window
a current sum.
The map with frequencies can also be an array if the possible elements are from a limited set.
Then when your window slides to the right...
increase the frequency of the new number by 1.
if that frequency is now 1, add it to the current sum.
decrease the frequency of the old number by 1.
if that frequency is now 0, subtract it from the current sum.
Actually, I am the asker of the question, I am not answering the question, but i just wanted to comment on the answers, but I can't since I have very less reputation.
I think that for {1, 2, 1, 3} and k = 4, the given algorithms produce count = 3, but according to the question, the count should be 2 (since 1 is repeated)
You can use a hash table H to keep track of the window as you iterate over the array. You also keep an additional field for each entry in the hash table that tracks how many times that element occurs in your window.
You start by adding the first k elements of arr to H. Then you iterate through the rest of arr and you decrease the counter field of the element that just leaves the windows and increase the counter field of the element that enters the window.
At any point (including the initial insertion into H), if a counter field turns 1, you increase the number of distinct elements you have in your window. This can happen while the last but one occurrence of an element leaves the window or while a first occurrence enters it. If a counter field turns to any other value but 1, you decrease the number of distinct elements you have in the window.
This is a linear solution in the number of elements in arr. Hashing integers can be done like this, but depending on the language you use to implement your solution you might not really need to hash them yourself. In case the range in which the elements of arr reside in is small enough, you can use a simple array instead of the hash table, as the other contributors suggested.
This is how I solved the problem
private static int[] getSolve(int[] A, int B) {
Map<Integer, Integer> map = new HashMap<>();
for (int i = 0; i < B; i++) {
map.put(A[i], map.getOrDefault(A[i], 0) + 1);
}
List<Integer> res = new ArrayList<>();
res.add(map.size());
//4, 1, 3, 1, 5, 2, 5, 6, 7
//3, 1, 5, 2, 5, 6 count = 5
for (int i = B; i < A.length; i++) {
if (map.containsKey(A[i - B]) && map.get(A[i - B]) == 1) {
map.remove(A[i - B]);
}
if (map.containsKey(A[i - B])) {
map.put(A[i - B], map.get(A[i - B]) - 1);
}
map.put(A[i], map.getOrDefault(A[i], 0) + 1);
System.out.println(map.toString());
res.add(map.size());
}
return res.stream().mapToInt(i -> i).toArray();
}

Finding the amount of different elements at array

We have an array at size n. How we can find how many different types of elements we have at n and what is the amount of each one?
For example: at {1,-5,2,-5,2,7,-5,-5} we have 4 different types, and the array of the amounts will be: {1,2,1,4}.
So my questions are:
How we can find how many different elements there is at the array?
How we can count the amount if each one?
Now, I try to solve it at Omega(n), I try a lot but I didn't find a way. I try to solve it with hash-tables.
You are trying to get frequency of an element in an array.
Initialize a Hash where every new key is initialized with value 0.
Loop through array and add this key to hash and increment the value.
In JavaScript:
hash = {};
a = [1,-5,2,-5,2,7,-5,-5];
for(var i = 0; i < a.length; ++i) {
if(hash[a[i]] === undefined)
hash[a[i]] = 0
hash[a[i]] = hash[a[i]] + 1;
}
console.log(hash.toSource());
The syntax and specific data structures you use will vary between languages, but the basic idea would be to store a running count of the number of instances of each value in an associative data structure (HashMap, Dictionary, whatever your language calls it).
Here is an example that will work in Java (I took a guess at the language you were using).
It's probably bad Java, but it illustrates the idea.
int[] myArray = {1,-5,2,-5,2,7,-5,-5};
HashMap<Object,Integer> occurrences = new HashMap<Object,Integer>();
for (int i=0;i<myArray.length;i++)
{
if (occurrences.get(myArray[i]) == null)
{
occurrences.put(myArray[i],1);
}
else
{
occurrences.put(myArray[i],occurrences.get(myArray[i])+1);
}
}
You can then use your HashMap to look up the distinct elements of the array like this
occurrences.keySet()
Other languages have their own HashSet implementations (Dictionaries in .NET and Python, Hashes in Ruby).
There are different approaches to solve this problem.The question that asked here might be asked in different ways.Here the the simple way to do it with std::map which is available in STL libraries.But remember it will be always sort by key.
int arr[]={1,-5,2,-5,2,7,-5,-5};
int n=sizeof(arr)/sizeof(arr[0]);
map<int,int>v;
for(int i=0;i<n;i++)
{
if(v[arr[i]])
v[arr[i]]++;
else
v[arr[i]]=1;
}
map<int,int>::iterator it;
for(it=v.begin();it!=v.end();++it)
cout<<it->first<<" "<<it->second<<endl;
return 0;
it will show output like
-5 4
1 1
2 2
7 1
I suggest you read about 'Count Sort'
Although i am not sure i understood correctly what you actually want to ask. Anyway, i think you want to:
1.) Scan an array and come up with the frequency of each unique element in that array.
2.) Total amount of unique elements
3.) all that in linear computational time
I think, what you need is Counting Sort. See algo on wiki.
You can obviously skip the sorting part. But you must see how it does the sorting (the useful part for your problem). It, first, calculates a histogram (array of size nominally equal to the number of unique elements in you original array) of frequency of each key. This works for integers only (although you can always sort other types by putting integer pointers).
So, every index of this histogram array will correspond to an element in your original array, and the value at this index will correspond to the frequency of this element in the original array.
For Example;
your array x = {3, 4, 3, 3, 1, 0, 1, 3}
//after calculation, you will get
your histogram array h[0 to 4] = {1, 2, 0, 4, 1}
i hope that is what you asked

How to determine to which extent/level an array of integers is already sorted

Consider an array of any given unique integers e.g. [1,3,2,4,6,5] how would one determine
the level of "sortedness", ranging from 0.0 to 1.0 ?
One way would be to evaluate the number of items that would have to be moved to make it sorted and then divide that by the total number of items.
As a first approach, I would detect the former as just the number of times a transition occurs from higher to lower value. In your list, that would be:
3 -> 2
6 -> 5
for a total of two movements. Dividing that by six elements gives you 33%.
In a way, this makes sense since you can simply move the 2 to between 1 and 3, and the 5 to between 4 and 6.
Now there may be edge cases where it's more efficient to move things differently but then you're likely going to have to write really complicated search algorithms to find the best solution.
Personally, I'd start with the simplest option that gave you what you wanted and only bother expanding if it turns out to be inadequate.
I would say the number of swaps is not a very good way to determine this. Most importantly because you can sort the array using a different number of swaps. In your case, you could switch 2<-->3 and 6<-->5, but you could also do a lot more switches.
How would you sort, say:
1 4 3 2 5
Would you directly switch 2 and 4, or would you switch 3 and 4, then 4 and 2, and then 3 and 2.
I would say a more correct method would be the number of elements in the right place divided by the total number of elements.
In your case, that would be 2/6.
Ok this is just an idea, but what if you can actually sort the array, i.e.
1,2,3,4,5,6
then get it as a string
123456
now get your original array in string
132465
and compare the Levenshtein distance between the two
I'll propose a different approach: let's count the number of non-descending sequences k in the array, then take its reversal: 1/k. For perfectly sorted array there's only one such sequence, 1/k = 1/1 = 1. This "unsortedness" level is the lowest when the array is sorted descendingly.
0 level is approached only asymptotically when the size of the array approaches infinity.
This simple approach can be computed in O(n) time.
In practice, one would measure unsortedness by the amount of work it needs to get sorted. That depends on what you consider "work". If only swaps are allowed, you could count the number op swaps needed. That has a nice upper bound of (n-1). For a mergesort kind of view you are mostly interested in the number of runs, since you'll need about log (nrun) merge steps. Statistically, you would probably take "sum(abs((rank - intended_rank))" as a measure, similar to a K-S test. But at eyesight, sequences like "HABCDEFG" (7 swaps, 2 runs, submean distance) and "HGFEDCBA" (4 swaps, 8 runs, maximal distance) are always showstoppers.
You could sum up the distances to their sorted position, for each item, and divide with the maximum such number.
public static <T extends Comparable<T>> double sortedMeasure(final T[] items) {
int n = items.length;
// Find the sorted positions
Integer[] sorted = new Integer[n];
for (int i = 0; i < n; i++) {
sorted[i] = i;
}
Arrays.sort(sorted, new Comparator<Integer>() {
public int compare(Integer i1, Integer i2) {
T o1 = items[i1];
T o2 = items[i2];
return o1.compareTo(o2);
}
public boolean equals(Object other) {
return this == other;
}
});
// Sum up the distances
int sum = 0;
for (int i = 0; i < n; i++) {
sum += Math.abs(sorted[i] - i);
}
// Calculate the maximum
int maximum = n*n/2;
// Return the ratio
return (double) sum / maximum;
}
Example:
sortedMeasure(new Integer[] {1, 2, 3, 4, 5}) // -> 0.000
sortedMeasure(new Integer[] {1, 5, 2, 4, 3}) // -> 0.500
sortedMeasure(new Integer[] {5, 1, 4, 2, 3}) // -> 0.833
sortedMeasure(new Integer[] {5, 4, 3, 2, 1}) // -> 1.000
One relevant measurement of sortedness would be "number of permutations needed to be sorted". In your case that would be 2, switching the 3,2 and 6,5. Then remains how to map this to [0,1]. You could calculate the maximum number of permutations needed for the length of the array, some sort of a "maximum unsortedness", which should yield a sortedness value of 0. Then take the number of permutations for the actual array, subtract it from the max and divide by max.

Resources