Find if one integer array is a permutation of other - arrays

Given two integer arrays of size N, design an algorithm to determine whether one is a permutation of the other. That is, do they contain exactly the same entries but, possibly, in a different order.
I can think of two ways:
Sort them and compare : O(N.log N + N)
Check if the array have same number of integers and the sum of these integers is same, then XOR both the arrays and see if the result is 0. This is O(N). I am not sure if this method will eliminate false positives completely. Thoughts. Better algorithms?

Check if the array have same number of integers and the sum of these integers is same, then XOR both the arrays and see if the result is 0.
This doesn't work. Example:
a = [1,6] length(a) = 2, sum(a) = 7, xor(a) = 7
b = [3,4] length(b) = 2, sum(b) = 7, xor(b) = 7
Others have already suggested HashMap for an O(n) solution.
Here's an O(n) solution in C# using a Dictionary<T, int>:
bool IsPermutation<T>(IList<T> values1, IList<T> values2)
{
if (values1.Count != values2.Count)
{
return false;
}
Dictionary<T, int> counts = new Dictionary<T, int>();
foreach (T t in values1)
{
int count;
counts.TryGetValue(t, out count);
counts[t] = count + 1;
}
foreach (T t in values2)
{
int count;
if (!counts.TryGetValue(t, out count) || count == 0)
{
return false;
}
counts[t] = count - 1;
}
return true;
}
In Python you could use the Counter class:
>>> a = [1, 4, 9, 4, 6]
>>> b = [4, 6, 1, 4, 9]
>>> c = [4, 1, 9, 1, 6]
>>> d = [1, 4, 6, 9, 4]
>>> from collections import Counter
>>> Counter(a) == Counter(b)
True
>>> Counter(c) == Counter(d)
False

The best solution is probably a counting one using a map whose keys are the values in your two arrays.
Go through one array creating/incrementing the appropriate map location and go through the other one creating/decrementing the appropriate map location.
If the resulting map consists entirely of zeros, your arrays are equal.
This is O(N), and I don't think you can do better.
I suspect this is approximately what Mark Byers was going for in his answer.

If a space complexity of O(n) is not a problem, you can do it in O(n), by first storing in a hash map the number of occurrences for each value in the first array, and then running a second pass on the second array and check that every element exists in the map, decrementing the number the occurrences for each element.

Sort the contents of both arrays numerically, and then compare each nth item.
You could also take each item in array1, and then check if it is present in array2. Keep a count of how many matches you find. At the end, the number of matches should equal the length of the arrays.

Related

Algorithm for intersection of n arrays in C

I need to write a function which returns the intersection (AND condition) of all the arrays generated in every iteration for an array of queries.
If my query is given by: query[] = {"num>20", "avg==5", "deviation != 0.5"} then, n runs from 0 to length of query. The query is passed on to a function (get_sample_ids) which compares the condition against a list of samples possessing certain information. The returned array numbers from get_sample_ids are the index of the respective samples.
query[] = {"num>20", "avg==5", "deviation != 0.5"}
int intersected_array*;
for n=0:query.length-1
int arr* = get_sample_ids(query[n]);
// n=0: [1, 7, 4, 2, 6]
// n=1: [3, 6, 2]
// n=2: [6, 2]
end;
Expected output: intersected_array* = [6, 2]
I've coded an implementation which has 2 arrays (arr*, temp*). For every array returned in the iteration, it is first stored in the temp* array and the intersection of arr* and temp* is stored in arr*. Is this an optimal solution or what is the best approach?
This is quite efficient but could be tiresome to implement (haven't tried it).
Determine the shortest array. Benefit of using C is that if you don't know their length, you can use the pointers to arrays to determine it if they are placed sequentially in memory.
Make a <entry,boolean> hash map for the entries in shortest. We know the size and if anything it's only going down in next steps.
Iterate through an array. Start by initiating the whole map to false. For each entry check it in map.
Iterate through map deleting all the entries that weren't checked. Set all the values to false.
If there any new arrays left go back to step 3. with a new array.
The result are the keys in the final map.
It looks like much but we didn't have to resort to any high complexity measures. Key to good performance is using hash map because of constant access time.
Alternatively:
Make the map be <entry,int>. This way you can count up all the recurrences and don't have to reset it at every iteration, which adds to complexity.
At the end just compare the number of array's to the the values in map. Those that match are your solution.
Complexity:
Seems like O(n).
First I would sort the arrays in ascending order more easy to preform tasks
you could also zero pad the arrays so all the arrays shall be in the same size
[1, 2, 0, 4, 0, 0, 6, 7]
[0, 2, 3, 4, 0, 0, 6, 7]
[0, 2, 0, 0, 0, 0, 6, 0]
like a matrix so you could easily find the intersection
all this shall take a lot of PC run time
enjoy
Here is Jquery implementation of #ZbyszekKr solution -
I have $indexes as array of arrays for all characters in English alphabets which stores which char is present in which rows. $chars is the array of char string I am trying to filter in my HTML table rows. Below method is a part of larger scheme in filtering rows as a user types, when there are more than say 5000 rows in your table.
PS - There are some obvious redundancies, but those are necessary for my plugin I am making.
function intersection($indexes, $chars){
map = {};
$minLength = Number.MAX_SAFE_INTEGER; $minIdx = 0;
//get shortest array
$.each($chars, function(key, c){
$index = getDiffInNum(c, $initialChar);
$len = $indexes[$index].rows.length;
if($len < $minLength){
$minLength = $len;
$minIdx = $index;
}
});
//put that array values in map
$minCount = 1;
$.each($indexes[$minIdx].rows, function(key, val){
map[val] = $minCount;
});
//iterate through other arrays to figure out count of element
$.each($chars, function(key, c){
$index = getDiffInNum(c, $initialChar);
if($index != $minIdx){
$array = $indexes[$index].rows;
$.each($array, function(key, val){
if(val in map){
map[val] = map[val] + 1;
}
});
$.each(map, function(key, val){
if(val == $minCount){
delete map[key];
}
});
$minCount++;
}
});
//get the elements which belong in intersection
$intersect = new Array();
$.each(map, function(key, val){
if(val == $chars.length){
$intersect.push(parseInt(key));
}
});
return $intersect;
}

Find missing numbers in array, time complexity O(N), space complexity O(1)

You are given an array of n unique integer numbers 0 <= x_i < 2 * n.
Print all integers 0 <= x < 2 * n that are not present in this array.
Example:
find_missing([0]) = [1]
find_missing([0, 2, 4]) = [1, 3, 5] # because all numbers are [0, 1, 2, 3, 4, 5]
find_missing([]) = []
find_missing([0, 1, 4, 5]) = [2, 3, 6, 7] # because all numbers are [0, 1, 2, 3, 4, 5, 6, 7]
Quirks are about requirements:
Time complexity O(n) - BUT there should be some fixed constant C independent of size of input such that every element of array is written/read < C times, so radix sorting the array is a no go.
Space complexity O(1) - you may modify the initial array, BUT sorted(initial_array) must equal sorted(array_after_executing_program) AND you can't store integers outside range [0, 2n) in this array (imagine that it's an array of uint32_t).
I saw a lot of complex solutions, but then I found this:
public void printNotInArr(int[] arr) {
if(arr == null)
return null;
int len = arr.length;
int max = 2 * len;
for(int i = 0; i < len; i++) {
System.out.println(max - arr[i] - 1);
}
}
I believe that is the best solution, but I am not sure. I would like to know why that would NOT work.
As #LasseV.Karlsen pointed out, [0,3] is a simple counter-example that shows how that solution doesn't work. This, however, is a pretty simple solution (in Python):
def show_missing(l):
n = len(l)
# put numbers less than n into the proper slot
for i in range(0,n):
while l[i]<n and l[i]!=i:
j = l[i]
l[i] = l[j]
l[j] = j
for i in range(0,n):
if l[i]!=i:
print('Missing %s'%i)
# put numbers greater than n into the proper slot
for i in range(0,n):
while l[i]>=n and l[i]!=i+n:
j = l[i]
l[i] = l[j-n]
l[j-n] = j
for i in range(0,n):
if l[i]!=i+n:
print('Missing %s'%(i+n))
The idea is simple. We first rearrange the elements so that every value j that is less than n is stored at index j. We can then go through the array and easily pick out the ones below n that are missing.
We then rearrange the elements so that every value j that is greater than or equal to n is stored at index j-n. Again, we can go through the array and easily pick out the ones greater than or equal to n that are missing.
Since only a couple of local variables are used, the O(1) space complexity is satisfied.
Because of the nested loops, the O(n) time complexity is a little harder to see, but it isn't too hard to show that we never swap more than n elements, since one new element is put into its proper place with each swap.
Since we've only swapped elements of the array, the requirement that all the original elements are still in the array is also satisfied.

Finding count of distinct elements in every k subarray

How to solve this question efficiently?
Given an array of size n and an integer k we need to return the sum of count of all distinct numbers in a window of size k. The window slides forward.
e.g. arr[] = {1,2,1,3,4,2,3};
Let k = 4.
The first window is {1,2,1,3}, count of distinct numbers is 2….(1 is repeated)
The second window is {2,1,3,4} count of distinct numbers is 4
The third window is {1,3,4,2} count of distinct numbers is 4
The fourth window is {3,4,2,3} count of distinct numbers is 2
You should keep track of
a map that counts frequencies of elements in your window
a current sum.
The map with frequencies can also be an array if the possible elements are from a limited set.
Then when your window slides to the right...
increase the frequency of the new number by 1.
if that frequency is now 1, add it to the current sum.
decrease the frequency of the old number by 1.
if that frequency is now 0, subtract it from the current sum.
Actually, I am the asker of the question, I am not answering the question, but i just wanted to comment on the answers, but I can't since I have very less reputation.
I think that for {1, 2, 1, 3} and k = 4, the given algorithms produce count = 3, but according to the question, the count should be 2 (since 1 is repeated)
You can use a hash table H to keep track of the window as you iterate over the array. You also keep an additional field for each entry in the hash table that tracks how many times that element occurs in your window.
You start by adding the first k elements of arr to H. Then you iterate through the rest of arr and you decrease the counter field of the element that just leaves the windows and increase the counter field of the element that enters the window.
At any point (including the initial insertion into H), if a counter field turns 1, you increase the number of distinct elements you have in your window. This can happen while the last but one occurrence of an element leaves the window or while a first occurrence enters it. If a counter field turns to any other value but 1, you decrease the number of distinct elements you have in the window.
This is a linear solution in the number of elements in arr. Hashing integers can be done like this, but depending on the language you use to implement your solution you might not really need to hash them yourself. In case the range in which the elements of arr reside in is small enough, you can use a simple array instead of the hash table, as the other contributors suggested.
This is how I solved the problem
private static int[] getSolve(int[] A, int B) {
Map<Integer, Integer> map = new HashMap<>();
for (int i = 0; i < B; i++) {
map.put(A[i], map.getOrDefault(A[i], 0) + 1);
}
List<Integer> res = new ArrayList<>();
res.add(map.size());
//4, 1, 3, 1, 5, 2, 5, 6, 7
//3, 1, 5, 2, 5, 6 count = 5
for (int i = B; i < A.length; i++) {
if (map.containsKey(A[i - B]) && map.get(A[i - B]) == 1) {
map.remove(A[i - B]);
}
if (map.containsKey(A[i - B])) {
map.put(A[i - B], map.get(A[i - B]) - 1);
}
map.put(A[i], map.getOrDefault(A[i], 0) + 1);
System.out.println(map.toString());
res.add(map.size());
}
return res.stream().mapToInt(i -> i).toArray();
}

How to determine to which extent/level an array of integers is already sorted

Consider an array of any given unique integers e.g. [1,3,2,4,6,5] how would one determine
the level of "sortedness", ranging from 0.0 to 1.0 ?
One way would be to evaluate the number of items that would have to be moved to make it sorted and then divide that by the total number of items.
As a first approach, I would detect the former as just the number of times a transition occurs from higher to lower value. In your list, that would be:
3 -> 2
6 -> 5
for a total of two movements. Dividing that by six elements gives you 33%.
In a way, this makes sense since you can simply move the 2 to between 1 and 3, and the 5 to between 4 and 6.
Now there may be edge cases where it's more efficient to move things differently but then you're likely going to have to write really complicated search algorithms to find the best solution.
Personally, I'd start with the simplest option that gave you what you wanted and only bother expanding if it turns out to be inadequate.
I would say the number of swaps is not a very good way to determine this. Most importantly because you can sort the array using a different number of swaps. In your case, you could switch 2<-->3 and 6<-->5, but you could also do a lot more switches.
How would you sort, say:
1 4 3 2 5
Would you directly switch 2 and 4, or would you switch 3 and 4, then 4 and 2, and then 3 and 2.
I would say a more correct method would be the number of elements in the right place divided by the total number of elements.
In your case, that would be 2/6.
Ok this is just an idea, but what if you can actually sort the array, i.e.
1,2,3,4,5,6
then get it as a string
123456
now get your original array in string
132465
and compare the Levenshtein distance between the two
I'll propose a different approach: let's count the number of non-descending sequences k in the array, then take its reversal: 1/k. For perfectly sorted array there's only one such sequence, 1/k = 1/1 = 1. This "unsortedness" level is the lowest when the array is sorted descendingly.
0 level is approached only asymptotically when the size of the array approaches infinity.
This simple approach can be computed in O(n) time.
In practice, one would measure unsortedness by the amount of work it needs to get sorted. That depends on what you consider "work". If only swaps are allowed, you could count the number op swaps needed. That has a nice upper bound of (n-1). For a mergesort kind of view you are mostly interested in the number of runs, since you'll need about log (nrun) merge steps. Statistically, you would probably take "sum(abs((rank - intended_rank))" as a measure, similar to a K-S test. But at eyesight, sequences like "HABCDEFG" (7 swaps, 2 runs, submean distance) and "HGFEDCBA" (4 swaps, 8 runs, maximal distance) are always showstoppers.
You could sum up the distances to their sorted position, for each item, and divide with the maximum such number.
public static <T extends Comparable<T>> double sortedMeasure(final T[] items) {
int n = items.length;
// Find the sorted positions
Integer[] sorted = new Integer[n];
for (int i = 0; i < n; i++) {
sorted[i] = i;
}
Arrays.sort(sorted, new Comparator<Integer>() {
public int compare(Integer i1, Integer i2) {
T o1 = items[i1];
T o2 = items[i2];
return o1.compareTo(o2);
}
public boolean equals(Object other) {
return this == other;
}
});
// Sum up the distances
int sum = 0;
for (int i = 0; i < n; i++) {
sum += Math.abs(sorted[i] - i);
}
// Calculate the maximum
int maximum = n*n/2;
// Return the ratio
return (double) sum / maximum;
}
Example:
sortedMeasure(new Integer[] {1, 2, 3, 4, 5}) // -> 0.000
sortedMeasure(new Integer[] {1, 5, 2, 4, 3}) // -> 0.500
sortedMeasure(new Integer[] {5, 1, 4, 2, 3}) // -> 0.833
sortedMeasure(new Integer[] {5, 4, 3, 2, 1}) // -> 1.000
One relevant measurement of sortedness would be "number of permutations needed to be sorted". In your case that would be 2, switching the 3,2 and 6,5. Then remains how to map this to [0,1]. You could calculate the maximum number of permutations needed for the length of the array, some sort of a "maximum unsortedness", which should yield a sortedness value of 0. Then take the number of permutations for the actual array, subtract it from the max and divide by max.

Remove duplicates from Array without using Hash Table

i have an array which might contain duplicate elements(more than two duplicates of an element). I wonder if it's possible to find and remove the duplicates in the array:
without using Hash Table (strict requirement)
without using a temporary secondary array. No restrictions on complexity.
P.S: This is not Home work question
Was asked to my friend in yahoo technical interview
Sort the source array. Find consecutive elements that are equal. (I.e. what std::unique does in C++ land). Total complexity is N lg N, or merely N if the input is already sorted.
To remove duplicates, you can copy elements from later in the array over elements earlier in the array also in linear time. Simply keep a pointer to the new logical end of the container, and copy the next distinct element to that new logical end at each step. (Again, exactly like std::unique does (In fact, why not just download an implementation of std::unique and do exactly what it does? :P))
O(NlogN) : Sort and replace consecutive same element with one copy.
O(N2) : Run nested loop to compare each element with the remaining elements in the array, if duplicate found, swap the duplicate with the element at the end of the array and decrease the array size by 1.
No restrictions on complexity.
So this is a piece of cake.
// A[1], A[2], A[3], ... A[i], ... A[n]
// O(n^2)
for(i=2; i<=n; i++)
{
duplicate = false;
for(j=1; j<i; j++)
if(A[i] == A[j])
{duplicate = true; break;}
if(duplicate)
{
// "remove" A[i] by moving all elements from its left over it
for(j=i; j<n; j++)
A[j] = A[j+1];
n--;
}
}
In-place duplicate removal that preserves the existing order of the list, in quadratic time:
for (var i = 0; i < list.length; i++) {
for (var j = i + 1; j < list.length;) {
if (list[i] == list[j]) {
list.splice(j, 1);
} else {
j++;
}
}
}
The trick is to start the inner loop on i + 1 and not increment the inner counter when you remove an element.
The code is JavaScript, splice(x, 1) removes the element at x.
If order preservation isn't an issue, then you can do it quicker:
list.sort();
for (var i = 1; i < list.length;) {
if (list[i] == list[i - 1]) {
list.splice(i, 1);
} else {
i++;
}
}
Which is linear, unless you count the sort, which you should, so it's of the order of the sort -- in most cases n × log(n).
In functional languages you can combine sorting and unicification (is that a real word?) in one pass.
Let's take the standard quick sort algorithm:
- Take the first element of the input (x) and the remaining elements (xs)
- Make two new lists
- left: all elements in xs smaller than or equal to x
- right: all elements in xs larger than x
- apply quick sort on the left and right lists
- return the concatenation of the left list, x, and the right list
- P.S. quick sort on an empty list is an empty list (don't forget base case!)
If you want only unique entries, replace
left: all elements in xs smaller than or equal to x
with
left: all elements in xs smaller than x
This is a one-pass O(n log n) algorithm.
Example implementation in F#:
let rec qsort = function
| [] -> []
| x::xs -> let left,right = List.partition (fun el -> el <= x) xs
qsort left # [x] # qsort right
let rec qsortu = function
| [] -> []
| x::xs -> let left = List.filter (fun el -> el < x) xs
let right = List.filter (fun el -> el > x) xs
qsortu left # [x] # qsortu right
And a test in interactive mode:
> qsortu [42;42;42;42;42];;
val it : int list = [42]
> qsortu [5;4;4;3;3;3;2;2;2;2;1];;
val it : int list = [1; 2; 3; 4; 5]
> qsortu [3;1;4;1;5;9;2;6;5;3;5;8;9];;
val it : int list = [1; 2; 3; 4; 5; 6; 8; 9]
Since it's an interview question it is usually expected by the interviewer to be asked precisions about the problem.
With no alternative storage allowed (that is O(1) storage allowed in that you'll probably use some counters / pointers), it seems obvious that a destructive operation is expected, it might be worth pointing it out to the interviewer.
Now the real question is: do you want to preserve the relative order of the elements ? ie is this operation supposed to be stable ?
Stability hugely impact the available algorithms (and thus the complexity).
The most obvious choice is to list Sorting Algorithms, after all, once the data is sorted, it's pretty easy to get unique elements.
But if you want stability, you cannot actually sort the data (since you could not get the "right" order back) and thus I wonder if it solvable in less than O(N**2) if stability is involved.
doesn't use a hash table per se but i know behind the scenes it's an implementation of one. Nevertheless, thought I might post in case it can help. This is in JavaScript and uses an associative array to record duplicates to pass over
function removeDuplicates(arr) {
var results = [], dups = [];
for (var i = 0; i < arr.length; i++) {
// check if not a duplicate
if (dups[arr[i]] === undefined) {
// save for next check to indicate duplicate
dups[arr[i]] = 1;
// is unique. append to output array
results.push(arr[i]);
}
}
return results;
}
Let me do this in Python.
array1 = [1,2,2,3,3,3,4,5,6,4,4,5,5,5,5,10,10,8,7,7,9,10]
array1.sort()
print(array1)
current = NONE
count = 0
# overwriting the numbers at the frontal part of the array
for item in array1:
if item != current:
array1[count] = item
count +=1
current=item
print(array1)#[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 5, 5, 5, 5, 6, 7, 7, 8, 9, 10, 10, 10]
print(array1[:count])#[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
The most Efficient method is :
array1 = [1,2,2,3,3,3,4,5,6,4,4,5,5,5,5,10,10,8,7,7,9,10]
array1.sort()
print(array1)
print([*dict.fromkeys(array1)])#[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
#OR#
aa = list(dict.fromkeys(array1))
print( aa)#[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Resources