AS3: Total count of merged similar sub arrays - arrays

I've got what should be a simple problem I can't quiet get my head around:
Say I have these array values (could change, but same basic structure):
TempArray[0]: 0,0
TempArray[1]: 0,0,0,0
TempArray[2]: 0,0,0,0
TempArray[3]: 3,3,3,4
TempArray[4]: 4,4
TempArray[5]: 4,3,4,4
TempArray[6]: 6,6
The sub array could go deeper, or it could be less, but it's always a matter of comparing within the subarray to get the goal.
The answer I'm after is a count of total matched groups. So since there is a 4 and 3 together all 4 and 3s would simply count as a 1.
My expected result from the above would be 3 (a count of total unique groups). All the 0s are 1, all the 3 and 4s merge together to be 2 and the 6 are 3. I just care the value is 3.
Any idea on how to achieve this?
Thanks
John

For instance, if you're only dealing with integers
-1/ Check your first Array
-2/ sort it
-3/ slice any element that already exist until you have an Array of unique elements.
you've created your first group
-4/ with the remaining Arrays, compare to your existing groups and slice any element that are already contained in a previous group. With your example Temp[3] would end up as group [3, 4]
-5 count your groups

Related

Search specific permutation of permutationsubset with constraints

Iam searching one permutation P consisting of p1...pn of following subset S.
S is defined of the Labels L.
L1...Lk. Where a L contains pi...pj.
Where the inverse of P has at most k-1 decreasing adjecent Elements. k <= n.
Example:
n := 4
k := 2
L1 := 1,2
L2 := 3,4
L := L1,L2,L1,L2
S := 1324,1423,2314,2413
one solution would be P := 1342
no solution would be P := 3142 because decreasing adjecent elements are 2 but only max1 ist allowed because k =2.
Exists therefor an algorithm to find P of S defined by L?
Currently I use bruteforce to figure one permutation P, but its getting very fast unusable slow.
So each of L1, ..., Lk is a consecutive set of elements. At each place we see Li, Lj in the definition of L, one of three things is true:
i < j in which case it is ascending.
i = j in which case it could be ascending or descending.
i > j in which case it must be descending.
By counting the number of places where case 3 is true, we get a minimum number of descending elements already in the definition of L.
Next, for each Li we have a pattern we can write down with len(Li)-1 ; and , where a ; means that there are elements of other Ljs between two members of Li, and , means that Li elements are adjacent and so the order of the elements may result in a descent. We want to know, "For each possible number of descents within Li, how many permutations of Li have that number of descents?"
We will think of building the permutations as follows:
The first element goes at position 0.
The second element goes to position 0 or 1. (If at 0, the first element is moved.)
The third element goes to position 0, 1, or 2.
etc
A descent is when the next element is smaller than the previous, at a transition matching a ,.
We actually will want the following data structure for later use:
cache[Li] gives:
by how many elements are chosen:
by the last element chosen:
by the number of descents we will add:
how many ways of finishing this permutation
So we can write a recursive function that takes:
The pattern for Li.
How many elements have been chosen.
What index was last chosen.
It then returns a dictionary mapping descents to count of ways to finish the permutation for Li.
Memoize that and we get our desired data structure.
Now we'll repeat the idea. We want:
cache2[i] gives:
by number of descents to use:
how many permutations of L[i], L[i+1], ..., L[k] meet it.
Again we can write a recursive function using cache to calculate this, and we can memoize it to get cache2.
And NOW we can reverse the process.
We know how many descents came from the definition of L.
We know the distribution of remaining descents from cache2[1], so we can randomly pick how many descents there will be meeting our condition among L1...Lk.
For L1...Lk we can look at cache[L1][1][0] and cache2[i+1] to figure out how many descents there will be within Li with the correct probability.
For each Li we can look at how many descents we want to wind up with, its pattern, and cache2[Li] to figure out a random sequence of inserts winding up with the right pattern. The first insert is always at 0. After that you always know the size, and where the last insert was, and how many descents are left. So for each possiblenext insert you figure out if it counts as a descent (look at both pattern, and whether it is before the last insert), and the number of ways to finish from there. Then you can choose the next insert randomly with the right possibility.
For each Li we can turn the pattern of inserts into the list of values in order. (I will explain this step more.)
We can now follow the pattern of L and fill in all of the values.
Now for step 5, let's illustrate with your example from the chat. Suppose that L2 = [4, 5, 6] and the pattern of inserts we came up with was [0, 1, 0]. How do we figure out the arrangement of values?
Well first we do our inserts:
[1]
[1, 2]
[3, 1, 2]
This says that the first element (4) goes to the third place, the second (5) to the first, and the third (6) to the second. So our permutation for L2 is [5, 6, 4].
This will be a lot of code to write. But it will be polynomial. Specifically if m is the count of the most common label, cache will have total size at most O(k m^2). Thanks to memoization, each entry takes O(m) to calculate. Everything else is small relative to that. So total space is O(k m^2) and time is O(k m^3).

algorithm which finds the numbers in a sequence which appear 3 times or more, and prints their indexes

Suppose I input a sequence of numbers which ends with -1.
I want to print all the values of the sequence that occur in it 3 times or more, and also print their indexes in the sequence.
For example , if the input is : 2 3 4 2 2 5 2 4 3 4 2 -1
so the expected output in that case is :
2: 0 3 4 6 10
4: 2 7 9
First I thought of using quick-sort , but then I realized that as a result I will lose the original indexes of the sequence. I also have been thinking of using count, but that sequence has no given range of numbers - so maybe count will be no good in that case.
Now I wonder if I might use an array of pointers (but how?)
Do you have any suggestions or tips for an algorithm with time complexity O(nlogn) for that ? It would be very appreciated.
Keep it simple!
The easiest way would be to scan the sequence and count the number of occurrence of each element, put the elements that match the condition in an auxiliary array.
Then, for each element in the auxiliary array, scan the sequence again and print out the indices.
First of all, sorry for my bad english (It's not my language) I'll try my best.
So similar to what #vvigilante told, here is an algorithm implemented in python (it is in python because is more similar to pseudo code, so you can translate it to any language you want, and moreover I add a lot of comment... hope you get it!)
from typing import Dict, List
def three_or_more( input_arr:int ) -> None:
indexes: Dict[int, List[int]] = {}
#scan the array
i:int
for i in range(0, len(input_arr)-1):
#create list for the number in position i
# (if it doesn't exist)
#and append the number
indexes.setdefault(input_arr[i],[]).append(i)
#for each key in the dictionary
n:int
for n in indexes.keys():
#if the number of element for that key is >= 3
if len(indexes[n]) >= 3:
#print the key
print("%d: "%(n), end='')
#print each element int the current key
el:int
for el in indexes[n]:
print("%d,"%(el), end='')
#new line
print("\n", end='')
#call the function
three_or_more([2, 3, 4, 2, 2, 5, 2, 4, 3, 4, 2, -1])
Complexity:
The first loop scan the input array = O(N).
The second one check for any number (digit) in the array,
since they are <= N (you can not have more number than element), so it is O(numbers) the complexity is O(N).
The loop inside the loop go through all indexes corresponding to the current number...
the complexity seem to be O(N) int the worst case (but it is not)
So the complexity would be O(N) + O(N)*O(N) = O(N^2)
but remember that the two nest loop can at least print all N indexes, and since the indexes are not repeated the complexity of them is O(N)...
So O(N)+O(N) ~= O(N)
Speaking about memory it is O(N) for the input array + O(N) for the dictionary (because it contain all N indexes) ~= O(N).
Well if you do it in c++ remember that maps are way slower than array, so if N is small, you should use an array of array (or std::vector> ), else you can also try an unordered map that use hashes
P.S. Remember that get the size of a vector is O(1) time because it is a difference of pointers!
Starting with a sorted list is a good idea.
You could create a second array of original indices and duplicate all of the memory moves for the sort on the indices array. Then checking for triplicates is trivial and only requires sort + 1 traversal.

Count of subarray

The problem is a variant of subarray counting. Given an array of numbers, let's say, 1,2,2,3,2,1,2,2,2,2 I look for subarrays and count the frequency of each. I start with looking from some K length subarrays (example K = 3).
Count of subarray 1,2,2 is C1:2.
Count of subarray 2,2,3 is 1.
Count of subarray 2,3,2 is 1.
and so on.
Now, I look for subarrays of length 2.
Count of subarray 1,2 is C2: 2. But (1,2) is a subset of the subarray 1,2,2. So, I calculate its count by subtracting C1 from C2 which gives count of 1,2 as 0. Similarly, count of 2,2 is 1.
My problem is in handling cases where more than one parent subset exists. I don't consider the sub-arrays in my result set whose frequency comes out to be 1. Example:
1,2,3,1,2,3,1,2,2,3
Here, Count of 1,2,3 is 2.
Count of 2,3,1 is 2.
Now, when I look for count of 2,3, it should be 1 as all the greater length parents have covered the occurrences. How shall I handle these cases?
The approach I thought was to mark all the pattern occurrences of the parent. In the above case, mark all the occurrences of 1,2,3 and 2,3,1. Array looks like this:
1,2,3,1,2,3,1,2,2,3
X,X,X,X,X,X,X,2,2,3
where X denotes the marked position. Now, frequency of 2,3 we see is 1 as per the positions which are unmarked. So, basically, I mark all the pattern occurrences I find at the current step. For the next step, I start looking for patterns from the unmarked locations only to get the correct count.
I am dealing with the large data on which this seems a bit not-so-good thing to do. Also, I'm not sure if it's correct or not. Any other approaches or ideas can be of big help?
Build suffix array for given array.
To count all repeating subarrays with given length - walk through this suffix array, comparing neighbor suffixes by needed prefix length.
For your first example
source array
1,2,2,3,2,1,2,2,2,2
suffix array is
5,0,9,4,8,7,6,1,2,3:
1,2,2,2,2 (5)
1,2,2,3,2,1,2,2,2,2 (0)
2 (9)
2,1,2,2,2,2 (4)
2,2 (8)
2,2,2 (7)
2,2,2,2 (6)
2,2,3,2,1,2,2,2,2 (1)
2,3,2,1,2,2,2,2 (2)
3,2,1,2,2,2,2 (3)
With length 2 we can count two subarrays 1,2 and four subarrays 2,2
If you want to count any given subarray - for example, all suffixes beginning with (1,2), just use binary search to get the first and the last indexes (like std:upperbound and std:lowerbound operations in C++ STL).
For the same example indexes of the first and last occurrences of (1,2) in suffix array are 0 and 1, so count is last-first+1=2

Finding all possible groups of two numbers in an array

If I have an input array, and I have to find all possible group of 2 numbers that satisfy the condition that a%b = k and a is towards left of b in the array. Here, k is the input and the array is the input as well. I am doing it in O(n^2). Simply by taking two loops and finding such numbers, can I do better?
For eg:
7 3 1 and `k = 1`
Then, (7,3) form one such group and I have to find such groups.

Minimum number of adjacent swaps

This is one of the question from online written test.
Books numbered from (1...N) have arrived to a warehouse.
The books are said to be best arranged if a book ā€œiā€ is present only to the left of book ā€œi+1ā€ (for all i, 1 <= i <= N-1) and that book N is present to the left of book 1. [Yes! Any cyclic sorted sequence is a best arrangement]
Books received are in a random order.Now your task is to find out the minimal number of moves required to achieve the best arrangement described above.
Note that only valid move is to choose a pair of adjacent books and have them switch places.
For Example if the books were initially in the order 3 5 4 2 1
Solution can be
a. First swap the second pair of books: { result : 3 4 5 2 1 }
b. Swap the rightmost pair: { result : 3 4 5 1 2 }
So, in 2 moves we achieve the best arrangement.
I tried but not able to find out solution for this.First I though that i will divide the array in two arrays and then I will apply insertion sort on both the arrays but that is also not working.
Please help me to find out a algo for this question.
N,1 can be anywhere in the sequence. eg 1..5, could be 3,4,5,1,2. So the first digit could be 1..5, ie 5x as complicated as Previous question. So, you'll have to do it 5 times. Use a sort algorithm that has a replaceable compare function.
So for the 3rd test the compare would be:-
// returns <0, 0 or >0
int compare(a,b){
return ((b+N-3)%N) - ((a+N-3)%N);
}

Resources