Find amount of natural numbers generated from array of digit characters - arrays

Good day.
I have an array of digit characters ['9','0'] or ['9','5','2','0','0','0']. Need to find amount of all natural numbers with length equal to array size generated from a source array. For example for ['9','0'] it will be only 90 and answer is 1.
If array has no 0 and digits duplication amount of numbers can be calculated by factorial:
['5','7','2'] => 3! => 6
['1','2','3','4','5','6','7'] => 7! => 5040.
When zeros and duplication appears it's become changeable.
More examples: https://www.codewars.com/kumite/5a26eb9ab6486ae2680000fe?sel=5a26eb9ab6486ae2680000fe
Thank you
P.S. Better to find formula, I know how solve this problem by loops
def g(a)
answer = a.permutation(a.size)
.select{|x| x.join.to_i.to_s.split("").size == a.size }.to_a.uniq.size
answer
end

The only difference with '0' is that you can't have leading '0', aka first digit cannot be 0.
The formula given an array of N numbers become
(N - Number of zeros) * (N-1)!
When there is no zero, it is just N!.
Now consider the case with duplication, lets say there are K '1' in the array. For every permutation you have in the previous calculation, you can swap the '1' in K! permutation, thus you need to divide your result with K!. This need to be done for every single digits with duplicates. When there is no duplication (0 or 1 such digit), you are dividing by 0! or 1! thus division does not change the value.
Sample case: [0, 0, 1, 1]
4 digits, 2 zeros, 2 ones
(4-2) * 3! / (2! * 2!) = 3
Possible permutation: 1001, 1010, 1100

Related

maximum of minimum of difference in subsequence of k size

Given a sorted sequence of n elements. Find the maximum of all the minimums taken from all pairs differences of subsequences of length k.
Here 1<=n<=10^5
and 2<=k<=n
For eg: [2, 3, 5, 9] and k = 3
there are 4 subsequences:
[2, 3, 5] - min diff of all pairs = 1
[2, 3, 9] - min diff of all pairs = 1
[3, 5, 9] - min diff of all pairs = 2
[2, 5, 9] - min diff of all pairs = 3
So answer is max of all min diffs = 3
The naive way is to find all k length subsequences and then find mins in each of them and then max of all of them but that will time out because of the constraints.
Apart from that what I thought was to find the sequence which is optimally distanced so that the min becomes maximum.
Can someone give an optimal and better solution?
Suppose that your sequence of integers is a[i]. Then my solution will find the answer in time O(n log((a[n-1]-a[0])/n)). If your sequence is floating point numbers it will likely run in similar time, but could theoretically be as bad as O(n^3).
The key observation is this. It is easy to construct the most compact sequence starting at the first element whose minimum gap is at least m. Just take the first element, and take each other one when it is at at least m bigger than the last one that you took. So we can write a function that constructs this sequence and tracks 3 numbers:
How many elements we got
The size of the smallest gap that we found.
The next smallest m that would result in a more compact sequence. That is, the largest gap to an element that we didn't include.
In the case of your sequence if we did this with a gap of 2 we'd find that we took 3 elements, the smallest gap is 3, and we'd get a different sequence if we had looked for a gap of 1.
This is enough information to construct a binary search for the desired gap length. With the key logic looking like this:
lower_bound = 0
upper_bound = (a[n-1] - a[0])/(k-1)
while lower_bound < upper_bound:
# Whether int or float, we want float to land "between" guesses
guess = (lower_bound + upper_bound + 0.0) / 2
(size, gap_found, gap_different) = min_gap_stats(a, guess)
if k < size:
# We must pick a more compact sequence
upper_bound = gap_different
else:
# We can get this big a gap, but maybe we can get bigger?
lower_bound = gap_found
If we ran this for your sequence we'd first set a lower_bound of 0 and an upper_bound of 7/2 = 3 (thanks to integer division). And we'd immediately find the answer.
If you had a sequence of floats with the same values it would take longer. We'd first try 3.5, and get a sequence of 2 with a different decision at 3. We'd then try 1.5, and find our sequence of 3 with the gap that we want.
The binary search will usually make this take a logarithmic number of passes.
However each time we set either the upper or lower bound to the size of an actual pairwise gap. Since there are only O(n^2) gaps, we are guaranteed to need no more than that many passes.

Determine the adjacency of two fibonacci number

I have many fibonacci numbers, if I want to determine whether two fibonacci number are adjacent or not, one basic approach is as follows:
Get the index of the first fibonacci number, say i1
Get the index of the second fibonacci number, say i2
Get the absolute value of i1-i2, that is |i1-i2|
If the value is 1, then return true.
else return false.
In the first step and the second step, it may need many comparisons to get the correct index by using accessing an array.
In the third step, it need one subtraction and one absolute operation.
I want to know whether there exists another approach to quickly to determine the adjacency of the fibonacci numbers.
I don't care whether this question could be solved mathematically or by any hacking techniques.
If anyone have some idea, please let me know. Thanks a lot!
No need to find the index of both number.
Given that the two number belongs to Fibonacci series, if their difference is greater than the min. number among them then those two are not adjacent. Other wise they are.
Because Fibonacci series follows following rule:
F(n) = F(n-1) + F(n-2) where F(n)>F(n-1)>F(n-2).
So F(n) - F(n-1) = F(n-2) ,
=> Diff(n,n-1) < F(n-1) < F(n-k) for k >= 1
Difference between two adjacent fibonaci number will always be less than the min number among them.
NOTE : This will only hold if numbers belong to Fibonacci series.
Simply calculate the difference between them. If it is smaller than the smaller of the 2 numbers they are adjacent, If it is bigger, they are not.
Each triplet in the Fibonacci sequence a, b, c conforms to the rule
c = a + b
So for every pair of adjacent Fibonaccis (x, y), the difference between them (y-x) is equal to the value of the previous Fibonacci, which of course must be less than x.
If 2 Fibonaccis, say (x, z) are not adjacent, then their difference must be greater than the smaller of the two. At minimum, (if they are one Fibonacci apart) the difference would be equal to the Fibonacci between them, (which is of course greater than the smaller of the two numbers).
Since for (a, b, c, d)
since c= a+b
and d = b+c
then d-b = (b+c) - b = c
By Binet's formula, the nth Fibonacci number is approximately sqrt(5)*phi**n, where phi is the golden ration. You can use base phi logarithms to recover the index easily:
from math import log, sqrt
def fibs(n):
nums = [1,1]
for i in range(n-2):
nums.append(sum(nums[-2:]))
return nums
phi = (1 + sqrt(5))/2
def fibIndex(f):
return round((log(sqrt(5)*f,phi)))
To test this:
for f in fibs(20): print(fibIndex(f),f)
Output:
2 1
2 1
3 2
4 3
5 5
6 8
7 13
8 21
9 34
10 55
11 89
12 144
13 233
14 377
15 610
16 987
17 1597
18 2584
19 4181
20 6765
Of course,
def adjacentFibs(f,g):
return abs(fibIndex(f) - fibIndex(g)) == 1
This fails with 1,1 -- but there is little point for explicit testing special logic for such an edge-case. Add it in if you want.
At some stage, floating-point round-off error will become an issue. For that, you would need to replace math.log by an integer log algorithm (e.g. one which involves binary search).
On Edit:
I concentrated on the question of how to recover the index (and I will keep the answer since that is an interesting problem in its own right), but as #LeandroCaniglia points out in their excellent comment, this is overkill if all you want to do is check if two Fibonacci numbers are adjacent, since another consequence of Binet's formula is that sufficiently large adjacent Fibonacci numbers have a ratio which differs from phi by a negligible amount. You could do something like:
def adjFibs(f,g):
f,g = min(f,g), max(f,g)
if g <= 34:
return adjacentFibs(f,g)
else:
return abs(g/f - phi) < 0.01
This assumes that they are indeed Fibonacci numbers. The index-based approach can be used to verify that they are (calculate the index and then use the full-fledged Binet's formula with that index).

How to find the nth binary permutation?

This is the last missing piece I need to complete my compression algorithm, new one. Let's say I have 4 bits with 2 bits set as 1, 0011. The total number of permutations for this number is 0011, 0101, 0110, 1001, 1010, 1100, 6 cases. This can be computed using the calculation.
4! / ((2!)(4-2)!) = 6
Now I want to be able to find the nth sequence, for instance 1st number is 0011, second number is 0101. So if I say n=5, I want to be able to get the 5th permutation sequence 1010 from the initial 0011. How do I do this?
If there are only two 1 in the binary, it's not too difficult.
When the highest 1 bit locate at x position, the number of permutation is x.
So that, the highest bit position is the smallest a (starts from 0), subjecting to a*(a+1)/2 >= n. You can easily find a by a O(n) loop.
Then the least bit position is a*(a+1)/2-n (starts from 0)
For example, when n is 5, the smallest a is 3, and the least bit position is 1, so that the answer is 1010

Smallest number that cannot be formed from sum of numbers from array

This problem was asked to me in Amazon interview -
Given a array of positive integers, you have to find the smallest positive integer that can not be formed from the sum of numbers from array.
Example:
Array:[4 13 2 3 1]
result= 11 { Since 11 was smallest positive number which can not be formed from the given array elements }
What i did was :
sorted the array
calculated the prefix sum
Treverse the sum array and check if next element is less than 1
greater than sum i.e. A[j]<=(sum+1). If not so then answer would
be sum+1
But this was nlog(n) solution.
Interviewer was not satisfied with this and asked a solution in less than O(n log n) time.
There's a beautiful algorithm for solving this problem in time O(n + Sort), where Sort is the amount of time required to sort the input array.
The idea behind the algorithm is to sort the array and then ask the following question: what is the smallest positive integer you cannot make using the first k elements of the array? You then scan forward through the array from left to right, updating your answer to this question, until you find the smallest number you can't make.
Here's how it works. Initially, the smallest number you can't make is 1. Then, going from left to right, do the following:
If the current number is bigger than the smallest number you can't make so far, then you know the smallest number you can't make - it's the one you've got recorded, and you're done.
Otherwise, the current number is less than or equal to the smallest number you can't make. The claim is that you can indeed make this number. Right now, you know the smallest number you can't make with the first k elements of the array (call it candidate) and are now looking at value A[k]. The number candidate - A[k] therefore must be some number that you can indeed make with the first k elements of the array, since otherwise candidate - A[k] would be a smaller number than the smallest number you allegedly can't make with the first k numbers in the array. Moreover, you can make any number in the range candidate to candidate + A[k], inclusive, because you can start with any number in the range from 1 to A[k], inclusive, and then add candidate - 1 to it. Therefore, set candidate to candidate + A[k] and increment k.
In pseudocode:
Sort(A)
candidate = 1
for i from 1 to length(A):
if A[i] > candidate: return candidate
else: candidate = candidate + A[i]
return candidate
Here's a test run on [4, 13, 2, 1, 3]. Sort the array to get [1, 2, 3, 4, 13]. Then, set candidate to 1. We then do the following:
A[1] = 1, candidate = 1:
A[1] ≤ candidate, so set candidate = candidate + A[1] = 2
A[2] = 2, candidate = 2:
A[2] ≤ candidate, so set candidate = candidate + A[2] = 4
A[3] = 3, candidate = 4:
A[3] ≤ candidate, so set candidate = candidate + A[3] = 7
A[4] = 4, candidate = 7:
A[4] ≤ candidate, so set candidate = candidate + A[4] = 11
A[5] = 13, candidate = 11:
A[5] > candidate, so return candidate (11).
So the answer is 11.
The runtime here is O(n + Sort) because outside of sorting, the runtime is O(n). You can clearly sort in O(n log n) time using heapsort, and if you know some upper bound on the numbers you can sort in time O(n log U) (where U is the maximum possible number) by using radix sort. If U is a fixed constant, (say, 109), then radix sort runs in time O(n) and this entire algorithm then runs in time O(n) as well.
Hope this helps!
Use bitvectors to accomplish this in linear time.
Start with an empty bitvector b. Then for each element k in your array, do this:
b = b | b << k | 2^(k-1)
To be clear, the i'th element is set to 1 to represent the number i, and | k is setting the k-th element to 1.
After you finish processing the array, the index of the first zero in b is your answer (counting from the right, starting at 1).
b=0
process 4: b = b | b<<4 | 1000 = 1000
process 13: b = b | b<<13 | 1000000000000 = 10001000000001000
process 2: b = b | b<<2 | 10 = 1010101000000101010
process 3: b = b | b<<3 | 100 = 1011111101000101111110
process 1: b = b | b<<1 | 1 = 11111111111001111111111
First zero: position 11.
Consider all integers in interval [2i .. 2i+1 - 1]. And suppose all integers below 2i can be formed from sum of numbers from given array. Also suppose that we already know C, which is sum of all numbers below 2i. If C >= 2i+1 - 1, every number in this interval may be represented as sum of given numbers. Otherwise we could check if interval [2i .. C + 1] contains any number from given array. And if there is no such number, C + 1 is what we searched for.
Here is a sketch of an algorithm:
For each input number, determine to which interval it belongs, and update corresponding sum: S[int_log(x)] += x.
Compute prefix sum for array S: foreach i: C[i] = C[i-1] + S[i].
Filter array C to keep only entries with values lower than next power of 2.
Scan input array once more and notice which of the intervals [2i .. C + 1] contain at least one input number: i = int_log(x) - 1; B[i] |= (x <= C[i] + 1).
Find first interval that is not filtered out on step #3 and corresponding element of B[] not set on step #4.
If it is not obvious why we can apply step 3, here is the proof. Choose any number between 2i and C, then sequentially subtract from it all the numbers below 2i in decreasing order. Eventually we get either some number less than the last subtracted number or zero. If the result is zero, just add together all the subtracted numbers and we have the representation of chosen number. If the result is non-zero and less than the last subtracted number, this result is also less than 2i, so it is "representable" and none of the subtracted numbers are used for its representation. When we add these subtracted numbers back, we have the representation of chosen number. This also suggests that instead of filtering intervals one by one we could skip several intervals at once by jumping directly to int_log of C.
Time complexity is determined by function int_log(), which is integer logarithm or index of the highest set bit in the number. If our instruction set contains integer logarithm or any its equivalent (count leading zeros, or tricks with floating point numbers), then complexity is O(n). Otherwise we could use some bit hacking to implement int_log() in O(log log U) and obtain O(n * log log U) time complexity. (Here U is largest number in the array).
If step 1 (in addition to updating the sum) will also update minimum value in given range, step 4 is not needed anymore. We could just compare C[i] to Min[i+1]. This means we need only single pass over input array. Or we could apply this algorithm not to array but to a stream of numbers.
Several examples:
Input: [ 4 13 2 3 1] [ 1 2 3 9] [ 1 1 2 9]
int_log: 2 3 1 1 0 0 1 1 3 0 0 1 3
int_log: 0 1 2 3 0 1 2 3 0 1 2 3
S: 1 5 4 13 1 5 0 9 2 2 0 9
C: 1 6 10 23 1 6 6 15 2 4 4 13
filtered(C): n n n n n n n n n n n n
number in
[2^i..C+1]: 2 4 - 2 - - 2 - -
C+1: 11 7 5
For multi-precision input numbers this approach needs O(n * log M) time and O(log M) space. Where M is largest number in the array. The same time is needed just to read all the numbers (and in the worst case we need every bit of them).
Still this result may be improved to O(n * log R) where R is the value found by this algorithm (actually, the output-sensitive variant of it). The only modification needed for this optimization is instead of processing whole numbers at once, process them digit-by-digit: first pass processes the low order bits of each number (like bits 0..63), second pass - next bits (like 64..127), etc. We could ignore all higher-order bits after result is found. Also this decreases space requirements to O(K) numbers, where K is number of bits in machine word.
If you sort the array, it will work for you. Counting sort could've done it in O(n), but if you think in a practically large scenario, range can be pretty high.
Quicksort O(n*logn) will do the work for you:
def smallestPositiveInteger(self, array):
candidate = 1
n = len(array)
array = sorted(array)
for i in range(0, n):
if array[i] <= candidate:
candidate += array[i]
else:
break
return candidate

Find the missing number in a group {0......2^k -1} range

Given an array that has the numbers {0......2^k -1} except for one number ,
find a good algorithm that finds the missing number.
Please notice , you can only use :
for A[i] return the value of bit j.
swap A[i] with A[j].
My answer : use divide & conquer , check the bit number K of all the numbers , if the K bit (now we're on the LSB) is 0 then move the number to the left side, if the K bit is 1 then move the number to the right side.
After the 1st iteration , we'd have two groups , where one of them is bigger than the other , so we continue to do the same thing, with the smaller group , and I think that I need to check the K-1 bit this time.
But from some reason I've tried with 8 numbers , from 0.....7 , and removed 4 (say that I want to find out that 4 is the missing number) , however to algorithm didn't work out so good. So where is my mistake ?
I assume you can build xor bit function using get bit j.
The answer will be (xor of all numbers)
PROOF: a xor (2^k-1-a) = 2^k-1 (a and (2^k-1-a) will have different bits in first k positions).
Then 0 xor 1 xor ... xor 2^k-1 = (0 xor 2^k-1) xor (1 xor 2^k-2).... (2^(k-1) pairs) = 0.
if number n is missing the result will be n, because 0 xor 1 xor 2....xor n-1 xor n+1 xor ... = 0 xor 1 xor 2....xor n-1 xor n+1 xor ... xor n xor n = 0 xor n = n
EDIT: This will not work if k = 1.
Ron,
your solution is correct. This problem smells Quicksort, doesn't it ?
What you do with the Kth bit (all 0's to the left, 1's to the right) is a called a partition - you need to find the misplaced elements in pairs and swap them. It's the process used in Hoare's Selection and in Quicksort, with special element classification - no need to use a pivot element.
You forgot to tell in the problem statement how many elements there are in the array (2^k-2 or more), i.e. if repetitions are allowed.
If repetitions are not allowed, every partition will indeed be imbalanced by one element. The algorithm to use is an instance of Hoare's Selection (only partition the smallest halve). At every partition stage, the number of elements to be considered is halved, hence O(N) running time. This is optimal since every element needs to be known before the solution can be found.
[If repetitions are allowed, use modified Quicksort (recursively partition both halves) until you arrive at an empty half. The running time is probably O(N Lg(N)) then, but this needs to be checked.]
You say that the algorithm failed on your test case: you probably mis-implemented some detail.
An example:
Start with
5132670 (this is range {0..7})
After partitioning on bit weight=4 you get
0132|675
where the shortest half is
675 (this is range {4..7})
After partitioning on bit weight=2, you get
5|67
where the shortest half is
5 (this is range {4..5})
After partitioning on bit weight=1, you get
|5
where the shortest half is empty (this is range {4}).
Done.
for n just add them all and subtract the result from n*(n+1)/2
n*(n+1)/2 is sum of 1...n all numbers. If one of them is missing, then sum of those n-1 numbers will be n*(n+1)/2-missingNumber
Your answer is: n*(n+1)/2-missingNumber where n is 2^k-1
Given the fact that for a given bit position j, there are exactly 2^(k-1) numbers which have it set to 0, and 2^(k-1) which have it set to 1 use the following algorithm.
start with an array B of boolean of size k
init the array to false everywhere
for each number A[i]
for each position j
get the value v
if v is 1 invert the boolean at position j
end for
end for
If a position is false at the end then the missing number does have a zero at
this position, otherwise it has a one (for k >1, If k = 1 then it is the inverse). Now to implement your array of booleans
create a number of size 2k, where the lower k are set to 0, and the upper
are set to 1. Then
invert the boolean at position j
is simply *
swap B[j] with B[j+k].
With this representation the missing number is the lower k elements of the array
B. Well this algorithm is still O(k*2^k) but you can say it is O(n*log(n))
of the input.
you can consider elements as string of k bits and at each step i if the number of ones or zeros in position i is 2^(k-i) you should remove all those strings an continue for example
100 111 010
101 110 000 011
so
100 111 101 110 all will be removed
and between 010 000 011 , 010 and 011 will be removed because their second bit is 1
000 remain and its rightmost bit is zero so 001 is the missing number

Resources