Rank and unrank fibonacci bitsequence with k ones - arrays

For positive integers n and k, let a "k-fibonacci-bitsequence of n" be a bitsequence with k 1 where the 1 on index i describe not Math.pow(2,i) but Fibonacci(i). These positive integers that add up to n, and let the "rank" of a given k- fibonnaci-bitsequence of n be its position in the sorted list of all of these fibonacci-bitsequences in lexicographic order, starting at 0.
For example, for the number 39 we have following valid k-fibonacci-bitsequences, k <=4. The fibonacci numbers behind the fibonacci-bitsequence in this example are following:
34 21 13 8 5 3 2 1
10001000 k = 2 rank = 0
01101000 k = 3 rank = 0
10000110 k = 3 rank = 1
01101100 k = 4 rank = 0
So, I want to be able to do two things:
Given n, k, and a k-fibonacci-bitsequence of n, I want to find the rank of that k-fibonacci-bitsequence of n.
Given n, k, and a rank, I want to find the k-fibonacci-bitsequence of n with that rank.
Can I do this without having to compute all the k-fibonacci-bitsequences of n that come before the one of interest?

Preliminaries
For brevity lets say »k-fbs of n« instead of »k-fibonacci-bitsequences of n«.
Question
Can I do this without having to compute all the k-fbs of n that come before the one of interest?
I'm not sure. So far I still have to compute some of fbs. However, you might have thought we had to start from 00…0 and count up – this is not the case. We can do it the other way around and start from the highest fbs and work our way down very efficiently.
This is not a complete answer. However, there are some observations that could help you:
Zeckendorf
In the following pseudo-code we use the data-type fbs which is basically an array of bools. We can read and write individual bits using mySeq[i] where bit i represents the Fibonacci number fib(i). Just as in your question, the bits myFbs[0] and myFbs[1] do not exist. All bits are initialized to 0 by default. An fbs can be used without [] to read the represented number (n). The helper function #(fbs) returns the number of set bits (k) inside an fbs. Example for n = 7:
fbs meaning representation helper functions
1 0 1 0
| | | `— 0·fib(2) = 0·1 ——— myFbs[2] = 0 #(myFbs) == 2
| | `——— 1·fib(3) = 1·2 ——— myFbs[3] = 1 myFbs == 7
| `————— 0·fib(4) = 0·3 ——— myFbs[4] = 0
`——————— 1·fib(5) = 1·5 ——— myFbs[5] = 1
For any given n we can easily compute the lexicographical maximum (across all k) fbs of n as this fbs happends to be the Zeckendorf representation of n.
function zeckendorf(int n) returns (fbs z):
1 int i := any (ideally the smallest) number such that fib(start) > n
2 while n-z > 0
3 | if fib(i) < n
4 | | z[i] := 1
5 | i := i - 1
zeckendorf(n) is unique and the only fbs of n with k=#(zeckendorf(n)). Therefore zeckendorf(n) has rank=0. Also, there exists no k'-fbs of n with k'<#(zeckendorf(n)).
Transformation
Any k-fbs of n can be transformed into a (k+1)-fbs of n by replacing the bit sequence 100 by 011 anywhere inside the fbs. This works because fib(i)=fib(i-1)+fib(i-2).
If our input k-fbs of n has rank=0 and we replace the right-most 100 then our resulting (k+1)-fbs of n also has rank=0. If we replace the second-right-most 100 our resulting (k+1)-fbs has rank=1 and so on.
You should be able answer both of your questions using repeated transformations starting at zeckendorf(n). For the first question it might even be sufficient to only look at the k-stable transformations 011…100→100…011 and 100…011→011…100 of the given fbs (think about what these transformations do to the rank).

Related

binomial coefficient for very high numbers in c

So the task I have to solve is to calculate the binomial coefficient for 100>=n>k>=1 and then say how many solutions for n and k are over an under barrier of 123456789.
I have no problem in my formula of calculating the binomial coefficient but for high numbers n & k -> 100 the datatypes of c get to small to calculated this.
Do you have any suggestions how I can bypass this problem with overflowing the datatypes.
I thought about dividing by the under barrier straight away so the numbers don't get too big in the first place and I have to just check if the result is >=1 but i couldn't make it work.
Say your task is to determine how many binomial coefficients C(n, k) for 1 ≤ k < n ≤ 8 exceed a limit of m = 18. You can do this by using the recurrence C(n, k) = C(n − 1, k) + C(n − 1, k − 1) that can visualized in Pascal's triangle.
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 (20) 15 6 1
1 7 (21 35 35 21) 7 1
1 8 (28 56 70 56 28) 8 1
Start at the top and work your way down. Up to n = 5, everything is below the limit of 18. On the next line, the 20 exceeds the limit. From now on, more and more coefficients are beyond 18.
The triangle is symmetric and strictly increasing in the first half of each row. You only need to find the first element that exceeds the limit on each line in order to know how many items to count.
You don't have to store the whole triangle. It is enough to keey the last and current line. Alternatively, you can use the algorithm detailed [in this article][ot] to work your way from left to right on each row. Since you just want to count the coefficients that exceed a limit and don't care about their values, the regular integer types should be sufficient.
First, you'll need a type that can handle the result. The larget number you need to handle is C(100,50) = 100,891,344,545,564,193,334,812,497,256. This number requires 97 bits of precision, so your normal data types won't do the trick. A quad precision IEEE float would do the trick if your environment provides it. Otherwise, you'll need some form of high/arbitrary precision library.
Then, to keep the numbers within this size, you'll want cancel common terms in the numerator and the denominator. And you'll want to calculate the result using ( a / c ) * ( b / d ) * ... instead of ( a * b * ... ) / ( c * d * ... ).

Change the minimum number of entries in an array so that the sum of any k consecutive items is even

We are given an array of integers. We have to change the minimum number of those integers however we'd like so that, for some fixed parameter k, the sum of any k consecutive items in the array is even.
Example:
N = 8; K = 3;
A = {1,2,3,4,5,6,7,8}
We can change 3 elements (4th,5th,6th)
so the array can be {1,2,3,5,6,7,7,8}
then
1+2+3=6 is even
2+3+5=10 is even
3+5+6=14 is even
5+6+7=18 is even
6+7+7=20 is even
7+7+8=22 is even
There's a very nice O(n)-time solution to this problem that, at a high level, works like this:
Recognize that determining which items to flip boils down to determining a pattern that repeats across the array of which items to flip.
Use dynamic programming to determine what that pattern is.
Here's how to arrive at this solution.
First, some observations. Since all we care about here is whether the sums are even or odd, we actually don't care about the numbers' exact values. We just care about whether they're even or odd. So let's begin by replacing each number with either 0 (if the number is even) or 1 (if it's odd). Now, our task is to make each window of k elements have an even number of 1s.
Second, the pattern of 0s and 1s that results after you've transformed the array has a surprising shape: it's simply a repeated copy of the first k elements of the array. For example, suppose k = 5 and we decide that the array should start off as 1 0 1 1 1. What must the sixth array element be? Well, in moving from the first window to the second, we dropped a 1 off the front of the window, changing the parity to odd. We therefore have to have the next array element be a 1, which means that the sixth array element must be a 1, equal to the first array element. The seventh array element then has to be a 0, since in moving from the second window to the third we drop off a zero. This process means that whatever we decide on for the first k elements turns out to determine the entire final sequence of values.
This means that we can reframe the problem in the following way: break the original input array of n items into n/k blocks of size k. We're now asked to pick a sequence of 0s and 1s such that
this sequence differs in as few places as possible from the n/k blocks of k items each, and
the sequence has an even number of 1s.
For example, given the input sequence
0 1 1 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1
and k = 3, we would form the blocks
0 1 1, 0 1 1, 1 0 0, 1 0 1, 1 1 0, 1 1 1
and then try to find a pattern of length three with an even number of 1s in it such that replacing each block with that pattern requires the fewest number of edits.
Let's see how to take that problem on. Let's work one bit at a time. For example, we can ask: what's the cost of making the first bit a 0? What's the cost of making the first bit a 1? The cost of making the first bit a 0 is equal to the number of blocks that have a 1 at the front, and the cost of making the first bit a 1 is equal to the number of blocks that have a 0 at the front. We can work out the cost of setting each bit, individually, to either to zero or to one. That gives us a matrix like this one:
Bit #0 Bit #1 Bit #2 Bit #3 ... Bit #k-1
---------------------+--------+--------+--------+--------+--------+----------
Cost of setting to 0 | | | | | | |
Cost of setting to 1 | | | | | | |
We now need to choose a value for each column with the goal of minimizing the total cost picked, subject to the constraint that we pick an even number of bits to be equal to 1. And this is a nice dynamic programming exercise. We consider subproblems of the form
What is the lowest cost you can make out of the first m columns from the table, provided your choice has parity p of items chosen from the bottom row?
We can store this in an (k + 1) × 2 table T[m][p], where, for example, T[3][even] is the lowest cost you can achieve using the first three columns with an even number of items set to 1, and T[6][odd] is the lowest cost you can achieve using the first six columns with an odd number of items set to 1. This gives the following recurrence:
T[0][even] = 0 (using zero columns costs nothing)
T[0][odd] = ∞ (you cannot have an odd number of bits set to 1 if you use no colums)
T[m+1][p] = min(T[m][p] + cost of setting this bit to 0, T[m][!p] + cost of setting this bit to 1) (either use a zero and keep the same parity, or use a 1 and flip the parity).
This can be evaluated in time O(k), and the resulting minimum cost is given by T[n][even]. You can use a standard DP table walk to reconstruct the optimal solution from this point.
Overall, here's the final algorithm:
create a table costs[k+1][2], all initially zero.
/* Populate the costs table. costs[m][0] is the cost of setting bit m
* to 0; costs[m][1] is the cost of setting bit m to 1. We work this
* out by breaking the input into blocks of size k, then seeing, for
* each item within each block, what its parity is. The cost of setting
* that bit to the other parity then increases by one.
*/
for i = 0 to n - 1:
parity = array[i] % 2
costs[i % k][!parity]++ // Cost of changing this entry
/* Do the DP algorithm to find the minimum cost. */
create array T[k + 1][2]
T[0][0] = 0
T[0][1] = infinity
for m from 1 to k:
for p from 0 to 1:
T[m][p] = min(T[m - 1][p] + costs[m - 1][0],
T[m - 1][!p] + costs[m - 1][1])
return T[m][0]
Overall, we do O(n) work with our initial pass to work out the costs of setting each bit, independently, to 0. We then do O(k) work with the DP step at the end. The overall work is therefore O(n + k), and assuming k ≤ n (otherwise the problem is trivial) the cost is O(n).

Number of ways to fill an array of size n , such that the mexium is greater than every element of the array?

We have given an empty array of size n , we need to fill it with natural numbers (we are allowed to repeat).
The condition that must follow is the mex of the array must be greater than all the elements we fill in the array .
Can someone pls help me with the number of ways to do so ?
(Different arrangements of same set of numbers are also considered distinct)
PS:- by mex of a sequence I mean the smallest non negative number that doesn't occur in the sequence
Number of such arrays is equivalent to the number of ordered distributions of values 1..N into buckets (so [A],[B,C] and [B,C][A] are distinct ones). And number of such distributions is described by ordered Bell numbers 1,3,13,75....
Example for N=3
1 1 1 //1 permutation
1 1 2 //3 permutations
1 2 2 //3 permutations
1 2 3 //6 permutations
//13 variants
Generation of distributions themselves for reference. Note that for N values every value might fall into part 1..K, where K is in range 1..N, so numbers of parts corresponding to all values form continuous sequence without holes (cf. your mex)
To calculate number of such distributions, we can use recurrence from Wiki, Python code:
def cnk(n, k):
k = min(k, n - k)
if k <= 0:
return 1 if k == 0 else 0
res = 1
for i in range(k):
res = res * (n - i) // (i + 1)
return res
def orderedbell(n):
a = [0]*(n+1)
a[0] = 1
for m in range(1, n+1):
for i in range(1, m+1):
a[m] += cnk(m, i) * a[m - i]
return a[n]
for i in range(1,10):
print(orderedbell(i))
1
3
13
75
541
4683
47293
545835
7087261

Smallest number that cannot be formed from sum of numbers from array

This problem was asked to me in Amazon interview -
Given a array of positive integers, you have to find the smallest positive integer that can not be formed from the sum of numbers from array.
Example:
Array:[4 13 2 3 1]
result= 11 { Since 11 was smallest positive number which can not be formed from the given array elements }
What i did was :
sorted the array
calculated the prefix sum
Treverse the sum array and check if next element is less than 1
greater than sum i.e. A[j]<=(sum+1). If not so then answer would
be sum+1
But this was nlog(n) solution.
Interviewer was not satisfied with this and asked a solution in less than O(n log n) time.
There's a beautiful algorithm for solving this problem in time O(n + Sort), where Sort is the amount of time required to sort the input array.
The idea behind the algorithm is to sort the array and then ask the following question: what is the smallest positive integer you cannot make using the first k elements of the array? You then scan forward through the array from left to right, updating your answer to this question, until you find the smallest number you can't make.
Here's how it works. Initially, the smallest number you can't make is 1. Then, going from left to right, do the following:
If the current number is bigger than the smallest number you can't make so far, then you know the smallest number you can't make - it's the one you've got recorded, and you're done.
Otherwise, the current number is less than or equal to the smallest number you can't make. The claim is that you can indeed make this number. Right now, you know the smallest number you can't make with the first k elements of the array (call it candidate) and are now looking at value A[k]. The number candidate - A[k] therefore must be some number that you can indeed make with the first k elements of the array, since otherwise candidate - A[k] would be a smaller number than the smallest number you allegedly can't make with the first k numbers in the array. Moreover, you can make any number in the range candidate to candidate + A[k], inclusive, because you can start with any number in the range from 1 to A[k], inclusive, and then add candidate - 1 to it. Therefore, set candidate to candidate + A[k] and increment k.
In pseudocode:
Sort(A)
candidate = 1
for i from 1 to length(A):
if A[i] > candidate: return candidate
else: candidate = candidate + A[i]
return candidate
Here's a test run on [4, 13, 2, 1, 3]. Sort the array to get [1, 2, 3, 4, 13]. Then, set candidate to 1. We then do the following:
A[1] = 1, candidate = 1:
A[1] ≤ candidate, so set candidate = candidate + A[1] = 2
A[2] = 2, candidate = 2:
A[2] ≤ candidate, so set candidate = candidate + A[2] = 4
A[3] = 3, candidate = 4:
A[3] ≤ candidate, so set candidate = candidate + A[3] = 7
A[4] = 4, candidate = 7:
A[4] ≤ candidate, so set candidate = candidate + A[4] = 11
A[5] = 13, candidate = 11:
A[5] > candidate, so return candidate (11).
So the answer is 11.
The runtime here is O(n + Sort) because outside of sorting, the runtime is O(n). You can clearly sort in O(n log n) time using heapsort, and if you know some upper bound on the numbers you can sort in time O(n log U) (where U is the maximum possible number) by using radix sort. If U is a fixed constant, (say, 109), then radix sort runs in time O(n) and this entire algorithm then runs in time O(n) as well.
Hope this helps!
Use bitvectors to accomplish this in linear time.
Start with an empty bitvector b. Then for each element k in your array, do this:
b = b | b << k | 2^(k-1)
To be clear, the i'th element is set to 1 to represent the number i, and | k is setting the k-th element to 1.
After you finish processing the array, the index of the first zero in b is your answer (counting from the right, starting at 1).
b=0
process 4: b = b | b<<4 | 1000 = 1000
process 13: b = b | b<<13 | 1000000000000 = 10001000000001000
process 2: b = b | b<<2 | 10 = 1010101000000101010
process 3: b = b | b<<3 | 100 = 1011111101000101111110
process 1: b = b | b<<1 | 1 = 11111111111001111111111
First zero: position 11.
Consider all integers in interval [2i .. 2i+1 - 1]. And suppose all integers below 2i can be formed from sum of numbers from given array. Also suppose that we already know C, which is sum of all numbers below 2i. If C >= 2i+1 - 1, every number in this interval may be represented as sum of given numbers. Otherwise we could check if interval [2i .. C + 1] contains any number from given array. And if there is no such number, C + 1 is what we searched for.
Here is a sketch of an algorithm:
For each input number, determine to which interval it belongs, and update corresponding sum: S[int_log(x)] += x.
Compute prefix sum for array S: foreach i: C[i] = C[i-1] + S[i].
Filter array C to keep only entries with values lower than next power of 2.
Scan input array once more and notice which of the intervals [2i .. C + 1] contain at least one input number: i = int_log(x) - 1; B[i] |= (x <= C[i] + 1).
Find first interval that is not filtered out on step #3 and corresponding element of B[] not set on step #4.
If it is not obvious why we can apply step 3, here is the proof. Choose any number between 2i and C, then sequentially subtract from it all the numbers below 2i in decreasing order. Eventually we get either some number less than the last subtracted number or zero. If the result is zero, just add together all the subtracted numbers and we have the representation of chosen number. If the result is non-zero and less than the last subtracted number, this result is also less than 2i, so it is "representable" and none of the subtracted numbers are used for its representation. When we add these subtracted numbers back, we have the representation of chosen number. This also suggests that instead of filtering intervals one by one we could skip several intervals at once by jumping directly to int_log of C.
Time complexity is determined by function int_log(), which is integer logarithm or index of the highest set bit in the number. If our instruction set contains integer logarithm or any its equivalent (count leading zeros, or tricks with floating point numbers), then complexity is O(n). Otherwise we could use some bit hacking to implement int_log() in O(log log U) and obtain O(n * log log U) time complexity. (Here U is largest number in the array).
If step 1 (in addition to updating the sum) will also update minimum value in given range, step 4 is not needed anymore. We could just compare C[i] to Min[i+1]. This means we need only single pass over input array. Or we could apply this algorithm not to array but to a stream of numbers.
Several examples:
Input: [ 4 13 2 3 1] [ 1 2 3 9] [ 1 1 2 9]
int_log: 2 3 1 1 0 0 1 1 3 0 0 1 3
int_log: 0 1 2 3 0 1 2 3 0 1 2 3
S: 1 5 4 13 1 5 0 9 2 2 0 9
C: 1 6 10 23 1 6 6 15 2 4 4 13
filtered(C): n n n n n n n n n n n n
number in
[2^i..C+1]: 2 4 - 2 - - 2 - -
C+1: 11 7 5
For multi-precision input numbers this approach needs O(n * log M) time and O(log M) space. Where M is largest number in the array. The same time is needed just to read all the numbers (and in the worst case we need every bit of them).
Still this result may be improved to O(n * log R) where R is the value found by this algorithm (actually, the output-sensitive variant of it). The only modification needed for this optimization is instead of processing whole numbers at once, process them digit-by-digit: first pass processes the low order bits of each number (like bits 0..63), second pass - next bits (like 64..127), etc. We could ignore all higher-order bits after result is found. Also this decreases space requirements to O(K) numbers, where K is number of bits in machine word.
If you sort the array, it will work for you. Counting sort could've done it in O(n), but if you think in a practically large scenario, range can be pretty high.
Quicksort O(n*logn) will do the work for you:
def smallestPositiveInteger(self, array):
candidate = 1
n = len(array)
array = sorted(array)
for i in range(0, n):
if array[i] <= candidate:
candidate += array[i]
else:
break
return candidate

Finding the maximum area in given binary data

I have a problem with describing algorithm for finding maximum rectangular area of binary data, where 1 occurs k-times more often than 0. Data is always n^2 bits like this:
For example data for n = 4 looks like:
1 0 1 0
0 0 1 1
0 1 1 1
1 1 0 1
Value of k can be 1 .. j (k = 1 means, that number of 0 and 1 is equal).
For above example of data and for k = 1 solution is:
1 0 1 0 <- 4 x '0' and 4 x '1'
0 0 1 1
0 1 1 1
1 1 0 1
But in this example:
1 1 1 0
0 1 0 0
0 0 0 0
0 1 1 1
Solution would be:
1 1 1 0
0 1 0 0
0 0 0 0
0 1 1 1
I tried with few brute force algorithms but for n > 20 it is getting too slow. Can you advise me how I should solve this problem?
As RBerteig proposed - the problem can be also described like that: "In a given square bitmap with cells set to 1 or 0 by some arbitrary process, find the largest rectangular area where the 1's and 0's occur in a specified ratio, k."
Bruteforce should do just fine here for n < 100, if properly implemented: solution below has O(n^4) time and O(n^2) memory complexity. 10^8 operations should be well under 1 second on modern PC (especially considering that each operation is very cheap: few additions and subtractions).
Some observations
There're O(n^4) sub-rectangles to consider and each of them can be a solution.
If we can find number of 1's and 0's in each sub-rectangle in O(1) (constant time), we'll solve problem in O(n^4) time.
If we know number of 1's in some sub-rectangle, we can find number of zeroes (through area).
So, the problem is reduced to following: create data structure allowing to find number of 1's in each sub-rectangle in constant time.
Now, imagine we have sub-rectangle [i0..i1]x[j0..j1]. I.e., it occupies rows between i0 and i1 and columns between j0 and j1. And let count_ones be the function to count number of 1's in subrectangle.
This is the main observation:
count_ones([i0..i1]x[j0..j1]) = count_ones([0..i1]x[0..j1]) - count_ones([0..i0 - 1]x[0..j1]) - count_ones([0..i1]x[0..j0 - 1]) + count_ones([0..i0 - 1]x[0..j0 - 1])
Same observation with practical example:
AAAABBB
AAAABBB
CCCCDDD
CCCCDDD
CCCCDDD
CCCCDDD
If we need to find number of 1's in D sub-rectangle (3x4), we can do it by taking number of 1's in the whole rectangle (A + B + C + D), subtracting number of 1's in (A + B) rectangle, subtracting number of 1's in (A + C) rectangle, and adding number of 1's in (A) rectangle. (A + B + C + D) - (A + B) - (A + C) + (A) = D
Thus, we need table sums, for each i and j containing number of 1's in sub-rectangle [0..i][0..j].
You can create this table in O(n^2), but even the direct way to fill it (for each i and j iterate all elements of [0..i][0..j] area) will be O(n^4).
Having this table,
count_ones([i0..i1]x[j0..j1]) = sums[i1][j1] - sums[i0 - 1][j1] - sums[i1][j0 - 1] + sums[i0 - 1][j0 - 1]
Therefore, time complexity O(n^4) reached.
This is still brute force, but something you should note is that you don't have to recompute everything from scratch for a new i*j rectangle. Instead, for each possible rectangle size, you can move the rectangle across the n*n grid one step at a time, decrementing the counts for the bits no longer within the rectangle and incrementing the counts for the bits that newly entered the rectangle. You could potentially combine this with varying the rectangle size, and try to find an optimal pattern for moving and resizing the rectangle.
Just some hints..
You could impose better restrictions on the values. The requirement leads to condition
N1*(k+1) == S*k, where N1 is number of ones in an area, and S=dx*dy is its surface.
It can be rewritten in better form:
N1/k == S/(k+1).
Because the greatest common divisor of numbers n and n+1 is always 1, then N1 have to be multiple of k and dx*dy to be multiple of k+1. It reduces greatly the possible space of solutions, the larger is k, the better (for dx*dy case you'll need to play with prime divisors of k+1).
Now, because you need just the surface of the largest area with such property, it would be wise to start from largest areas and move to smaller ones. By trying dx*dy from n^2 downto k+1 that would satisfy the divisor and the bounding conditions, you'll find quite fast the solution, muuuch faster than O(n^4), because of a special reason: except cases when the array was specially constructed, if we assume a random input, the probability that there are N1 ones out of S values in the (n-dx+1)*(n-dy+1) areas that have the surface S will constantly grow with decrease of S. (large values of k will make the probability smaller, but in the same time they will make the filter for dx and dy pairs stronger).
Also, this problem: http://ioinformatics.org/locations/ioi99/contest/land/land.shtml , looks somehow similar, maybe you'll find some ideas in their solution.

Resources