Enumerate integers by Hamming weight, modulo bit shifting - permutation

I need to sample integers from an ordered array described as follows.
Let k be a positive integer.
All entries are nonnegative integers in [0,2^k)
The list starts at 0
All the (increasing) integers with hamming weight 1 modulo bit shifting (i.e. multiplication by 2) follow.
All the (increasing) integers with hamming weight 2 modulo bit shifting, follow, and so on.
The array for k=5 looks like this:
0 ( weight 0 )
1 ( weight 1 )
11 ( weight 2 )
101
1001
10001
111 ( weight 3 )
1011
1101
10011
10101
11001
In particular, given an entry on the list, I would like to deduce the next entry algorithmically.
I know this can be done in several ways (see e.g. this question). For the sake completeness, this is how these other arrays looks like, in contrast to the one above:
0 ( weight 0 )
1 ( weight 1 )
10
100
1000
10000
11 ( weight 2 )
101
110
... etc

I figured out the answer to this, in case anyone needs it.
Given an entry, add the second least bit and carry. If it reduces the Hamming weight, put the necessary number of 1's starting from the second least-significant position.

Related

binomial coefficient for very high numbers in c

So the task I have to solve is to calculate the binomial coefficient for 100>=n>k>=1 and then say how many solutions for n and k are over an under barrier of 123456789.
I have no problem in my formula of calculating the binomial coefficient but for high numbers n & k -> 100 the datatypes of c get to small to calculated this.
Do you have any suggestions how I can bypass this problem with overflowing the datatypes.
I thought about dividing by the under barrier straight away so the numbers don't get too big in the first place and I have to just check if the result is >=1 but i couldn't make it work.
Say your task is to determine how many binomial coefficients C(n, k) for 1 ≤ k < n ≤ 8 exceed a limit of m = 18. You can do this by using the recurrence C(n, k) = C(n − 1, k) + C(n − 1, k − 1) that can visualized in Pascal's triangle.
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 (20) 15 6 1
1 7 (21 35 35 21) 7 1
1 8 (28 56 70 56 28) 8 1
Start at the top and work your way down. Up to n = 5, everything is below the limit of 18. On the next line, the 20 exceeds the limit. From now on, more and more coefficients are beyond 18.
The triangle is symmetric and strictly increasing in the first half of each row. You only need to find the first element that exceeds the limit on each line in order to know how many items to count.
You don't have to store the whole triangle. It is enough to keey the last and current line. Alternatively, you can use the algorithm detailed [in this article][ot] to work your way from left to right on each row. Since you just want to count the coefficients that exceed a limit and don't care about their values, the regular integer types should be sufficient.
First, you'll need a type that can handle the result. The larget number you need to handle is C(100,50) = 100,891,344,545,564,193,334,812,497,256. This number requires 97 bits of precision, so your normal data types won't do the trick. A quad precision IEEE float would do the trick if your environment provides it. Otherwise, you'll need some form of high/arbitrary precision library.
Then, to keep the numbers within this size, you'll want cancel common terms in the numerator and the denominator. And you'll want to calculate the result using ( a / c ) * ( b / d ) * ... instead of ( a * b * ... ) / ( c * d * ... ).

DFA for binary numbers that have a remainder of 1 when divided by 3

I need a DFA for a set of all strings beginning with a 1 that, interpreted as the binary representation of an integer, have a remainder of 1 when divided by 3.
For example, the binary number 1010 b is decimal 10. When you divide 10 by 3 you get a remainder of 1, so 1010 is in the language. However, the binary number 1111 b is decimal 15. When you divide 15 by 3 you get a remainder of 0, so 1111 is not in the language.
I've attached my DFA below. Could you please check it?
It looks correct to me.
You could make two simplifications:
q4 represents (mod 0), so you could make it the starting state and get rid of q0 and q5. (Unless you are required to reject strings beginning with a 0? Your question doesn't specify.)
q1 and q3 can be merged. They both represent (mod 1) and have the same transitions.
These two changes would leave you with exactly 3 states, one for each remainder.

Integer compression method

How can I compress a row of integers into something shorter ?
Like:
Input: '1 2 4 5 3 5 2 3 1 2 3 4' -> Algorithm -> Output: 'X Y Z'
and can get it back the other way around? ('X Y Z' -> '1 2 4 5 3 5 2 3 1 2 3 4')
Note:Input will only contain numbers between 1-5 and the total string of number will be 10-16
Is there any way I can compress it to 3-5 numbers?
Here is one way. First, subtract one from each of your little numbers. For your example input that results in
0 1 3 4 2 4 1 2 0 1 2 3
Now treat that as the base-5 representation of an integer. (You can choose either most significant digit first or last.) Calculate the number in binary that means the same thing. Now you have a single integer that "compressed" your string of little numbers. Since you have shown no code of your own, I'll just stop here. You should be able to implement this easily.
Since you will have at most 16 little numbers, the maximum resulting value from that algorithm will be 5^16 which is 152,587,890,625. This fits into 38 bits. If you need to store smaller numbers than that, convert your resulting value into another, larger number base, such as 2^16 or 2^32. The former would result in 3 numbers, the latter in 2.
#SergGr points out in a comment that this method does not show the number of integers encoded. If that is not stored separately, that can be a problem, since the method does not distinguish between leading zeros and coded zeros. There are several ways to handle that, if you need the number of integers included in the compression. You could require the most significant digit to be 1 (first or last depends on where the most significant number is.) This increases the number of bits by one, so you now may need 39 bits.
Here is a toy example of variable length encoding. Assume we want to encode two strings: 1 2 3 and 1 2 3 0 0. How the results will be different? Let's consider two base-5 numbers 321 and 00321. They represent the same value but still let's convert them into base-2 preserving the padding.
1 + 2*5 + 3*5^2 = 86 dec = 1010110 bin
1 + 2*5 + 3*5^2 + 0*5^3 + 0*5^4 = 000001010110 bin
Those additional 0 in the second line mean that the biggest 5-digit base-5 number 44444 has a base-2 representation of 110000110100 so the binary representation of the number is padded to the same size.
Note that there is no need to pad the first line because the biggest 3-digit base-5 number 444 has a base-2 representation of 1111100 i.e. of the same length. For an initial string 3 2 1 some padding will be required in this case as well, so padding might be required even if the top digits are not 0.
Now lets add the most significant 1 to the binary representations and that will be our encoded values
1 2 3 => 11010110 binary = 214 dec
1 2 3 0 0 => 1000001010110 binary = 4182 dec
There are many ways to decode those values back. One of the simplest (but not the most efficient) is to first calculate the number of base-5 digits by calculating floor(log5(encoded)) and then remove the top bit and fill the digits one by one using mod 5 and divide by 5 operations.
Obviously such encoding of variable-length always adds exactly 1 bit of overhead.
Its call : polidatacompressor.js but license will be cost you, you have to ask author about prices LOL
https://github.com/polidatacompressor/polidatacompressor
Ncomp(65535) will output: 255, 255 and when you store this in database as bytes you got 2 char
another way is to use "Hexadecimal aka base16" in javascript (1231).toString(16) give you '4cf' in 60% situation it compress char by -1
Or use base10 to base64 https://github.com/base62/base62.js/
4131 --> 14D
413131 --> 1Jtp

How to find out number of bits enabled : bits handling

This was asked in one of the interview I gave. I couldn't answer this properly.
I want to find out how many bits are enabled based on a number.
Suppose , if the number is 2 , I should return 3.
if the number is 3 , I should return 7
8 4 2 1
1 1
8 4 2 1
1 1 1
Is there any easy way of doing it?
Yes, there is: subtract 1 from the corresponding power of 2, like this:
int allBitsSet = (1U << n) - 1;
The expression (1U << n) - 1 computes the value of 2 to the power of n, which always has this form in binary:
1000...00
i.e. one followed by n zeros. When you subtract 1 from a number of that form, you "borrow" from the bit that is set to 1 making it zero, and flip the remaining bits to 1.
You can visualize this by solving an analogous problem in decimal system: "make a number that has n nines". The solution is the same, except now you need to use 10 instead of 2.

C - Fast conversion between binary and hex representations

Reading or writing a C code, I often have difficulties translating the numbers from the binary to the hex representations and back. Usually, different masks like 0xAAAA5555 are used very often in low-level programming, but it's difficult to recognize a special pattern of bits they represent. Is there any easy-to-remember rule how to do it fast in the mind?
Each hex digit map exactly on 4 bit, I usually keep in mind the 8421 weights of each of these bits, so it is very easy to do even an in mind conversion ie
A = 10 = 8+2 = 1010 ...
5 = 4+1 = 0101
just keep the 8-4-2-1 weights in mind.
A 5
8+4+2+1 8+4+2+1
1 0 1 0 0 1 0 1
I always find easy to map HEX to BINARY numbers. Since each hex digit can be directly mapped to a four digit binary number, you can think of:
> 0xA4
As
> b 1010 0100
> ---- ---- (4 binary digits for each part)
> A 4
The conversion is calculated by dividing the base 10 representation by 2 and stringing the remainders in reverse order. I do this in my head, seems to work.
So you say what does 0xAAAA5555 look like
I just work out what A looks like and 5 looks like by doing
A = 10
10 / 2 = 5 r 0
5 / 2 = 2 r 1
2 / 2 = 1 r 0
1 / 2 = 0 r 1
so I know the A's look like 1010 (Note that 4 fingers are a good way to remember the remainders!)
You can string blocks of 4 bits together, so A A is 1010 1010. To convert binary back to hex, I always go through base 10 again by summing up the powers of 2. You can do this by forming blocks of 4 bits (padding with 0s) and string the results.
so 111011101 is 0001 1101 1101 which is (1) (1 + 4 + 8) (1 + 4 + 8) = 1 13 13 which is 1DD

Resources