Using binary strings on storing ordered items in database - database

In this post, #boisvert mentioned that if using string as the order field's value, it is best shown for a binary string, and then gave an algorithm to calculate the average of two binary strings as follows:
Avalue = 1+0*(1/2)+1*(1/4)+1*(1/8)
Bvalue = 1+1*(1/2)+0*(1/4)+0*(1/8)
average, new value = 1+0*(1/2)+1*(1/4)+1*(1/8)+1*(1/16) new string = "10111"
content order
--------------------
A '1011'
new! '10111'
B '1100'
C '1101'
I couldn't understand these very well, what's the value of the first item putting into the DB and the items inserting before/after it? How to calculate the average between '1011' and the new value '10111', or between '111' and '1000'?
Any help is much appreciated.

The binary strings are fractions, not integers; the decimal point is always at the beginning (or after the first digit, in #boisvert's answer; it doesn't make any difference as long as the position of the decimal point is fixed. Of course, it's actually a binary point since these are binary numbers.)
To find the average:
If the strings differ in length, put enough 0s at the end of the shorter string so that it is the same length as the longer string.
Add the two strings together, using binary addition, always putting the last carry at the beginning, even if it is ´0'. [See algorithm below].
Remove any 0s at the end.
Example 1: 1011 and 10111
Extend the first string with a 0: 10110 and 10111
Find the sum:
A: 10110
B: 10111
Carry: 101100
Sum: 101101
No trailing zeros, so the result is 101101
Example 2: 111 and 1000
1. 1110 1000
2. 10110
3. 1011
Starting off and insertion at the end:
The first item put into the database has the label 1. If at any point you need to add an item at the very beginning, use the first label with a 0 before it. Similarly, if you need to add an item at the end, use the first label with a 1 before it.
Binary addition:
Since the strings are the same length, this is easy; set Carry to 0, and scan both strings from back to front. (The output is also produced back-to-front.)
At each position:
* If the sum of Carry and the two digits is 1 or 3, output a 1, otherwise output a 0.
* If the sum of Carry and the two digits is 2 or 3, set Carry to 1, otherwise set it to 0.
When you've finished all the digits, output the value of Carry.
Practical implementation:
In practice, you wouldn't use binary strings; you'd use some fairly large base, the only requirement being that it is even. But the algorithms are the same. When constructing the representation of your numbers, you need to assign digits to characters in alphabetical order, so that the resulting strings can be sorted alphabetically without converting them to numbers; the database doesn't know how to convert to numbers, but it knows how to sort strings alphabetically.

Related

Convert big number to single digit array

So, I need to convert any number to its reverse and put it in an array.
For example:
123456789 --> [9 8 7 6 5 4 3 2 1]
I managed to do it for small numbers like the example with this:
n=32134654654213
rev=(fliplr(num2str(n)));
mikos=length(rev);
array=[ ]
for i=1:mikos
array=[array,str2num(rev(i))]
end
But when I put a big number like 564465426464334345413435541 the array is always 1x18 double and does not show all the digits.
Any ideas?
edit As you tell me in comments, it is a limit of how many digits a double can hold. You are right, if I use a string input, it works as a charm. Still wondering hot to make it work as a function, with this form :
function digits = GetDigits(n)
As mentioned in the comments, you have a limit on the number of digits a double can store. The limit on a uint64 is bigger (19), but still not sufficient to store your value. I would make sure it does not happen.
function digits = GetDigits(n)
%check if the value behaves as a normal integer
if isnumeric(n)
if n==(n+1),error('n too big, supply as string instead');end
end
rev=(fliplr(num2str(n)));
mikos=length(rev);
array=[ ]
for i=1:mikos
array=[array,str2num(rev(i))]
end
end

Count number of permutations of a string with two distinct digits

How can i calculate how many numbers are there between 000000 and 999999 that contain only two distinct digits?
For example 000001 can be counted as one. The same goes for 002200, 112211, 100000. However 112233 contains three distinct digits so it can't be counted.
Thanks
Let's simplify the problem.
Suppose we need find all the permutations of numbers with just 0,1. So the possible combinations can be like 000011,000001,001110 etc. Since there needs to be 2 distinct digits There can be following combinations:
[Zeroes, Ones]: {1,5},{2,4},{3,3},{4,2},{5,1}
That means 1 zeroes 5 ones will have: 000001, 000010, 000100, 001000, 010000, 100000
So if there are Z zeroes then there will be 6CZ combinations with Z zeroes and 6 - Z ones.
Since Z can have a value from 1-5, we can say that there are 5∑Z=16CZ possible numbers with 0,1 combination with at-least 1 zero & 1 one.
Now coming back to original problem Since there are 10 digits and we need two distinct digits so 10C2 i.e. 45 Combination will be there ex: {0,1}, {0,2} ..... {1,2} ....
So the answer is 10C2 * 5∑Z=16CZ
As you haven't specified any specific programming language, I did use of javascript with proper comments. Hope it helps you.
var counter = 0; // this counts if it contains exactly two different digits only
for(var i=10000; i<10005 ; i++) { // change the loop values as you need
var x = i.toString(); // converting number to string which makes easy to split
var chars = x.split(''); // split characters and keep in an array
var uniqueChars = Array.from(new Set(chars)); // get distinct characters from array
if(uniqueChars.length == 2){ // check if it contains exactly two elements
counter++;
}
}
console.log(counter);

Excel: How to check for repeated numbers inside a cell

I have an excel spreadsheet with numbers from 000 to 999 and am trying to find repeated numbers inside a cell.
(So for example, printing 1 if the number is 022 , 555 or 115 and 0 if it isn't)
So far, I have not been able to find a solution.
Feel free to ask for more information and thanks in advance.
This will do: =IF(COUNT(SEARCH(REPT({0,1,2,3,4,5,6,7,8,9},2),A1))>0,1,0)
Note: If value in cell A1 contains 2 repeated digits it will show 1 else 0. You can customize the repetition limit by changing 2 in the part 8,9},2).
You could try this one if you wanted to find repeated digits not necessarily next to each other:-
=IF(MAX(LEN(A1)-LEN(SUBSTITUTE(A1,{0,1,2,3,4,5,6,7,8,9},"")))>1,1,0)
If the numbers are stored as 3-digit numbers and you wanted it to work for (e.g.) 001, would need:-
=IF(MAX(LEN(TEXT($A1,"000"))-LEN(SUBSTITUTE(TEXT($A1,"000"),{0,1,2,3,4,5,6,7,8,9},"")))>1,1,0)
If your data is in Range "A1:A100" and you want to locate repeated numbers in the range for instance, enter =IF(COUNTIF(A:A,A1)>1,1,0) in cell B1 and fill down. But if you want to check repetitions of specific numbers like 022, 555 or 115, enter =IF(OR(AND(A1=022,COUNTIF(A:A,A1)>1),AND(A1=555,COUNTIF(A:A,A1)>1),AND(A1=115,COUNTIF(A:A,A1)>1)),1,0) in cell B1 and fill down.
being a number, use arithmetics to break it into digits and then check if all are different.
the formula is
=INT(NOT(AND(INT(A1/100)<>INT(MOD(A1,100)/10),INT(A1/100)<>MOD(A1,10),INT(MOD(A1,100)/10)<>MOD(A1,10))))
let's analyze it step by step
first, INT(A1/100) extracts the first digit (the integer division by 100); then INT(MOD(A1,100)/10) extracts the second digit (the integer division by 10 of the modulo 100); and MOD(A1,10) extracts the last digit (the modulo 10).
next there are the three comparisons of difference <> first with second, second with third and first with third, combined with AND() and finally take the result, negate it NOT() and transforming it into an integer 0 or 1 with INT()

Generating also non-unique (duplicated) permutations

I've written a basic permutation program in C.
The user types a number, and it prints all the permutations of that number.
Basically, this is how it works (the main algorithm is the one used to find the next higher permutation):
int currentPerm = toAscending(num);
int lastPerm = toDescending(num);
int counter = 1;
printf("%d", currentPerm);
while (currentPerm != lastPerm)
{
counter++;
currentPerm = nextHigherPerm(currentPerm);
printf("%d", currentPerm);
}
However, when the number input includes repeated digits - duplicates - some permutations are not being generated, since they're duplicates. The counter shows a different number than it's supposed to - Instead of showing the factorial of the number of digits in the number, it shows a smaller number, of only unique permutations.
For example:
num = 1234567
counter = 5040 (!7 - all unique)
num = 1123456
counter = 2520
num = 1112345
counter = 840
I want to it to treat repeated/duplicated digits as if they were different - I don't want to generate only unique permutations - but rather generate all the permutations, regardless of whether they're repeated and duplicates of others.
Uhm... why not just calculate the factorial of the length of the input string then? ;)
I want to it to treat repeated/duplicated digits as if they were
different - I don't want to calculate only the number of unique
permutations.
If the only information that nextHigherPerm() uses is the number that's passed in, you're out of luck. Consider nextHigherPerm(122). How can the function know how many versions of 122 it has already seen? Should nextHigherPerm(122) return 122 or 212? There's no way to know unless you keep track of the current state of the generator separately.
When you have 3 letters for example ABC, you can make: ABC, ACB, BAC, BCA, CAB, CBA, 6 combinations (6!). If 2 of those letters repeat like AAB, you can make: AAB, ABA, BAA, IT IS NOT 3! so What is it? From where does it comes from? The real way to calculate it when a digit or letter is repeated is with combinations -> ( n k ) = n! / ( n! * ( n! - k! ) )
Let's make another illustrative example: AAAB, then the possible combinations are AAAB, AABA, ABAA, BAAA only four combinations, and if you calcualte them by the formula 4C3 = 4.
How is the correct procedure to generate all these lists:
Store the digits in an array. Example ABCD.
Set the 0 element of the array as the pivot element, and exclude it from the temp array. A {BCD}
Then as you want all the combinations (Even the repeated), move the elements of the temporal array to the right or left (However you like) until you reach the n element.
A{BCD}------------A{CDB}------------A{DBC}
Do the second step again but with the temp array.
A{B{CD}}------------A{C{DB}}------------A{D{BC}}
Do the third step again but inside the second temp array.
A{B{CD}}------------A{C{DB}}------------A{D{BC}}
A{B{DC}}------------A{C{BD}}------------A{D{CB}}
Go to the first array and move the array, BCDA, set B as pivot, and do this until you find all combinations.
Why not convert it to a string then treat your program like an anagram generator?

Find the permutations where no element stays in place

I'm working with permutations where each element is different from its original location. I would like an algorithm that given {an input length, row and digit}, will give me the output number. Here's an example:
If the input length is four, then all the permutations of 0123 are:
0123,0132,0213,0231,0312,0321,
1023,1032,1203,1230,1302,1320,
2013,2031,2103,2130,2301,2310,
3012,3021,3102,3120,3201,3210
The permutations in which no digit is in the same place (every digit has moved):
1032,1230,1302,
2031,2301,2310,
3012,3201,3210
Numbering starts at 0 so if the input to the function is {4,0,0}, the output should be the 0th (leftmost) digit of the 0th (first) permutation. First digit of 1032 is 1.
If the input is {4,1,1} then the output is the the second digit of 1230, which is 2.
The row number might be greater the nubmer of permutations. In that case, take the remainder modulo the number of permutations (in the above case, row modulo 9).
In the c language would be great.
(It's not homework, it's for work. Cuckoo hashing if you must know. I'd like to randomly select the swaps that I'll be making at each stage to see if it's better than BFS when the number of tables is greater than two.)
Why not just build a tree and iterate through it?
For example, if you have the digits 0123, then you know that the left most digit can be only from the set {1,2,3}. This would act as your first level in your tree.
Then, if you go down the path beginning with 1, you only have three options, {0, 2, 3}. If you go down the path beginning with 2 in the first level, you only have two options {0,3} (since you can't use 1 in the second digit from the left and the 2 was already used (you could pop the 2 from your list of choices)), etc.
The thing to watch out for in this approach is if you get to the end of a branch with 3 being your only option, in which case, you would just delete it.
For n > 10 generating all permutations and then filtering can become problematic. I think building out the tree would trim this significantly.
You can build the tree on the fly if need be. Your order can be defined by how you traverse the tree.
Brute-force approach in Python (you may use it to test your C implementation):
#!/usr/bin/env python
from itertools import ifilter, islice, permutations
def f(length, row, digit):
"""
>>> f(4, 0, 0)
1
>>> f(4, 1, 1)
2
"""
# 1. enumerate all permutations of range (range(3) -> [0,1,2], ..)
# 2. filter out permutations that have digits inplace
# 3. get n-th permutation (n -> row)
# 4. get n-th digit of the permutation (n -> digit)
return nth(ifilter(not_inplace, permutations(range(length))), row)[digit]
def not_inplace(indexes):
"""Return True if all indexes are not on their places.
"""
return all(i != d for i, d in enumerate(indexes))
def nth(iterable, n, default=None):
"""Return the nth item or a default value.
http://docs.python.org/library/itertools.html#recipes
"""
return next(islice(iterable, n, None), default)
if __name__=="__main__":
import doctest; doctest.testmod()

Resources