Moving from Linear Probing to Quadratic Probing (hash collisons) - c

My current implementation of an Hash Table is using Linear Probing and now I want to move to Quadratic Probing (and later to chaining and maybe double hashing too). I've read a few articles, tutorials, wikipedia, etc... But I still don't know exactly what I should do.
Linear Probing, basically, has a step of 1 and that's easy to do. When searching, inserting or removing an element from the Hash Table, I need to calculate an hash and for that I do this:
index = hash_function(key) % table_size;
Then, while searching, inserting or removing I loop through the table until I find a free bucket, like this:
do {
if(/* CHECK IF IT'S THE ELEMENT WE WANT */) {
// FOUND ELEMENT
return;
} else {
index = (index + 1) % table_size;
}
while(/* LOOP UNTIL IT'S NECESSARY */);
As for Quadratic Probing, I think what I need to do is change how the "index" step size is calculated but that's what I don't understand how I should do it. I've seen various pieces of code, and all of them are somewhat different.
Also, I've seen some implementations of Quadratic Probing where the hash function is changed to accommodated that (but not all of them). Is that change really needed or can I avoid modifying the hash function and still use Quadratic Probing?
EDIT:
After reading everything pointed out by Eli Bendersky below I think I got the general idea. Here's part of the code at http://eternallyconfuzzled.com/tuts/datastructures/jsw_tut_hashtable.aspx:
15 for ( step = 1; table->table[h] != EMPTY; step++ ) {
16 if ( compare ( key, table->table[h] ) == 0 )
17 return 1;
18
19 /* Move forward by quadratically, wrap if necessary */
20 h = ( h + ( step * step - step ) / 2 ) % table->size;
21 }
There's 2 things I don't get... They say that quadratic probing is usually done using c(i)=i^2. However, in the code above, it's doing something more like c(i)=(i^2-i)/2
I was ready to implement this on my code but I would simply do:
index = (index + (index^index)) % table_size;
...and not:
index = (index + (index^index - index)/2) % table_size;
If anything, I would do:
index = (index + (index^index)/2) % table_size;
...cause I've seen other code examples diving by two. Although I don't understand why...
1) Why is it subtracting the step?
2) Why is it diving it by 2?

There is a particularly simple and elegant way to implement quadratic probing if your table size is a power of 2:
step = 1;
do {
if(/* CHECK IF IT'S THE ELEMENT WE WANT */) {
// FOUND ELEMENT
return;
} else {
index = (index + step) % table_size;
step++;
}
} while(/* LOOP UNTIL IT'S NECESSARY */);
Instead of looking at offsets 0, 1, 2, 3, 4... from the original index, this will look at offsets 0, 1, 3, 6, 10... (the ith probe is at offset (i*(i+1))/2, i.e. it's quadratic).
This is guaranteed to hit every position in the hash table (so you are guaranteed to find an empty bucket if there is one) provided the table size is a power of 2.
Here is a sketch of a proof:
Given a table size of n, we want to show that we will get n distinct values of (i*(i+1))/2 (mod n) with i = 0 ... n-1.
We can prove this by contradiction. Assume that there are fewer than n distinct values: if so, there must be at least two distinct integer values for i in the range [0, n-1] such that (i*(i+1))/2 (mod n) is the same. Call these p and q, where p < q.
i.e. (p * (p+1)) / 2 = (q * (q+1)) / 2 (mod n)
=> (p2 + p) / 2 = (q2 + q) / 2 (mod n)
=> p2 + p = q2 + q (mod 2n)
=> q2 - p2 + q - p = 0 (mod 2n)
Factorise => (q - p) (p + q + 1) = 0 (mod 2n)
(q - p) = 0 is the trivial case p = q.
(p + q + 1) = 0 (mod 2n) is impossible: our values of p and q are in the range [0, n-1], and q > p, so (p + q + 1) must be in the range [2, 2n-2].
As we are working modulo 2n, we must also deal with the tricky case where both factors are non-zero, but multiply to give 0 (mod 2n):
Observe that the difference between the two factors (q - p) and (p + q + 1) is (2p + 1), which is an odd number - so one of the factors must be even, and the other must be odd.
(q - p) (p + q + 1) = 0 (mod 2n) => (q - p) (p + q + 1) is divisible by 2n. If n (and hence 2n) is a power of 2, this requires the even factor to be a multiple of 2n (because all of the prime factors of 2n are 2, whereas none of the prime factors of our odd factor are).
But (q - p) has a maximum value of n-1, and (p + q + 1) has a maximum value of 2n-2 (as seen in step 9), so neither can be a multiple of 2n.
So this case is impossible as well.
Therefore the assumption that there are fewer than n distinct values (in step 2) must be false.
(If the table size is not a power of 2, this falls apart at step 10.)

You don't have to modify the hash function for quadratic probing. The simplest form of quadratic probing is really just adding consequent squares to the calculated position instead of linear 1, 2, 3.
There's a good resource here. The following is taken from there. This is the simplest form of quadratic probing when the simple polynomial c(i) = i^2 is used:
In the more general case the formula is:
And you can pick your constants.
Keep, in mind, however, that quadratic probing is useful only in certain cases. As the Wikipedia entry states:
Quadratic probing provides good memory
caching because it preserves some
locality of reference; however, linear
probing has greater locality and,
thus, better cache performance.
Quadratic probing better avoids the
clustering problem that can occur with
linear probing, although it is not
immune.
EDIT: Like many things in computer science, the exact constants and polynomials of quadratic probing are heuristic. Yes, the simplest form is i^2, but you may choose any other polynomial. Wikipedia gives the example with h(k,i) = (h(k) + i + i^2)(mod m).
Therefore, it is difficult to answer your "why" question. The only "why" here is why do you need quadratic probing at all? Having problems with other forms of probing and getting a clustered table? Or is it just a homework assignment, or self-learning?
Keep in mind that by far the most common collision resolution technique for hash tables is either chaining or linear probing. Quadratic probing is a heuristic option available for special cases, and unless you know what you're doing very well, I wouldn't recommend using it.

Related

Finding the maximum element of the array is a divisor of another element

Given an array of non-zero integers of length N. Write a function that returns the maximum element of the array, which is a divisor of some other element of the same array. If this number is not present, then return 0. I know how to solve in O(n^2). Is it possible to do it faster?
First, note that you are assuming that testing if integer A divides integer B can be completed in O(1). I guess you're also assuming that no pre-computation (e.g. building a divisibility graph) is allowed.
Since integer factorization (for which no polynomial algorithm is known) is not an option, you can't do faster then O(n^2) (worst case).
For example, given the input {11,127, 16139} (all integers are primes, each integer squared is less than the next one), you can't avoid checking all pairs.
I have been playing with your problem for a while and found a sometimes-better than brute-force solution.
It is based in to ideas:
We can perform the search in an order such that bigger divisor candidates are tested first. That way we can terminate the search as soon as we find a divisor.
One way to test if some candidate divw is a divisor for number w, is to calculate r = floor(w / divw) and then check that r * divw == w. The interesting thing, is that when it fails, we can calculate a top limit for the next divisor candidate of w as topw = floor(w / (r + 1)). So we can discard anything between divw and topw.
A sample for that second point: Imagine we are testing if divw = 10 is a divisor of w = 12, we calculate r = floor(12 / 10) = 1, and topw = floor(w / 2) = 6. So, we don't need to check if numbers in the set between 7 and 9, inclusive, are divisors for 12.
In order to implement this algorithm I have used a heap to keep the numbers in the set using as key the next divisor candidate that has to be tested.
So...
Initialize the heap pushing every element which its predecessor as its bigger potential divisor.
Pop the first element from the heap (w) and check if the potential divisor candidate (divw) is actually a divisor.
If it is, return it as the biggest divisor
Calculate topw for w, divw; search the next element in the set divw' that is equal or lesser than topw (using binary-search); if found, push w,divw' again in the queue.
unless the queue is empty, goto 2.
An implementation in Common Lisp is available here!
I guess calculating the theoretical computational cost for this algorithm would be challenging, specially for the average case, so I am not going to do it!
After running it a dozen times, it seems to behave better than the brute force approach when N is high and the numbers are dispersed (which means that the probability of one number being a divisor of other is low). On the other hand, brute-force seems to be faster when N is low or when the numbers are densely distributed in a small range (which means that the probability of a number being a divisor of other is high).
I did it so
int f(int* a, int size)
{
int max = 0;
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++)
if (a[i] > a[j] && a[i] % a[j] == 0 && a[j] > max)
max = a[j];
return max;
}

Big integer addition code

I am trying to immplement big integer addition in CUDA using the following code
__global__ void add(unsigned *A, unsigned *B, unsigned *C/*output*/, int radix){
int id = blockIdx.x * blockDim.x + threadIdx.x;
A[id ] = A[id] + B[id];
C[id ] = A[id]/radix;
__syncthreads();
A[id] = A[id]%radix + ((id>0)?C[id -1]:0);
__syncthreads();
C[id] = A[id];
}
but it does not work properly and also i don't now how to handle the extra carry bit. Thanks
TL;DR build a carry-lookahead adder where each individual additionner adds modulo radix, instead of modulo 2
Additions need incoming carries
The problem in your model is that you have a rippling carry. See Rippling carry adders.
If you were in an FPGA that wouldn't be a problem because they have dedicated logic to do that fast (carry chains, they're cool). But alas, you're on a GPU !
That is, for a given id, you only know the input carry (thus whether you are going to sum A[id]+B[id] or A[id]+B[id]+1) when all the sums with smaller id values have been computed. As a matter of fact, initially, you only know the first carry.
A[3]+B[3] + ? A[2]+B[2] + ? A[1]+B[1] + ? A[0]+B[0] + 0
| | | |
v v v v
C[3] C[2] C[1] C[0]
Characterize the carry output
And each sum also has a carry output, which isn't on the drawing. So you have to think of the addition in this larger scheme as a function with 3 inputs and 2 outputs : (C, c_out) = add(A, B, c_in)
In order to not wait O(n) for the sum to complete (where n is the number of items your sum is cut into), you can precompute all the possible results at each id. That isn't such a huge load of work, since A and B don't change, only the carries. So you have 2 possible outputs : (c_out0, C) = add(A, B, 0) and (c_out1, C') = add(A, B, 1).
Now with all these results, we need to basically implement a carry lookahead unit.
For that, we need to figure out to functions of each sum's carry output P and G :
P a.k.a. all of the following definitions
Propagate
"if a carry comes in, then a carry will go out of this sum"
c_out1 && !c_out0
A + B == radix-1
G a.k.a. all of the following definitions
Generate
"whatever carry comes in, a carry will go out of this sum"
c_out1 && c_out0
c_out0
A + B >= radix
So in other terms, c_out = G or (P and c_in). So now we have a start of an algorithm that can tell us easily for each id the carry output as a function of its carry input directly :
At each id, compute C[id] = A[id]+B[id]+0
Get G[id] = C[id] > radix -1
Get P[id] = C[id] == radix-1
Logarithmic tree
Now we can finish in O(log(n)), even though treeish things are nasty on GPUs, but still shorter than waiting. Indeed, from 2 additions next to each other, we can get a group G and a group P :
For id and id+1 :
step = 2
if id % step == 0, do steps 6 through 10, otherwise, do nothing
group_P = P[id] and P[id+step/2]
group_G = (P[id+step/2] and G[id]) or G[id+step/2]
c_in[id+step/2] = G[id] or (P[id] and c_in[id])
step = step * 2
if step < n, go to 5
At the end (after repeating steps 5-10 for every level of your tree with less ids every time), everything will be expressed in terms of Ps and Gs which you computed, and c_in[0] which is 0. On the wikipedia page there are formulas for the grouping by 4 instead of 2, which will get you an answer in O(log_4(n)) instead of O(log_2(n)).
Hence the end of the algorithm :
At each id, get c_in[id]
return (C[id]+c_in[id]) % radix
Take advantage of hardware
What we really did in this last part, was mimic the circuitry of a carry-lookahead adder with logic. However, we already have additionners in the hardware that do similar things (by definition).
Let us replace our definitions of P and G based on radix by those based on 2 like the logic inside our hardware, mimicking a sum of 2 bits a and b at each stage : if P = a ^ b (xor), and G = a & b (logical and). In other words, a = P or G and b = G. So if we create a intP integer and a intG integer, where each bit is respectively the P and G we computed from each ids sum (limiting us to 64 sums), then the addition (intP | intG) + intG has the exact same carry propagation as our elaborate logical scheme.
The reduction to form these integers will still be a logarithmic operation I guess, but that was to be expected.
The interesting part, is that each bit of the sum is function of its carry input. Indeed, every bit of the sum is eventually function of 3 bits a+b+c_in % 2.
If at that bit P == 1, then a + b == 1, thus a+b+c_in % 2 == !c_in
Otherwise, a+b is either 0 or 2, and a+b+c_in % 2 == c_in
Thus we can trivially form the integer (or rather bit-array) int_cin = ((P|G)+G) ^ P with ^ being xor.
Thus we have an alternate ending to our algorithm, replacing steps 4 and later :
at each id, shift P and G by id : P = P << id and G = G << id
do an OR-reduction to get intG and intP which are the OR of all the P and G for id 0..63
Compute (once) int_cin = ((P|G)+G) ^ P
at each id, get `c_in = int_cin & (1 << id) ? 1 : 0;
return (C[id]+c_in) % radix
PS : Also, watch out for integer overflow in your arrays, if radix is big. If it isn't then the whole thing doesn't really make sense I guess...
PPS : in the alternate ending, if you have more than 64 items, characterize them by their P and G as if radix was 2^64, and re-run the same steps at a higher level (reduction, get c_in) and then get back to the lower level apply 7 with P+G+carry in from higher level

Best way to compute ((2^n )-1)mod p

I'm working on a cryptographic exercise, and I'm trying to calculate (2n-1)mod p where p is a prime number
What would be the best approach to do this? I'm working with C so 2n-1 becomes too large to hold when n is large
I came across the equation (a*b)modp=(a(bmodp))modp, but I'm not sure this applies in this case, as 2n-1 may be prime (or I'm not sure how to factorise this)
Help much appreciated.
A couple tips to help you come up with a better way:
Don't use (a*b)modp=(a(bmodp))modp to compute 2n-1 mod p, use it to compute 2n mod p and then subtract afterward.
Fermat's little theorem can be useful here. That way, the exponent you actually have to deal with won't exceed p.
You mention in the comments that n and p are 9 or 10 digits, or something. If you restrict them to 32 bit (unsigned long) values, you can find 2^n mod p with a simple (binary) modular exponentiation:
unsigned long long u = 1, w = 2;
while (n != 0)
{
if ((n & 0x1) != 0)
u = (u * w) % p; /* (mul-rdx) */
if ((n >>= 1) != 0)
w = (w * w) % p; /* (sqr-rdx) */
}
r = (unsigned long) u;
And, since (2^n - 1) mod p = r - 1 mod p :
r = (r == 0) ? (p - 1) : (r - 1);
If 2^n mod p = 0 - which doesn't actually occur if p > 2 is prime - but we might as well consider the general case - then (2^n - 1) mod p = -1 mod p.
Since the 'common residue' or 'remainder' (mod p) is in [0, p - 1], we add a some multiple of p so that it is in this range.
Otherwise, the result of 2^n mod p was in [1, p - 1], and subtracting 1 will be in this range already. It's probably better expressed as:
if (r == 0)
r = p - 1; /* -1 mod p */
else
r = r - 1;
To take modulus you somehow must have 2^n-1 or you will move in a different direction of algorithms, interesting but seperate direction somehow, so i recommend you to use big int concept as it will be easy... make a structure and implement a big value in small values, e.g.
struct bigint{
int lowerbits;
int upperbits;
}
decomposition of the statement also has solution like 2^n = (2^n-4 * 2^4 )-1%p decompose and seperatly handle them, that will be quite algorithmic then
To compute 2^n - 1 mod p, you can use exponentiation by squaring after first removing any multiple of (p - 1) from n (since a^{p-1} = 1 mod p). In pseudo-code:
n = n % (p - 1)
result = 1
pow = 2
while n {
if n % 2 {
result = (result * pow) % p
}
pow = (pow * pow) % p
n /= 2
}
result = (result + p - 1) % p
I came across the answer that I am posting here, when solving one of the mathematical problems on HackerRank, and it has worked for all the given test cases given there.
If you restrict n and p to 64 bit (unsigned long) values, then here is the mathematical approach :
2^n - 1 can be written as 1*[ (2^n - 1)/(2 - 1) ]
If you look at this carefully, this is the sum of the GP 1 + 2 + 4 + .. + 2^(n-1)
And voila, we know that (a+b)%m = ( (a%m) + (b%m) )%m
If you have a confusion whether the above relation is true or not for addition, you can google for it or you can check this link : http://www.inf.ed.ac.uk/teaching/courses/dmmr/slides/13-14/Ch4.pdf
So, now we can apply the above mentioned relation to our GP, and you would have your answer!!
That is,
(2^n - 1)%p is equivalent to ( 1 + 2 + 4 + .. + 2^(n-1) )%p and now apply the given relation.
First, focus on 2n mod p because you can always subtract one at the end.
Consider the powers of two. This is a sequence of numbers produced by repeatedly multiplying by two.
Consider the modulo operation. If the number is written in base p, you're just grabbing the last digit. Higher digits can be thrown away.
So at some point(s) in the sequence, you get a two-digit number (a 1 in the p's place), and your task is really just to get rid of the first digit (subtract p) when that happens.
Stopping here conceptually, the brute-force approach would be something like this:
uint64_t exp2modp( uint64_t n, uint64_t p ) {
uint64_t ret = 1;
uint64_t limit = p / 2;
n %= p; // Apply Fermat's Little Theorem.
while ( n -- ) {
if ( ret >= limit ) {
ret *= 2;
ret -= p;
} else {
ret *= 2;
}
}
return ret;
}
Unfortunately, this still takes forever for large n and p, and I can't think of any better number theory offhand.
If you have a multiplication facility which can compute (p-1)^2 without overflow, then you can use an analogous algorithm using repeated squaring with a modulo after each square operation, and then take the product of the series of square residuals, again with a modulo after each multiplication.
step 1. x= shifting 1 n times and then subtract 1
step 2.result = logical and operation of x and p

Efficiently calculating nCk mod p

I have came across this problem many time but I am unable to solve it. There would occur some cases or the other which will wrong answer or otherwise the program I write will be too slow. Formally I am talking about calculating
nCk mod p where p is a prime n is a large number, and 1<=k<=n.
What have I tried:
I know the recursive formulation of factorial and then modelling it as a dynamic programming problem, but I feel that it is slow. The recursive formulation is (nCk) + (nCk-1) = (n+1Ck). I took care of the modulus while storing values in array to avoid overflows but I am not sure that just doing a mod p on the result will avoid all overflows as it may happen that one needs to remove.
To compute nCr, there's a simple algorithm based on the rule nCr = (n - 1)C(r - 1) * n / r:
def nCr(n,r):
if r == 0:
return 1
return n * nCr(n - 1, r - 1) // r
Now in modulo arithmetic we don't quite have division, but we have modulo inverses which (when modding by a prime) are just as good
def nCrModP(n, r, p):
if r == 0:
return 1
return n * nCrModP(n - 1, r - 1) * modinv(r, p) % p
Here's one implementation of modinv on rosettacode
Not sure what you mean by "storing values in array", but I assume they array serves as a lookup table while running to avoid redundant calculations to speed things up. This should take care of the speed problem. Regarding the overflows - you can perform the modulo operation at any stage of computation and repeat it as much as you want - the result will be correct.
First, let's work with the case where p is relatively small.
Take the base-p expansions of n and k: write n = n_0 + n_1 p + n_2 p^2 + ... + n_m p^m and k = k_0 + k_1 p + ... + k_m p^m where each n_i and each k_i is at least 0 but less than p. A theorem (which I think is due to Edouard Lucas) states that C(n,k) = C(n_0, k_0) * C(n_1, k_1) * ... * C(n_m, k_m). This reduces to taking a mod-p product of numbers in the "n is relatively small" case below.
Second, if n is relatively small, you can just compute binomial coefficients using dynamic programming on the formula C(n,k) = C(n-1,k-1) + C(n-1,k), reducing mod p at each step. Or do something more clever.
Third, if k is relatively small (and less than p), you should be able to compute n!/(k!(n-k)!) mod p by computing n!/(n-k)! as n * (n-1) * ... * (n-k+1), reducing modulo p after each product, then multiplying by the modular inverses of each number between 1 and k.

Time complexity of a recursive algorithm

How can I calculate the time complexity of a recursive algorithm?
int pow1(int x,int n) {
if(n==0){
return 1;
}
else{
return x * pow1(x, n-1);
}
}
int pow2(int x,int n) {
if(n==0){
return 1;
}
else if(n&1){
int p = pow2(x, (n-1)/2)
return x * p * p;
}
else {
int p = pow2(x, n/2)
return p * p;
}
}
Analyzing recursive functions (or even evaluating them) is a nontrivial task. A (in my opinion) good introduction can be found in Don Knuths Concrete Mathematics.
However, let's analyse these examples now:
We define a function that gives us the time needed by a function. Let's say that t(n) denotes the time needed by pow(x,n), i.e. a function of n.
Then we can conclude, that t(0)=c, because if we call pow(x,0), we have to check whether (n==0), and then return 1, which can be done in constant time (hence the constant c).
Now we consider the other case: n>0. Here we obtain t(n) = d + t(n-1). That's because we have again to check n==1, compute pow(x, n-1, hence (t(n-1)), and multiply the result by x. Checking and multiplying can be done in constant time (constant d), the recursive calculation of pow needs t(n-1).
Now we can "expand" the term t(n):
t(n) =
d + t(n-1) =
d + (d + t(n-2)) =
d + d + t(n-2) =
d + d + d + t(n-3) =
... =
d + d + d + ... + t(1) =
d + d + d + ... + c
So, how long does it take until we reach t(1)? Since we start at t(n) and we subtract 1 in each step, it takes n-1 steps to reach t(n-(n-1)) = t(1). That, on the other hands, means, that we get n-1 times the constant d, and t(1) is evaluated to c.
So we obtain:
t(n) =
...
d + d + d + ... + c =
(n-1) * d + c
So we get that t(n)=(n-1) * d + c which is element of O(n).
pow2 can be done using Masters theorem. Since we can assume that time functions for algorithms are monotonically increasing. So now we have the time t(n) needed for the computation of pow2(x,n):
t(0) = c (since constant time needed for computation of pow(x,0))
for n>0 we get
/ t((n-1)/2) + d if n is odd (d is constant cost)
t(n) = <
\ t(n/2) + d if n is even (d is constant cost)
The above can be "simplified" to:
t(n) = floor(t(n/2)) + d <= t(n/2) + d (since t is monotonically increasing)
So we obtain t(n) <= t(n/2) + d, which can be solved using the masters theorem to t(n) = O(log n) (see section Application to Popular Algorithms in the wikipedia link, example "Binary Search").
Let's just start with pow1, because that's the simplest one.
You have a function where a single run is done in O(1). (Condition checking, returning, and multiplication are constant time.)
What you have left is then your recursion. What you need to do is analyze how often the function would end up calling itself. In pow1, it'll happen N times. N*O(1)=O(N).
For pow2, it's the same principle - a single run of the function runs in O(1). However, this time you're halving N every time. That means it will run log2(N) times - effectively once per bit. log2(N)*O(1)=O(log(N)).
Something which might help you is to exploit the fact that recursion can always be expressed as iteration (not always very simply, but it's possible. We can express pow1 as
result = 1;
while(n != 0)
{
result = result*n;
n = n - 1;
}
Now you have an iterative algorithm instead, and you might find it easier to analyze it that way.
It can be a bit complex, but I think the usual way is to use Master's theorem.
Complexity of both functions ignoring recursion is O(1)
For the first algorithm pow1(x, n) complexity is O(n) because the depth of recursion correlates with n linearly.
For the second complexity is O(log n). Here we recurse approximately log2(n) times. Throwing out 2 we get log n.
So I'm guessing you're raising x to the power n. pow1 takes O(n).
You never change the value of x but you take 1 from n each time until it gets to 1 (and you then just return) This means that you will make a recursive call n times.

Resources