Two elements in array whose xor is maximum - arrays

Given an array of integers ,You have to find two elements whose XOR is maximum.
There is naive approach --just by picking each element and xoring with other elements and then comparing the results to find the pair.
Other than this ,Is there any efficient algorithm?

I think I have a O(n lg U) algorithm for this, where U is the largest number. The idea is similar to user949300's, but with a bit more detail.
The intuition is as follows. When you're XORing two numbers together, to get the maximum value, you want to have a 1 at the highest possible position, and then of the pairings that have a 1 at this position, you want a pairing with a 1 at the next possible highest position, etc.
So the algorithm is as follows. Begin by finding the highest 1 bit anywhere in the numbers (you can do this in time O(n lg U) by doing O(lg U) work per each of the n numbers). Now, split the array into two pieces - one of the numbers that have a 1 in that bit and the group with 0 in that bit. Any optimal solution must combine a number with a 1 in the first spot with a number with a 0 in that spot, since that would put a 1 bit as high as possible. Any other pairing has a 0 there.
Now, recursively, we want to find the pairing of numbers from the 1 and 0 group that has the highest 1 in them. To do this, of these two groups, split them into four groups:
Numbers starting with 11
Numbers starting with 10
Numbers starting with 01
Numbers starting with 00
If there are any numbers in the 11 and 00 group or in the 10 and 01 groups, their XOR would be ideal (starting with 11). Consequently, if either of those pairs of groups isn't empty, recursively compute the ideal solution from those groups, then return the maximum of those subproblem solutions. Otherwise, if both groups are empty, this means that all the numbers must have the same digit in their second position. Consequently, the optimal XOR of a number starting with 1 and a number starting with 0 will end up having the next second bit cancel out, so we should just look at the third bit.
This gives the following recursive algorithm that, starting with the two groups of numbers partitioned by their MSB, gives the answer:
Given group 1 and group 0 and a bit index i:
If the bit index is equal to the number of bits, return the XOR of the (unique) number in the 1 group and the (unique) number in the 0 group.
Construct groups 11, 10, 01, and 00 from those groups.
If group 11 and group 00 are nonempty, recursively find the maximum XOR of those two groups starting at bit i + 1.
If group 10 and group 01 are nonempty, recursively find the maximum XOR of those two groups, starting at bit i + 1.
If either of the above pairings was possible, then return the maximum pair found by the recursion.
Otherwise, all of the numbers must have the same bit in position i, so return the maximum pair found by looking at bit i + 1 on groups 1 and 0.
To start off the algorithm, you can actually just partition the numbers from the initial group into two groups - numbers with MSB 1 and numbers with MSB 0. You then fire off a recursive call to the above algorithm with the two groups of numbers.
As an example, consider the numbers 5 1 4 3 0 2. These have representations
101 001 100 011 000 010
We begin by splitting them into the 1 group and the 0 group:
101 100
001 011 000 010
Now, we apply the above algorithm. We split this into groups 11, 10, 01, and 00:
11:
10: 101 100
01: 011 010
00: 000 001
Now, we can't pair any 11 elements with 00 elements, so we just recurse on the 10 and 01 groups. This means we construct the 100, 101, 010, and 011 groups:
101: 101
100: 100
011: 011
010: 010
Now that we're down to buckets with just one element in them, we can just check the pairs 101 and 010 (which gives 111) and 100 and 011 (which gives 111). Either option works here, so we get that the optimal answer is 7.
Let's think about the running time of this algorithm. Notice that the maximum recursion depth is O(lg U), since there are only O(log U) bits in the numbers. At each level in the tree, each number appears in exactly one recursive call, and each of the recursive calls does work proportional to the total number of numbers in the 0 and 1 groups, because we need to distribute them by their bits. Consequently, there are O(log U) levels in the recursion tree, and each level does O(n) work, giving a total work of O(n log U).
Hope this helps! This was an awesome problem!

This can be solved in O(NlogN) time complexity using Trie.
Construct a trie. For each integer key, each node of the trie will hold every bit(0 or 1) starting from most significant bit.
Now for each arr[i] element of arr[0, 1, ..... N]
Perform query to retrieve the maximum xor value possible for arr[i]. We know xor of different type of bits(0 ^ 1 or 1 ^ 0) is always 1. So during query for each bit, try to traverse node holding opposite bit. This will make that particular bit 1 result in maximizing xor value. If there is no node with opposite bit, only then traverse the same bit node.
After query, insert arr[i] into trie.
For each element, keep track the maximum Xor value possible.
During walking through each node, build the other key for which the Xor is being maximized.
For N elements, we need one query(O(logN)) and one insertion(O(logN)) for each element. So the overall time complexity is O(NlogN).
You can find nice pictorial explanation on how it works in this thread.
Here is C++ implementation of the above algorithm:
const static int SIZE = 2;
const static int MSB = 30;
class trie {
private:
struct trieNode {
trieNode* children[SIZE];
trieNode() {
for(int i = 0; i < SIZE; ++i) {
children[i] = nullptr;
}
}
~trieNode() {
for(int i = 0; i < SIZE; ++i) {
delete children[i];
children[i] = nullptr;
}
}
};
trieNode* root;
public:
trie(): root(new trieNode()) {
}
~trie() {
delete root;
root = nullptr;
}
void insert(int key) {
trieNode* pCrawl = root;
for(int i = MSB; i >= 0; --i) {
bool bit = (bool)(key & (1 << i));
if(!pCrawl->children[bit]) {
pCrawl->children[bit] = new trieNode();
}
pCrawl = pCrawl->children[bit];
}
}
int query(int key, int& otherKey) {
int Xor = 0;
trieNode *pCrawl = root;
for(int i = MSB; i >= 0; --i) {
bool bit = (bool)(key & (1 << i));
if(pCrawl->children[!bit]) {
pCrawl = pCrawl->children[!bit];
Xor |= (1 << i);
if(!bit) {
otherKey |= (1 << i);
} else {
otherKey &= ~(1 << i);
}
} else {
if(bit) {
otherKey |= (1 << i);
} else {
otherKey &= ~(1 << i);
}
pCrawl = pCrawl->children[bit];
}
}
return Xor;
}
};
pair<int, int> findMaximumXorElements(vector<int>& arr) {
int n = arr.size();
int maxXor = 0;
pair<int, int> result;
if(n < 2) return result;
trie* Trie = new trie();
Trie->insert(0); // insert 0 initially otherwise first query won't find node to traverse
for(int i = 0; i < n; i++) {
int elem = 0;
int curr = Trie->query(arr[i], elem);
if(curr > maxXor) {
maxXor = curr;
result = {arr[i], elem};
}
Trie->insert(arr[i]);
}
delete Trie;
return result;
}

Ignoring the sign bit, one of the values must be one of the values with the highest significant bit set. Unless all the values have that bit set, in which case you go to the next highest significant bit that isn't set in all the values. So you could pare down the possibilities for the 1st value by looking at the HSB. For example, if the possibilities are
0x100000
0x100ABC
0x001ABC
0x000ABC
The 1st value of the max pair must be either 0x100000 or 0x10ABCD.
#internal Server Error I don't think smallest is necessarily correct. I don't have a great idea for paring down the 2nd value. Just any value that isn't in the list of possible 1st values. In my example, 0x001ABC or 0x000ABC.

A very interesting problem!
Here is my idea:
First build a binary tree from all the numbers by using the binary
representation and sort them into the tree most significant bit first
(add leading zeros to match the longest number). When done each path
from the root to any leaf represents one number from the original
set.
Let a and b be pointers to a tree node and initialize them at the root.
Now move a and b down the tree, trying to use opposite edges at each step, i.e. if a moves down a 0-edge, b moves down a 1-edge unless its not possible.
If a and b reach a leaf, the should point to two numbers with "very few" identical bits.
I just made this algorithm up and do not know if its correct or how to prove it. However it should be in O(n) running time.

Make a recursive function that takes two lists of integers, A and B, as its arguments. As its return value, it returns two integers, one from A and one from B, which maximize the XOR of the two. If all the integers are 0, return (0,0). Otherwise, the function does some processing and calls itself recursively twice, but with smaller integers. In one of the recursive calls, it considers taking an integer from list A to supply a 1 to bit k, and in the other call it considers taking an integer from list B to supply a 1 to bit k.
I don't have time now to fill in the details, but maybe this will be enough for to see the answer? Also, I'm not sure if the run time will be better than N^2, but it probably will be.

We can find the maximum number in O(n) time then loop through the array doing xor with each element. Assuming xor operation cost is O(1) we can find max xor of two numbers in O(n) time.

Related

Given an array A[] of N numbers. Now, you need to find and print the Summation of the bitwise OR of all possible subsets of this array

For [1, 2, 3], all possible subsets are {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, {1,2,3}
The sum of OR of these subsets are, 1 + 2 + 3 + 3 + 3 + 3 + 3 = 18.
My Approach is to generate all possible subset and find their OR and sum it but time complexity is O(2^n) , but I need a solution with O(nlogn) or less.
As you having 3 alements so 2^3=8 subsets will be created and you need to or all subset and print the sum of all subsets, By following logic you can get the solution you required
public class AndOfSubSetsOfSet {
public static void main(String[] args) {
findSubsets(new int[]{1, 2,3});
}
private static void findSubsets(int array[]) {
int numOfSubsets = 1 << array.length;
int a = 0;
for (int i = 0; i < numOfSubsets; i++) {
int pos = array.length - 1;
int bitmask = i;
int temp = 0;
int count = 0;
while (bitmask > 0) {
if ((bitmask & 1) == 1) {
if (count == 0) {
temp = array[pos];
} else
temp = array[pos] | temp;
count++;
}
//this will shift this number to left so one bit will be remove
bitmask >>= 1;
pos--;
}
count = 0;
a += temp;
temp = 0;
}
System.out.println(a);
}
}
`
one best approach you can use 3 loops outer loop would select number of elements of pair we have to make 2,3,4....upto n. and inner two loops would select elements according to outer loop. in the inner loop you can use bitwise OR so get the answer.
here time complexicity is better than exponential.
if any problem i would gave you code .
please vote if like.
Let's find the solution by calculating bitwise values. Consider the following points first. We will formulate the algorithm based on these points
For N numbers, there can be 2^N-1 such subsets.
For N numbers, where the maximum number of bits can be k, what can be the maximum output? Obviously when every subset sum is all 1's (i.e., for every combination there will be 1 in every bit of k positions). So calculate this MAX. In your example k = 2 and N = 3. So the MAX is when all the subset sum will be 11 (i.e.,3). SO MAX = (2^N-1)*(2^k-1) = 21.
Note that, the value of a bit of subset sum will only be 0 when the bits of every element of that subset is 0. So For every bit first calculate how many subsets can have 0 value in that bit. Then multiply that number with the corresponding value (2^bit_position) and deduct from MAX. In your case, for the leftmost position (i.e., position 0), there is only one 0 (in 2). So in 2^1-1 = 1 subset, the subsets sum's 0 position will be 0. So deduct 1*1 from MAX. Similarly for position 1, there can be only 1 subset with 0 at position 1 of subset sum ({2}). so deduct 1*2 from MAX. For every bit, calculate this value and keep deducting. the final MAX will be the result. If you consider 16 bit integer and you don't know about max k, then calculate using k = 16.
Let's consider another example with N = {1,4}. The subsets are {1},{4},{1,4}, and the result is = 1+4+5 = 10
here k = 3, N = 2. SO MAX = (2^K-1)*(2^N-1) = 21.
For 0 bit, there is only single 0 (in 4). so deduct 1*1 from MAX. So new MAX = 21 -1 = 20.
For 1 bit, both 1 and 4 has 0. so deduct (2^2-1)*2 from MAX. So new MAX = 20 -6 = 14.
For 2 bit, there is only single 0 (in 1). so deduct 1*4 from MAX. So new MAX = 14 -4 = 10.
As we have calculated for every bit position, thus the final result is 10.
Time Complexity
First and second steps can be calculated in constant time
In third step, the main thing is to find the number of 0 bit of each position. So for N number it takes O(k*N) in total. as k will be constant so the overall complexity will be O(N).

Find 2 repeating elements in given array

Given an array with n+2 elements, all elements in the array are in the range 1 to n and all elements occur only once except two elements which occur twice.
Find those 2 repeating numbers. For example, if the array is [4, 2, 4, 5, 2, 3, 1], then n is 5, there are n+2 = 7 elements with all elements occurring only once except 2 and 4.
So my question is how to solve the above problem using XOR operation. I have seen the solution on other websites but I'm not able to understand it. Please consider the following example:
arr[] = {2, 4, 7, 9, 2, 4}
XOR every element. xor = 2^4^7^9^2^4 = 14 (1110)
Get a number which has only one set bit of the xor. Since we can easily get the rightmost set bit, let us use it.
set_bit_no = xor & ~(xor-1) = (1110) & ~(1101) = 0010. Now set_bit_no will have only set as rightmost set bit of xor.
Now divide the elements in two sets and do xor of elements in each set, and we get the non-repeating elements 7 and 9.
Yes, you can solve it with XORs. This answer expands on Paulo Almeida's great comment.
The algorithm works as follows:
Since we know that the array contains every element in the range [1 .. n], we start by XORing every element in the array together and then XOR the result with every element in the range [1 .. n]. Because of the XOR properties, the unique elements cancel out and the result is the XOR of the duplicated elements (because the duplicate elements have been XORed 3 times in total, whereas all the others were XORed twice and canceled out). This is stored in xor_dups.
Next, find a bit in xor_dups that is a 1. Again, due to XOR's properties, a bit set to 1 in xor_dups means that that bit is different in the binary representation of the duplicate numbers. Any bit that is a 1 can be picked for the next step, my implementation chooses the least significant. This is stored in diff_bit.
Now, split the array elements into two groups: one group contains the numbers that have a 0 bit on the position of the 1-bit that we picked from xor_dups. The other group contains the numbers that have a 1-bit instead. Since this bit is different in the numbers we're looking for, they can't both be in the same group. Furthermore, both occurrences of each number go to the same group.
So now we're almost done. Consider the group for the elements with the 0-bit. XOR them all together, then XOR the result with all the elements in the range [1..n] that have a 0-bit on that position, and the result is the duplicate number of that group (because there's only one number repeated inside each group, all the non-repeated numbers canceled out because each one was XORed twice except for the repeated number which was XORed three times).
Rinse, repeat: for the group with the 1-bit, XOR them all together, then XOR the result with all the elements in the range [1..n] that have a 1-bit on that position, and the result is the other duplicate number.
Here's an implementation in C:
#include <assert.h>
void find_two_repeating(int arr[], size_t arr_len, int *a, int *b) {
assert(arr_len > 3);
size_t n = arr_len-2;
int i;
int xor_dups = 0;
for (i = 0; i < arr_len; i++)
xor_dups ^= arr[i];
for (i = 1; i <= n; i++)
xor_dups ^= i;
int diff_bit = xor_dups & -xor_dups;
*a = 0;
*b = 0;
for (i = 0; i < arr_len; i++)
if (arr[i] & diff_bit)
*a ^= arr[i];
else
*b ^= arr[i];
for (i = 1; i <= n; i++)
if (i & diff_bit)
*a ^= i;
else
*b ^= i;
}
arr_len is the total length of the array arr (the value of n+2), and the repeated entries are stored in *a and *b (these are so-called output parameters).

Algorithms for selecting a number uniformly randomly from a subset of integers

Suppose that int array a[i]=i, i=0, 1, 2, ..., N-1. Now given that each integer would associate with a bit. I need an algorithm to uniformly randomly select an integer from the subset of integers whose associated bit are 0. It is assumed that the total number of integers whose associated bit are 0 is already given in a variable T.
One straightforward solution I have is to generate r=rand() % T, and then find the r-th integer whose associated bit is 0 (by testing i=0,1,...). However, I wonder would there be any decent algorithms for doing this? Also, if say that the associated bits are stored in some long int variables (which is true in my case), finding the r-th integer whose associated bit is 0 would not be a easy task.
Thanks for your inputs.
If the associated bits are irregular, i.e. cannot be deduced from the value of i by a simple formula, then it is just impossible to locate the r-th '0' bit without enumerating those that precede, unless preprocessing is allowed.
A good solution is to precompute a table that will store the indexes of the '0' bit entries contiguously, and lookup this table for the r-th entry. (Instead of an index table, you can as well fill another array with the elements from the subset only.)
Indexing a packed bit array is not such a big deal. Assuming 64 bits long ints, the bit at index i is found by the expression
(PackedBits[i >> 6] >> (i & 63)) & 1
(The 6 because 64 == (1 << 6).)
In case you really want to find the r-th '0' sequentially, you can speed-up the search a little (x 64) by precomputing the number of '0's in every long int so that you can skip 64 entries in a single go.
And if you really really don't want to precompute anything, you can still speed-up the search by processing the bits 8 by 8, using a static table that relates every byte value (among 256) to the number of '0' bits in it. (Or even 16 by 16 if you can afford using a table of 65536 numbers.)
You can speed this up by trading memory for speed.
T must be an array, that stores in T[n] the number of integers in a[] that have bit n cleared, and this needs to be precomputed at some point. So, while you are calculating that, store the indices of all the integers that have a given bit cleared in another 2 dimensional array, indexed by the bit number and r.
In C for example:
#define BITS (64)
#define N (100)
long int a[N];
int T[BITS];
int index[BITS][N];
void init()
{
int i, j;
// clear T:
for(j = 0; j < BITS; j++)
T[j] = 0;
// compute T and the indices for each:
for(i = 0; i < N; i++)
{
for(j = 0; j < BITS; j++)
{
if((a[i] & (1 << j)) == 0)
{
// increment T and store the index
index[j][T[j]++] = i;
}
}
}
}
Then you can find your random number like this:
long number = N[index[bit][rand() % T[bit]];
You could make this more memory-efficient by using a less wasteful data structure that only stores as many indices for each bit as there are actual values in a[] that have the bit cleared.
If T is sufficiently large, the most efficient solution is going to be to randomly select an integer up to N and loop until the condition is met.

How to improve on this implementation of the radix-sort?

I'm implementing a 2-byte Radix Sort. The concept is to use Counting Sort, to sort the lower 16 bits of the integers, then the upper 16 bits. This allows me to run the sort in 2 iterations. The first concept I had was trying to figure out how to handle negatives. Since the sign bit would be flipped for negative numbers, then in hex form, that would make negatives greater than the positives. To combat this I flipped the sign bit when it was positive, in order to make [0, 2 bil) = [128 000 000 000, 255 255...). And when it was negative I flipped all the bits, to make it range from (000 000 .., 127 255 ..). This site helped me with that information. To finish it off, I would split the integer into either the top or bottom 16-bits based on the pass. The following is the code allowing me to do that.
static uint32_t position(int number, int pass) {
int mask;
if (number <= 0) mask = 0x80000000;
else mask = (number >> 31) | 0x80000000;
uint32_t out = number ^ mask;
return pass == 0 ? out & 0xffff : (out >> 16) & 0xffff;
}
To start the actual Radix Sort, I needed to form a histogram of size 65536 elements. The problem I ran across was when the number of elements inputted was very large. It would take a while to create the histogram, so I implemented it in parallel, using processes and shared memory. I partitioned the array into subsections of size / 8. Then over an array of shared memory sized 65536 * 8, I had each process create its own histogram. Afterwards, I summed it all together to form a single histogram. The following is the code for that:
for (i=0;i<8;i++) {
pid_t pid = fork();
if (pid < 0) _exit(0);
if (pid == 0) {
const int start = (i * size) >> 3;
const int stop = i == 7 ? size : ((i + 1) * size) >> 3;
const int curr = i << 16;
for (j=start;j<stop;++j)
hist[curr + position(array[j], pass)]++;
_exit(0);
}
}
for (i=0;i<8;i++) wait(NULL);
for (i=1;i<8;i++) {
const int pos = i << 16;
for (j=0;j<65536;j++)
hist[j] += hist[pos + j];
}
The next part was where I spent most of my time analyzing how cache affected the performance of the prefix-sum. With an 8-bit and 11-bit pass Radix Sort, all of the histogram would fit within L1 cache. With 16-bits, it would only fit within L2 cache. In the end the 16-bit histogram ran the sum the fastest, since I only had to run 2 iterations with it. I also ran the prefix sum in parallel as per the CUDA website recommendations. At 250 million elements, this ran about 1.5 seconds slower than the 16-bit integer. So my prefix sum ended up looking like this:
for (i=1;i<65536;i++)
hist[i] += hist[i-1];
The only thing left was to traverse backwards through the array and put all the elements into their respective spots in the temp array. Since I only had to go through twice, instead of copying from the temp back to array, and running the code again. I ran the sort first using array as the input, and temp as the output. Then ran it the second time using temp as the input and array as the output. This kept me from mem-copying back to array both times. The code looks like this for the actual sort:
histogram(array, size, 0, hist);
for (i=size-1;i>=0;i--)
temp[--hist[position(array[i], 0)]] = array[i];
memset(hist, 0, arrSize);
histogram(temp, size, 1, hist);
for (i=size-1;i>=0;i--)
array[--hist[position(temp[i], 1)]] = temp[i];
This link contains the full code that I have so far. I ran a test against quicksort, and it ran between 5 and 10 times faster with integers and floats, and about 5 times faster with 8-byte data types. Is there a way to improve on this?
My guess would be that treating the sign of the integers during operation is not worth it. It complexyfies and slows down your code. I'd go for a first sort as unsigned and then do a second path that just reorders the two halves and inverts the one of the negatives.
Also from your code I don't get how you have different processes operate together. How do you collect the histogram in the parent? you have a process shared variable? In any case using ptrhead would be much more appropriate, here.

How to find a 2 unpaired elements in array?

You have an array with n=2k+2 elements where 2 elements haven't pair. Example for 8 elemets array: 1 2 3 47 3 1 2 0. "47" and "0" haven't pair in array. If I have array where only 1 element has't pair, I solve this problem with XOR. But I have 2 unpair elements! What can I do? Solution could be for a O(n) time performance and for O(1) additional memory.
Some hints...
It will take 2 passes. First, go through the list and XOR all elements together. See what you get. Proceed from there.
Edit: The key observation about the result of the first pass should be that it shows you the set of bits in which the 2 unpaired elements differ.
Use INT_MAX/8 bytes of memory. Walk the array. XOR the bit corresponding to each value with 1. If there are 0 or 2 instances the bit will end up 0. If there is only one instance, it will be set. O(1) mem, O(N) time.
Scan the Array and put each number and count in hash.
Rescan and find out the items with count=1.
This is O(n).
You can try this.It will take O(n) time
int xor = arr[0];
int set_bit_no;
int i;
int x = 0; //First unpair number
int y = 0; //second unpair number
for (i = 1; i < n; i++)
xor ^= arr[i];
set_bit_no = xor & ~(xor-1);//Get the rightmost set bit in set_bit_no
for (i = 0; i < n; i++)
{
if (arr[i] & set_bit_no) {
//XOR of first set
x = x ^ arr[i];
}
else
{
//XOR of second set
y = y ^ arr[i];
}
}
Explanation...
arr[] = {2, 4, 7, 9, 2, 4}
1) Get the XOR of all the elements.
xor = 2^4^7^9^2^4 = 14 (1110)
2) Get a number which has only one set bit of the xor.
Since we can easily get the rightmost set bit, let us use it.
set_bit_no = xor & ~(xor-1) = (1110) & ~(1101) = 0010
Now set_bit_no will have only set as rightmost set bit of xor.
3) Now divide the elements in two sets and do xor of
elements in each set, and we get the non-repeating
elements 7 and 9.

Resources