Effective Algorithms for selecting the top k ( in percent) items from a datastream: - c

I have to repeatedly sort an array containing 300 random elements. But i have to do a special kind of sort: I need the 5% smallest values from an subset of the array, then some value is calculated and the subset is increased. Now the value is calculated again and the subset also increased. And so on until the subset contains the whole array.
The subset starts with the first 10 elements and is increased by 10 elements after each step.
i.e. :
subset-size k=ceil(5%*subset)
10 1 (so just the smallest element)
20 1 (so also just the smallest)
30 2 (smallest and second smallest)
...
The calculated value is basically a sum of all elements smaller than k and the specially weighted k smallest element.
In code:
k = ceil(0.05 * subset) -1; // -1 because array index starts with 0...
temp = 0.0;
for( int i=0 i<k; i++)
temp += smallestElements[i];
temp += b * smallestElements[i];
I have implemented myself a selection sort based algorithm (code at the end of this post). I use MAX(k) pointers to keep track of the k smallest elements. Therefore I unnecessarily sort all elements smaller than k :/
Furthermore I know selection sort is bad for performance, which is unfortunately crucial in my case.
I tried figuring out a way how I could use some quick- or heapsort based algorithm. I know that quickselect or heapselect are perfect for finding the k smallest elements if k and the subset is fixed.
But because my subset is more like an input stream of data I think that quicksort based algorithm drop out.
I know that heapselect would be perfect for a data stream if k is fixed. But I don't manage it to adjust heapselect for dynamic k's without big performance drops, so that it is less effective than my selection-sort based version :( Can anyone help me to modify heap-select for dynamic k's?
If there is no better algorithm, you maybe find a different/faster approach for my selection sort implementation. Here is a minimal example of my implementation, the calculated variable isn't used in this example, so don't worry about it. (In my real programm i have just some loops unrolled manually for better performance)
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define ARRAY_SIZE 300
#define STEP_SIZE 10
float sortStream( float* array, float** pointerToSmallest, int k_max){
int i,j,k,last = k_max-1;
float temp=0.0;
// init first two pointers
if( array[0] < array[1] ){
pointerToSmallest[0] = &array[0];
pointerToSmallest[1] = &array[1];
}else{
pointerToSmallest[0] = &array[1];
pointerToSmallest[1] = &array[0];
}
// Init remaining pointers until i= k_max
for(i=2; i< k_max;++i){
if( *pointerToSmallest[i-1] < array[i] ){
pointerToSmallest[i] = &array[i];
}else{
pointerToSmallest[i] = pointerToSmallest[i-1];
for(j=0; j<i-1 && *pointerToSmallest[i-2-j] > array[i];++j)
pointerToSmallest[i-1-j] = pointerToSmallest[i-2-j];
pointerToSmallest[i-1-j]=&array[i];
}
if((i+1)%STEP_SIZE==0){
k = ceil(0.05 * i)-1;
for(j=0; j<k; j++)
temp += *pointerToSmallest[j];
temp += 2 * (*pointerToSmallest[k]);
}
}
// Selection sort remaining elements
for( ; i< ARRAY_SIZE; ++i){
if( *pointerToSmallest[ last ] > array[i] ) {
for(j=0; j != last && *pointerToSmallest[ last-1-j] > array[i];++j)
pointerToSmallest[last-j] = pointerToSmallest[last-1-j];
pointerToSmallest[last-j] = &array[i];
}
if( (i+1)%STEP_SIZE==0){
k = ceil(0.05 * i)-1;
for(j=0; j<k; j++)
temp += *pointerToSmallest[j];
temp += 2 * (*pointerToSmallest[k]);
}
}
return temp;
}
int main(void){
int i,k_max = ceil( 0.05 * ARRAY_SIZE );
float* array = (float*)malloc ( ARRAY_SIZE * sizeof(float));
float** pointerToSmallest = (float**)malloc( k_max * sizeof(float*));
for( i=0; i<ARRAY_SIZE; i++)
array[i]= rand() / (float)RAND_MAX*100-50;
// just return a, so that the compiler doens't drop the function call
float a = sortStream(array,pointerToSmallest, k_max);
return (int)a;
}
Thank you very much

By using two heap for storing all items from stream, you can:
find top p% elements in O(1)
update data structure (two heaps) in O(log N)
assume, now we have N elements, k = p% *N,
min heap (LargerPartHeap) for storing top k items
max heap (SmallerPartHeap) for storing the other (N - k) items.
all items in SmallerPartHeap is less or equal to min items of LargerPartHeap (top item # LargerPartHeap).
for query "what is top p% elements?", simply return LargerPartHeap
for update "new element x from stream",
2.a check new k' = (N + 1) * p%, if k' = k + 1, move top of SmallerPartHeap to LargerPartHeap. - O(logN)
2.b if x is larger than top element (min element) of LargerPartHeap, insert x to LargerPartHeap, and move top of LargerPartHeap to SmallerPartHeap; otherwise, insert x to SmallerPartHeap - O(logN)

I believe heap sort is far too complicated for this particular problem, even though that or other priority queue algorithms are well suited to get N minimum or maximum items from a stream.
The first notice is the constraint 0.05 * 300 = 15. That is the maximum amount of data, that has to be sorted at any moment. Also during each iteration one has add 10 elements. The overall operation in-place could be:
for (i = 0; i < 30; i++)
{
if (i != 1)
qsort(input + i*10, 10, sizeof(input[0]), cmpfunc);
else
qsort(input, 20, sizeof(input[0]), cmpfunc);
if (i > 1)
merge_sort15(input, 15, input + i*10, 10, cmpfunc);
}
When i==1, one could also merge sort input and input+10 to produce completely sorted array of 20 inplace, since that has lower complexity than the generic sort. Here the "optimizing" is also on minimizing the primitives of the algorithm.
Merge_sort15 would only consider the first 15 elements of the first array and the first 10 elements of the next one.
EDIT The parameters of the problem will have a considerable effect in choosing the right algorithm; here selecting 'sort 10 items' as basic unit will allow one half of the problem to be parallelized, namely sorting 30 individual blocks of 10 items each -- a problem which can be efficiently solved with fixed pipeline algorithm using sorting networks. With different parametrization such an approach may not be feasible.

Related

What is the most efficient (fastest) way to find an N number of the largest integers in an array in C?

Let's have an array of size 8
Let's have N be 3
With an array:
1 3 2 17 19 23 0 2
Our output should be:
23, 19, 17
Explanation: The three largest numbers from the array, listed in descending order.
I have tried this:
int array[8];
int largest[N] = {0, 0, 0};
for (int i = 1; i < N; i++) {
for (int j = 0; j < SIZE_OF_ARRAY; j++) {
if (largest[i] > array[j]) {
largest[i] = array[j];
array[j] = 0;
}
}
}
Additionally, let the constraint be as such:
integers in the array should be 0 <= i <= 1 000
N should be 1 <= N <= SIZE_OF_ARRAY - 1
SIZE_OF_ARRAY should be 2 <= SIZE_OF_ARRAY <= 1 000 000
My way of implementing it is very inefficient, as it scrubs the entire array an N number of times. With huge arrays, this can take several minutes to do.
What would be the fastest and most efficient way to implement this in C?
You should look at the histogram algorithm. Since the values have to be between 0 and 1000, you just allocate an array for each of those values:
#define MAX_VALUE 1000
int occurrences[MAX_VALUE+1];
int largest[N];
int i, j;
for (i=0; i<N; i++)
largest[N] = -1;
for (i=0; i<=MAX_VALUE; i++)
occurrences[i] = 0;
for (i=0; i<SIZE_OF_ARRAY; i++)
occurrences[array[i]]++;
// Step through the occurrences array backward to find the N largest values.
for (i=MAX_VALUE, j=0, i; i>=0 && j<N; i--)
if (occurrences[i] > 0)
largest[j++] = i;
Note that this will yield only one element in largest for each unique value. Modify the insertion accordingly if you want all occurrences to appear in largest. Because of that, you may get values of -1 for some elements if there weren't enough unique large numbers to fill the largest array. Finally, the results in largest will be sorted from largest to smallest. That will be easy to fix if you want to: just fill the largest array from right to left.
The fastest way is to recognize that data doesn't just appear (it either exists at compile time; or arrives by IO - from files, from network, etc); and therefore you can find the 3 highest values when the data is created (at compile time; or when you're parsing and sanity checking and then storing data received by IO - from files, from network, etc). This is likely to be the fastest possible way (because you're either doing nothing at run-time, or avoiding the need to look at all the data a second time).
However; in this case, if the data is modified after it was created then you'd need to update the "3 highest values" at the same time as the data is modified; which is easy if a lower value is replaced by a higher value (you just check if the new value becomes one of the 3 highest values) but involves a search if a "previously highest" value is being replaced with a lower value.
If you need to search; then it can be done with a single loop, like:
firstHighest = INT_MIN;
secondHighest = INT_MIN;
thirdHighest = INT_MIN;
for (int i = 1; i < N; i++) {
if(array[i] > thirdHighest) {
if(array[i] > secondHighest) {
if(array[i] > firstHighest) {
thirdHighest = secondHighest;
secondHighest = firstHighest;
firstHighest = array[i];
} else {
thirdHighest = secondHighest;
secondHighest = array[i];
}
} else {
thirdHighest = array[i];
}
}
}
Note: The exact code will depend on what you want to do with duplicates (you may need to replace if(array[j] > secondHighest) { with if(array[j] >= secondHighest) { and if(array[j] > firstHighest) { with if(array[j] >= firstHighest) { if you want the numbers 1, 2, 3, 4, 4, 4, 4 to give the answer 4, 4, 4 instead of 2, 3, 4).
For large amounts of data it can be accelerated with SIMD and/or multiple threads. For example; if SIMD can do "bundles of 8 integers" and you have 4 CPUs (and 4 threads); then you can split it into quarters then treat each quarter as columns of 8 elements; find the highest 3 values in each column in each quarter; then determine the highest 3 values from the "highest 3 values in each column in each quarter". In this case you will probably want to add padding (dummy values set to INT_MIN) to the end of the array to ensure that the array's total size is a multiple of SIMD width and number of CPUs.
For small amounts of data the extra overhead of setting up SIMD and/or coordinating multiple threads is going to cost more than it saves; and the "simple loop" version is likely to be as fast as it gets.
For unknown/variable amounts of data you could provide multiple alternatives (simple loop, SIMD with single thread, and SIMD with a variable number of threads) and decide which method to use (and how many threads to use) at run-time based on the amount of data.
One method I can think of is to just sort the array and return the first N numbers. Since the array is sorted, the N number we return will be the N largest numbers of the array. This method will take a time complexity of O(nlogn) where n is the number of elements we have in the given array. I think this is probably very good time complexity you can get when approaching this problem.
Another approach with similar time complexity would be to use a max-heap. Form max-heap from the given array and for N times, use pop() (or extract or whatever you call it) to get the top-most element which would be the max element remaining in the heap after each pop.
The time complexity of this approach could be considered to be even better than first one - O(n + Nlogn) where n is the number of elements in array and N is the number of largest elements to be found. Here, O(n) would be required to build heap and for popping the top-most element, we would need O(logn) for N times which sums up to - O(n + Nlogn), slightly better than O(nlogn)

Shuffle an array while making each index have the same probability to be in any index

I want to shuffle an array, and that each index will have the same probability to be in any other index (excluding itself).
I have this solution, only i find that always the last 2 indexes will always ne swapped with each other:
void Shuffle(int arr[]. size_t n)
{
int newIndx = 0;
int i = 0;
for(; i > n - 2; ++i)
{
newIndx = rand() % (n - 1);
if (newIndx >= i)
{
++newIndx;
}
swap(i, newIndx, arr);
}
}
but in the end it might be that some indexes will go back to their first place once again.
Any thoughts?
C lang.
A permutation (shuffle) where no element is in its original place is called a derangement.
Generating random derangements is harder than generating random permutations, can be done in linear time and space. (Generating a random permutation can be done in linear time and constant space.) Here are two possible algorithms.
The simplest solution to understand is a rejection strategy: do a Fisher-Yates shuffle, but if the shuffle attempts to put an element at its original spot, restart the shuffle. [Note 1]
Since the probability that a random shuffle is a derangement is approximately 1/e, the expected number of shuffles performed is about e (that is, 2.71828…). But since unsuccessful shuffles are restarted as soon as the first fixed point is encountered, the total number of shuffle steps is less than e times the array size for a detailed analysis, see this paper, which proves the expected number of random numbers needed by the algorithm to be around (e−1) times the number of elements.
In order to be able to do the check and restart, you need to keep an array of indices. The following little function produces a derangement of the indices from 0 to n-1; it is necessary to then apply the permutation to the original array.
/* n must be at least 2 for this to produce meaningful results */
void derange(size_t n, size_t ind[]) {
for (size_t i = 0; i < n; ++i) ind[i] = i;
swap(ind, 0, randint(1, n));
for (size_t i = 1; i < n; ++i) {
int r = randint(i, n);
swap(ind, i, r);
if (ind[i] == i) i = 0;
}
}
Here are the two functions used by that code:
void swap(int arr[], size_t i, size_t j) {
int t = arr[i]; arr[i] = arr[j]; arr[j] = t;
}
/* This is not the best possible implementation */
int randint(int low, int lim) {
return low + rand() % (lim - low);
}
The following function is based on the 2008 paper "Generating Random Derangements" by Conrado Martínez, Alois Panholzer and Helmut Prodinger, although I use a different mechanism to track cycles. Their algorithm uses a bit vector of size N but uses a rejection strategy in order to find an element which has not been marked. My algorithm uses an explicit vector of indices not yet operated on. The vector is also of size N, which is still O(N) space [Note 2]; since in practical applications, N will not be large, the difference is not IMHO significant. The benefit is that selecting the next element to use can be done with a single call to the random number generator. Again, this is not particularly significant since the expected number of rejections in the MP&P algorithm is very small. But it seems tidier to me.
The basis of the algorithms (both MP&P and mine) is the recursive procedure to produce a derangement. It is important to note that a derangement is necessarily the composition of some number of cycles where each cycle is of size greater than 1. (A cycle of size 1 is a fixed point.) Thus, a derangement of size N can be constructed from a smaller derangement using one of two mechanisms:
Produce a derangement of the N-1 elements other than element N, and add N to some cycle at any point in that cycle. To do so, randomly select any element j in the N-1 cycle and place N immediately after j in the j's cycle. This alternative covers all possibilities where N is in a cycle of size > 3.
Produce a derangement of N-2 of the N-1 elements other than N, and add a cycle of size 2 consisting of N and the element not selected from the smaller derangement. This alternative covers all possibilities where N is in a cycle of size 2.
If Dn is the number of derangements of size n, it is easy to see from the above recursion that:
Dn = (n−1)(Dn−1 + Dn−2)
The multiplier is n−1 in both cases: in the first alternative, it refers to the number of possible places N can be added, and in the second alternative to the number of possible ways to select n−2 elements of the recursive derangement.
Therefore, if we were to recursively produce a random derangement of size N, we would randomly select one of the N-1 previous elements, and then make a random boolean decision on whether to produce alternative 1 or alternative 2, weighted by the number of possible derangements in each case.
One advantage to this algorithm is that it can derange an arbitrary vector; there is no need to apply the permuted indices to the original vector as with the rejection algorithm.
As MP&P note, the recursive algorithm can just as easily be performed iteratively. This is quite clear in the case of alternative 2, since the new 2-cycle can be generated either before or after the recursion, so it might as well be done first and then the recursion is just a loop. But that is also true for alternative 1: we can make element N the successor in a cycle to a randomly-selected element j even before we know which cycle j will eventually be in. Looked at this way, the difference between the two alternatives reduces to whether or not element j is removed from future consideration or not.
As shown by the recursion, alternative 2 should be chosen with probability (n−1)Dn−2/Dn, which is how MP&P write their algorithm. I used the equivalent formula Dn−2 / (Dn−1 + Dn−2), mostly because my prototype used Python (for its built-in bignum support).
Without bignums, the number of derangements and hence the probabilities need to be approximated as double, which will create a slight bias and limit the size of the array to be deranged to about 170 elements. (long double would allow slightly more.) If that is too much of a limitation, you could implement the algorithm using some bignum library. For ease of implementation, I used the Posix drand48 function to produce random doubles in the range [0.0, 1.0). That's not a great random number function, but it's probably adequate to the purpose and is available in most standard C libraries.
Since no attempt is made to verify the uniqueness of the elements in the vector to be deranged, a vector with repeated elements may produce a derangement where one or more of these elements appear to be in the original place. (It's actually a different element with the same value.)
The code:
/* Deranges the vector `arr` (of length `n`) in place, to produce
* a permutation of the original vector where every element has
* been moved to a new position. Returns `true` unless the derangement
* failed because `n` was 1.
*/
bool derange(int arr[], size_t n) {
if (n < 2) return n != 1;
/* Compute derangement counts ("subfactorials") */
double subfact[n];
subfact[0] = 1;
subfact[1] = 0;
for (size_t i = 2; i < n; ++i)
subfact[i] = (i - 1) * (subfact[i - 2] + subfact[i - 1]);
/* The vector 'todo' is the stack of elements which have not yet
* been (fully) deranged; `u` is the count of elements in the stack
*/
size_t todo[n];
for (size_t i = 0; i < n; ++i) todo[i] = i;
size_t u = n;
/* While the stack is not empty, derange the element at the
* top of the stack with some element lower down in the stack
*/
while (u) {
size_t i = todo[--u]; /* Pop the stack */
size_t j = u * drand48(); /* Get a random stack index */
swap(arr, i, todo[j]); /* i will follow j in its cycle */
/* If we're generating a 2-cycle, remove the element at j */
if (drand48() * (subfact[u - 1] + subfact[u]) < subfact[u - 1])
todo[j] = todo[--u];
}
return true;
}
Notes
Many people get this wrong, particularly in social occasions such as "secret friend" selection (I believe this is sometimes called "the Santa game" in other parts of the world.) The incorrect algorithm is to just choose a different swap if the random shuffle produces a fixed point, unless the fixed point is at the very end in which case the shuffle is restarted. This will produce a random derangement but the selection is biased, particularly for small vectors. See this answer for an analysis of the bias.
Even if you don't use the RAM model where all integers are considered fixed size, the space used is still linear in the size of the input in bits, since N distinct input values must have at least N log N bits. Neither this algorithm nor MP&P makes any attempt to derange lists with repeated elements, which is a much harder problem.
Your algorithm is only almost correct (which in algorithmics means unexpected results). Because of some little errors scattered along, it will not produce expected results.
First, rand() % N is not guaranteed to produce an uniformal distribution, unless N is a divisor of the number of possible values. In any other case, you will get a slight bias. Anyway my man page for rand describes it as a bad random number generator, so you should try to use random or if available arc4random_uniform.
But avoiding that an index come back at its original place is both incommon, and rather hard to achieve. The only way I can imagine is to keep an array of the numbers [0; n[ and swap it the same as the real array to be able to know the original index of a number.
The code could become:
void Shuffle(int arr[]. size_t n)
{
int i, newIndx;
int *indexes = malloc(n * sizeof(int));
for (i=0; i<n; i++) indexes[i] = i;
for(i=0; i < n - 1; ++i) // beware to the inequality!
{
int i1;
// search if index i is in the [i; n[ current array:
for (i1=i; i1 < n; ++i) {
if (indexes[i1] == i) { // move it to i position
if (i1 != i) { // nothing to do if already at i
swap(i, i1, arr);
swap(i, i1, indexes);
}
break;
}
}
i1 = (i1 == n) ? i : i+1; // we will start the search at i1
// to guarantee that no element keep its place
newIndx = i1 + arc4random_uniform(n - i1);
/* if arc4random is not available:
newIndx = i1 + (random() % (n - i1));
*/
swap(i, newIndx, arr);
swap(i, newIndx, indexes);
}
/* special case: a permutation of [0: n-1[ have left last element in place
* we will exchange the last element with a random one
*/
if (indexes[n-1] == n-1) {
newIndx = arc4random_uniform(n-1)
swap(n-1, newIndx, arr);
swap(n-1, newIndx, indexes);
}
free(indexes); // don't forget to free what we have malloc'ed...
}
Beware: the algorithm should be correct, but the code has not been tested and can contain typos...

What is the best way to find N consecutive elements of a sorted version of an unordered array?

For instance: I have an unsorted list A of 10 elements. I need the sublist of k consecutive elements from i through i+k-1 of the sorted version of A.
Example:
Input: A { 1, 6, 13, 2, 8, 0, 100, 3, -4, 10 }
k = 3
i = 4
Output: sublist B { 2, 3, 6 }
If i and k are specified, you can use a specialized version of quicksort where you stop recursion on parts of the array that fall outside of the i .. i+k range. If the array can be modified, perform this partial sort in place, if the array cannot be modified, you will need to make a copy.
Here is an example:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
// Partial Quick Sort using Hoare's original partition scheme
void partial_quick_sort(int *a, int lo, int hi, int c, int d) {
if (lo < d && hi > c && hi - lo > 1) {
int x, pivot = a[lo];
int i = lo - 1;
int j = hi;
for (;;) {
while (a[++i] < pivot)
continue;
while (a[--j] > pivot)
continue;
if (i >= j)
break;
x = a[i];
a[i] = a[j];
a[j] = x;
}
partial_quick_sort(a, lo, j + 1, c, d);
partial_quick_sort(a, j + 1, hi, c, d);
}
}
void print_array(const char *msg, int a[], int count) {
printf("%s: ", msg);
for (int i = 0; i < count; i++) {
printf("%d%c", a[i], " \n"[i == count - 1]);
}
}
int int_cmp(const void *p1, const void *p2) {
int i1 = *(const int *)p1;
int i2 = *(const int *)p2;
return (i1 > i2) - (i1 < i2);
}
#define MAX 1000000
int main(void) {
int *a = malloc(MAX * sizeof(*a));
clock_t t;
int i, k;
srand((unsigned int)time(NULL));
for (i = 0; i < MAX; i++) {
a[i] = rand();
}
i = 20;
k = 10;
printf("extracting %d elements at %d from %d total elements\n",
k, i, MAX);
t = clock();
partial_quick_sort(a, 0, MAX, i, i + k);
t = clock() - t;
print_array("partial qsort", a + i, k);
printf("elapsed time: %.3fms\n", t * 1000.0 / CLOCKS_PER_SEC);
t = clock();
qsort(a, MAX, sizeof *a, int_cmp);
t = clock() - t;
print_array("complete qsort", a + i, k);
printf("elapsed time: %.3fms\n", t * 1000.0 / CLOCKS_PER_SEC);
return 0;
}
Running this program with an array of 1 million random integers, extracting the 10 entries of the sorted array starting at offset 20 gives this output:
extracting 10 elements at 20 from 1000000 total elements
partial qsort: 33269 38347 39390 45413 49479 50180 54389 55880 55927 62158
elapsed time: 3.408ms
complete qsort: 33269 38347 39390 45413 49479 50180 54389 55880 55927 62158
elapsed time: 149.101ms
It is indeed much faster (20x to 50x) than sorting the whole array, even with a simplistic choice of pivot. Try multiple runs and see how the timings change.
An idea could be to scan your array for bigger or equal numbers of i and smaller or equal numbers of i+k and add them to another list/container.
This will take you O(n) and give an unordered list of the numbers you need. Then you sort that list O(nlogn) and you are done.
For really big arrays the advantage of this method is that you will sort a smaller list of numbers. (given that the k is relatively small).
You can use Quickselect, or a heap selection algorithm to get the i+k smallest items. Quickselect works in-place, but it modifies the original array. It also won't work if the list of items is larger than will fit in memory. Quickselect is O(n), but with a fairly high constant. When the number of items you are selecting is a very small fraction of the total number of items, the heap selection algorithm is faster.
The idea behind the heap selection algorithm is that you initialize a max-heap with the first i+k items. Then, iterate through the rest of the items. If an item is smaller than the largest item on the max-heap, remove the largest item from the max-heap and replace it with the new, smaller item. When you're done, you have the first i+k items on the heap, with the largest k items at the top.
The code is pretty simple:
heap = new max_heap();
Add first `i+k` items from a[] to heap
for all remaining items in a[]
if item < heap.peek()
heap.pop()
heap.push(item)
end-if
end-for
// at this point the smallest i+k items are on the heap
This requires O(i+k) extra memory, and worst case running time is O(n log(i+k)). When (i+k) is less than about 2% of n, it will usually outperform Quickselect.
For much more information about this, see my blog post When theory meets practice.
By the way, you can optimize your memory usage somewhat based on i. That is, if there are a billion items in the array and you want items 999,999,000 through 999,999,910, the standard method above would require a huge heap. But you can re-cast that problem to one in which you need to select the smallest of the last 1,000 items. Your heap then becomes a min-heap of 1,000 items. It just takes a little math to determine which way will require the smallest heap.
That doesn't help much, of course, if you want items 600,000,000 through 600,000,010, because your heap still has 400 million items in it.
It occurs to me, though, that if time isn't a huge issue, you can just build the heap in the array in-place using Floyd's algorithm, pop the first i items like you would with heap sort, and the next k items are what you're looking for. This would require constant extra space and O(n + (i+k)*log(n)) time.
Come to think of it, you could implement the heap selection logic with a heap of (i+k) items (as described above) in-place, as well. It would be a little tricky to implement, but it wouldn't require any extra space and would have the same running time O(n*log(i+k)).
Note that both would modify the original array.
One thing you could do is modify heapsort, such that you will first create the heap, but then pop the first i elements. The next k elements you pop form the heap will be your result. Discarding the n - i - k elements remaining let's the algorithm terminate early.
The result will be in O((i + k) log n) which is in O(n log n), but is significantly faster with relative low values for i and k.

Applying a function on sorted array

Taken from the google interview question here
Suppose that you have a sorted array of integers (positive or negative). You want to apply a function of the form f(x) = a * x^2 + b * x + c to each element x of the array such that the resulting array is still sorted. Implement this in Java or C++. The input are the initial sorted array and the function parameters (a, b and c).
Do you think we can do it in-place with less than O(n log(n)) time where n is the array size (e.g. apply a function to each element of an array, after that sort the array)?
I think this can be done in linear time. Because the function is quadratic it will form a parabola, ie the values decrease (assuming a positive value for 'a') down to some minimum point and then after that will increase. So the algorithm should iterate over the sorted values until we reach/pass the minimum point of the function (which can be determined by a simple differentiation) and then for each value after the minimum it should just walk backward through the earlier values looking for the correct place to insert that value. Using a linked list would allow items to be moved around in-place.
The quadratic transform can cause part of the values to "fold" over the others. You will have to reverse their order, which can easily be done in-place, but then you will need to merge the two sequences.
In-place merge in linear time is possible, but this is a difficult process, normally out of the scope of an interview question (unless for a Teacher's position in Algorithmics).
Have a look at this solution: http://www.akira.ruc.dk/~keld/teaching/algoritmedesign_f04/Artikler/04/Huang88.pdf
I guess that the main idea is to reserve a part of the array where you allow swaps that scramble the data it contains. You use it to perform partial merges on the rest of the array and in the end you sort back the data. (The merging buffer must be small enough that it doesn't take more than O(N) to sort it.)
If a is > 0, then a minimum occurs at x = -b/(2a), and values will be copied to the output array in forward order from [0] to [n-1]. If a < 0, then a maximum occurs at x = -b/(2a) and values will be copied to the output array in reverse order from [n-1] to [0]. (If a == 0, then if b > 0, do a forward copy, if b < 0, do a reverse copy, If a == b == 0, nothing needs to be done). I think the sorted array can be binary searched for the closest value to -b/(2a) in O(log2(n)) (otherwise it's O(n)). Then this value is copied to the output array and the values before (decrementing index or pointer) and after (incrementing index or pointer) are merged into the output array, taking O(n) time.
static void sortArray(int arr[], int n, int A, int B, int C)
{
// Apply equation on all elements
for (int i = 0; i < n; i++)
arr[i] = A*arr[i]*arr[i] + B*arr[i] + C;
// Find maximum element in resultant array
int index=-1;
int maximum = -999999;
for (int i = 0; i< n; i++)
{
if (maximum < arr[i])
{
index = i;
maximum = arr[i];
}
}
// Use maximum element as a break point
// and merge both subarrays usin simple
// merge function of merge sort
int i = 0, j = n-1;
int[] new_arr = new int[n];
int k = 0;
while (i < index && j > index)
{
if (arr[i] < arr[j])
new_arr[k++] = arr[i++];
else
new_arr[k++] = arr[j--];
}
// Merge remaining elements
while (i < index)
new_arr[k++] = arr[i++];
while (j > index)
new_arr[k++] = arr[j--];
new_arr[n-1] = maximum;
// Modify original array
for (int p = 0; p < n ; p++)
arr[p] = new_arr[p];
}

Suggest an Efficient Algorithm

Given an Array arr of size 100000, each element 0 <= arr[i] < 100. (not sorted, contains duplicates)
Find out how many triplets (i,j,k) are present such that arr[i] ^ arr[j] ^ arr[k] == 0
Note : ^ is the Xor operator. also 0 <= i <= j <= k <= 100000
I have a feeling i have to calculate the frequencies and do some calculation using the frequency, but i just can't seem to get started.
Any algorithm better than the obvious O(n^3) is welcome. :)
It's not homework. :)
I think the key is you don't need to identify the i,j,k, just count how many.
Initialise an array size 100
Loop though arr, counting how many of each value there are - O(n)
Loop through non-zero elements of the the small array, working out what triples meet the condition - assume the counts of the three numbers involved are A, B, C - the number of combinations in the original arr is (A+B+C)/!A!B!C! - 100**3 operations, but that's still O(1) assuming the 100 is a fixed value.
So, O(n).
Possible O(n^2) solution, if it works: Maintain variable count and two arrays, single[100] and pair[100]. Iterate the arr, and for each element of value n:
update count: count += pair[n]
update pair: iterate array single and for each element of index x and value s != 0 do pair[s^n] += single[x]
update single: single[n]++
In the end count holds the result.
Possible O(100 * n) = O(n) solution.
it solve problem i <= j <= k.
As you know A ^ B = 0 <=> A = B, so
long long calcTripletsCount( const vector<int>& sourceArray )
{
long long res = 0;
vector<int> count(128);
vector<int> countPairs(128);
for(int i = 0; i < sourceArray.size(); i++)
{
count[sourceArray[i]]++; // count[t] contain count of element t in (sourceArray[0]..sourceArray[i])
for(int j = 0; j < count.size(); j++)
countPairs[j ^ sourceArray[i]] += count[j]; // countPairs[t] contain count of pairs p1, p2 (p1 <= p2 for keeping order) where t = sourceArray[i] ^ sourceArray[j]
res += countPairs[sourceArray[i]]; // a ^ b ^ c = 0 if a ^ b = c, we add count of pairs (p1, p2) where sourceArray[p1] ^ sourceArray[p2] = sourceArray[i]. it easy to see that we keep order(p1 <= p2 <= i)
}
return res;
}
Sorry for my bad English...
I have a (simple) O(n^2 log n) solution which takes into account the fact that i, j and k refer to indices, not integers.
A simple first pass allow us to build an array A of 100 values: values -> list of indices, we keep the list sorted for later use. O(n log n)
For each pair i,j such that i <= j, we compute X = arr[i]^arr[j]. We then perform a binary search in A[X] to locate the number of indices k such that k >= j. O(n^2 log n)
I could not find any way to leverage sorting / counting algorithm because they annihilate the index requirement.
Sort the array, keeping a map of new indices to originals. O(nlgn)
Loop over i,j:i<j. O(n^2)
Calculate x = arr[i] ^ arr[j]
Since x ^ arr[k] == 0, arr[k] = x, so binary search k>j for x. O(lgn)
For all found k, print mapped i,j,k
O(n^2 lgn)
Start with a frequency count of the number of occurrences of each number between 1 and 100, as Paul suggests. This produces an array freq[] of length 100.
Next, instead of looping over triples A,B,C from that array and testing the condition A^B^C=0,
loop over pairs A,B with A < B. For each A,B, calculate C=A^B (so that now A^B^C=0), and verify that A < B < C < 100. (Any triple will occur in some order, so this doesn't miss triples. But see below). The running total will look like:
Sum+=freq[A]*freq[B]*freq[C]
The work is O(n) for the frequency count, plus about 5000 for the loop over A < B.
Since every triple of three different numbers A,B,C must occur in some order, this finds each such triple exactly once. Next you'll have to look for triples in which two numbers are equal. But if two numbers are equal and the xor of three of them is 0, the third number must be zero. So this amounts to a secondary linear search for B over the frequency count array, counting occurrences of (A=0, B=C < 100). (Be very careful with this case, and especially careful with the case B=0. The count is not just freq[B] ** 2 or freq[0] ** 3. There is a little combinatorics problem hiding there.)
Hope this helps!

Resources