Implementation of C lower_bound

Implementation of C lower_bound - c

Based on the following definition found here
Returns an iterator pointing to the
first element in the sorted range
[first,last) which does not compare
less than value. The comparison is
done using either operator< for the
first version, or comp for the second.
What would be the C equivalent implementation of lower_bound(). I understand that it would be a modification of binary search, but can't seem to quite pinpoint to exact implementation.
int lower_bound(int a[], int lowIndex, int upperIndex, int e);
Sample Case:
int a[]= {2,2, 2, 7 };
lower_bound(a, 0, 1,2) would return 0 --> upperIndex is one beyond the last inclusive index as is the case with C++ signature.
lower_bound(a, 0, 2,1) would return 0.
lower_bound(a, 0, 3,6) would return 3;
lower_bound(a, 0, 4,6) would return 3;
My attempted code is given below:
int low_bound(int low, int high, int e)
{
if ( low < 0) return 0;
if (low>=high )
{
if ( e <= a[low] ) return low;
return low+1;
}
int mid=(low+high)/2;
if ( e> a[mid])
return low_bound(mid+1,high,e);
return low_bound(low,mid,e);
}

Here are the equivalent implementations of upper_bound and lower_bound. This algorithm is O(log(n)) in the worst case, unlike the accepted answer which gets to O(n) in the worst case.
Note that here high index is set to n instead of n - 1. These functions can return an index which is one beyond the bounds of the array. I.e., it will return the size of the array if the search key is not found and it is greater than all the array elements.
int bs_upper_bound(int a[], int n, int x) {
int l = 0;
int h = n; // Not n - 1
while (l < h) {
int mid = l + (h - l) / 2;
if (x >= a[mid]) {
l = mid + 1;
} else {
h = mid;
}
}
return l;
}
int bs_lower_bound(int a[], int n, int x) {
int l = 0;
int h = n; // Not n - 1
while (l < h) {
int mid = l + (h - l) / 2;
if (x <= a[mid]) {
h = mid;
} else {
l = mid + 1;
}
}
return l;
}
The actual C++ implementation works for all containers. You can find it here.

lower_bound is almost like doing a usual binary search, except:
If the element isn't found, you return your current place in the search, rather than returning some null value.
If the element is found, you search leftward until you find a non-matching element. Then you return a pointer/iterator to the first matching element.
Yes, it's really that simple. :-)

I know that this is a very old post. However, I was working on a problem and I came across this post. I would like to add my iterative version for the problem which is an extension of the last answer. I checked this with the test cases I could think of. I've attached my code in C#.
This code was working for all ranges. However, the range should be within the first index to the last index+1. If the array is of size N and considering range as [0,N] the search space will be within [0,N). I know that's pretty obvious but it helped me checking some edge cases.
static int lower_bound(int[] a, int lo,int hi, int x)
{
while (lo < hi)
{
int mid = lo + (hi-lo) / 2;
if(a[mid]==x)
{
/*when there is a match, we should keep on searching
for the next same element. If the same element is not
found, mid is considered as the answer and added to 'hi'
Finally 'hi' is returned*/
if(a[mid-1]!=x)
{
hi=mid;
break;
}
else
hi=mid-1;
}
else if(a[mid]>x)
hi=mid-1;
else
lo=mid+1;
}
//if element is not found, -1 will be returned
if(a[hi]!=x)
return -1;
return hi;
}
static int upper_bound(int[] a, int lo,int hi, int x)
{
int temp=hi;
while (lo < hi)
{
int mid = lo + (hi-lo) / 2;
if(a[mid]==x)
{
/*this section make sure that program runs within
range [start,end)*/
if(mid+1==hi)
{
lo=mid;
break;
}
/*when there is a match, we should keep on searching
for the next same element. If the same element is not
found, mid is considered as the answer and added to
'lo'. Finally 'lo' is returned*/
if(a[mid+1]!=x)
{
lo=mid;
break;
}
else
lo=mid+1;
}
else if(a[mid]>x)
hi=mid-1;
else
lo=mid+1;
}
//if element is not found, -1 will be returned
if(a[lo]!=x)
return -1;
return lo;
}
Here is a test case that I used:
Array(a) : 1 2 2 2 2 5 5 5 5
size of the array(a) : 9
Considering search element as 2:
upper_bound(a,0,9,2)=4, lower_bound(a,0,9,2)=1
Considering search element as 5:
upper_bound(a,0,9,2)=8, lower_bound(a,0,9,2)=5
Considering search element as 1:
upper_bound(a,0,9,2)=0, lower_bound(a,0,9,2)=0
Considering search element as 5:
upper_bound(a,5,9,2)=8, lower_bound(a,5,9,2)=5

The lower_bound and upper_bound functions in python would be implemented as follows:
def binLowerBound(a, lo, hi, x):
if (lo > hi):
return hi
mid = (lo + hi) / 2;
if (a[mid] == x):
return binLowerBound(a, lo, mid-1, x)
elif (a[mid] > x):
return binLowerBound(a, lo, mid-1, x)
else:
return binLowerBound(a, mid+1, hi, x)
def binHigherBound(a, lo, hi, x):
if (lo > hi):
return lo
mid = (lo + hi) / 2;
if (a[mid] == x):
return binHigherBound(a, mid+1, hi, x)
elif (a[mid] > x):
return binHigherBound(a, lo, mid-1, x)
else:
return binHigherBound(a, mid+1, hi, x)

C++ Implementation
int binary_search_lower_bound(vector<int>& array, int target) {
int lo = 0, hi = (int)array.size();
int mid;
while(lo < hi) {
mid = lo + ((hi - lo) >> 1);
int val = array[mid];
if (target <= val)//array[mid])
hi = mid;
else
lo = mid + 1;
}
return lo;
}
Edit: Fixed bug for non-existing value.

int lowerBound (int *a, int size, int val) {
int lo = 0, hi = size - 1;
while (lo < hi) {
int mid = lo + (hi - lo)/2;
if (a[mid] < val)
lo = mid + 1;
else
hi = mid;
}
return lo;
}

Example if this is the given array
1 2 3 3 4
and different values of x is
3 then firstOccurance will be 2 and lastOccurance will be 3
2 then firstOccurance will be 1 and lastOccurance will be 1
10 then firstOccurance will be -1 and lastOccurance will be -1
int firstOccurance(vector<int>& arr, int x){
int low = 0;
int high = arr.size();
int ans=-1;
while(low<=high){
int mid = (low+high)/2;
if(arr[mid]==x) ans=mid;
if(arr[mid]>=x) high=mid-1;
else low = mid+1;
}
return ans;
}
int lastOccurance(vector<int>& arr, int x){
int low = 0;
int high = arr.size();
int ans=-1;
while(low<=high){
int mid = (low+high)/2;
if(arr[mid]==x) ans=mid;
if(arr[mid]<=x) low=mid+1;
else high = mid-1;
}
return ans;
}

I know this is a very old post with a lot of answers already but I came across this problem as well and needed a generic solution so I used manish_s answer to adapt the gnu stdlib bsearch function. In case anyone needs it:
size_t myBsearch (const void *__key, const void *__base, size_t __nmemb, size_t __size,
__compar_fn_t __compar)
{
size_t __l, __u, __idx;
const void *__p;
int __comparison;
__l = 0;
__u = __nmemb;
while (__l < __u)
{
__idx = (__l + __u) / 2;
__p = (void *)(((const char *)__base) + (__idx * __size));
__comparison = (*__compar)(__key, __p);
if (__comparison <= 0)
__u = __idx;
else if (__comparison > 0)
__l = __idx + 1;
}
return __l;
}

Related

Min-Max Heap, how to get level of node from given index in O(1)?

The function int IsOnMinLevel(Heap H, int i), returns if the node of index i is on a min level (even level), in constant time
functions provided:
typedef struct heap
{
int *array;
int count;
int capacity;
} *Heap;
Heap CreateHeap(int capacity)
{
Heap h=(Heap) malloc(sizeof(struct heap));
h->count=0;
h->capacity=capacity;
h->array=(int *)malloc(sizeof(int)*h->capacity);
if(! h->array) return NULL;
return h;
}
int Parent(Heap h, int i)
{
if(i<=0 || i>=h->count)
return -1;
return ((i-1)/2);
}
int LeftChild(Heap h, int i)
{
int left = 2*i+1;
if(left>=h->count) return -1;
return left;
}
int RightChild(Heap h, int i)
{
int right = 2*i+2;
if(right>=h->count)
return -1;
return right;
}
void ResizeHeap(Heap *h)
{
int i;
int *array_old = (*h)->array;
(*h)->array=(int *)malloc(sizeof(int)*(*h)->capacity*2);
for(i=0; i<(*h)->capacity; i++)
(*h)->array[i]=array_old[i];
(*h)->capacity *=2;
free(array_old);
}
How do I get level from index? And is there a relation between level and index in a complete binary tree?

Here is the level of the first few nodes:
index: 0 1 2 3 4 5 6 7 8 ...
level: 0 1 1 2 2 2 2 3 3 ...
The pattern is pretty simple: a new level starts at index 2^k-1. So if a node has an index between 2^k-1 included and 2^(k+1)-1 excluded, then it is at level k.
You can derive a formula from this: for a node with index i, find k such that 2^k-1 <= i < 2^(k+1)-1.
Add 1: 2^k <= i+1 < 2^(k+1).
Take the log2: k <= log2(i+1) < k+1.
Since k and k+1 are consecutive integers, this is equivalent to floor(log2(i+1)) = k.
It turns out the function floor(log2(i)) when i is an integer can be very efficiently written, for example like this (taken from here):
int uint64_log2(uint64_t n)
{
#define S(k) if (n >= (UINT64_C(1) << k)) { i += k; n >>= k; }
int i = -(n == 0); S(32); S(16); S(8); S(4); S(2); S(1); return i;
#undef S
}
With this, your function becomes:
int IsOnMinLevel(Heap H, int i)
{
return uint64_log2(i+1) % 2 == 0;
}

The answer Given by #user3386109:
Assuming the heap uses 0-based indexing, and that the root is on level 1, then given index i, the level is floor(log2(i+1)) + 1
The function is simply:
int IsOnMinLevel(Heap H, int i)
{
return ((int)(floor(log2(i+1))) % 2) == 0; //cast to int since log2 gives double
}

Binary Search is giving me a segfault

I'm trying to run this implementation of binary search. I don't know why but it keeps giving me segfault error. I'm thinking the problem might be either the way I'm passing the array or there's something wrong with the recursive calls.
#include <stdio.h>
int hasBinarySearch(int *array, int low, int high, int element)
{
int mid = (low + (high-low)) / 2;
if (high>=low)
{
if (array[mid] == element)
{
return mid;
}
else if(array[mid]<element)
{
return hasBinarySearch(array, low, mid-1, element);
}
else
{
return hasBinarySearch(array, mid+1, high, element);
}
}
return 0;
}
int main(void)
{
int array[10] = {1,2,3,4,5,6,6,6,7,8};
hasBinarySearch(array, 0, 9, 2);
return 0;
}

I think that you have some misunderstanding about binary search. Read some article or book about it.
As #Claies commented, calculation of middle index is wrong.
It should be low + (high - low) / 2. Just think about the internal division of two points in mathematics.
Also, you have to fix the parameters on recursive calls like the code below.
#include <stdio.h>
int hasBinarySearch(int *array, int low, int high, int element)
{
int mid = low + (high - low) / 2; // changed
printf("%d %d\n", high, low);
if (high >= low)
{
if (array[mid] == element)
{
return mid;
}
else if (array[mid] < element)
{
return hasBinarySearch(array, mid + 1, high, element); // changed
}
else
{
return hasBinarySearch(array, low, mid - 1, element); // changed
}
}
return 0;
}
int main(void)
{
int array[10] = { 1,2,3,4,5,6,6,6,7,8 };
hasBinarySearch(array, 0, 9, 2);
return 0;
}

int mid = (low + (high-low)) / 2; // wrong formula
#paganinist good answer points out the flaws in OP's search method and with a fix.
Yet to dig deeper.
Even though some compilers might be able to "un-recurse" code (Example), recursion is not needed here. A simple loop will suffice.
Array sizes can approach near maximum or exceed the range of int in extreme cases.
For sizes in the high int range, the following is better. #Jonathan Leffler
// int mid = (low + high)/2; // could overflow
int mid = low + (high-low)/2; // better, will not overflow when low >= 0
To accommodate all array sizes, use size_t instead on int. This also handles sizes including those near and above INT_MAX.
Candidate solution that returns the address of the matching element or NULL if not found.
#include <stdlib.h>
#include <stdio.h>
int *BinarySearch_int(const int *array, size_t count, int key) {
while (count > 0) {
size_t mid = count / 2;
if (key > array[mid]) {
array += mid + 1;
count -= mid + 1;
} else if (key < array[mid]) {
count = mid;
} else {
return (int *) &array[mid];
}
}
return NULL;
}
Test code
bool BinarySearch_int_test(const int *array, size_t count, int key, bool target){
int *p = BinarySearch_int(array, count, key);
bool success = (p != NULL) == target && (p == NULL || *p == key);
printf("f(Array:%p count:%zu, key:%2d) --> ptr:%p value:%2d success:%d\n",
(void*) array, count, key, (void*) p, p ? *p : 0, success);
return success;
}
int main(void) {
int array[] = {10, 20, 30, 40, 50, 60};
size_t n = sizeof array / sizeof array[0];
for (size_t i = 0; i < n; i++) {
BinarySearch_int_test(array, n, array[i], 1);
}
BinarySearch_int_test(array, n, 0, 0);
for (size_t i = 0; i < n; i++) {
BinarySearch_int_test(array, n, array[i] + 1, 0);
}
}
Output
f(Array:0xffffcb90 count:6, key:10) --> ptr:0xffffcb90 value:10 success:1
...
f(Array:0xffffcb90 count:6, key:60) --> ptr:0xffffcba4 value:60 success:1
f(Array:0xffffcb90 count:6, key: 0) --> ptr:0x0 value: 0 success:1
...
f(Array:0xffffcb90 count:6, key:61) --> ptr:0x0 value: 0 success:1

mid's calculation simplifies to high / 2 because you've added and then subtracted the lower bound out again. It looks like you meant to add half the difference to the lower bound, but the division occurs too late. It should be low + (high-low) / 2. (This is a bit more complicated than (low + high) / 2 but avoids the integer-math problem mentioned elsewhere.)
I think that segfault is happening when high goes below low and gets too small and you fall off the beginning of the array.
And #paganinist is right about the upper and lower cases being backwards.

Why doesn't my binary search implementation find the last element?

I have implemented a beginner recursive version of binary search in C. However, it doesn't seem to work when the element to be found is in the last position of the array. Is there any way to fix this without changing the prototype of the function?
#include <stdio.h>
int search(int value, int values[], int n);
int main() {
int a[] = { 26, 27, 28 };
if (search(28, a, 3) == 0)
printf("Found.\n");
else
printf("Not found.\n");
}
int search(int value, int values[], int n)
{
if (n <= 0)
return 1;
if (value < values[n/2])
// Search the left half
return search(value, values, n/2);
else if (value > values[n/2])
// Search the right half, excluding the middle term
return search(value, values + n/2 + 1, n/2 - 1);
else
return 0;
return 1;
}

Your search function is incorrect:
The slice size you pass when you recurse on the right part is computed incorrectly: it should be n - n/2 - 1 instead of n/2 - 1.
Here is a corrected version:
#include <stdio.h>
int search(int value, int values[], int n);
int main(void) {
int a[] = { 26, 27, 28 };
if (search(28, a, 3) == 0)
printf("Found.\n");
else
printf("Not found.\n");
return 0;
}
int search(int value, int values[], int n) {
if (n > 0) {
int mid = n / 2;
if (value < values[mid]) {
// Search the left half
return search(value, values, mid);
} else
if (value > values[mid]) {
// Search the right half, excluding the middle term
return search(value, values + mid + 1, n - mid - 1);
} else {
// Found the value
return 0;
}
}
return 1;
}
Here is a simpler iterative version:
int search(int value, int values[], int n) {
while (n > 0) {
int mid = n / 2;
if (value < values[mid]) {
// Search the left half
n = mid;
} else
if (value > values[mid]) {
// Search the right half, excluding the middle term
values += mid + 1;
n -= mid + 1;
} else {
// Found the value
return 0;
}
}
return 1;
}

It seems to be your return statement in your else if clause. The length of the array n should be n-n/2-1 and not n/2-1 or else the last element will be clipped off. You can see this to be more prevalent as the length of the array increases and as you're searching for elements coming from the right side.
return search(value, values + n/2 + 1, n - n/2 - 1);
Note:
As chqrlie pointed out

Finding # of integers in range from low to high in BST (C)

Given a high and a low, the point of this function is to find the amount of numbers equal or in between. I use 5 and 10 as my low and high and should receive a return result of 4. Since I'm using recursion I use a static variable to keep track. However I keep getting 1 as a result. Why?
In my test function:
int main()
{
int i;
int a[] = {8, 2, 7, 9, 11, 3, 2, 6};
BST_PTR t = bst_create();
for(i=0; i<8; i++)
bst_insert(t, a[i]);
printf("%d\n", bst_num_in_range(t,5,10)); <----- **calls the function here**
//rest of tests here
int bst_num_in_range(BST_PTR t, int low, int hi)
{
static int x = 0;
if ( NULL == t->root)
return 0;
return num_in_range_helper(t->root, low, hi, x);
}
int num_in_range_helper(NODE *r , int low, int hi, static int x)
{
if (r == NULL)
return 0;
if (low < r->val)
num_in_range_helper(r->left, low, hi, x);
if ( low <= r->val && hi >= r->val )
x++;
if (hi > r->val)
num_in_range_helper(r->right, low, hi, x);
return x;
}

That static keyword does not make x the same variable in those two functions. You need to assign return values to x after every call, i.e. try something like this
int bst_num_in_range(BST_PTR t, int low, int hi)
{
if ( NULL == t->root)
return 0;
return num_in_range_helper(t->root, low, hi);
}
int num_in_range_helper(NODE *r , int low, int hi)
{
if (r == NULL)
return 0;
int x = 0;
if (low < r->val)
x += num_in_range_helper(r->left, low, hi);
if ( low <= r->val && hi >= r->val )
x++;
if (hi > r->val)
x += num_in_range_helper(r->right, low, hi);
return x;
}
EDIT: you don't need to send current x to the helper either, you just ask the helper for the number of vertices in two subtrees, add them and return the total number in the current subtree only

Parallel merge in single thread mode very slow

I have two sets of sorted elementes and want to merge them together in way so i can parallelize it later. I have a simple merge implementation that has data dependencies because it uses the maximum function and a first version of a parallelizable merge that uses binary search to find the rank and compute the index for a given value.
The getRank function returns the number of elements lower or equal than the given needle.
#define ATYPE int
int getRank(ATYPE needle, ATYPE *haystack, int size) {
int low = 0, mid;
int high = size - 1;
int cmp;
ATYPE midVal;
while (low <= high) {
mid = ((unsigned int) (low + high)) >> 1;
midVal = haystack[mid];
cmp = midVal - needle;
if (cmp < 0) {
low = mid + 1;
} else if (cmp > 0) {
high = mid - 1;
} else {
return mid; // key found
}
}
return low; // key not found
}
The merge algorithms operates on the two sorted sets a, b and store the result into c.
void simpleMerge(ATYPE *a, int n, ATYPE *b, int m, ATYPE *c) {
int i, l = 0, r = 0;
for (i = 0; i < n + m; i++) {
if (l < n && (r == m || max(a[l], b[r]) == b[r])) {
c[i] = a[l];
l++;
} else {
c[i] = b[r];
r++;
}
}
}
void merge(ATYPE *a, int n, ATYPE *b, int m, ATYPE *c) {
int i;
for (i = 0; i < n; i++) {
c[i + getRank(a[i], b, m)] = a[i];
}
for (i = 0; i < m; i++) {
c[i + getRank(b[i], a, n)] = b[i];
}
}
The merge operation is very slow when having a lot of elements and still can be parallelized, but simpleMerge is always faster even though it can not be parallelized.
So my question now is, do you know any better approach for parallel merging and if so, can you point me to a direction or is my code just bad?

Complexity of simpleMerge function:
O(n + m)
Complexity of merge function:
O(n*logm + m*logn)
Without having thought about this too much, my suggestion for parallelizing it, is to find a single value that's around the middle of each function, using something similar to the getRank function, and using simple merge from there. That can be O(n + m + log m + log n) = O(n + m) (even if you do a few, but constant amount of lookups to find a value around the middle).

The algorithm used by the merge function is best by asymptotic analysis. The complexity is O(n+m). You cannot find a better algorithm since I/O takes O(n+m).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Implementation of C lower_bound - c

int lowerBound (int *a, int size, int val) { int lo = 0, hi = size - 1; while (lo < hi) { int mid = lo + (hi - lo)/2; if (a[mid] < val) lo = mid + 1; else hi = mid; } return lo; }

Related

Min-Max Heap, how to get level of node from given index in O(1)?

Binary Search is giving me a segfault

Why doesn't my binary search implementation find the last element?

Finding # of integers in range from low to high in BST (C)

Parallel merge in single thread mode very slow

Categories

Resources