Recursive unsorted array search algorithm in C? - c

Let's say we want to write a function in C that finds a specified target value in an unsorted array of ints. In general, this is simple and runs in O(n) time:
int search(int *data, int len, int target)
{
int i;
for(i = 0; i < len; i++)
if(data[i]==target) return i;
return -1;
}
Let's say we're being masochistic and want to approach this with a divide and conquer algorithm instead. We'll run into trouble on the recursive part because we can't exclude half the array each time, like we can with binary search:
int search(int *data, int start, int stop, int target)
{
// Base case: we've divided the array into two size subarray
if(stop==start+1)
{
if(data[start]==target) return start;
if(data[stop]==target) return stop;
return -1;
}
/* The recursion part is tricky.
We *need* to parse both halves of the array, because we can't safely
exclude any part of the array; it's not sorted, so we can't predict
which half it's going to be in.*/
else
{
/** This obviously doesn't work. */
int mid = (stop-start)/2;
return search(data, start, mid, target);
return search(data, mid+1, stop, target);
}
}
Is there any way to make this work?
NOTE: This is not asking people to do my homework for me, as some of you may think when reading this question. It is, however, inspired by curiosity after I encountered this problem when trying to solve a question in an assignment that I've submitted earlier this week.

How about changing the recursive call to:
else
{
int mid = (stop-start)/2;
int x = search(data, start, mid, target);
if (x == -1)
return search(data, mid+1, stop, target);
else
return x;
}

I think the answer to your question is no, you can't achieve any benefit using the binary split approach if the data is unsorted.

If the data are not sorted you can not use binary search.
But divide and conquer can be used with the following recursive logic (linear search):
int search(int *data, int len, int target)
{
if (len == 0)
return -1;
else if (data[0] == target);
return 0;
else
return 1 + search(++data, len-1, target);
}

Related

How to use divide and conquer and the fact that if one subarray has a majority, the combined array has a majority to find majority element?

In the question we were told that the crux of the algorithm is the fact that
"When we get down to single elements, that single
element is returned as the majority of its (1-element) array. At every other level, it will get return values from its
two recursive calls. The key to this algorithm is the fact that if there is a majority element in the combined array,
then that element must be the majority element in either the left half of the array, or in the right half of the array."
My implementation was this, probably very buggy but the general idea was this:
#include <stdio.h>
int merge(int *input, int left, int middle, int right, int maj1, int maj2)
{
// determine length
int length1 = middle - left + 1;
int length2 = right - middle;
// create helper arrays
int left_subarray[length1];
int right_subarray[length2];
// fill helper arrays
int i;
for (i=0; i<length1; ++i)
{
left_subarray[i] = input[left + i];
}
for (i=0; i<length2; ++i)
{
right_subarray[i] = input[middle + 1 + i];
}
left_subarray[length1] = 100;
right_subarray[length2] = 100;
//both return majority element
int count1 = 0;
int count2 = 0;
for (int i = 0; i < length1; ++i) {
if (left_subarray[i] == maj1) {
count1++;
}
if (right_subarray[i] == maj1) {
count1++;
}
}
for (int i = 0; i < length2; ++i) {
if (right_subarray[i] == maj2) {
count2++;
}
if (left_subarray[i] == maj2) {
count2++;
}
}
if (count1 > ((length1+length2) - 2)/2){
return maj1;
}
else if (count2 > ((length1+length2) - 2)/2){
return maj2;
}
else
return 0;
}
int merge_sort(int *input, int start, int end, int maj1, int maj2)
{
//base case: when array split to one
if (start == end){
maj1 = start;
return maj1;
}
else
{
int middle = (start + end ) / 2;
maj1 = merge_sort(input, start, middle, maj1, maj2);
maj2 = merge_sort(input, middle+1, end, maj1, maj2);
merge(input, start, middle, end, maj1, maj2);
}
return 0;
}
int main(int argc, const char* argv[])
{
int num;
scanf("%i", &num);
int input[num];
for (int i = 0; i < num; i++){
scanf("%i", &input[i]);
}
int maj;
int maj1 = -1;
int maj2 = -1;
maj = merge_sort(&input[0], 0, num - 1, maj1, maj2);
printf("%d", maj);
return 0;
}
This obviously isn't divide and conquer. I was wondering what is the correct way to implement this, so I can have a better understanding of divide and conquer implementations. My main gripe was in how to merge the two sub-array to elevate it to the next level, but I am probably missing something fundamental on the other parts too.
Disclaimer: This WAS for an assignment, but I am analyzing it now to further my understanding.
The trick about this particular algorithm, and why it ends up O(n log n) time is that you still need to iterate over the array you are dividing in order to confirm the majority element. What the division provides is the correct candidates for this iteration.
For example:
[2,1,1,2,2,2,3,3,3,2,2]
|maj 3| maj 2
maj 2 | maj None
<-------------------> still need to iterate
This is implicit in the algorithm statement: "if there is a majority element in the combined array, then that element must be the majority element in either the left half of the array." That "if" indicates confirmation is still called for.

Iterative binary search with only two comparisons?

Adjust the iterative binary search code so that it uses only two comparisons instead of three (in
the main while loop). *Note: The three comparisons are in the while loop, and two if statements
within the loop.
#include <stdio.h>
int ItBinarySearch(int arr[], int len, int target) {
int first = 0;
int last = len-1;
while (first <= last){
// Assert: array is sorted
int mid = (first+last) / 2;
if (target == arr[mid])
return 1;
if (target < arr[mid])
last = mid-1;
else first = mid+1;
}
return 0;
}
int main(void){
int arr[6]={0,1,2,3,4,5};
int len=sizeof(arr)/sizeof(arr[0]);
int target = 4;
printf("%d\n",ItBinarySearch(arr,len,target));
}
Hint: try moving one of the if/else statements outside the loop. Which if/else would be the most likely candidate where the algorithm would still work if it were outside the statement?

Optimize a segment tree for range maximum queries?

So I need some help again.I recently started doing medium level problems on codechef and hence I am getting TLE quite a lot.
So basically the question is to find the sum of of multiple maximum range queries given in the question.The initial range is given and the next values are calculated by a formula which is given in the problem.
I used segment trees to solve the problem,but I keep getting TLE for some sub-tasks.Please help me optimize this code.
Problem link- https://www.codechef.com/problems/FRMQ
//solved using segment tree
#include <stdio.h>
#define gc getchar_unlocked
inline int read_int() //fast input function
{
char c = gc();
while(c<'0' || c>'9')
c = gc();
int ret = 0;
while(c>='0' && c<='9')
{
ret = 10 * ret + c - '0';
c = gc();
}
return ret;
}
int min(int a,int b)
{
return (a<b?a:b);
}
int max(int a,int b)
{
return (a>b?a:b);
}
void construct(int a[],int tree[],int low,int high,int pos) //constructs
{ //the segment tree by recursion
if(low==high)
{
tree[pos]=a[low];
return;
}
int mid=(low+high)>>1;
construct(a,tree,low,mid,(pos<<1)+1);
construct(a,tree,mid+1,high,(pos<<1)+2);
tree[pos]=max(tree[(pos<<1)+1],tree[(pos<<1)+2]);
}
int query(int tree[],int qlow,int qhigh,int low,int high,int pos)
{ //function finds the maximum value using the 3 cases
if(qlow<=low && qhigh>=high)
return tree[pos]; //total overlap
if(qlow>high || qhigh<low)
return -1; //no overlap
int mid=(low+high)>>1; //else partial overlap
return max(query(tree,qlow,qhigh,low,mid,(pos<<1)+1),query(tree,qlow,qhigh,mid+1,high,(pos<<1)+2));
}
int main()
{
int n,m,i,temp,x,y,ql,qh;
long long int sum;
n=read_int();
int a[n];
for(i=0;i<n;i++)
a[i]=read_int();
i=1;
while(temp<n) //find size of tree
{
temp=1<<i;
i++;
}
int size=(temp<<1)-1;
int tree[size];
construct(a,tree,0,n-1,0);
m=read_int();
x=read_int();
y=read_int();
sum=0;
for(i=0;i<m;i++)
{
ql=min(x,y);
qh=max(x,y);
sum+=query(tree,ql,qh,0,n-1,0);
x=(x+7)%(n-1); //formula to generate the range of query
y=(y+11)%n;
}
printf("%lld",sum);
return 0;
}
Several notes:
It's great you are using fast IO routines.
Make sure you do NOT use modulo operation, because it is VERY slow. To calculate remainder, simply subtract N from the number until it becomes less that N. This would work much faster.
Your algorithm works in O((M+N) * log N) time, which is not optimal. For static RMQ problem, it is better and much simpler to use sparse table. It needs O(N log N) space and O(M + N log N) time.
Well , I think to get 100 points you need to use sparse table.
I tried to optimize your code https://www.codechef.com/viewsolution/7535957 (run time decreased from 0.11 sec to 0.06 sec)
but still not enough to pass subtask 3..

why doesn't my quicksort implementation in C work

Here I made a quicksort implementation that uses (at least tries to) a little trick that is possible because I know the input is a list of numbers in the set {1,2,...,1023}. In particular I am not using a pivot that is necessarily in the list itself. Here is the main part of the function
int partition2(int length, int arr[], int mask) {
int left = 0;
int right = length;
while (left < right) {
while ((left < right) && (!((arr[left])&mask)) ) {
left++;
}
while ((left < right) && ((arr[right-1])&mask)) {
right--;
}
if (left < right) {
right--;
swap(left, right, arr);
left++;
}
}
left--;
return left;
}
void qSort2(int length, int arr[], int mask) {
int boundary;
if (length <= 1) {
return; /* empty or singleton array: nothing to sort */
}
boundary = partition2(length, arr, mask);
qSort2(boundary, arr,mask/2);
qSort2(length - boundary - 1, &arr[boundary + 1], mask/2);
}
main(){
int length = 200;
int *arr;
arr = generateNumbers(length);
qSort2(length, arr, 512);
int i;d
for(i=0;i<length;i++){
printf("%d ", arr[i]);
}
}
Sorry for the lengthy code. The function generateNumbers just makes a vector of size length with numbers from the given range and swap simply swaps two elements from the array. Now I am trying to exploit the fact that all numbers are smaller that 1024. So roughly speaking half of them will contain a 1 in binary representation in the position corresponding to 2^9=512. So we can use that to split the list in two lists. Then we check for both list what the digit corresponding to 2^8 is and split the list again. I am using the variable mask for this and the operator n&mask is zero if the is smaller then mask and non zero if it is larger that mask. For some reason though, it does not seem to work. Does anyone have any idea why? The output list is almost sorted but there are just some mistakes at certain places. If anyone could help me out that would be great. Thanks!
Here is the generateNumbers function:
void *makeDynamicIntArray(int length) {
void *ptr = malloc(length*sizeof(int));
if (ptr == NULL) {
printf("\nMalloc failed: out of memory?\n");
exit(-1);
}
return ptr;
}
int *generateNumbers(int length) {
int i, *arr = makeDynamicIntArray(length);
for (i=0; i<length; i++) {
arr[i] = rand() % 1024;
}
return arr;
}

What is wrong with my binary search implementation?

#include <stdio.h>
int bsearch(int a[], int n, int lo, int hi) {
int mid;
mid = (hi + lo) / 2;
if(a[mid] == n)
return 1;
else if(a[mid] > n)
bsearch(a, n, lo, mid);
else
bsearch(a, n, mid, hi);
return 0;
}
int main(void) {
int n, a[7] = {2, 4, 5, 67, 70, 80, 81};
int hi = 6, lo = 0, j;
scanf("%d", &n);
j = bsearch(a, n, lo, hi);
if(j)
printf("Found");
else
printf("Not Found");
return 0;
}
input : 5 output: Not Found
Can anyone tell me why I'm getting this result?
You need to fix several big issues to make it work (see details in following code comments).
Change your binary search function to the following:
int bsearch(int a[], int n, int lo, int hi)
{
// add base case
if (high < low)
return 0; // not found
int mid;
mid=(hi+lo)/2;
if(a[mid]==n)
return 1;
else if(a[mid]>n)
return bsearch(a,n,lo,mid-1); // add return
else
return bsearch(a,n,mid+1,hi); // add return
}
P.S.: And based on your usage in the main() body, you actually only need to return 0/1 to indicate contains the value or not. I will suggest you to use bool return type to make it more clear.
Add "return" to the recursive calls, e.g.:
return bsearch(a,n,lo,mid);
Otherwise, when you return 1 in bsearch, it does not get returned all the way to main.
That will make it work for 5. You have other bugs, so try with many values and use an IDE and/or printf to see what's happening. Good luck and have fun!
That's because the return 0; statement in your bsearch function is always executed because you are simply discarding the values returned by the recursive calls. In a recursive function, you must first decide the base case. Here in your bsearch, the base case should be
low <= hi
This is the first condition which must be true to start the search for the sought value. If this condition is not fulfilled, then you must return false, i.e., 0.
Next, a value returning function call is an expression, i.e., it evaluates to a value. When you simply call the function and do nothing with the result, you will always fall down to the last return statement in your function. Here I list some points in comments alongside the statements in your bsearch function.
int bsearch(int a[], int n, int lo, int hi) {
// first check for the base condition here
// if(hi < low) return 0;
int mid;
// can cause integer overflow. should be
// mid = lo + (hi - lo) / 2;
mid = (hi + lo) / 2;
if(a[mid] == n)
return 1;
else if(a[mid] > n)
// you are doing nothing with the value returned
// think of the function call as an expression
// return the value of the expression
// should be
// return besearch(a, n, lo, hi);
bsearch(a, n, lo, mid);
else
// same follows here
// should be
// return bsearch(a, n, mid, hi);
bsearch(a, n, mid, hi);
// finally you will always return 0 because this statement is always executed
// all cases have been taken care of.
// no return statement needed here
return 0;
}

Resources