What's the point of using linear search with sentinel? - c

My goal is to understand why adopting linear search with sentinel is preferred than using a standard linear search.
#include <stdio.h>
int linearSearch(int array[], int length) {
int elementToSearch;
printf("Insert the element to be searched: ");
scanf("%d", &elementToSearch);
for (int i = 0; i < length; i++) {
if (array[i] == elementToSearch) {
return i; // I found the position of the element requested
}
}
return -1; // The element to be searched is not in the array
}
int main() {
int myArray[] = {2, 4, 9, 2, 9, 10};
int myArrayLength = 6;
linearSearch(myArray, myArrayLength);
return 0;
}
Wikipedia mentions:
Another way to reduce the overhead is to eliminate all checking of the loop index. This can be done by inserting the desired item itself as a sentinel value at the far end of the list.
If I implement linear search with sentinel, I have to
array[length + 1] = elementToSearch;
Though, the loop stops checking the elements of the array once the element to be searched is found. What's the point of using linear search with sentinel?

A standard linear search would go through all the elements checking the array index every time to check when it has reached the last element. Like the way your code does.
for (int i = 0; i < length; i++) {
if (array[i] == elementToSearch) {
return i; // I found the position of the element requested
}
}
But, the idea is sentinel search is to keep the element to be searched in the end, and to skip the array index searching, this will reduce one comparison in each iteration.
while(a[i] != element)
i++;

First, lets turn your example into a solution that uses sentinels.
#include <stdio.h>
int linearSearch(int array[], int length, int elementToSearch) {
int i = 0;
array[length] = elementToSearch;
while (array[i] != elementToSearch) {
i++;
}
return i;
}
int main() {
int myArray[] = {2, 4, 9, 2, 9, 10, -1};
int myArrayLength = 6;
int mySearch = 9;
printf("result is %d\n", linearSearch(myArray, myArrayLength, mySearch));
return 0;
}
Notice that the array now has an extra slot at the end to hold the sentinel value. (If we don't do that, the behavior of writing to array[length] is undefined.)
The purpose of the sentinel approach is to reduce the number of tests performed for each loop iteration. Compare:
// Original
for (int i = 0; i < length; i++) {
if (array[i] == elementToSearch) {
return i;
}
}
return -1;
// New
while (array[i] != elementToSearch) {
i++;
}
return i;
In the first version, the code is testing both i and array[i] for each loop iteration. In the second version, i is not tested.
For a large array, the performance difference could be significant.
But what are the downsides?
The result when the value is not found is different; -1 versus length.
We have to make the array bigger to hold the sentinel value. (And if we don't get it right we risk clobbering something on the stack or heap. Ouch!)
The array cannot be read-only. We have to be able to update it.
This won't work if multiple threads are searching the same array for different elements.

Using the sentinel value allows to remove variable i and correspondingly its checking and increasing.
In your linear search the loop looks the following way
for (int i = 0; i < length; i++) {
if (array[i] == elementToSearch) {
return i; // I found the position of the element requested
}
}
So variable i is introduced, initialized, compared in each iteration of the loop, increased and used to calculate the next element in the array.
Also the function has in fact three parameters if to pass to the function the searched value
int linearSearch(int array[], int length, int value) {
//...
Using the sentinel value the function can be rewritten the following way
int * linearSearch( int array[], int value )
{
while ( *array != value ) ++array;
return array;
}
And inside the caller you can check whether the array has the value the following way
int *target = linearSearch( array, value );
int index = target == array + size - 1 ? -1 : target - array;

If you add the value to search for, you can reduce one comparison in every loop, so that the running time is reduced.
It may look like for(i = 0;;i++) if(array[i] == elementToSearch) return i;.

If you append the value to search for at the end of the array, when instead of using a for loop with initialization, condition and increment you can a simpler loop like
while (array[i++] != elementToSearch)
;
Then the loop condition is the check for the value you search for, which means less code to execute inside the loop.

The point is that you can convert the for loop into a while/repeat loop. Notice how you are checking i < length each time. If you covert it,
do {
} while (array[i++] != elementToSearch);
Then you don't have to do that extra checking. (in this case, array.length is now one bigger)

Although the sentinel approach seems to shave off a few cycles per iteration in the loop, this approach is not a good idea:
the array must be defined with an extra slot and passing its length as 1 less than the defined length is confusing and error prone;
the array must be modifiable;
if the search function modifies the array to set the sentinel value, this constitutes a side effect that can be confusing and unexpected;
the search function with a sentinel cannot be used for a portion of the array;
the sentinel approach is inherently not thread safe: seaching the same array for 2 different values in 2 different threads would not work whereas searching a constant read only array from multiple threads would be fine;
the benefits are small and only for large arrays. If this search becomes a performance bottleneck, you should probably not use linear scanning. You could sort the array and use a binary search or you could use a hash table.
optimizing compilers for modern CPUs can generate code where both comparisons will be performed in parallel, hence incur no overhead;
As a rule of thumb, a search function should not have side effects. A good example of the Principe of least surprise.

Related

Issue about Binary search algorithm in c

I am confused in understating the behavior of the code while searching for an element which does not exist in the array.
The result of the element index i am looking for is always zero while declaring it as int index;.
The result of the element index i am looking for is random number while declaring it as size_t index;
what is the difference between declaring the variable index as int index; andsize_t;in the code below.
The code
#include <stdio.h>
#define SIZE 5
int main(void)
{
int numbers[SIZE]={1,2,3,4,5};
int search =0; // This variable define the required number i am searching for
int start = 0 ;
int end = SIZE-1 ;
size_t index;
while (start <= end)
{
int middle = (start+end)/2;
if (search == numbers[middle])
{
index = middle;
}
if (search > numbers[middle])
{
start = middle+1 ;
}
else
{
end= middle-1 ;
}
}
printf("The index of the element is %d",index);
return 0;
}
The basic problem is that index is not initialized and that it never gets assigned when you don't find what you are searching for. Since the printf statement accesses an uninitialized variable in that case, your code have undefined behavior, i.e. anything may happen - including print of all sorts of numbers.
The result of the element index i am looking for is always zero while declaring it as int index;
That is "just by luck"
The result of the element index i am looking for is random number while declaring it as size_t index;
That is also "just by luck"
Here are a couple of action items you can take to improve your code:
Since this array is statically defined there is no need to include the SIZE define inside the []. Declare it like this int numbers[]={1,2,3,4,5}; instead of this int numbers[SIZE]={1,2,3,4,5};. Let the compiler do the math for you.
Initialize index to some value (i.e. index = 0;). this is the main cause of the problem and it is introducing undefined behavior to the program.
Change the type of size_t index to int index every variable that was declared in the program is an int and the program is treating index as an int. So it might as well be an int to avoid confusion.
Make this an else if clause instead of just an if:
else if (search > numbers[middle])
{
start = middle+1 ;
}
Add another case to have the program fail gracefully when the value to be searched is missing from the data set. Such as, printf("Data not found: %d", search);
The algorithm still isn't 100% and has some flaws but I will leave this up to you to figure out. I hope this info helps!
Best Regards!
The problem is that , the value of indexis not initialized.
initializing the variable to 0 does not solve your problem.
Because you are using index to return the position of the array element.
By initializing the index = 0 will provide he same result for the elements not present in the array as well as the for the first element to the of the array .
The better way is to initialize as size_t index = -1;
So that the result for the elements not present in the array would b -1.
Also check for the access specifier used in the printf statement, for size_t datatype. It can be ,
printf("The index of the element is %ld",index);
You are not using correct specifier for size_t, it's not %d.
Try to use %zd or %ld and it'll work fine.
Furthermore, add this after the while loop so that it doesn't show weird value of index when the element is not present in the array.
if(start>end) {
printf("That number is not present in the array");
return 0;
}
And move the line printf("The index of the element is %d",index); under the condition if (search == numbers[middle]). So that you don't get "this number is not present" even if it is present in the array.
For corrected version of your code see https://code.hackerearth.com/80043dg?key=7b325b26aec0f5425b76cc3efbdc93cf

Is it cheating to use 'static' when writing a recursive algorithm?

As part of a programming assignment, I'm required to write a recursive function which determines the largest integer in an array. To quote the exact task:
Write a recursive function that finds the largest number in a given list of
integers.
I have come up with two solutions, the first of which makes two recursive calls:
int largest(int arr[], int length){
if(length == 0)
return 0;
else if(arr[length - 1] > largest(arr,length -1))
return arr[length];
else return largest(arr,length -1);
}
The second one makes only one, however it uses a static variable n:
int largest(int arr[], int length){
static int n = -1;
if(length == 0)
return n;
else if (arr[length - 1] > n)
n = arr[length - 1];
return largest(arr, length - 1);
}
I was wondering whether it would be considered cheating use static variables for such a task. Either way, which one is considered better form? Is there a recursive method which tops both?
I wouldn't say that it's cheating to use static variables this way - I'd say that it's incorrect. :-)
Imagine that you call this function multiple times on a number of different arrays. With the static variable introduced, the value of n never resets between calls, so you may end up returning the wrong value. Generally speaking, it's usually poor coding style to set things up like this, since it makes it really easy to get the wrong answer. Additionally, if your array contains only negative values, you may return -1 as the answer even though -1 is actually bigger than everything in the array.
I do think that the second version has one nice advantage over the first - it's much, much faster because it makes only one recursive call rather than two. Consider using the first version, but updating it so that you cache the value returned by the recursive call so that you don't make two calls. This will exponentially speed up the code; the initial version takes time Θ(2n), while the updated version would take time Θ(n).
There is nothing cheating using a static inside function, recursive or otherwise.
There can be many good reasons for why to do so, but in your case I suspect that you are coming up with a wrong solution -- in as largest will only work once in the lifetime of the program running it.
consider the following (pseudo) code;
main() {
largest([ 9, 8, 7]) // would return 9 -- OK
largest([ 1, 2, 3]) // would return 9 ?? bad
}
The reason being that your largest cannot tell the difference between the two calls, but if that is what you want then that is fine.
Edit:
In answer to your comment, something like this will have a better big-O notation than your initial code;
int largest(int arr[], int length){
int split, lower,upper;
switch (length) {
case 1: return arr[0];
case 2: if (arr[1]>arr[0]) return arr[1]; else return arr[0];
default:
if (len <= 0) throw error;
split = length/2;
lower = largest(arr,split);
upper = largest(arr+split,length-split);
if (lower > upper) return lower; else return upper;
}
}
Alternatively, the obvious solution is;
int largest(int arr[], int length){
if (length <= 0) thor error;
int max = arr[0];
for (int i=1; i<length; i++)
if (arr[i] > max) max = arr[i];
return max;
}
which has no recursion at all
It is actually a terrible design, because on the second execution of the function does not return a correct result.
I don't think you need to debate whether it is cheating, if it is wrong.
The first version is also incorrect, because you return arr[length] instead of arr[length-1]. You can eliminate the second recursive call. What can you do instead of calling the same function (with no side-effects) twice with the same arguments?
In addition to the excellent points in the three prior answers, you should practice having more of a recursion-based mind. (1) Handle the trivial case. (2) For a non-trivial case, make a trivial reduction in the task and recur on the (smaller) remaining problem.
I propose that your proper base case is a list of one item: return that item. An empty list has no largest element.
For the recursion case, check the first element against the max of the rest of the list; return the larger. In near-code form, this looks like the below. It makes only one recursive call, and has only one explicit local variable -- and that is to serve as an alias for the recursion result.
int largest(int arr[], int length){
if(length == 1)
// if only one element, return it
return arr[0];
else n = largest(arr,length-1))
// return the larger of the first element or the remaining largest.
return arr[length-1] > n ? arr[length-1] : n
}
Is there a recursive method which tops both?
Recursion gets a bad name when with N elements cause a recursion depth of N like with return largest(arr,length -1);
To avoid this, insure the length on each recursion is halved.
The maximum recursive depth is O(log2(N))
int largest(int arr[], int length) {
if (length <= 0) return INT_MIN;
int big = arr[0];
while (length > 1) {
int length_r = length / 2;
int length_l = length - length_r;
int big_r = largest(&arr[length_l], length_r);
if (big_r > big) big = big_r;
length = length_l;
}
return big;
}
A sneaky and fast method that barely uses recursion as finding the max is trivial with a loop.
int largest(int arr[], int length) {
if (length <= 0) return INT_MIN;
int max = largest(NULL, -1);
while (length) {
length--;
if (arr[length] > max) max = arr[length];
}
return max;
}

how to write function which returns the position of the number in the array?

I have just started leran about C++. And I have to do one exercise but I don't know how. Please help me.
I have to write the function which returns the position of the number in the array,rate and size are pass to this function and the value of the expression|tab[i]_M| is the maximum, where M is the average of all the elements.
Thank you for your help
You will want to look at the values in your array one by one. You can access the individual values like this:
yourarray[index]
The best way to do it is a loop. There are several loops available in C++. You can use the for loop for example. Inside of the loop you check if the value is the one you are looking for
for (int i = 0; ...
{
if your array[i] == your value
If you found the value, break the loop and return the index i.
// Returns the index of the first occurrence of a value in the array, or -1 if not found
int GetPositionInArray(int array[], int value)
{
for (int i = 0; i < sizeof(array)/sizeof(int); i++) {
if (array[i] == value)
return i;
}
return -1;
}

C how do I check if an array is full?

I have an array: array[3][3]
I will let the user input data into the array as long as it is not full. As soon as the array gets full I want to stop the user from inserting more data into it.
C has no array bounds check. You have to do it yourself. Use a variable to keep track of how many items you have inserted, and stop when your counter is equal to the array size.
You cannot check if "array is full". To do what u want to do, keep track of index while adding elements to array.
You need to keep track of the data inserted by a user. When your counter reaches the size of the array, the array is full. :D There is not other way in C to achieve this result as it does not provide any means of verifying how many elements are in the array.
Introduce a variable for counting the number of cells filled. You adjust this variable whenever you add/remove data to your array. Then, in order to check if your array is full, just check if this variable is equal to the total number of cells in your array.
The most simple way in my opinion is to dynamically allocate memory using calloc(), where you can initialise the array elements to, for example, zeros. The you can check if the array is full by checking, if the last element in the array is still zero or not. Of course, if the last element is still zero, then the array is not full.
First of all this is not an array but a matrix (aka array of array). you already know that the matrix has dimensions 3x3 and then you could do something like this:
int x, y;
int array[3][3];
for (x = 0; x<2; x++)
{
for (y = 0; y<2; y++)
{
//I assume that it is an array of int
printf("Insert number at %d - %d" ,x,y);
scanf("%d" ,&array[x][y]);
}
}
Now the user can only insert 3*3=9 values.
There are no way to array bound check in C, However with better coding practice we check whether array is full.
For example, consider in your array a[3][3] you don't want to have some particular value. That value could be anything! 0xFF or 0 or anything which is in the integer range! and you have to make sure that value is never given as the input to the array, and then you can verify whether your array a[3][3] is full!
/*part of coed in main*/
/*here initialize the array with 0, assuming that 0 will never be a part of array a[3][3]*/
for(i=0; i<3; i++)
{
for(j=0;j<3;j++)
{
a[i][j] = 0; // assuming that 0 will never be a part of array a[3][3]
}
}
while(CheckArray(**a)!=0)
{
printf("give array input:\n")
scanf("%d", &a[row][column]); //writing to empty cell of an array
}
//CheckArray code
int CheckArray(int a[][])
{
for(i=0; i<3; i++)
{
for(j=0;j<3;j++)
{
if(a[i][j] == 0) // assuming that 0 will never be a part of array a[3][3]
{
row = i; // row and column must be global variables in this example!
column = j; // row and column must be global variables in this example!
return 1;
}
else
{
// no need to do anything here
}
}
}
//if code reaches here then array is full!
printf("Array is full.\n");
return 0;
}
You can still optimize the above code! this is just one way of checking whether array is full with better coding practice!
If possible, you could initialize all elements in the array to a certain value that your program would otherwise consider "illegal" or "invalid". E.g. an array of positive numbers can be initialized to be all -1's. Or an array of chars can be initialized to be all NULL characters.
Then, just look at the last element if it is set to the default value.
measurement = sizeof(myarray) / sizeof(element); //or just a constant
while(myarray[measurement-1] == defaultvalue){
//insert code here...
}
Encapsulate the behavior into a struct with getter/setter functions that check for the max length of the desired vector:
typedef varvector
varvector;
struct varvector {
int length;
void* vector;
};
varvector* varvector_create(int length) {
varvector* container = malloc(sizeof(varvector));
void* vector = malloc(length);
if(container && vector) {
container->vector = vector;
}
return(container);
}
void varvector_destroy(varvector* container) {
free(container.vector);
free(container);
}
varvector_get(varvector* container, int position) {
if(position < container.length) {
return(container->vector[position]);
}
}
varvector_set(varvector* container, int position, char value) {
if(position < container.length) {
container->vector[position] = value
}
}
Object Oriented Programming is a design pattern which happens to have syntactic support in some languages and happens to not have syntactic support in C.
This does not mean that you cannot use this design pattern in your work, it just means you have to either use a library that already provides this for you (glib, ooc) or if you only need a small subset of these features, write your own basic functions.
You can assume that the last element of an array has id=0.
Then in function add check if there is an element with id=0.
int add(char *source, char *target, int size) {
int index = 0;
for (int i = 0; i < size; i++) {
if (target[i].id == 0) {
index = i;
break;
}
}
if (index >= 0 && index < size) {
if (index < size - 1) target[index + 1].id = 0;
// check that element with id=0 is the last in the array
//write code to add your element here
return index;
}
}

Trying to find numbers repeated in two arrays

I am trying to find all of the numbers that are repeated across two arrays..
For example:
array1[2]: 1,2
array2[2]: 1,5
The number that repeats itself is 1 so we create a new array that will contain 1.
array3[2]: 1
My code is:
int func1(int *str, int *str2)
{
int i,j,temp,c[10];
for(i=0;i<*(str+i);i++){
for(j=0;j<*(str2+j);j++){
if(*(str+i) == *(str+j))
{
temp = *(str+i);
*(str+i) = temp;
temp = *(c+i);
return c[i];
}
}
}
return 0;
}
What is the problem?(logically)
Thanks.
There are multiple problems:
The conditions in the two for loops are odd and probably wrong. They are equivalent to:
for (i = 0; i < str1[i]; i++)
for (j = 0; j < str2[j]; j++)
You should probably specify the sizes of the input arrays in the function interface.
In C, you must make sure you always know the sizes of the arrays.
You should probably specify the output array in the function interface.
Since you will need to know how many values were found in common, you'll need to return that number from the function.
Your choice of the names str1 and str2 is unusual. Not technically wrong, but probably not a good idea. Such names should be reserved for character strings, not arrays of integers.
Your local array c is barely used, and is not used safely.
Your code returns when it finds the first pair of numbers that match, not all possible matches.
The first two lines of the body of the if statement elaborately copies the value in str[i] back to itself via temp.
The third line of the body of the if statement copies an uninitialized value from array c into the variable temp.
The last line of the body of the if then returns that uninitialized value.
This adds up to changes such as:
int func1(int *arr1, int num1, int *arr2, int num2, int *arr3)
{
int k = 0;
for (int i = 0; i < num1; i++)
{
for (int j = 0; j < num2; j++)
{
if (arr1[i] == arr2[j])
arr3[k++] = arr1[i];
}
}
return k;
}
Note that this code assumes that the size of arr3 (the array, not the pointer itself) is as big as the product of num1 and num2. If both arrays contain a list of the same value, then there will be one row in the output array, arr3, for each pair so it could use num1 * num2 rows. This points out that the code does not deal with suppressing duplicates; if you need that (you likely do), then the body of the if statement needs to search through the current values in arr3 to check that the new value is not present. It would be wise to add another parameter, int siz3, to indicate the size of the third array; if you run out of space for values, you could then return -1 as an error indication.
The coded algorithm is quadratic (or, more accurately, proportional to the product num1 * num2). If you know the arrays are sorted on entry, you can reduce it to a linear algorithm (proportional to num1 + num2). With duplicate elimination, it is a little more expensive - it isn't quite as simple as 'cubic'. If you know the input arrays contain unique values (no duplicates), then duplicate elimination is obviously not necessary.
for(i=0;i<*(str+i);i++){
for(j=0;j<*(str2+j);j++){
Are wrong. You are applying '<' condition on an integer to itself and hence loop condition breaks. So, the loop never runs.
And why are you using these redundant statements?
temp = *(str+i);
*(str+i) = temp;
Also, this is wrong
temp = *(c+i);
return c[i];
Try more to correct those statements.If you can't do again, I will provide you a solution

Resources