Reduce execution time of a code that uses binary search

Reduce execution time of a code that uses binary search - c

The problem is to create an array of player ranks based on 2 other arrays: leaderboard and player scores. More explanations of the problem here: https://www.hackerrank.com/challenges/climbing-the-leaderboard/problem.
The code below is a spaghetti but it's working fine. But, for large size of ranked array(200000 elements for example), it times out. I'm not asking for code to copy/paste. I just wanna know if there is a way to optimize this code.
int* climbingLeaderboard(int ranked_count, int* ranked, int player_count, int* player, int* result_count) {
*result_count=player_count;
// remove duplicates
int removed=0;
for(int i=0, j=1; i<ranked_count-removed; i++, j++){
if(ranked[i]==ranked[j]){
for(int k=j; k<ranked_count-removed; k++)
ranked[k]=ranked[k+1];
removed++;
}
}
int newsize=ranked_count-removed;
// create an array to store ranks then fill it
int* positions=malloc(newsize*sizeof(int));
positions[0]=1;
for(int i=0, j=1; j<newsize; i++, j++){
positions[j]=(ranked[j]<ranked[i])? (positions[i]+1) : positions[i];
}
// create and fill the results array using binary search
int* res = malloc(player_count*sizeof(int));
int start=0, end=newsize-1, middle=(start+end)/2;
int j, k=newsize-1;
for(int i=0; i<player_count; i++){
if(i>0&&player[i]==player[i-1]){
*(res+i)=(*(res+(i-1)));
continue;
}
if(player[i]>=ranked[middle]){
*(res+i)=positions[middle];
j=middle-1;
while(j>=0){
if(player[i]>=ranked[j])
*(res+i)=positions[j];
else if(j==k)
*(res+i)=positions[j]+1;
else break;
--j;
}
start=0; end=middle-1;
}
else{
*(res+i)=positions[newsize-1]+1;
j=newsize-1;
while(j>=middle){
if(player[i]>=ranked[j])
*(res+i)=positions[j];
else if(j==k)
*(res+i)=positions[j]+1;
else break;
--j;
}
start=middle+1; end=newsize-1;
}
middle=(start+end)/2;
}
free(positions);
return res;
}

The initial loop to remove duplicates has a potential quadratic time complexity. You can achieve linear complexity using the 2 finger approach:
int removed = 0;
for (int i = 1, j = 1; j < ranked_count; j++) {
if (ranked[i - 1] != ranked[j])
ranked[i++] = ranked[j];
else
removed++;
}
More generally, the argument arrays should not be changed in spite of the sloppy prototype given:
int *climbingLeaderboard(int ranked_count, int *ranked,
int player_count, int *player,
int *result_count);
Here are simple steps I would recommend to solve this problem:
allocate and initialize a ranking array with the ranking for each of the scores in the ranked array. Be careful to allocate ranked_count + 1 elements.
allocate a result array res of length player_count, set the result_count to player_count.
starting with pos = ranked_count, for each entry i in player:
locate the position pos where the entry would be inserted in the ranking array using binary search between position 0 and the current pos inclusive. Make sure you find the smallest entry in case of duplicate scores.
set res[i] to ranking[pos]
free the ranking array
return the res array.
Here is a simple implementation:
int *climbingLeaderboard(int ranked_count, int *ranked,
int player_count, int *player,
int *result_count)
{
if (player_count <= 0) {
*result_count = 0;
return NULL;
}
int *ranking = malloc(sizeof(*ranking) * (ranked_count + 1));
int rank = 1;
ranking[0] = rank;
for (int i = 1; i < ranked_count; i++) {
if (ranked[i] != ranked[i - 1])
rank++;
ranking[i] = rank;
}
ranking[ranked_count] = rank + 1;
int *res = malloc(sizeof(*res) * player_count);
*result_count = player_count;
int pos = ranked_count;
for (int i = 0; i < player_count; i++) {
int start = 0;
while (start < pos) {
int middle = start + (pos - start) / 2;
if (ranked[middle] > player[i])
start = middle + 1;
else
pos = middle;
}
res[i] = ranking[pos];
}
free(ranking);
return res;
}

Look for ways to use "branchless" to improve execution speed:
positions[0]=1;
for(int i=0, j=1; j<newsize; i++, j++){
positions[j]=(ranked[j]<ranked[i])? (positions[i]+1) : positions[i];
}
becomes
positions[0] = 1;
for( int i = 0, j = 1; j < newsize; i++, j++ )
positions[j] = positions[i] + (ranked[j] < ranked[i]);
Other than this, I don't even want to try to sort out what this code is attempting.

Related

Sorting integers by sum of their digits

I'm trying to write a program that will sort an array of 20 random numbers by the sums of their digits.
For example:
"5 > 11" because 5 > 1+1 (5 > 2).
I managed to sort the sums but is it possible to return to the original numbers or do it other way?
#include <stdio.h>
void sortujTab(int tab[], int size){
int sum,i;
for(int i=0;i<size;i++)
{
while(tab[i]>0){//sum as added digits of an integer
int p=tab[i]%10;
sum=sum+p;
tab[i]/=10;
}
tab[i]=sum;
sum=0;
}
for(int i=0;i<size;i++)//print of unsorted sums
{
printf("%d,",tab[i]);
}
printf("\n");
for(int i=0;i<size;i++)//sorting sums
for(int j=i+1;j<=size;j++)
{
if(tab[i]>tab[j]){
int temp=tab[j];
tab[j]=tab[i];
tab[i]=temp;
}
}
for(int i=0;i<20;i++)//print of sorted sums
{
printf("%d,",tab[i]);
}
}
int main()
{
int tab[20];
int size=sizeof(tab)/sizeof(*tab);
for(int i=0;i<=20;i++)
{
tab[i]=rand()%1000;// assamble the value
}
for(int i=0;i<20;i++)
{
printf("%d,",tab[i]);//print unsorted
}
printf("\n");
sortujTab(tab,size);
return 0;
}

There are two basic approach :
Create a function that return the sum for an integer, say sum(int a), then call it on comparison, so instead of tab[i] > tab [j] it becomes sum(tab[i]) > sum (tab[j])
Store the sum into a different array, compare with the new array, and on swapping, swap both the original and the new array
The first solution works well enough if the array is small and takes no extra memory, while the second solution didn't need to repeatedly calculate the sum. A caching approach is also possible with map but it's only worth it if there are enough identical numbers in the array.

Since your numbers are non-negative and less than 1000, you can encode the sum of the digits in the numbers itself. So, this formula will be true: encoded_number = original_number + 1000 * sum_of_the_digits. encoded_number/1000 will decode the sum of the digits, and encoded_number%1000 will decode the original number. Follow the modified code below. The numbers enclosed by parentheses in the output are original numbers. I've tried to modify minimally your code.
#include <stdio.h>
#include <stdlib.h>
void sortujTab(int tab[], int size)
{
for (int i = 0; i < size; i++) {
int sum = 0, n = tab[i];
while (n > 0) { //sum as added digits of an integer
int p = n % 10;
sum = sum + p;
n /= 10;
}
tab[i] += sum * 1000;
}
for (int i = 0; i < size; i++) { //print of unsorted sums
printf("%d%c", tab[i] / 1000, i < size - 1 ? ',' : '\n');
}
for (int i = 0; i < size; i++) { //sorting sums
for (int j = i + 1; j < size; j++) {
if (tab[i] / 1000 > tab[j] / 1000) {
int temp = tab[j];
tab[j] = tab[i];
tab[i] = temp;
}
}
}
for (int i = 0; i < size; i++) { //print of sorted sums
printf("%d(%d)%c", tab[i] / 1000, tab[i] % 1000, i < size - 1 ? ',' : '\n');
}
}
int main(void)
{
int tab[20];
int size = sizeof(tab) / sizeof(*tab);
for (int i = 0; i < size; i++) {
tab[i] = rand() % 1000; // assamble the value
}
for (int i = 0; i < size; i++) {
printf("%d%c", tab[i], i < size - 1 ? ',' : '\n'); //print unsorted
}
sortujTab(tab, size);
return 0;
}
If the range of numbers doesn't allow such an encoding, then you can declare a structure with two integer elements (one for the original number and one for the sum of its digits), allocate an array for size elements of this structure, and initialize and sort the array using the digit sums as the keys.

You can sort an array of indexes rather than the array with data.
#include <stdio.h>
//poor man's interpretation of sumofdigits() :-)
int sod(int n) {
switch (n) {
default: return 0;
case 5: return 5;
case 11: return 2;
case 1000: return 1;
case 9: return 9;
}
}
void sortbyindex(int *data, int *ndx, int size) {
//setup default indexes
for (int k = 0; k < size; k++) ndx[k] = k;
//sort the indexes
for (int lo = 0; lo < size; lo++) {
for (int hi = lo + 1; hi < size; hi++) {
if (sod(data[ndx[lo]]) > sod(data[ndx[hi]])) {
//swap indexes
int tmp = ndx[lo];
ndx[lo] = ndx[hi];
ndx[hi] = tmp;
}
}
}
}
int main(void) {
int data[4] = {5, 11, 1000, 9};
int ndx[sizeof data / sizeof *data];
sortbyindex(data, ndx, 4);
for (int k = 0; k < sizeof data / sizeof *data; k++) {
printf("%d\n", data[ndx[k]]);
}
return 0;
}

Printing unique values of the array in C

I wrote a function creating a dynamic array of random values and another function creating a new array consisting of unique values of the previous array. The algorithm used counts unique values correctly. However, I faced a problem in printing all values. In the example below the program printed 7 2 12714320 4 5 instead of 7 2 4 5 6 .
This is the program which can be tested:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int *delduplicate(int *v, int size_old, int *size_new);
main()
{
int n;
int *norepeat;
float *results;
int dim, size_norepeat, i;
int a[7] = {7,2,2,4,5,6,7};
norepeat = delduplicate(a, 7, &size_norepeat);
for (int i = 0; i < size_norepeat; i++)
printf("%d ", norepeat[i]);
}
// delduplicate function
int *delduplicate(int *v, int size_old, int *size_new)
{
int i, j, k = 1, uniques = 1, repeats, *new_v, temp;
// count the number of unique elements
for (i = 1; i < size_old; i++)
{
int is_unique = 1;
for (j = 0; is_unique && j < i; j++)
{
if (v[i] == v[j])
is_unique = 0;
}
if (is_unique)
uniques++;
}
*size_new = uniques;
// create new array of unique elements
new_v = (int*) malloc(*size_new * sizeof(int));
// fill new array with unique elements
new_v[0] = v[0];
for (i = 1; i < size_old; i++)
{
int is_unique = 1;
for (j = 0; j < i; j++)
{
if (v[i] == v[j])
is_unique = 0;
}
if (is_unique)
new_v[k] = v[i];
k++;
}
return new_v;
}
The problem should be happening here:
// fill new array with unique elements
new_v[0] = v[0];
for (i = 1; i < size_old; i++)
{
int is_unique = 1;
for (j = 0; j < i; j++)
{
if (v[i] == v[j])
is_unique = 0;
}
if (is_unique)
new_v[k] = v[i];
k++;
}

Your problem is probably occurring in the following section -
if (is_unique)
new_v[k] = v[i];
k++;
Here you are incrementing k at each iteration. However, you only want to increment it whenever you have found a unique element. if() without brackets only considers the first statement. So change it to this -
if (is_unique){
new_v[k] = v[i];
k++;
}
This change should make your program run fine.
Side Note : If you do not want to use brackets for an if() , for() , etc, you can separate the statements by commas and use without having the brackets. Like this -
if (is_unique)
new_v[k] = v[i],
k++;

Why do these two bubble sort implementations differ significantly in runtime if they have similar numbers of steps and swaps?

I was trying to implement bubble sort in C. I've done so in this gist.
I implemented the algorithm laid out on Wikipedia's article for bubble sort, sort_bubble, and compared it to a reference implementation I found on github, bubble_sort:
typedef struct Bubble_Sort_Stats {
int num_swaps;
int num_steps;
} bubble_sort_stats_t;
bubble_sort_stats_t bubble_sort(int arr[], int n) {
bubble_sort_stats_t stats;
stats.num_swaps = 0;
stats.num_steps = 0;
int temp;
int i;
int j;
while (i < n) {
j = 0;
while (j < i) {
stats.num_steps++;
if (arr[j] > arr[i]) {
temp = arr[j];
arr[j] = arr[i];
arr[i] = temp;
stats.num_swaps++;
}
j++;
}
i++;
}
return stats;
}
bubble_sort_stats_t sort_bubble(int array[], int length_of_array) {
bubble_sort_stats_t stats;
stats.num_swaps = 0;
stats.num_steps = 0;
int n = length_of_array;
int new_n;
while (n >= 1) {
new_n = 0;
for (int i = 0; i < n - 1; i++) {
stats.num_steps++;
if (array[i] > array[i+1]) {
int l = array[i];
stats.num_swaps++;
new_n = i + 1;
array[i] = array[i + 1];
array[i + 1] = l;
}
}
n = new_n;
}
return stats;
}
#define BIG 10000
int main() {
int nums1[BIG], nums2[BIG];
for (int i = 0; i < BIG; i++) {
int newInt = rand() * BIG;;
nums1[i] = newInt;
nums2[i] = newInt;
}
long start, end;
bubble_sort_stats_t stats;
start = clock();
stats = bubble_sort(nums2, BIG);
end = clock();
printf("It took %ld ticks and %d steps to do %d swaps\n\n", end - start, stats.num_steps, stats.num_swaps);
start = clock();
stats = sort_bubble(nums1, BIG);
end = clock();
printf("It took %ld ticks and %d steps to do %d swaps\n\n", end - start, stats.num_steps, stats.num_swaps);
for (int i = 0; i < BIG; i++) {
if (nums1[i] != nums2[i]) {
printf("ERROR at position %d - nums1 value: %d, nums2 value: %d", i, nums1[i], nums2[i]);
}
if (i > 0) {
if (nums1[i - 1] > nums1[i]) {
printf("BAD SORT at position %d - nums1 value: %d", i, nums1[i]);
}
}
}
return 0;
}
Now when I run this program I get these results:
It took 125846 ticks and 49995000 steps to do 25035650 swaps
It took 212430 ticks and 49966144 steps to do 25035650 swaps
That is, the number of swaps is identical, and sort_bubble actually takes fewer steps, but it takes almost twice as long for this size of array!
My suspicion is that the difference has something to do with the control structure itself, the indices, something like that. But I don't really know enough about how the c compiler works to guess further, and I don't know how I would go about even determining this by debugging.
So I would like to know why but also how I could figure this out empirically.

Your bubble_sort isn't actually a bubble sort: it doesn't only compare adjacent pairs.
It is an insertion sort with the inner loop oddly reversed, which still works as intended. It can be rewritten as follows, without changing the number of steps or swaps.
bubble_sort_stats_t bubble_sort(int arr[], int n) {
bubble_sort_stats_t stats;
stats.num_swaps = 0;
stats.num_steps = 0;
for (int i = 1; i < n; i++) {
for (int j = i; j > 0; j--) {
stats.num_steps++;
if (arr[j-1] > arr[j]) {
int temp = arr[j];
arr[j] = arr[j-1];
arr[j-1] = temp;
stats.num_swaps++;
}
}
}
return stats;
}
To get a proper insertion sort from this, simply move the if condition into the inner loop, as follows.
bubble_sort_stats_t bubble_sort(int arr[], int n) {
bubble_sort_stats_t stats;
stats.num_swaps = 0;
stats.num_steps = 0;
for (int i = 1; i < n; i++) {
for (int j = i; j > 0 && arr[j-1] > arr[j]; j--) {
stats.num_steps++;
int temp = arr[j];
arr[j] = arr[j-1];
arr[j-1] = temp;
stats.num_swaps++;
}
}
return stats;
}
This way, you can see that the number of steps is actually equal to the number of swaps, and less than the number of steps of the actual bubble sort (sort_bubble).

Sorting one array into another- C

So I am trying to write this function where the input parameter array will be taken and copied into another array but in a sorted way. For example: an input parameter of 3, 1, 9, 8 will copy into the target array 1, 3, 8, 9.
This is what I have so far but it only copies the smallest element in every time. I'm looking for a way to "blacklist" smallest values that are discovered in each pass.
void sort_another_array(int *param, int *target, int size){
int i, j, lowest = param[0];
for(i = 0; i < size; i++){
for(j = 0; j < size; j++){
if(param[j] < lowest){
lowest = param[j]
}
}
target[i] = lowest;
}
}
Of course I could have another array of already found lowest values but that's more unnecessary looping and checking and adds to the already terrible n^2 complexity. Is there an easier way to do this?
I'm completely new to C, so please do restrict it to simple programming concepts of logic statements, using some flag variables etc..

The probably most straight-forward way to do this is to first copy the whole array and then sort the new array in-place using a standard sorting algorithm.
However, if you want to keep the current structure, the following would be an alternative when all elements are unique:
void sort_another_array(int *param, int *target, int size) {
int i, j, past_min = INT_MAX, current_min = INT_MAX;
for (i = 0; i < size; ++i) {
for (j = 0; j < size; ++j) {
if (i == 0 || param[j] > past_min) {
if (past_min == current_min || param[j] < current_min) {
current_min = param[j];
}
}
}
target[i] = current_min;
past_min = current_min;
}
}
What this does is keeping track of the previously lowest element found (past_min). The next element to find is lowest among all elements greater than past_min. I.e., we want both param[j] > past_min and param[j] < current_min to be true. However, the first element to add to target (i.e., when i == 0) will not have a lower element before it, so we add an exception for that. Similar, the first element satisfying param[j] > past_min in a pass will not have any element to compare with so we add another exception using past_min == current_min (this is true only for the first element found in a pass).
If you have duplicates in the array, this might work:
void sort_another_array(int *param, int *target, int size) {
int j, past_min, current_min, write = 0, round_write = 0;
while (round_write != size) {
for (j = 0; j < size; ++j) {
if (round_write == 0 || param[j] > past_min) {
if (write == round_write || param[j] < current_min) {
current_min = param[j];
write = round_write;
target[write] = current_min;
++write;
} else if (param[j] == current_min) {
target[write] = current_min;
++write;
}
}
}
round_write = write;
past_min = current_min;
}
}
Basically it's the same idea, but it writes all elements of the minimum value in the same pass.

You can use a modified insertion sort algorithm to solve this problem:
#include <stdio.h>
void sort_another_array(int *param, int *target, int size)
{
for ( int i = 0; i < size; i ++ ) // do for all elements in param
{
int j = i - 1;
while ( j >= 0 && target[j] > param[i] ) // find index of element in target which is samler or equal than param[i]
{
target[j+1] = target[j]; // shift forward element of target which is greater than param[i]
j --;
}
target[j+1] = param[i]; // insert param[i] into target
}
}
#define SIZE 10
int main( void )
{
int a[SIZE] = { 9, 8, 0, 2, 1, 3, 4, 5, 7, 6 };
int b[SIZE];
sort_another_array( a, b, SIZE );
for ( int i = 0; i < SIZE; i ++ )
printf( "%2d", b[i] );
return 0;
}

The solution I am providing, has a limitation that: If there are no duplicates in the array, then this will work:
void sort_another_array(int *param, int *target, int size)
{
int i, j, lowest;
for(i = 0; i < size; i++)
{
int k = 0;
if( i > 0) // for all except first iteration
{
while(param[k] <= target[i-1]) // find the one greater than the last one added
k++;
}
lowest = param[k];
for(j = 1; j < size; j++)
{
if( ( i==0 && param[j] < lowest ) || ( i > 0 && param[j] < lowest && param[j] > target[i-1])) // for all except first iteration the min found should be greater than the last one found
{
lowest = param[j];
}
}
target[i] = lowest;
}
}

array bucket sort in C

I am trying to read list of numbers from txt file and then sort them with Bucket sort.
so here is my code:
void bucketSort(int array[],int *n)
{
int i, j;
int count[*n];
for (i = 0; i < *n; i++)
count[i] = 0;
for (i = 0; i < *n; i++)
(count[array[i]])++;
for (i = 0, j = 0; i < *n; i++)
for(; count[i] > 0; (count[i])--)
array[j++] = i;
}
int main(int brArg,char *arg[])
{
FILE *ulaz;
ulaz = fopen(arg[1], "r");
int array[100];
int i=0,j,k,n;
while(fscanf(ulaz, "%d", &array[i])!=EOF)i++;
fclose(ulaz);
n=i;
for (j = 0; j<i; j++)
{
printf("Broj: %d\n", array[j]);
}
BucketSort(array,&n);
for (k = 0; k<i; k++)
printf("%d \n", array[i]);
return 0;
}
There are no errors in code,but when i call my function instead of sorted array i get array length random numbers(example: 2 3 5 4,after sorting i get 124520 124520 124520 124520 or some other random number) since i am a beginner,could someone help me with my code and what i did wrong? (sorry for bad english)

As Cool Guy correctly pointed out you have issues with memory access but on top of it the code does not sort anything. First you should read how Bucket Sort actually works.
In general:
You divide the input data among buckets by some criteria that guarantees that the buckets will not mess up the input order
Sort each bucket either using some other sorting method or recursively with bucket sort
Concatenate the sorted data (this is why the first point has the restriction of not messing up the input order)
Here is an example of your original code, I tried to adjust it as little as possible you it is easier for you to understand. This code divides a predefined input array among 3 buckets by range:
[-infinity][-1] -> first bucket
[0;10] -> second bucket
[11;infinity] -> third bucket
then performs Quicksort on each bucket and concatenates the result. I hope this helps to understand how this algorithm works.
#include <stdio.h>
#include <stdlib.h>
struct bucket
{
int count;
int* values;
};
int compareIntegers(const void* first, const void* second)
{
int a = *((int*)first), b = *((int*)second);
if (a == b)
{
return 0;
}
else if (a < b)
{
return -1;
}
else
{
return 1;
}
}
void bucketSort(int array[],int n)
{
struct bucket buckets[3];
int i, j, k;
for (i = 0; i < 3; i++)
{
buckets[i].count = 0;
buckets[i].values = (int*)malloc(sizeof(int) * n);
}
// Divide the unsorted elements among 3 buckets
// < 0 : first
// 0 - 10 : second
// > 10 : third
for (i = 0; i < n; i++)
{
if (array[i] < 0)
{
buckets[0].values[buckets[0].count++] = array[i];
}
else if (array[i] > 10)
{
buckets[2].values[buckets[2].count++] = array[i];
}
else
{
buckets[1].values[buckets[1].count++] = array[i];
}
}
for (k = 0, i = 0; i < 3; i++)
{
// Use Quicksort to sort each bucket individually
qsort(buckets[i].values, buckets[i].count, sizeof(int), &compareIntegers);
for (j = 0; j < buckets[i].count; j++)
{
array[k + j] = buckets[i].values[j];
}
k += buckets[i].count;
free(buckets[i].values);
}
}
int main(int brArg,char *arg[]) {
int array[100] = { -5, -9, 1000, 1, -10, 0, 2, 3, 5, 4, 1234, 7 };
int i = 12,j,k,n;
n=i;
for (j = 0; j<i; j++)
{
printf("Broj: %d\n", array[j]);
}
bucketSort(array, n);
for (k = 0; k<i; k++)
printf("%d \n", array[k]);
return 0;
}

Your code exhibits Undefined Behavior as you try to write into memory location which are not owned by your program.
for (i = 0; i < *n; i++)
(count[array[i]])++;
The above loop is causing the problem. You say that i is 4 which means that *n is also 4 and array contains 2 3 5 4. In the above code,count is an array of *n elements(in this case 4 elements) and the valid indices for the array are count[0],count[1],count[2] and count[3]. Doing
count[array[i]]
when i is zero is okay as it is same as count[2]. This is the same when i is 1 as it would be count[3] . After that ,when i is 4 and 5,count[4] and count[5] are wrong as you try to write to a invalid memory location.
Also,your code dosen't sort the values.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Reduce execution time of a code that uses binary search - c

Related

Sorting integers by sum of their digits

Printing unique values of the array in C

Why do these two bubble sort implementations differ significantly in runtime if they have similar numbers of steps and swaps?

Sorting one array into another- C

array bucket sort in C

Categories

Resources