Optimization of program processing structured input for large data

Optimization of program processing structured input for large data - c

I have this one task. To make it more clear, I am gonna use picture below as an example. Input and output is separated with dotted line. First line of input is number N - number of sets. For every set, it's first line are 2 numbers - first one declares how many numbers am I gonna process and second one is number of intervals. Second line specifies the numbers to process and third line contains 2 numbers X and Y, which create and interval. For every interval I have to output 3 numbers - lowest number on interval, index of highest number on interval and XOR of all numbers. Everything is running fine except it is really slow for big data and I have no idea how to make work faster. I have attached my code and large data input as well.
input.txt
#include <stdio.h>
#include <stdlib.h>
typedef struct {
int id;
int index;
} Censor;
int Xor(const int x, const int y, const Censor array[]) {
int xor = array[x].id;
if (x == y) {
return xor;
}
for (int i = x + 1; i <= y; i++) {
xor ^= array[i].id;
}
return xor;
}
int int_cmp(const void *a, const void *b) {
const Censor *ia = (const Censor *)a;
const Censor *ib = (const Censor *)b;
return (ia->id - ib->id);
}
int LowestId(const int x, const int y, Censor array[]) {
int id = array[x].id;
if (x == y) {
return id;
}
qsort(array, y - x + 1, sizeof(Censor), int_cmp);
return array[0].id;
}
int HighestIdIndex(const int x, const int y, Censor array[]) {
int index = array[x].index;
if (x == y) {
return index;
}
qsort(array, y - x + 1, sizeof(Censor), int_cmp);
return array[y].index;
}
int main() {
int t, n, q, b, e;
int max = 100;
int count = 0;
int *output = (int *)malloc(max * sizeof(output));
scanf("%d", &t); //number of sets
for (int i = 0; i < t; i++) {
scanf("%d %d", &n, &q);
//I am making 3 separate arrays for numbers, because some of them are being sorted and some of them not
Censor lowest_id[n];
Censor highest_id_index[n];
Censor xor[n];
//This loop fills arrays with the numbers to be processed
for (int j = 0; j < n; j++) {
scanf("%d", &(lowest_id[j].id));
lowest_id[j].index = j;
highest_id_index[j].id = lowest_id[j].id;
highest_id_index[j].index = j;
xor[j].id = lowest_id[j].id;
xor[j].index = j;
}
// Now I am scanning intervals and creating output. Output is being stored in one dynamically allocated array.
for (int k = 0; k < q; k++) {
scanf("%d %d", &b, &e);
if (count + 3 >= max) {
max *=2;
int *tmp = (int *)realloc(output, max * sizeof(tmp));
if (tmp == NULL) {
return 1;
} else {
output = tmp;
}
}
output[count++] = LowestId(b, e, lowest_id);
output[count++] = HighestIdIndex(b, e, highest_id_index);
output[count++] = Xor(b, e, xor);
}
}
printf("---------------------\n");
for (int i = 0; i < count; i++) {
printf("%d\n", output[i]);
}
free(output);
return 0;
}

Thanks #Dan Mašek and #Alex Lop. Sorting subarray in this case was unnecessary. Much easier is to iterate through the subarray in linear complexity.

Related

Finding Prime and Composite Elements in an array. Print primes in ascending order and composite in descending order

I am successful in identifying prime and composite from an array. But my qsort function seem to not have any effect when I print the output. I need the primes to be ascending and composite to be descending. When I run the code, it does not sort the output, though it identifies primes and composites.
#include <stdio.h>
#include <stdlib.h>
int compare_Asc(const void *a_void, const void *b_void) {
int a = *(int *)a_void;
int b = *(int *)b_void;
return a - b;
}
int compare_Desc(const void *a_void, const void *b_void) {
int a = *(int *)a_void;
int b = *(int *)b_void;
return b - a;
}
int main() {
int i = 0, n, x, p, c, z, w, j = 0, k = 0, cmpst, null;
int prm;
int prime[50], composite[50], input[50];
printf("How many inputs are you be working with?\nNote: 50 Maximum Inputs\n");
scanf("%d", &n);
printf("Enter the numbers.\n", n);
for (i = 0; i < n; i++) {
scanf("%d", &input[i]);;
}
for (i = 0; i < n; i++) {
if (input[i] % 2 != 0) {
prime[p++] = input[i];
prm = p;
} else
if (input[i] >= 2 && input[i] % 2 == 0) {
composite[c++] = input[i];
cmpst = c;
}
}
printf("Prime Numbers:");
qsort(prime, prm, sizeof(int), compare_Asc);
for (i = 0; i < p; i++) {
printf("%d", prime[p]);
}
printf("Composite Numbers:");
qsort(composite, cmpst, sizeof(int), compare_Desc);
for (i = 0; i < c; i++) {
printf("%d", composite[c]);
}
return 0;
}

There are some major issues, in the posted code, worth mentioning.
Variables
Declaring all the variables at the beginning of the scope, instead of just before where they are used, can hide bugs.
Uninitialized variables, are an even worse source of errors, because their values are indeterminated.
int i=0, n, x, p, c, z, w, j=0, k=0, cmpst, null;
// ^ ^ ^^^^ ?
// ... Later, in the code:
prime[p++] = input[i];
// ^^^ What value is incremented?
// Where is [p++]? Is it inside the prime array?
A correct initialization would prevent undefined behavior.
int p = 0, c = 0;
int composite[50], input[50];
for(int i = 0; i < n ; ++i) {
if ( is_prime(input[i]) ) { // <-- More on this, later.
prime[p++] = input[i];
}
else {
composite[c++] = input[i];
}
}
Loops
This happens a couple of times, just because the code itself is duplicated (another code smell):
for(i=0;i<p;i++){
// ^^^^^^^^^^^ We want to iterate over [0, p).
printf("%d",prime[p]);
// ^ But this always prints the element one past the end
}
Even if it's just a simple loop, it could be a good idea to write a (testable and reusable) function
void print_arr(size_t n, int arr[n])
{
for (size_t i = 0; i < n; ++i) {
printf("%d ", arr[i]);
} // ^
putchar('\n');
}
// ... Later, in main:
print_arr(p, prime);
print_arr(c, composite);
Primes or composite
I am successful in identifying prime and composite from an array
Well, no. Not with this code, I'm sorry.
if (input[i]%2 != 0) { // Those are ALL the ODD numbers!
prime[p++]=input[i];
}
else if(input[i]>=2 && input[i]%2==0){ // Those are the EVEN numbers greater than 0
composite[c++]=input[i];
}
// What about 0 and the even numbers less than 0?
Not all the odd numbers are prime number (it's a little more complicated than that) and 2 itself is a prime, not a composite.
It's unclear to me if this is a terminology issue or if the snippet is only a placeholder for a proper algorithm. In any case, there are multiple examples of primality test functions in SE sites (I'm quite confident some are posted almost every day).
Overflow risk
See chux - Reinstate Monica's comment:
return a-b; risks overflow when a, b are large int values.
Consider return (a > b) - (a < b); for a full range solution.

Single letter variables names are to be avoided... except for i, j and k used in for() loops only.
You're not updating the index of the arrays c and p as the numbers are being printed out. The arrays are being sorted fine.
In the code below I also remove redundant variables, and rename n to input_count, c to compo_count and p to prime_count.
#include <stdio.h>
#include <stdlib.h>
int compare_Asc(const void *a_void, const void *b_void)
{
int a = *(int *) a_void;
int b = *(int *) b_void;
return a - b;
}
int compare_Desc(const void *a_void, const void *b_void)
{
int a = *(int *) a_void;
int b = *(int *) b_void;
return b - a;
}
int main ()
{
int i = 0;
int input_count = 0;
int prime_count = 0;
int compo_count = 0;
int prime[50];
int composite[50];
int input[50];
printf("How many inputs are you be working with?\nNote: 50 Maximum Inputs\n");
scanf("%d", &input_count);
printf("Enter the %d numbers.\n", input_count);
for (i = 0; i < input_count; i++)
{
scanf("%d", &input[i]);
}
for (i = 0; i < input_count; i++)
{
if (input[i] % 2 != 0)
{
prime[prime_count] = input[i];
prime_count += 1;
}
else if (input[i] >= 2 && input[i] % 2 == 0)
{
composite[compo_count] = input[i];
compo_count += 1;
}
}
printf("Prime Numbers:");
qsort(prime, prime_count, sizeof(int), compare_Asc);
for (i = 0; i < prime_count; i++)
{
printf("%d ", prime[i]); // <<-- HERE, not [p]
}
printf( "\n" );
printf ("Composite Numbers:");
qsort(composite, compo_count, sizeof(int), compare_Desc);
for (i = 0; i < compo_count; i++)
{
printf("%d", composite[i]); // <<-- HERE, not [c]
}
printf( "\n" );
return 0;
}

Can a nested for loop in C recursively increase in depth times a user given integer?

I would like to avoid a nested for loop since it should recursively increase in depth by a user given integer.
So if the user input 3 it should be nested like the example below.. if the user input 6 there should be three more loops inside!?
#include <stdio.h>
int main(void)
{
// int depth_lvl = 3
char n[] = {'a','b','c'};
int i,j,y;
int x = sizeof(n);
for(i = 0; i < x; i++)// <---- LEVEL 1
{
printf("%c\n",n[i]);
for(j = 0; j < x; j++)// <---- LEVEL 2
{
printf("%c%c\n",n[i],n[j]);
for(y = 0; y < x; y++) // <---- LEVEL 3
{
printf("%c%c%c\n",n[i],n[j],n[y]);
}
}
}
}

Is it something like that what you are looking for?
The solution uses recursion together with an intermediate result string at each level, with which each state of the current level is carried over to the next deeper level.
#define MAX_DEPTH 6
void printRecursive(char n[], int x, int curDepth, char* result)
{
// note: x is supposed to be sizeof(n).
if (x > MAX_DEPTH) // prohibit overflow of intermediateResult
x = MAX_DEPTH;
if (curDepth < x) {
char intermediateResult[MAX_DEPTH+1];
if (result)
strcpy(intermediateResult,result);
else
strcpy(intermediateResult, "");
for (int i=0;i<x;i++) {
intermediateResult[curDepth] = n[i];
intermediateResult[curDepth+1] = '\0';
printRecursive(n,x,curDepth+1,intermediateResult);
}
}
if (curDepth > 0)
printf("%s\n", result);
}
int main(void)
{
char n[] = {'a','b','c', 'd'};
int x = sizeof(n);
printRecursive(n, x, 0, NULL);
return 0;
}

Bubble sort in C: Function not changing array data

Code:
#include <stdio.h>
void testSort(int values[], int n);
int main(void)
{
int hs[] = {5,3,2,1,4};
printf("Unsorted: %i %i %i %i %i\n", hs[0], hs[1], hs[2], hs[3], hs[4]);
testSort(hs, 5);
printf("Sorted: %i %i %i %i %i\n", hs[0], hs[1], hs[2], hs[3], hs[4]);
}
void testSort(int values[], int n)
{
for (int i = 0; i < n-1; i++)
{
int hold;
int current = values[i];
int next = values[i + 1];
if (current > next)
{
hold = current;
current = next;
next = hold;
}
}
return;
}
I'm trying to do bubble sort and right now it goes through the array once, but my question is: Why isn't my hs[] updating after calling function? The second printf shows that it remained the same.
EDIT:
As mentioned, turns out I was changing data but of the copies. For some reason I when I created the variables current/next I felt as if they were representing values[i]/values[i+1] but in reality I was just creating new variable and passing the value of values[0] which is 5 and assigning it to current. Obviously leaving values[] unchanged. Thanks everyone

The problem is that you're only modifying the function's local variables, not the array's elements.
It's the same principle as why this program will print 1 and not 2:
int main()
{
int array[] = {1};
int x = array[0];
x = 2;
printf("array[0] = %d\n", array[0]);
return 0;
}
You need to assign values to the array's elements:
void testSort(int values[], int n)
{
for (int i = 0; i < n-1; i++)
{
if (values[i] > values[i+1])
{
int hold = values[i];
values[i] = values[i+1];
values[i+1] = hold;
}
}
}
Once you've fixed this, you will notice that this function only works for some inputs.
Solving that bug is left as an exercise.

Please try below code:-
void bubble_sort(int list[], int n){
int c, d, t;
for (c = 0 ; c < ( n - 1 ); c++)
{
for (d = 0 ; d < n - c - 1; d++)
{
if (list[d] > list[d+1])
{
t = list[d];
list[d] = list[d+1];
list[d+1] = t;
}
}
}
}

How do I find distance between couple of points (x, y) from origin, and then sort the points, who is closest to (0, 0)?

i need to enter number of points(x,y), and then sort the points,from the closest one to (0,0) to the one that is far.. for example:
Enter number of points: 3
Enter point: 1 6
Enter point: 2 5
Enter point: 4 4
Sorted points:(2,5) (4,4) (1,6)
now i did a function that will find the distance,and i did an array and put the distance between two coordinate x and y,and i want to use merge sort to sort the array, my problem is how to go back and print the actual coordinate x y ... (i hope you would understand the problem),what can i do? i thought of putting the cordinate an array and sort them but that won't work :\
(and i didn't learn struct so i can't use unless if there is no other way ...)
plz anyone can help me i really have no idea have to continue:\
#include <stdio.h>
#include <stdlib.h>
void Enter_numbers(int x,int *z,int *first_coordinate,int *second_coordinate);
int distance(int a,int b);
void merge(int a[], int na, int b[], int nb, int c[]);
int merge_sort(int ar[], int n);
int main()
{
int x;
int *z;
int *first_coordinate;
int *second_coordinate;
printf("Enter number of points: ");
scanf("%d",&x);
z=(int*)malloc(x*sizeof(int));
first_coordinate=(int*)malloc(x*sizeof(int));
second_coordinate=(int*)malloc(x*sizeof(int));
Enter_numbers(x,z,first_coordinate,second_coordinate);
free(z);
free(first_coordinate);
free(second_coordinate);
return 0;
}
int distance(int a,int b)
{
int dis;
dis=((a*a)+(b*b));
return dis;
}
void Enter_numbers(int x,int *z,int *first_coordinate,int *second_coordinate)
{
int a=0,b=0;
int i=0;
int diss=0;
while(x>0)
{
printf("Enter points: ");
scanf("%d %d",&a,&b);
diss=distance(a,b);
z[i]=diss;
first_coordinate[i]=a;
second_coordinate[i]=b;
++i;
x--;
}
}
and the merge sort function i will use after i figure what to do :
int merge_sort(int ar[], int n)
{
int len;
int *temp_array, *base;
temp_array = (int*)malloc(sizeof(int)*n);
if(temp_array == NULL) {
printf("Dynamic Allocation Error in merge_sort");
return FAILURE;
}
for (len = 1; len < n; len *= 2) {
for (base = ar; base < ar + n; base += 2 * len) {
merge(base, len, base + len, len, temp_array);
memcpy(base, temp_array, 2*len*sizeof(int));
}
}
free(temp_array);
return SUCCESS;
}
and here is merge ...
void merge(int a[], int na, int b[], int nb, int c[])
{
int ia, ib, ic;
for(ia = ib = ic = 0; (ia < na) && (ib < nb); ic++)
{
if(a[ia] < b[ib]) {
c[ic] = a[ia];
ia++;
}
else {
c[ic] = b[ib];
ib++;
}
}
for(;ia < na; ia++, ic++) c[ic] = a[ia];
for(;ib < nb; ib++, ic++) c[ic] = b[ib];
}

I would use a struct for solving this task.
If you haven't learned struct yet, this seems to be a good time to learn it.
Note: If you really can't use stuct, see the last part of the answer.
With struct it could be something like:
#include <stdio.h>
#include <stdlib.h>
typedef struct
{
int x;
int y;
int squared_distance;
} dpoint;
int squared_dst(int x, int y)
{
return (x*x + y*y);
}
// Compare function used for sorting
int compare_dpoint_dst(const void * e1, const void * e2)
{
dpoint* p1 = (dpoint*)e1;
dpoint* p2 = (dpoint*)e2;
if (p1->squared_distance > p2->squared_distance) return 1;
if (p1->squared_distance < p2->squared_distance) return -1;
return 0;
}
void print_dpoint(dpoint dp)
{
printf("(%d, %d) : sd = %d\n", dp.x, dp.y, dp.squared_distance);
}
#define N 5
int main(void) {
// Array of points (fixed size for simplicity)
dpoint ps[N];
// Dummy input (for simplicity)
int x[N] = {1,5,2,3,4};
int y[N] = {9,3,7,1,3};
for (int i = 0; i < N; ++i)
{
ps[i].x = x[i];
ps[i].y = y[i];
}
// Calculate squared distance for all points
for (int i = 0; i < N; ++i)
{
ps[i].squared_distance = squared_dst(ps[i].x, ps[i].y);
}
printf("unsorted:\n");
for (int i = 0; i < N; ++i)
{
print_dpoint(ps[i]);
}
// Sort the points
qsort (ps, sizeof(ps)/sizeof(*ps), sizeof(*ps), compare_dpoint_dst);
printf("sorted:\n");
for (int i = 0; i < N; ++i)
{
print_dpoint(ps[i]);
}
return 0;
}
Notice that you can do the sorting on the squared distance so that you don't need square root in the program.
The program above will generate:
unsorted:
(1, 9) : sd = 82
(5, 3) : sd = 34
(2, 7) : sd = 53
(3, 1) : sd = 10
(4, 3) : sd = 25
sorted:
(3, 1) : sd = 10
(4, 3) : sd = 25
(5, 3) : sd = 34
(2, 7) : sd = 53
(1, 9) : sd = 82
No use of struct
If you for some reason can't use struct, you can use a shadow array to track the sorting but you'll have to write your own sorting. I don't recommend this approach - learn about structinstead. Anyway, it could be something like:
int x[N];
int y[N];
int sd[N]; // squared distance
int sw[N]; // swap order
// read input and calculate distance
// ...
// Fill sw with 0, 1, 2, ....
for (int i=0; i < N; ++i) sw[i] = i;
mySort(sd, sw, N);
// Now you can use sw for printing
for (int i=0; i < N; ++i)
{
// print element sw[i]
printf("(%d,%d)\n", x[sw[i]], y[sw[i]]);
}
}
void mySort(int sd[], int sw[], int N)
{
// .... code for sorting
// ....
// Assume that you need to swap element i and j here
temp = sd[i];
sd[i] = sd[j];
sd[j] = temp;
// Then do exactly the same for sw
temp = sw[i];
sw[i] = sw[j];
sw[j] = temp;
// ....
// ....
}

Make a program run linear in C

So based in the following problem from cumulative sum query I created the solution. But is any other way to solve the problem in C with linear complexity O(N)?
Problem description:
William Macfarlane wants to look at an array.
You are given a list of N numbers and Q queries. Each query is
specified by two numbers i and j; the answer to each query is the sum
of every number between the range [i, j] (inclusive).
Note: the query ranges are specified using 0-based indexing.
Input
The first line contains N, the number of integers in our list (N <=
100,000). The next line holds N numbers that are guaranteed to fit
inside an integer. Following the list is a number Q (Q <= 10,000). The
next Q lines each contain two numbers i and j which specify a query
you must answer (0 <= i, j <= N-1). Output
Output
For each query, output the answer to that query on its own line in the
order the queries were made.
Here is the solution:
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
struct node {
int first;
int last;
};
int sum_array(int *array, int first, int last) {
int sum = 0;
for (int i = first; i <= last; i++) {
sum += array[i];
}
return sum;
}
int main() {
FILE* input = fopen("share.in","r");
int N = 0;
fscanf(input,"%d",&N);
int *array = (int*)malloc(N * sizeof(int));
for (int i = 0; i < N; i++) {
fscanf(input,"%d",&array[i]);
}
int Q = 0;
fscanf(input,"%d",&Q);
struct node query[Q];
for (int i=0; i < Q; i++) {
fscanf(input,"%d",&query[i].first);
fscanf(input,"%d",&query[i].last);
}
fclose(input);
int sum = 0;
for ( int i = 0; i < Q ; i++) {
int first = query[i].first;
int last = query[i].last;
sum = sum_array(array,first,last);
printf("Number of queries : %d , sum is %d\n",i ,sum);
}
free(array);
return 0;
}
Update:
The answer given is good. But for some reason I couldn't make it work.
So here is the code rewritten and if someone can explain me what I do wrong I will be happy! Keep in mind we want the range to be [first,last]
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
struct node {
int first;
int last;
};
int sum_array(int *array, int first, int last) {
int sum = 0;
for (int i = first; i <= last; i++) {
sum += array[i];
}
return sum;
}
int main() {
FILE* input = fopen("share.in","r");
int N = 0;
fscanf(input,"%d",&N);
int *array = (int*)malloc(N * sizeof(int));
int *integralArray = (int*)malloc(N * sizeof(int));
for (int i = 0; i < N; i++) {
fscanf(input,"%d",&array[i]);
integralArray[i] = array[i] + ((i > 0) ? array[i-1] : 0);
}
int Q = 0;
fscanf(input,"%d",&Q);
struct node query[Q];
for (int i=0; i < Q; i++) {
fscanf(input,"%d",&query[i].first);
fscanf(input,"%d",&query[i].last);
}
fclose(input);
int sum = 0;
for (int i = 0; i < Q ; i++) {
int first = query[i].first;
int last = query[i].last;
sum = integralArray[last] - integralArray[first - 1];
printf("Number of queries : %d , sum is %d\n",i ,sum);
}
free(array);
return 0;
}

You'd form the integral array. Modify to something like:
int *array = (int*)malloc(N * sizeof(int));
int *integralArray = (int*)malloc(N * sizeof(int));
for (int i = 0; i < N; i++) {
fscanf(input,"%d",&array[i]);
integralArray[i] = array[i] + ((i > 0) ? integralArray[i-1] : 0);
}
So the element at integralArray[i] is the sum of all elements in array from 0 to i.
Then, to get the sum from a to b, where a > b, integralArray[b] is the sum from 0 to b and integralArray[a] is the sum from 0 to a so you can just compute integralArray[b] - integralArray[a] to get the total from a to b. Intuitively, integralArray[b] includes the numbers you want but it also includes the numbers up to and including a. You don't want those so you take them off again.
Vary appropriately for inclusion or exclusion of the number at a and the number at b. That as given will include the number at b but not that at a. You could adjust your integralArray to be one earlier (so integralArray[b] is the sum from 0 to b-1) or adjust your indices.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Optimization of program processing structured input for large data - c

Thanks #Dan Mašek and #Alex Lop. Sorting subarray in this case was unnecessary. Much easier is to iterate through the subarray in linear complexity.

Related

Finding Prime and Composite Elements in an array. Print primes in ascending order and composite in descending order

Can a nested for loop in C recursively increase in depth times a user given integer?

Bubble sort in C: Function not changing array data

How do I find distance between couple of points (x, y) from origin, and then sort the points, who is closest to (0, 0)?

Make a program run linear in C

Categories

Resources