Writing a function that calculates the sum of squares within a range in one line in C - c

My try
double sum_squares_from(double x, double n){
return n<=0 ? 0 : x*x + sum_squares_from((x+n-1)*(x+n-1),n-1);
}
Instead of using loops my professor wants us to write functions like this...
What the exercise asks for is a function sum_squares_from() with double x being the starting number and n is the number of number. For example if you do x = 2 and n = 4 you get 2*2+3*3+4*4+5*5. It returns zero if n == 0.
My thinking was that in my example what I have is basically x*x+(x+1)(x+1)+(x+1+1)(x+1+1)+(x+1+1+1)(x+1+1+1) = (x+0)(x+0)+(x+1)(x+1)+(x+2)(x+2)+(x+3)(x+3) = (x+n-1)^2 repeated n times where n gets decremented every time by one until it becomes zero and then you sum everything.
Did I do it right?
(if my professor seems a bit demanding... he somehow does this sort of thing all in his head without auxiliary calculations. Scary guy)

It's not recursive, but it's one line:
int
sum_squares(int x, int n) {
return ((x + n - 1) * (x + n) * (2 * (x + n - 1) + 1) / 6) - ((x - 1) * x * (2 * (x - 1) + 1) / 6);
}
Sum of squares (of integers) has a closed-form solution for 1 .. n. This code calculates the sum of squares from 1 .. (x+n) and then subtracts the sum of squares from 1 .. (x-1).

The original version of this answer used ASCII art.
So,
&Sum;i:0..n i = n(n+1)(&half;)
&Sum;i:0..n i2 = n(n+1)(2n+1)(&frac16;)
We note that,
&Sum;i:0..n (x+i)2
&equals; &Sum;i:0...n x2 + 2xi + i2
&equals; (n+1)x2 + (2x)&Sum;i:0..n i + &Sum;i:0..n i2
&equals; (n+1)x2 + n(n+1)x + n(n+1)(2n+1)(&frac16;)
Thus, your sum has the closed form:
double sum_squares_from(double x, int n) {
return ((n-- > 0)
? (n + 1) * x * x
+ x * n * (n + 1)
+ n * (n + 1) * (2 * n + 1) / 6.
: 0);
}
If I apply some obfuscation, the one-line version becomes:
double sum_squares_from(double x, int n) {
return (n-->0)?(n+1)*(x*x+x*n+n*(2*n+1)/6.):0;
}
If the task is to implement the summation in a loop, use tail recursion. Tail recursion can be mechanically replaced with a loop, and many compilers implement this optimization.
static double sum_squares_from_loop(double x, int n, double s) {
return (n <= 0) ? s : sum_squares_from_loop(x+1, n-1, s+x*x);
}
double sum_squares_from(double x, int n) {
return sum_squares_from_loop(x, n, 0);
}
As an illustration, if you observe the generated assembly in GCC at a sufficient optimization level (-Os, -O2, or -O3), you will notice that the recursive call is eliminated (and sum_squares_from_loop is inlined to boot).
Try it online!

As mentioned in my original comment, n should not be type double, but instead be type int to avoid floating point comparison problems with n <= 0. Making the change and simplifying the multiplication and recursive call, you do:
double sum_squares_from(double x, int n)
{
return n <= 0 ? 0 : x * x + sum_squares_from (x + 1, n - 1);
}
If you think about starting with x * x and increasing x by 1, n times, then the simple x * x + sum_squares_from (x + 1, n - 1) is quite easy to understand.

Maybe this?
double sum_squares_from(double x, double n) {
return n <= 0 ? 0 : (x + n - 1) * (x + n - 1) + sum_squares_from(x, n - 1);
}

Related

Any concise way to calculate n * (n+1) / 2 and handling overflow at the same time

I am trying to implement n * (n + 1) / 2 knowing that n is an int <= 2^16 - 1 (this guarantees that n * (n + 1) / 2 <= 2^31 - 1 so there is no overflow).
Then we know that n * (n + 1) / 2 is guaranteed to be non-negative integer. When calculating this value in a program, though, if we do multiplication n *(n + 1) first, we might get into integer overflow problem. My idea is to use a clumsy condition:
int m;
if (n % 2 == 0) {
m = (n / 2) * (n + 1);
} else {
m = n * ((n + 1) / 2);
}
Is there any more concise way of doing this?
There is a more concise way to write your test using the ternary operator:
int m = (n % 2 == 0) ? (n / 2) * (n + 1) : n * ((n + 1) / 2);
But it is likely to generate the exact same code.
You could take advantage of the extra precision long long is guaranteed to provide (at least 63 value bits):
int m = (long long)n * (n + 1) / 2;
Whether this is more or less efficient than the test version will depend on the target CPU and the compiler version and options. This version is simpler to read and understand, which is valuable. Adding a comment to explain why the result will be in range would be useful.
Derived from a suggestion by Amadeus, here is a more concise, but much less readable alternative, that does not use 64-bit arithmetics:
int m = (n + (n & 1)) / 2 * (n + 1 - (n & 1));
Demonstration:
if n is odd, we get m = (n + 1) / 2 * n;
if n is even, we get: m = n / 2 * (n + 1);.
The simplest solution is perhaps to use a larger intermediate type:
int m = (int)((long long)n * (n + 1) / 2) ;
It is not necessary to cast all operands since automatic type promotion will apply.
What do you think about:
m = ((n + (n & 1)) >> 1) * ( n + !(n & 1));
Explanation:
This solution try to achieve two objectives:
Do not overflow
Avoid to use if then else condition, and be pipeline friendly
To avoid overflow we first divide and the multiply. Once division is done to half the number (by 2) it has an interesting property: if number is odd the division is exact and can be done by a simple right sifting by 1.
So, to guarantee that the number is odd without if then else condition, we use the following trick:
If number is odd, it means that it lower bit is zero (captured by anding it with 1), otherwise it is even. Therefore if number is odd, we divide it by 2, otherwise, we first add 1, to make sure that it is odd and the divide.
In other words, this solution is equivalent to:
if ( n is odd )
m = (n >> 1) * (n + 1);
else
m = ( (n + 1) >> 1) * n;
and one more:
int m = (n/2 * n) + ((n%2) * (n/2)) + (n/2) + (n%2);
maybe
result = (n) * (n / 2) + (n & 1) * (n) + n / 2 ;

Time complexity of a function in Big-O

I'm trying to find the time complexity of this function:
int bin_search(int a[], int n, int x); // Binary search on an array with size n.
int f(int a[], int n) {
int i = 1, x = 1;
while (i < n) {
if (bin_search(a, i, x) >= 0) {
return x;
}
i *= 2;
x *= 2;
}
return 0;
}
The answer is (log n)^2. How come?
Best I could get is log n. First the i is 1, so the while will be run log n times.At first interaction, when i=1, the binary search will have only one interaction because the array's size is 1(i). Then, when i=2, two interactions, and so on until it's log n interactions.
So the formula I thought would fit is this.
The summation is for the while and the inner equation is because for i=1 it's log(1), for i=2 it's log(2) and so on until it's log(n) at the last.
Where am I wrong?
Each iteration performs a binary search on the first 2^i elements of the array.
You can compute the number of operations (comparisons):
log2(1) + log2(2) + log2(4) + ... + log2(2^m)
log(2^n) equals n, so this series simplifies into:
0 + 1 + 2 + ... + m
Where m is floor(log2(n)).
The series evaluates to m * (m + 1) / 2, replacing m we get
floor(log2(n)) * (floor(log2(n)) + 1) / 2
-> 0.5 * floor(log2(n))^2 + 0.5 * floor(log2(n))
The first element dominates the second, hence the complexity is O(log(n)^2)

sum's sum of divizors of numbers less than or equal to N

I really need some help at this problem:
Given a positive integer N, we define xsum(N) as sum's sum of all positive integer divisors' numbers less or equal to N.
For example: xsum(6) = 1 + (1 + 2) + (1 + 3) + (1 + 2 + 4) + (1 + 5) + (1 + 2 + 3 + 6) = 33.
(xsum - sum of divizors of 1 + sum of divizors of 2 + ... + sum of div of 6)
Given a positive integer K, you are asked to find the lowest N that satisfies the condition: xsum(N) >= K
K is a nonzero natural number that has at most 14 digits
time limit : 0.2 sec
Obviously, the brute force will fall for most cases with Time Limit Exceeded. I haven't find something better than it yet, so that's the code:
fscanf(fi,"%lld",&k);
i=2;
sum=1;
while(sum<k) {
sum=sum+i+1;
d=2;
while(d*d<=i) {
if(i%d==0 && d*d!=i)
sum=sum+d+i/d;
else
if(d*d==i)
sum+=d;
d++;
}
i++;
}
Any better ideas?
For each number n in range [1 , N] the following applies: n is divisor of exactly roundDown(N / n) numbers in range [1 , N]. Thus for each n we add a total of n * roundDown(N / n) to the result.
int xsum(int N){
int result = 0;
for(int i = 1 ; i <= N ; i++)
result += (N / i) * i;//due to the int-division the two i don't cancel out
return result;
}
The idea behind this algorithm can aswell be used to solve the main-problem (smallest N such that xsum(N) >= K) in faster time than brute-force search.
The complete search can be further optimized using some rules we can derive from the above code: K = minN * minN (minN would be the correct result if K = 2 * 3 * ...). Using this information we have a lower-bound for starting the search.
Next step would be to search for the upper bound. Since the growth of xsum(N) is (approximately) quadratic we can use this to approximate N. This optimized guessing allows to find the searched value pretty fast.
int N(int K){
//start with the minimum-bound of N
int upperN = (int) sqrt(K);
int lowerN = upperN;
int tmpSum;
//search until xsum(upperN) reaches K
while((tmpSum = xsum(upperN)) < K){
int r = K - tmpSum;
lowerN = upperN;
upperN += (int) sqrt(r / 3) + 1;
}
//Now the we have an upper and a lower bound for searching N
//the rest of the search can be done using binary-search (i won't
//implement it here)
int N;//search for the value
return N;
}

Optimizing neighbor count function for conway's game of life in C

Having some trouble optimizing a function that returns the number of neighbors of a cell in a Conway's Game of Life implementation. I'm trying to learn C and just get better at coding. I'm not very good at recognizing potential optimizations, and I've spent a lot of time online reading various methods but it's not really clicking for me yet.
Specifically I'm trying to figure out how to unroll this nested for loop in the most efficient way, but each time I try I just make the runtime longer.
I'm including the function, I don't think any other context is needed. Thanks for any advice you can give!
Here is the code for the countNeighbors() function:
static int countNeighbors(board b, int x, int y)
{
int n = 0;
int x_left = max(0, x-1);
int x_right = min(HEIGHT, x+2);
int y_left = max(0, y-1);
int y_right = min(WIDTH, y+2);
int xx, yy;
for (xx = x_left; xx < x_right; ++xx) {
for (yy = y_left; yy < y_right; ++yy) {
n += b[xx][yy];
}
}
return n - b[x][y];
}
Instead of declaring board as b[WIDTH][HEIGHT] declare it as b[WIDTH + 2][HEIGHT + 2]. This gives an extra margin which will have zeros, but it prevents from index out of bounds. So, instead of:
x x
x x
We will have:
0 0 0 0
0 x x 0
0 x x 0
0 0 0 0
x denotes used cells, 0 will be unused.
Typical trade off: a bit of memory for speed.
Thanks to that we don't have to call min and max functions (which have bad for performance if statements).
Finally, I would write your function like that:
int countNeighborsFast(board b, int x, int y)
{
int n = 0;
n += b[x-1][y-1];
n += b[x][y-1];
n += b[x+1][y-1];
n += b[x-1][y];
n += b[x+1][y];
n += b[x-1][y+1];
n += b[x][y+1];
n += b[x+1][y+1];
return n;
}
Benchmark (updated)
Full, working source code.
Thanks to Jongware comment I added linearization (reducing array's dimensions from 2 to 1) and changing int to char.
I also made the main loop linear and calculate the returned sum directly, without an intermediate n variable.
2D array was 10002 x 10002, 1D had 100040004 elements.
The CPU I have is Pentium Dual-Core T4500 at 2.30 GHz, further details here (output of cat /prof/cpuinfo).
Results on default optimization level O0:
Original: 15.50s
Mine: 10.13s
Linear: 2.51s
LinearAndChars: 2.48s
LinearAndCharsAndLinearLoop: 2.32s
LinearAndCharsAndLinearLoopAndSum: 1.53s
That's about 10x faster compared to the original version.
Results on O2:
Original: 6.42s
Mine: 4.17s
Linear: 0.55s
LinearAndChars: 0.53s
LinearAndCharsAndLinearLoop: 0.42s
LinearAndCharsAndLinearLoopAndSum: 0.44s
About 15x faster.
On O3:
Original: 10.44s
Mine: 1.47s
Linear: 0.26s
LinearAndChars: 0.26s
LinearAndCharsAndLinearLoop: 0.25s
LinearAndCharsAndLinearLoopAndSum: 0.24s
About 44x faster.
The last version, LinearAndCharsAndLinearLoopAndSum is:
typedef char board3[(HEIGHT + 2) * (WIDTH + 2)];
int i;
for (i = WIDTH + 3; i <= (WIDTH + 2) * (HEIGHT + 1) - 2; i++)
countNeighborsLinearAndCharsAndLinearLoopAndSum(b3, i);
int countNeighborsLinearAndCharsAndLinearLoopAndSum(board3 b, int pos)
{
return
b[pos - 1 - (WIDTH + 2)] +
b[pos - (WIDTH + 2)] +
b[pos + 1 - (WIDTH + 2)] +
b[pos - 1] +
b[pos + 1] +
b[pos - 1 + (WIDTH + 2)] +
b[pos + (WIDTH + 2)] +
b[pos + 1 + (WIDTH + 2)];
}
Changing 1 + (WIDTH + 2) to WIDTH + 3 won't help, because compiler takes care of it anyway (even on O0 optimization level).

multiply two numbers using only bit operations

While learning Bit operations in c,I was searching for code to multiply two numbers using only bit operations , I found the following code!. I am unable to understand how ternary operator is working in the following scenario and producing the correct o/p.
#include<stdio.h>
static int multiply (int x, int y)
{
return y==0?0:((y&1) ==1?x:0)+multiply(x<<1,y>>1);
}
int main()
{
printf("%d",multiply(2,3));
return 0;
}
Can someone please explain how is the above code working?.
That is not using "only bit operations", since it's using + to add numbers.
Maybe indenting can help break up the complicated expression:
return (y == 0 ? 0
: (y & 1) == 1 ? x
: 0)
+ multiply(x << 1, y >> 1);
Basically it's a recursive addition, that stops when y reaches 0. If the least significant bit of y is set, x is added to the result, else it is not. On each recursion, one bit of y is dropped so that it eventually will reach 0. The value of x is shifted to the left, very much like when doing multiplication by hand.
For instance if x = 3 (binary 11) and y = 6 (binary 110), it will compute
0 * 3 + 1 * 6 + 1 * 12 = 18
And of course 18 is 3 * 6.
Each recursion step is written as a * b where a is the least significant bit of y at that step (reading from the left, you get 0, 1, 1 which is the bits of y starting with the least significant bit) and b is the value of x at that step.
If y is odd, x * y = x + (x * 2) * (y / 2)
If y is even, x * y = (x * 2) * (y / 2)
With the logic above, and use recursion until y = 0.
If you are struggling understanding a complex nested use of the conditional operator, then simply expand it to an if statement:
static int multiply (int x, int y)
{
if (y==0)
return 0;
else
return ((y&1) ==1?x:0)+multiply(x<<1,y>>1);
}
And then expand the inner conditional operator:
static int multiply (int x, int y)
{
if (y == 0)
return 0;
else if ((y&1) == 1)
return x + multiply(x<<1, y>>1);
else return
return multiply(x<<1, y>>1);
}
Once you've expanded it like this, it should be clear what the expression is doing.

Resources