novice programmer here trying to get better at C, so i began doing code problems on a website called codeforces. However i seem to be stuck, i have written code that appears to work in practice but the website does not accept it as right.
the problem :
Theatre Square in the capital city of Berland has a rectangular shape with the size n × m meters. On the occasion of the city's anniversary, a decision was taken to pave the Square with square granite flagstones. Each flagstone is of the size a × a. What is the least number of flagstones needed to pave the Square? It's allowed to cover the surface larger than the Theatre Square, but the Square has to be covered. It's not allowed to break the flagstones. The sides of flagstones should be parallel to the sides of the Square.1
Source :
https://codeforces.com/problemset/problem/1/A
I did have a hard time completely understanding the math behind the problem and used this source's answer from a user named "Joshua Pan" to better understand the problem
Source :
https://www.quora.com/How-do-I-solve-the-problem-Theatre-Square-on-Codeforces
This is my code :
#include<stdio.h>
#include<math.h>
int main(void)
{
double n,m,a;
scanf("%lf %lf %lf", &n,&m,&a);
printf("%1.lf\n", ceil(n/a)*ceil(m/a));
return 0;
}
I compiled it using "gcc TheatreSquare.c -lm"
When given the sample input 6,6,4 my code produces the correct output 4, however the website does not accept this code as correct, i could be wrong but maybe im using format specifiers incorrectly?
Thanks in advance.
Typical double (IEEE754 64-bit floating point) doesn't have enough accuracy for the problem.
For example, for input
999999999 999999999 1
Your program may give output
999999998000000000
While the actual answer is
999999998000000001
To avoid this, you shouldn't use floating point data type.
You can add #include <inttypes.h> and use 64-bit integer type int64_t for this calculation.
"%" SCNd64 is for reading and "%" PRId64 is for writing int64_t.
cell(n/a) on integers can be done by (n + a - 1) / a.
You can solve this using integers.
#include <stdio.h>
int main()
{
unsigned long n, m, a = 1;
unsigned long na, ma, res = 0;
scanf("%lu %lu %lu", &n, &m, &a);
na = n/a;
if (n%a != 0)
na++;
ma = m/a;
if (m%a != 0)
ma++;
res = na * ma;
printf("%lu", res);
return 0;
}
This code will fail in the Codeforce platform, on the test 9 (see below). But if you compile it and run it locally with the same inputs, the result is correct.
> Test: #9, time: 15 ms., memory: 3608 KB, exit code: 0, checker exit code: 1, verdict: WRONG_ANSWER
> Input 1000000000 1000000000 1
> Output 2808348672 Answer 1000000000000000000
> Checker Log wrong answer 1st numbers differ - expected: '1000000000000000000', found: '2808348672'
EDIT:
The problem described above is due to the fact that I'm running a 64-bit machine and the online compiler is probably using 32-bit. The unsigned long variables overflow.
The following code will pass all the tests.
#include <stdio.h>
int main()
{
unsigned long long n, m, a = 1;
unsigned long long na, ma, res = 0;
scanf("%llu %llu %llu", &n, &m, &a);
na = n/a;
if (n%a != 0)
na++;
ma = m/a;
if (m%a != 0)
ma++;
res = na * ma;
printf("%llu", res);
return 0;
}
Use the code below it will pass all the test cases we need to use long long for all variable declaration to get output.
#include <stdio.h>
#include <math.h>
int main(){
long long n,m,a,l,b;
scanf("%lld%lld%lld",&n,&m,&a);
l= n/a;
if(n%a != 0)
l++;
b= m/a;
if(m%a != 0)
b++;
printf("%lld",l*b);
return 0;
}
Theatre Square in the capital city of Berland has a rectangular shape with the size n × m meters. On the occasion of the city's anniversary, a decision was taken to pave the Square with square granite flagstones. Each flagstone is of the size a × a.
import java.util.Scanner;
public class theatre_square {
public static void main(String[] args) {
long a,b,c;
Scanner s = new Scanner(System.in);
a = s.nextLong();
b = s.nextLong();
c = s.nextLong();
long result = 0;
if(a>=c){
if(a%c==0)
result = a/c;
else
result = a/c + 1; // some part is left
}else{ // area of rectangle < area of square then 1 square is required
result = 1;
}
if(b>=c){
if(b%c==0)
result *= b/c;
else
result *= b/c + 1;
}
System.out.println(result);
}
}
case 1 . 2 2 3 => 1
length = 2 so 2 < 3 then only 1 square required <br>
breadth = 2 so 2 < 3 then covered in previous square so output 1
intial view
0 0
0 0
after adding 1 square ( r= remaining or left)
1 1 r
1 1 r
r r r
case 2 . 6 6 4 => 4
length = 2 so 6 > 4 then only 2 square required <br>
breadth = 2 so 6 > 4 then 2 square required
intial view
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
after adding 4 square ( r= remaining or left)
1 1 1 1 2 2 r r
1 1 1 1 2 2 r r
1 1 1 1 2 2 r r
1 1 1 1 2 2 r r
3 3 3 3 4 4 r r
3 3 3 3 4 4 r r
r r r r r r r r
r r r r r r r r
You can try the following:
import math
x,y,z=list(map(float, input().split()))
print(math.ceil(x/z)*math.ceil(y/z))
Here is the code for the above problem in CPP. We need a long long variable to store the value as we may have a very large value.
GUIDANCE ABOUT THE QUESTION:
As we are given the hint of edges so we have to cover them nicely. For a rectangle, we know that we have a length and height which is shown as n * m and the square is of a*a so we will try to cover the length first and decide its squares first
for that, we divide it by k, and then if any remainder exists we will add one more and the same for height.
I hope it will help you
HERE IS THE CODE
#include<iostream>
using namespace std;
int main()
{
long long n,m,k,l=0,o=0;
cin>>n>>m>>k;
l=n/k;
if(n%k!=0)
{
l++;
}
o=m/k;
if(m%k!=0)
{
o++;
}
cout<<l*o;
}
I have one variable, Npart which is an int and initialized to 64. Below is my code (test.c):
#include <math.h>
#include <stdio.h>
int Npart, N;
int main(){
Npart = 64;
N = (int) (pow(Npart/1., (1.0/3.0)));
printf("%d %d\n",Npart, N);
return 0;
};
which prints out 64 3, probably due to numerical precision issues. I compile it as follows:
gcc -g3 test.c -o test.x
If I try to debug using lldb, I try to calculate the value and print it in the command prompt, the following happens:
$ lldb ./test.x
(lldb) target create "./test.x"
Current executable set to './test.x' (x86_64).
(lldb) breakpoint set --file test.c --line 1
Breakpoint 1: where = test.x`main + 44 at test.c:8, address = 0x0000000100000f0c
(lldb) r
Process 20532 launched: './test.x' (x86_64)
Process 20532 stopped
* thread #1: tid = 0x5279e0, 0x0000000100000f0c test.x`main + 44 at test.c:8, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100000f0c test.x`main + 44 at test.c:8
5
6 int main(){
7
-> 8 Npart = 64;
9
10 N = (int) (pow(Npart/1., (1.0/3.0)));
11 printf("%d %d\n",Npart, N);
(lldb) n
Process 20532 stopped
* thread #1: tid = 0x5279e0, 0x0000000100000f12 test.x`main + 50 at test.c:10, queue = 'com.apple.main-thread', stop reason = step over
frame #0: 0x0000000100000f12 test.x`main + 50 at test.c:10
7
8 Npart = 64;
9
-> 10 N = (int) (pow(Npart/1., (1.0/3.0)));
11 printf("%d %d\n",Npart, N);
12
13 return 0;
(lldb) n
Process 20532 stopped
* thread #1: tid = 0x5279e0, 0x0000000100000f4a test.x`main + 106 at test.c:11, queue = 'com.apple.main-thread', stop reason = step over
frame #0: 0x0000000100000f4a test.x`main + 106 at test.c:11
8 Npart = 64;
9
10 N = (int) (pow(Npart/1., (1.0/3.0)));
-> 11 printf("%d %d\n",Npart, N);
12
13 return 0;
14 };
(lldb) print Npart
(int) $0 = 64
(lldb) print (int)(pow(Npart/1.,(1.0/3.0)))
warning: could not load any Objective-C class information. This will significantly reduce the quality of type information available.
(int) $1 = 0
(lldb) print (int)(pow(64,1.0/3.0))
(int) $2 = 0
Why is lldb giving different results?
Edit: Clarified the question and provided a minimal verifiable example.
Your code calculates the cube root of 64, which should be 4.
The C code converts the return value to an integer by flooring it. The pow is usually implemented in some sort of Taylor polynomial or similar - this tends to be numerically inaccurate. The result on your computer seems to be a little less than 4.0, which when cast to int is truncated - the solution would be to use for example lround first instead:
N = lround(pow(Npart/1., (1.0/3.0)));
As for the lldb, the key is the text:
error: 'pow' has unknown return type; cast the call to its declared return type
i.e. it doesn't know the return type - thus the prototype - of the function. pow is declared as
double pow(double x, double y);
but since the only hint that lldb has about the return type is the cast you provided, lldb thinks the prototype is
int pow(int x, double y);
and that will lead into undefined behaviour - in practice, lldb thinks that the return value should be the int from the EAX register, hence 0 was printed, but the actual return value was in some floating point/SIMD register. Likewise, since the types of the arguments are not known either, you must not pass in an int.
Thus I guess you would get the proper value in the debugger with
print (double)(pow(64.0, 1.0/3.0))
The GCC implementation of the C mathematical library on Debian systems has apparently an (IEEE 754-2008)-compliant implementation of the function exp, implying that rounding shall always be correct:
(from Wikipedia) The IEEE floating point standard guarantees that add, subtract, multiply, divide, fused multiply–add, square root, and floating point remainder will give the correctly rounded result of the infinite precision operation. No such guarantee was given in the 1985 standard for more complex functions and they are typically only accurate to within the last bit at best. However, the 2008 standard guarantees that conforming implementations will give correctly rounded results which respect the active rounding mode; implementation of the functions, however, is optional.
It turns out that I am encountering a case where this feature is actually hindering, because the exact result of the exp function is often nearly exactly at the middle between two consecutive double values (1), and then the program carries plenty of several further computations, losing up to a factor 400 (!) in speed: this was actually the explanation to my (ill-asked :-S) Question #43530011.
(1) More precisely, this happens when the argument of exp turns out to be of the form (2 k + 1) × 2-53 with k a rather small integer (like 242 for instance). In particular, the computations involved by pow (1. + x, 0.5) tend to call exp with such an argument when x is of the order of magnitude of 2-44.
Since implementations of correct rounding can be so much time-consuming in certain circumstances, I guess that the developers will also have devised a way to get a slightly less precise result (say, only up to 0.6 ULP or something like this) in a time which is (roughly) bounded for every value of the argument in a given range… (2)
… But how to do this??
(2) What I mean is that I just do not want that some exceptional values of the argument like (2 k + 1) × 2-53 would be much more time-consuming than most values of the same order of magnitude; but of course I do not mind if some exceptional values of the argument go much faster, or if large arguments (in absolute value) need a larger computation time.
Here is a minimal program showing the phenomenon:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <time.h>
int main (void)
{
int i;
double a, c;
c = 0;
clock_t start = clock ();
for (i = 0; i < 1e6; ++i) // Doing a large number of times the same type of computation with different values, to smoothen random fluctuations.
{
a = (double) (1 + 2 * (rand () % 0x400)) / 0x20000000000000; // "a" has only a few significant digits, and its last non-zero digit is at (fixed-point) position 53.
c += exp (a); // Just to be sure that the compiler will actually perform the computation of exp (a).
}
clock_t stop = clock ();
printf ("%e\n", c); // Just to be sure that the compiler will actually perform the computation.
printf ("Clock time spent: %d\n", stop - start);
return 0;
}
Now after gcc -std=c99 program53.c -lm -o program53:
$ ./program53
1.000000e+06
Clock time spent: 13470008
$ ./program53
1.000000e+06
Clock time spent: 13292721
$ ./program53
1.000000e+06
Clock time spent: 13201616
On the other hand, with program52 and program54 (got by replacing 0x20000000000000 by resp. 0x10000000000000 and 0x40000000000000):
$ ./program52
1.000000e+06
Clock time spent: 83594
$ ./program52
1.000000e+06
Clock time spent: 69095
$ ./program52
1.000000e+06
Clock time spent: 54694
$ ./program54
1.000000e+06
Clock time spent: 86151
$ ./program54
1.000000e+06
Clock time spent: 74209
$ ./program54
1.000000e+06
Clock time spent: 78612
Beware, the phenomenon is implementation-dependent! Apparently, among the common implementations, only those of the Debian systems (including Ubuntu) show this phenomenon.
P.-S.: I hope that my question is not a duplicate: I searched for a similar question thoroughly without success, but maybe I did note use the relevant keywords… :-/
To answer the general question on why the library functions are required to give correctly rounded results:
Floating-point is hard, and often times counterintuitive. Not every programmer has read what they should have. When libraries used to allow some slightly inaccurate rounding, people complained about the precision of the library function when their inaccurate computations inevitably went wrong and produced nonsense. In response, the library writers made their libraries exactly rounded, so now people cannot shift the blame to them.
In many cases, specific knowledge about floating point algorithms can produce considerable improvements to accuracy and/or performance, like in the testcase:
Taking the exp() of numbers very close to 0 in floating-point numbers is problematic, since the result is a number close to 1 while all the precision is in the difference to one, so most significant digits are lost. It is more precise (and significantly faster in this testcase) to compute exp(x) - 1 through the C math library function expm1(x). If the exp() itself is really needed, it is still much faster to do expm1(x) + 1.
A similar concern exists for computing log(1 + x), for which there is the function log1p(x).
A quick fix that speeds up the provided testcase:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <time.h>
int main (void)
{
int i;
double a, c;
c = 0;
clock_t start = clock ();
for (i = 0; i < 1e6; ++i) // Doing a large number of times the same type of computation with different values, to smoothen random fluctuations.
{
a = (double) (1 + 2 * (rand () % 0x400)) / 0x20000000000000; // "a" has only a few significant digits, and its last non-zero digit is at (fixed-point) position 53.
c += expm1 (a) + 1; // replace exp() with expm1() + 1
}
clock_t stop = clock ();
printf ("%e\n", c); // Just to be sure that the compiler will actually perform the computation.
printf ("Clock time spent: %d\n", stop - start);
return 0;
}
For this case, the timings on my machine are thus:
Original code
1.000000e+06
Clock time spent: 21543338
Modified code
1.000000e+06
Clock time spent: 55076
Programmers with advanced knowledge about the accompanying trade-offs may sometimes consider using approximate results where the precision is not critical
For an experienced programmer it may be possible to write an approximative implementation of a slow function using methods like Newton-Raphson, Taylor or Maclaurin polynomials, specifically inexactly rounded specialty functions from libraries like Intel's MKL, AMD's AMCL, relaxing the floating-point standard compliance of the compiler, reducing precision to ieee754 binary32 (float), or a combination of these.
Note that a better description of the problem would enable a better answer.
Regarding your comment to #EOF 's answer, the "write your own" remark from #NominalAnimal seems simple enough here, even trivial, as follows.
Your original code above seems to have a max possible argument for exp() of a=(1+2*0x400)/0x2000...=4.55e-13 (that should really be 2*0x3FF, and I'm counting 13 zeroes after your 0x2000... which makes it 2x16^13). So that 4.55e-13 max argument is very, very small.
And then the trivial taylor expansion is exp(a)=1+a+(a^2)/2+(a^3)/6+... which already gives you all double's precision for such small arguments. Now, you'll have to discard the 1 part, as explained above, and then that just reduces to expm1(a)=a*(1.+a*(1.+a/3.)/2.) And that should go pretty darn quick! Just make sure a stays small. If it gets a little bigger, just add the next term, a^4/24 (you see how to do that?).
>>EDIT<<
I modified the OP's test program as follows to test a little more stuff (discussion follows code)
/* https://stackoverflow.com/questions/44346371/
i-do-not-want-correct-rounding-for-function-exp/44397261 */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#define BASE 16 /*denominator will be (multiplier)xBASE^EXPON*/
#define EXPON 13
#define taylorm1(a) (a*(1.+a*(1.+a/3.)/2.)) /*expm1() approx for small args*/
int main (int argc, char *argv[]) {
int N = (argc>1?atoi(argv[1]):1e6),
multiplier = (argc>2?atoi(argv[2]):2),
isexp = (argc>3?atoi(argv[3]):1); /* flags to turn on/off exp() */
int isexpm1 = 1; /* and expm1() for timing tests*/
int i, n=0;
double denom = ((double)multiplier)*pow((double)BASE,(double)EXPON);
double a, c=0.0, cm1=0.0, tm1=0.0;
clock_t start = clock();
n=0; c=cm1=tm1=0.0;
/* --- to smooth random fluctuations, do the same type of computation
a large number of (N) times with different values --- */
for (i=0; i<N; i++) {
n++;
a = (double)(1 + 2*(rand()%0x400)) / denom; /* "a" has only a few
significant digits, and its last non-zero
digit is at (fixed-point) position 53. */
if ( isexp ) c += exp(a); /* turn this off to time expm1() alone */
if ( isexpm1 ) { /* you can turn this off to time exp() alone, */
cm1 += expm1(a); /* but difference is negligible */
tm1 += taylorm1(a); }
} /* --- end-of-for(i) --- */
int nticks = (int)(clock()-start);
printf ("N=%d, denom=%dx%d^%d, Clock time: %d (%.2f secs)\n",
n, multiplier,BASE,EXPON,
nticks, ((double)nticks)/((double)CLOCKS_PER_SEC));
printf ("\t c=%.20e,\n\t c-n=%e, cm1=%e, tm1=%e\n",
c,c-(double)n,cm1,tm1);
return 0;
} /* --- end-of-function main() --- */
Compile and run it as test to reproduce OP's 0x2000... scenario, or run it with (up to three) optional args test #trials multiplier timeexp where #trials defaults to the OP's 1000000, and multipler defaults to 2 for the OP's 2x16^13 (change it to 4, etc, for her other tests). For the last arg, timeexp, enter a 0 to do only the expm1() (and my unnecessary taylor-like) calculation. The point of that is to show that the bad-timing-cases displayed by the OP disappear with expm1(), which takes "no time at all" regardless of multiplier.
So default runs, test and test 1000000 4, produce (okay, I called the program rounding)...
bash-4.3$ ./rounding
N=1000000, denom=2x16^13, Clock time: 11155070 (11.16 secs)
c=1.00000000000000023283e+06,
c-n=2.328306e-10, cm1=1.136017e-07, tm1=1.136017e-07
bash-4.3$ ./rounding 1000000 4
N=1000000, denom=4x16^13, Clock time: 200211 (0.20 secs)
c=1.00000000000000011642e+06,
c-n=1.164153e-10, cm1=5.680083e-08, tm1=5.680083e-08
So the first thing you'll note is that the OP's c-n using exp() differs substantially from both cm1==tm1 using expm1() and my taylor approx. If you reduce N they come into agreement, as follows...
N=10, denom=2x16^13, Clock time: 941 (0.00 secs)
c=1.00000000000007140954e+01,
c-n=7.140954e-13, cm1=7.127632e-13, tm1=7.127632e-13
bash-4.3$ ./rounding 100
N=100, denom=2x16^13, Clock time: 5506 (0.01 secs)
c=1.00000000000010103918e+02,
c-n=1.010392e-11, cm1=1.008393e-11, tm1=1.008393e-11
bash-4.3$ ./rounding 1000
N=1000, denom=2x16^13, Clock time: 44196 (0.04 secs)
c=1.00000000000011345946e+03,
c-n=1.134595e-10, cm1=1.140730e-10, tm1=1.140730e-10
bash-4.3$ ./rounding 10000
N=10000, denom=2x16^13, Clock time: 227215 (0.23 secs)
c=1.00000000000002328306e+04,
c-n=2.328306e-10, cm1=1.131288e-09, tm1=1.131288e-09
bash-4.3$ ./rounding 100000
N=100000, denom=2x16^13, Clock time: 1206348 (1.21 secs)
c=1.00000000000000232831e+05,
c-n=2.328306e-10, cm1=1.133611e-08, tm1=1.133611e-08
And as far as timing of exp() versus expm1() is concerned, see for yourself...
bash-4.3$ ./rounding 1000000 2
N=1000000, denom=2x16^13, Clock time: 11168388 (11.17 secs)
c=1.00000000000000023283e+06,
c-n=2.328306e-10, cm1=1.136017e-07, tm1=1.136017e-07
bash-4.3$ ./rounding 1000000 2 0
N=1000000, denom=2x16^13, Clock time: 24064 (0.02 secs)
c=0.00000000000000000000e+00,
c-n=-1.000000e+06, cm1=1.136017e-07, tm1=1.136017e-07
Question: you'll note that once the exp() calculation reaches N=10000 trials, its sum remains constant regardless of larger N. Not sure why that would be happening.
>>__SECOND EDIT__<<
Okay, #EOF , "you made me look" with your "heirarchical accumulation" comment. And that indeed works to bring the exp() sum closer (much closer) to the (presumably correct) expm1() sum. The modified code's immediately below followed by a discussion. But one discussion note here: recall multiplier from above. That's gone, and in its same place is expon so that denominator is now 2^expon where the default is 53, matching OP's default (and I believe better matching how she was thinking about it). Okay, and here's the code...
/* https://stackoverflow.com/questions/44346371/
i-do-not-want-correct-rounding-for-function-exp/44397261 */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#define BASE 2 /*denominator=2^EXPON, 2^53=2x16^13 default */
#define EXPON 53
#define taylorm1(a) (a*(1.+a*(1.+a/3.)/2.)) /*expm1() approx for small args*/
int main (int argc, char *argv[]) {
int N = (argc>1?atoi(argv[1]):1e6),
expon = (argc>2?atoi(argv[2]):EXPON),
isexp = (argc>3?atoi(argv[3]):1), /* flags to turn on/off exp() */
ncparts = (argc>4?atoi(argv[4]):1), /* #partial sums for c */
binsize = (argc>5?atoi(argv[5]):10);/* #doubles to sum in each bin */
int isexpm1 = 1; /* and expm1() for timing tests*/
int i, n=0;
double denom = pow((double)BASE,(double)expon);
double a, c=0.0, cm1=0.0, tm1=0.0;
double csums[10], cbins[10][65537]; /* c partial sums and heirarchy */
int nbins[10], ibin=0; /* start at lowest level */
clock_t start = clock();
n=0; c=cm1=tm1=0.0;
if ( ncparts > 65536 ) ncparts=65536; /* array size check */
if ( ncparts > 1 ) for(i=0;i<ncparts;i++) cbins[0][i]=0.0; /*init bin#0*/
/* --- to smooth random fluctuations, do the same type of computation
a large number of (N) times with different values --- */
for (i=0; i<N; i++) {
n++;
a = (double)(1 + 2*(rand()%0x400)) / denom; /* "a" has only a few
significant digits, and its last non-zero
digit is at (fixed-point) position 53. */
if ( isexp ) { /* turn this off to time expm1() alone */
double expa = exp(a); /* exp(a) */
c += expa; /* just accumulate in a single "bin" */
if ( ncparts > 1 ) cbins[0][n%ncparts] += expa; } /* accum in ncparts */
if ( isexpm1 ) { /* you can turn this off to time exp() alone, */
cm1 += expm1(a); /* but difference is negligible */
tm1 += taylorm1(a); }
} /* --- end-of-for(i) --- */
int nticks = (int)(clock()-start);
if ( ncparts > 1 ) { /* need to sum the partial-sum bins */
nbins[ibin=0] = ncparts; /* lowest-level has everything */
while ( nbins[ibin] > binsize ) { /* need another heirarchy level */
if ( ibin >= 9 ) break; /* no more bins */
ibin++; /* next available heirarchy bin level */
nbins[ibin] = (nbins[ibin-1]+(binsize-1))/binsize; /*#bins this level*/
for(i=0;i<nbins[ibin];i++) cbins[ibin][i]=0.0; /* init bins */
for(i=0;i<nbins[ibin-1];i++) {
cbins[ibin][(i+1)%nbins[ibin]] += cbins[ibin-1][i]; /*accum in nbins*/
csums[ibin-1] += cbins[ibin-1][i]; } /* accumulate in "one bin" */
} /* --- end-of-while(nprevbins>binsize) --- */
for(i=0;i<nbins[ibin];i++) csums[ibin] += cbins[ibin][i]; /*highest level*/
} /* --- end-of-if(ncparts>1) --- */
printf ("N=%d, denom=%d^%d, Clock time: %d (%.2f secs)\n", n, BASE,expon,
nticks, ((double)nticks)/((double)CLOCKS_PER_SEC));
printf ("\t c=%.20e,\n\t c-n=%e, cm1=%e, tm1=%e\n",
c,c-(double)n,cm1,tm1);
if ( ncparts > 1 ) { printf("\t binsize=%d...\n",binsize);
for (i=0;i<=ibin;i++) /* display heirarchy */
printf("\t level#%d: #bins=%5d, c-n=%e\n",
i,nbins[i],csums[i]-(double)n); }
return 0;
} /* --- end-of-function main() --- */
Okay, and now you can notice two additional command-line args following the old timeexp. They are ncparts for the initial number of bins into which the entire #trials will be distributed. So at the lowest level of the heirarchy, each bin should (modulo bugs:) have the sum of #trials/ncparts doubles. The argument after that is binsize, which will be the number of doubles summed in each bin at every successive level, until the last level has fewer (or equal) #bins as binsize. So here's an example dividing 1000000 trials into 50000 bins, meaning 20doubles/bin at the lowest level, and 5doubles/bin thereafter...
bash-4.3$ ./rounding 1000000 53 1 50000 5
N=1000000, denom=2^53, Clock time: 11129803 (11.13 secs)
c=1.00000000000000465661e+06,
c-n=4.656613e-09, cm1=1.136017e-07, tm1=1.136017e-07
binsize=5...
level#0: #bins=50000, c-n=4.656613e-09
level#1: #bins=10002, c-n=1.734588e-08
level#2: #bins= 2002, c-n=7.974450e-08
level#3: #bins= 402, c-n=1.059379e-07
level#4: #bins= 82, c-n=1.133885e-07
level#5: #bins= 18, c-n=1.136214e-07
level#6: #bins= 5, c-n=1.138542e-07
Note how the c-n for exp() converges pretty nicely towards the expm1() value. But note how it's best at level#5, and isn't converging uniformly at all. And note if you break the #trials into only 5000 initial bins, you get just as good a result,
bash-4.3$ ./rounding 1000000 53 1 5000 5
N=1000000, denom=2^53, Clock time: 11165924 (11.17 secs)
c=1.00000000000003527384e+06,
c-n=3.527384e-08, cm1=1.136017e-07, tm1=1.136017e-07
binsize=5...
level#0: #bins= 5000, c-n=3.527384e-08
level#1: #bins= 1002, c-n=1.164153e-07
level#2: #bins= 202, c-n=1.158332e-07
level#3: #bins= 42, c-n=1.136214e-07
level#4: #bins= 10, c-n=1.137378e-07
level#5: #bins= 4, c-n=1.136214e-07
In fact, playing with ncparts and binsize doesn't seem to show much sensitivity, and it's not always "more is better" (i.e., less for binsize) either. So I'm not sure exactly what's going on. Could be a bug (or two), or could be yet another question for #EOF ...???
>>EDIT -- example showing pair addition "binary tree" heirarchy<<
Example below added as per #EOF 's comment
(Note: re-copy preceding code. I had to edit nbins[ibin] calculation for each next level to nbins[ibin]=(nbins[ibin-1]+(binsize-1))/binsize; from nbins[ibin]=(nbins[ibin-1]+2*binsize)/binsize; which was "too conservative" to create ...16,8,4,2 sequence)
bash-4.3$ ./rounding 1024 53 1 512 2
N=1024, denom=2^53, Clock time: 36750 (0.04 secs)
c=1.02400000000011573320e+03,
c-n=1.157332e-10, cm1=1.164226e-10, tm1=1.164226e-10
binsize=2...
level#0: #bins= 512, c-n=1.159606e-10
level#1: #bins= 256, c-n=1.166427e-10
level#2: #bins= 128, c-n=1.166427e-10
level#3: #bins= 64, c-n=1.161879e-10
level#4: #bins= 32, c-n=1.166427e-10
level#5: #bins= 16, c-n=1.166427e-10
level#6: #bins= 8, c-n=1.166427e-10
level#7: #bins= 4, c-n=1.166427e-10
level#8: #bins= 2, c-n=1.164153e-10
>>EDIT -- to show #EOF's elegant solution in comment below<<
"Pair addition" can be elegantly accomplished recursively, as per #EOF's comment below, which I'm reproducing here. (Note case 0/1 at end-of-recursion to handle n even/odd.)
/* Quoting from EOF's comment...
What I (EOF) proposed is effectively a binary tree of additions:
a+b+c+d+e+f+g+h as ((a+b)+(c+d))+((e+f)+(g+h)).
Like this: Add adjacent pairs of elements, this produces
a new sequence of n/2 elements.
Recurse until only one element is left.
(Note that this will require n/2 elements of storage,
rather than a fixed number of bins like your implementation) */
double trecu(double *vals, double sum, int n) {
int midn = n/2;
switch (n) {
case 0: break;
case 1: sum += *vals; break;
default: sum = trecu(vals+midn, trecu(vals,sum,midn), n-midn); break; }
return(sum);
}
This is an "answer"/followup to EOF's preceding comments re his trecu() algorithm and code for his "binary tree summation" suggestion. "Prerequisites" before reading this are reading that discussion. It would be nice to collect all that in one organized place, but I haven't done that yet...
...What I did do was build EOF's trecu() into the test program from the preceding answer that I'd written by modifying the OP's original test program. But then I found that trecu() generated exactly (and I mean exactly) the same answer as the "plain sum" c using exp(), not the sum cm1 using expm1() that we'd expected from a more accurate binary tree summation.
But that test program's a bit (maybe two bits:) "convoluted" (or, as EOF said, "unreadable"), so I wrote a separate smaller test program, given below (with example runs and discussion below that), to separately test/exercise trecu(). Moreover, I also wrote function bintreesum() into the code below, which abstracts/encapsulates the iterative code for binary tree summation that I'd embedded into the preceding test program. In that preceding case, my iterative code indeed came close to the cm1 answer, which is why I'd expected EOF's recursive trecu() to do the same. Long-and-short of it is that, below, same thing happens -- bintreesum() remains close to correct answer, while trecu() gets further away, exactly reproducing the "plain sum".
What we're summing below is just sum(i),i=1...n, which is just the well-known n(n+1)/2. But that's not quite right -- to reproduce OP's problem, summand is not sum(i) alone but rather sum(1+i*10^(-e)), where e can be given on the command-line. So for, say, n=5, you don't get 15 but rather 5.000...00015, or for n=6 you get 6.000...00021, etc. And to avoid a long, long format, I printf() sum-n to remove that integer part. Okay??? So here's the code...
/* Quoting from EOF's comment...
What I (EOF) proposed is effectively a binary tree of additions:
a+b+c+d+e+f+g+h as ((a+b)+(c+d))+((e+f)+(g+h)).
Like this: Add adjacent pairs of elements, this produces
a new sequence of n/2 elements.
Recurse until only one element is left. */
#include <stdio.h>
#include <stdlib.h>
double trecu(double *vals, double sum, int n) {
int midn = n/2;
switch (n) {
case 0: break;
case 1: sum += *vals; break;
default: sum = trecu(vals+midn, trecu(vals,sum,midn), n-midn); break; }
return(sum);
} /* --- end-of-function trecu() --- */
double bintreesum(double *vals, int n, int binsize) {
double binsum = 0.0;
int nbin0 = (n+(binsize-1))/binsize,
nbin1 = (nbin0+(binsize-1))/binsize,
nbins[2] = { nbin0, nbin1 };
double *vbins[2] = {
(double *)malloc(nbin0*sizeof(double)),
(double *)malloc(nbin1*sizeof(double)) },
*vbin0=vbins[0], *vbin1=vbins[1];
int ibin=0, i;
for ( i=0; i<nbin0; i++ ) vbin0[i] = 0.0;
for ( i=0; i<n; i++ ) vbin0[i%nbin0] += vals[i];
while ( nbins[ibin] > 1 ) {
int jbin = 1-ibin; /* other bin, 0<-->1 */
nbins[jbin] = (nbins[ibin]+(binsize-1))/binsize;
for ( i=0; i<nbins[jbin]; i++ ) vbins[jbin][i] = 0.0;
for ( i=0; i<nbins[ibin]; i++ )
vbins[jbin][i%nbins[jbin]] += vbins[ibin][i];
ibin = jbin; /* swap bins for next pass */
} /* --- end-of-while(nbins[ibin]>0) --- */
binsum = vbins[ibin][0];
free((void *)vbins[0]); free((void *)vbins[1]);
return ( binsum );
} /* --- end-of-function bintreesum() --- */
#if defined(TESTTRECU)
#include <math.h>
#define MAXN (2000000)
int main(int argc, char *argv[]) {
int N = (argc>1? atoi(argv[1]) : 1000000 ),
e = (argc>2? atoi(argv[2]) : -10 ),
binsize = (argc>3? atoi(argv[3]) : 2 );
double tens = pow(10.0,(double)e);
double *vals = (double *)malloc(sizeof(double)*MAXN),
sum = 0.0;
double trecu(), bintreesum();
int i;
if ( N > MAXN ) N=MAXN;
for ( i=0; i<N; i++ ) vals[i] = 1.0 + tens*(double)(i+1);
for ( i=0; i<N; i++ ) sum += vals[i];
printf(" N=%d, Sum_i=1^N {1.0 + i*%.1e} - N = %.8e,\n"
"\t plain_sum-N = %.8e,\n"
"\t trecu-N = %.8e,\n"
"\t bintreesum-N = %.8e \n",
N, tens, tens*((double)N)*((double)(N+1))/2.0,
sum-(double)N,
trecu(vals,0.0,N)-(double)N,
bintreesum(vals,N,binsize)-(double)N );
} /* --- end-of-function main() --- */
#endif
So if you save that as trecu.c, then compile it as cc DTESTTRECU trecu.c lm o trecu And then run with zero to three optional command-line args as trecu #trials e binsize Defaults are #trials=1000000 (like OP's program), e=10, and binsize=2 (for my bintreesum() function to do a binary-tree sum rather than larger-size bins).
And here are some test results illustrating the problem described above,
bash-4.3$ ./trecu
N=1000000, Sum_i=1^N {1.0 + i*1.0e-10} - N = 5.00000500e+01,
plain_sum-N = 5.00000500e+01,
trecu-N = 5.00000500e+01,
bintreesum-N = 5.00000500e+01
bash-4.3$ ./trecu 1000000 -15
N=1000000, Sum_i=1^N {1.0 + i*1.0e-15} - N = 5.00000500e-04,
plain_sum-N = 5.01087168e-04,
trecu-N = 5.01087168e-04,
bintreesum-N = 5.00000548e-04
bash-4.3$
bash-4.3$ ./trecu 1000000 -16
N=1000000, Sum_i=1^N {1.0 + i*1.0e-16} - N = 5.00000500e-05,
plain_sum-N = 6.67552231e-05,
trecu-N = 6.67552231e-05,
bintreesum-N = 5.00001479e-05
bash-4.3$
bash-4.3$ ./trecu 1000000 -17
N=1000000, Sum_i=1^N {1.0 + i*1.0e-17} - N = 5.00000500e-06,
plain_sum-N = 0.00000000e+00,
trecu-N = 0.00000000e+00,
bintreesum-N = 4.99992166e-06
So you can see that for the default run, e=10, everybody's doing everything right. That is, the top line that says "Sum" just does the n(n+1)/2 thing, so presumably displays the right answer. And everybody below that agrees for the default e=10 test case. But for the e=15 and e=16 cases below that, trecu() exactly agrees with the plain_sum, while bintreesum stays pretty close to the right answer. And finally, for e=17, plain_sum and trecu() have "disappeared", while bintreesum()'s still hanging in there pretty well.
So trecu()'s correctly doing the sum all right, but its recursion's apparently not doing that "binary tree" type of thing that my more straightforward iterative bintreesum()'s apparently doing correctly. And that indeed demonstrates that EOF's suggestion for "binary tree summation" realizes quite an improvement over the plain_sum for these 1+epsilon kind of cases. So we'd really like to see his trecu() recursion work!!! When I originally looked at it, I thought it did work. But that double-recursion (is there a special name for that?) in his default: case is apparently more confusing (at least to me:) than I thought. Like I said, it is doing the sum, but not the "binary tree" thing.
Okay, so who'd like to take on the challenge and explain what's going on in that trecu() recursion? And, maybe more importantly, fix it so it does what's intended. Thanks.
I've started studying C and I'm trying to practice it developing a small application. Please, could you give any tips about what to do here?
I want to buy shoes from three different brands (brandA=50; brandB=100; brandC=150). I need to spend 2000 dollars on it and buy exactly 20 shoes.
How could I write a program to display all possible combinations?
E.g. brandA (10 shoes), brandB (0 shoe), brandC(10 shoes);
brandA(1 shoe), brandB (3 shoes), brandC (11 shoes), etc.
Please, I don't want the full code now but tips about how to do it.
I really appreciate any help. Tks!
I've updated my post to include a code. Does this code make any sense?
int main(void) {
int brandA=50, brandB=100, brandC=150, ba, bb, bc;
for(ba=0;ba<=20;ba++) {
for(bb=0;bb<=20;bb++) {
for(bc=0;bc<=20;bc++) {
if(ba+bb+bc==20 && (ba*brandA)+(bb*brandB)+(bc*brandC)==2000 {
printf("You can buy %d brandA, %d brandB, %d brandC", ba,bb,bc);
}
}
}
}
return 0;
}
For first you have to have an algorithm...
Take all zero shoes brandA=0; brandB=0; brandC=0
Check total quantity= 0+0+0 = 0
If it is not 20 pcs - pass it
If it equal 20 pcs ( for example: brandA=5; brandB=5; brandC=10) - check total price.
If total price equal 2000 - show it, if not - pass it.
Increment brandA value till 20
repeat steps 2-6
Increment brandB value till 20
repeat steps 2-8
Increment brandC value till 20
repeat steps 2-10
Note: you can use 3 included 'for' cycles :)
If you are a beginner I suggest to start with backtracking and recursion.
Even if backtracking is a costly technique it's great for a beginner to see how recursion can provide a simple yet powerful solution to a problem.
Here are some resources for you to start: http://web.cse.ohio-state.edu/~gurari/course/cis680/cis680Ch19.html
And if you are serious about programming you should also read some books about algorithms and data structures since you will rely heavily on these basic fundamentals:
1. The Algorithm Design Manual
2. The Pragmatic Programmer: From Journeyman to Master
Now that you have it working, I feel no guilt in suggesting a potential refinement. Rather than a pure brute force triple loop, as others have mentioned, you can use the relationship between A, B & C to eliminate the third loop. Remember in C, there are generally many ways to approach any given problem and many ways to handle the output. As long as the logic and syntax are correct, then the only difference will be in the efficiency of the algorithms. Here is an example of eliminating the third loop:
#include <stdio.h>
int main (void) {
int ca = 50;
int cb = 100;
int cc = 150;
int budget = 2000;
int pairs = 20;
int a, b, c, cost;
for (a = 0; a < budget/ca; a++)
for (b = 0; b < budget/cb; b++)
{
c = pairs - (a + b);
if ((cost = a * ca + b * cb + c * cc) != budget)
continue;
printf ("\n a (%2d) * %3d = %4d"
"\n b (%2d) * %3d = %4d"
"\n c (%2d) * %3d = %4d\n",
a, ca, a * ca, b, cb, b * cb, c, cc, c * cc);
printf (" ===================\n (%d) %d\n", pairs, budget);
}
return 0;
}
Compiling
Since you are new to C, when you compile your code, make sure you always compile with warnings enabled. The compiler warnings are there for a reason, and there are very, very few circumstances where you can rely on code that doesn't compile without warnings. At minimum, you will want to compile with -Wall -Wextra enabled (gcc). You can also include -pedantic if you want to check your code against virtually all possible warnings. For example, to compile the code above:
$ gcc -Wall -Wextra -pedantic -o bin/shoes shoes.c
If you want to add optimizations to the fullest extent, you can add:
-Ofast (-03 with gcc < 4.6)
Output
$ ./bin/shoes
a ( 1) * 50 = 50
b (18) * 100 = 1800
c ( 1) * 150 = 150
===================
(20) 2000
a ( 2) * 50 = 100
b (16) * 100 = 1600
c ( 2) * 150 = 300
===================
(20) 2000
a ( 3) * 50 = 150
b (14) * 100 = 1400
c ( 3) * 150 = 450
===================
(20) 2000
<..snip..>
a ( 9) * 50 = 450
b ( 2) * 100 = 200
c ( 9) * 150 = 1350
===================
(20) 2000
a (10) * 50 = 500
b ( 0) * 100 = 0
c (10) * 150 = 1500
===================
(20) 2000
Good luck learning C. There is no other comparable language that gives you the precise low-level control that you have in C. (assembler excluded) But that precise control doesn't come for free. There is a learning curve involved and there is a bit more to cover in C before you will lose that "fish out of water" feeling and feel comfortable with the language. The benefit of learning C, with the low-level access it provides, is it will greatly improve your understanding of how programming works. That knowledge is applicable to all other programming languages (no matter how hard the other languages work to hide the details from you). C is time well spent.