Can not understand this pi calculate algorithm - c

I saw a pi calculate algorithm on a website and it looks like that:
#include <stdio.h>
int a[52514],b,c=52514,d,e,f=1e4,g,h;
main(){
for(;b=c-=14;h=printf("%04d",e+d/f)){
for(e=d%=f;g=--b*2;d/=g){
d=d*b+f*(h?a[b]:f/5);
a[b]=d%--g;}
}
}
it was said this code was based on this expansion,but i do not understand the relative between the code and the expansion.
pi= sigma( (i!) ^2*2^(i+1))/(2i+1)!
(i=0 to infinite)
Could you help me explain it?Thanks.

pi+3=sigma( (m!)^2 * 2^m * m / (2*m)! ) (m=1 to infinite).
Algorithm's S Pflouffe use it.

Related

My loop isn't working well in this little game I just create

I'm recently introducing myself to C programming. I've tried in the past with web developing, like html or css, but not really deeper. Well, I'm learning from a book about C programming. I see a dice game, and tried to emulate it just for fun and learning how it works, but didn't work well(My intention was that if lifeGohan or lifeJoel gets to '0' it stops. Just like one die, and 'the battle' stops.). I'm not following any guide for this specific game I just created. So, I'm just trying many things without good results.
Can you give me some pieces of advice, please?
This is my code in Tried N°1
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <ctype.h>
main()
{
int attackJoel, attackGohan, lifeJoel, lifeGohan, KO;
time_t t;
srand(time(&t));
lifeGohan=20;
lifeJoel=20;
printf("Actual Life Joel: %d\tActual Life Gohan: %d\n", lifeJoel, lifeGohan);
do
{
attackJoel=(rand() % 6)+1;
attackGohan=(rand() % 6)+1;
lifeJoel=lifeJoel-attackGohan;
lifeGohan=lifeGohan-attackJoel;
printf("\nActual Life Joel: %d\tActual Life Gohan: %d\n", lifeJoel, lifeGohan);
}
while((lifeJoel>0) || (lifeGohan>0));
}
Tried N°2
main()
{
int attackJoel, attackGohan, lifeJoel, lifeGohan, KO;
time_t t;
srand(time(&t));
lifeGohan=20;
lifeJoel=20;
printf("Actual Life Joel: %d\tActual Life Gohan: %d\n", lifeJoel, lifeGohan);
do
{
attackJoel=(rand() % 6)+1;
attackGohan=(rand() % 6)+1;
lifeJoel=lifeJoel-attackGohan;
lifeGohan=lifeGohan-attackJoel;
printf("\nActual Life Joel: %d\tActual Life Gohan: %d\n", lifeJoel, lifeGohan);
if((lifeJoel<=0) || (lifeGohan<=0))
{
KO=0;
}
}
while(KO==0);
}
How does the loop stop? Did you initialise the variable? Does it ever change?
Make K0 non-zero to stop the loop.

Increase of execution time while using multithreaded FFTW

I am new to FFTW library. I have successfully implemented 1D and 2D fft using FFTW library. I converted my 2D fft code into multithreaded 2D fft. But the results were completely opposite. Multithreaded 2D FFT code is taking longer time to run than serialized 2D FFT code. I am missing something somewhere. I followed all the instructions given in FFTW documentation to parallelize the code.
This is my parallelized 2D FFT C program
#include <mpi.h>
#include <fftw3.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#define N 2000
#define M 2000
#define index(i, j) (j + i*M)
int i, j;
void get_input(fftw_complex *in) {
for(i=0;i<N;i++){
for(j=0;j<M;j++){
in[index(i, j)][0] = sin(i + j);
in[index(i, j)][1] = sin(i * j);
}
}
}
void show_out(fftw_complex *out){
for(i=0;i<N;i++){
for(j=0;j<M;j++){
printf("%lf %lf \n", out[index(i, j)][0], out[index(i, j)][1]);
}
}
}
int main(){
clock_t start, end;
double time_taken;
start = clock();
int a = fftw_init_threads();
printf("%d\n", a);
fftw_complex *in, *out;
fftw_plan p;
in = (fftw_complex *)fftw_malloc(N * M * sizeof(fftw_complex));
out = (fftw_complex *)fftw_malloc(N * M * sizeof(fftw_complex));
get_input(in);
fftw_plan_with_nthreads(4);
p = fftw_plan_dft_2d(N, M, in, out, FFTW_FORWARD, FFTW_ESTIMATE);
fftw_execute(p);
/*p = fftw_plan_dft_1d(N, out, out, FFTW_BACKWARD, FFTW_ESTIMATE);
fftw_execute(p);
puts("In Real Domain");
show_out(out);*/
fftw_destroy_plan(p);
fftw_free(in);
fftw_free(out);
fftw_cleanup_threads();
end = clock();
time_taken = ((double) (end - start)) / CLOCKS_PER_SEC;
printf("%g \n", time_taken);
return 0;
}
Can someone please help me in pointing out the mistake what I am doing?
That kind of behavior is typical of incorrect binding.
Generally speaking, OpenMP threads should all be bound to cores of the same socket in order to avoid NUMA effect (which can make performance suboptimal or even worst).
Also, make sure MPI tasks are correctly bound (one task should be bound to several cores from the same sockets, and you should use one OpenMP thread per core).
Because of MPI, there is a risk your OpenMP threads end up doing time sharing.
At first, i recommend you start printing both MPI and OpenMP binding.
How to achieve that is dependent on both MPI library and OpenMP runtime. If you use Open MPI and Intel compilers, you can KMP_AFFINITY=verbose mpirun --report-bindings --tag-output ...
Then, as suggested earlier, i recommend you start easy and increase complexity
1 MPI task and 1 OpenMP thread
1 MPI task and x OpenMP threads (x is the number of cores on one socket)
x MPI tasks and 1 OpenMP thread per task
x MPI tasks and y OpenMP threads per task
hopefully, 2. will be faster than 1. and 4 will be faster than 3.

sampling with raspberry in c

I need to analogread every 4ms, but I tested my code reading the execution time and it printed this:
it's not 4ms,
my code:
#include <time.h>
clock_t start,end;
double tempo;
for(i=1; i <= 20; i++) {
start=clock();
x = analogRead (BASE + chan);
printf("%d\n", x);
delay(4);
end=clock();
tempo=((double)(end-start))/CLOCKS_PER_SEC;
printf("%f \n", tempo);
}
It does not matter what function you use as the Linux is not a RTOS, so you can actually forget about real time functionality unless you patch the kernel with PREEMPT_RT. There is a lots of information about this topic online.
This is to complex topic for a SO answer but I hope that I will point you into a right direction.

How to measure the elapsead time below nanosecond for x86?

I have searched and used many approaches for measuring the elapsed time. there are many questions for this purpose. For example, this question is very good but when you need an accurate time recorder I couldn't find a good method. For this, I want to share my method here to be used and be corrected if something is wrong.
UPDATE&NOTE: this question is for Benchmarking, less than one nanosecond. It's completely different from using clock_gettime(CLOCK_MONOTONIC,&start); it records time more than one nanosecond.
UPDATE : A common method to measure the speedup is repeating a section of the program which should be benchmarked. But, as mentioned in comment it might show different optimization when the researcher rely on autovectorizing.
NOTE It's not accurate enough to measure the elapsed time in one repeatinng. In some cases my results show that the section must be repeated more than 1K or 1M to get the smallest time.
SUGGESTION : I'm not familiar with shell programming (just know some basic commands...) But, it might be possible to measure the smallest time with out repeating inside the program.
MY CURRENT SOLUTION In order to prevent the branches I repeat the ode section using a macro #define REP_CODE(X) X X X... X X which X is the code section I want to benchmark as follows:
//numbers
#define FMAX1 MAX1*MAX1
#define COEFF 8
int __attribute__(( aligned(32))) input[FMAX1+COEFF]; //= {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
int __attribute__(( aligned(32))) output[FMAX1];
int __attribute__(( aligned(32))) coeff[COEFF] = {1,2,3,4,5,6,7,8};//= {1,1,1,1,1,1,1,1};//; //= {1,2,1,2,1,2,1,2,2,1};
int main()
{
REP_CODE(
t1_rdtsc=_rdtsc();
//Code
for(i = 0; i < FMAX1; i++){
for(j = 0; j < COEFF; j++){//IACA_START
output[i] += coeff[j] * input[i+j];
}//IACA_END
}
t2_rdtsc=_rdtsc();
ttotal_rdtsc[ii++]=t2_rdtsc-t1_rdtsc;
)
// The smallest element in `ttotal_rdtsc` is the answer
}
This does not impact the optimization but also is restricted by code size and compiling time is too much in some cases.
Any suggestion and correction?
Thanks in advance.
If you have problem with autovectorizer and want to limit it just add a asm("#somthing"); after your begin_rdtsc it will separate the do-while loop. I just checked and it vectorized your posted code which auto vectorizer was unable to vectorize it.
I changed your macro you can use it....
long long t1_rdtsc, t2_rdtsc, ttotal_rdtsc[do_while], ttbest_rdtsc = 99999999999999999, elapsed, elapsed_rdtsc=do_while, overal_time = OVERAL_TIME, ttime=0;
int ii=0;
#define begin_rdtsc\
do{\
asm("#mmmmmmmmmmm");\
t1_rdtsc=_rdtsc();
#define end_rdtsc\
t2_rdtsc=_rdtsc();\
asm("#mmmmmmmmmmm");\
ttotal_rdtsc[ii]=t2_rdtsc-t1_rdtsc;\
}while (ii++<do_while);\
for(ii=0; ii<do_while; ii++){\
if (ttotal_rdtsc[ii]<ttbest_rdtsc){\
ttbest_rdtsc = ttotal_rdtsc[ii];}}\
printf("\nthe best is %lld in %lld iteration\n", ttbest_rdtsc, elapsed_rdtsc);
I have developed my first answer and got this solution. But, I still want a solution. Because it is very important to measure the time accurately and with the least impacts. I put this part in a header file and include it in main program files.
//Header file header.h
#define count 1000 // number of repetition
long long t1_rdtsc, t2_rdtsc, ttotal_rdtsc[count], ttbest_rdtsc = 99999999999999999, elapsed, elapsed_rdtsc=count, overal_time = OVERAL_TIME, ttime=0;
int ii=0;
#define begin_rdtsc\
do{\
t1_rdtsc=_rdtsc();
#define end_rdtsc\
t2_rdtsc=_rdtsc();\
ttotal_rdtsc[ii]=t2_rdtsc-t1_rdtsc;\
}while (ii++<count);\
for(ii=0; ii<do_while; ii++){\
if (ttotal_rdtsc[ii]<ttbest_rdtsc){\
ttbest_rdtsc = ttotal_rdtsc[ii];}}\
printf("\nthe best is %lld in %lldth iteration \n", ttbest_rdtsc, elapsed_rdtsc);
//Main program
#include "header.h"
.
.
.
int main()
{
//before the section
begin_rdtsc
//put your code here to measure the clocks.
end_rdtsc
return 0
}
I recommend using this method for x86 micro-architecture.
NOTE:
NUM_LOOP should be a number which helps to increase the accuracy
with repeating your code to record the best time
ttbest_rdtsc must
be bigger than the worst time I recommend to maximize it.
I used (you might not want it) OVERAL_TIME as another checking rule because I used this for many kernels and in some cases NUM_LOOP was very big and I didn't want to change it. I planned OVERAL_TIME to limit the iterations and stop after specific time.
UPDATE: The whole program is this:
#include <stdio.h>
#include <x86intrin.h>
#define NUM_LOOP 100 //executes your code NUM_LOOP times to get the smalest time to avoid overheads such as cache misses, etc.
int main()
{
long long t1_rdtsc, t2_rdtsc, ttotal_rdtsc, ttbest_rdtsc = 99999999999999999;
int do_while = 0;
do{
t1_rdtsc = _rdtsc();
//put your code here
t2_rdtsc = _rdtsc();
ttotal_rdtsc = t2_rdtsc - t1_rdtsc;
//store the smalest time:
if (ttotal_rdtsc<ttbest_rdtsc)
ttbest_rdtsc = ttotal_rdtsc;
}while (do_while++ < NUM_LOOP);
printf("\nthe best is %lld in %d repetitions\n", ttbest_rdtsc, NUM_LOOP );
return 0;
}
that I have changed to this and added to a header for my self then I can use it simply in my program.
#include <x86intrin.h>
#define do_while NUM_LOOP
#define OVERAL_TIME 999999999
long long t1_rdtsc, t2_rdtsc, ttotal_rdtsc, ttbest_rdtsc = 99999999999999999, elapsed, elapsed_rdtsc=do_while, overal_time = OVERAL_TIME, ttime=0;
#define begin_rdtsc\
do{\
t1_rdtsc=_rdtsc();
#define end_rdtsc\
t2_rdtsc=_rdtsc();\
ttotal_rdtsc=t2_rdtsc-t1_rdtsc;\
if (ttotal_rdtsc<ttbest_rdtsc){\
ttbest_rdtsc = ttotal_rdtsc;\
elapsed=(do_while-elapsed_rdtsc);}\
ttime+=ttotal_rdtsc;\
}while (elapsed_rdtsc-- && (ttime<overal_time));\
printf("\nthe best is %lld in %lldth iteration and %lld repetitions\n", ttbest_rdtsc, elapsed, (do_while-elapsed_rdtsc));
How to use this method? Well, it is very simple!
int main()
{
//before the section
begin_rdtsc
//put your code here to measure the clocks.
end_rdtsc
return 0
}
Be creative, You can change it to measure the speedup in your program, etc.
An example of the output is:
the best is 9600 in 384751th iteration and 569179 repetitions
my tested code got 9600 clock that the best was recorded in 384751enditeration and my code was tested 569179 times
I have tested them on GCC and Clang.

Stuck as a Beginner: C Programming

I am taking a C programming class this semester, and was somehow allowed to register despite not fulfilling the prerequisite. I thought I would still be able to handle it, but now that I have passed the point of no return for dropping it, I find myself completely lost.
For my current assignment, I am supposed to create a program that does a few simple trig operations and display the results. The main idea is that there is a building, and I am standing a certain distance from it.
For part A, I have to calculate the height of the building assuming I am standing 120 meters from the building and am looking at the top while tilting my head at a 30 degree angle (plus/minus 3 degrees).
Part B, assumes the building is 200ft tall, and I am standing 20ft away. What would be the angle I would have to tilt my head to see the top?
Part C, given the info in part B, how far is the distance (hypotenuse) from my head to the top of the building?
So far, I have written this:
#include <stdio.h>
#include <math.h>
#define MAX_ANGLE 33
#define MIN_ANGLE 27
#define DIST_A 120
#define DIST_B 20
#define HEIGHT_B 200
#define PI 3.14159
int main()
(
double MIN_ANGLE_R, MAX_ANGLE_R;
MIN_ANGLE_R = MIN_ANGLE * (PI / 180);
MAX_ANGLE_R = MAX_ANGLE * (PI / 180);
min_height = DIST_A * tan(MIN_ANGLE);
max_height = DIST_A * tan(MAX_ANGLE);
angle = atan(HEIGHT_B/DIST_B)/(PI/180);
hypotenuse = HEIGHT_B/tan(angle);
printf ("The minimum height is %6.2f meters.\nThe maximum height is%6.2f meters.\n\n",min_height,max_height);
printf ("The angle that youw ill tilt your head to see\nthe top of the building is %3.2f feet.\n",angle);
printf ("The distance from your head to the top of the building is %6.2f feet.\n",hypotenuse);
return 0;
)
When I try compiling the program, I keep getting errors that I don't know how to read. IF anyone could read through my program, and tell me what's missing, it would be a huge help.
Don't confuse () and {}. They mean different things.
Declare your variables.
You have to open and close main() with "{ ... }" instead of "( ... )". Also, you have to declare all the variables you are using (not just MIN_ANGLE_R and MAX_ANGLE_R).
I'm not a C programmer, but I suspect your trig functions work in radians and you seem to be passing degrees.

Resources