I am fairly new to the whole multiprocessing and parallelizing world. I am currently working on an algorithm to implement Simpson's Method and Simpson 3/8. I have already implemented the algorithm in C in its serial form.
I need some help to parallelize both of this algorithms using, OpenMp and Pthreads. Any recommendation is welcome and appreciated.
//Simpson.c
#include <stdio.h>
#include <stdlib.h>
double function(double x) //función a la cual se le aplicará el método
{
return 0.5*x;
}
int main(int argc, char*argv[])
{
double resultado = 0.0; //resultado de la integral tras aplicar el método de simpson
double a = 0.0; //límite inferior de la integral
double b = 0.0; //límite superior de la integral
int n = 0; //cantidad de particiones entre los intervalos, tiene que ser un número par
int i = 0; //variable para poder iterar sobre las diferentes particiones
double h = 0.0; //variable para guardar el incremento que habrá entre cada valor de "x"
double calc = 0.0; //variable intermedia para ir acumulando el resultado de evaluar las diferentes "x" en la función
//variables para almacenar los diferentes valores de "x", es decir los límites inferiores y superiores y el valor intermedio de cada partición
double x0 = 0.0;
double x1 = 0.0;
double x2 = 0.0;
printf("Introduce el límite inferior: ");
scanf("%lf", &a);
printf("Introduce el límite superior: ");
scanf("%lf", &b);
printf("Introduce la cantidad de particiones (número par): ");
scanf("%d", &n);
h = (b-a)/n; //se calcula el incremento para poder iterar sobre todas las particiones
//Para el cálculo de la integral se utilizan xj, x(j+1) y x(j+2) en el subintervalo [xj, x(j+2)] para cada i = 1, 3, 5, ..., n
for (i=0; i<=n; i+=2)
{
x0 = a + h*i; //límite inferior
calc += function(x0); //se evalua la x en la función
x1 = a + h*(i+1); //valor intermedio
calc += 4*function(x1);
x2 = a + h*(i+2); //límite superior
calc += function(x2);
calc *= ((x2-x0)/6) ; //se vuelve a calcular el incremento entre nuestros límites, y se divide entre 6
resultado += calc; //variable para ir acumulando los cálculos intermedios de cada "x"
}
printf("La aproximación a la integral es: %f \n", resultado);
}
#include<stdio.h>
#include<math.h>
// función a integrar
#define f(x) (0.5*x)
int main()
{
float b; //Limite superior
float a; //Limite inferior
float resultado=0.0;
float dx;
float k;
int i;
int N; //numero de intervalos o iteraciones
// Entradas
printf("Dame el limite inferior: ");
scanf("%f", &a);
printf("Dame el limite superior: ");
scanf("%f", &b);
printf("Número de iteraciones/intervalos (número par): ");
scanf("%d", &N);
//Calculando delta de x
dx = (b - a)/N;
//Sumas de la integracion
resultado = f(a) + f(b); //primera suma
for(i=1; i<= N-1; i++)
{
k = a + i*dx; //Suma del intervalo
if(i%3 == 0)
{
resultado = resultado + 2 * f(k);
}
else
{
resultado = resultado + 3 * f(k);
}
}
resultado = resultado * dx*3/8;
printf("\nEl valor de la integral es: %f \n", resultado);
printf("\n");
return 0;
}
My first recommendation is to start by reading a lot about the shared-memory paradigm and its problems. Then read about Pthreads and OpenMP, and only then look at concrete OpenMP and Pthread parallelizations, and how they deal with some of the issues of the shared-memory paradigm. If you have to learn both Pthreads and OpenMP, in my opinion, if I were you I would start by looking into Pthread first and only then looking into OpenMP. The latter helps with some of the cumbersome details, and pitfalls, of coding with the former.
I will try to give you a very general (not super detailed) formula on how to approach a parallelization of a code, and I will be using as an example your second code and OpenMP.
If you opt to parallelize an algorithm, first you need to look at the part of the code that is worth parallelization (i.e., the overhead of the parallelization is overshadowed by the speedups that it brings). For this profiling is fundamental, however, for smaller codes like yours it is pretty straightforward to figure it out, it will be typically loops, for instance:
for(i=1; i<= N-1; i++)
{
k = a + i*dx; //Suma del intervalo
if(i%3 == 0)
{
resultado = resultado + 2 * f(k);
}
else
{
resultado = resultado + 3 * f(k);
}
}
To parallelize this code, which constructor should I use tasks ? parallel for ? you can look at the great answers to this topic on this SO thread. In your code, it is pretty clear that we should use parallel for (i.e., #pragma omp parallel for). Consequently, we are telling to OpenMP to divide the iterations of that loop among the threads.
Now you should think about if the variables used inside the parallel region should be private or shared among threads. For that openMP offers constructors like private, firstprivate, lastprivate and shared. Another great SO thread about the subject can be found here. To evaluate this you really need to understand the shared memory paradigm and how OpenMP handles it.
Then look at potential race conditions, interdependencies between variables and so on. OpenMP offers constructors like critical, atomic operations, reductions among others to solve those issues (other great SO threads atomic vs critical and reduction vs atomic). TL;DR : You should opt for the solution that gives you the best performance without compromising correctness.
In your case reduction of the variable "resultado" is clearly the best option:
#pragma omp parallel reduction ( +: resultado)
for(i=1; i<= N-1; i++)
{
k = a + i*dx; //Suma del intervalo
...
}
After you guarantee the correctness of your code you can look at load unbalancing problems (i.e., the difference among the work performed by each thread). To deal with this issue openMP offers parallel for scheduling strategies like dynamic and guided (another SO thread) and the possibility of tuning the chunk size of those distributions.
In your code, threads will execute the same amount of parallel work, so no need to use a dynamic or guided schedule, you can opt for the static one instead. The benefit of the static over the others is that it does not introduce the overhead of having to coordinate the distributions of tasks among threads at-runtime.
There is much more stuff to look into to extract the maximum performance out of the architecture that your code is running on. But for now, I think it is a good enough start.
Related
I am writing this post because I have a problem with my C socket. When I send messages very quickly in a loop from the server to the clients, the client displays all the messages in one string and I don't know why. However, when I send a message followed by a sleep, I do not have this problem. Would it be possible to have an explanation please?
Serveur code without sleep:
int nombreDeJoueurs = joueurs + robots;
int taille;
for(i = 1; i<= nombreDeJoueurs; i++){
sprintf(saisie,"Le joueur %d a %d têtes de boeufs", i, vie[i]);
for (j = 1; j <= joueurs; j++){
taille = write(pollfds[j].fd,saisie,strlen(saisie));
}
}
Output client :
Le joueur 1 a 1 têtes de boeufsLe joueur 2 a 0 têtes de boeufsLe joueur 3 a 0 têtes de boeufsLe joueur 4 a 0 têtes de boeufs
Serveur code with sleep:
int nombreDeJoueurs = joueurs + robots;
int taille;
for(i = 1; i<= nombreDeJoueurs; i++){
sprintf(saisie,"Le joueur %d a %d têtes de boeufs", i, vie[i]);
for (j = 1; j <= joueurs; j++){
taille = write(pollfds[j].fd,saisie,strlen(saisie));
sleep(1);
}
}
Output client
Le joueur 1 a 1 têtes de boeufs
Le joueur 2 a 0 têtes de boeufs
Le joueur 3 a 0 têtes de boeufs
Le joueur 4 a 0 têtes de boeufs
Your assumption, that you would send multiple messages is wrong.
From man write:
write() writes up to count bytes from the buffer starting at buf to the file referred to by the file descriptor fd.
To be more specifically: all you're doing is to write a number of bytes into the underlying stream. The number of calls of the function write does not determine the number of 'messages'.
If the socket buffer is full, the socket implementation will flush the remaining bytes and send them over the network to the client. If you wait too long, the socket implementation will automatically flush the buffer in certain time intervals.
That's why your sleep version works. But it is the wrong approach. You have to implement a protocol between the server and client. An example would be to send the length of the message (either as text or binary), so that the client would first read the length and then read only that much from the stream to form a message (string).
I have this code:
#include <stdio.h>
#include<conio.h>
main(){
float promAnual=0.0;
int numMeses, numToneladas,i, suma = 0, mesTon = 0;
float toneladas[12];
for(i = 1; i < 13; i++){
printf("Ingrese la cantidad de toneladas del mes #%d->", i);
scanf("%f", &toneladas[i] );
}
for(i = 1; i < 13; i++){
suma = suma + toneladas [i];
}
promAnual = suma / 12.0;
for(i = 1; i < 13; i++){
if(toneladas[i]>promAnual){
numMeses = numMeses + 1;
}
}
numToneladas = 0;
mesTon = 0;
for(i = 1; i < 13; i++){
if(toneladas[i]>toneladas[i+1]){
mesTon = i;
numToneladas = toneladas[i];
}
}
printf("El promedio anual es: %0.2f, %d mes(es) tuvieron mayor cosecha que el promedio anual, y el mayor numero de toneladas se produjo en el mes #%d con %0.2f", promAnual,numMeses,mesTon, numToneladas);
}
The issue is that the last 2 variables in the last printf are showing wrong values, I know why, but I don't know how to fix it, it is because the last "for" is assigning the last value of "i", but I don't know how to fix it.
you would better check the end condition of "for" loop, the array toneladas have 12 stores, which from 0 to 11,but you set it from 1 to 12,maybe it cause your issue.
please try to set "for(i=0;i<12;i++)",and run the code again.
By the looks of the errors in your code, I'd say you're probably not reading a book to learn. That's unfortunate, because people who read books to learn C usually don't have this kind of trouble.
float toneladas[12];
for(i = 1; i < 13; i++){
printf("Ingrese la cantidad de toneladas del mes #%d->", i);
scanf("%f", &toneladas[i] );
}
Here you've declared a carton of 12 eggs, egg[0] through to egg[11] (write them out and count them one by one, and you'll see there are 12)... and then tried to insert into egg[12], which is out of bounds. Expect one smashed egg!
Throughout that code you're referring to that smashed egg again and again. I wouldn't be surprised if the recipe you're concocting is disastrous!
Speaking of smashed eggs, you have an uninitialised variable: int numMesses which you then use without initialisation: numMeses = numMeses + 1;...
int numMeses, numToneladas,i, suma = 0, mesTon = 0;
/* SNIP */
printf("El promedio anual es: %0.2f, %d mes(es) tuvieron mayor cosecha que el promedio anual, y el mayor numero de toneladas se produjo en el mes #%d con %0.2f", promAnual,numMeses,mesTon, numToneladas);
As you can see, numToneladas is declared as int. However, in your call to printf you're telling printf it's a double. You're lying to printf; no wonder it's lying back at you!
You appear to be including the non-portable header <conio.h>, though I see no point. Unlike most, you've not used a single function from that header! Why include a non-portable header you're not using?
My only guess is that you're copy/pasting from somewhere, and attempting to learn by unguided trial and error. As you've seen, that's dangerous; it'll cause you headaches due to strange, difficult to debug bugs like the one you've encountered today. You got lucky this time, because you noticed. If you write code like this in the real world you might cause someone injury!
Read a book! Do the exercises as you stumble across them. I can recommend K&R2E.
When I try to calculate a 3-order determinant of a matrix ,I received bad outputs. At the 2-order one ,it work fine. To be more specific, I don't receive 9 values (v[1,1],v[1,2] etc) ,but instead i receive more than that. I thought is a problem to arrays ,but idk..
Code:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
void main(void) {
int i,j,n,i_max,j_max,ordin,i_m,j_m;
long int det;
int v[3][3];
int e[3];
int nr=0;
printf("\nIntroduceti ordinul matricei:\t");
scanf("%d",&n);
if (n==2) {
i_max=n;
j_max=n;
printf("\nIntroduceti valorile matricei:\n");
for (i=1;i<=i_max;i++) {
for (j=1;j<=j_max;j++) {
printf("v[%d,%d]= ",i,j);
scanf("%d",&(v[i][j]));
nr++;
e[nr] = v[i][j];
}
}
det = (e[1]*e[4])-(e[2]*e[3]);
printf("\nDeterminantul matricei este: %ld\n",det);
if (det != 0)
printf("Matricea de ordinul %d este inversabila !",n);
else printf("Matricea de ordinul %d nu este inversabila!",n);
} else if (n==3) {
i_m=n;
j_m=n;
printf("\nIntroduceti valorile matricei:\n");
for (i=1; i<= i_m; i++) {
for (j=1; j<= j_m; j++) {
printf("v[%d,%d]= ",i,j);
scanf("%d",&(v[i][j]));
nr++;
e[nr] = v[i][j];
}
}
det = (e[1]*e[5]*e[9])+(e[2]*e[6]*e[7])+(e[3]*e[4]*e[8])-(e[3]*e[5]*e[7])-(e[2]*e[4]*e[9])-(e[1]*e[6]*e[8]);
printf("Determinantul matricei este: %ld\n",det);
if (det != 0)
printf("Matricea de ordinul %d este inversabila!",n);
else
printf("Matricea de ordinul %d nu este inversabila!",n);
} else
printf("Ordinul matricei este incorect!");
return 0;
}
First, you declare
int v[3][3];
int e[3];
There are not enough items for e as you use it for v which has 3 x 3 = 9 elements.
So it seems that it would be solved by changing the second statement to
int e[9];
but it is not the end of story.
In the for loops you loop nor from 0 (which is common in C language), but from 1, so you need 1 more indices for all arrays!
So declare
int v[4][4]; /* for using indices from 1 to 3 */
int e[10]; /* for using indices from 1 to 9 */
First you say int v[3][3]; and int e[3]; and then you reach for elements like v[3][3] and e[4]. You seem to forget that arrays/matrices use 0-based indices. In other words, if you declare int v[3][3]; the only elements you should refer to are v[0][0]...v[2][2]. When reading data into v the for loops should go from 0 to 2, not from 1 to 3. Also, you clearly get out of the bounds of e, since it has 3 elements, but you go as far as e[9].
Btw, you also do not need to transfer things from v to e. After reading into v you can simply refer to v[0][0] as v[0], to v[0][1] as v[1], to v[1][0] as v[3] and to v[2][2] as v[8].
Also, you make pretty much the same errors in the part referring to the second order determinant.
This question already has answers here:
Calculate sum of 1+(1/2!)+…+(1/n!) n number in C language
(4 answers)
Closed 9 years ago.
Like the title says, how can I calculate the sum of n numbers of the form: 1+(1/2!)+...+(1/n!)?
I already got the code for the harmonic series:
#include <stdio.h>
int main( void )
{
int v=0,i,ch;
double x=0.;
printf("Introduce un número paracalcular la suma: ");
while(scanf("%d",&v)==0 || v<=0)
{
printf("Favor de introducir numeros reales positivos: ");
while((ch=getchar())!='\n')
if(ch==EOF)
return 1;
}
for (i=v; i>=1; i--)
x+=1./i;
printf("EL valor de la serie es %f\n", x);
getch();
return 0;
}
The question here is: I already got the sum as the fraction, but how can I calculate the variable "i" factorial?
Note: I´m programming in language C, with DEV -C++ 4.9.9.2
For $n$ bigger than around $20,$ just use the mathematical constant $e.$ Below $20,$ it really doesn't matter what you do.
Usually by means of a recursive method, one can create a factorial function. Note that:
$$
n! = \left\{
\begin{array}{lr}
1 & : n = 1\\
n(n-1)! & : n > 1
\end{array}
\right.
$$
I guess this would mean something like
public int Factorial(int n)
{
return (n == 1 ? 1 : n * Factorial(n - 1));
}
ac = 1;
for (i=n; i>0; i--) ac = ac/i+1;
saves the computation of factorials and avoids rounding errors.
No recursion and no factorials.
double fraction=1, sum=0;
long i,n;
for(i=1;i<=n;i++)
sum+=(fraction/=i);
Output:
n=1
sum=1
n=2
sum=1.5
n=3
sum=1.666666666
n=1073741824
sum=1.71828182845904553488480814849026501178741455078125
first I apologize for my bad english...
So here is my problem. I'm testing out the FFTW3 library, with a simple input signal, a continious one. Then I compute the FFT and get the good result : just a signal on frequency 0, everything else is at 0.
Then I would like to get my input back with the backward FFT, but it doesn't work. Here is my code :
fftw_complex* imgIn;
fftw_complex* imgIn2;
fftw_complex* imgOut;
fftw_plan plan;
int taille = 100;
int i;
//Allocation des entrées et sorties
imgIn = fftw_malloc(sizeof(fftw_complex)*taille);
imgIn2 = fftw_malloc(sizeof(fftw_complex)*taille);
imgOut = fftw_malloc(sizeof(fftw_complex)*taille);
//Remplissage des données d'entrées pour le calcul de la FFT
for(i = 0 ; i < taille ; i++){
imgIn[i][0] = 1.0;
imgIn[i][1] = 0.0;
}
//Plan d'execution
plan = fftw_plan_dft_2d(taille/10, taille/10, imgIn, imgOut, FFTW_FORWARD, FFTW_ESTIMATE);
//Execute la FFT
fftw_execute(plan);
//Inverse
plan = fftw_plan_dft_2d(taille/10, taille/10, imgOut, imgIn2, FFTW_BACKWARD, FFTW_ESTIMATE);
for(i = 0 ; i < taille ; i++){
printf("%d : %g\n%d : %g\n", i, imgIn2[i][0], i, imgIn2[i][1]);
}
As you can see, I just try to perform a normal FFT, then to reverse it. The problem is that my output imgIn2 is just full of 0, instead of 1 and 0...
So what's wrong with my code ?
Thank you :)
Your code does not execute the second plan.