I was trying to solve a variance problem, but after the for loop I can't the sum values to finally dived by the numbers of items in the list.
lista = [1.86, 1.97, 2.05, 1.91, 1.80, 1.78]
n = len(lista) #NUMBERS OF DATA IN THE LIST
MA = sum(lista)/n #ARITHMETIC MEAN
for x in lista:
y = pow(x - MA, 2) #SUBTRACTION OF ALL DATA BY THE MA, RAISED TO THE POWER OF 2
print(y)
print(sum(y)/n) # AND THAT IS IT, I CAN'T FINISH
I‘m trying to do this work for days and I didn't discovery yet. Is it possible to finish or should I just quit it because there are better ways to solve it?
The result of the variance must be: 0.008891666666666652
PS: I don't want to have to install other programs or libs like pandas or numpy
You are just updating the value of y each time you run the loop so y will be a single element after it reached end of loop.
What you are printing is :
sum(pow(1.78 - MA, 2)/2)
So either you store each value of y inside function in a array of simply do a thing
y=0
y = pow(x-ma, 2)
y += y
I did it again and found the result, thanks Deepak Singh for your help, the asnwer is:
lista = [1.86, 1.97, 2.05, 1.91, 1.80, 1.78]
n = len(lista)
MA = sum(lista)/n
y = 0
for x in lista:
subts = pow(x - MA, 2)
y = (y + subts)
print("Varience:", y/n)
Related
I've created this code and it gives me this error message:
Error using *
Incorrect dimensions for matrix multiplication.
Error in poli3 = sin(pi*a) ...
Below I show one function used in the code. I don't know if the problem comes from the value given by derivadan or what.
x = -1:0.01:1; % Intervalo en el que se evaluará el polinomio de Taylor
y = sin(pi*x); % Función
a = 0;
derivada3 = derivadan(0.01, 3, a);
derivada7 = derivadan(0.01, 7, a);
derivada3_vec = repmat(derivada3, size(x - a));
derivada7_vec = repmat(derivada7, size(x - a));
poli3 = sin(pi*a) + derivada3_vec*(x - a) + (derivada3_vec*(x - a).^2)/factorial(2) + (derivada3_vec*(x - a).^3)/factorial(3);
poli7 = sin(pi*a) + derivada7_vec*(x - a) + (derivada7_vec*(x - a).^2)/factorial(2) + (derivada7_vec*(x - a).^3)/factorial(3) + (derivada7_vec*(x - a).^4)/factorial(4) + (derivada7_vec*(x - a).^5)/factorial(5) + (derivada7_vec*(x - a).^6)/factorial(6) + (derivada7_vec*(x - a).^7)/factorial(7);
figure
plot(x, poli3, 'r', x, poli7, 'b')
legend('Taylor grau 3', 'Taylor grau 7')
title('Grafica Taylor 3 grau vs Grafica Taylor 7 grau')
function Yd = derivadan(h, grado, vecX)
Yd = zeros(size(vecX));
for i = 1:grado
Yd = (vecX(2:end) - vecX(1:end-1)) / h;
vecX = Yd;
end
end
In MATLAB one can go 2 ways when developing taylor series; the hard way and the easy way.
As asked, first the HARD way :
close all;clear all;clc
dx=.01
x = -1+dx:dx:1-dx;
y = sin(pi*x);
a =0;
[va,na]=find(x==a)
n1=3
D3y=zeros(n1,numel(x));
for k=1:1:n1
D3y(k,1:end-k)=dn(dx,k,y);
end
T3=y(na)+sum(1./factorial([1:n1])'.*D3y(:,na).*((x(1:end-n1)-a)'.^[1:n1])',1);
n2=7
D7y=zeros(n2,numel(x));
for k=1:1:n2
D7y(k,1:end-k)=dn(dx,k,y);
end
T7=y(na)+sum([1./factorial([1:n2])]'.*D7y(:,na).*((x(1:end-n2)-a)'.^[1:n2])',1);
figure(1);ax=gca
plot(ax,x(1:numel(T7)),T3(1:numel(T7)),'r')
grid on;hold on
xlabel('x')
plot(ax,x(1:numel(T7)),T7(1:numel(T7)),'b--')
plot(ax,x(1:numel(T7)),y(1:numel(T7)),'g')
axis(ax,[-1 1 -1.2 1.2])
legend('T3', 'T7','sin(pi*x)','Location','northeastoutside')
the support function being
function Yd = dn(h, n, vecX)
Yd = zeros(size(vecX));
for i = 1:n
Yd = (vecX(2:end) - vecX(1:end-1))/h;
vecX = Yd;
end
end
Explanation
1.- The custom function derivadan that I call dn shortens one sample for every unit up in grado where grado is the derivative order.
For instance, the 3rd order derivative is going to be 3 samples shorter than the input function.
This causes product mismatch and when later on attempting plot it's going to cause plot error.
2.- To avoid such mismatchs ALL vectors shortened to the size of the shortest one.
x(1:end-a)
is a samples shorter than x and y and can be used as reference vector in plot.
3.- Call function derivadan (that I call dn) correctly
dn expects as 3rd input (3rd from left) a vector, the function values to differentiate, yet you are calling derivadan with a in 3rd input field. a is scalar and you have set it null. Fixed it.
derivada3 = derivadan(0.01, 3, a);
should be called
derivada3 = derivadan(0.01, 3, y);
same for derivada7
4.- So
error using * ...
error in poly3=sin(pi*a) ...
MATLAB doesn't particularly mean that there's an error right on sin(pi*a) , that could be, but it's not the case here >
MATLAB is saying : THERE'S AN ERROR IN LINE STARTING WITH
poly3=sin(pi*a) ..
MATLAB aborts there.
Same error is found in next line starting with
poly7 ..
Since sin(pi*a)=0 because a=0 yet all other terms in sum for poly3 are repmat outcomes with different sizes all sizes being different and >1 hence attempting product of different sizes.
Operator * requires all terms have same size.
5.- Syntax Error
derivada3_vec = repmat(derivada3, size(x - a))
is built is not correct
this line repeats size(x) times the nth order derivative !
it's a really long sequence.
Now the EASY way
6.- MATLAB already has command taylor
syms x;T3=taylor(sin(pi*x),x,0)
T3 = (pi^5*x^5)/120 - (pi^3*x^3)/6 + pi*x
syms x;T3=taylor(sin(pi*x),x,0,'Order',3)
T3 = pi*x
syms x;T3=taylor(sin(pi*x),x,0,'Order',7)
T7 = (pi^5*x^5)/120 - (pi^3*x^3)/6 + pi*x
T9=taylor(sin(pi*x),x,0,'Order',9)
T9 =- (pi^7*x^7)/5040 + (pi^5*x^5)/120 - (pi^3*x^3)/6 + pi*x
It really simplfies taylor series development because it readily generates all that is needed to such endeavour :
syms f(x)
f(x) = sin(pi*x);
a=0
T_default = taylor(f, x,'ExpansionPoint',a);
T8 = taylor(f, x, 'Order', 8,'ExpansionPoint',a);
T10 = taylor(f, x, 'Order', 10,'ExpansionPoint',a);
figure(2)
fplot([T_default T8 T10 f])
axis([-2 3 -1.2 1.2])
hold on
plot(a,f(a),'r*')
grid on;xlabel('x')
title(['Taylor Series Expansion x =' num2str(a)])
a=.5
T_default = taylor(f, x,'ExpansionPoint',a);
T8 = taylor(f, x, 'Order', 8,'ExpansionPoint',a);
T10 = taylor(f, x, 'Order', 10,'ExpansionPoint',a);
figure(3)
fplot([T_default T8 T10 f])
axis([-2 3 -1.2 1.2])
hold on
plot(a,f(a),'r*')
grid on;xlabel('x')
title(['Taylor Series Expansion x =' num2str(a)])
a=1
T_default = taylor(f, x,'ExpansionPoint',a);
T8 = taylor(f, x, 'Order', 8,'ExpansionPoint',a);
T10 = taylor(f, x, 'Order', 10,'ExpansionPoint',a);
figure(4)
fplot([T_default T8 T10 f])
axis([-2 3 -1.2 1.2])
hold on
plot(a,f(a),'r*')
grid on;xlabel('x')
title(['Taylor Series Expansion x =' num2str(a)])
thanks for reading
I checked the line that you were having issues with; it seems that the error is in the derivada3_vec*(x - a) (as well as in the other terms that use derivada3_vec).
Looking at the variable itself: derivada3_vec is an empty vector. Going back further, the derivada3 variable is also an empty vector.
Your issue is in your function derivadan. You're inputting a 1x1 vector (a = [0]), but the function assumes that a is at least 1x2 (or 2x1).
I suspect there are other issues, but this is the cause of your error message.
I have Three numpy arrays
X = [1,2,3,4,4,5,56,..,n]
Y = [1,2,344,4,4,4,..,n]
Z = [1,2,244,24,445,64,..,n]
I want to make output like this
final_list = [(X1,Y1,Z1),(X2,Y2,Z2),(X3,Y3,Z3), ... (Xn,Yn,Zn)]
And then to check if Z in any of them is > some threshold
Pop it up all with its correspondence X and Y
Is there some suggestions please?
I tried
np.conctatenate
but no any good results.
Thanks a lot:)
A simple way with if else could be:
X = [1,2,3,4,4,5,56]
Y = [1,2,344,4,4,4,89]
Z = [1,2,244,24,445,64,89]
d=[]
for i in range(len(X)):
if Z[i]>thresh:
print("print something")
else:
d.append([X[i],Y[i],Z[i]])
print(d)
if you check if z>thresh at the time of creating the list there is no need of poping those items later.
One way with just list comprehension:
out = [(x,y,z) for x,y,z in zip(X,Y,Z) if z<threshold]
With numpy you can do something like this:
xyz = np.array([X,Y,Z])
under_thresh = xyz[xyz[-1]<threshold]
I have two Fortran arrays in 2 and 3 dimensions, say a(nx,ny) and b(nx,ny,nz). In array a, I need to find out the satisfied points, say values > 0. Then I need to locate the vectors in array b having the same indexes of x and y of those satisfied points in a. What is the easiest and fast way to do it? The two arrays are big, and I don't want to search one element by one element. Hope I explain my problem clearly! thanks!
I'm not sure that this is the best method, but here's what I would do:
Put a where clause inside a do loop over the z-values. You can first get a 2D map of valid indices into a logical array if you don't want to recalculate the points every time:
program indices
implicit none
integer, parameter :: nx = 3000, ny = 400, nz = 500
integer, dimension(nx, ny) :: a
integer, dimension(nx, ny, nz) :: b
logical, dimension(nx, ny) :: valid_points
integer :: x, y, z
do y = 1, ny
do x = 1, nx
a(x, y) = x - y
end do
end do
valid_points = (a > 0)
do z = 1, nz
where(valid_points)
b(:, :, z) = z
else where
b(:, :, z) = 0
end where
end do
end program indices
I am trying to create two data sets, one which summarizes data by 2 groups which I have done using the following code:
x = rnorm(1:100)
g1 = sample(LETTERS[1:3], 100, replace = TRUE)
g2 = sample(LETTERS[24:26], 100, replace = TRUE)
aggregate(x, list(g1, g2), mean)
The second needs to summarize the data by the first group and NOT the second group.
If we consider the possible pairs from the previous example:
A - X B - X C - X
A - Y B - Y C - Y
A - Z B - Z C - Z
The second dataset should to summarize the data as the average of the outgroup.
A - not X
A - not Y
A - not Z etc.
Is there a way to manipulate aggregate functions in R to achieve this?
Or I also thought there could be dummy variable that could represent the data in this way, although I am unsure how it would look.
I have found this answer here:
R using aggregate to find a function (mean) for "all other"
I think this indicates that a dummy variable for each pairing is necessary. However if there is anyone who can offer a better or more efficient way that would be appreciated, as there are many pairings in the true data set.
Thanks in advance
First let us generate the data reproducibly (using set.seed):
# same as question but added set.seed for reproducibility
set.seed(123)
x = rnorm(1:100)
g1 = sample(LETTERS[1:3], 100, replace = TRUE)
g2 = sample(LETTERS[24:26], 100, replace = TRUE)
Now we have two solutions both of which use aggregate:
1) ave
# x equals the sums over the groups and n equals the counts
ag = cbind(aggregate(x, list(g1, g2), sum),
n = aggregate(x, list(g1, g2), length)[, 3])
ave.not <- function(x, g) ave(x, g, FUN = sum) - x
transform(ag,
x = NULL, # don't need x any more
n = NULL, # don't need n any more
mean = x/n,
mean.not = ave.not(x, Group.1) / ave.not(n, Group.1)
)
This gives:
Group.1 Group.2 mean mean.not
1 A X 0.3155084 -0.091898832
2 B X -0.1789730 0.332544353
3 C X 0.1976471 0.014282465
4 A Y -0.3644116 0.236706489
5 B Y 0.2452157 0.099240545
6 C Y -0.1630036 0.179833987
7 A Z 0.1579046 -0.009670734
8 B Z 0.4392794 0.033121335
9 C Z 0.1620209 0.033714943
To double check the first value under mean and under mean.not:
> mean(x[g1 == "A" & g2 == "X"])
[1] 0.3155084
> mean(x[g1 == "A" & g2 != "X"])
[1] -0.09189883
2) sapply Here is a second approach which gives the same answer:
ag <- aggregate(list(mean = x), list(g1, g2), mean)
f <- function(i) mean(x[g1 == ag$Group.1[i] & g2 != ag$Group.2[i]]))
ag$mean.not = sapply(1:nrow(ag), f)
ag
REVISED Revised based on comments by poster, added a second approach and also made some minor improvements.
I have the following problem. Say I have a vector:
v = [1,2,3,4,5,1,2,3,4,...]
I want to sequentially sample points from the vector, that have an absolute maginute difference higher than a threshold from a previously sampled point. So say my threshold is 2.
I start at the index 1, and sample the first point 1. Then my condition is met at v[3], and I sample 3 (since 3-1 >= 2). Then 3, the new sampled point becomes the reference, that I check against. The next sampled point is 5 which is v[5] (5-3 >= 2). Then the next point is 1 which is v[6] (abs(1-5) >= 2).
Unfortunately my code in R, is taking too long. Basically I am scanning the array repeatedly and looking for matches. I think that this approach is naive though. I have a feeling that I can accomplish this task in a single pass through the array. I dont know how though. Any help appreciated. I guess the problem I am running into is that the location of the next sample point can be anywhere in the array, and I need to scan the array from the current point to the end to find it.
Thanks.
I don't see a way this can be done without a loop, so here is one:
my.sample <- function(x, thresh) {
out <- x
i <- 1
for (j in seq_along(x)[-1]) {
if (abs(x[i]-x[j]) >= thresh) {
i <- j
} else {
out[j] <- NA
}
}
out[!is.na(out)]
}
my.sample(x = c(1:5,1:4), thresh = 2)
# [1] 1 3 5 1 3
You can do this without a loop using a bit of recursion:
vsearch = function(v, x, fun=NULL) {
# v: input vector
# x: threshold level
if (!length(v) > 0) return(NULL)
y = v-rep(v[1], times=length(v))
if (!is.null(fun)) y = fun(y)
i = which(y >= x)
if (!length(i) > 0) return(NULL)
i = i[1]
return(c(v[i], vsearch(v[-(1:(i-1))], x, fun=fun)))
}
With your vector above:
> vsearch(c(1,2,3,4,5,1,2,3,4), 2, abs)
[1] 3 5 1 3