Fortran count intrinsic leading to segfault - arrays

I am experiencing some unexpected segfault after introducing a cleaner code making use of the count intrinsic. The code I had was
n = 0
do ico = 1,cpatch%ncohorts
if (is_tropical(cpatch%pft(ico))) then
n = n + 1
end if
end do
where cpatch%pft is an array which by definition has length cpatch%ncohorts.
I substituted it with
n = count(is_tropical(cpatch%pft))
Also is_tropical is an logical array of size 17 defined as
is_tropical(1:13) = .false.
is_tropical(14:17) = .true.
Is there a situation where these two portions of code would not perform the same set of operations?

Related

Optimizing matrix calculations in for loops in Octave

I imported code from Matlab to Octave and the speed of certain functions seems to have dropped.
I looked into vectorization and could not come up with a solution with my limited knowledge.
What i want to ask, is there a way to speed this up?
n = 181;
N = 250;
for i=1:n
for j=1:n
par=0;
for k=1:N;
par=par+log2(1+(10.^(matrix1(j,i,matrix2(j,i))./10)./(matrix3(j,i).*double1+double2)));
end
resultingMatrix(j,i)=2.^((1/N).*par)-1;
end
end
Where dimensions are:
matrix1 = 181x181x2,
matrix2 = 181x181 --> containing values either 1 or 2 only,
matrix3 = 181x181,
double1, double2 = just doubles
Here's my testing code, I've completed your code by making some random matrices:
n = 181;
N = 250;
matrix1 = rand(n,n,2);
matrix2 = randi(2,n,n);
matrix3 = rand(n,n);
double1 = 1;
double2 = 1;
tic
for i=1:n
for j=1:n
par=0;
for k=1:N
par=par+log2(1+(10.^(matrix1(j,i,matrix2(j,i))./10)./(matrix3(j,i).*double1+double2)));
end
resultingMatrix(j,i)=2.^((1/N).*par)-1;
end
end
toc
Note that the code inside the loop over k doesn't use k. This makes the loop superfluous. We can easily remove it. The loop does the same computation 250 times, adds up the results, and divides by 250, yielding the value of one of the repeated computations.
Another important thing to do is preallocate resultingMatrix, to avoid it growing with every loop iteration.
This is the resulting code:
tic
resultingMatrix2 = zeros(n,n);
for i=1:n
for j=1:n
par=log2(1+(10.^(matrix1(j,i,matrix2(j,i))./10)./(matrix3(j,i).*double1+double2)));
resultingMatrix2(j,i)=2.^par-1;
end
end
toc
max(abs((resultingMatrix(:)-resultingMatrix2(:))./resultingMatrix(:)))
The last line computes the maximum relative difference. It is 9.9424e-15 in my version of Octave. It will differ depending on the version, the system, and more. This error is the floating-point rounding error. Note that the original code, adding the same value 250 times, and then dividing it by 250, will produce a larger rounding error than the modified code. For example,
x = pi;
t = 0;
for i = 1:N
t = t + x;
end;
t = t / N;
t-x
gives -8.4377e-15, a similar rounding error to what we saw above.
The original code took 81.5 s, the modified code takes only 0.4 s. This is not a gain of vectorization, it is just a gain of preallocation and not needlessly repeating the same computation over and over again.
Next, we can remove the other two loops by vectorizing the operations. The difficult bit here is matrix1(j,i,matrix2(j,i)). We can produce each of the n*n linear indices with (1:n*n).' + (matrix2(:)-1)*(n*n). This is not trivial, I suggest you think about how this computation works. You need to know that linear indices count, starting at 1 for the top-left array element, first down, then right, then along the 3rd dimension. So 1:n*n is simply the linear indices for each of the elements of a 2D array, in order. To each of these we add n*n if we need to access the 2nd element along the 3rd dimension.
We now have the code
tic
index = reshape((1:n*n).' + (matrix2(:)-1)*(n*n), n, n);
par = log2(1+(10.^(matrix1(index)./10)./(matrix3.*double1+double2)));
resultingMatrix3 = 2.^par-1;
toc
max(abs((resultingMatrix(:)-resultingMatrix3(:))./resultingMatrix(:)))
This code produces the exact same result as my previous version, and runs in only 0.013 s, 30 times faster than the non-vectorized code, and 6000 times faster than the original code.

Using vectorization to reduce for loops, how to use conditional if?

I'm working in a Matlab project and I have a function that is working, but I want to optimize it, reducing the number of for loops that I have in my code.
I read about vectorization, I could use it but how would I include the if conditional statement if I have to test every single value at a time?
function [y, zf] = MyFunction(x, b, zi)
y = zeros([length(x) 1]);
for n = 1:length(x)
for k=1:length(zi)
if n<k
y(n) = y(n) + b(k)*zi(length(zi)+(n-k)+1);
else
y(n) = y(n) + b(k)*x(n-k+1);
end
end
end
zf = x(length(x)-length(zi)+1:length(x));
I manage to do the vectorization, but I can't figure how to do the conditional, I get the warning:
Variable 'n' might be set by a nonscalar operator
function [y, zf] = MyFunction(x, b, zi)
y = zeros([length(x) 1]);
n=1:1:length(x); % use of vectorization
for k=1:length(zi)
if n<k % problem with if
y = y + b(k)*zi(length(zi)+(n-k)+1);
else
y = y + b(k)*x(n-k+1);
end
end
zf = x(length(x)-length(zi)+1:length(x));
Currently n is a vector and k is a scalar, and n<k returns a logical vector. If you directly use if, it would be the same as if all(n), which will only return true when everything in that vector is true! That's unexpected behavior.
I don't know if there's a general way to vectorize codes with if. But in your case, I can do it this way.
% use indice to deal with if
for k=1:length(zi)
y(1:k-1)=y(1:k-1)+b(k)*zi(length(zi)+2-k:end);
y(k:end)=y(k:end)+b(k)*x(1:length(x)-k+1);
end
I also notice that actually if you cat zi and x, it's no need to use 2 individual statement.
% assume both zi & x to be column vector
ziandx=[zi;x];
for k=1:length(zi)
y=y+b(k)*ziandx(length(zi)+2-k:length(zi)+length(x)-k+1);
end
Finally, even this for-loop is no need if you use conv. (check the doc for more detail)
ziandx=[zi;x];
s=conv(b(1:length(zi)),ziandx);
y=s(length(zi)+1:length(zi)+length(x))
I recommend you to read all three methods and understand the idea, thus you can do it yourself next time.

Matlab variable size arrays

I am very new to Matlab, and I feel completely overwhelmed by the use of arrays. What is the most efficient implementation of the following C++ code in Matlab?
A = std::vector<double>();
for (int i = 0; i < 100; i++) {
if (complicatedBoolFunction(i)) {
A.push_back(i);
}
}
Edit: By efficiency I mean to use as little resources as possible to grow the array A - that is, to avoid copy-pasting it into temporary memory
You can do this 2 ways
Pre-allocating for the maximum size, and removing unused elements. This has the advantage of pre-allocating memory in case the condition is often met...
A = NaN(100,1)
for ii = 0:99
if rand > 0.5 % some condition
A(ii+1) = ii; % some value
end
end
A(isnan(A)) = []; % remove unused elements
Appending to the array. This avoids making A way too large if appending is unlikely...
A = []; % empty array
for ii = 0:99
if rand > 0.5 % some condition
A(end+1, 1) = ii; % some value. Equivalent to 'A = [A; ii];'
end
end
A better, and more Matlab-esque way of doing this would be to vectorise your conditional function. This way you avoid looping and allocation issues...
ii = 0:99;
A = ii(rand(100, 1) > 0.5);
You can use any Boolean function you like as an indexing array, as long as it returns a logical array with the same number of elements as the array you're indexing (ii here) or integer indices of the elements to choose.
The most efficient implementation of such C++ code would be
i = 0:99;
A = i(complicatedBoolFunction(i));
Anyway you can grow an array with concatenation, which is (or was) usually not recommended, like the following
A = [];
for i = 0:99
if (complicatedBoolFunction(i))
A = [A i];
end
end
or much more efficiently like this:
A = [];
for i = 0:99
if (complicatedBoolFunction(i))
A(end + 1) = i;
end
end

indexing matrix multiplication for matlab coder into C

I would like to speed up my euler/Ornstein–Uhlenbeck code, to try to speed up the for loop, using the matlab coder to translate it to C. However, there seems to be a problem with the indexing, but I can not see what is the problem. Thanks a lot..
function [x,t] = Quick_Euler(N_t,N,A,G,dt)
x = zeros(N,N_t);
t = zeros(1,N_t);
for i = 2 : N_t
n = randn([N,1]);
x(1:N,i) = x(1:N,i-1) + dt^(.5)*G*n + (A*x(1:N,i))*dt;
t(1,i) = t(1,i-1) + dt;
end
end
Compiler:
>> coder -build Quick_Euler.prj
??? Size mismatch (size [:? x 1] ~= size [:? x :?]).
<br>Mismatched varying and fixed sizes indicate a probable run-time error.
If this diagnostic is incorrect, use indexing to explicitly make the varying size fixed.

Variable dimensions in file don't match with dimension of indexing subscript if one dimension is singleton

I want to test a function func(par1,par2,par3) with all combinations of the parameters par1, par2 and par3 and store the output in a .mat file. My code looks like this at the moment:
n1 = 3;
n2 = 1;
n3 = 2;
parList1 = rand(1,n1); % n1,n2,n3 is just some integer
parList2 = rand(1,n2); % the lists are edited by hand in the actual script
parList3 = rand(1,n3);
saveFile = matfile('file.mat','Writable',true);
% allocate memory
saveFile.output = NaN(numel(parList1),numel(parList2),numel(parList3));
counter1 = 0;
for par1 = parList1
counter1 = counter1 + 1;
counter2 = 0; % reset inner counter
for par2 = parList2
counter2 = counter2 + 1;
counter3 = 0; % reset inner counter
for par3 = parList3
counter3 = counter3 + 1;
saveFile.output(counter1,counter2,counter3) = sum([par1,par2,par3]);
end
end
end
This works except if parList3 has only one item, i.e. if n3 = 1. Then the saveFile.output has singleton dimensions and I get the error
Variable 'output' has 2 dimensions in the file, this does not match the 3 dimensions in the indexing subscripts.
Is there a elegant way to fix this?
The expression in the for statement needs to be a row array, not a column array as in your example. The loops will exit after the first value with your code. Set a breakpoint on the saveFile.output command to see what I mean. With a column array, par1 will not be a scalar as desired, but the whole parList1 column. With a row array, par1 will iterate through each value of parList1 as intended
Another thing is that you need to reset your inner counters (counter2 and counter2) or your second and third dimensions will blow up larger than you expected.
The n3=1 problem is expected behavior because matfile defines the variables with fixed number of dimensions and it will treat saveFile.output as 2D. Once you have fixed those issues, you can solve the n3=1 problem by changing the line,
saveFile.output(counter1,counter2,counter3) = sum([par1,par2,par3]);
to
if n3==1, saveFile.output(counter1,counter2) = sum([par1,par2,par3]);
else saveFile.output(counter1,counter2,counter3) = sum([par1,par2,par3]);
end
By now I realized that actually in matfiles all the singleton dimensions, except for the first two are removed.
In my actual programm I decided to save the data in the file linearly and circumvent matfile's lack of linear indexing capability by using the functions sub2ind and ind2sub.

Resources