currently I have the following portion of code:
for i = 2:N-1
res(i) = k(i)/m(i)*x(i-1) -(c(i)+c(i+1))/m(i)*x(N+i) +e(i+1)/m(i)*x(i+1);
end
where as the variables k, m, c and e are vectors of size N and x is a vector of size 2*N. Is there any way to do this a lot faster using something like arrayfun!? I couldn't figure this out :( I especially want to make it faster by running on the GPU later and thus, arrayfun would be also helpfull since matlab doesn't support parallelizing for-loops and I don't want to buy the jacket package...
Thanks a lot!
You don't have to use arrayfun. It works if use use some smart indexing:
clear all
N=20;
k=rand(N,1);
m=rand(N,1);
c=rand(N,1);
e=rand(N,1);
x=rand(2*N,1);
% for-based implementation
%Watch out, you are not filling the first element of forres!
forres=zeros(N-1,1); %Initialize array first to gain some speed.
for i = 2:N-1
forres(i) = k(i)/m(i)*x(i-1) -(c(i)+c(i+1))/m(i)*x(N+i) +e(i+1)/m(i)*x(i+1);
end
%vectorized implementation
parres=k(2:N-1)./m(2:N-1).*x(1:N-2) -(c(2:N-1)+c(3:N))./m(2:N-1).*x(N+2:2*N-1) +e(3:N)./m(2:N-1).*x(3:N);
%compare results; strip the first element from forres
difference=forres(2:end)-parres %#ok<NOPTS>
Firstly, MATLAB does support parallel for loops via PARFOR. However, that doesn't have much chance of speeding up this sort of computation since the amount of computation is small compared to the amount of data you're reading and writing.
To restructure things for GPUArray "arrayfun", you need to make all the array references in the loop body refer to the loop iterate, and have the loop run across the full range. You should be able to do this by offsetting some of the arrays, and padding with dummy values. For example, you could prepend all your arrays with NaN, and replace x(i-1) with a new variable x_1 = [x(2:N) NaN]
Related
I have a row vector q with 200 elements, and another row vector, dij, which is the output of the pdist function with currently 48216200 elements, but I'd like to be able to go higher. The operation I want to do is essentially:
t=sum(q'*dij,2);
However, since this tries to allocate a 200x48211290 array, it complains that this would require 70GB of memory. Therefore I do it this way:
t = zeros(numel(q),1);
for i=1:numel(q)
qi = q(i);
factor = qi*dij;
t(i)=sum(factor);
end
However, this takes too much time. By too much time, I mean it takes about 36s, which is orders of magnitude longer than the time required by the pdist function. Is there a way I can speed up this operation without explicitly allocating so much memory? I'm assuming here, that if the first way could allocate the memory, (being a vector operation) it would be faster.
Just use the distributive property of multiplication with respect to addition:
t = q'*sum(dij);
for testing what Cris said in the first post comment I created 3 ".m" files as follows:
vec.m :
res=sum(sin(d.*q')./(d.*q'));
forloop.m
for i=1:200
res(i)=sum(sin(d.*q(i))./(d.*q(i)));
end
and test.m:
clc
clear all
d=rand(4e6,1);
q=rand(200,1);
res=zeros(1,200);
forloop;
vec;
forloop;
vec;
forloop;
vec;
then I used matlab run and time profiler ,
the results were very surprising ! :
3 calls to forloop : ~10.5 S
3 call to vec : 15.5 S !!!
and additionally when I converted data to single the results were:
... forloop : 7.5 S
... vec : 8.5 S
I don't know precisely why for-loop is faster in these scenarios, but as for your problem, you could speed up things by generating lesser variables in the loop and using vertical vectors( i think). and finally converting your data to single values :
q=single(rand(200,1));
...
I am trying to perform a specific downsampling process. It is described by the following pseudocode.
//Let V be an input image with dimension of M by N (row by column)
//Let U be the destination image of size floor((M+1)/2) by floor((N+1)/2)
//The floor function is to emphasize the rounding for the even dimensions
//U and V are part of a wrapper class of Pixel_FFFF vImageBuffer
for i in 0 ..< U.size.rows {
for j in 0 ..< U.size.columns {
U[i,j] = V[(i * 2), (j * 2)]
}
}
The process basically takes pixel values on every other locations spanning on both dimensions. The resulting image will be approximately half of the original image.
On a one-time call, the process is relatively fast running by itself. However, it becomes a bottleneck when the code is called numerous times inside a bigger algorithm. Therefore, I am trying to optimize it. Since I use Accelerate in my app, I would like to be able to adapt this process in a similar spirit.
Attempts
First, this process can be easily done by a 2D convolution using the 1x1 kernel [1] with a stride [2,2]. Hence, I considered the function vImageConvolve_ARGBFFFF. However, I couldn't find a way to specify the stride. This function would be the best solution, since it takes care of the image Pixel_FFFF structure.
Second, I notice that this is merely transferring data from one array to another array. So, I thought vDSP_vgathr function is a good solution for this. However, I hit a wall, since the resulting vector of vectorizing a vImageBuffer would be the interleaving bits structure A,R,G,B,A,R,G,B,..., which each term is 4 bytes. vDSP_vgathr function transfers every 4 bytes to the destination array using a specified indexing vector. I could use a linear indexing formula to make such vector. But, considering both even and odd dimensions, generating the indexing vector would be as inefficient as the original solution. It would require loops.
Also, neither of the vDSP 2D convolution functions fit the solution.
Is there any other functions in Accelerate that I might have overlooked? I saw that there's a stride option in the vDSP 1D convolution functions. Maybe, does someone know an efficient way to translate 2D convolution process with strides to 1D convolution process?
Edited...
Thanks for every one to try to help me!!!
i am trying to make a Finite Element Analysis in Mathemetica.... We can obtain all the local stiffness matrices that has 8x8 dimensions. I mean there are 2000 matrices they are similar but not same. every local stiffness matrix shown like a function that name is KK. For example KK[1] is first element local stiffness matrix
i am trying to assemble all the local matrices to make global stiffness matrix. To make it easy:
Do[K[e][i][j]=KK[[e]][[i]][[j]],{e,2000},{i,8},{j,8}]....edited
Here is my question.... this equality can affect the analysis time...If yes what can i do to improve this...
in matlab this is named as 3d array but i don't know what is called in Mathematica
what are the advantages and disadvantages of this explanation type in Mathematica...is t faster or is it easy way
Thanks for your help...
It is difficult to understand what your question is, so you might want to reformulate it.
As others have mentioned, there is no advantage to be expected from a switch from a 3D array to DownValues or SubValues. In fact you will then move from accessing data-structures to pattern matching, which is powerful and the real strength of Mathematica but not very efficient for what you plan to do, so I would strongly suggest to stay in the realm of ordinary arrays.
There is another thing that might not be clear for someone more familiar with matlab than with Mathematica: In Mathematica the "default" for arrays behave a lot like cell arrays in matlab: each entry can contain arbitrary content and they don't need to be rectangular (as High Performance Mark has mentioned they are just expressions with a head List and can roughly be compared to matlab cell arrays). But if such a nested list is a rectangular array and every element of it is of the same type such arrays can be converted to so called PackedArrays. PackedArrays are much more memory efficient and will also speed up many calculations, they behave in many respect like regular ("not-cell") arrays in matlab. This conversion is often done implicitly from functions like Table, which will oten return a packed array automatically. But if you are interested in efficiency it is a good idea to check with Developer`PackedArrayQ and convert explicitly with Developer`ToPackedArray if necessary. If you are working with PackedArrays speed and memory efficiency of many operations are much better and usually comparable to verctorized operations on normal matlab arrays. Unfortunately it can happen that packed arrays get "unpacked" by some operations, so if calculations become slow it is usually a good idea to check if that has happend.
Neither "normal" arrays nor PackedArrays are restricted in the rank (called Depth in Mathematica) they can have, so you can of course create and use "3D arrays" just as you can in matlab. I have never experienced or would know of any efficiency penalties when doing so.
It probably is of interest that newer versions of Mathematica (>= 10) bring the finite element method as one of the solver methods for NDSolve, so if you are not doing this as an exercise you might want to have a look what is available already, there is quite excessive documentation about it.
A final remark is that you can instead of kk[[e]][[i]][[j]] use the much more readable form kk[[e,i,j]] which is also easier and less error prone to type...
extended comment i guess, but
KK[e][[i]][[j]]
is not the (e,i,j) element of a "3d array". Note the single
brackets on the e. When you use the single brackets you are not denoting an array or list element but a DownValue, which is quite different from a list element.
If you do for example,
f[1]=0
f[2]=2
...
the resulting f appears similar to an array, but is actually more akin to an overloaded function in some other language. It is convenient because the indices need not be contiguous or even integers, but there is a significant performance drawback if you ever want to operate on the structure as a list.
Your 'do' loop example would almost certainly be better written as:
kk = Table[ k[e][i][j] ,{e,2000},{i,8},{j,8} ]
( Your loop wont even work as-is unless you previously "initialized" each of the kk[e] as an 8x8 array. )
Note now the list elements are all double bracketed, ie kk[[e]][[i]][[j]] or kk[[e,i,j]]
I'm a quite new MatLab programmer, so this might be an easy one.. :)
I'm trying to generate a script, that will be able to read any number of XYZ-files, in any order, into a array, and arrange them in the array according to the X and Y coordinates given in the file..
My attempt is to use Load to get the files into a array, and after that, read through the array and, as explained, use the X and Y coordinate as the locations in a new array..
I've tried presetting the array size, and also I'm subtracting a value from both X and Y to minimize the size of the array (fullArray)
%# Script for extraction of XYZ-data from DSM/DTM xyz files
%# Define folders and filter
DSMfolder='/share/CFDwork/site/OFSites/MABH/DSM/*.xyz';
DTMfolder='/share/CFDwork/site/OFSites/MABH/DTM/*.xyz';
%# Define minimumvalues, to reduce arrays.. Please leave some slack, for the
%# reduction-algorithm..
borderX=100000;
borderY=210000;
%% Expected array-size
expSizeX=20000;
expSizeY=20000;
%# Program starts.. Please do not edit below this line!
files=ls(DSMfolder);
clear fullArray
fullArray=zeros(expSizeX,expSizeY);
minX=999999999;
minY=999999999;
maxX=0;
maxY=0;
disp('Reading DSM files');
[thisFile,remaining]=strtok(files);
while (~isempty(thisFile))
disp(['Reading: ' thisFile]);
clear fromFile;
fromFile=load(thisFile);
for k=1:size(fromFile,1)
tic
fullArray(fromFile(k,1)-borderX,fromFile(k,2)-borderY)=fromFile(k,3);
disp([k size(fromFile,1)]);
if (fromFile(k,1)<minX)
minX=fromFile(k,1);
end
if (fromFile(k,2)<minY)
minY=fromFile(k,2);
end
if (fromFile(k,1)>maxX)
maxX=fromFile(k,1);
end
if (fromFile(k,2)>maxY)
maxY=fromFile(k,2);
end
toc
end
[thisFile,remaining]=strtok(remaining);
end
As can be seen, I've added a tic-toc, and the time was 3.36secs for one operation!
Any suggestion on, why this is so slow, and how to improve the speed.. I need to order 2x6,000,000 lines, and I can't be bothered to wait 466 days.. :D
Best regards
Mark
Have you considered using a sparse matrix?
A sparse matrix in matlab is defined by a list of values and their location in the matrix -
incidentally this matches your input file perfectly.
While this representation is generally meant for matrices which are truly sparse, (i.e. most of their values are zeros), it appears that in your case it would be much faster to load the matrix using the sparse function even if it is not truly sparse.
Since your data is organised in such a way (location of every data point) my guess is it is sparse anyway.
The function to create a sparse matrix takes the location as columns so instead of a for loop your code will look something like this (this segment replaces the whole for loop):
minX = min(fromFile(:,1);
maxX = max(fromFile(:,1);
minY = min(fromFile(:,2);
minY = max(fromFile(:,2);
S = sparse(fromFile(:,1) - borderX, fromFile(:,2) - borderY, fromFile(:,3));
Note that the other change I've made is calculating minimum / maximum values directly from the matrix - this is much faster than going over a for loop, as operating on vectors and matrices unleashes the true power of matlab :)
You can perform all sorts of operations on the sparse matrix, but if you want to convert it to a regular matrix you can use the matlab full function.
More information here and there.
I am looking for a way to store a large variable number of matrixes in an array in MATLAB.
Are there any ways to achieve this?
Example:
for i: 1:unknown
myArray(i) = zeros(500,800);
end
Where unknown is the varied length of the array, I can revise with additional info if needed.
Update:
Performance is the main reason I am trying to accomplish this. I had it before where it would grab the data as a single matrix, show it in real time and then proceed to process the next set of data.
I attempted it using multidimensional arrays as suggested below by Rocco, however my data is so large that I ran out of Memory, I might have to look into another alternative for my case. Will update as I attempt other suggestions.
Update 2:
Thank you all for suggestions, however I should have specified beforehand, precision AND speed are both an integral factor here, I may have to look into going back to my original method before trying 3-d arrays and re-evaluate the method for importing the data.
Use cell arrays. This has an advantage over 3D arrays in that it does not require a contiguous memory space to store all the matrices. In fact, each matrix can be stored in a different space in memory, which will save you from Out-of-Memory errors if your free memory is fragmented. Here is a sample function to create your matrices in a cell array:
function result = createArrays(nArrays, arraySize)
result = cell(1, nArrays);
for i = 1 : nArrays
result{i} = zeros(arraySize);
end
end
To use it:
myArray = createArrays(requiredNumberOfArrays, [500 800]);
And to access your elements:
myArray{1}(2,3) = 10;
If you can't know the number of matrices in advance, you could simply use MATLAB's dynamic indexing to make the array as large as you need. The performance overhead will be proportional to the size of the cell array, and is not affected by the size of the matrices themselves. For example:
myArray{1} = zeros(500, 800);
if twoRequired, myArray{2} = zeros(500, 800); end
If all of the matrices are going to be the same size (i.e. 500x800), then you can just make a 3D array:
nUnknown; % The number of unknown arrays
myArray = zeros(500,800,nUnknown);
To access one array, you would use the following syntax:
subMatrix = myArray(:,:,3); % Gets the third matrix
You can add more matrices to myArray in a couple of ways:
myArray = cat(3,myArray,zeros(500,800));
% OR
myArray(:,:,nUnknown+1) = zeros(500,800);
If each matrix is not going to be the same size, you would need to use cell arrays like Hosam suggested.
EDIT: I missed the part about running out of memory. I'm guessing your nUnknown is fairly large. You may have to switch the data type of the matrices (single or even a uintXX type if you are using integers). You can do this in the call to zeros:
myArray = zeros(500,800,nUnknown,'single');
myArrayOfMatrices = zeros(unknown,500,800);
If you're running out of memory throw more RAM in your system, and make sure you're running a 64 bit OS. Also try reducing your precision (do you really need doubles or can you get by with singles?):
myArrayOfMatrices = zeros(unknown,500,800,'single');
To append to that array try:
myArrayOfMatrices(unknown+1,:,:) = zeros(500,800);
I was doing some volume rendering in octave (matlab clone) and building my 3D arrays (ie an array of 2d slices) using
buffer=zeros(1,512*512*512,"uint16");
vol=reshape(buffer,512,512,512);
Memory consumption seemed to be efficient. (can't say the same for the subsequent speed of computations :^)
if you know what unknown is,
you can do something like
myArray = zeros(2,2);
for i: 1:unknown
myArray(:,i) = zeros(x,y);
end
However it has been a while since I last used matlab.
so this page might shed some light on the matter :
http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/matlab_prog/f1-86528.html
just do it like this
x=zeros(100,200);
for i=1:100
for j=1:200
x(i,j)=input('enter the number');
end
end