I have a vector with very large size in column format, I want to repeat this vector multiple times. the simple method that works for small arrays is repmat but I am running out of memory. I used bsxfun but still no success, MATLAB gives me an error of memory for using ones. any idea how to do that?
Here is the simple code (just for demonstration):
t=linspace(0,1000,89759)';
tt=repmat(t,1,length(t));
or using bsxfun:
tt=bsxfun(#times,t, ones(length(t),length(t)));
The problem here is simply too much data, it does not have to do with the repmat function itself. To verify that it is too much data, you can simply try creating a matrix of ones of that size with a clear workspace to reproduce the error. On my system, I get this error:
>> clear
>> a = ones(89759,89759)
Error using ones
Requested 89759x89759 (60.0GB) array exceeds maximum array size preference. Creation of arrays greater than
this limit may take a long time and cause MATLAB to become unresponsive. See array size limit or preference
panel for more information.
So you fundamentally need to reduce the amount of data you are handling.
Also, I should note that plots will hold onto references to the data, so even if you try plotting this "in chunks", then you will still run into the same problem. So again, you fundamentally need to reduce the amount of data you are handling.
Related
The problem I'm having can be reproduced by running the code below.
gcp;
C={};
for i=1:1000
C = [C,{tall(ones(1000,1,1000,2))}];
pause(0.05)
end
My expectation is that, because tall arrays are only brought into memory for the purpose of evaluating expressions, and then only a few rows at a time, the above would not cause immediate memory problems. However, it seems to fill up my ram in exactly the same way as calling
gcp;
C={};
for i=1:1000
C = [C,{ones(1000,1,1000,2)}];
pause(0.05)
end
That is, using tall arrays does not seem to have any impact on memory usage at all.
If I wish to store large arrays of data produced by MatLab outside of memory, how should I do it? Using tall arrays doesn't seem to work.
Note: I am using MatLab 2017a, which doesn't support vertical concatenation of tall arrays. As such I am using the structure
{rows1,rows2,...,rowsn}
to represent blocks of rows of the same array. This may not be optimal.
As mentioned in the comments - the local tall array constructor (where you give it local data rather than a datastore) is generally used only for in-memory prototyping before you point your tall arrays at your real large data in a datastore.
You can use tall/write to write out your tall arrays to disk, then make a datastore reading in from the locations you used in write. This will process the data without loading it all back into memory at once.
I am trying to perform a specific downsampling process. It is described by the following pseudocode.
//Let V be an input image with dimension of M by N (row by column)
//Let U be the destination image of size floor((M+1)/2) by floor((N+1)/2)
//The floor function is to emphasize the rounding for the even dimensions
//U and V are part of a wrapper class of Pixel_FFFF vImageBuffer
for i in 0 ..< U.size.rows {
for j in 0 ..< U.size.columns {
U[i,j] = V[(i * 2), (j * 2)]
}
}
The process basically takes pixel values on every other locations spanning on both dimensions. The resulting image will be approximately half of the original image.
On a one-time call, the process is relatively fast running by itself. However, it becomes a bottleneck when the code is called numerous times inside a bigger algorithm. Therefore, I am trying to optimize it. Since I use Accelerate in my app, I would like to be able to adapt this process in a similar spirit.
Attempts
First, this process can be easily done by a 2D convolution using the 1x1 kernel [1] with a stride [2,2]. Hence, I considered the function vImageConvolve_ARGBFFFF. However, I couldn't find a way to specify the stride. This function would be the best solution, since it takes care of the image Pixel_FFFF structure.
Second, I notice that this is merely transferring data from one array to another array. So, I thought vDSP_vgathr function is a good solution for this. However, I hit a wall, since the resulting vector of vectorizing a vImageBuffer would be the interleaving bits structure A,R,G,B,A,R,G,B,..., which each term is 4 bytes. vDSP_vgathr function transfers every 4 bytes to the destination array using a specified indexing vector. I could use a linear indexing formula to make such vector. But, considering both even and odd dimensions, generating the indexing vector would be as inefficient as the original solution. It would require loops.
Also, neither of the vDSP 2D convolution functions fit the solution.
Is there any other functions in Accelerate that I might have overlooked? I saw that there's a stride option in the vDSP 1D convolution functions. Maybe, does someone know an efficient way to translate 2D convolution process with strides to 1D convolution process?
Is there a way to create a 3D array for which only certain elements are defined, while the rest does not take up memory?
Context: I am running Monte-Carlo simulations in which I want to solve 10^5 matrices. All of these matrices have a majority of elements that are zero, for which I wouldn't need to use 8 bytes of memory per element. These elements are the same for all matrices. For simplicity, I have combined all of these matrices into a 3D array, but if my matrices start to become too large, I encounter memory issues (since at matrix dimensions of 100*100*100000, the array already takes up 8 GB of memory).
One workaround would be to store every matrix element with its 10^6 iterations in a vector, that way, no additional information needs to be stored. The inconvenience is that then I would need to work with more than 50 different vectors, and I prefer working with arrays.
Is there any way to tell R that some matrix elements don't need information?
I have been thinking that defining a new class could help for this, but since I have just discovered classes, I am not sure what all the options are. Do you think this could be a good approach? Are there specific things I should keep in mind?
I also know that there are packages made to deal with memory problems, but that did not seem like the quickest solution in terms of human and computation effort for this specific problem.
I am trying to build five single precision arrays of the size 744×744×744×3×3 in the latest MATLAB version (R2016b).
However, when I build the first array, I get the error:
Requested 744x744x744x2x3 (9.2GB) array exceeds maximum array size preference. Creation of arrays greater than this limit may take a long
time and cause MATLAB to become unresponsive. See array size limit or preference panel for more information.
I set the workspace preferences in MATLAB to max array size 1e4, which is all that it allows. And I set the maximum virtual memory in Windows 10 to 400GB.
I also read the relevant posts in this forum, but they don't answer my question. Is it impossible to build arrays that size or am I missing something?
You are exceeding your RAM, I can suggest to use matfile.
To save the large matrices (for example My_var, having size Nvar1 x Nvar2), without slowing the other processes...
myObject = matfile('myFilename.mat','Writable',true);
myObject.myVariablenameinObject(1:Nvar1,1:Nvar2)=My_var(1:Nvar1,1:Nvar2)
By setting 'Writable' as true, you can access, modify or write data. If you don't want to write. Just use:
myObject = matfile('myFilename.mat')
For more details, refer to this link.
What are the pros and contras of using a Vector.<> instead of array?
From the adobe documentation page:
As a result of its restrictions, a Vector has two primary benefits over an Array instance whose elements are all instances of a single class:
Performance: array element access and iteration are much faster when using a Vector instance than when using an Array.
Type safety: in strict mode the compiler can identify data type errors such as assigning a value of the incorrect data type to a Vector or expecting the wrong data type when reading a value from a Vector. Note, however,
that when using the push() method or unshift() method to add values to a Vector, the arguments' data types are not checked at compile time but are checked at run time.
Pro: Vector is faster than Array - e.g. see this: Faster JPEG Encoding with Flash Player 10
Contra: Vector requires FP10, and according to http://riastats.com/ some 20% of users are still using FP9
Vectors are faster. Although for sequential iteration the fastest thing seems to be linked-lists.
Vectors can also be useful for bitmap operations (check out BitmapData.setVector, also BitmapData.lock and unlock).
The linked list example mentioned earlier in comments is incorrectly written though it skips odd nodes and because of that only iterates the half amount of the same data. No wonder he get so great results, might be faster with correct code as well, but not the same % difference. The loop sets current = current.next one time too much (both in the loop and as loop-condition) each iteration which cause that behavior.
According flash player penetration website it is a little higher. Around the 85%
This is the source