Large arrays in Matlab - arrays

I am trying to build five single precision arrays of the size 744×744×744×3×3 in the latest MATLAB version (R2016b).
However, when I build the first array, I get the error:
Requested 744x744x744x2x3 (9.2GB) array exceeds maximum array size preference. Creation of arrays greater than this limit may take a long
time and cause MATLAB to become unresponsive. See array size limit or preference panel for more information.
I set the workspace preferences in MATLAB to max array size 1e4, which is all that it allows. And I set the maximum virtual memory in Windows 10 to 400GB.
I also read the relevant posts in this forum, but they don't answer my question. Is it impossible to build arrays that size or am I missing something?

You are exceeding your RAM, I can suggest to use matfile.
To save the large matrices (for example My_var, having size Nvar1 x Nvar2), without slowing the other processes...
myObject = matfile('myFilename.mat','Writable',true);
myObject.myVariablenameinObject(1:Nvar1,1:Nvar2)=My_var(1:Nvar1,1:Nvar2)
By setting 'Writable' as true, you can access, modify or write data. If you don't want to write. Just use:
myObject = matfile('myFilename.mat')
For more details, refer to this link.

Related

Large matrices with a certain structure: how can one define where memory allocation is not needed?

Is there a way to create a 3D array for which only certain elements are defined, while the rest does not take up memory?
Context: I am running Monte-Carlo simulations in which I want to solve 10^5 matrices. All of these matrices have a majority of elements that are zero, for which I wouldn't need to use 8 bytes of memory per element. These elements are the same for all matrices. For simplicity, I have combined all of these matrices into a 3D array, but if my matrices start to become too large, I encounter memory issues (since at matrix dimensions of 100*100*100000, the array already takes up 8 GB of memory).
One workaround would be to store every matrix element with its 10^6 iterations in a vector, that way, no additional information needs to be stored. The inconvenience is that then I would need to work with more than 50 different vectors, and I prefer working with arrays.
Is there any way to tell R that some matrix elements don't need information?
I have been thinking that defining a new class could help for this, but since I have just discovered classes, I am not sure what all the options are. Do you think this could be a good approach? Are there specific things I should keep in mind?
I also know that there are packages made to deal with memory problems, but that did not seem like the quickest solution in terms of human and computation effort for this specific problem.

repeating a vector many times without repmat MATLAB

I have a vector with very large size in column format, I want to repeat this vector multiple times. the simple method that works for small arrays is repmat but I am running out of memory. I used bsxfun but still no success, MATLAB gives me an error of memory for using ones. any idea how to do that?
Here is the simple code (just for demonstration):
t=linspace(0,1000,89759)';
tt=repmat(t,1,length(t));
or using bsxfun:
tt=bsxfun(#times,t, ones(length(t),length(t)));
The problem here is simply too much data, it does not have to do with the repmat function itself. To verify that it is too much data, you can simply try creating a matrix of ones of that size with a clear workspace to reproduce the error. On my system, I get this error:
>> clear
>> a = ones(89759,89759)
Error using ones
Requested 89759x89759 (60.0GB) array exceeds maximum array size preference. Creation of arrays greater than
this limit may take a long time and cause MATLAB to become unresponsive. See array size limit or preference
panel for more information.
So you fundamentally need to reduce the amount of data you are handling.
Also, I should note that plots will hold onto references to the data, so even if you try plotting this "in chunks", then you will still run into the same problem. So again, you fundamentally need to reduce the amount of data you are handling.

In Swift, how efficient is it to append an array?

If you have an array which can very in size throughout the course of your program, would it be more efficient to declare the array as the maximum size it will ever reach and then control how much of the array your program can access, or to change the size of the array quite frequently throughout the course of the program?
From the Swift headers, there's this about array growth and capacity:
When an array's contiguous storage fills up, new storage must be allocated and elements must be moved to the new storage. Array, ContiguousArray, and Slice share an exponential growth strategy that makes append a constant time operation when amortized over many invocations. In addition to a count property, these array types have a capacity that reflects their potential to store elements without reallocation, and when you know how many elements you'll store, you can call reserveCapacity to pre-emptively reallocate and prevent intermediate reallocations.
Reading that, I'd say it's best to reserve the capacity you need, and only come back to optimize that if you find it's really a problem. You'll make more work for yourself if you're faking the length all the time.

maximum size of array in vb.net

So I am working on a problem where I am dealing with very large amounts of data and I have come across a limitation I do not fully understand. I need to store sets of 6 integer values and associate each with an index. The approach I chose was to initially create my own type and then create a List(of Type). That failed with an 'Array dimensions exceeded supported range" error. Fine, I presumed that this was due to the Type I defined and perhaps the way the List/Collection was storing the data. I was expecting to make use of the full Integer.MaxValue number of indices in an array, as given in http://msdn.microsoft.com/en-us/library/wak0wfyt.aspx#BKMK_ArraySize but that seems to not apply (why?). I then proceeded to re-write the functions and ended up with an array of type Tuple(int,int,int,int,int,int). But again, I run into the same situation. Same for arrays of a type that has an array as its variable. I tried out several ways to see what the maximum size of the array could be and ended up with a maximum size of around 48E6 indices. The problem is that I need more than 10x that to store the data I have...
The only way I found to make this (sort of) work is to use a List(of List(of Integer())) and then add a new item to the top level list after every 40M indices or so. Nasty solution and not efficient, but it showed that it could be made to work...
Background: VS2010, .NET 4.0, Win7 x64, 32GB Ram.
Any ideas of how I would best store 6 integer values in either a collection or array (I need to be able to access them by index) for more than about 500 million combinations (ideally up to the 2.1B combinations)?
Thanks
The solution is actually quite simple (thanks coffee). Reading through the documentation in the link above, this should not be the problem, but... the maximum size of the array is no longer Int.MaxValue once the type isn't an integer (or so it would seem, though none of the documentation indicates this). The way around this is simply to go from something like this:
Dim _Array(Array_Size) as Tuple(of Integer,Integer,Integer,Integer,Integer,Integer)
to
Dim _Array1(Array_Size) as Integer
Dim _Array2(Array_Size) as Integer
Dim _Array3(Array_Size) as Integer
Dim _Array4(Array_Size) as Integer
Dim _Array5(Array_Size) as Integer
This allows each array the maximum size (or at least the size I need which is close enough to the max size). The only thing is that I then need to expand the rest of the code accordingly.
I am a bit surprised about this, considering that the MSDN states that 'The length of every dimension of an array is limited to the maximum value of the Integer data type' when it looks like it should actually read that 'The Total length [...] is limited to the maximum value'. That would explain that I receive an error (of the original statement) at a size that accounts for the additional 6 integer values plus some for accounting.

How can I efficiently copy 2-dimensional arrays of bytes into a larger 2D array?

I have a structure called Patch that represents a 2D array of data.
newtype Size = (Int, Int)
data Patch = Patch Size Strict.ByteString
I want to construct a larger Patch from a set of smaller Patches and their assigned positions. (The Patches do not overlap.) The function looks like this:
newtype Position = (Int, Int)
combinePatches :: [(Position, Patch)] -> Patch
combinePatches plan = undefined
I see two sub-problems. First, I must define a function to translate 2D array copies into a set of 1D array copies. Second, I must construct the final Patch from all those copies.
Note that the final Patch will be around 4 MB of data. This is why I want to avoid a naive approach.
I'm fairly confident that I could do this horribly inefficiently, but I would like some advice on how to efficiently manipulate large 2D arrays in Haskell. I have been looking at the "vector" library, but I have never used it before.
Thanks for your time.
If the spec is really just a one-time creation of a new Patch from a set of previous ones and their positions, then this is a straightforward single-pass algorithm. Conceptually, I'd think of it as two steps -- first, combine the existing patches into a data structure with reasonable lookup for any give position. Next, write your new structure lazily by querying the compound structure. This should be roughly O(n log(m)) -- n being the size of the new array you're writing, and m being the number of patches.
This is conceptually much simpler if you use the Vector library instead of a raw ByteString. But it is simpler still if you simply use Data.Array.Unboxed. If you need arrays that can interop with C, then use Data.Array.Storable instead.
If you ditch purity, at least locally, and work with an ST array, you should be able to trivially do this in O(n) time. Of course, the constant factors will still be worse than using fast copying of chunks of memory at a time, but there's no way to keep that code from looking low-level.

Resources