Declaring integers array in Matlab - arrays

Is there an effective way to declare a very big matrix (let's say 40.000.000x10) of integers in Matlab?
If I do it like this:
var=uint8(zeros(40000000,10));
It works very well in command window.
But the same code works much worse in function! If I do this somewhere in the function, it firstly creates a 40.000.000x10 matrix of doubles and then converts it to 8-bit integers matrix. I would prefer if it was created as integer matrix from the very begin, as in commands window. I have to work with even bigger matrices and I ran out of RAM when it initializes such matrix of doubles (although there would be enough memory if it initialized the matrix as integers). And I don't really need doubles here, all numbers are in range 0:100.
Hope you understood the problem :D

From: MATLAB: uint8
var = zeroes(40000000,10, 'uint8')

Maybe you should use this, which is more efficient.
var = zeros(40000000, 10, 'uint8');

If you want to be a bit tricky and save a small amount of time, you can allocate your array of uint8 zeros this way:
var(40000000,10) = uint8(0);
See here for some details on this type of preallocation. Be careful with this scheme. If you allocate var as one size and then allocate it again without clearing it as a smaller size array using this method, the size won't actually change and the data won't be zeroed out. Essentially, this scheme is only good if the array (var here) doesn't exist yet.

Related

Twodiemensional Array of booleans in PROGMEM(AVR) on arduino

I'm trying to construct an two dimensional array on an arduino uno which uses an atmega328. I want an array of booleans with the size of 256 * 18. This is to big for the 2KB RAM so i wanted to save it to the PROGMEM(avr). how can i do this and how can i acces the variables? i found some tutotials about doing th same with chars or strings but there is no data type for booleans. what is the best way zo save and extract booleans in/from chars.
As you have likely read about using program space (aka flash). It is necassary to use special macro's that work on the pointers, as detailed at avr-libc/user-manual.
That said, see my example of a 2D matrix in program space example of storing the 2D array along with the example of calling the data from the 2D array
It should work for larger scale.
Where you sneak a second question at the end, about booleans. Note booleans while treated as 0 or 1 really consume a full byte.
You may want to consider #include and using the vector< bool > type as this will consume only a single bit per unit. As each element occupies a single bit.

Force dlmread to return uint8 matrix - possible?

I have a file containing a very very huge matrix, size in millions x hundreds, and I wanna further process this matrix on and at the same time, conserve memory. But unfortunately, dlmread returns a double type matrix.
The numbers on this file are 0-255 only, so uint8 is the most suitable. But I have hit my memory limit, and Matlab starts yelling out "Out of memory" error, when I tried to convert the loaded matrix into uint8, with myMat = single(myMat); It makes sense, because a new matrix has to be created before removing the old one.
Can I do anything with this?
You could convert your data file to a suitable (i.e. lossless) 8 bit image format (using an external program) and then read it into MATLAB with imread. Reading this file should be a lot quicker too, as there is no data conversion involved.

Allocating arrays of the same size

I'd like to allocate an array B to be of the same shape and have the same lower and upper bounds as another array A. For example, I could use
allocate(B(lbound(A,1):ubound(A,1), lbound(A,2):ubound(A,2), lbound(A,3):ubound(A,3)))
But not only is this inelegant, it also gets very annoying for arrays of (even) higher dimensions.
I was hoping for something more like
allocate(B(shape(A)))
which doesn't work, and even if this did work, each dimension would start at 1, which is not what I want.
Does anyone know how I can easily allocate an array to have the same size and bounds as another array easily for arbitrary array dimensions?
As of Fortran 2008, there is now the MOLD optional argument:
ALLOCATE(B, MOLD=A)
The MOLD= specifier works almost in the same way as SOURCE=. If you specify MOLD= and source_expr is a variable, its value need not be defined. In addition, MOLD= does not copy the value of source_expr to the variable to be allocated.
Source: IBM Fortran Ref
You can either define it in a preprocessor directive, but that will be with a fixed dimensionality:
#define DIMS3D(my_array) lbound(my_array,1):ubound(my_array,1),lbound(my_array,2):ubound(my_array,2),lbound(my_array,3):ubound(my_array,3)
allocate(B(DIMS3D(A)))
don't forget to compile with e.g. the -cpp option (gfortran)
If using Fortran 2003 or above, you can use the source argument:
allocate(B, source=A)
but this will also copy the elements of A to B.
If you are doing this a lot and think it too ugly, you could write your own subroutine to take care of it, copy_dims (template, new_array), encapsulating the source code line you show. You could even set up a generic interface so that it could handle arrays of several ranks -- see how to write wrapper for 'allocate' for an example of that concept.

OK to use a terminator to manage fixed length arrays?

I'm working in ANSI C with lots of fixed length arrays. Rather than setting an array length variable for every array, it seems easier just to add a "NULL" terminator at the end of the array, similar to character strings. Fot my current app I'm using "999999" which would never occur in the actual arrays. I can execute loops and determine array lengths just by looking for the terminator. Is this a common approach? What are the issues with it? Thanks.
This approach is technically used by your main arguments, where the last value is a terminal NULL, but it's also accompanied by an argc that tells you the size.
Using just terminals sounds like it's more prone to mistakes in the future. What's wrong with storing the size along with an array?
Something like:
struct fixed_array {
unsigned long len;
int arr[];
};
This will also be more efficient and less error-prone.
The main problem I can think of is that keeping track of the length can be useful because there are built in functions in C that take length as a parameter, and you need it to know the length to know where to add the next element too.
In reality it depends on the size of your array, if it is a huge array than you should keep track of the length. Otherwise looping through it to determine the length every time you want to add an element to the end would be very expensive. O(n) instead of the O(1) time you normally get with arrays
The main problem with this approach is that you can't know the length in advance without looping to the end of the array - and that can affect the performance quite negatively if you only want to determine the length.
Why don't you just
Initialize it with a const int that you can use later in the code to check the size, or
Use int len = sizeof(my_array) / sizeof(the_type).
Since you're using 2-dimensional arrays to hold a ragged array, you could just use a ragged array: type *my_array[];. Or you could put the length in element 0 of each row and treat the rows as 1-indexed arrays. With some evil trickery you could even put the lengths at element -1 of each row![1]
Left as exercise ;)

How does the Length() function in Delphi work?

In other languages like C++, you have to keep track of the array length yourself - how does Delphi know the length of my array? Is there an internal, hidden integer?
Is it better, for performance-critical parts, to not use Length() but a direct integer managed by me?
There are three kinds of arrays, and Length works differently for each:
Dynamic arrays: These are implemented as pointers. The pointer points to the first array element, but "behind" that element (at a negative offset from the start of the array) are two extra integer values that represent the array's length and reference count. Length reads that value. This is the same as for the string type.
Static arrays: The compiler knows the length of the array, so Length is a compile-time constant.
Open arrays: The length of an open array parameter is passed as a separate parameter. The compiler knows where to find that parameter, so it replaces Length with that a read of that parameter's value.
Don't forget that the layout of dynamic arrays and the like would change in a 64-bit version of Delphi, so any code that relies on finding the length at a particular offset would break.
I advise just using Length(). If you're working with it in a loop, you might want to cache it, but don't forget that a for loop already caches the terminating bounds of the loop.
Yes, there are in fact two additional fields with dynamic arrays. First is the number of elements in the array at -4 bytes offset to the first element, and at -8 bytes offset there's the reference count. See Rudy's article for a detailed explanation.
For the second question, you'd have to use SetLength for sizing dynamic arrays, so the internal 'length' field would be available anyway. I don't see much use for additional size tracking.
Since Rob Kennedy gave such a good answer to the first part of your question, I'll just address the second one:
Is it better, for performance-critical parts, to not use Length() but a direct integer managed by me?
Absolutely not. First, as Rob mentioned, the compiler does it's thing to access the information extremely quickly, either by reading a fixed offset before the start of the array in the case of dynamic ones, using a compile-time constant in the case of static ones, and passing a hidden parameter in the case of open arrays, you're not going to gain any improvement in performance.
Secondly, the direct integer managed by you wouldn't be any faster, but would actually use more memory (an additional integer allocated along with the one Delphi already provides for dynamic and open arrays, and an extra integer entirely in the case of static arrays).
Even if you directly read the value Delphi stores already for dynamic arrays, you wouldn't gain any performance over Length(), and would risk your code breaking if the internal representation of that hidden header for arrays changes in the future.
Is there an internal, hidden integer
Yes.
to not use Length() but a direct integer managed by me?
Doesn't matter.
See Dynamic arrays item in Addressing pointers article by Rudy Velthuis.
P.S. You can also hit F1 button.

Resources