I have a Function that calculates an average payout. The loop calculates down from x days to y days. As part of this function I have to interpolate a number using a specific range. However this is very slow.
I read that a method to speed up the code would be to read the range as values rather than a Range as VBA is slowed down by going to Excel each time the code is run.
Is this true?
My current code.
Function AveragePayout(Time As Double, period)
Dim i As Integer
Dim sum As Double
Dim interpolate_surface As Range
Set interpolate_surface = Range("A1", "D4")
If Time < period Then
AveragePayout = 0
Else
For i = 1 To period
interpolated_val = Interpolation(interpolate_surface, 5, Time)
sum = sum + CustomPricer(interpolated_value)
Time = Time - 1
Next i
AveragePayout = sum / period
End If
End Function
I was thinking to change line 5 to the below to then run the Interpolation on an VBA matrix/array rather than returning to the Excel document each loop (which apparently slows the function tremendously:
Set interpolate_surface = Range("A1", "D4").Value2
Alternatively are there any other methods to speed up the running of this loop?
Many thanks.
While R.Leruth is very close, there are a few things that need to be elaborated on.
First, the reason why a Range object is slower is because you are working on the Object representation of that value, and there are events bound to that Range. As a result, calculations will run, the sheet will need to be evaluated, and accessing the value has to go through the Object, and not through an in-memory representation of that object.
This performance decrease generally stands true for any Range operations, and the performance decrease is directly tied to the size of the range. Thus, operating on 100 cells is much quicker than operating on 1,000,000 cells.
While performance time of an array is also directly linked, accessing each value is much quicker. This is because the the values are in-memory and easy to access. There are no Objects to depend on with an array. This doesn't mean arrays will always be fast. I have encountered instances of array operations taking many minutes or hours because I took their initial speed for granted. You will notice a performance decrease with arrays, but the rate of performance decrease is much much much lower.
To create an array, we use the Variant type. Keep in mind a Variant can be anything, so you have to be somewhat careful. General convention is to use Dim Foo as Variant() but then any argument that accepts or returns a Variant() must be given a Variant() and not a Variant (minor difference, huge impact on code). Because of this, I tend to use Dim Foo as Variant.
We can then assign the Values from a range back to the array. While Foo = Range("A1:B2") is functionally equivalent to Foo = Range("A1:B2").Value, I strongly recommend full qualification. Thus, I don't rely on implicit properties as much as I can (.Value is the implicit property of Range).
So our code should be:
Dim Foo as Variant
Foo = SomeRange.Value
Where Foo is your variable, and SomeRange is replaced with your range.
As long as your Interpolate function accepts an array, this should cause no issues whatsoever. If the Interpolate function doesn't accept an array, you may need to find another workaround (or write your own).
To output the array, we just need to create a range of the same size as our array. There are different ways of doing this. I tend to prefer this method:
SomeRange.Resize(UBound(SomeArray, 1) - LBound(SomeArray, 1) + 1, Ubound(SomeArray, 2) - LBound(SomeArray, 2) + 1)
All this does is takes some range (should be a single cell) and resizes that range by the number of columns in the array, and the number of rows in the array. I use (Ubound - Lbound) + 1 since, for a 0-based array, this will return Ubound + 1 and for a 1-based array it will return Ubound. It makes things much simpler than creating If blocks for the same purpose.
The last thing to make sure in all of this is that your Range variable is fully-qualified. Notice that Range("A1:B2").Value is functionally equivalent to ActiveSheet.Range("A1:B2").Value but again, relying on implicit calls quickly introduces bugs. Squash those out as much as possible. If you need the ActiveSheet then use it. Otherwise, create a Worksheet variable and point that variable to the correct sheet.
And if you must use the ActiveSheet then Dim Foo as Worksheet : Set Foo = ActiveSheet is much better than just using the ActiveSheet (since the ActiveSheet will generally change when you really need it not to, and then everything will break.
Best of luck in using arrays. They are performance changing but they are never an excuse for bad coding practices. Make sure you use them properly, and that you aren't introducing new inefficiencies just because you now can.
What we usually do in VBA to speed up macros is to decrease the amount of interaction between the code and the sheet.
For example :
Get all necessary values in an array
Dim arr() as Variant
arr = Range("A1:D4")
Treat the value
...
Put them back
Range("A1:D4") = arr
In your case just try to change interpolated_surface from Range to an array type.
Related
I have a large range of data in excel that I would like to parse into an array for a user defined function. The range is 2250 x 2250. It takes far too long to parse each cell in via a for loop, and it is too large to be assigned to an array via this method:
dim myArr as Variant
myArr = range("myrange")
Just brainstorming here, would it be more efficient to parse in each column and join the arrays? Any ideas?
Thanks
You're nearly there.
The code you need is:
Dim myArr as Variant
myArr = range("myrange").Value2
Note that I'm using the .Value2 property of the range, not just 'Value', which reads formats and locale settings, and will probably mangle any dates
Note, also, that I haven't bothered to Redim and specify the dimensions of the array: the Value and Value2 properties are a 2-dimensional array, (1 to Rowcount, 1 to Col Count)... Unless it's a single cell, which will be a scalar variant which breaks any downstream code that expected an array. But that's not your problem with a known 2250 x 2250 range.
If you reverse the operation, and write an array back to a range, you will need to set the size of the receiving range exactly to the dimensions of the array. Again, not your problem with the question you asked: but the two operations generally go together.
The general principle is that each 'hit' to the worksheet takes about a twentieth of a second - some machines are much faster, but they all have bad days - and the 'hit' or reading a single cell to a variable is almost exactly the same as reading a seven-million-cell range into a variant array. Both are several million times faster than reading that range in one cell at a time.
Either way, you may as well count any operation in VBA as happening in zero time once you've done the 'read-in' and stopped interacting with the worksheet.
The numbers are all very rough-and-ready, but the general principles will hold, right up until the moment you start allocating arrays that won't fit in the working memory and, again, that's not your problem today.
Remember to Erase the array variant when you've finished, rather than relying on it going out of scope: that'll make a difference, with a range this size.
This works fine.
Sub T()
Dim A() As Variant
A = Range("A2").Resize(2250, 2250).Value2
Dim i As Long, j As Long
For i = 1 To 2250
For j = 1 To 2250
If i = j Then A(i, j) = 1
Next j
Next i
Range("A2").Resize(2250, 2250).Value2 = A
End Sub
I think the best options are:
Try to limit the data to a reasonable number, say 1,000,000 values at a time.
Add some error handling to catch the Out of Memory error and then try again, but cut the size in half, then by a third, a quarter, etc...until it works.
Either way, if we're using data sets in the order of 5,000,000 values and you want to make sure that the program will run, you will need to adjust the code to chop up the data.
I have a case in an Excel macro (VBA) where I'd like to dimension an array where the number of dimensions and the bounds of each dimension are determined at runtime. I'm letting the user specify a series of combinatorial options by creating a column for each option type and filling in the possibilities below. The number of columns and the number of options is determined at run time by inspecting the sheet.
Some code needs to run through each combination (one selection from each column) and I'd like to store the results in a multidimensional array.
The number of dimensions will probably be between about 2 to 6 so I can always fall back to a bunch of if else blocks if I have to but it feels like there should be a better way.
I was thinking it would be possible to do if I could construct the Redim statement at runtime as a string and execute the string, but this doesn't seem possible.
Is there any way to dynamically Redim with a varying number of dimensions?
I'm pretty sure there is no way of doing this in a single ReDim statement. Select Case may be marginally neater than "a bunch of If...Else blocks", but you're still writing out a lot of separate ReDims.
Working with arrays in VBA where you don't know in advance how many dimensions they will have is a bit of a PITA - as well as ReDim not being very flexible, there is also no neat way of testing an array to see how many dimensions it has (you have to loop through attempts to access higher dimensions and trap errors, or hack around in the underlying memory structure - see this question). So you will need to keep track of the number of dimensions, and write long Case statements every time you need to access the array as well, since the syntax will be different.
I would suggest creating the array with the largest number of dimensions you think you'll need, then setting the number of elements in any unused dimensions to 1 - that way you always have the same syntax every time you access the array, and if you need to you can check for this using UBound(). This is the approach taken by the Excel developers themselves for the Range.Value property - which always returns a 2-dimensional array even for a 1-dimensional Range.
As I understood your users can specify dimensions and their seize by filling in the excel-sheet. This means you have to get the last row containing a value and the last column.
Therefore, have a look at: Excel VBA- Finding the last column with data
Use Redim to change the array's size. If you want to keep some kind of entries use Redim Preserve
"Some code needs to run through each combination (one selection from
each column) and I'd like to store the results in a multidimensional
array."
To begin with, I would transpose desired Range object into a Variant.
Dim vArray as Variant
'--as per your defined Sheet, range
'this creates a two dimensional array
vArray = ActiveWorkbook.Sheets("Sheet1").Range("A1:Z300").Value2
Then you could iterate through this array to possible find the size and data you need, which you may save it to an array (with the dimensions) you need.
Little Background:
Redim: Reallocates storage space for an array variable.
You are not allowed to Redim an array, if you are defining an array with a Dim statement initially. (e.g. Dim vArray(1 to 6) As Variant).
UPDATE: to show explicitly what's allowed and what's not under Redim.
Each time you use Redim it resets your original Array object to the dimensions you are defining next.
There's a way to preserve your data using Redim Preserve but that only allows you to change the last dimension of a multidimensional array, where first dimension remains as the original.
I'm trying to write a simulator in VB (Excel macro) where the input to a simulation is taken from cells in one sheet. The input will be placed in a number of arrays, for example timePerUser(10) and bytesPerUser(10). Then there will be some simple if/for/while stuff to make calculations based on the arrays and finally I will write the results back to Excel. SO, Excel will only be used to provide input data and to display the results, everything else is happening inside of the macro, including changing values in the arrays.
I am used working with Matlab but can't use it for this simulator, so here are my questions:
Are there any existing matrix/array operations I could use within an Excel macro? For example, is there some command to check the smallest or the next smallest value in an array? The Excel function "SMALL" would be perfect, but it doesn't seem to work in macros. Or do I simply need to solve this with for-loops?
Are there any other suggestions on how to create the arrays? Is it better to have one big matrix where each row corresponds to time, data, user etc (an NxM matrix) or is it better to have separate arrays for each parameter?
How to speed up matrix/array operations? Any general suggestions?
Thanks!
Oscar
Excel does have some simple array operations - SUMPRODUCT (Matlab equivalent: A*B'), MIN, MAX, ... Most of these, including SMALL, need to be called using Application.WorksheetFunction.xxx ; for example, the function SMALL can be called in VBA with the following:
nthSmallest = Application.WorksheetFunction.Small(r, n)
where r is an array or range, and n is an integer between 1 and r.Cells.Count
I would strongly urge you to use one variable per conceptual variable, rather than trying to be clever with multiple entities in one 2D array. Speedwise I suspect the difference is small; but the opportunity to write unreadable code is significant.
As for speeding things up: it really helps to define your variables (and especially your arrays) as some fixed type (rather than Variant), e.g.:
Dim a(1 To 10) as Double
Dim ii as Integer
and it may help a little bit to have all of the arrays use the same (default) base. Since I'm a Matlab junkie myself, I often work with array indices starting at 1 - you enforce this in VBA by adding
Option Base 1
at the start of your module. At this point, if you declare
Dim a(10) as Double
it is equivalent to
Dim a(1 To 10) as Double
Is also good practice for someone who is not proficient in VBA, to add
Option Explicit
at the start of the module, as it will ensure that any variables you did not declare (or more often, that you misspelled) will generate an interpreter error.
One other thing about arrays: as a Matlab person you are used to Matlab increasing array sizes as needed. In VBA it is possible to change array size dynamically at runtime, using, for example
ReDim Preserve a(1 To 20)
which will make the array a 20 elements in size. If it was shorter, it will be padded with zeros; if it was longer, it will be trimmed. This CAN be useful, but is QUITE expensive, since often the entire array needs to be copied to a new location. When you don't know in advance how large your array will be, it's better to overestimate (and check bounds) than to ReDim every time it needs to get bigger - and do a final ReDim at the end to get it to the right size.
Hope this is enough to get your started!
Hi everybody: Following liitle issue:
Option Base 1
Sub Test()
Dim aa() As Integer
Dim bb() As Integer
ReDim aa(3)
ReDim bb(3)
For j = 1 To 3
aa(j) = j * 2
bb(j) = j * 3
Next j
End Sub
Now the only little thing that I want to do is to multiply the two one dimensional arrays elementwise without looping, and then to unload this new array (6,24,54) in a range. I'm sure this must be easily possible. A solution that I would see is to create a diagonal matrix (array) and then to use mmult, but I'm sure this is doable in a very simple manner. Thanks for the help.
There is no way to do multiplication on each element in the array without a loop. Some languages have methods that appear to do just that, but under the hood they are looping.
As you've mentioned in your comments, you have 2 choices:
Loop through a range and do the multiplication
Dump the range into an array, do the multiplication, then dump back onto a range
It all depends on your data, but almost always you'll find that dumping a range into a variant array, doing your work, and dumping it back will be much faster than looping through a range of cells. How you dump it back into a range will also affect the speed, mind you.
It is possible to multiply ranges without explicit looping, e.g. try:
sub try()
[c1:c3].value = [a1:a3 * b1:b3]
end sub
Same logic as in:
=sumproduct(a1:A3-b1:b3;a1:A3-b1:b3)
I'm not entirely sure that this is possible in any language (with the possible exception of some functional languages). Perhaps if you told us why you wanted to do this we could help you more?
A question on variants. Im aware that variants in Excel vba are both the default data type and also inefficient (from the viewpoint of overuse in large apps). However, I regularly use them for storing data in arrays that have multiple data types. A current project I am working on is essentially a task that requires massive optimistaion of very poor code (c.7000 lines)- and it got me thinking; is there a way around this?
To explain; the code frequently stores data in array variables. So consider a dataset of 10 columns by 10000. The columns are multiple different data types (string, double, integers, dates,etc). Assuming I want to store these in an array, I would usually;
dim myDataSet(10,10000) as variant
But, my knowledge says that this will be really inefficient with the code evaluating each item to determine what data type it is (when in practise I know what Im expecting). Plus, I lose the control that dimensioning individual data types gives me. So, (assuming the first 6 are strings, the next 4 doubles for ease of explaining the point), I could;
dim myDSstrings(6,10000) as string
dim myDSdoubles(4,10000) as double
This gives me back the control and efficiency- but is also a bit clunky (in practise the types are mixed and different- and I end up having an odd number of elements in each one, and end up having to assign them individually in the code- rather than on mass). So, its a case of;
myDSstrings(1,r) = cells(r,1)
myDSdoubles(2,r) = cells(r,2)
myDSstrings(2,r) = cells(r,3)
myDSstrings(3,r) = cells(r,4)
myDSdoubles(3,r) = cells(r,5)
..etc...
Which is a lot more ugly than;
myDataSet(c,r) = cells(r,c)
So- it got me thinking- I must be missing something here. What is the optimal way for storing an array of different data types? Or, assuming there is no way of doing it- what would be best coding-practise for storing an array of mixed data-types?
Never optimize your code without measuring first. You'll might be surprised where the code is the slowest. I use the PerfMon utility from Professional Excel Development, but you can roll your own also.
Reading and writing to and from Excel Ranges is a big time sink. Even though Variants can waste a lot of memory, this
Dim vaRange as Variant
vaRange = Sheet1.Range("A1:E10000").Value
'do something to the array
Sheet1.Range("A1:E10000").Value = vaRange
is generally faster than looping through rows and cells.
My preferred method for using arrays with multiple data types is to not use arrays at all. Rather, I'll use a custom class module and create properties for the elements. That's not necessarily a performance boost, but it makes the code much easier to write and read.
I'm not sure your bottleneck comes from the Variant typing of your array.
By the way, to set values from an array to an Excel range, you should use (in Excel 8 or higher):
Range("A1:B2") = myArray
On previous versions, you should use the following code:
Sub SuperBlastArrayToSheet(TheArray As Variant, TheRange As Range)
With TheRange.Parent.Parent 'the workbook the range is in
.Names.Add Name:="wstempdata", RefersToR1C1:=TheArray
With TheRange
.FormulaArray = "=wstempdata"
.Copy
.PasteSpecial Paste:=xlValues
End With
.Names("wstempdata").Delete
End With
End Sub
from this source that you should read for VBA optimization.
Yet, you should profile your app to see where your bottlenecks are. See this question from Issun to help you benchmark your code.