My worksheet requires the following array formula in BG2.
=INDEX('Client'!O$2:O$347473,
MATCH(1, (('Client_Cost'!D$2:D$347473='Client'!BC2)*
('Client_Cost'!E$2:E$347473='Client'!BE2)), 0))
This provides a two-column match (Client_Cost!D:D to Client!BC2 AND Client_Cost!E:E to Client!BE2) and returns the corresponding value from Client!O:O.
The large number of rows makes the array formula very calculation-intensive. I can deal with a few hundred rows (~90 seconds for 500 rows) but I need results all the way down to Client'!BG347473 and I would like them sometime this year.
I've tried using Application Evaluate to return a result from the array formula into an variant array and subsequently returning the array of results to the worksheet en masse but it isn't the improvement I was hoping for. Looking for alternatives.
First off, I would recommend developing alternative methods with a smaller data set. 5K or 10K rows will either demonstrate a noticeable improvement or not; you can always expand to the original data set once you are confident you are not going to get into a long 'Not responding' state that you have to crash out of.
A common method of removing the array from that style of array formula¹ is a 'helper' column that concatenates the two values from column D and E in the Client_Cost worksheet into a single, delimited value. For example, in Client_Cost!Z2 as,
=CONCATENATE(Client_Cost!D2, "|", Client_Cost!E2)
Filled down to Client_Cost!Z347473 should only take a second or two.
Once that is set up, a single INDEX/MATCH function pair can provide a vastly more efficient lookup on a similarly concatenated Client!BC2 and Client'!BE2. In Client!BG2 as,
=INDEX(Client!O$2:O$347473,
MATCH(CONCATENATE(Client!BC2, "|", Client!BE2),
Client_Cost'!Z$2:Z$347473, 0))
That will take 1 hr, 51 minutes for 350K rows. While not yet optimal, it is a big improvement over the estimated ~17.5 hours that the original took.
The next logical step in optimizing that method would be working with a VBA Scripting.Dictionary object. A dictionary holds its own unique index on its keys and the concatenated values could be stuffed into a dictionary object to facilitate virtually instantaneous lookups on a large number of items (i.e. rows).
Sub JR_CSE_in_Array()
Dim olr As Long, rws As Long, JR_Count As Long, JR_Values As Variant
Dim v As Long, vTMP As Variant, vTMPs As Variant, dVALs As Object
Debug.Print Timer
Set dVALs = CreateObject("Scripting.Dictionary")
'get some dimensions to the various data ranges
With Worksheets("Client_Cost")
'only use as many rows as absolutely necessary
olr = Application.Min(.Cells(Rows.Count, "D").End(xlUp).Row, _
.Cells(Rows.Count, "E").End(xlUp).Row)
'store D & E
vTMPs = .Range(.Cells(2, 4), .Cells(olr, 5)).Value2
End With
With Worksheets("Client")
rws = Application.Min(.Cells(Rows.Count, "BC").End(xlUp).Row, _
.Cells(Rows.Count, "BE").End(xlUp).Row, _
UBound(vTMPs, 1))
'override the above statement for sampling
'rws = 5000
'building the Dictionary object takes a fair bit of time but it is worth it
vTMP = .Range(.Cells(2, 15), .Cells(olr, 15)).Value2
For v = LBound(vTMPs, 1) To UBound(vTMPs, 1)
If Not dVALs.Exists(Join(Array(vTMPs(v, 1), vTMPs(v, 2)), ChrW(8203))) Then _
dVALs.Add Key:=Join(Array(vTMPs(v, 1), vTMPs(v, 2)), ChrW(8203)), Item:=vTMP(v, 1)
Next v
'store BC and BE
vTMPs = .Range(.Cells(2, 55), .Cells(olr, 57)).Value2
End With
ReDim JR_Values(1 To rws, 1 To 1) 'force a two-dimension, one-based index on the array
'Debug.Print LBound(JR_Values) & ":" & UBound(JR_Values)
For JR_Count = LBound(JR_Values, 1) To UBound(JR_Values, 1) Step 1
If dVALs.Exists(Join(Array(vTMPs(JR_Count, 1), vTMPs(JR_Count, 3)), ChrW(8203))) Then
JR_Values(JR_Count, 1) = dVALs.Item(Join(Array(vTMPs(JR_Count, 1), vTMPs(JR_Count, 3)), ChrW(8203)))
End If
Next JR_Count
With Worksheets("Client")
.Range("BG2").Resize(UBound(JR_Values), 1) = JR_Values
End With
'Debug.Print dVALs.Count
dVALs.RemoveAll: Set dVALs = Nothing
Debug.Print Timer
End Sub
The elapsed time for that routine to run (without helper column(s)) was 45.72 seconds. Breaking it down, it took a full 13.4 seconds just to build the dictionary and the remainder was largely taken up by the actual lookup with a half-second here and there attributed to the bulk seeding of the variant arrays from the worksheets' values.
So the Scripting.Dictionary is the clear winner here. Unfortunately, it is not automatically calculating updates in the various columns when values change but at this stage of development, the worksheet should be set to manual calculation. Setting one of the formula-based solutions into a recalculation event from a single retyped value seems an inefficient expenditure of time.
All-in-all, this makes perfect sense. The original array formula is analogous to an SQL SELECT statement with an INNER JOIN on two fields and if my SELECT statement was running inefficiently the first thing I would do to improve it would be to look at the tables' indexes.
On a related note, any workbook with this much data should be saved as a Excel Binary Workbook regardless of whether it is macro-enabled or not. The file size of a binary workbook (.XLSB) is typically ¹⁄₃ the size of an equivalent .XLSX or .XLSM. Beyond a faster initial load time, many bulk operations should prove faster.
Anyone wishing to test their own optimizations can find my sample .XLSB workbook here for the time being. Don't blindly run the procedures without seeing what you're getting into first.
¹ Array formulas need to be finalized with Ctrl+Shift+Enter↵. Once entered into the first cell correctly, they can be filled or copied down or right just like any other formula. Try and reduce your full-column references to ranges more closely representing the extents of your actual data. Array formulas chew up calculation cycles logarithmically so it is good practise to narrow the referenced ranges to a minimum. See Guidelines and examples of array formulas for more information.
Related
I am aware that this;
Arr() =Range("E2:X2500")
...then doing stuff with the Arr, and then dumping back using:
Range("E2:X2500")=Arr()
is many folds more efficient (faster) that looping through and referencing cells directly.
It's light speed!
But, this range-to-array assignment only grabs cells' value.
Is there a way to assign an actual range (continuous or not) into an array (with the same light speed) in such a way that you could then treat array items the way you would refer to cells, like:
arr(23).row 'getting a row number
Or;
If Arr(23).Value ="Pending" then arr(23).font.bold=1 else arr(23).font.bold=0
I know i can dim a range-type array where each item can store an actual single cell range. But this array cannot be handled the same way - with one liner assignment:
Dim Arr () as Range
Set Arr = Range("E2:X2500") 'error
Instead, I would need to iterate each cell and assign it to the next item in the range-type array, which would allow me to treat items the way I'd refer to cells, but take substantially longer to load as I'm dealing with a Loop.
Also how would I dump a range-type array back into the sheet with the same ease and effectiveness of the one liner assignment? I think the only way would be to use a loop yet again, correct?
Side question :
Speedwise, is it any better to refer to cells via a range-type array over referring to cells directly via the sheet, or are both basically the same?
Thanks!
Well, the array use will save a lot of code running time. But, there are some issues which must be understood:
First thing when work in VBA and your project increases, is to properly declare your variables. Try making a reflex in putting Option Explicit on top of all your modules. In the array case, the thing, from this point of view stays like that:
Dim Arr() As variant, arr1 As Variant
Both declarations work in excel. But the second one is recommended (on mai taste), when you need an array from a range. When you want building a, let us say, result array, it will be zero based and you must take care of the range size where the values will be returned.
The array content cannot be retrieved exactly like you tried in your question in case of not fix/known number of elements. Look at the next test code:
Sub testArrays()
Dim sh As Worksheet, rng As Range, arrTest As Variant
Set sh = ActiveSheet
Set rng = sh.Range("A1:F4")
arrTest = rng.value
sh.Range("J1").Resize(UBound(arrTest, 1), UBound(arrTest, 2)).value = arrTest
End Sub
It is recommended to use arrTest = sh.Range("A1:F4").value. Using range Value. Excel is able to understand what you need according to your declaration, but it is good for you to differentiate somehow, from the way of the range definition.
Sometimes, you need to build an array during analyzing of the dynamic range. If you cannot know the new array dimensions and need to Redim (Preserve), only the second dimension of the array can be re-dimensioned and Transpose function must be use, in such a case. And finally the resulted array can be properly loaded in a range, only if you know the array number of rows and columns.
You can deduce the range row, from the array row, in the next way:
If we are referring to the above arrTest we know that its first row is first row of the sheet and it has 5 columns.
So, arrTest(3, 1) will be sh.Range("A3").Value and its row would be 3.
Then, arrTest(3, 4) will be sh.Range("D3").Value and its row would be also 3.
If your array comes from a range starting with the fifth row, you must add four in order to obtain the sheet row extracted from the array row...
So, your example can be transformed in:
If arrTest(3, 4) ="Pending" then sh.Cells(3, 4).Font.Bold=1 Else sh.Cells(3, 4).Font.Bold=0
Now if you need a ranges array, you cannot do it in the way you tried. You must use the ranges address and build the range at the end:
Sub testArraysBis()
Dim sh As Worksheet, rng As Range, rng1 As Range, lastCol As Long
Dim rng2, arrTest As Variant, arrT As Variant, arrF As Variant
Set sh = ActiveSheet
lastCol = sh.Cells(1, Cells.Columns.Count).End(xlToLeft).column
Set rng = sh.Range(sh.Cells(1, 1), sh.Cells(4, lastCol))
Set rng1 = sh.Range("A5:F6")
arrT = Array(rng.Address, rng1.Address)
arrTest = rng.value
Debug.Print UBound(arrTest), LBound(arrTest)
sh.Range("J1").Resize(UBound(arrTest, 1), UBound(arrTest, 2)).value = arrTest
Set rng2 = sh.Range(arrT(0))
Debug.Print rng2.Address
arrF = sh.Range(arrT(0)).value
Debug.Print UBound(arrF, 2)
End Sub
rng2 range will be built using the address string, extracted from arrT array. An array (arrF) can also be extracted from the arrT first element...
Epilog:
The best way, in terms of speed, is to load the range in arrays, make all processing using them (in memory and very fast due to this aspect), but the most important issue is to build another array (or even a range, using Union) and retrieve the data AT ONCE. Sending of each partial processing result to a cell/range consumes a lot of time and other resources, for a big range size...
I want to populate an array with the values of a range in other sheet different than the active one. I do need to load into arrays data ranges values from many different sheets and then made thousands of operations.
I cannot be thinking about activating sheets, using loops, or even worse accessing the data in the sheets by cells(). And I want to write a neat and clean code avoiding loops for inteligibility.
I started by:
dim claimsarray as variant
claimsArray = Range(Cells(1, 1), Cells(a, b)).Value
a and b are integers
It seems to work. No error. But the values of the array are empty because the array is populated with the values of the active sheet. I want to get the values froma sheet called "claims".
claimsArray = sheets(“claims”).Range(Cells(1, 1), Cells(a, b)).Value
that gives me an error 1004
looking for solutions in stackoverflow I tried the following modifications:
FIRST
worksheets() instead of sheets():
claimsArray = worksheets(“claims”).Range(Cells(1, 1), Cells(a, b)).Value
gives me an error 1004
SECOND
Changing the dimensioning of the array
dim claimsarray as variant
vs
dim claimsarray() as variant
all combinations give me error 1004
It seems to me that you can populate an array in this way only in the active sheet. So I modify:
sheets(”claims”).activate
claimsArray = worksheets(“claims”).Range(Cells(1, 1), Cells(a, b)).Value
it works.
THE QUESTION:
How to populate an array from a different sheet without using loop and without having to activate it.
Why cant I not refer to other sheet to populate the array? Where is the flaw? is it just that VBA is weak code?
NOTE1:
I read many websites about populating arrays with ranges:
http://www.cpearson.com/excel/ArraysAndRanges.aspx
https://excelmacromastery.com/excel-vba-array/
at non avail. They don't really deal with this particular problematic.
NOTE2:
I ended up using the typical loop_:
For i = 1 To a
For j = 1 To b
claimsArray(i, j) = Sheets("claims").Cells(i, j).Value
Next j
Next i
5 lines instead of one. Makes the code so much longer and conbersome...
thanks to #banana I understood where is "the flaw" of passing a range to an array.
claimsArray = sheets(“claims”).Range(Cells(1, 1), Cells(a, b)).Value
does not work properly when "claims" is not the active sheet because Excel gets confused as to which cell is actually cells(1,1)
therefore the elegant and efective solution is simply telling excel this way:
dim ST as worksheet
dim claimsArray as variant
set ST = thisworkbook.sheets("claims")
claimsArray = ST.Range(ST.Cells(1, 1), ST.Cells(a, b)).Value
It is also very important as #banana pointed out in the coments that ST is initialize as to refering to "thisworkbook" to avoid that several workbooks are open having a sheet called "claims".
This whole problematic might be the reason why then populating arrays with loops is at the very end the less problematic way.
In my case it worked when I added the Sheets reference also inside the Range inputs (in each cell reference):
claimsArray = Sheets(“claims”).Range(Sheets(“claims”).Cells(1, 1),Sheets(“claims”).Cells(a, b)).Value
Possible to Run Goal Seek on array elements within VBA, instead of an Excel sheet range?
I have an array that takes initial values from an excel input sheet, does many calculations, and prints values back into a report region on an excel sheet; the output is roughly 200 rows x 28 columns of calculated values. I used to do this with formulas in every cell, but it was very, very slow. I converted to an all-vba Sub that does the calculations quickly and pastes the finished values into the report range in excel.
The problem is that I also need to run Goal Seek on various cells and Excel can't do it on a cell that just has a value, it needs a cell with a formula. With my fancy, efficient array, I can't goal seek anymore!!!!!
Is there a way to run some version of Goal Seek NOT on excel sheet ranges but on array members, like on MyArray(107,23) by testing an input value that is actually on the excel sheet, like Range("B2")? What would that code look like?
The first subroutine uses Range while the second uses Array instead. The goal here is 0.
First subroutine :
Sub GoalSeekWithRange()
Dim i As Long
For i = 1 To 10
Range("C" & i).GoalSeek Goal:=0, ChangingCell:=Range("A" & i)
Next
End Sub
Second subroutine :
Sub GoalSeekWithArray()
Dim arGoal As Variant
Dim arChanging As Variant
With ThisWorkbook.Sheets("MySheet")
Set arGoal = .Range("C1:C10")
Set arChanging = .Range("A1:A10")
End With
Dim i As Long
For i = 1 To 10
arGoal(i, 1).GoalSeek Goal:=0, ChangingCell:=arChanging(i, 1)
Next
End Sub
To avoid headaches, remember an array of range has two dimensions
Probably, you will have to adapt these codes to fit your specific need
I have a large range of data in excel that I would like to parse into an array for a user defined function. The range is 2250 x 2250. It takes far too long to parse each cell in via a for loop, and it is too large to be assigned to an array via this method:
dim myArr as Variant
myArr = range("myrange")
Just brainstorming here, would it be more efficient to parse in each column and join the arrays? Any ideas?
Thanks
You're nearly there.
The code you need is:
Dim myArr as Variant
myArr = range("myrange").Value2
Note that I'm using the .Value2 property of the range, not just 'Value', which reads formats and locale settings, and will probably mangle any dates
Note, also, that I haven't bothered to Redim and specify the dimensions of the array: the Value and Value2 properties are a 2-dimensional array, (1 to Rowcount, 1 to Col Count)... Unless it's a single cell, which will be a scalar variant which breaks any downstream code that expected an array. But that's not your problem with a known 2250 x 2250 range.
If you reverse the operation, and write an array back to a range, you will need to set the size of the receiving range exactly to the dimensions of the array. Again, not your problem with the question you asked: but the two operations generally go together.
The general principle is that each 'hit' to the worksheet takes about a twentieth of a second - some machines are much faster, but they all have bad days - and the 'hit' or reading a single cell to a variable is almost exactly the same as reading a seven-million-cell range into a variant array. Both are several million times faster than reading that range in one cell at a time.
Either way, you may as well count any operation in VBA as happening in zero time once you've done the 'read-in' and stopped interacting with the worksheet.
The numbers are all very rough-and-ready, but the general principles will hold, right up until the moment you start allocating arrays that won't fit in the working memory and, again, that's not your problem today.
Remember to Erase the array variant when you've finished, rather than relying on it going out of scope: that'll make a difference, with a range this size.
This works fine.
Sub T()
Dim A() As Variant
A = Range("A2").Resize(2250, 2250).Value2
Dim i As Long, j As Long
For i = 1 To 2250
For j = 1 To 2250
If i = j Then A(i, j) = 1
Next j
Next i
Range("A2").Resize(2250, 2250).Value2 = A
End Sub
I think the best options are:
Try to limit the data to a reasonable number, say 1,000,000 values at a time.
Add some error handling to catch the Out of Memory error and then try again, but cut the size in half, then by a third, a quarter, etc...until it works.
Either way, if we're using data sets in the order of 5,000,000 values and you want to make sure that the program will run, you will need to adjust the code to chop up the data.
I have a range with 170000 rows in it. I'm filtering column A for a single value and returning the corresponding values in column B.
I want these values to dump into an array so I can quickly toss them into a dictionary (with the key being the value I filtered column A with).
The problem is that SpecialCells(xlCellTypeVisible) is acting inconsistent.
If I do the same test on a smaller range, the values dump into the array just fine. But with a range as large as mine, it only returns the first value in the range. Also, I can use the same line to copy to another sheet. I just can't get it to populate the array.
foo = ws1.Range(tbl1Name & "[ID]").SpecialCells(xlCellTypeVisible)
Works with small ranges, but returns only the first result in a range as large as mine (less than 50 results.) foo becomes an array containing all the variables.
ws1.Range(tbl1Name & "[ID]").SpecialCells(xlCellTypeVisible).Copy ws2.Range("A1")
Works with the large range and copies all the relevant data successfully.
So my question: How do I populate the array without the extra step of copying to a blank worksheet when autofiltering a large table range? (Excel 2013)
EDIT: requires a reference to "Microsoft Forms 2.0 Object Library" (should be near the top of the list of available references). Or add a userform to your project and that will auto-add the reference (you can then remove the form...)
This should work for a single column:
Sub Tester()
Dim rng, txt As String, cb As New DataObject, arr
Set rng = ActiveSheet.Range("A2:A28").SpecialCells(xlCellTypeVisible)
rng.Copy
DoEvents
cb.GetFromClipboard
txt = cb.GetText
arr = Split(txt, vbCrLf)
Debug.Print LBound(arr), UBound(arr)
End Sub
If you had multiple columns you'd need to loop over each element of arr (splitting its value on tab) and transfer the values to a 2-d array.