Use Excel Loop (VBA) to update array ranges - arrays

End result desired: Use a macro to select a range of cells (D530:O530, then D531:O531, down to 1029) and "convert" the standard function into an array.
Rather than hitting F2 and CSE 500 times, I figured there should be an easy way to have Excel do the lifting.
I recorded a macro for the first couple lines (as an example) and attempted to then repeat that operation 500 more times.
Sub ArrayUpdateLoop()
'
' ArrayUpdateLoop Macro
'
Range("D530:O530").Select
For i = 1 To 500
Range("D530:O530").Select
ActiveCell.Offset(1, 0).Select
Selection.FormulaArray = "=Spreader(Rollouts,R[-503]C:R[-503]C[14])"
Next
End Sub
The above is my proposed/modified code using a loop, which seems to run, but doesn't actually perform the operation.
This is my first time using a loop to redo an operation X number of times, and I've not found much of the MS documentation/etc. terribly helpful.
Appreciate any feedback or help :-)

You need to offset by one row down in each loop:
Sub OffsetDown()
Dim i&, rng As Range
'// Start from the row above to line up with loop
Set rng = Range("D529:O529")
For i = 1 to To 500
Set rng = rng.Offset(1) '//Move one cell down
rng.FormulaArray = ...
Next
End Sub

Related

Populate an array without a loop

I am trying to avoid the use of loops for populating arrays since they take a lot of time when managing a lot of data.
Apparently as well, that is possible and easy in VBA but often results in problems.
Here is the code:
sub populate()
'put the whole column in an array
Dim AppArray() As Variant
Dim AppRange As Range
'calculate the last row of the column 1 of sheets
Dim LstRow As Integer
LstRow = Sheets("whole").Cells(Sheets("whole").Rows.Count, "A").End(xlUp).row
'here I calculate the range that I want to pass to the array
Set AppRange = Sheets("whole").Range(Cells(1, 1), Cells(LstRow, 1))
MsgBox ("apprange " & AppRange.Address)
'i dont know if I need to redim or not
ReDim AppArray(1 To LstRow)
'here comes the point. populate the array with the values of the range
AppArray = AppRange.Value
End Sub
This does not work. I also tried application.tranpose(AppRange.Value).
I used:
For i = 1 To LstRow
Debug.Print AppArray(i)
Next
and an error appears, so somehow there is no AppArray(1).
I would be very happy if you can comment on that. More than just arranging the code suggest even other pages (links) to populate arrays with values of ranges when these ranges are not known in advance.
If the case is that looping is very time consuming and that arrays can be populated straight away, I don't understand why 99% of the pages referring to arrays use a loop (or nested loop) to populate an array.
I found the answer.
dim myRange as range
dim myArray() as variant
myRange = range(cells(2,3),cells(10,15))
redeem myArray(1 to 20,1 to 20)
myArray=myRange
It's always much faster to work with variables and arrays than with cells values.

How to improve efficiency using arrays instead of Find in VBA

I have a function that is used to find the information in a Excel worksheet knowing that:
- The Key can be in a variable column
- Variable fields can be searched
Sheets usually have less than a hundred column, but can have anything from a few hundred to 100 000 rows to search. In our biggest files, the function I'm trying to optimize can be used about a million times.
After reading
https://fastexcel.wordpress.com/2011/10/26/match-vs-find-vs-variant-array-vba-performance-shootout/
... and finding our function used Find (3 times), I tried using arrays.
This is the code I wrote
Function getInfo(Key As String, NameField As String, NameKey As String, WksName As String) As Variant
On Error GoTo Error
Dim iColumnKEY As Integer
Dim iColumnFIELD As Integer
Dim i As Integer
Dim ListFields, ListKeys As Variant
ListFields = Worksheets(WksName).Range("A1:ZZ1")
i = LBound(ListFields, 2)
'To identify which column contains the Key and which one contains the
'information we are searching for
Do While iColumnKEY=0 Or iColumnFIELD=0
If i > UBound(ListFields, 2) Then
getInfo = "//error\\"
ElseIf ListFields(1, i) = NameKey Then
iColumnKEY = i
ElseIf ListFields(1, i) = NameField Then
iColumnFIELD = i
End If
i = i + 1
Loop
Dim iROW As Integer
ListKeys = Worksheets(WksName).Columns(iColumnFIELD)
i = LBound(ListKeys, 1)
Do While iROW=0
If i > UBound(ListKeys,1) Then
getInfo = "//error\\"
ElseIf ListKeys(i,1) = Key Then
iROW = i
End If
i = i + 1
Loop
getInfo = Worksheets(WksName).Cells(iROW, iColumnFIELD)
Exit Function
Error:
getInfo = "//error\\"
End Function
The code works, but is very slow. What am I doing that is slowing things down?
It is not in the code right now, but I did try turning the screen update down, as well as automatic calculation down. I didn't see any difference in speed, which indicates me that the basic algorithm is the main issue.
Also, the article was in 2011. Are arrays still a lot faster than Match/Find?
As a side note: eventually, I'll suggest having a macro that search for a range of Keys in a batch, instead of calling the function for every single key. This would means the first Do... While loop would be done only once for a macro, and only the Do_While for Rows would be run for every key. However, this is not an option in the very short term.
Thanks. Any help or advice would be greatly appreciated.
To make sure I understood you correctly, you have a sheet that has a random column that contains unique keys.
you want to search for one of these keys and return related info (like row no, etc) many times
Approach:
Find the column in which the keys are listed.
Load that column in a dictionary(Indexed).
Use GetInfo function to return info about a specific key if it exists.
Dependencies:
Microsoft scripting runtime (Tools > refrences > Microsoft scripting runtime)
code:
Option Explicit
Private KeyDictionary As Scripting.Dictionary
Sub PopulateDictionary(ByRef WS As Worksheet, ByVal FieldName As Variant)
Dim i As Long, LastRow As Long, iColumnFIELD As Long
Dim ListKeys As Variant
iColumnFIELD = WS.Range("A1:ZZ1").Find(FieldName).Column
With WS 'Finds the last row in the sheet
If Application.WorksheetFunction.CountA(.Cells) <> 0 Then
LastRow = .Cells.Find(What:="*", _
After:=.Range("A1"), _
LookAt:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Row
End If
Set KeyDictionary = New Scripting.Dictionary
For i = 1 To LastRow 'populates Dictionary with Key, Row number pair
If Not (KeyDictionary.Exists(.Cells(i, iColumnFIELD))) Then 'Make sure the key doesn't already exist(Key values should be unique)
KeyDictionary.Add .Cells(i, iColumnFIELD).Value, .Cells(i, iColumnFIELD).Row
End If
Next i
End With
End Sub
Function getInfo(ByVal key) As Variant
If KeyDictionary.Exists(key) Then
getInfo = KeyDictionary(key) 'if the key exist return row number (or whatever you want to)
Else
getInfo = "Null" 'Else return whatever you want like a msgbox "not Found" , etc
End If
End Function
usage
'populates and returns the row number of key 9500
Sub TestSearch()
PopulateDictionary ThisWorkbook.Worksheets("Sheet1"), "Key"
Debug.Print getInfo(9500)
End Sub
Notes:
-almost always Use long instead of integer , not much performance difference, but can save you from overflow pitfalls.
-you can add a reference to the range containing the key instead of the row number that would more flexible
-Passing a sheet by reference (Full Ref) is better than passing just its name and avoids a lot of possible problems like the case of multiple workbooks with the same sheet name and makes your code more reusable.
References:
Dictionary object
Edit:
I misunderstood your request , thought you wanted to know the best method available.
here's a performance comparison of the four methods:
it takes 1325 ms (Or 1.3 seconds) to populate the Dictionary with the unique key list the first time (100,000 Row)
it takes 1.79646327708265E-02 ms aka 0.02 ms To search for an item at the end of list (row 99863) using the dictionary object
it takes around 10.5 ms To search for the same item with WorksheetFunction.Match
it takes around 50 ms To search for the same item with the array method
it takes around 80 ms To search for the same item with the Range.find
Result:
Dictionary Method is faster than match -the second-best method of the Four- by over 500 Times!
The reason is that the keys are indexed inside the dictionary, unlike the other methods.
notes:
Office 2016 was used on a 2 cores (3.20 GHz) machine with 8 gigs or ram (Dictionary took about extra 8 Megabytes ram)
All of these searches were done on the same data set
(the search was done on only 1 column with 100,000 unique keys ,with the searched for value at the bottom of the list)
The break-even point on whether you should use Match or Dictionary is around 120 searches.
if the code will search for more than 120 values then it's better to use the dictionary approach.
Windows API QueryPerformanceCounter was used for High resolution timer.
Code line used to search for Values(not gonna put the full sub)
'Match
WorksheetFunction.Match(Key, ThisWorkbook.Worksheets(1).Range("CW:CW"), 0)
'Find
ThisWorkbook.Worksheets(1).Range("CW:CW").Find(Key).Row
'Array
'Loops through the column till it finds a match
In your code you never use iColumnKEY
I think this is what you are actually after:
Function getInfo(key As String, NameField As String, NameKey As String, WksName As String) As Variant
Dim keyCol As Variant, fieldCol As Variant, keyRow As Variant
Dim errMsg As String
getInfo = "//error\\"
With Worksheets(WksName)
With Intersect(.UsedRange, .Columns("A:ZZ")) ' <--| reference a range in passed worksheet cells belonging to columns "A" to "ZZ" from worksheet first used row to last used one and from worksheet first used column to last used one
MsgBox .Address
fieldCol = Application.Match(NameField, .Rows(1), 0) '<--| look for passed 'NameField' in referenced range
If IsError(fieldCol) Then
errMsg = " :field column '" & NameField & "' not found"
Else
keyCol = Application.Match(NameKey, .Rows(1), 0) '<--| look for passed 'NameKey' in referenced range
If IsError(keyCol) Then
errMsg = " :key column '" & NameKey & "' not found"
Else
MsgBox .Columns(keyCol).Address
keyRow = Application.Match(key, .Columns(keyCol)) '<--| look for passed 'key' in referenced range 'NameKey' column
If IsError(keyRow) Then
errMsg = " :key '" & key & "' not found in column '" & NameKey & "'"
Else
getInfo = .Cells(keyRow, fieldCol) '<--| get referenced range "item"
End If
End If
End If
If errMsg <> "" Then getInfo = getInfo & errMsg
End With
End With
End Function
I see that in your loop you are doing a UBound() evaluation every time. This is not needed.
The following should be faster than a Do While loop. Notice that the array returned by Range().Value has always a lower bound of one. No need to call LBound()
Also, find where the last data exists in the column and restrict the loop to that range. I do this with .End(xlUp)
Dim ListKeys() as Variant
Dim iROW As Long, nRows as Long
nRows = Worksheets(WksName).Cells(Worksheets(WksName).Rows.Count, iColumnFIELD).End(xlUp).Row
ListKeys = Worksheets(WksName).Cell(1, iColumnFIELD).Resize(nRows,1).Value
For i=1 To nRows
If ListKeys(i,1) = Key Then
iROW = i
Exit For
End If
Next i
not an answer but a radically different approach, since im from data-science background i use these methods for fast searching any data in a database which are few GB in size, in your case excel for example. this approach can be parallelized based on number of CPUs in your system. and uses python framework Pandas, which has very big community incase you need support, VB has limited community.
also read this before judging this answer https://corporatefinanceinstitute.com/resources/knowledge/other/transitioning-from-excel-to-python/
i expect criticism for this answer , OP asked this but you are giving this blah. but if you want faster development times for ever changing business needs you need something fast, and easy to maintain. python makes it easy, pandas makes it fast.
to get started read this.https://towardsdatascience.com/read-excel-files-with-python-1000x-faster-407d07ad0ed8
i will mention the pipeline here however. see very few lines of code!!! finish work faster, go home early.
import the excel file as csv
import pandas as pd
dataframe=pd.read_excel("file.xlsx")
item=dataframe[dataframe["Order ID"]==886714971] #condition based searching in excel
note "Order ID" is just any arbitary column and you can use SQL like logic here which resembles match/find in VBA.
for speed reference iterating 1,000,000 rows took 0.03 seconds, which means a transaction speed of 30 TPS. use https://modin.readthedocs.io/en/latest/ to scale up that speed linearly with number of cores in cpu.
To find out what parts of the code are the slowest, you can use Timer:
Dim t as Single
t = Timer
' part of the code
Debug.Print CDbl(Timer - t) ' CDbl to avoid scientific notation
Using .Value2 instead of .Value should help a bit:
ListFields = Worksheets(WksName).Range("A1:ZZ1").Value2
Searching for the key and field in two separate loops should be a bit faster because there will be less comparisons. Also, I am not sure if it will be a bit slower or faster, but you can iterate even multi-dimensional arrays:
Dim i As Long, v ' As Variant
i = 1
For Each v in ListFields
If v = NameKey Then
iColumnKEY = i
Exit For
End If
i = i + 1
Next

Excel 2007 VBA Assigning part of a named range to an array

I have a table of monthly sales figures - FreqData1. Each column represents a month and is numbered 1 to 12. The user chooses one of these numbers from a dropdown list.
I have the code to find the column number and I have tried to assign the data in that column to an array so I can use it copy it to a different spreadsheet but with my basic VBA knowledge and despite lots of searching I have been unable to find the code as to how to do this or a different method to carry this out.
Can anyone help please
Sub AnnualFreqMacro()
Dim TPNoInt As Long, BranchNoInt As Long, ColNo As Long
Dim FreqArray()
Worksheets("Freq data").Activate
TPNoInt = Range("B42").Value
BranchNoInt = Range("B41").Value
ColNo = Application.Match(TPNoInt, Range("TPBr1"), 0)
CharaArray = Range("FreqData1").Cells (1, ColNo), Cells(16, ColNo))
End Sub
Many thanks in advance
I think this is your answer: It's how you're using the range.
Delete your CharArray = ... line and replace with:
With Range("FreqData1")
CharaArray = .Range(.Cells(1, ColNo), .Cells(16, ColNo))
End With
The issue is how you're setting the range, Range().Cells(), Cells() isn't the context, you'd want something more like Range(Cells(),Cells()).
Let's say "FreqData1" is the range A10:A20. If you use
With Range("FreqData1")
.Cells(1,1).Select
End With
this will select the top left most cell (row 1, column 1) in the range "FreqData", so cell A10 would be selected.
A final misc. point: Avoid using .Select/.Activate. You can activate a sheet of course so you can follow your macro, but when setting variables to ranges/cell values, etc. it's best to qualify which sheet you are referring to.
Sub AnnualFreqMacro()
Dim TPNoInt As Long, BranchNoInt As Long, ColNo As Long
Dim FreqArray()
Dim freqWS As Worksheet
Set freqWS = Worksheets("Freq data")
' Worksheets("Freq data").Activate ' removed this, since we have a variable for it now.
TPNoInt = freqWS.Range("B42").Value ' see how I added the worksheet that we want to get B42's value from?
BranchNoInt = freqWS.Range("B41").Value
ColNo = Application.Match(TPNoInt, Range("TPBr1"), 0)
With freqWS.Range("FreqData1") ' I'm assuming "FreqData1" is on this worksheet
CharaArray = .Range(.Cells(1, ColNo), .Cells(16, ColNo))
End With
End Sub
I'm not positive if you have to qualify a named range's sheet, since it's a named range, but I added that just to be safe.
Edit2: Hm, oddly enough, if your named range "myRange" is A1:A10, you can still do myRange.Range(myRange.cells(1,1),myRange.cells(1,2)), even though there's no second column in the range, it just expands it. I thought it'd throw an error, but nope. Just to note.

VBA Nested For Loop Efficiency

I am trying to find the fastest way to perform a task in VBA. Currently I have it written as a nested for loop which can be extremely slow. I am looping over a list of unique numbers and matching them to numbers in a different list. If I get a match I store the information in a multidimensional array since there can be multiple matches and I want to keep track of all of them. Unfortunetly, this means when using a for loop if there are just 1000 unique numbers and 5000 numbers to look for matches my loop can end up iterating 1000*5000 = 5000000 times. As you see this can create a problem quickly. I am asking if there is any better way to approach this problem while staying in VBA. I already did all the tricks like set screenUpdating to false and calculation to manaul.
Here is my code:
For x = 0 To UBound(arrUniqueNumbers)
Dim arrInfo() As Variant
ReDim Preserve arrInfo(0)
If UBound(arrInfo) = 0 Then
arrInfo(0) = CStr(arrUniqueNumbers(x))
End If
For y = 2 To Length
UniqueString = CStr(arrUniquePhoneNumbers(x))
CLEARString = CStr(Sheets(2).Range("E" & y).Value)
If UniqueString = CLEARString Then 'match!
NormalizedDate = Format(CStr(Sheets(2).Range("G" & y).Value), "yyyymmdd")
z = z + 1
ReDim Preserve arrInfo(z)
arrInfo(z) = NormalizedDate & " " & LTrim(CStr(Sheets(2).Range("D" & y).Value))
arrInfo(z) = LTrim(arrInfo(z))
End If
Next
arrUniqueNumbers(x) = arrInfo()
ReDim arrInfo(0) 'erase everything in arrOwners
z = 0
Next
The loop is quite inefficient, so there are quite a few avoidable bottlenecks (mostly in the order of simplest to change to most complex to change)
Take the UniqueString step out of the innermost loop: This step doesn't change with changing y, so no point in repeating it.
Take the Redim Preserve out of the innermost loop: You are reallocating memory in the innermost loop which is extremely inefficient. Allocate 'sufficient' amount of memory outside the loop.
Do not keep using Sheets().Range() to access cell contents: Every time you access something on the spreadsheet, it is a HUGE drag and has a lot of overhead associated with the access. Consider one-step fetch operations from the spreadsheet, and one-step push operations back to the spreadsheet for your results. See sample code below.
Sample code for Efficient Fetch and Push-back operations for the spreadsheet:
Dim VarInput() As Variant
Dim Rng As Range
' Set Rng = whatever range you are looking at, say A1:A1000
VarInput = Rng
' This makes VarInput a 1 x 1000 array where VarInput(1,1) refers to the value in cell A1, etc.
' This is a ONE STEP fetch operation
' Your code goes here, loops and all
Dim OutputVar() as Variant
Redim OutputVar(1 to 1000, 1 to 1)
' Fill values in OutputVar(1,1), (1,2) etc. the way you would like in your output range
Dim OutputRng as Range
Set OutputRng = ActiveSheet.Range("B1:B1000")
' where you want your results
OutputRng = OutputVar
' ONE STEP push operation - pushes all the contents of the variant array onto the spreadsheet
There are quite a few other steps which can further dramatically speed up your code, but these should produce visible impact with not too much effort.
dim dict as Object
set dict = CreateObject("Scripting.Dictionary")
dim x as Long
'Fill with ids
'Are arrUniqueNumbers and arrUniquePhoneNumbers the same?
For x = 0 To UBound(arrUniqueNumbers)
dict.add CStr(arrUniquePhoneNumbers(x)), New Collection
next
'Load Range contents in 2-Dimensional Array
dim idArray as Variant
idArray = Sheets(2).Cells(2,"E").resize(Length-2+1).Value
dim timeArray as Variant
timeArray = Sheets(2).Cells(2,"G").resize(Length-2+1).Value
dim somethingArray as Variant
somethingArray = Sheets(2).Cells(2,"D").resize(Length-2+1).Value
dim y as Long
'Add Values to Dictionary
For y = 2 To Length
Dim CLEARString As String
CLEARString = CStr(timeArray(y,1))
If dict.exists(CLEARString) then
dict(CLEARString).Add LTrim( Format(timeArray(y,1)), "yyyymmdd")) _
& " " & LTrim(CStr(somethingArray(y,1)))
end if
next
Access like this
dim currentId as Variant
for each currentId in dict.Keys
dim currentValue as variant
for each currentValue in dict(currentId)
debug.Print currentId, currentValue
next
next

Comparing two Arrays with excel VBA

As for the problem, I need to be able to compare all data in Variant array A to all data in Variant array B. I know I need some kind of double loop (so that every A value is checked against all B values), but I can't figure out how to do it. Here's what I have so far:
Sub Button_Click()
Dim trgtRange As Variant
Dim tempRange As Variant
Set myRange = ThisWorkbook.Sheets(1).Range("L:L")
For Each cell In myRange
If IsEmpty(cell) Then
ActiveCell.Offset(-1, 0).Select
currentRow = ActiveCell.Row
Set trgtRange = Range("L2:L" & currentRow)
Exit For
End If
Next cell
Set tempRange = Range("A1:A" & currentRow - 1)
' Insert a double loop here
End Sub
So, trgtRange is the Variant A and tempRange is Variant B. I know I could have set the Variant B up a little easier, but I already did it that way. After all, code should be polished as last operation anyway.
You might be wondering why Variants A and B are completely the same. Well, that's because I need to compare them so that I can find values that are close to each other, (i.e 10000 and 12000) and I need to incorporate some kind of tolerance for it.
Here is my answer. Why do you need two loops to do this. Some relative addressing handles this issue quite nicely. Set up a spreadsheet like this for an example:
and your code is simply this
Sub Button_Click()
Dim dblTolerance As Double
Dim tmp As Range
'Get source range
Set tmp = ActiveSheet.Range("A2")
'Get tolerance from sheet or change this to an assignment to hard code it
dblTolerance = ActiveSheet.Range("D13")
'use the temporary variable to cycle through the first array
Do Until tmp.Value = ""
'Use absolute function to determine if you are within tolerance and if so put match in the column
'NOTE: Adjust the column offset (set to 4 here) to match whichever column you want result in
If Abs(tmp.Value - tmp.Offset(0, 2).Value) < dblTolerance Then
tmp.Offset(0, 4).Value = "Match"
Else
tmp.Offset(0, 4).Value = "No Match"
End If
'Go to the next row
Set tmp = tmp.Offset(1, 0)
Loop
'Clean up
Set tmp = Nothing
End Sub
The comments in the code explain how it works. This is superior to a double loop because relative referencing is faster, the memory use is more efficient and you only have to make one pass at each row.
If you are required for some reason to use a double loop let me know, but that is inferior performance wise to this methodology. Hope this helps.

Resources