VBA Nested Loop with Array running very slowly - Looking for Solutions - arrays

I'm running a nested loop, which I've added an array to in an attempt to speed up, but it is still running fairly slowly. When I have 100 rows and 41 columns of data in the "Active" sheet and 1000 rows and 41 columns of data in the "Closed" sheet, it takes about 7 minutes to run through the code and output the data into the "CompSheet"
Sub CompareColumns()
'Turn off screen updating and automatic calculation
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim i As Integer 'variable for the outer loop
Dim j As Integer 'variable for the inner loop
Dim ws As Worksheet 'variable for the sheet CompSheet
Dim compareLat As Byte 'variable for the column that is being compared
Dim compareLon As Byte 'variable for the column that is being compared
Dim compareLatArray As Byte
Dim compareLonArray As Byte
Dim uniqueID As String 'variable for the unique identifier
Dim ActiveSheetRows As Integer
Dim ClosedSheetRows As Integer
Dim closedArray As Variant ' variable for closed sheet data
Dim closedArrayRow As Variant
Dim activeArray As Variant ' variable for active sheet data
Dim activeArrayRow As Variant
Dim dLon As Double
Dim x As Double
Dim y As Double
Dim lat_a As Double
Dim lat_c As Double
Dim lon_a As Double
Dim lon_c As Double
Dim result As Double
Dim distance_toggle As Single
Dim distance As Single
ActiveSheetRows = Worksheets("Active").UsedRange.Rows.Count
ClosedSheetRows = Worksheets("Closed").UsedRange.Rows.Count
compareLat = 38 'change this variable to switch the column that is being compared
compareLon = 39 'change this variable to switch the column that is being compared
compareLatArray = 38 'change this variable to switch the column that is being compared
compareLonArray = 39 'change this variable to switch the column that is being compared
distance_toggle = 1.5
'Store the data from the "Closed" worksheet into the array
closedArray = Worksheets("Closed").UsedRange.Value
'Store the data from the "Active" worksheet into the array
activeArray = Worksheets("Active").UsedRange.Value
'Check if the sheet CompSheet exists, if not create it
On Error Resume Next
Set ws = ThisWorkbook.Sheets("CompSheet")
If ws Is Nothing Then
ThisWorkbook.Sheets.Add(After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count)).Name = "CompSheet"
'copy the header row from the "Closed" worksheet when it first creates the "CompSheet" worksheet
Worksheets("Closed").Rows(1).Copy _
Destination:=Worksheets("CompSheet").Range("A1")
'Add the column header "uniqueID" to the last cell in row 1 of the "CompSheet" worksheet
Worksheets("CompSheet").Cells(1, Worksheets("CompSheet").UsedRange.Columns.Count + 1).Value = "uniqueID"
'Add the column header "CompDistance" to the last cell in row 1 of the "CompSheet" worksheet
Worksheets("CompSheet").Cells(1, Worksheets("CompSheet").UsedRange.Columns.Count + 1).Value = "CompDistance"
End If
On Error GoTo 0
'Loop through all the rows in the "Active" worksheet starting on row 2
For i = 2 To UBound(activeArray, 1)
'Loop through the array to look up the data in the "Closed" worksheet
For j = 2 To UBound(closedArray, 1)
lat_a = activeArray(i, compareLat)
lat_c = closedArray(j, compareLatArray)
lon_a = activeArray(i, compareLon)
lon_c = closedArray(j, compareLonArray)
'Calculationg for D2R = 0.0174532925199433
'pi = 4 * Atn(1)
'D2R = pi / 180#
lat_a = 0.0174532925199433 * lat_a
lat_c = 0.0174532925199433 * lat_c
dLon = 0.0174532925199433 * (lon_c - lon_a)
x = Sin(lat_a) * Sin(lat_c) + Cos(lat_a) * Cos(lat_c) * Cos(dLon)
y = Sqr((Cos(lat_c) * Sin(dLon)) ^ 2 + (Cos(lat_a) * Sin(lat_c) - Sin(lat_a) * Cos(lat_c) * Cos(dLon)) ^ 2)
distance = WorksheetFunction.Atan2(x, y) * 3963.19
If distance <= distance_toggle Then
'Copy the row from the Closed worksheet to the CompSheet worksheet in the next available row
Worksheets("CompSheet").Rows(Worksheets("CompSheet").UsedRange.Rows.Count + 1).Insert
closedArrayRow = Worksheets("Closed").Cells(j, 1).Resize(1, UBound(closedArray, 2))
'Worksheets("CompSheet").Range("B1").Resize(UBound(closedArrayRow, 1), UBound(closedArrayRow, 2)).Value = closedArrayRow
Worksheets("CompSheet").Rows(Worksheets("CompSheet").UsedRange.Rows.Count).Resize(1, 41).Value = closedArrayRow
'Create a uniqueID by combining column 6 from both the Active and Closed worksheets with a space and "&" in between
uniqueID = activeArray(i, 5) & " " & "&" & " " & closedArray(j, 5)
'Paste the uniqueID in the next available column of the new row in the CompSheet worksheet
Worksheets("CompSheet").Cells(Worksheets("CompSheet").UsedRange.Rows.Count, compareLon + 1).Value = uniqueID
'Paste the distance value in the corresponding column of the new row in the CompSheet worksheet
Worksheets("CompSheet").Cells(Worksheets("CompSheet").UsedRange.Rows.Count, compareLon + 2).Value = distance
End If
Next j
Next i
'Formatting "CompSheet" Data
Worksheets("CompSheet").Columns.AutoFit
Worksheets("CompSheet").Range("AO:AO").NumberFormat = "#,##0.0"
Worksheets("CompSheet").UsedRange.Font.Bold = False
Worksheets("CompSheet").Cells(1, 1).EntireRow.Font.Bold = True
'Turn on screen updating and automatic calculation
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
I added the array's to try and speed things up but I'm not sure I've implemented them properly. I've also added some other code to help improve the speed (as suggested by OpenAI ChatGPT), such as:
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Any help would be great appreciated.
UPDATE: See below google drive link for the actual excel file. I've run my code and it took 8 minutes. Eventually, I'd like to scale this up to a dataset about 500 times this size. Which would take 60 hours to run based on a linear time calculation.
https://drive.google.com/file/d/1GfR5RbWcHFQC-5oY9izDOQWbzZkvBwXi/view?usp=share_link
What I'm trying to do is compare real estate listings (properties), properties that are currently listed for sale in the "Active" sheet to ones that are already sold, in the "Closed" sheet. For every property (row) in the "Active" sheet, I need the code to check over every sold property in the "Closed" sheet based on the distance toggle and if the sold property is within the specified distance (2 miles) then I want to copy the sold listing row from the "Closed" sheet into the "CompSheet" and also paste the Unique ID (Both addresses concatenated) and the 'distance' variable, for that comp.

Should take less than 10 seconds
Option Explicit
Sub CompareColumns()
'change these variable to switch the column that is being compared
Const compareLat = 38 'AL
Const compareLon = 39 'AM
Const compareLatArray = 38 'AL
Const compareLonArray = 39 'AM
Const distance_toggle = 1.5
Dim wb As Workbook
Dim wsClosed As Worksheet, wsActive As Worksheet, wsComp As Worksheet
Dim n As Long, rComp As Long, colsClosed As Long, t0 As Single: t0 = Timer
Set wb = ThisWorkbook
With wb
Set wsActive = .Sheets("Active")
Set wsClosed = .Sheets("Closed")
n = .Sheets.Count
On Error Resume Next
Set wsComp = .Sheets("CompSheet")
On Error GoTo 0
If wsComp Is Nothing Then
Set wsComp = .Sheets.Add(After:=.Sheets(n))
With wsComp
.Name = "CompSheet"
'copy the header row from the "Closed" worksheet
'when it first creates the "CompSheet" worksheet
wsClosed.Rows(1).Copy .Range("A1")
'Add the column header "uniqueID" and "CompDistance"
'to the end of row 1 of the "CompSheet" worksheet
colsClosed = .UsedRange.Columns.Count
.Cells(1, colsClosed + 1).Value = "uniqueID"
.Cells(1, colsClosed + 2).Value = "CompDistance"
'Formatting "CompSheet" Data
.Columns.AutoFit
.Range("AO:AO").NumberFormat = "#,##0.0"
.UsedRange.Font.Bold = False
.Cells(1, 1).EntireRow.Font.Bold = True
End With
Else
colsClosed = wsClosed.UsedRange.Columns.Count
End If
rComp = wsComp.UsedRange.Rows.Count + 1
End With
'Store the data from the "Active" and "Closed"
'worksheet into the array
Dim arActive, arClosed
arActive = wsActive.UsedRange.Value
arClosed = wsClosed.UsedRange.Value
Dim i As Long, j As Long, k As Long
Dim lat_a As Double, lon_a As Double, lat_c As Double, lon_c As Double
Dim x As Double, y As Double, dLon As Double, distance As Double
Dim uniqueID As String
'Calculationg for D2R = 0.0174532925199433
'pi = 4 * Atn(1)
'D2R = pi / 180#
Const FACTOR As Double = 1.74532925199433E-02
' dimension max possible rows
Dim arComp, z As Long
z = UBound(arActive) * UBound(arClosed)
ReDim arComp(1 To z, 1 To colsClosed + 2)
rComp = 0
'Loop through all the rows in the "Active" worksheet starting on row 2
For i = 2 To UBound(arActive, 1)
lat_a = arActive(i, compareLat) * FACTOR
lon_a = arActive(i, compareLon)
'Loop through the array to look up the data in the "Closed" worksheet
For j = 2 To UBound(arClosed, 1)
lat_c = arClosed(j, compareLatArray) * FACTOR
lon_c = arClosed(j, compareLonArray)
dLon = FACTOR * (lon_c - lon_a)
x = Sin(lat_a) * Sin(lat_c) + Cos(lat_a) * Cos(lat_c) * Cos(dLon)
y = Sqr((Cos(lat_c) * Sin(dLon)) ^ 2 + (Cos(lat_a) * Sin(lat_c) - Sin(lat_a) * Cos(lat_c) * Cos(dLon)) ^ 2)
distance = WorksheetFunction.Atan2(x, y) * 3963.19
If distance <= distance_toggle Then
'Create a uniqueID by combining column 6 from
'both the Active and Closed worksheets
'with a space and "&" in between
uniqueID = arActive(i, 5) & " " & "&" & " " & arClosed(j, 5)
'Copy the row from the Closed worksheet to the
'CompSheet worksheet in the next available row
'Paste the uniqueID and distance in the next available column
'of the new row in the CompSheet worksheet
rComp = rComp + 1
For k = 1 To colsClosed
arComp(rComp, k) = arClosed(j, k)
Next
arComp(rComp, k) = uniqueID
arComp(rComp, k + 1) = distance
End If
Next j
Next i
'Turn off screen updating and automatic calculation
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
' result
Dim rngComp As Range
With wsComp
Set rngComp = .Cells(.UsedRange.Rows.Count + 1, "A")
Set rngComp = rngComp.Resize(rComp, colsClosed + 2)
rngComp = arComp
End With
'Turn on screen updating and automatic calculation
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
MsgBox "Data written " & rngComp.Address, vbInformation, "Took " & Format(Timer - t0, "0.00 secs")
End Sub

One thing I found so far that is not needed is this:
Worksheets("CompSheet").Rows(Worksheets("CompSheet").UsedRange.Rows.Count + 1).Insert
This looks like you're adding a row to the bottom. You don't have to add rows to the bottom, they're already there - just comment that out and add 1 to your "copy" statement, Rows.Count + 1.
Worksheets("CompSheet").Rows(Worksheets("CompSheet").UsedRange.Rows.Count + 1).Resize(1, 41).Value = closedArrayRow

This should be faster. Compiles, but untested.
Use of worksheet references makes your code cleaner.
Swapped out the WorksheetFunction call for a faster VBA version.
Skipped the Insert when adding data to the comp sheet (as already suggested by Nick).
Use Const for fixed values
Avoid UsedRange since it can be unreliable/unpredictable
Sub CompareColumns()
Const NUM_COLS As Long = 39
Const ID_COL As Long = 40
Const DIST_COL As Long = 41
Const COL_ACT_LAT As Long = 38
Const COL_ACT_LON As Long = 39
Const COL_CLS_LAT As Long = 38
Const COL_CLS_LON As Long = 39
Const DIST_TOGGLE As Double = 1.5
Dim wb As Workbook, wsActive As Worksheet, wsClosed As Worksheet, wsComp As Worksheet
Dim rngClosed As Range, rngActive As Range
Dim i As Long, j As Long
Dim closedArray As Variant, activeArray As Variant
Dim lat_a As Double, lat_c As Double, lon_a As Double, lon_c As Double
Dim distance As Double, lastRw As Long, destRw As Range
Set wb = ThisWorkbook
Set wsActive = wb.Worksheets("Active")
'if your data has no empty rows or columns
Set rngActive = wsActive.Range("A1").CurrentRegion.Resize(, NUM_COLS)
activeArray = rngActive.Value
Set wsClosed = wb.Worksheets("Closed")
Set rngClosed = wsClosed.Range("A1").CurrentRegion.Resize(, NUM_COLS)
closedArray = rngClosed.Value
'add the comparison sheet if not already present
On Error Resume Next 'ignore error if sheet is missing
Set wsComp = wb.Worksheets("CompSheet")
On Error GoTo 0 'stop ignoring errors as soon as it's no longer needed....
If wsComp Is Nothing Then
Set wsComp = wb.Worksheets.Add(After:=wb.Worksheets(wb.Worksheets.Count))
wsComp.Name = "CompSheet"
wsClosed.Range("A1").Resize(1, NUM_COLS).Copy wsComp.Range("A1")
wsComp.Cells(1, ID_COL).Value = "uniqueID"
wsComp.Cells(1, DIST_COL).Value = "CompDistance"
lastRw = 1
Else
'find last row with any data
lastRw = wsComp.Cells.Find(What:="*", SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious).Row
End If
Set destRw = wsComp.Rows(lastRw + 1) 'first empty row on comp sheet
For i = 2 To UBound(activeArray, 1) 'loop "active" array
lat_a = activeArray(i, COL_ACT_LAT) 'you can read these in the outer loop
lon_a = activeArray(i, COL_ACT_LON)
For j = 2 To UBound(closedArray, 1) 'loop "closed" array
lat_c = closedArray(j, COL_CLS_LAT)
lon_c = closedArray(j, COL_CLS_LON)
distance = DistanceCalc(lat_a, lon_a, lat_c, lon_c)
If distance <= DIST_TOGGLE Then
destRw.Cells(1).Resize(1, NUM_COLS).Value = rngClosed.Rows(j).Value
destRw.Cells(ID_COL).Value = activeArray(i, 5) & " " & "&" & " " & closedArray(j, 5)
destRw.Cells(DIST_COL).Value = distance
Set destRw = destRw.Offset(1, 0)
End If
Next j
Next i
With wsComp 'Formatting "CompSheet" Data
.Columns.AutoFit
.Range("AO:AO").NumberFormat = "#,##0.0"
.UsedRange.Font.Bold = False
.Cells(1, 1).EntireRow.Font.Bold = True
End With
End Sub
'Miles between (latA,lonA) and (latB,lonB)
Function DistanceCalc(latA As Double, lonA As Double, latB As Double, lonB As Double) As Double
Const RAD_MULT As Double = 1.74532925199433E-02
Dim dlon As Double, x As Double, y As Double
latA = latA * RAD_MULT
latB = latB * RAD_MULT
dlon = RAD_MULT * (lonB - lonA)
x = Sin(latA) * Sin(latB) + Cos(latA) * Cos(latB) * Cos(dlon)
y = Sqr((Cos(latB) * Sin(dlon)) ^ 2 + (Cos(latA) * Sin(latB) - Sin(latA) * Cos(latB) * Cos(dlon)) ^ 2)
'DistanceCalc = WorksheetFunction.Atan2(x, y) * 3963.19
DistanceCalc = ArcTan2(x, y) * 3963.19 'VBA version is faster
End Function
'VBA version of WorksheetFunction.Atan2
Function ArcTan2(x As Double, y As Double) As Double
Const PI As Double = 3.14159265358979
Const PI_2 As Double = 1.5707963267949
Select Case x
Case Is > 0
ArcTan2 = Atn(y / x)
Case Is < 0
ArcTan2 = Atn(y / x) + PI * Sgn(y)
If y = 0 Then ArcTan2 = ArcTan2 + PI
Case Is = 0
ArcTan2 = PI_2 * Sgn(y)
End Select
End Function

There are a few basic things you can do to speed code up. The easiest is to disable screen updating and calculations. You can use error handling to ensure they get re-enabled.
Sub MyFasterProcess()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
On Error GoTo Finally
Call MyLongRunningProcess()
Finally:
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
If Err > 0 Then Err.Raise Err
End Sub
Some people like to put that into some helper functions, or even a class to manage the state over several processes.
The most common culprit for long running processes is reading from and writing to cells. It is significantly faster to read an array than it is to read individual cells in the range.
Consider the following:
Sub SlowReadWrite()
Dim src As Range
Set src = Range("A1:AA100000")
Dim c As Range
For Each c In src
c.Value = c.Value + 1
Next c
End Sub
This will take a very, very long time. Now let's do it with an array. Read once. Write once. No need to disable screen updating or set calculation to manual either. This will be just as fast with them on.
Sub FastReadWrite()
Dim src As Range
Set src = Range("A1:AA100000")
'Read once.
Dim vals() As Variant
vals = r.Value
Dim r As Long, c As Long
For r = 1 To UBound(vals, 1)
For c = 1 To UBound(vals, 2)
vals(r, c) = vals(r, c) + 1
Next c
Next r
'Write once.
src.Value = vals
End Sub
Your code looks like it's still performing read / write actions in the loop which is what is slowing you down.

Related

Converting a multi-column table and have the output go to two columns?

I am looking for if it is possible to get the data and headers from a table as in the example image and have the output go to two columns with the first column being a repeating header? I did try the transpose however the email row kept populating up to column E.
Please, try the next way. It uses arrays being fast even for large ranges, mostly working in memory. It returns starting from "F2". It is able to process any other columns you (may) need, after "Status":
Sub TransposeMails()
Dim sh As Worksheet, lastR As Long, lastCol As Long
Dim arrH, arr, arrFin, i As Long, j As Long, k As Long
Set sh = ActiveSheet 'use here the necessary sheet
lastR = sh.Range("A" & sh.rows.count).End(xlUp).row 'last row
lastCol = sh.cells(1, sh.Columns.count).End(xlToLeft).column 'last column
arrH = Application.Transpose(sh.Range(sh.cells(1, 1), sh.cells(1, lastCol)).Value2) 'place headers in an array
arr = sh.Range("A2", sh.cells(lastR, lastCol)).Value2 'place the range to be processed (except headers) in an array for faster iteration/processing
ReDim arrFin(1 To (UBound(arrH) + 1) * UBound(arr), 1 To 2) 'Redim the final array (keeping the processed result)
'+ 1 for the empty rows in between...
For i = 1 To UBound(arr)
For j = 1 To UBound(arrH)
k = k + 1
arrFin(k, 1) = arrH(j, 1): arrFin(k, 2) = arr(i, j)
Next j
k = k + 1 'for the empty row between groups...
Next i
'drop the processed array content:
sh.Range("G2").Resize(UBound(arrFin), 2).Value2 = arrFin
End Sub
The code can be easily adapted to return anywhere (another sheet, workbook, range etc).
The range to be processed must start from "A1" ("Email" header) and not having any other record after the last header (on the first row)...
Transpose Data
Sub TransposeData()
Const SRC_NAME As String = "Sheet1"
Const DST_NAME As String = "Sheet1"
Const DST_FIRST_CELL As String = "A8"
Const EMPTY_COLS As Long = 0
Const EMPTY_ROWS As Long = 1
Dim wb As Workbook: Set wb = ThisWorkbook ' workbook containing this code
Dim sws As Worksheet: Set sws = wb.Sheets(SRC_NAME)
Dim srg As Range: Set srg = sws.Range("A1").CurrentRegion
Dim drOffset As Long: drOffset = srg.Columns.Count + EMPTY_ROWS
Dim dcOffset As Long: dcOffset = 1 + EMPTY_COLS
Dim dws As Worksheet: Set dws = wb.Sheets(DST_NAME)
Dim dfCell As Range: Set dfCell = dws.Range(DST_FIRST_CELL)
Application.ScreenUpdating = False
Dim srrg As Range, shrg As Range
Dim IsHeaderReady As Boolean, IsFirstRowDone As Boolean
For Each srrg In srg.Rows
If Not IsHeaderReady Then
srrg.Copy
dfCell.PasteSpecial Transpose:=True
Set shrg = dfCell.Resize(srg.Columns.Count) ' transpose no more
IsHeaderReady = True
Else ' header is ready; it's already copied for the first data row
If IsFirstRowDone Then shrg.Copy dfCell Else IsFirstRowDone = True
srrg.Copy
dfCell.Offset(, dcOffset).PasteSpecial Transpose:=True
Set dfCell = dfCell.Offset(drOffset)
End If
Next srrg
Application.ScreenUpdating = True
MsgBox "Data transposed.", vbInformation
End Sub
If I understand you correctly
Sub test()
'set the range of the header as rg variable
'count how many data under EMAIL header as cnt variable
Dim rg As Range: Set rg = Range("A1", Range("A1").End(xlToRight))
Dim cnt As Integer: cnt = Range(rg, rg.End(xlDown)).Rows.Count - 1
Dim i As Integer: Dim rslt As Range
'loop to each range with data as many as the cnt value
'prepare the result range as rslt variable
'put the value of header name to rslt range
'put the looped range value to rslt.offset(0,1)
For i = 1 To cnt
Set rslt = Range("A" & Rows.Count).End(xlUp).Offset(3, 0) _
.Resize(rg.Columns.Count, 1)
rslt.Value = Application.Transpose(rg)
rslt.Offset(0, 1).Value = Application.Transpose(rg.Offset(i, 0))
Next
End Sub
Please note, the code must be run where the sheet contains the data is active.

VBA 2D Array for each item find average value

I am trying to find a way to average an array column value with a condition on items from another column in that array - I am aware that a class or dictionary might be the best solution but I would like to stick to an array as in my real scenario I have to use an array.
In this case the data is as follows
Risk ID Data set 1 Data set 2
23359720 1154 587
23359720 1254 658
23359720 854 756
23293773 965 1456
20053692 1458 458
I would like to find the average of Data sets 1 and 2 per Risk ID, here is what I've tried but does not work - I have seen that this it's not possible to use for each and point it to a specific column, but not sure what else to do in the case of an array?
Edit: expected result data:
ArrayResultAverage()
Risk ID Avg Data set 1 Avg Data set 2
23359720 1087.33 667
23293773 965 1456
20053692 1458 458
Sub Test_Arr_Avg()
'
Dim TESTWB As Workbook
Dim TESTWS As Worksheet
Set TESTWB = ThisWorkbook
Set TESTWS = TESTWB.Worksheets("TEST")
'Array set up
Dim RngTest As Range
Dim ArrTestAvg As Variant
NbRowsTest = TESTWS.Range("A1").End(xlDown).Row
Set RngTest = TESTWS.Range(TESTWS.Cells(1, 1), TESTWS.Cells(NbRowsTest, 3))
ArrTestAvg = RangeToArray2D(RngTest)
'Find the average of Data Range 1 for each item in Risk ID
For k = 1 To UBound(ArrTestAvg, 1)
Dim Sum As Variant
Sum = 0
For Each Item In ArrTestAvg(k, 1)
Sum = Sum + ArrTestAvg(k, 2)
Dim AverageDataSet1 As Variant
AverageDataSet1 = Sum / UBound(ArrTestAvg(Item)) + 1
Debug.Print AverageDataSet1
Next Item
Next k
End Sub
Public Function RangeToArray2D(inputRange As Range) As Variant()
Dim size As Integer
Dim inputValue As Variant, outputArray() As Variant
inputValue = inputRange
On Error Resume Next
size = UBound(inputValue)
If Err.Number = 0 Then
RangeToArray2D = inputValue
Else
On Error GoTo 0
ReDim outputArray(1 To 1, 1 To 1)
outputArray(1, 1) = inputValue
RangeToArray2D = outputArray
End If
On Error GoTo 0
End Function
Get Averages of Unique Data
Adjust the values in the constants section, especially the destination worksheet name (it's the same as the source worksheet name) and its first cell address.
The dictionary's keys hold the unique risk ids, while its items (values) hold the associated destination rows.
The result is written to the same array (which is too big) but with dr the destination row size is tracked and only three columns will be copied.
Before the calculation of the averages, column 1 holds the unique risk ids (the same order as in the dictionary), columns 2 and 3 hold the sums while columns 4 and 5 hold the counts of the first and second data set respectively.
Option Explicit
Sub Test_Arr_Avg()
' Source
Const sName As String = "Sheet1"
' Destination
Const dName As String = "Sheet1"
Const dFirstCellAddress As String = "E1"
' Workbook
Dim wb As Workbook: Set wb = ThisWorkbook ' workbook containing this code
' Read from source.
Dim sws As Worksheet: Set sws = wb.Worksheets(sName)
Dim slRow As Long: slRow = sws.Cells(sws.Rows.Count, "A").End(xlUp).Row
Dim srg As Range: Set srg = sws.Range("A1", sws.Cells(slRow, "C"))
Dim srCount As Long: srCount = srg.Rows.Count
' Write source range values to array.
Dim Data As Variant: Data = GetRange(srg)
' Add two helper columns for the count.
ReDim Preserve Data(1 To srCount, 1 To 5)
' Sum up and count uniques.
Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")
dict.CompareMode = vbTextCompare
Dim dr As Long: dr = 1 ' first row are headers
Dim sr As Long
Dim cr As Long
Dim c As Long
For sr = 2 To srCount
' Sum up.
If dict.Exists(Data(sr, 1)) Then
cr = dict(Data(sr, 1))
For c = 2 To 3
Data(cr, c) = Data(cr, c) + Data(sr, c)
Next c
Else
dr = dr + 1
cr = dr
dict(Data(sr, 1)) = cr
For c = 1 To 3
Data(cr, c) = Data(sr, c)
Next c
End If
' Count.
For c = 4 To 5
Data(cr, c) = Data(cr, c) + 1
Next c
Next sr
' Calculate averages.
For cr = 2 To dr
For c = 2 To 3
Data(cr, c) = Data(cr, c) / Data(cr, c + 2)
' You might want to round the results instead:
'Data(cr, c) = Round(Data(cr, c) / Data(cr, c + 2), 2)
Next c
Next cr
Application.ScreenUpdating = False
' Write to destination.
Dim dws As Worksheet: Set dws = wb.Worksheets(dName)
With dws.Range(dFirstCellAddress).Resize(, 3)
.Resize(dr).Value = Data
' Clear below.
.Resize(dws.Rows.Count - .Row - dr + 1).Offset(dr).Clear
' Apply various formatting.
.Font.Bold = True ' headers
.Resize(dr - 1, 2).Offset(1, 1).NumberFormat = "#0.00" ' averages
.EntireColumn.AutoFit ' columns
End With
'wb.Save
' Inform.
Application.ScreenUpdating = True
MsgBox "Risk ids averaged.", vbInformation
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose: Returns the values of a range ('rg') in a 2D one-based array.
' Remarks: If ˙rg` refers to a multi-range, only its first area
' is considered.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function GetRange( _
ByVal rg As Range) _
As Variant
Const ProcName As String = "GetRange"
On Error GoTo ClearError
If rg.Rows.Count + rg.Columns.Count = 2 Then ' one cell
Dim Data As Variant: ReDim Data(1 To 1, 1 To 1): Data(1, 1) = rg.Value
GetRange = Data
Else ' multiple cells
GetRange = rg.Value
End If
ProcExit:
Exit Function
ClearError:
Debug.Print "'" & ProcName & "' Run-time error '" _
& Err.Number & "':" & vbLf & " " & Err.Description
Resume ProcExit
End Function
It would be complicated to use a single Dictionary. Here I add a Sub-Dictionary for each Risk ID to the main Dictionary. The Sub-Dictionary is used to hold all the values for each ID. The final step is to create an array of averages for all the main Dictionary items.
Sub Test_Arr_Avg()
Dim Data As Variant
With TestWS.Range("A1").CurrentRegion
Data = .Offset(1).Resize(.Rows.Count - 1, 3)
End With
Dim Results As Variant
Results = KeyedAverages(Data, 1, 2)
Stop
End Sub
Function KeyedAverages(Data As Variant, IDColumn As Long, ValueColumn As Long)
Dim Map As Object
Set Map = CreateObject("Scripting.Dictionary")
Dim Key As Variant
Dim r As Long
For r = 1 To UBound(Data)
Key = CStr(Data(r, IDColumn))
If Len(Key) > 0 Then
If Not Map.Exists(Key) Then Map.Add Key, CreateObject("Scripting.Dictionary")
With Map(Key)
.Add CStr(.Count), Data(r, ValueColumn)
End With
End If
Next
Dim Results As Variant
Dim Values As Variant
ReDim Results(1 To Map.Count, 1 To 2)
Dim n As Long
For Each Key In Map.Keys
n = n + 1
Values = Map(Key).Items
Results(n, 1) = Key
Results(n, 2) = WorksheetFunction.Average(Values)
Next
KeyedAverages = Results
End Function
Public Function TestWB() As Workbook
Set TestWB = ThisWorkbook
End Function
Public Function TestWS() As Worksheet
Set TestWS = TestWB.Worksheets("Test")
End Function
Note: Code updated to exclude empty items.

VBA - problem copying data from storage array to sheet array

I am trying to replicate what a Data Table does in excel in VBA. I have got the code working as I want thus far however when I copy data out of the temporary storage array it is offset by 1 Column and 1 Row.
I cannot figure out what the issue is? Thanks in advance.
Sub DataTableLoop()
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Dim CodeRng As Range
Dim PasteRng As Range
Dim WatchRng As Range
Dim ResultRng As Range
Dim ResultRes As Range
Dim x As Integer
Dim y As Integer
Dim i As Integer
Dim Count As Integer
Dim col As Integer
Dim MyArray As Variant
Dim TempArr As Variant
Dim CodeVar As Range
Set CodeRng = Worksheets("OptionCodes").[CodeTop]
Set PasteRng = Worksheets("OptionCodes").[OptionsCode]
Set WatchRng = Worksheets("OptionCodes").[WatchRange]
Set ResultRng = Worksheets("OptionCodes").[ResultsRange]
col = WatchRng.Columns.Count
x = Worksheets("OptionCodes").[Iterations].Value
y = x - 1
i = 0
Set ResultRes = ResultRng.Resize(x)
ReDim MyArray(x, col)
Do While i <= y
Set CodeVar = CodeRng.Offset(i, 0)
Count = i + 1
Application.StatusBar = "Iteration: " & Count & " of " & x
CodeVar.Copy
PasteRng.PasteSpecial Paste:=xlPasteValues
Application.Calculate
TempArr = WatchRng
For j = 1 To col
MyArray(Count, j) = TempArr(1, j)
Next j
i = i + 1
Loop
ResultRes = MyArray
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
Redim is by default 0 based, so your array is actually 1 row and column larger than you expect. To get 1 based you need to specify the lower bounds
ReDim MyArray(1 To x, 1 To col)

1004 application-defined or object-defined error while naming worksheets vba

I would like to rename worksheets in an exiting workbook. This is the code i am using:
Dim LobArray As Variant
Dim TypeArray As Variant
Dim g As String
'Added during Edit of question.
Dim NoLobs As Long, NoTypes As Long
Dim l As Long, t As Long, s As Long
Dim SheetNames(100) As String
Dim SheetCountSpL As Long
Dim TmplSpl As Workbook
Set TmplSpl = ThisWorkbook
'-----------------------------
g = "_"
LobArray = Array("Lob1", "Lob2", "Lob3", "Lob4")
TypeArray = Array("ea", "pa", "inc")
NoLobs = UBound(LobArray) - LBound(LobArray) + 1
NoTypes = UBound(TypeArray) - LBound(TypeArray) + 1
For l = LBound(LobArray) To UBound(LobArray)
For t = LBound(TypeArray) To UBound(TypeArray)
SheetNames(l * NoLobs + t) = LobArray(l) & g & TypeArray(t)
Next t
Next l
SheetCountSpL = NoTypes * NoLobs
For s = 1 To SheetCountSpL
TmplSpL.Worksheets(s).Activate
TmplSpL.Worksheets(s).Name = SheetNames(s - 1)
Next s
When i reduce the elements in the LobArray to 3 it works. Basically, when the macro has to rename more then 9 sheets, i get the error i mentioned in the title.
This is the code I use to create and rename sheets. It creates sheets based on selected cells and renames the new sheets accordingly. If sheets exist it deletes them
Sub CreateSheetsFromAList()
Dim MyCell As Range
Dim MyRange As Range
Set MyRange = Selection
For Each MyCell In MyRange
Sheets.Add After:=Sheets(Sheets.Count) 'creates a new worksheet
On Error Resume Next
Sheets(Sheets.Count).Name = MyCell.Value 'renames the new worksheet
If Err.Number = 1004 Then
Application.DisplayAlerts = False
ActiveWindow.SelectedSheets.Delete
Application.DisplayAlerts = True
End If
On Error GoTo 0
Next MyCell
End Sub
This is the error:
LobArray = four elements.
TypeArray = three elements.
l = 0, NoLobs = 4, t = 0 on first loop.
First inner loop:
0 * 4 + 0 = 0 = SheetNames(0) = LobArray(0) & TypeArray(0) = "Lob1_ea"
Second inner loop:
0 * 4 + 1 = 1 = SheetNames(1) = .....
Third inner loop:
0 * 4 + 2 = 2 = SheetNames(2) = .....
Fourth inner loop:
Doesn't occur as TypeArray only has 3 elements.
SheetNames(3) is left blank as a result
This code will rename your sheets:
Public Sub Test()
Dim LobArray As Variant
Dim TypeArray As Variant
Dim lobItm As Variant, typeItm As Variant
Dim g As String, x As Long
Dim RequiredSheetCount As Long
g = "_"
LobArray = Array("Lob1", "Lob2", "Lob3", "Lob4")
TypeArray = Array("ea", "pa", "inc")
RequiredSheetCount = (UBound(LobArray) + 1) * (UBound(TypeArray) + 1)
If Worksheets.Count >= RequiredSheetCount Then
For Each lobItm In LobArray
For Each typeItm In TypeArray
x = x + 1
ThisWorkbook.Worksheets(x).Name = lobItm & g & typeItm
Next typeItm
Next lobItm
Else
MsgBox "The workbook needs at least " & RequiredSheetCount & " sheets to work properly."
End If
End Sub

Slice array to use index on larger than 65000

I use the code hereunder to calculate max values as described in this post (vba max value of group of values). The code works great but once I have more than 65k lines I get a data type mismatch when trying to pase the array:
sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)
Could somebody help me to slice the array in chunks. I have tried to get it working myself but without any luck.
Sub FillGroupsMax()
Dim lColumn As Long
Dim sht As Worksheet
Dim groupsArray As Variant 'array with all group infomation
Dim groupsSeen As Variant 'array with group infomation already seen
Application.ScreenUpdating = False 'stop screen updating makes vba perform better
Set sht = ThisWorkbook.Worksheets("import")
Set last = sht.Range("A:A").Find("*", Cells(1, 1), searchdirection:=xlPrevious) 'last cell with value in column A
lColumn = sht.Cells(1, Columns.Count).End(xlToLeft).Column
groupsArray = sht.Range(Cells(1, 1), Cells(last.Row, lColumn))
'collect all the information on the Sheet into an array
'Improves performance by not visiting the sheet
For dRow = 2 To last.Row 'for each of the rows skipping header
'check if group as already been seen
If inArrayValue(Cells(dRow, 1).Value, groupsSeen) > 0 Then
'if it has been seen/calculated attribute value
'Cells(dRow, 4).Value = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
groupsArray(dRow, lColumn) = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
Else
'if it hasn't been seen then find max
'Cells(dRow, 4).Value = getMax(Cells(dRow, 1).Value, groupsArray)
groupsArray(dRow, lColumn) = getMax(Cells(dRow, 1).Value, groupsArray, lColumn)
'array construction from empty
If IsEmpty(groupsSeen) Then
ReDim groupsSeen(0)
'groupsSeen(0) = Array(Cells(dRow, 1).Value, Cells(dRow, 4).Value)
groupsSeen(0) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
'attribute value to array
Else
ReDim Preserve groupsSeen(0 To UBound(groupsSeen) + 1)
groupsSeen(UBound(groupsSeen)) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
End If
End If
Next
sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)
'reactivate Screen updating
Application.ScreenUpdating = True
End Sub
Function getMax(group As String, groupsArray As Variant, lColumn As Long) As Double
'for each in array
For n = 1 To UBound(groupsArray)
'if its the same group the Max we seen so far the record
If groupsArray(n, 1) = group And groupsArray(n, lColumn - 1) > maxSoFar Then
maxSoFar = groupsArray(n, lColumn - 1)
End If
Next
'set function value
getMax = maxSoFar
End Function
Function inArrayValue(group As String, groupsSeen As Variant) As Double
'set function value
inArrayValue = 0
'if array is empty then exit
If IsEmpty(groupsSeen) Then Exit Function
'for each in array
For n = 0 To UBound(groupsSeen)
'if we find the group
If groupsSeen(n)(0) = group Then
'set function value to the Max value already seen
inArrayValue = groupsSeen(n)(1)
'exit function earlier
Exit Function
End If
Next
End Function
You can write a helper function to use instead of Application.Index
Bonus - it will be much faster than using Index (>5x)
Sub Tester()
Dim arr, arrCol
arr = Range("A2:J80000").Value
arrCol = GetColumn(arr, 5) '<< get the fifth column
Range("L2").Resize(UBound(arrCol, 1), 1).Value = arrCol
End Sub
'extract a single column from a 1-based 2-D array
Function GetColumn(arr, colNumber)
Dim arrRet, i As Long
ReDim arrRet(1 To UBound(arr, 1), 1 To 1)
For i = 1 To UBound(arr, 1)
arrRet(i, 1) = arr(i, colNumber)
Next i
GetColumn = arrRet
End Function
EDIT - since QHarr asked about timing here's a basic example
Sub Tester()
Dim arr, arrCol, t, i as long
arr = Range("A2:J80000").Value
t = Timer
For i = 1 to 100
arrCol = GetColumn(arr, 5) '<< get the fifth column
Next i
Debug.print Timer - t '<<# of seconds for execution
End Sub
Below, whilst not as tidy as could be, is a way to process an array in chunks and Index to access a column and write out to the sheet.
I populated two columns (A:B) with data. Both had 132,000 rows, populated incrementally, with values from 1 to 132,000 in each column for my test run.
You can fiddle with cutOff to get the chunk size just below the point where the fail happens.
The code below is simply to demonstrate the principle of looping in batches, upto the set cutoff in each batch, until all rows have been processed.
Option Explicit
Public Sub WriteArrayToSheet()
Dim wb As Workbook
Dim ws As Worksheet
Set wb = ThisWorkbook
Set sht = wb.Worksheets("Sheet1") 'change as appropriate
Dim myArr() 'dynamic array
myArr = sht.Range("A1").CurrentRegion.Value 'you may want a more robust method
Dim cutOff As Long 'the max value - what ever it is before error occurs
cutOff = 1000
Dim totalRows As Long 'total rows in array read in from sheet
totalRows = UBound(myArr, 1)
Dim totalArraysNeeded As Long
'Determine how many lots of cutOff chunks there are in the total number of array rows
totalArraysNeeded = Application.WorksheetFunction.Ceiling(totalRows / cutOff, 1)
Dim rotations As Long 'number of times to loop original array to handle all rows
Dim rowCountTotal As Long
Dim rowCount As Long
Dim tempArr() 'this will hold the chunk of the original array
Dim rowCounter As Long
Dim lastRow As Long
Dim nextRow As Long
Dim i As Long
Dim j As Long
Dim numRows As Long
rotations = 1
Do While rotations < totalArraysNeeded
If rotations < totalArraysNeeded - 1 Then
ReDim tempArr(1 To cutOff, 1 To UBound(myArr, 2)) 'size chunk array
numRows = cutOff
Else
numRows = totalRows - rowCountTotal
ReDim tempArr(1 To numRows, 1 To UBound(myArr, 2)) 'size chunk array
End If
For i = 1 To numRows
rowCount = 1 'rows in this chunk looped
rowCountTotal = rowCountTotal + 1 'rows in original array looped
For j = LBound(myArr, 2) To UBound(myArr, 2)
tempArr(i, j) = myArr(rowCountTotal, j)
Next j
rowCount = rowCount + 1
Next i
With sht
lastRow = .Cells(.Rows.Count, "E").End(xlUp).Row 'Column where I am writing the sliced column out to
End With
If lastRow = 1 Then
nextRow = 1
Else
nextRow = lastRow + 1
End If
sht.Range("E" & nextRow).Resize(UBound(tempArr, 1), 1) = Application.Index(tempArr, , 1) 'write out to sheet
rotations = rotations + 1
Loop
End Sub
As #Tim suggested, the best way to slice a large array is use a loop to copy the column.
Though in your case, most of the processing time is spent on computing the maximum since your code is using a nested loop.
If you want to reduce significantly the processing time, then use a dictionary:
Sub Usage
GetMaxByGroupTo _
sourceGroups := ThisWorkbook.Range("Sheet1!A2:A100"), _
sourceValues := ThisWorkbook.Range("Sheet1!B2:B100"), _
target := ThisWorkbook.Range("Sheet1!C2")
End Sub
Sub GetMaxByGroupTo(sourceGroups As Range, sourceValues As Range, target As Range)
Dim dict As Object, groups(), values(), r As Long, max
Set dict = CreateObject("Scripting.Dictionary")
groups = sourceGroups.Value2
values = sourceValues.Value2
' store the maximum value of each group in a dictionary for an efficient lookup '
For r = Lbound(groups) to Ubound(groups)
max = dict(groups(r, 1))
If VarType(max) And values(r, 1) <= max Then Else dict(groups(r, 1)) = values(r, 1)
Next
' build and copy the result array to the sheet '
For r = Lbound(groups) to Ubound(groups)
values(r, 1) = dict(groups(r, 1))
Next
target.Resize(Ubound(groups), 1).Value2 = values
End Sub

Resources