Speeding up Loop / Match - Code runs very slow - arrays

I have a code that matches a cell value in Column C on Sheet1 to a pivot table on Sheet3 and then copies certain columns over.
Code will check how many entries there are on Sheet1 that need to be checked
Loop 2: For every value in Column C/Sheet1 with a match in Column A on Sheet 2 it will then copy over the corresponding data from Column B,C,D,E.
Since there are multiple matches possible by value/Sheet I am limiting the data pull to three matches (three loops in the code). To achieve that I am increasing i +1 or i+2 to get the next row in the pivot table.
The table on Sheet 2 is sometimes 10,000+ rows and excel crashes.
Does anyone have an idea how to speed up the loop codes (Loop2,3,4 are the same) to make it less work intensive e.g. array possibly? They are causing the lock up since I think the code keeps running up and down column A.
Set sheet3 = Sheets("OrbitPivotTable")
CellChanged = Sheet1.Range("A1").Value + 1
LastRow = sheet3.Cells(Rows.Count, "A").End(xlUp).Row
LastData = Sheet1.Cells(Rows.Count, "C").End(xlUp).Row
'Loop1
For i = 1 To LastRow
If Sheet1.Range("C" & CellChanged).Value = "" Then GoTo Nextstep2
If Sheet1.Range("C" & CellChanged).Value = sheet3.Range("A" & i) Then
Sheet1.Range("H" & CellChanged).Value = sheet3.Range("B" & i).Value 'Customer
Sheet1.Range("I" & CellChanged).Value = sheet3.Range("C" & i).Value 'Rate Val start
Sheet1.Range("J" & CellChanged).Value = sheet3.Range("D" & i).Value 'ATA All in
Sheet1.Range("K" & CellChanged).Value = sheet3.Range("E" & i).Value 'Special Remarks
Found = True
End If
If Found = True Or i = LastRow Then
If CellChanged = LastData Then
Exit For
End If
If Found = True Then
Found = False
Nextstep2:
CellChanged = CellChanged + 1
End If
i = 0
End If
Next i
'Loop2
etc....
Excel File

I might have misunderstood the process in the file you shared, but this should be faster (and much less code overall).
I put the pivot table lookup in a loop, switched to Match(), and reduced the number of read/writes using arrays where possible.
EDITED to fix an embarrassing bug where I forgot to adjust the Match() result m to account for the starting row of the range I run match() against...
Sub HB_IPT_Rate_Check()
Dim wsReport As Worksheet, wsCPK As Worksheet, wsOrbitPivot As Worksheet
Dim c As Range, rwReport As Range, lastPivotRow As Long
Dim ata, m, numMatches As Long, matchFrom As Long, matchRow As Long
Set wsReport = ThisWorkbook.Worksheets("Comparison Report")
Set wsCPK = ThisWorkbook.Worksheets("CPK")
Set wsOrbitPivot = ThisWorkbook.Worksheets("OrbitPivotTable")
'loop over the rows in the report sheet
For Each c In wsReport.Range("C3", wsReport.Cells(Rows.Count, "C").End(xlUp)).Cells
ata = c.Value 'read this once....
Set rwReport = c.EntireRow
'1st Database Match "CPK"
m = Application.Match(ata, wsCPK.Columns("A"), 0)
If Not IsError(m) Then
With wsCPK.Rows(m)
rwReport.Columns("D").Resize(1, 4).Value = _
Array(.Columns("B").Value, .Columns("C").Value, _
.Columns("F").Value, .Columns("H").Value)
'Sum of HB CWGT (KG),Sum of MB CWGT (KG),Achiev CPK,Density
End With
Else
'no match...
End If
'2nd Database Match "Orbit"
lastPivotRow = wsOrbitPivot.Cells(Rows.Count, "A").End(xlUp).Row
numMatches = 0 'reset match count
matchFrom = 2
m = Application.Match(ata, wsOrbitPivot.Range("A" & matchFrom & ":A" & lastPivotRow), 0)
'keep going while we still have a match and we've not reached the max result count
Do While Not IsError(m) And numMatches < 3
numMatches = numMatches + 1
matchRow = matchFrom + (m - 1) 'adjust the matched row index according to where we started looking...
'sanity check
Debug.Print "Matched " & ata & " on row " & matchRow
rwReport.Columns("H").Offset(0, (numMatches - 1) * 4).Resize(1, 4).Value = _
wsOrbitPivot.Cells(matchRow, "B").Resize(1, 4).Value
'find the next match if any, starting below the last match
matchFrom = matchRow + 1
m = Application.Match(ata, wsOrbitPivot.Range("A" & matchFrom & ":A" & lastPivotRow), 0)
Loop
Next c 'next report row
End Sub

Use Dictionary to set row and column number.
Data is assigned to fit rows and columns in a virtual array.
Sub test()
Dim Ws(1 To 4) As Worksheet
Dim DicR As Object ' Dictionary
Dim DicC As Object ' Dictionary
Dim vDB, arr()
Dim s As String
Dim i As Long, n As Long, j As Integer
Dim r As Long, c As Integer
Set Ws(1) = Sheets("Comparison Report")
Set Ws(2) = Sheets("CPK")
Set Ws(3) = Sheets("OrbitPivotTable")
Set Ws(4) = Sheets("Orbit")
'Row index dictionary
Set DicR = CreateObject("Scripting.Dictionary") 'New Scripting.Dictionary
'Column index dictionary
Set DicC = CreateObject("Scripting.Dictionary") ' New Scripting.Dictionary
vDB = Ws(1).UsedRange
For i = 3 To UBound(vDB, 1)
s = vDB(i, 3)
If s <> "" Then
If DicR.Exists(s) Then
'DicC(s) = DicC(s) + 1
Else
n = n + 1
DicR.Add s, n 'row index
DicC.Add s, 0 'column index
End If
End If
Next i
'Create an array of virtual tables based on the number of dictionaries.
'Since the number of columns cannot be predicted, a specific number of 1000 was entered.
'in my test, number 100 is too small
ReDim arr(1 To DicR.Count, 1 To 1000)
For j = 2 To 4
vDB = Ws(j).Range("a1").CurrentRegion
For i = 2 To UBound(vDB, 1)
s = vDB(i, 1)
If DicR.Exists(s) Then
r = DicR(s)
c = DicC(s) * 4 + 1
DicC(s) = DicC(s) + 1
arr(r, c) = vDB(i, 2)
arr(r, c + 1) = vDB(i, 3)
arr(r, c + 2) = vDB(i, 4)
arr(r, c + 3) = vDB(i, 5)
End If
Next i
Next j
With Ws(1)
.Range("d3").Resize(UBound(arr, 1), UBound(arr, 2)) = arr
End With
End Sub
Result image

Related

Using CountA or Equivelant on a range of columns contained within a larger array of columns

I am currently reading a range into an array to perform a few calculations before outputting into another worksheet. My reason for using the array is speed as I am often dealing with thousands of rows.
I have one particular calculation that I am struggling with for some reason.
This is the part I am struggling with (rest of sample of this code is further down):
For i = non_rev_rows To 2 Step -1.
**' Remove Blank Rows from array
If data_range(i, 2) = "No WBS/CC" Then
If Application.WorksheetFunction.CountA(Range("C" & i & ":M" & i)) = 0 Then
Rows(i).Delete
End If
So basically when a row in column 2 is equal to "No WBS/CC" then I need to run a CountA or any other method you can recommend to calcuate the total value of columns C to M on that row. I am essentially looking for any row that = "No WBS/CC" and where columns C:M have no value. If so, then delete the entire row. If there is a value in columns C:M then I would not wish to delete the row.
'Row Count
With Sheets("array")
non_rev_rows = .Range("E" & .Rows.Count).End(xlUp).Row
End With
' Remove Blank Rows from array
' Replace "NO WBS/CC" with Co Code Over-Ride if supplied
' Set Debit / Credit
' Round to 2 decimal places
Set data = array_sheet.Range("A1:M" & non_rev_rows)
data_range = data.Value
For i = non_rev_rows To 2 Step -1.
**' Remove Blank Rows from array
If data_range(i, 2) = "No WBS/CC" Then
If Application.WorksheetFunction.CountA(Range("C" & i & ":M" & i)) = 0 Then
Rows(i).Delete
End If
' Replace "NO WBS/CC" with Co Code Over-Ride if supplied
If data_range(i, 13) <> 0 Then
data_range(i, 2) = data_range(i, 13)
End If
End If**
' Set Debit / Credit
data_range(i, 3) = Replace(data_range(i, 3), "Debit", 41)
data_range(i, 3) = Replace(data_range(i, 3), "Credit", 51)
' Round to 2 decimal places
data_range(i, 5) = WorksheetFunction.Round(data_range(i, 5), 2)
' If data_range(i, 3) = "Debit" Then
' data_range(i, 3).Value = 41
' ElseIf data_range(i, 3) = "Credit" Then
' data_range(i, 3).Value = 51
' End If
'data_range(i, 5).Value = Application.WorksheetFunction.Round(Range(data_range(i, 5)).Value, 2)
'Range("E" & i).Value = Application.WorksheetFunction.Round(Range("E" & i).Value, 2)
Next i
**' Remove Blank Rows from array
If data_range(i, 2) = "No WBS/CC" Then
If Application.WorksheetFunction.CountA(Range("C" & i & ":M" & i)) = 0 Then
Rows(i).Delete
End If
This code does not result in an error but it also does not have the desired impact. I have several rows in my test data that contain "No WBS/CC" in column 2 and zero values in columns C:M but the code is not deleting those rows.
If you want learning/understanding how an array row can be deleted (adapted for your case), please test the next way. It will return the array without deleted rows starting from "O2" of the same sheet, so the range after M:M column must be empty. You can easily adapt last code line to return wherever you need (in other sheet, other workbook...):
Sub DeleteArrayRows()
Dim array_sheet As Worksheet, non_rev_rows As Long, Data As Range, count2 As Long, data_range, arrRow, i As Long
Set array_sheet = ActiveSheet 'worksheets("array")
non_rev_rows = array_sheet.Range("E" & array_sheet.rows.count).End(xlUp).row
Set Data = array_sheet.Range("A1:M" & non_rev_rows)
data_range = Data.Value
For i = 1 To UBound(data_range)
count2 = 0
If data_range(i, 2) = "No WBS/CC" Then
With Application
arrRow = .Transpose(.Transpose(.Index(data_range, i, 0))) 'extract a slice of the row array
End With
Debug.Print Join(arrRow, ""): Stop 'just to see the joinned respecitve slice In Immediate Window
'comment it after seeing what it represents and press F5
If data_range(i, 1) <> "" Then count2 = Len(data_range(i, 1))
If Len(Join(arrRow, "")) - count2 = Len(data_range(i, 2)) Then
data_range = DeleteArrayRow_(data_range, i): i = i - 1
End If
End If
If i = UBound(data_range) Then Exit For
Next i
'drop the array (without deleted rows) in a range:
array_sheet.Range("O1").Resize(UBound(data_range), UBound(data_range, 2)).Value = data_range
End Sub
Private Function DeleteArrayRow_(arr As Variant, RowToDelete As Long) As Variant 'interesting...
'It does not work to eliminate the first array row...
Dim Rws As Long, cols As String
Rws = UBound(arr) - LBound(arr)
cols = "A:" & Split(Columns(UBound(arr, 2) - LBound(arr, 2) + 1).address(, 0), ":")(0)
DeleteArrayRow_ = Application.Index(arr, Application.Transpose(Split(Join(Application.Transpose(Evaluate("Row(1:" & _
(RowToDelete - 1) & ")"))) & " " & Join(Application.Transpose(Evaluate("Row(" & _
(RowToDelete + 1) & ":" & UBound(arr) & ")"))))), Evaluate("COLUMN(" & cols & ")"))
End Function
It is not extremely fast, I tried showing it only for didactic purpose. To see that it is and how it is possible...
Note: I did not pay attention to all at the code lines after deletion. It can be easily adapted to include that part...
You can do both tests on the array rather than partially in array and partially in the worksheet.
Only delete the row in the worksheet when you find a full match.
Public Sub Test2()
Dim data_range As Variant
Dim lRows As Long
Dim lColumns As Long
Dim lCounter As Long
data_range = Sheet1.Range("A1:M6")
' Add the data to an array
For lRows = UBound(data_range) To LBound(data_range) Step -1
'Step through the array in reverse
If data_range(lRows, 2) = "No WBS/CC" Then
'Check for the "No WBS/CC" value in the second column of the array
lCounter = 0
'Reset the counter
For lColumns = 3 To 13
If Not IsEmpty(data_range(lRows, lColumns)) Then
lCounter = lCounter + 1
End If
Next lColumns
'Check columns in the array row to see if they have data
'Add to the counter for each cell having value
If lCounter = 0 Then
Sheet1.Rows(lRows).EntireRow.Delete
End If
'If the counter is zero delete the current row in the Workbook
End If
Next lRows
End Sub
Sample data before the macro is run. The row we expected to be removed highlighted in green.
Sample data after the macro is run. The expected row has been removed.
An alternate option is to write the valid rows to a new array.
Clear the data on the worksheet, then write the new array to the worksheet.
Remove Rows
Sub DoStuff()
Dim wb As Workbook: Set wb = ThisWorkbook
Dim ws As Worksheet: Set ws = wb.Worksheets("Array")
Dim LastRow As Long: LastRow = ws.Cells(ws.Rows.Count, "E").End(xlUp).Row
Dim rg As Range: Set rg = ws.Range("A2", ws.Cells(LastRow, "M"))
Dim rCount As Long: rCount = rg.Rows.Count
Dim cCount As Long: cCount = rg.Columns.Count
Dim Data() As Variant: Data = rg.Value
Dim sr As Long
Dim dr As Long
Dim c As Long
For sr = 1 To rCount
If Not IsRowBlank(Data, sr, 3, 13) Then ' is not blank
' Replace "NO WBS/CC" with Co Code Over-Ride if supplied
If CStr(Data(sr, 1)) = "No WBS/CC" Then
If Data(sr, 13) <> 0 Then
Data(sr, 2) = Data(sr, 13)
End If
End If
' Set Debit / Credit
Data(sr, 3) = Replace(Data(sr, 3), "Debit", 41)
Data(sr, 3) = Replace(Data(sr, 3), "Credit", 51)
' Round to 2 decimal places
Data(sr, 5) = Application.Round(Data(sr, 5), 2)
' Copy source row to destination row.
dr = dr + 1
For c = 1 To cCount
Data(dr, c) = Data(sr, c)
Next c
'Else ' is blank; do nothing
End If
Next sr
' Clear bottom source data.
If dr < rCount Then
For sr = dr + 1 To rCount
For c = 1 To cCount
Data(sr, c) = Empty
Next c
Next sr
End If
rg.Value = dData
End Sub
Function IsRowBlank( _
Data() As Variant, _
ByVal DataRow As Long, _
ByVal StartColumn As Long, _
ByVal EndColumn As Long) _
As Boolean
Dim c As Long
For c = StartColumn To EndColumn
If Len(CStr(Data(DataRow, c))) > 0 Then Exit For
Next c
IsRowBlank = c > EndColumn
End Function

Faster method to delete a range of rows other that using union

I am using the below code to:
Delete the similar rows, keeping only one and combine cells values in the range "N", separated by vbLf
it works ,but with big range (e.g. 30 thousands rows) the macro takes a very long time to finish.
After debugging the code, I find out that using union causes macro to takes a very long time to finish.
Set rngDel = Union(rngDel, ws.Range("A" & i + m))
So with the below code , How to adapt a faster method to delete that range of rows other that using union?
In advance, grateful for any helpful comments and answers.
Sub DeleteSimilarRows_combine_Last_Column_N()
Dim LastRow As Long, ws As Worksheet, arrWork, rngDel As Range, i As Long, j As Long, k As Long
Dim strVal As String, m As Long
Set ws = ActiveSheet: LastRow = ws.Cells(ws.Rows.Count, 1).End(xlUp).Row
arrWork = ws.Range("A1:A" & LastRow).Value2 'Place the range in an array to make iteration faster
Application.DisplayAlerts = False: Application.ScreenUpdating = False
For i = 2 To UBound(arrWork) - 1 'Iterate between the array elements:
If arrWork(i, 1) = arrWork(i + 1, 1) Then
'Determine how many consecutive similar rows exist:______
For k = 1 To LastRow
If i + k + 1 >= UBound(arrWork) Then Exit For
If arrWork(i, 1) <> arrWork(i + k + 1, 1) Then Exit For
Next k '__
For j = 14 To 14 'Build the concatenated string of cells in range "N":
strVal = ws.Cells(i, j).Value
For m = 1 To k
strVal = strVal & vbLf & ws.Cells(i + m, j).Value
Next m
ws.Cells(i, j).Value = strVal: strVal = ""
Next j
For m = 1 To k 'Place the cells for rows to be deleted in a Union range, to delete at the end, at once
If rngDel Is Nothing Then
Set rngDel = ws.Range("A" & i + m)
Else
Set rngDel = Union(rngDel, ws.Range("A" & i + m)) 'This line causes macro takes very long time to finish.
End If
Next m
i = i + k: If i >= UBound(arrWork) - 1 Then Exit For 'Increment the i variable and exiting if the resulted value exits the array size
End If
Next i
If Not rngDel Is Nothing Then rngDel.EntireRow.Delete 'Delete the not necessary rows
Application.DisplayAlerts = True: Application.ScreenUpdating = True
End Sub
Union gets progressively slower as you add more cells/areas to the range (see numbers here: https://stackoverflow.com/a/56573408/478884). If you were working "bottom up" you could delete rngDel every (eg) 500 rows, but you can't take that approach since you're working top-down.
Here's a different approach - adding cells to a Collection and then processing the collection "bottom-up" at the end, using a batch-delete process.
Sub TestRowDeletion()
Dim rngRows As Range, data, rngDel As Range, i As Long
Dim t, nRows As Long, colCells As New Collection
Set rngRows = Range("A1:A10000") '10k rows for testing
'Approach #1 - your existing method
DummyData rngRows 'populate some dummy data
data = rngRows.Value
t = Timer
For i = 1 To UBound(data, 1)
'removing ~25% of cells...
If data(i, 1) > 0.75 Then BuildRange rngDel, rngRows.Cells(i)
Next i
If Not rngDel Is Nothing Then rngDel.EntireRow.Delete
Debug.Print "Regular single delete", Timer - t
'Approach #2 - batch-deleting rows
DummyData rngRows 'reset data
data = rngRows.Value
t = Timer
For i = 1 To UBound(data, 1)
If data(i, 1) > 0.75 Then colCells.Add rngRows.Cells(i)
Next i
RemoveRows colCells
Debug.Print "Batch-deleted", Timer - t
'Approach #3 - array of "delete" flags plus SpecialCells()
DummyData rngRows 'reset data
data = rngRows.Value
t = Timer
ReDim flags(1 To UBound(data, 1), 1 To UBound(data, 2))
For i = 1 To UBound(data, 1)
If data(i, 1) > 0.75 Then
flags(i, 1) = "x"
bDelete = True 'flag we have rows to delete
End If
Next i
If bDelete Then
With rngRows.Offset(0, 10) 'use an empty column....
.Value = flags 'populate with flags for deletion
.SpecialCells(xlCellTypeConstants).EntireRow.Delete
End With
End If
Debug.Print "Specialcells", Timer - t
End Sub
'Delete the row for any cell in `col`
' cells were added to `col` in a "top down" order
Sub RemoveRows(col As Collection)
Dim rngDel As Range, n As Long
For n = col.Count To 1 Step -1 'working from the bottom up...
BuildRange rngDel, col(n)
If n Mod 250 = 0 Then
rngDel.EntireRow.Delete
Set rngDel = Nothing
End If
Next n
If Not rngDel Is Nothing Then rngDel.EntireRow.Delete
End Sub
Sub DummyData(rng As Range)
With rng
.Formula = "=RAND()"
.Value = .Value
End With
End Sub
Sub BuildRange(ByRef rngTot As Range, rngAdd As Range)
If rngTot Is Nothing Then
Set rngTot = rngAdd
Else
Set rngTot = Application.Union(rngTot, rngAdd)
End If
End Sub
Times (sec) - note how differently the single-delete and batch-delete approaches scale as more rows are added.
# of rows deleted ~2.5k/10k ~5k/20k ~7.5k/30k
------------------------------------------------------------
1. Regular single delete 10.01 65.9 226
2. Batch-deleted 2.2 4.7 7.8
3. SpecialCells 1.6 3.1 4.7
You could also consider populating a "delete" flag in your dataset, then using the autofilter/delete visible rows approach (EDIT: added as method #3)
Posting this as a working (but faster) version of your actual use case, since my other answer is really just about timing the different approaches.
Sub DeleteSimilarRowsCombineColumnN()
Const SEP As String = ","
Dim arrKeys, arrVals, arrFlags, rngRows As Range, rngVals As Range, i As Long, key, currKey, s As String
Dim ws As Worksheet, ub As Long, t, n As Long
t = Timer
Set ws = ActiveSheet
Set ws = ActiveSheet
Set rngRows = ws.Range("A1:A" & ws.Cells(Rows.Count, 1).End(xlUp).Row)
Set rngVals = rngRows.EntireRow.Columns("N")
arrKeys = rngRows.Value
ub = UBound(arrKeys, 1)
arrVals = rngVals.Value
ReDim arrFlags(1 To UBound(arrKeys, 1), 1 To 1)
currKey = Chr(0) 'non-existing key...
For i = ub To 1 Step -1 'looping from bottom up
key = arrKeys(i, 1) 'this row's key
If key <> currKey Then 'different key from row below?
If i < ub Then arrVals(i + 1, 1) = s 'populate the collected info for any previous key
s = arrVals(i, 1) 'collect this row's "N" value
currKey = key 'set as current key
Else
If i < ub Then
arrFlags(i + 1, 1) = "x" 'flag for deletion
n = n + 1
End If
s = arrVals(i, 1) & SEP & s 'concatenate the "N" value
End If
Next i
arrVals(1, 1) = s 'populate the last (first) row...
rngVals.Value = arrVals 'drop the concatenated values
If n > 0 Then 'any rows to delete?
Debug.Print "About to delete " & n & " of " & ub & " rows", Timer - t
With rngRows.Offset(0, 100) 'use any empty column
.Value = arrFlags
.SpecialCells(xlCellTypeConstants).EntireRow.Delete
End With
Debug.Print "Done deleting in " & Round(Timer - t, 2) & " sec"
End If
End Sub

Loop through array and return multiple rows based on one or more search criteria

I am looping through ArrayDestination through two columns (customer name and process number).
I am looping through ArraySourceData to find matches (invoice number and amount) for the above search criteria.
If there is a match it gets copied to the array and once both loops finish the results get copied to the worksheet.
So far it works except that the loop is only returning the first match.
If a customer has multiple identical process numbers the loop only returns the first match for all of them.
My b variable looks a bit static and I tried to cheer it up with b = b + 1.
For simplicity I didn't post creating the array part. It works. If needed I can provide it.
Sub search_loop_arrray()
For a = 2 To UBound(ArraySourceData)
varCustomerName = ArraySourceData(a, 3)
varProcessNumber = ArraySourceData(a, 5)
For b = 2 To UBound(ArrayDestination)
If ArrayDestination(b, 3) = varCustomerName And _
ArrayDestination(b, 8) = varProcessNumber Then
ArrayDestination(b, 9) = ArraySourceData(a, 11)
ArrayDestination(b, 10) = ArraySourceData(a, 12)
Exit For
End If
Next b
Next a
'transfer data (invoice number and amount) from ArrayDestination to wsDestination (Column 9 and 10)
For a = 2 To UBound(ArraySourceData)
For b = 9 To 10
wsDestination.Cells(a, b).Value = ArrayDestination(a, b)
Next b
Next a
End Sub
02/02/2020
I rewrote the code in a nested for loop without the array. This code works. The problem is there are duplicated process numbers in my source data.
In my example I "cut and paste" the already found process numbers in a sheet called coincidences. It is working BUT I was looking to parse everything into an array due to dealing with 100.000+ rows and 20+ columns.
I don't know if my "copy to temporary coincidences sheet" would make sense in the array?
Sub find_invoice()
Dim wsSourceData As Worksheet
Dim wsResults As Worksheet
Dim wsCoincidences As Worksheet
Dim varCustomer As String
Dim varProcessNumber As Long
Dim varInvoiceNumber As Long
Dim varSDlastrow As Integer
Dim varRElastrow As Long
Dim varCIlastrow As Long
Dim varCounterResults As Long
Set wsResults = ThisWorkbook.Sheets("RESULTS")
Set wsSourceData = ThisWorkbook.Sheets("SOURCEDATA")
Set wsCoincidences = ThisWorkbook.Sheets("COINCIDENCES")
varSDlastrow = wsSourceData.Cells(Rows.Count, 1).End(xlUp).Row
varRElastrow = wsResults.Cells(Rows.Count, 1).End(xlUp).Row
varCIlastrow = wsCoincidences.Cells(Rows.Count, 1).End(xlUp).Row
For i = 2 To varRElastrow
varCustomer = wsResults.Cells(i, 1)
varProcessNumber = wsResults.Cells(i, 2)
For j = 2 To varSDlastrow
If wsSourceData.Cells(j, 1) = varCustomer And wsSourceData.Cells(j, 2) = varProcessNumber Then
wsResults.Cells(i, 3) = wsSourceData.Cells(j, 3)
wsResults.Cells(i, 4) = wsSourceData.Cells(j, 4)
wsCoincidences.Rows(varCIlastrow).EntireRow.Value = wsSourceData.Rows(j).EntireRow.Value
wsSourceData.Rows(j).EntireRow.Delete
varCIlastrow = varCIlastrow + 1
Exit For
End If
Next j
Next i
End Sub
I'm not sure you're logic is right. If you are saying you need to match 2 parameters and several entities can contain those two parameters, then I don't see how you can do anything other than find either the first or last occurrence. Wouldn't you need a third parameter to distinguish the matches?
You'll see in the sample code below, I've assumed that the source data has the list of invoices which are sequential and the destination data has the duplicate customer and process parameters. In this case I've assumed that the invoice matching on the destination sheet should also be sequential, ie 2nd occurrence of duplicate means match the 2nd occurence of an invoice. So here, 'sequence' becomes the third parameter, but yours may be different.
It might also be easier to format your data into a hierarchical structure:
customer -> process -> invoice
so you can see what's going on a little easier. Classes are ideal for this. Your code is hard to follow as that Exit For will guarantee a first match only, and the transfer loop iterates on the upperbound of the ArraySourceData array and yet processes the ArrayDestination (I can't see what you're trying to do there, unless it's an error).
To show you what I mean, create three classes (Insert~>Class Module) called cCustomer, cProcess and cInvoice. Add the following code to each:
cCustomer
Option Explicit
Public Name As String
Public Processes As Collection
Public Sub AddInvoice(processNum As String, invoiceNum As String, invAmount As Double)
Dim process As cProcess
Dim invoice As cInvoice
On Error Resume Next
Set process = Processes(processNum)
On Error GoTo 0
If process Is Nothing Then
Set process = New cProcess
With process
.ProcessNumber = processNum
Processes.Add process, .ProcessNumber
End With
End If
Set invoice = New cInvoice
With invoice
.InvoiceNumber = invoiceNum
.Amount = invAmount
process.Invoices.Add invoice
End With
End Sub
Public Function GetProcess(num As String) As cProcess
On Error Resume Next
Set GetProcess = Processes(num)
End Function
Private Sub Class_Initialize()
Set Processes = New Collection
End Sub
cProcess
Option Explicit
Public ProcessNumber As String
Public Invoices As Collection
Public CurrentInvoiceCount As Long
Private Sub Class_Initialize()
Set Invoices = New Collection
End Sub
cInvoice
Option Explicit
Public InvoiceNumber As String
Public Amount As Double
Public ArrayIndex As Long
The following routine in your Module will output the data as I described above:
Dim customers As Collection
Dim customer As cCustomer
Dim process As cProcess
Dim invoice As cInvoice
Dim srcData As Variant, dstData As Variant
Dim output() As Variant
Dim i As Long
'Populate the source data array.
'Note: just an example here, use whatever array populating code you have.
With Sheet1 'I've put some dummy data in my Sheet1.
srcData = _
.Range( _
.Cells(2, "A"), _
.Cells(.Rows.Count, "A").End(xlUp)) _
.Resize(, 12) _
.Value2
End With
'Populate the destination data array.
'Note: just an example here, use whatever array populating code you have.
With Sheet2 'I've put some dummy data in my Sheet2.
dstData = _
.Range( _
.Cells(2, "A"), _
.Cells(.Rows.Count, "A").End(xlUp)) _
.Resize(, 10) _
.Value2
End With
'Convert source array to heirarchical collections.
Set customers = New Collection
For i = 1 To UBound(srcData, 1)
Set customer = Nothing: On Error Resume Next
Set customer = customers(CStr(srcData(i, 3))): On Error GoTo 0
If customer Is Nothing Then
Set customer = New cCustomer
With customer
.Name = CStr(srcData(i, 3))
customers.Add customer, .Name
End With
End If
customer.AddInvoice CStr(srcData(i, 5)), CStr(srcData(i, 11)), CDbl(srcData(i, 12))
Next
'Match destination array.
For i = 1 To UBound(dstData, 1)
Set customer = Nothing: On Error Resume Next
Set customer = customers(CStr(dstData(i, 3))): On Error GoTo 0
If Not customer Is Nothing Then
Set process = customer.GetProcess(CStr(dstData(i, 8)))
If Not process Is Nothing Then
With process
.CurrentInvoiceCount = .CurrentInvoiceCount + 1
If .CurrentInvoiceCount > .Invoices.Count Then
MsgBox "No further invoices for [cust=" & customer.Name & ";" & process.ProcessNumber & "]"
Else
Set invoice = .Invoices(.CurrentInvoiceCount)
invoice.ArrayIndex = i
End If
End With
End If
End If
Next
'Populate the output array.
ReDim output(1 To UBound(dstData, 1), 1 To 2)
For Each customer In customers
For Each process In customer.Processes
For Each invoice In process.Invoices
With invoice
If .ArrayIndex > 0 Then
output(.ArrayIndex, 1) = .InvoiceNumber
output(.ArrayIndex, 2) = .Amount
End If
End With
Next
Next
Next
'Write array to worksheet
Sheet2.Cells(2, 9).Resize(UBound(output, 1), UBound(output, 2)).Value = output
Without seeing some sample data, it's difficult to be certain, but I suspect my point is: if only a combination of three of parameters makes something unique, then you'll need to match against those three parameters.
If you have 100,000 rows on the SOURCEDATA sheet and 10,000 rows of the RESULTS sheet then having 2 loops is 1,000,000,000 iterations. The efficient way is to use a dictionary object using a key constructed on your 2 match criteria (col1 and col2) joined by a character of your choice such a "~" (tilde) or "_" (underscore). Scan the SOURCEDATA sheet once to build a "look up" of key to row number. Then scan the RESULTS sheet once, concatenate the 2 fields as before and using the dictionary .exists(key) method to find a match will give you the relevant row number on SOURCEDATA. Here is some code to illustrate. I tested it with 100,000 source rows and 10,000 results rows of random data matching the keys and filling in col C and D on the RESULTS sheet take around 3 seconds. Add a sheet called RUNLOG for the performance figures. It looks a lot of code but much of it is logging.
Option Explicit
Sub find_invoice2()
Const MSG As Boolean = False ' TRUE to show message boxes
Const RUNLOG As Boolean = False ' TRUE to log matches, no match etc
Dim wb As Workbook, start As Single, finish As Single
start = Timer
Set wb = ThisWorkbook
' set up sheets
Dim wsSourceData As Worksheet, wsResults As Worksheet, wsLog As Worksheet, wsMatch
With wb
Set wsResults = .Sheets("RESULTS")
Set wsSourceData = .Sheets("SOURCEDATA")
Set wsMatch = .Sheets("COINCIDENCES")
Set wsLog = .Sheets("RUNLOG")
End With
' find last row of source and results
Dim lastRowSource As Long, lastRowResults As Long, lastRowLog As Long, lastRowMatch
lastRowSource = wsSourceData.Cells(Rows.Count, 1).End(xlUp).Row
lastRowResults = wsResults.Cells(Rows.Count, 1).End(xlUp).Row
lastRowMatch = wsMatch.Cells(Rows.Count, 1).End(xlUp).Row
' set up log sheets
wsLog.Cells.Clear
wsLog.Range("A1:E1") = Array("Source Row", "Result Row", "Customer~Process", "Message", "Date Time")
wsLog.Cells(2, 4) = "Started"
wsLog.Cells(2, 5) = Time
lastRowLog = 3
' create lookup from Source
' key = Name~ProcessID, value = array row
Dim dict As Object, sKey As String, iRow As Long
Set dict = CreateObject("scripting.dictionary")
With wsSourceData
For iRow = 2 To lastRowSource
sKey = CStr(.Cells(iRow, 1)) & "~" & CStr(.Cells(iRow, 2)) ' customer~process
If Len(sKey) > 1 Then ' skip blanks lines if any
If dict.exists(sKey) Then
dict.Item(sKey) = dict.Item(sKey) & "_" & CStr(iRow)
If MSG Then MsgBox "Ignoring duplicate key in Source Data " & sKey, vbCritical
If RUNLOG Then
With wsLog.Cells(lastRowLog, 1)
.Offset(0, 0) = iRow
.Offset(0, 2) = sKey
.Offset(0, 3) = "Source : Ignoring duplicate key "
.Offset(0, 4) = Time
End With
lastRowLog = lastRowLog + 1
End If
Else
dict.Add sKey, iRow
'Debug.Print "Dict add", sKey, iRow
End If
End If
Next
End With
If MSG Then MsgBox dict.Count & " records added to dictionary"
wsLog.Cells(lastRowLog, 4) = "Dictionary Built Keys Count = " & dict.Count
wsLog.Cells(lastRowLog, 5) = Time
lastRowLog = lastRowLog + 1 ' blank line to seperate results
' scan results sheet
Dim sDict As String, countMatch As Long, countNoMatch As Long, sMsg As String
Dim ar As Variant, i As Long
countMatch = 0: countNoMatch = 0
Application.ScreenUpdating = False
With wsResults
For iRow = 2 To lastRowResults
sKey = CStr(.Cells(iRow, 1)) & "~" & CStr(.Cells(iRow, 2)) ' customer~process
If Len(sKey) > 1 Then 'skip blanks lines if any
If dict.exists(sKey) Then
' split string to get multiple lines
sDict = dict(sKey)
ar = Split(sDict, "_")
.Cells(iRow, 3).Value = UBound(ar) + 1
For i = 0 To UBound(ar)
.Cells(iRow, 4).Offset(0, i) = ar(i)
Next
lastRowMatch = lastRowMatch + 1
countMatch = countMatch + 1
If RUNLOG Then
With wsLog.Cells(lastRowLog, 1)
.Offset(0, 0) = sDict
.Offset(0, 1) = iRow
.Offset(0, 2) = sKey
.Offset(0, 3) = "Match - Source record deleted"
.Offset(0, 4) = Time
End With
lastRowLog = lastRowLog + 1
End If
'Debug.Print iRow,sDict, sKey,
Else
' no match
If MSG Then MsgBox "Results Row " & iRow & ": NO match for " & sKey, vbExclamation, "NO match"
countNoMatch = countNoMatch + 1
If RUNLOG Then
With wsLog.Cells(lastRowLog, 1)
.Offset(0, 1) = iRow
.Offset(0, 2) = sKey
.Offset(0, 3) = "Results : NO match"
.Offset(0, 4) = Time
.EntireRow.Interior.Color = vbYellow
End With
.Cells(iRow, 3).Resize(1, 2).Interior.Color = vbYellow
lastRowLog = lastRowLog + 1
'Debug.Print iRow, sDict, sKey,
End If
End If
End If
Next
End With
Application.ScreenUpdating = True
wsLog.Cells(lastRowLog, 4) = "Program Ended Rows Scanned = " & lastRowResults - 1
wsLog.Cells(lastRowLog, 5) = Time
wsLog.Columns.AutoFit
wsLog.Activate
wsLog.Columns("A:B").HorizontalAlignment = xlCenter
wsLog.Range("A1").Select
' result
finish = Timer
sMsg = "Matched = " & countMatch & vbCrLf _
& "NO match = " & countNoMatch & vbCrLf _
& "Run time (secs) = " & Int(finish - start)
MsgBox sMsg, vbInformation, "Results"
End Sub

Looping through an array while grabbing certain elements

I have a giant dataset that looks like this
I am trying to go down the list of different companies and grab 3 per company and combine them. Based on the photo above, I would have 2 different lists with 3 companies each (except TH Repair which will have 2 in the final list).
My real dataset contains hundreds of different companies, each with dozens/hundreds of entries so I would finish with dozens of lists (each potentially hundreds long).
I tried to record a macro and ended up with this code
Sub Loop1()
'
' Loop1 Macro
'
'
Range("A4:E6").Select
Selection.Copy
Sheets("Sheet3").Select
Range("A18").Select
ActiveSheet.Paste
Sheets("Sheet2").Select
Range("A11:E13").Select
Application.CutCopyMode = False
Selection.Copy
Sheets("Sheet3").Select
Range("A21").Select
ActiveSheet.Paste
Sheets("Sheet2").Select
Range("A17:E19").Select
Application.CutCopyMode = False
Selection.Copy
Sheets("Sheet3").Select
Range("A24").Select
ActiveSheet.Paste
End Sub
However, this turned out to be WAY more complicated then I expected.
I am looking for the end result to look like this
See if something like this works for you. I only ran one scenario through it so you will want to test it more.
This makes the assumption that the data is sorted by column B on the original sheet
This procedure makes an assumption that there is either headers or no data on row 1.
You will need to change the "Sheet1" in this line Set ws1 = ActiveWorkbook.Worksheets("Sheet1") to the name of the sheet you are starting with.
Option Explicit
Public Sub MoveData()
Dim ws1 As Worksheet
Set ws1 = ActiveWorkbook.Worksheets("Sheet1")
Dim ws2 As Worksheet
Set ws2 = ActiveWorkbook.Worksheets.Add()
Dim rw As Long
Dim match_count As Integer
Dim list_multiplier As Integer
list_multiplier = 7
Dim list_row() As Long
ReDim list_row(0)
list_row(0) = 2
For rw = 2 To ws1.Range("A" & ws1.Rows.Count).End(xlUp).Row
If ws1.Range("B" & rw).Value <> ws1.Range("B" & rw).Offset(-1, 0).Value Then
match_count = 0
Else
match_count = match_count + 1
End If
Dim list_num As Integer
list_num = match_count \ 3
If list_num > UBound(list_row, 1) Then
ReDim Preserve list_row(list_num)
list_row(list_num) = 2
End If
ws2.Cells(list_row(list_num), 1 + list_multiplier * list_num).Value = ws1.Range("A" & rw).Value
ws2.Cells(list_row(list_num), 2 + list_multiplier * list_num).Value = ws1.Range("B" & rw).Value
ws2.Cells(list_row(list_num), 3 + list_multiplier * list_num).Value = ws1.Range("C" & rw).Value
ws2.Cells(list_row(list_num), 4 + list_multiplier * list_num).Value = ws1.Range("D" & rw).Value
ws2.Cells(list_row(list_num), 5 + list_multiplier * list_num).Value = ws1.Range("E" & rw).Value
list_row(list_num) = list_row(list_num) + 1
Next rw
End Sub
When you record your macro, ensure that "Use Relative References" on the Developer Ribbon tab is enabled, :)
assuming row 3 has your data headers, you could try this:
Option Explicit
Sub main()
Dim nLists As Long, iList As Long
Dim data As Variant
Dim dataToDelete As Range
With Range("F3", Cells(Rows.Count, 1).End(xlUp))
data = .Value
nLists = WorksheetFunction.Max(.Resize(,1))
nLists = nLists \ 3 + IIf(nLists - 3 * (nLists \ 3) = 0, -1, 0)
End With
With Range("A3").Resize(, 6)
For iList = 0 To nLists
Set dataToDelete = Nothing
With .Offset(, iList * 6).Resize(UBound(data))
.Value = data
.AutoFilter Field:=1, Criteria1:="<=" & iList * 3, Criteria2:=">" & (iList + 1) * 3, Operator:=xlOr
If Application.WorksheetFunction.Subtotal(103, .Cells) > 1 Then Set dataToDelete = .Offset(1).Resize(.Rows.Count - 1).SpecialCells(xlCellTypeVisible)
.Parent.AutoFilterMode = False
If Not dataToDelete Is Nothing Then dataToDelete.Delete xlShiftUp
End With
Next
End With
End Sub
Your task is actually slightly trickier than your online advice suggests. Basically, you have to do the following:
Find out how many unique 'keys' (ie unique items in column B) you have. This will tell you the total number of rows you need (ie number of unique keys * 3)
Count the number of items for each 'key'. This will tell you how many columns you need (ie max item count / 3 * number of columns in array [A:E = 5])
Loop through each line of data and it put on appropriate row for that 'key'. Once three has been reached, jump the column for that key 6 columns to the right, and continue.
If you were to use a Class object and Collection type of object, this could be really quite concise code, but judging by your post you are at the beginning of your programming journey in VBA. Therefore, I've broken down each task into separate chunks of code so you will hopefully see how arrays can work for you. Once you practise with arrays a little, perhaps you could have a go at making this code more efficient by combining some of the loops:
Public Sub RunMe()
Dim data As Variant
Dim r As Long, c As Long, i As Long, dataRows As Long, dataCols As Long, keyLen As Long, maxCount As Long
Dim keys As String
Dim k As Variant
Dim keyArray() As String
Dim keyCount() As Long, threeCount() As Long, rowNum() As Long, colNum() As Long
Dim output() As Variant
'Read the data - change "Sheet1" to your sheet name.
'Shows how to write range values into a variant to
'create an array of variants.
data = ThisWorkbook.Worksheets("Sheet1").UsedRange.Value2
dataRows = UBound(data, 1)
dataCols = UBound(data, 2)
'Create a list of unique keys.
'Note: not the most efficient way, but shows how to
'create an array from a value-separated string.
For r = 1 To dataRows
If InStr(keys, CStr(data(r, 2))) = 0 Then
If Len(keys) > 0 Then keys = keys & "|"
keys = keys & CStr(data(r, 2))
End If
Next
keyArray = Split(keys, "|")
keyLen = UBound(keyArray)
'Initialise the row and column numbers for each key.
'Shows how to iterate an array using For Each loop.
ReDim rowNum(keyLen)
ReDim colNum(keyLen)
r = 1
i = 0
For Each k In keyArray
rowNum(i) = r
colNum(i) = 1
r = r + 3
i = i + 1
Next
'Count the number of items for each key.
'Shows how to iterate an array using For [index] loop.
ReDim keyCount(keyLen)
For r = 1 To dataRows
i = IndexOfKey(keyArray, CStr(data(r, 2)))
keyCount(i) = keyCount(i) + 1
If keyCount(i) > maxCount Then maxCount = keyCount(i)
Next
'Size the output array.
c = WorksheetFunction.Ceiling(maxCount / 3, 1)
ReDim output(1 To (keyLen + 1) * 3, 1 To c * dataCols + c - 1)
'Populate the output array.
ReDim threeCount(keyLen)
For r = 1 To dataRows
i = IndexOfKey(keyArray, CStr(data(r, 2)))
'Copy the columns for this row.
For c = 1 To dataCols
output(rowNum(i), colNum(i) + c - 1) = data(r, c)
Next
'Increment the count and if it's equals 3 then
'reset the row num and increase the column number.
threeCount(i) = threeCount(i) + 1
rowNum(i) = rowNum(i) + 1
If threeCount(i) = 3 Then
rowNum(i) = rowNum(i) - 3
colNum(i) = colNum(i) + dataCols + 1
threeCount(i) = 0
End If
Next
'Write the data - change "Sheet2" to your sheet name.
'Shows how to write an array to a Range.
ThisWorkbook.Worksheets("Sheet2").Range("A3") _
.Resize(UBound(output, 1), UBound(output, 2)).Value = output
End Sub
Private Function IndexOfKey(list() As String, key As String) As Long
Dim i As Long
Dim k As Variant
'Helper function to find index position of key in array.
For Each k In list
If key = k Then
IndexOfKey = i
Exit Function
End If
i = i + 1
Next
IndexOfKey = -1
End Function

Find multiple values, concatenate cooresponding values in other column, write to cell

Problem:
Nothing is being written into cells in column P. The line Cells(x, "P").Value = failingClasses should do this.
Description: (VBA script below)
I've got a column with ID numbers. There can be multiple rows with each ID number. What I need to do is concatenate all the corresponding values in another column and write this into a cell in the original row. This needs to be done for each row in the sheet.
Field 1 is where the IDs are, field 6 is where the information I want to concatenate is, I'm trying to write the concatenation into column P.
Right now, I think that the computation is being done correctly, but for what ever reason it isn't writing to the cell in P?
Macro takes for ever to run. Between 1k and 2k rows when run.
Thanks!
Worksheets("RAW GRADE DATA").Select
' Turn off auto calc update and screen update -- saves speed
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Dim x As Long, y As Long, totalGradeEntries As Long, failingClasses As String, failingClassesCell As Excel.Range
totalGradeEntries = Cells(Rows.Count, 1).End(xlUp).Row
For x = totalGradeEntries To 1 Step -1
failingClasses = ""
For y = totalGradeEntries To 1 Step -1
If Cells(y, 1).Value = Cells(x, 1).Value And Cells(x, 6) <> "02HR" Then
failingClasses = failingClasses & " " & Cells(y, 1).Value
End If
Cells(x, "P").Value = failingClasses
Next y
Next x
' Turn calc and screen update back on
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
I got the bones of a solution to this work, thanks to Ron Rosenfeld -- Here is the code, working on a test sheet with 3 columns of data, the Unique IDs being in column 1.
Sub CalcArrary()
'Declare variables
Dim numRows As Integer, calcArray() As Variant
'Set the number of rows in the sheet
numRows = ActiveSheet.Range("A1").Offset(Sheet1.Rows.Count - 1, 0).End(xlUp).Row
ReDim calcArray(numRows - 1, 4)
For i = 0 To numRows - 2
calcArray(i, 1) = Range("A" & i + 2)
calcArray(i, 2) = Range("B" & i + 2)
calcArray(i, 3) = Range("C" & i + 2)
Next i
For b = 0 To numRows - 2
For c = 0 To numRows - 2
If calcArray(c, 1) = calcArray(b, 1) And calcArray(c, 3) < 60 Then
calcArray(b, 4) = calcArray(b, 4) & calcArray(c, 2) & ", " & calcArray(c, 3) & "% "
End If
Next c
Next b
For d = 0 To numRows - 2
ActiveSheet.Range("D" & d + 2) = calcArray(d, 4)
Next d
End Sub

Resources