I have a function that is used to find the information in a Excel worksheet knowing that:
- The Key can be in a variable column
- Variable fields can be searched
Sheets usually have less than a hundred column, but can have anything from a few hundred to 100 000 rows to search. In our biggest files, the function I'm trying to optimize can be used about a million times.
After reading
https://fastexcel.wordpress.com/2011/10/26/match-vs-find-vs-variant-array-vba-performance-shootout/
... and finding our function used Find (3 times), I tried using arrays.
This is the code I wrote
Function getInfo(Key As String, NameField As String, NameKey As String, WksName As String) As Variant
On Error GoTo Error
Dim iColumnKEY As Integer
Dim iColumnFIELD As Integer
Dim i As Integer
Dim ListFields, ListKeys As Variant
ListFields = Worksheets(WksName).Range("A1:ZZ1")
i = LBound(ListFields, 2)
'To identify which column contains the Key and which one contains the
'information we are searching for
Do While iColumnKEY=0 Or iColumnFIELD=0
If i > UBound(ListFields, 2) Then
getInfo = "//error\\"
ElseIf ListFields(1, i) = NameKey Then
iColumnKEY = i
ElseIf ListFields(1, i) = NameField Then
iColumnFIELD = i
End If
i = i + 1
Loop
Dim iROW As Integer
ListKeys = Worksheets(WksName).Columns(iColumnFIELD)
i = LBound(ListKeys, 1)
Do While iROW=0
If i > UBound(ListKeys,1) Then
getInfo = "//error\\"
ElseIf ListKeys(i,1) = Key Then
iROW = i
End If
i = i + 1
Loop
getInfo = Worksheets(WksName).Cells(iROW, iColumnFIELD)
Exit Function
Error:
getInfo = "//error\\"
End Function
The code works, but is very slow. What am I doing that is slowing things down?
It is not in the code right now, but I did try turning the screen update down, as well as automatic calculation down. I didn't see any difference in speed, which indicates me that the basic algorithm is the main issue.
Also, the article was in 2011. Are arrays still a lot faster than Match/Find?
As a side note: eventually, I'll suggest having a macro that search for a range of Keys in a batch, instead of calling the function for every single key. This would means the first Do... While loop would be done only once for a macro, and only the Do_While for Rows would be run for every key. However, this is not an option in the very short term.
Thanks. Any help or advice would be greatly appreciated.
To make sure I understood you correctly, you have a sheet that has a random column that contains unique keys.
you want to search for one of these keys and return related info (like row no, etc) many times
Approach:
Find the column in which the keys are listed.
Load that column in a dictionary(Indexed).
Use GetInfo function to return info about a specific key if it exists.
Dependencies:
Microsoft scripting runtime (Tools > refrences > Microsoft scripting runtime)
code:
Option Explicit
Private KeyDictionary As Scripting.Dictionary
Sub PopulateDictionary(ByRef WS As Worksheet, ByVal FieldName As Variant)
Dim i As Long, LastRow As Long, iColumnFIELD As Long
Dim ListKeys As Variant
iColumnFIELD = WS.Range("A1:ZZ1").Find(FieldName).Column
With WS 'Finds the last row in the sheet
If Application.WorksheetFunction.CountA(.Cells) <> 0 Then
LastRow = .Cells.Find(What:="*", _
After:=.Range("A1"), _
LookAt:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Row
End If
Set KeyDictionary = New Scripting.Dictionary
For i = 1 To LastRow 'populates Dictionary with Key, Row number pair
If Not (KeyDictionary.Exists(.Cells(i, iColumnFIELD))) Then 'Make sure the key doesn't already exist(Key values should be unique)
KeyDictionary.Add .Cells(i, iColumnFIELD).Value, .Cells(i, iColumnFIELD).Row
End If
Next i
End With
End Sub
Function getInfo(ByVal key) As Variant
If KeyDictionary.Exists(key) Then
getInfo = KeyDictionary(key) 'if the key exist return row number (or whatever you want to)
Else
getInfo = "Null" 'Else return whatever you want like a msgbox "not Found" , etc
End If
End Function
usage
'populates and returns the row number of key 9500
Sub TestSearch()
PopulateDictionary ThisWorkbook.Worksheets("Sheet1"), "Key"
Debug.Print getInfo(9500)
End Sub
Notes:
-almost always Use long instead of integer , not much performance difference, but can save you from overflow pitfalls.
-you can add a reference to the range containing the key instead of the row number that would more flexible
-Passing a sheet by reference (Full Ref) is better than passing just its name and avoids a lot of possible problems like the case of multiple workbooks with the same sheet name and makes your code more reusable.
References:
Dictionary object
Edit:
I misunderstood your request , thought you wanted to know the best method available.
here's a performance comparison of the four methods:
it takes 1325 ms (Or 1.3 seconds) to populate the Dictionary with the unique key list the first time (100,000 Row)
it takes 1.79646327708265E-02 ms aka 0.02 ms To search for an item at the end of list (row 99863) using the dictionary object
it takes around 10.5 ms To search for the same item with WorksheetFunction.Match
it takes around 50 ms To search for the same item with the array method
it takes around 80 ms To search for the same item with the Range.find
Result:
Dictionary Method is faster than match -the second-best method of the Four- by over 500 Times!
The reason is that the keys are indexed inside the dictionary, unlike the other methods.
notes:
Office 2016 was used on a 2 cores (3.20 GHz) machine with 8 gigs or ram (Dictionary took about extra 8 Megabytes ram)
All of these searches were done on the same data set
(the search was done on only 1 column with 100,000 unique keys ,with the searched for value at the bottom of the list)
The break-even point on whether you should use Match or Dictionary is around 120 searches.
if the code will search for more than 120 values then it's better to use the dictionary approach.
Windows API QueryPerformanceCounter was used for High resolution timer.
Code line used to search for Values(not gonna put the full sub)
'Match
WorksheetFunction.Match(Key, ThisWorkbook.Worksheets(1).Range("CW:CW"), 0)
'Find
ThisWorkbook.Worksheets(1).Range("CW:CW").Find(Key).Row
'Array
'Loops through the column till it finds a match
In your code you never use iColumnKEY
I think this is what you are actually after:
Function getInfo(key As String, NameField As String, NameKey As String, WksName As String) As Variant
Dim keyCol As Variant, fieldCol As Variant, keyRow As Variant
Dim errMsg As String
getInfo = "//error\\"
With Worksheets(WksName)
With Intersect(.UsedRange, .Columns("A:ZZ")) ' <--| reference a range in passed worksheet cells belonging to columns "A" to "ZZ" from worksheet first used row to last used one and from worksheet first used column to last used one
MsgBox .Address
fieldCol = Application.Match(NameField, .Rows(1), 0) '<--| look for passed 'NameField' in referenced range
If IsError(fieldCol) Then
errMsg = " :field column '" & NameField & "' not found"
Else
keyCol = Application.Match(NameKey, .Rows(1), 0) '<--| look for passed 'NameKey' in referenced range
If IsError(keyCol) Then
errMsg = " :key column '" & NameKey & "' not found"
Else
MsgBox .Columns(keyCol).Address
keyRow = Application.Match(key, .Columns(keyCol)) '<--| look for passed 'key' in referenced range 'NameKey' column
If IsError(keyRow) Then
errMsg = " :key '" & key & "' not found in column '" & NameKey & "'"
Else
getInfo = .Cells(keyRow, fieldCol) '<--| get referenced range "item"
End If
End If
End If
If errMsg <> "" Then getInfo = getInfo & errMsg
End With
End With
End Function
I see that in your loop you are doing a UBound() evaluation every time. This is not needed.
The following should be faster than a Do While loop. Notice that the array returned by Range().Value has always a lower bound of one. No need to call LBound()
Also, find where the last data exists in the column and restrict the loop to that range. I do this with .End(xlUp)
Dim ListKeys() as Variant
Dim iROW As Long, nRows as Long
nRows = Worksheets(WksName).Cells(Worksheets(WksName).Rows.Count, iColumnFIELD).End(xlUp).Row
ListKeys = Worksheets(WksName).Cell(1, iColumnFIELD).Resize(nRows,1).Value
For i=1 To nRows
If ListKeys(i,1) = Key Then
iROW = i
Exit For
End If
Next i
not an answer but a radically different approach, since im from data-science background i use these methods for fast searching any data in a database which are few GB in size, in your case excel for example. this approach can be parallelized based on number of CPUs in your system. and uses python framework Pandas, which has very big community incase you need support, VB has limited community.
also read this before judging this answer https://corporatefinanceinstitute.com/resources/knowledge/other/transitioning-from-excel-to-python/
i expect criticism for this answer , OP asked this but you are giving this blah. but if you want faster development times for ever changing business needs you need something fast, and easy to maintain. python makes it easy, pandas makes it fast.
to get started read this.https://towardsdatascience.com/read-excel-files-with-python-1000x-faster-407d07ad0ed8
i will mention the pipeline here however. see very few lines of code!!! finish work faster, go home early.
import the excel file as csv
import pandas as pd
dataframe=pd.read_excel("file.xlsx")
item=dataframe[dataframe["Order ID"]==886714971] #condition based searching in excel
note "Order ID" is just any arbitary column and you can use SQL like logic here which resembles match/find in VBA.
for speed reference iterating 1,000,000 rows took 0.03 seconds, which means a transaction speed of 30 TPS. use https://modin.readthedocs.io/en/latest/ to scale up that speed linearly with number of cores in cpu.
To find out what parts of the code are the slowest, you can use Timer:
Dim t as Single
t = Timer
' part of the code
Debug.Print CDbl(Timer - t) ' CDbl to avoid scientific notation
Using .Value2 instead of .Value should help a bit:
ListFields = Worksheets(WksName).Range("A1:ZZ1").Value2
Searching for the key and field in two separate loops should be a bit faster because there will be less comparisons. Also, I am not sure if it will be a bit slower or faster, but you can iterate even multi-dimensional arrays:
Dim i As Long, v ' As Variant
i = 1
For Each v in ListFields
If v = NameKey Then
iColumnKEY = i
Exit For
End If
i = i + 1
Next
Related
I'm trying to make a worksheet of dynamically generated checkboxes, so I made the below sub to re-name them so I would be able to refer to them later (since I get what the checkboxes are representing from another sub and can't hardcode there names in). It works fine, however it seems to be the cause of a strange error I'm getting. After I run the main sub in my macro I get a compile error whenever I try to run any macro's. The exact error is:
Compile error:
Object library invalid or contains references to object definitions that could not be found
This is an error that a bunch of other people have got but none seem to be in the same circumstances and none of the suggested fixes have worked for me.
In the actual macro reNameCheckBox1 (the class that seems to be causing the issue) is ran numerous times but when ran only once no problem occurs.
The problem class:
Sub reNameCheckBox1(newName As String, mainWB As Workbook)
Dim obj As OLEObject
Dim i As Integer
i = 1
With mainWB.Worksheets("Sheet1")
For Each obj In .OLEObjects
Debug.Print obj.name
If TypeName(obj.Object) = "CheckBox" And Left(obj.name, 8) = "CheckBox" Then
Debug.Print "Found it"
obj.name = newName
mainWB.Worksheets("Sheet1").OLEObjects(newName).Object.Caption = newName
Exit For
End If
Next obj
i = i + 1
End With
End Sub
How its called used when its causing problems:
For n = 1 To numberOfNames
mainWB.Worksheets("Sheet1").OLEObjects.Add(ClassType:="Forms.CheckBox.1", Link:=False, _
DisplayAsIcon:=False, Left:=48, Top:=checkBoxTop, Width:=96, Height:=30). _
Select
Call reNameCheckBox1("Name=" & arrayOfNames(n), mainWB)
checkBoxTop = checkBoxTop + 45
Next n
If anyone knows why this is causing problems or a better way to do what I'm trying to do it would be MUCH appreciated.
EDIT
As per comintern's suggestion, how the arrayOfNames is inilialized is shown below. Just a few things to note. 1) searchForParameters returns two lists, which i'll ill call arrayOfNames and otherNames, and in an attempt to return them both I stored them in one comfusingly organized array. 2) I didn't know LBound or UBound existed when I made this (i'm new to vba) and 3) results is organized as such,
-At zero the length of the entire results array is stored
-At one the length of arrayOfNames
-at the beginning of OtherNames its length is stored
So, taking that confusing mess and turning into arrayOfNames and otherNames looks like this.
Dim results As Variant
Dim arrayOfNames As String
Dim otherNames As String
Dim length as Integer
Dim arrayOfNamesLength as Integer
Dim otherNamesLength As Integer
Dim n, i As Integer
results = searchForParameters(currentWB) 'Current wb is the name of the wb its searching, which isnt the one the code's being ran out of
length = results(0)
arrayOfNamesLength = results(1)
otherNamesLength = results (arrayOfNamesLength + 2)
ReDim arrayOfNames(arrayOfNamesLength + 1)
ReDim otherNames(otherNamesLength)
For i = 0 To arrayOfNamesLength
arrayOfNames(i) = results(i + 1)
Next i
For n = 0 To otherNamesLength
OtherNames(n) = results(i + n + 1)
Next n
EDIT
As it turns out, the compile error this question was created to solve was not caused by any of the code here, but by a separate class. So if your having the same error I'm having, the solution won't be here. Since it has some really good answers on fixing the code I did have, I guess I'll just leave this here and hope someone finds it useful.
Cheers
-Ben J. Man
It's unclear whether your arrayOfNames has a one-based or zero-based index. I suspect the former. Use the LBound
UBound functions to define the scope of your array elements.
I am also suspect of the quoted 'named parameter'. You can use named parameters but not in that manner.
For n = LBound(arrayOfNames) To UBound(arrayOfNames)
mainWB.Worksheets("Sheet1").OLEObjects.Add(ClassType:="Forms.CheckBox.1", Link:=False, _
DisplayAsIcon:=False, Left:=48, Top:=checkBoxTop, Width:=96, Height:=30). _
Select
reNameCheckBox1 newName:=CStr(arrayOfNames(n)), mainWB:=mainWB
checkBoxTop = checkBoxTop + 45
Next n
I'm not sure that the Workbook Object Application.Selection property needs to be used like that. I recommend simply referring to the Workbook object as a parameter.
You might get a problem passing a variant element of a variant array as a string type parameter so I've cast it as a string.
Call isn't necessary and some people think it is antiquated. Use it if you like; it does no harm.
It isn't exactly clear why you are selecting the CheckBox returned from OLEObjects.Add and then immediately searching for it in reNameCheckBox1. Just grab a reference and rename it in your loop:
Dim cb As OLEObject
For n = LBound(arrayOfNames) To UBound(arrayOfNames)
Set cb = mainWB.Worksheets("Sheet1").OLEObjects.Add( _
ClassType:="Forms.CheckBox.1", _
Link:=False, _
DisplayAsIcon:=False, _
Left:=48, _
Top:=checkBoxTop, _
Width:=96, _
Height:=30)
cb.Name = arrayOfNames(n)
cb.Object.Caption = arrayOfNames(n)
checkBoxTop = checkBoxTop + 45
Next n
I have a table of monthly sales figures - FreqData1. Each column represents a month and is numbered 1 to 12. The user chooses one of these numbers from a dropdown list.
I have the code to find the column number and I have tried to assign the data in that column to an array so I can use it copy it to a different spreadsheet but with my basic VBA knowledge and despite lots of searching I have been unable to find the code as to how to do this or a different method to carry this out.
Can anyone help please
Sub AnnualFreqMacro()
Dim TPNoInt As Long, BranchNoInt As Long, ColNo As Long
Dim FreqArray()
Worksheets("Freq data").Activate
TPNoInt = Range("B42").Value
BranchNoInt = Range("B41").Value
ColNo = Application.Match(TPNoInt, Range("TPBr1"), 0)
CharaArray = Range("FreqData1").Cells (1, ColNo), Cells(16, ColNo))
End Sub
Many thanks in advance
I think this is your answer: It's how you're using the range.
Delete your CharArray = ... line and replace with:
With Range("FreqData1")
CharaArray = .Range(.Cells(1, ColNo), .Cells(16, ColNo))
End With
The issue is how you're setting the range, Range().Cells(), Cells() isn't the context, you'd want something more like Range(Cells(),Cells()).
Let's say "FreqData1" is the range A10:A20. If you use
With Range("FreqData1")
.Cells(1,1).Select
End With
this will select the top left most cell (row 1, column 1) in the range "FreqData", so cell A10 would be selected.
A final misc. point: Avoid using .Select/.Activate. You can activate a sheet of course so you can follow your macro, but when setting variables to ranges/cell values, etc. it's best to qualify which sheet you are referring to.
Sub AnnualFreqMacro()
Dim TPNoInt As Long, BranchNoInt As Long, ColNo As Long
Dim FreqArray()
Dim freqWS As Worksheet
Set freqWS = Worksheets("Freq data")
' Worksheets("Freq data").Activate ' removed this, since we have a variable for it now.
TPNoInt = freqWS.Range("B42").Value ' see how I added the worksheet that we want to get B42's value from?
BranchNoInt = freqWS.Range("B41").Value
ColNo = Application.Match(TPNoInt, Range("TPBr1"), 0)
With freqWS.Range("FreqData1") ' I'm assuming "FreqData1" is on this worksheet
CharaArray = .Range(.Cells(1, ColNo), .Cells(16, ColNo))
End With
End Sub
I'm not positive if you have to qualify a named range's sheet, since it's a named range, but I added that just to be safe.
Edit2: Hm, oddly enough, if your named range "myRange" is A1:A10, you can still do myRange.Range(myRange.cells(1,1),myRange.cells(1,2)), even though there's no second column in the range, it just expands it. I thought it'd throw an error, but nope. Just to note.
I have a spreadsheet of data that I want to put into a VBA array which then outputs unique values to a new sheet. I have got that to work so far. However, some of the cells in the original data have text separated by commas, and I want to add those to the array as well. I can't quite get that bit to work.
After the various 'dims', my code is
'Grabs the data to work with
Set rTable = Worksheets("Data Entry").Range("N1:N100", "P1:P100")
'Puts it into an array
MyArray = rTable.Value
'Sets where the data will end up
Set rCell = Worksheets("TestSheet").Range("A1:A100")
'Each unique entry gets added to the new array
On Error Resume Next
For Each a In MyArray
UnqArray.Add a, a
Next
'Add unique data to new location
For i = 1 To UnqArray.Count
rCell(i, 1) = UnqArray(i)
Next
I have tried doing a new variant to store the split data
SpArray = split(MyArray,", ")
and then have that here
MyArray = rTable.Value
SpArray = split(MyArray,", ")
and then refer to SpArray for the rest of the code
I've also tried to have as part of
For Each a in SpArray
but it doesn't work for me.
Do I need to do a separate loop on each cell of the array before I filter out the unique ones?
Yes, you need another loop. But if you set a reference to Microsoft Scripting Runtime and use a Dictionary object, you can eliminate the loop that writes to the range because Dictionary.Keys returns an array.
In this example, it attempts to split every entry on a comma and treats each of those as a unique. If there is no comma, Split returns the one value so it works in both cases. There's probably a small cost to splitting things that don't need to be split, but you won't notice until your range is much larger. And it makes the code cleaner, I think.
Sub WriteUniques()
Dim dcUnique As Scripting.Dictionary
Dim vaData As Variant
Dim vaSplit As Variant
Dim i As Long, j As Long
vaData = Sheet1.Range("$I$12:$I$62").Value
Set dcUnique = New Scripting.Dictionary
For i = LBound(vaData, 1) To UBound(vaData, 1)
vaSplit = Split(vaData(i, 1), ",")
For j = LBound(vaSplit) To UBound(vaSplit)
If Not dcUnique.Exists(vaSplit(j)) Then
dcUnique.Add vaSplit(j), vaSplit(j)
End If
Next j
Next i
Sheet1.Range("J12").Resize(dcUnique.Count, 1).Value = Application.Transpose(dcUnique.Keys)
End Sub
The code tweak that worked for me was to put the Split at the end.
'Add unique data to new location
For i = 1 To UnqArray.Count
rCell(i, 1) = Split(UnqArray(i), ",")
Next
This then built up an array using data from different ranges and splitting up comma separated ones before outputting only the unique ones.
As for the problem, I need to be able to compare all data in Variant array A to all data in Variant array B. I know I need some kind of double loop (so that every A value is checked against all B values), but I can't figure out how to do it. Here's what I have so far:
Sub Button_Click()
Dim trgtRange As Variant
Dim tempRange As Variant
Set myRange = ThisWorkbook.Sheets(1).Range("L:L")
For Each cell In myRange
If IsEmpty(cell) Then
ActiveCell.Offset(-1, 0).Select
currentRow = ActiveCell.Row
Set trgtRange = Range("L2:L" & currentRow)
Exit For
End If
Next cell
Set tempRange = Range("A1:A" & currentRow - 1)
' Insert a double loop here
End Sub
So, trgtRange is the Variant A and tempRange is Variant B. I know I could have set the Variant B up a little easier, but I already did it that way. After all, code should be polished as last operation anyway.
You might be wondering why Variants A and B are completely the same. Well, that's because I need to compare them so that I can find values that are close to each other, (i.e 10000 and 12000) and I need to incorporate some kind of tolerance for it.
Here is my answer. Why do you need two loops to do this. Some relative addressing handles this issue quite nicely. Set up a spreadsheet like this for an example:
and your code is simply this
Sub Button_Click()
Dim dblTolerance As Double
Dim tmp As Range
'Get source range
Set tmp = ActiveSheet.Range("A2")
'Get tolerance from sheet or change this to an assignment to hard code it
dblTolerance = ActiveSheet.Range("D13")
'use the temporary variable to cycle through the first array
Do Until tmp.Value = ""
'Use absolute function to determine if you are within tolerance and if so put match in the column
'NOTE: Adjust the column offset (set to 4 here) to match whichever column you want result in
If Abs(tmp.Value - tmp.Offset(0, 2).Value) < dblTolerance Then
tmp.Offset(0, 4).Value = "Match"
Else
tmp.Offset(0, 4).Value = "No Match"
End If
'Go to the next row
Set tmp = tmp.Offset(1, 0)
Loop
'Clean up
Set tmp = Nothing
End Sub
The comments in the code explain how it works. This is superior to a double loop because relative referencing is faster, the memory use is more efficient and you only have to make one pass at each row.
If you are required for some reason to use a double loop let me know, but that is inferior performance wise to this methodology. Hope this helps.
I have a list of customers from last year (in column A) and I have a list of customers from this year (in Column B). I've put the data from these two columns in arrays (using the code below - which is set up as Option Base 1):
'Define our variables and array types'
Sub CustomerArray()
Dim LastArray() As String
Dim CurrentArray() As String
Dim BothArray() As String
Dim LR As Long
Dim i As Integer
'Define LastArray which is customers last year'
LR = Cells(Rows.Count, 1).End(xlUp).Row
ReDim LastArray(LR - 3)
With Range("A1")
For i = 1 To LR - 3
LastArray(i) = .Offset(i, 0)
Next i
End With
'Define CurrentArray which is customers this year'
ReDim CurrentArray(LR - 3)
With Range("B1")
For i = 1 To LR - 3
CurrentArray(i) = .Offset(i, 0)
Next i
End With
End Sub
Now I want to compare/combine the Arrays to show a list of customers who appear in both of the two arrays I just defined (last year and this year). I want to create a third array with the customers who appear for both years (and I want to put that in column D of my excel sheet). I'm getting confused on how to write the code which will compare these two arrays (current year and last year). Will I use a conditional If > statement? Each of the arrays have the customers listed alphabetically.
I appreicate any help you might be able to give me.
Thanks!
You don't need to mess with arrays or loop at all, keep it simple, try something like this:
Sub HTH()
With Range("A1", Cells(Rows.Count, "A").End(xlUp)).Offset(, 3)
.Formula = "=IF(COUNTIF(B:B,A1)>0,A1,"""")"
.Value = .Value
.SpecialCells(xlCellTypeBlanks).Delete
End With
End Sub
OK. I got a little carried away here, but this does what your are asking (you may have to tune it up to suit your specific needs. To use this code, simply call the Sub "Match Customers".
Your original code proposed the use of three arrays. Excel VBA provides some mechanisms to do what you seek which are both easier to use, and possibly more efficient.
I went ahead and broke the process out into more discrete chunks of code. While it seems like more code, you will find that each peice might make more sense, and it is much more maintainable. You can also now re-use the individual functions for other operations if needed.
I also pulled your range and column indexes out into locally defined constants. This way, if the various row or column references ever need to change, you only have to change the value in one place.
It is not necessarily the most efficient way to do this, but is most likely less complicated than using the arrays you originally propose.
I have not tested this exhaustively, but it works in the most basic sense. Let me know if you have questions.
Hope that helps . . .
Option Explicit
'Set your Column indexes as constants, and use the constants in your code.
'This will be much more maintainable in the long run:
Private Const LY_CUSTOMER_COLUMN As Integer = 1
Private Const CY_CUSTOMER_COLUMN As Integer = 2
Private Const MATCHED_CUSTOMER_COLUMN As Integer = 4
Private Const OUTPUT_TARGET As String = "D1"
Private Const LAST_ROW_OFFSET As Integer = -3
'A Function which returns the list of customers from last year
'as a Range object:
Function CustomersLastYear() As Range
Dim LastCell As Range
'Find the last cell in the column:
Set LastCell = Cells(Rows.Count, LY_CUSTOMER_COLUMN).End(xlUp)
'Return the range of cells containing last year's customers:
Set CustomersLastYear = Range(Cells(1, LY_CUSTOMER_COLUMN), LastCell)
End Function
'A Function which returns the list of customers from this year
'as a Range object:
Function CustomersThisYear() As Range
Dim LastCell As Range
'Find the last cell in the column:
Set LastCell = Cells(Rows.Count, CY_CUSTOMER_COLUMN).End(xlUp)
'Return the range of cells containing this year's customers:
Set CustomersThisYear = Range(Cells(1, CY_CUSTOMER_COLUMN), LastCell)
End Function
'A function which returns a range object representing the
'current list of matched customers (Mostly so you can clear it
'before re-populating it with a new set of matches):
Function CurrentMatchedCustomersRange() As Range
Dim LastCell As Range
'Find the last cell in the column:
Set LastCell = Cells(Rows.Count, MATCHED_CUSTOMER_COLUMN).End(xlUp)
'Return the range of cells containing currently matched customers:
Set CurrentMatchedCustomersRange = Range(Cells(1, MATCHED_CUSTOMER_COLUMN), LastCell)
End Function
'A Function which performs a comparison between two ranges
'and returns a Collection containing the matching cells:
Function MatchedCustomers(ByVal LastYearCustomers As Range, ByVal ThisYearCustomers As Range) As Collection
Dim output As Collection
'A variable to iterate over a collection of cell ranges:
Dim CustomerCell As Range
'Initialize the collection object:
Set output = New Collection
'Iterate over the collection of cells containing last year's customers:
For Each CustomerCell In LastYearCustomers.Cells
Dim MatchedCustomer As Range
'Set the variable to reference the current cell object:
Set MatchedCustomer = ThisYearCustomers.Find(CustomerCell.Text)
'Test for a Match:
If Not MatchedCustomer Is Nothing Then
'If found, add to the output collection:
output.Add MatchedCustomer
End If
'Kill the iterator variable for the next iteration:
Set MatchedCustomer = Nothing
Next
'Return a collection of the matches found:
Set MatchedCustomers = output
End Function
Sub MatchCustomers()
Dim LastYearCustomers As Range
Dim ThisYearCustomers As Range
Dim MatchedCustomers As Collection
Dim MatchedCustomer As Range
'Clear out the destination column using the local function:
Set MatchedCustomer = Me.CurrentMatchedCustomersRange
MatchedCustomer.Clear
Set MatchedCustomer = Nothing
'Use local functions to retrieve ranges:
Set LastYearCustomers = Me.CustomersLastYear
Set ThisYearCustomers = Me.CustomersThisYear
'Use local function to preform the matching operation and return a collection
'of cell ranges representing matched customers. Pass the ranges of last year and this year
'customers in as Arguments:
Set MatchedCustomers = Me.MatchedCustomers(LastYearCustomers, ThisYearCustomers)
Dim Destination As Range
'Use the local constant to set the initial output target cell:
Set Destination = Range(OUTPUT_TARGET)
'Itereate over the collection and paste the matches into the output cell:
For Each MatchedCustomer In MatchedCustomers
MatchedCustomer.Copy Destination
'Increment the output row index after each paste operation:
Set Destination = Destination.Offset(1)
Next
End Sub
If you want to compare the two arrays using loops, maybe because you have, for example, picked up all the data into arrays for faster computation rather than interacting with the spreadsheet range object, or you need to compare multiple things from the two arrays to check that the entries match so can't use a .find statement, then this is what you need:
-Two loops, one nested inside the other
-Three counters, one for each array
-One "Exit Loop", "Exit For", "GoTo foundmatch" or similar way of exiting the inner loop
-A "Redim Preserve" of the results array
-An "If" statement
-Finally, one line where you assign the name that appears in both arrays to the results array
This is everything that is needed to write it simply as loops - but doesn't give the fastest or best way to do it (Redim Preserve is not the best..). Constructing it should be easy from this list though: the if statement should either be an x=y type for a general usage, or if x>y if you are really really sure that the list being looped in the inner loop really is sorted alphabetically