At risk of being of topic, I decided to share some code, Q&A-style. If the general opinion is such that this would be off-topic I'll be happy to delete if need be.
Background
Having a Range object holding a certain amount of values I would want to pull those values into an array. The conventional way (at least to me) would be to use ""scripting.dictionary", loop through a Range, or rather an Array in memory, to add these values into a uniques list.
While that works, I wanted to see if there is a method to pull an array of unique items without any loop.
Sample
Imagine the following data in A1:A8:
Vals
A
B
A
B
C
C
B
Question
To retrieve a 1D-array of unique items {A,B,C}, how would we go about doing this without a loop?
Uniques - Dictionary
A very solid (and fast) way of returning a 1D-array of unique values would be to use a conventional Dictionary object as below:
Sub UniquesDictionary()
Dim lr As Long, x As Long
Dim arr As Variant
Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")
With Sheet1
'Find the last used row
lr = .Cells(.Rows.Count, 1).End(xlUp).Row
arr = .Range("A2:A" & lr).Value
End With
'Loop through memory and fill dictionary
For x = LBound(arr) To UBound(arr)
dict(arr(x, 1)) = 1
Next x
'Pull unique items into a 1D-array
arr = dict.Keys
End Sub
Uniques - Evaluate
Whilst the above works. The wish was to avoid any loop. The way to do this is to use .Evaluate, see below:
Sub UniquesEvaluate()
Dim lr As Long
Dim arr As Variant
With Sheet1
'Find the last used row
lr = .Cells(.Rows.Count, 1).End(xlUp).Row
'Get array of unique values
arr = Filter(.Evaluate("TRANSPOSE(If(Row(A2:A" & lr & ")<>MATCH(A2:A" & lr & ",A2:A" & lr & ",0)+1,""|"",A2:A" & lr & "))"), "|", False)
End With
End Sub
It might seem like a long formula but it really isn't that impressive and boils down to:
=IF(ROW(A2:A8)<>MATCH(A2:A8,A2:A8,0)+1,"|",A2:A8)
The TRANSPOSE is only there to return a 1D-array for FILTER to work with.
The inital formula will only return those values that are sitting on the rows where they are first encountered through MATCH, otherwise it would return a pipe-symbol.
Thus TRANSPOSE(<formula>) returns a 1D-array, e.g.: {A,B,|,|,C,|,|}
Filter then uses this 1D-array as an input, returning an array filtering out those pipe-symbols using FALSE in the "include" parameter, e.g: Filter(<arr>,"|",FALSE) > {A,B,C}.
Comparison
This would only have real purpose if this method would be equally as fast as the more conventional Dictionary so I did a small comparison. As far as my testing goes, there was no really noticable timing difference (around 0 seconds), but since in essence the Evaluate is a CSE formula, larger datasets will get noticable timing differences above let's say 2000 rows of data.
Hopefully this is usefull to those working with smaller datasets.
Related
I am trying to avoid the use of loops for populating arrays since they take a lot of time when managing a lot of data.
Apparently as well, that is possible and easy in VBA but often results in problems.
Here is the code:
sub populate()
'put the whole column in an array
Dim AppArray() As Variant
Dim AppRange As Range
'calculate the last row of the column 1 of sheets
Dim LstRow As Integer
LstRow = Sheets("whole").Cells(Sheets("whole").Rows.Count, "A").End(xlUp).row
'here I calculate the range that I want to pass to the array
Set AppRange = Sheets("whole").Range(Cells(1, 1), Cells(LstRow, 1))
MsgBox ("apprange " & AppRange.Address)
'i dont know if I need to redim or not
ReDim AppArray(1 To LstRow)
'here comes the point. populate the array with the values of the range
AppArray = AppRange.Value
End Sub
This does not work. I also tried application.tranpose(AppRange.Value).
I used:
For i = 1 To LstRow
Debug.Print AppArray(i)
Next
and an error appears, so somehow there is no AppArray(1).
I would be very happy if you can comment on that. More than just arranging the code suggest even other pages (links) to populate arrays with values of ranges when these ranges are not known in advance.
If the case is that looping is very time consuming and that arrays can be populated straight away, I don't understand why 99% of the pages referring to arrays use a loop (or nested loop) to populate an array.
I found the answer.
dim myRange as range
dim myArray() as variant
myRange = range(cells(2,3),cells(10,15))
redeem myArray(1 to 20,1 to 20)
myArray=myRange
It's always much faster to work with variables and arrays than with cells values.
I'm trying to create a program that optimizes histogram bin width...before that though, I need help with a seemingly simple task - setting a Range mentioned in a cell into a VBA array.
I'd like the data for the histogram to be on any sheet, in this case 'Data'!B4:M12. This is mentioned in D4 of the sheet I want the histogram to appear on. I keep getting an error when I run my code though, even after changing it multiple times. There's clearly a syntax error that I don't know how to handle.
Any assistance would be much appreciated!
Sub Histogram_Shimazaki_Shinomoto()
Dim data_range As String, min_bins As Integer, max_bins As Integer
Dim Data()
Dim x_min As Double, x_max As Double
data_range = Cells(4, 4) ' data range
min_bins = Cells(5, 4) ' min # of bins
max_bins = Cells(6, 4) ' max # of bins
Set Data = Range(data_range)
x_min = WorksheetFunction.Min(Data)
MsgBox x_min
End Sub
You are trying to set an array of variant equal to a Range -- but that isn't possible since you can't assign to an array. You could assign a range to a simple Variant (or to a Range variable). You could change
Dim Data()
to
Dim Data as Variant
Note the absence of parenthesis. Also, as a stylistic point I think that it is good to be explicit about the type, even though Variant is the default.
This might be enough for your code to work, although if all you want is the minimum value in the range, you could change
Set Data = Range(data_range)
to
Data = Range(data_range).Value
If the sheet "Data" isn't the active sheet and data_range contains "B4:M12" then you would need to use
Data = Sheets("Data").Range(data_range).Value
since Range returns a range on the active sheet unless explicitly qualified by a reference to the sheet.
I am trying to find the fastest way to perform a task in VBA. Currently I have it written as a nested for loop which can be extremely slow. I am looping over a list of unique numbers and matching them to numbers in a different list. If I get a match I store the information in a multidimensional array since there can be multiple matches and I want to keep track of all of them. Unfortunetly, this means when using a for loop if there are just 1000 unique numbers and 5000 numbers to look for matches my loop can end up iterating 1000*5000 = 5000000 times. As you see this can create a problem quickly. I am asking if there is any better way to approach this problem while staying in VBA. I already did all the tricks like set screenUpdating to false and calculation to manaul.
Here is my code:
For x = 0 To UBound(arrUniqueNumbers)
Dim arrInfo() As Variant
ReDim Preserve arrInfo(0)
If UBound(arrInfo) = 0 Then
arrInfo(0) = CStr(arrUniqueNumbers(x))
End If
For y = 2 To Length
UniqueString = CStr(arrUniquePhoneNumbers(x))
CLEARString = CStr(Sheets(2).Range("E" & y).Value)
If UniqueString = CLEARString Then 'match!
NormalizedDate = Format(CStr(Sheets(2).Range("G" & y).Value), "yyyymmdd")
z = z + 1
ReDim Preserve arrInfo(z)
arrInfo(z) = NormalizedDate & " " & LTrim(CStr(Sheets(2).Range("D" & y).Value))
arrInfo(z) = LTrim(arrInfo(z))
End If
Next
arrUniqueNumbers(x) = arrInfo()
ReDim arrInfo(0) 'erase everything in arrOwners
z = 0
Next
The loop is quite inefficient, so there are quite a few avoidable bottlenecks (mostly in the order of simplest to change to most complex to change)
Take the UniqueString step out of the innermost loop: This step doesn't change with changing y, so no point in repeating it.
Take the Redim Preserve out of the innermost loop: You are reallocating memory in the innermost loop which is extremely inefficient. Allocate 'sufficient' amount of memory outside the loop.
Do not keep using Sheets().Range() to access cell contents: Every time you access something on the spreadsheet, it is a HUGE drag and has a lot of overhead associated with the access. Consider one-step fetch operations from the spreadsheet, and one-step push operations back to the spreadsheet for your results. See sample code below.
Sample code for Efficient Fetch and Push-back operations for the spreadsheet:
Dim VarInput() As Variant
Dim Rng As Range
' Set Rng = whatever range you are looking at, say A1:A1000
VarInput = Rng
' This makes VarInput a 1 x 1000 array where VarInput(1,1) refers to the value in cell A1, etc.
' This is a ONE STEP fetch operation
' Your code goes here, loops and all
Dim OutputVar() as Variant
Redim OutputVar(1 to 1000, 1 to 1)
' Fill values in OutputVar(1,1), (1,2) etc. the way you would like in your output range
Dim OutputRng as Range
Set OutputRng = ActiveSheet.Range("B1:B1000")
' where you want your results
OutputRng = OutputVar
' ONE STEP push operation - pushes all the contents of the variant array onto the spreadsheet
There are quite a few other steps which can further dramatically speed up your code, but these should produce visible impact with not too much effort.
dim dict as Object
set dict = CreateObject("Scripting.Dictionary")
dim x as Long
'Fill with ids
'Are arrUniqueNumbers and arrUniquePhoneNumbers the same?
For x = 0 To UBound(arrUniqueNumbers)
dict.add CStr(arrUniquePhoneNumbers(x)), New Collection
next
'Load Range contents in 2-Dimensional Array
dim idArray as Variant
idArray = Sheets(2).Cells(2,"E").resize(Length-2+1).Value
dim timeArray as Variant
timeArray = Sheets(2).Cells(2,"G").resize(Length-2+1).Value
dim somethingArray as Variant
somethingArray = Sheets(2).Cells(2,"D").resize(Length-2+1).Value
dim y as Long
'Add Values to Dictionary
For y = 2 To Length
Dim CLEARString As String
CLEARString = CStr(timeArray(y,1))
If dict.exists(CLEARString) then
dict(CLEARString).Add LTrim( Format(timeArray(y,1)), "yyyymmdd")) _
& " " & LTrim(CStr(somethingArray(y,1)))
end if
next
Access like this
dim currentId as Variant
for each currentId in dict.Keys
dim currentValue as variant
for each currentValue in dict(currentId)
debug.Print currentId, currentValue
next
next
As for the problem, I need to be able to compare all data in Variant array A to all data in Variant array B. I know I need some kind of double loop (so that every A value is checked against all B values), but I can't figure out how to do it. Here's what I have so far:
Sub Button_Click()
Dim trgtRange As Variant
Dim tempRange As Variant
Set myRange = ThisWorkbook.Sheets(1).Range("L:L")
For Each cell In myRange
If IsEmpty(cell) Then
ActiveCell.Offset(-1, 0).Select
currentRow = ActiveCell.Row
Set trgtRange = Range("L2:L" & currentRow)
Exit For
End If
Next cell
Set tempRange = Range("A1:A" & currentRow - 1)
' Insert a double loop here
End Sub
So, trgtRange is the Variant A and tempRange is Variant B. I know I could have set the Variant B up a little easier, but I already did it that way. After all, code should be polished as last operation anyway.
You might be wondering why Variants A and B are completely the same. Well, that's because I need to compare them so that I can find values that are close to each other, (i.e 10000 and 12000) and I need to incorporate some kind of tolerance for it.
Here is my answer. Why do you need two loops to do this. Some relative addressing handles this issue quite nicely. Set up a spreadsheet like this for an example:
and your code is simply this
Sub Button_Click()
Dim dblTolerance As Double
Dim tmp As Range
'Get source range
Set tmp = ActiveSheet.Range("A2")
'Get tolerance from sheet or change this to an assignment to hard code it
dblTolerance = ActiveSheet.Range("D13")
'use the temporary variable to cycle through the first array
Do Until tmp.Value = ""
'Use absolute function to determine if you are within tolerance and if so put match in the column
'NOTE: Adjust the column offset (set to 4 here) to match whichever column you want result in
If Abs(tmp.Value - tmp.Offset(0, 2).Value) < dblTolerance Then
tmp.Offset(0, 4).Value = "Match"
Else
tmp.Offset(0, 4).Value = "No Match"
End If
'Go to the next row
Set tmp = tmp.Offset(1, 0)
Loop
'Clean up
Set tmp = Nothing
End Sub
The comments in the code explain how it works. This is superior to a double loop because relative referencing is faster, the memory use is more efficient and you only have to make one pass at each row.
If you are required for some reason to use a double loop let me know, but that is inferior performance wise to this methodology. Hope this helps.
I have a list of customers from last year (in column A) and I have a list of customers from this year (in Column B). I've put the data from these two columns in arrays (using the code below - which is set up as Option Base 1):
'Define our variables and array types'
Sub CustomerArray()
Dim LastArray() As String
Dim CurrentArray() As String
Dim BothArray() As String
Dim LR As Long
Dim i As Integer
'Define LastArray which is customers last year'
LR = Cells(Rows.Count, 1).End(xlUp).Row
ReDim LastArray(LR - 3)
With Range("A1")
For i = 1 To LR - 3
LastArray(i) = .Offset(i, 0)
Next i
End With
'Define CurrentArray which is customers this year'
ReDim CurrentArray(LR - 3)
With Range("B1")
For i = 1 To LR - 3
CurrentArray(i) = .Offset(i, 0)
Next i
End With
End Sub
Now I want to compare/combine the Arrays to show a list of customers who appear in both of the two arrays I just defined (last year and this year). I want to create a third array with the customers who appear for both years (and I want to put that in column D of my excel sheet). I'm getting confused on how to write the code which will compare these two arrays (current year and last year). Will I use a conditional If > statement? Each of the arrays have the customers listed alphabetically.
I appreicate any help you might be able to give me.
Thanks!
You don't need to mess with arrays or loop at all, keep it simple, try something like this:
Sub HTH()
With Range("A1", Cells(Rows.Count, "A").End(xlUp)).Offset(, 3)
.Formula = "=IF(COUNTIF(B:B,A1)>0,A1,"""")"
.Value = .Value
.SpecialCells(xlCellTypeBlanks).Delete
End With
End Sub
OK. I got a little carried away here, but this does what your are asking (you may have to tune it up to suit your specific needs. To use this code, simply call the Sub "Match Customers".
Your original code proposed the use of three arrays. Excel VBA provides some mechanisms to do what you seek which are both easier to use, and possibly more efficient.
I went ahead and broke the process out into more discrete chunks of code. While it seems like more code, you will find that each peice might make more sense, and it is much more maintainable. You can also now re-use the individual functions for other operations if needed.
I also pulled your range and column indexes out into locally defined constants. This way, if the various row or column references ever need to change, you only have to change the value in one place.
It is not necessarily the most efficient way to do this, but is most likely less complicated than using the arrays you originally propose.
I have not tested this exhaustively, but it works in the most basic sense. Let me know if you have questions.
Hope that helps . . .
Option Explicit
'Set your Column indexes as constants, and use the constants in your code.
'This will be much more maintainable in the long run:
Private Const LY_CUSTOMER_COLUMN As Integer = 1
Private Const CY_CUSTOMER_COLUMN As Integer = 2
Private Const MATCHED_CUSTOMER_COLUMN As Integer = 4
Private Const OUTPUT_TARGET As String = "D1"
Private Const LAST_ROW_OFFSET As Integer = -3
'A Function which returns the list of customers from last year
'as a Range object:
Function CustomersLastYear() As Range
Dim LastCell As Range
'Find the last cell in the column:
Set LastCell = Cells(Rows.Count, LY_CUSTOMER_COLUMN).End(xlUp)
'Return the range of cells containing last year's customers:
Set CustomersLastYear = Range(Cells(1, LY_CUSTOMER_COLUMN), LastCell)
End Function
'A Function which returns the list of customers from this year
'as a Range object:
Function CustomersThisYear() As Range
Dim LastCell As Range
'Find the last cell in the column:
Set LastCell = Cells(Rows.Count, CY_CUSTOMER_COLUMN).End(xlUp)
'Return the range of cells containing this year's customers:
Set CustomersThisYear = Range(Cells(1, CY_CUSTOMER_COLUMN), LastCell)
End Function
'A function which returns a range object representing the
'current list of matched customers (Mostly so you can clear it
'before re-populating it with a new set of matches):
Function CurrentMatchedCustomersRange() As Range
Dim LastCell As Range
'Find the last cell in the column:
Set LastCell = Cells(Rows.Count, MATCHED_CUSTOMER_COLUMN).End(xlUp)
'Return the range of cells containing currently matched customers:
Set CurrentMatchedCustomersRange = Range(Cells(1, MATCHED_CUSTOMER_COLUMN), LastCell)
End Function
'A Function which performs a comparison between two ranges
'and returns a Collection containing the matching cells:
Function MatchedCustomers(ByVal LastYearCustomers As Range, ByVal ThisYearCustomers As Range) As Collection
Dim output As Collection
'A variable to iterate over a collection of cell ranges:
Dim CustomerCell As Range
'Initialize the collection object:
Set output = New Collection
'Iterate over the collection of cells containing last year's customers:
For Each CustomerCell In LastYearCustomers.Cells
Dim MatchedCustomer As Range
'Set the variable to reference the current cell object:
Set MatchedCustomer = ThisYearCustomers.Find(CustomerCell.Text)
'Test for a Match:
If Not MatchedCustomer Is Nothing Then
'If found, add to the output collection:
output.Add MatchedCustomer
End If
'Kill the iterator variable for the next iteration:
Set MatchedCustomer = Nothing
Next
'Return a collection of the matches found:
Set MatchedCustomers = output
End Function
Sub MatchCustomers()
Dim LastYearCustomers As Range
Dim ThisYearCustomers As Range
Dim MatchedCustomers As Collection
Dim MatchedCustomer As Range
'Clear out the destination column using the local function:
Set MatchedCustomer = Me.CurrentMatchedCustomersRange
MatchedCustomer.Clear
Set MatchedCustomer = Nothing
'Use local functions to retrieve ranges:
Set LastYearCustomers = Me.CustomersLastYear
Set ThisYearCustomers = Me.CustomersThisYear
'Use local function to preform the matching operation and return a collection
'of cell ranges representing matched customers. Pass the ranges of last year and this year
'customers in as Arguments:
Set MatchedCustomers = Me.MatchedCustomers(LastYearCustomers, ThisYearCustomers)
Dim Destination As Range
'Use the local constant to set the initial output target cell:
Set Destination = Range(OUTPUT_TARGET)
'Itereate over the collection and paste the matches into the output cell:
For Each MatchedCustomer In MatchedCustomers
MatchedCustomer.Copy Destination
'Increment the output row index after each paste operation:
Set Destination = Destination.Offset(1)
Next
End Sub
If you want to compare the two arrays using loops, maybe because you have, for example, picked up all the data into arrays for faster computation rather than interacting with the spreadsheet range object, or you need to compare multiple things from the two arrays to check that the entries match so can't use a .find statement, then this is what you need:
-Two loops, one nested inside the other
-Three counters, one for each array
-One "Exit Loop", "Exit For", "GoTo foundmatch" or similar way of exiting the inner loop
-A "Redim Preserve" of the results array
-An "If" statement
-Finally, one line where you assign the name that appears in both arrays to the results array
This is everything that is needed to write it simply as loops - but doesn't give the fastest or best way to do it (Redim Preserve is not the best..). Constructing it should be easy from this list though: the if statement should either be an x=y type for a general usage, or if x>y if you are really really sure that the list being looped in the inner loop really is sorted alphabetically