Building and comparing arrays - arrays

I have the below code I'm trying to get to work. This is my first time dealing with arrays in VBA. Here's the plain english version of what I'm going for:
Load SSBarray with column A from worksheet SSB.
Load EDMarray with Column I from worksheet EDM.
Compare the above arrays and sort into two new arrays IDarray and noIDarray based on a possible match.
Output the new arrays into their respective worksheets.
Step 4 is temporary to just make sure the code is even working. The entire project is compiling all the data from 3 sheets into these two lists. Worksheet 1 has Data point A only, Worksheet 2 may or may not have Data point A, B, and/or C, and Worksheet 3 may or may not have Data point A, B, and/or C. The code I have is my start to check for all of the data point A's in worksheet 1 are in worksheet 2. Run time is also a factor. I'll take any and all the help I can get at this point. Thanks.
'Build Arrays
Dim i As Long, j As Long
Dim SSBarray
Dim EDMarray
Dim IDarray
Dim noIDarray
Dim YCounter As Long
Dim NCounter As Long
Dim inArray As Boolean
endSSB = SSB.Range("A" & Rows.Count).End(xlUp).Row
endEDM = EDM.Range("A" & Rows.Count).End(xlUp).Row
BBlast = BB.Range("A" & BB.Range("A" & Rows.Count).End(xlUp).Row)
ReDim SSBarray(1 To endSSB)
ReDim EDMarray(1 To endEDM)
For i = 2 To endSSB
SSBarray(i) = SSB.Cells(i, 1).Value2
Next i
For i = 2 To endEDM
EDMarray = EDM.Cells(i, 9).Value2
Next i
For i = 2 To endSSB
inArray = False
For j = 2 To endEDM
If SSBarray(i) = EDMarray(j) Then
inArray = True
YCounter = YCounter + 1
ReDim Preserve IDarray(1 To YCounter)
IDarray(YCounter) = SSBarray(i)
Exit For
End If
Next j
If inArray = False Then
NCounter = NCounter + 1
ReDim Preserve noIDarray(1 To NCounter)
noIDarray(NCounter) = SSBarray(i)
End If
Next i
For i = 1 To UBound(IDarray)
Identifiers.Cells(i, 4) = IDarray(i)
Next i
For i = 1 To UBound(noIDarray)
NoIdentifiers.Cells(i, 4) = noIDarray(i)
Next i
End Sub
Revised Code:
'Sort and Compile Data
Dim i As Long
endSSB = SSB.Range("A" & Rows.Count).End(xlUp).Row
endEDM = EDM.Range("A" & Rows.Count).End(xlUp).Row
BBlast = BB.Range("A" & BB.Range("A" & Rows.Count).End(xlUp).Row)
Public Type otherIDs
SEDOL As Variant
ISIN As Variant
End Type
Dim SSBIds As New Scripting.Dictionary
Dim IDs As otherIDs
For i = 2 To endSSB
'Add an ID\row number pair
SSBIds.Add SSB.Cells(i, 1).Value2
Next i
Dim EDMIds As New Scripting.Dictionary
For i = 2 To endEDM
IDs.SEDOL = EDM.Cells(i, 8).Value2
IDs.ISIN = EDM.Cells(i, 7).Value2
EDMIds.Add EDM.Cells(i, 9).Value2, IDs.SEDOL, IDs.ISIN
Next i
Dim IdMatches As New Scripting.Dictionary
Dim IdMisMatches As New Scripting.Dictionary
Dim key As Variant
For Each key In SSBIds
'If it's in the other dictionary...
If EDMIds.Exists(key) Then
'...add the row to the matches...
IdMatches.Add key, EDMIds(key)
Else
'...otherwise add the row to the mismatches.
IdMisMatches.Add key, EDMIds(key)
End If
Next
i = 1
For Each key In IdMatches.Keys
Identifiers.Cells(i, 4) = key
Identifiers.Cells(i, 5) = IdMatches.IDs.SEDOL
Identifier.Cells(i, 6) = IdMatches.IDs.ISIN
i = i + 1
Next
i = 1
For Each key In IdMisMatches.Keys
NoIdentifiers.Cells(i, 4) = key
i = i + 1
Next

Arrays aren't the best containers to be using here. Dictionaries have an .Exists method that uses a much faster hash lookup than a simple iteration that compares every value.
Not only that, repeatedly calling Redim Preserve is incredibly inefficient compared to adding items to a Dictionary. Every time you increase the array dimension, the entire data set gets copied to a newly allocated area of memory and the data pointer for the array gets updated to point to it.
Example using Dictionaries (you'll need to add a reference to Microsoft Scripting Runtime):
Dim SSBIds As New Scripting.Dictionary
For i = 2 To endSSB
'Add an ID\row number pair
SSBIds.Add SSB.Cells(i, 1).Value2, i
Next i
Dim EDMIds As New Scripting.Dictionary
For i = 2 To endEDM
EDMIds.Add EDM.Cells(i, 9).Value2, i
Next i
Dim IdMatches As New Scripting.Dictionary
Dim IdMisMatches As New Scripting.Dictionary
Dim key As Variant
For Each key In SSBIds
'If it's in the other dictionary...
If EDMIds.Exists(key) Then
'...add the row to the matches...
IdMatches.Add key, EDMIds(key)
Else
'...otherwise add the row to the mismatches.
IdMisMatches.Add key, EDMIds(key)
End If
Next
i = 1
For Each key In IdMatches.Keys
Identifiers.Cells(i, 4) = key
i = i + 1
Next
i = 1
For Each key In IdMisMatches.Keys
NoIdentifiers.Cells(i, 4) = key
i = i + 1
Next
Note that this assumes that your key columns have unique values. If they don't, you can either test for the presence of the key before adding a value (this matches your code's behavior of only taking the first match), or you can create a Collection of values to store in the Dictionary for each key, or something else entirely depending on your requirement.

Related

How can I use the numbers stored in an array to copy several individual columns to another workbook

I am trying to make a script that copy several specific columns, that are selected through a search loop, into a new woorkbook.
At the moment I have a loop to find all column positions of the the headers in a list and store these into an array. My question is how can I use all the values inside the array to copy them into a new workbook. At the moment I am looping through all the values and copying them individually.
How can make this copy and paste with one selection?
Dim c As Range
Dim dataheaderList As Range
Dim eachheader As Range
Dim sourceColumn As Range, targetColumn As Range
Dim myArray() As Variant
Dim x As Long
Dim i As Long
Set dataheaderList = Range("DataHeaderstoCopy") 'The range I want to look
i = -1
'Loop to find the positions of the headers from a existing list
For Each eachheader In dataheaderList
i = i + 1
ReDim Preserve myArray(i) As Variant
With ThisWorkbook.Names("Title2012Q2").RefersToRange
Set c = .Find(eachheader, LookIn:=xlValues)
myArray(i) = c.Column
End With
Next eachheader
'Loop to copy individually
i = 0
For x = LBound(myArray) To UBound(myArray)
i = i + 1
Set sourceColumn = Workbooks("SOR_Historical_Global_Entries_Example_2.xlsm").Worksheets(1).Columns(myArray(x))
Set targetColumn = Workbooks("Book1").Worksheets(1).Columns(i)
sourceColumn.Copy Destination:=targetColumn
Next x
End Sub

VBA stop using temporary ranges

I'm new to vba so I need some help making my macro more efficient. It does return the desired outcome however I know there must be a much quicker way to do so I just do not have the vba experience to know how.
I have a column which contains names of people assigned to a project. Some are only one name, and others may be multiple, for example:
At the moment, my code goes through this column, separates the names by comma, and enters them individually into a new range like so:
I then use a collection for the unique names and enter them in the final desired list. The names must show up three times, blank row, next three rows are the next name, so on.It should look like this in the end:
Currently my code is the following
Sub FindUniques()
Dim Ws As Worksheet, Ns As Worksheet
Dim SubString() As String, m As Integer, k As Long, NameCount As Integer
Dim allNames As New Collection, tempRng As Range
Set Ns = Worksheets("Sheet2")
Set Ws = Worksheets("Sheet1")
'Loops through the Assigned To column, separates and finds unique names
On Error Resume Next
For i = 1 To Ws.Range("A:A").End(xlDown).Row - Range("Assigned_to").Row
SubString = Split(Range("Assigned_to").Offset(i), ", ")
For j = 0 To UBound(SubString)
allNames.Add (allNames.count), SubString(j)
Next j
Next i
On Error GoTo 0
NameCount = allNames.count
For k = 1 To NameCount
For m = 1 To 4
Ns.Cells((k - 1) * 4 + m + 7, 2) = allNames.Key(k)
Next
Range("Names").Offset((k - 1) * 4).ClearContents
Next
End Sub
It works, however there must be some way that is more efficient than entering the names into a new range and then deleting the range. How can I use a collection or an array or something of the sort to make it quicker? Any ideas would be really appreciated
edit: I have now updated the code and it is using an collection, taking values from the substring. This enters the item (0, 1, 2, ...) in the cells instead of the keys (keys here are the names). How do I get it to return the key instead of the item number?
The slowest part of VBA are worksheet interactions so we should attempt to minimize that as much as possible.
Sub FindUniques()
Dim ws As Worksheet, ns As Worksheet
Dim splitStr() As String, nameStr As Variant
Dim dict As New Dictionary
Dim lastRow As Long, i As Long
Set ns = Worksheets("Sheet2")
Set ws = Worksheets("Sheet1")
lastRow = ws.Cells(ws.Rows.Count, 1).End(xlUp).Row
'Loops through the Assigned To column, separates and finds unique names
For i = 2 To lastRow
splitStr = Split(CStr(ws.Cells(i, 1).Value), ", ")
For Each nameStr In splitStr
If Not dict.Exists(nameStr) Then dict.Add nameStr , 0
Next
Next i
i = 2
For Each nameStr In dict.Keys
ns.Cells(i, 1).Resize(3).Value = nameStr
i = i + 4
Next
End Sub
Edited With #Toddleson & #BigBen 's suggestions
Good Luck!

Go through values in range, search for them in range, find value in respective rows, add them to array

I would like to go through a range of values in Column D and take each value:
for each value
check in the same range for its occurrence
check in the row of its occurrence for a value in column A
Add this value in column a to an array (or another way to save data)
go to the next occurrence of the value in column D and save the next Value of Column A to the array
When I checked each value for all its occurrences and added it to the array I want the array to be given out in the cell H1 (and for the next values onwards, I1 and so on)
Here's a picture of what I mean with some dummy values:
My attempts in VBA so far are this (with the remark that I deal with arrays for the first time):
Dim finden As String, FirstFound As String
Dim FoundCell As Range, rng As Range
Dim i As Integer
Dim zahl As Integer
Dim zeile As Range
Dim temparray As Double
Dim b As Integer
Dim count As Integer
Set rng = Worksheets("Tabelle1").Range("H1:H100")
i = Worksheets("Tabelle1").Cells(Rows.count, "D").End(xlUp).Row
For zahl = 1 To i
finden = Sheets("Tabelle1").Cells(zahl, "D").Value
count = Application.WorksheetFunction.CountIf(Range("A1:A100"), finden)
Set zeile = Sheets("Tabelle1").Columns("D").Find(finden, Cells(Rows.count, "D"), xlValues, xlWhole)
If Not zeile Is Nothing Then
FoundCell = zeile.Address
Do
For b = 1 To count
Set temparray(b, 1) = Sheets("Tabelle1").Cells(zeile.Row, "A").Value
Set zeile = Sheets("Tabelle1").Columns("A").Find(finden, zeile, xlValues, xlWhole)
Next b
Loop While zeile.Address <> FoundCell
End If
Set zeile = Nothing
rng.Value = temparray
Sheets("Tabelle1").Cells(1, 8 + zahl) = rng.Value
Next
End Sub
Unfortunately I already get a error message for:
set temparray(b,1)
telling me a data field was expected.
Any idea how I could solve my problem?
Have a look at the Collection object as it is a good way to store unique values. You don't need to run the multiple Find functions or incrementally build your array, you could simply read the columns once and write them into the relevant collection.
It's had to tell from your question and code how you want to write the output, but the code below will set you in the right direction:
Dim uniques As Collection
Dim valueSet As Collection
Dim valueD As String
Dim valueA As String
Dim v As Variant
Dim r As Long
Dim c As Long
Dim output() As String
'Read the data
With ThisWorkbook.Worksheets("Tabelle1")
v = .Range("A1", _
.Cells(Rows.Count, "D").End(xlUp)) _
.Value2
End With
'Populate the collections
Set uniques = New Collection
For r = 1 To UBound(v, 1)
valueA = CStr(v(r, 1))
valueD = CStr(v(r, 4))
'Check if we have a collection for the D value
Set valueSet = Nothing
On Error Resume Next
Set valueSet = uniques(valueD)
On Error GoTo 0
'If not then create a new one.
If valueSet Is Nothing Then
Set valueSet = New Collection
uniques.Add valueSet, Key:=valueD
End If
'Add the A value to it
valueSet.Add valueA
Next
'Compile the write array
ReDim Preserve output(1 To 1, 1 To uniques.Count)
c = 1
For Each valueSet In uniques
For Each v In valueSet
'--> uncomment this 'If block', if you want
'--> comma separated values.
' If Len(output(1, c)) > 0 Then
' output(1, c) = output(1, c) & ", "
' End If
output(1, c) = output(1, c) & v
Next
c = c + 1
Next
'Write the output array
ThisWorkbook.Worksheets("Tabelle1") _
.Range("H1").Resize(, UBound(output, 2)) _
.Value = output

Populate a one-dimensional array with values from an Excel Table column using VBA

The code below is meant to read columns from an Excel table into arrays, which can then be used to determine whether each "Project" belongs to the Environment "Group", and if so, to add the project number and dollar value to another array. I am having some issues with my code, and have been searching the internet and StackOverflow but have been able to find very little information on dealing with Excel Tables using VBA. I am using Excel 2010.
Sub UpdateProjectsAndCharges()
'Define arrays to be used
Dim projectArray() As Variant
Dim uniqueProjectArray(100) As Variant
Dim dollarValue() As Variant
Dim envProjectArray(100) As Variant
Dim envDollarValue(100) As Double
Dim cumulativeCosts(100) As Double
'Define all tables in this sheet as list objects
Dim UnitsValues As ListObject
Dim ChargingTracking As ListObject
'Define counters to be used
Dim counter As Integer
Dim counter2 As Integer
'Set variables for each table in sheet
Set UnitsValues = Sheets("Cluster Data").ListObjects("UnitsValues")
Set ChargingTracking = Sheets("Cluster Data").ListObjects("ChargingTracking")
'Find last row in table
With Sheets("Cluster Data")
lastRow = Cells(Rows.Count, 1).End(xlUp).Row
End With
'Define variables to be used in loops
Dim userGroup As Variant
Dim project As Variant
Dim Value As Variant
'Set arrays to respective columns from UnitsValues table
userGroups = Range("UnitsValues[Group]")
projectArray = Range("UnitsValues[Project]")
dollarValue = Range("UnitsValues[Dollar Value]")
'Redefine length of arrays to number of rows in table
ReDim Preserve projectArray(lastRow)
ReDim Preserve dollarValue(lastRow)
'Set counter values
counter = 1
counter2 = 1
For Each userGroup In userGroups
project = projectArray(counter)
Value = dollarValue(counter)
If userGroup = "Environment" Then
envProjectArray(counter2) = project
envDollarValue(counter2) = Value
counter2 = counter2 + 1
MsgBox ((envProjectArray(counter2) & " " & envDollarValue(counter2)))
End If
counter = counter + 1
Next userGroup
I was receiving the "Subscript out of range" error with these lines:
project = projectArray(counter)
Value = dollarValue(counter)
I looked up the error and thought that these lines would perhaps fix the problem:
ReDim Preserve projectArray(lastRow)
ReDim Preserve dollarValue(lastRow)
Now, I am receiving the same error on the lines above instead, and have run out of ideas on how to fix the error. I suspect it is happening because I assigned a range into an array, but I'm not certain.
Change:
project = projectArray(counter)
Value = dollarValue(counter)
to
project = projectArray(counter, 1)
Value = dollarValue(counter, 1)
Arrays read from worksheets always are multidimensional even if you just have 1 column.
In this case you are specifying that column to be 1, every time.

Comparing two large lists with multiple columns (same number in each list) in excel VBA and do...more stuff

I've searched far and wide and I can't quite find anything to fit my needs.
The situation:
I have two lists of data with the same type data in each column (10 columns but the last 2 are useless), but the lists are of varying length (currently 55k in one, 18k in the other). The longer list is going to be a running list of items with the most up to date data in each column for the unique ID # in column A. The other list is linked to a SharePoint list that I update a couple times each day.
The need:
I need the list that updates from SharePoint to be compared to the running list. If there are matching Unique ID #'s in the lists, then the running list needs to be updated to the pulled data. If the running list doesn't contain a Unique ID that is in the pulled list, the new line needs to be added to the running list (which will be sorted later).
I first tried doing this with cell references in two for loops and for only 10 rows this worked fine. When I tried running it for every line, I had problems. So I tried using arrays instead, but this is new territory for me. The code seems to be working, but it's taking a really long time to run (I've let it go for 10 minutes before force stopping). I've tried adding some efficiency increases like turning off screen updating and calculations, but they shouldn't have any effect since I'm using arrays and not actually updating the cells until the array comparison is finished. If arrays are more efficient, great, but I don't know how to combine the data from the pulled list's array to the running list's array.
Here is the code that I have so far:
Sub Data_Compile_Cells()
Dim sdata As Worksheet, spull As Worksheet
Dim p As Long, d As Long, c As Long
Dim lrdata As Long, lrpull As Long
Dim rdata As Range, rpull As Range
Dim Newvalue As Boolean
Dim apull As Variant, adata As Variant
Dim nrows As Long, ncols As Integer
Set sdata = Sheets("Data")
Set spull = Sheets("Data Pull")
Newvalue = "FALSE"
i = 1
apull = spull.Range("A1").CurrentRegion
adata = sdata.Range("A1").CurrentRegion
'lrdata = sdata.Range("A" & Rows.Count).End(xlUp).Row
'lrpull = spull.Range("A" & Rows.Count).End(xlUp).Row
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
sdata.Activate
'*****UniqueID Check******
'Run through list of Unique ID's pulled from SharePoint
For p = 2 To UBound(apull, 1)
'I tried to add a status bar to see if the code was actually running
'Application.StatusBar = "Progress: " & p & " of " & UBound(apull, 1) & " : " & Format(p / UBound(apull, 1), "0%")
'Compare each one to the Unique ID's already listed
For d = 2 To UBound(adata, 1)
'Check for matching Unique ID's
If adata(d, 1) = apull(p, 1) Then
'Check each cell in the row with the matching Unique ID
For c = 2 To 10
'If a cell does not have the same data, replace the Data array value with the value from the Pull array
If adata(p, c) <> apull(d, c) Then
adata(d, c) = apull(p, c)
End If
Next c
'If a match is found, skip to the next p value
Exit For
Else
Newvalue = "TRUE"
'Need code to append new line to Data array
End If
Next d
Next p
'Sort the data
'Range("A2").CurrentRegion.Sort key1:=Range("A2"), order1:=xlAscending, Header:=xlYes
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
Any direction would be much appreciated.
This ran in <1 sec for me, using 20k rows "data", ~3k rows "pull" (mix of updates and new).
EDIT: tidied up and added some comments...
Sub tester()
Const NUM_NEW As Long = 20000 'large enough ?
Dim arrPull, arrData, arrDataId, arrNew()
Dim ubP As Long, ubD As Long
Dim numNew As Long, r As Long
Dim v, c As Long
Dim t, tmp, coll As Collection
t = Timer
'grab the current and new data
arrPull = Sheets("Pull").Range("A1").CurrentRegion.Value
arrData = Sheets("Data").Range("A1").CurrentRegion.Value
ubP = UBound(arrPull, 1)
ubD = UBound(arrData, 1)
numNew = 0
ReDim arrNew(1 To NUM_NEW, 1 To 10) 'array for new data
'create a collection to map ID to "row number"
Set coll = New Collection
For r = 1 To ubD
coll.Add Item:=r, Key:=arrData(r, 1)
Next r
For r = 1 To ubP
tmp = arrPull(r, 1)
v = 0
'collection has no "exists" function, so trap any error
On Error Resume Next
v = coll.Item(tmp)
On Error GoTo 0
If v > 0 Then
'Id already exists: update data
For c = 2 To 10
arrData(v, c) = arrPull(r, c)
Next c
Else
'new Id: add to the "new" array
numNew = numNew + 1
If numNew > NUM_NEW Then
MsgBox "Need larger `new` array!"
'a more sophisticated approach would be to dump the full
' array to the sheet and then redimension it for more
' data...
Exit Sub
End If
For c = 1 To 10
arrNew(numNew, c) = arrPull(r, c)
Next c
End If
Next r
'drop updated and new (if any) to the worksheet
With Sheets("Data")
.Range("A1").CurrentRegion.Value = arrData
If numNew > 0 Then
.Cells(ubD + 1, 1).Resize(numNew, 10).Value = arrNew
End If
End With
Debug.Print "Done in " & Timer - t & " sec"
End Sub
You would be better off using MSAccess to do this. Link to both tables and then do an inner join on the id field or which ever field links the items in the two lists.

Resources