Intersect two arrays in VBA (no excel) - arrays

I have two string arrays with quite a few items. The items are unique in each array: I sort the array alphabetically and remove duplicates as I do so.
I want to know how many of the items in array a are also present in array b. As soon as there is even one of such items that occurs in both arrays, I "know enough".
And even though both arrays are sorted alphabetically first, doing a double loop as below is very heavy on the processor and terribly slow.
So I'm looking for an alternative, or even an other approach from the start...
Dim aryA(1 To 10000) As String
Dim aryB(1 To 10000) As String
Dim x As Long
Dim y As Long
Dim Counter as long
Counter = 0
'fill arrays with many values
For x = 1 to 10000
For y = 1 to 10000
If aryA(x) = aryB(y) Then
counter = counter + 1
Exit For
End If
Next y
Next x
If Flag = True Then
'run amazing code here
End If
End Sub```
Thanks!

You should try using Dictionaries instead of arrays.
I did a test with 1'000, 10'000 and 100'000 words and compared all elements (so not breaking when a duplicate was found) on a very regular computer. Using 1'000 elements was ready instantly, 10'000 took not much more that the blink of the eye, 100'000 2-3 seconds.
I don't know how your original data is from or how you will your arrays. The following code just copies the arrays into dictionaries and then loop over the keys of the first dictionary and looks if it is present in the second. No need to have the data sorted, and duplicates are eliminated automatically when the dictionary is build.
If your not familiar with Dictionaries: You need to add a reference to the "Microsoft Scripting Runtime" (or change code to late binding).
Sub arrayTest()
' (omitting my test code to fill up the arrays)
' Create Dictionaries
Dim d1 As New Dictionary
Dim d2 As New Dictionary
For x = 1 To UBound(aryA)
If Not d1.Exists(aryA(x)) Then d1.Add aryA(x), vbNullString
Next x
For y = 1 To UBound(aryB)
If Not d2.Exists(aryB(y)) Then d2.Add aryB(y), vbNullString
Next
' Compare the dictionaries
Counter = 0
Dim k
For Each k In d1.Keys
If d2.Exists(k) Then Counter = Counter + 1
Next
Debug.Print "Done: " & Counter
End Sub
If you just need this check once and don't do anything with the data afterwards, of course it is not necessary to create the second dictionary:
For x = 1 To UBound(aryA)
If Not d1.Exists(aryA(x)) Then d1.Add aryA(x), vbNullString
Next x
For y = 1 To UBound(aryB)
If d1.Exists(aryB(y)) Then Counter = Counter + 1
Next
Debug.Print "Done: " & Counter

If both array are sorted you could just go through them considering their ordering in linear time:
Pseudo code:
i=0;
j=0;
while i < len(aryA) and j < len(aryB) do
if aryA[i] == aryB[j] then
counter++;
i++;
j++;
elseif aryA[i] > aryB[j] then
j++
else
i++;
fi
done

Related

vba fastest way compare two lists of numbers

I am working in an MS-Access 2010 environment. I have two strings of numbers. I need to compare them and find the missing numbers in list 2. These lists can be very large (more than 100.000 numbers), so I need a fast method to compare them.
Let us say the lists are as follows:
L1: "1,2,4,5,6,8,9"
L2: "1,2,6,9"
So I need to find the numbers 4, 5 and 8. How can I do this most efficiently? I can put them in an array and them loop through both arrays, but I am afraid it will be very slow if both lists contain over 100.000 values.
Would a dictionary approach be more efficient? If so, how?
Please, try the next code. It uses a dictionary and the two strings split in arrays. It needs only two iterations per each array, which should not take too much. The arrays content not necessary to be sorted. It returns in a third array, its content being Joined and returned in Immediate Window. Of course, it can be used as an array, depending on your need:
Sub testCompare1DArrays()
Dim arr1, arr2, noMatch, i As Long, k As Long, dict2 As Object
arr1 = Split("1,2,4,5,6,8,9", ",")
arr2 = Split("1,2,6,9", ",")
Set dict2 = CreateObject("scripting.dictionary")
'place the array to be checked in the dictionary (as keys)
For i = 0 To UBound(arr2)
dict2(arr2(i)) = vbNullString
Next i
ReDim noMatch(UBound(arr1)) 'set the array to keep the processed result to surelly have space for all occurrences
For i = 1 To UBound(arr1)
If Not dict2.exists(arr1(i)) Then noMatch(k) = arr1(i): k = k + 1 'buld the returned arry for elements not being found as keys
Next i
If k > 0 Then
ReDim Preserve noMatch(k - 1) 'keep only loaded elements
Debug.Print Join(noMatch, "|") 'return in Immediate Window (Ctrl + G)
Else
Debug.Print "All elements of arr2 exists in arr1..."
End If
End Sub
An alternative method to achieve this desired result is to utilize the collections object.
First, loop through the second list adding each number to a collection and using each number as its own item key.
Second, loop through the first list testing each number against the second list's keys.
If the key does not exist, add the number to a collection of missing keys.
Private Sub compareLists()
Dim list1 As Variant, list2 As Variant
Dim testNumbers As New Collection, _
missingNumbers As New Collection
Dim i As Long
Dim number As String
list1 = Split("1,2,4,5,6,8,9", ",")
list2 = Split("1,2,6,9", ",")
'Prepare numbers from list2
For i = 0 to UBound(list2)
number = list2(i)
testNumbers.Add number, number
Next i
'Identify numbers in list1 missing from list2
For i = 0 to UBound(list1)
number = list1(i)
On Error Resume Next
testNumbers.Item number
If Err.Number = 5 And Err.Number <> 457 Then _
missingNumbers.Add number, number
Next i
End Sub
Perhaps a faster solution exists, but I tend to prefer utilizing collections.

Create dynamically sized array to store row counter

The function below finds the first result.
There could be duplicate rows with matching values that meet my if statements. How would I create an array to store the row numbers the search function found so that I can process the data later.
How would I create the array size based on the number of results found in the for loop?
I am assuming that the for loop counter will have some sort of role to play in this. Let's say the for loop found 2 matches in row numbers 56 and 98 matching my if statements:
array_example(counter, 0) = 56
array_example(counter, 0) = 98
The stored values would be:
array_example(1, 0) = 56
array_example(2, 0) = 98
Private Sub Complex_Search(col1, cval1, col2, cval2)
'MsgBox (col1 & " " & col2)
'MsgBox (cval1 & " " & cval2)
Dim i
Dim lRow
Dim Counter
lRow = Cells(Rows.Count, 1).End(xlUp).row
Counter = 0
With Sheets("Office Spaces")
For i = 2 To lRow
If LCase(.Cells(i, col1).Value) = LCase(cval1) And LCase(.Cells(i, col2).Value) = LCase(cval2) Then
row = i
Counter = Counter + 1
End If
Next i
End With
If row = "" Then
MsgBox ("Search complete. 0 results found")
Else
GetRowData (row)
UserForm1.resmax.Value = 1
End If
End Sub
It's worth noting that you haven't even initialized row, and have just let vba implicitly declare it as a variant type for you. To avoid common problems that arise from typos, include Option Explicit at the top of your code and Dim every variable with the type beside it. For example: Dim i as long. Dim i will work, but it will declare it as a variant type
To initialize an array in VBA you use Dim row() as variant. From there you can re-dimension its size using Redim row(LboundX to UboundX) but this will reset all the stored values to zero. To get around this use Redim Preserve row(LBoundX to UBound X).
If you wish to use a 2D array, add a comma and then put the bounds for the next dimension Redim Preserve row(LBoundX to UBound X, LboundY to UBoundY)
At the top of your code I would include
Dim row() as Variant
Redim Preserve row(1 to 1)
Then within the loop I would change row = i to
row(Ubound(row)) = i
Redim Preserve row(1 to Ubound(row) +1)
Now that you have an array though, the check you do below will no longer work and will likely throw an error because you there's no specified index. Instead I suggest changing it from If row = "" Then to If Counter = 0 Then.
I'm not sure what the intention is with GetRowData(row) but you can just access each row number via row(i). It is worth noting however, that the row array will have Counter +1 number of items, but the last one will be blank. You could get around this by adding in an if statement within the already existing one that would look something like this:
If Counter = 0 Then
row(1) = i
Else
ReDim Preserve row(1 To UBound(row) + 1)
row(UBound(row)) = i
End If
Counter = Counter + 1
By implementing this change, row should have exactly Counter number of items that all have a non-empty value
I don't really suggest making a column array because it becomes more cumbersome to change the array size. Redim preserve or not only allows you to change the last dimension in the array, so to change the number of rows you would have to set the array equal to the transpose of itself, do the redim and then set it to the transpose of itself again. It's just needlessly messy. If you're pasting it to a sheet and need it to be in a column you could just transpose it at the end instead.

Swaping positions within an array vb.net

I'm trying to code a program that has two sections depending on which of two buttons has been pressed.
The first section is the bit that is working, the user presses the first button labeled "unsort", this triggers a loop which displays an input box asking for a random number 8 times. These 8 numbers are stored in an array.
However it is the second section I'm struggling with; the second button is labeled sort and should output the numbers the user just entered using the first button is order, smallest to largest. I understand that a bubble sort must be used here and that a loop within a loop must also be used however it is the content of these loop that I don't understand. Since my original post I've edited the post to include some code in the loop I was previously stuck with, however it still isn't producing the desired output (all numbers in order) but is instead just outputting the numbers in a seemingly random order
The code is posted below with annotations:
Public Class BubbleSort1
Dim Bubble(8) As Integer
Dim UnsortedList As String
Dim n As Integer
Dim SortedList As String
Dim temp As String
Private Sub btnUnsort_Click(sender As Object, e As EventArgs) Handles btnUnsort.Click
n = 8 ' number off values on array
For i = 1 To n ' when i is between 1 and size of array
Bubble(i) = InputBox("Enter Number") ' User inputs a number
UnsortedList = UnsortedList & " " & Bubble(i) & vbNewLine ' number is added to the unsorted list variable
Next i
lblUnsort.Text = UnsortedList ' outputs the array
End Sub
Private Sub btnSort_Click(sender As Object, e As EventArgs) Handles btnSort.Click
For i = 1 To n - 1 ' When i is between 1 and the array size - 1 (8-1):
For j = 1 To n - 1 ' Second loop - when j is between 1 and the array size - 1 (8-1):
If Bubble(j) > Bubble(j + 1) Then ' if bubble value j is greater than value j - 1:
temp = Bubble(j)
Bubble(j) = Bubble(j + 1) ' These lines are supost to order the numbers but aren'r currently doing so
Bubble(j + 1) = temp
SortedList = SortedList & Bubble(j) & vbNewLine ' Adding the number in order to a variable
End If
Next j
Next i
lblSort.Text = SortedList ' outputting the ordered numbers
End Sub
End Class
As is pointed out in the code, the section of this code that orders the numbers is just putting them in a random order rather than actually ordering them.
With your updated code which now includes the swapping of array elements, you are building the string which shows the sorted array too soon: it will show the workings rather than the final result.
All you need to do is build the string once the array is in order:
Private Sub btnSort_Click(sender As Object, e As EventArgs) Handles btnSort.Click
' Bubble sort the array...
For i = 1 To n - 1 ' When i is between 1 and the array size - 1 (8-1):
For j = 1 To n - 1 ' Second loop - when j is between 1 and the array size - 1 (8-1):
If Bubble(j) > Bubble(j + 1) Then ' if bubble value j is greater than value j - 1:
temp = Bubble(j)
Bubble(j) = Bubble(j + 1)
Bubble(j + 1) = temp
End If
Next j
Next i
'lblSort.Text = String.Join(vbNewLine, Bubble.Skip(1)) ' an easy one-liner
' Create a string to show the sorted array...
SortedList = "" ' clear it out in case it was used previously
For i = 1 To n
SortedList = SortedList & Bubble(i).ToString()
If i < n Then ' only add a newline if it isn't the last element
SortedList = SortedList & vbNewLine
End If
Next
lblSort.Text = SortedList
End Sub
I put the .ToString() in there in anticipation of you explicitly converting the input strings into numbers; strictly speaking, the & operator will convert its arguments into strings but I prefer to make it obvious in the code.
As your code is, there is an implicit conversion from the input (a string of digits) into an integer (the type of the array elements). While this seems convenient, it can be a problem if VB guesses the wrong conversion for you. There is a way to tell it to let you know if the types of variables don't match: put Option Strict On as the very first line and it will even give you suggestions on what needs to be done to put it right.
If you want to prompt the user for the input, then you will first need to either get a numeric value using a control like a NumericUpDown or you will need to convert the String value to an Integer value using Integer.TryParse. Also, keep in mind that arrays in VB.Net have a 0 based index, so they start at 0, not at 1.
In terms of the Bubble Sort algorithm, you'll need a nested loop like you have with i and j, only your inner-nested loop (j) needs to iterate from the beginning of the array to the second to last item (0 to n-2). Inside of the nested loops, you would compare if the currently iterated value is greater than (or less than depending on which value you want to swap) than the next value. If so, then you'd just reassign the values at the currently iterated index.
Here is a console application example that I whipped up, it does not prompt the user for random values, rather it simply gets a collection of random values and then performs the Bubble Sort:
Private Function BubbleSort(ByVal values() As Integer) As Integer()
'Declare placeholder variables to use in the iterations
Dim temp As Integer
For outterIndex As Integer = 0 To values.Length - 1
For innerIndex As Integer = 0 To values.Length - 2
If values(innerIndex) > values(innerIndex + 1) Then
temp = values(innerIndex + 1)
values(innerIndex + 1) = values(innerIndex)
values(innerIndex) = temp
End If
Next
Next
Return values
End Function
Private r As New Random()
Private Function RandomNumbers(ByVal range As Integer) As Integer()
'Throw an exception if the value is less than 1
If range < 1 Then Throw New ArgumentOutOfRangeException("The range cannot be less than 1")
'Return a collection of random numbers
Return Enumerable.Range(1, range).Select(Function(i) r.Next()).ToArray()
End Function
Fiddle: Live Demo

Removing duplicates of a 2D array in VBA

I have found lots of methods to remove duplicates of a 1D array but could not find a 2D example.
In addition to that, I wonder if the fuction can "leave" an instance of the duplicate item instead of removing them all.
Is it possible to do it?
Example:
Sub Tests()
Dim Example()
Example(0,0) = "Apple"
Example(1,0) = "Apple"
Example(0,1) = "Pear"
Example(1,1) = "Orange"
End Sub
Remaining items would be: Apple, Pear and Orange
This is how I like to do it, using a separate array to hold the unique items. This prevents your loops from having to cycle through non unique items when trying to test them.
Sub Test()
Dim Example(1, 1) As String
Dim NoDups() As String
Dim I, II, III As Long
ReDim NoDups(UBound(Example, 1) * UBound(Example, 2))
For I = 0 To UBound(Example, 1)
For II = 0 To UBound(Example, 2)
For III = 0 To UBound(NoDups)
If NoDups(III) = Example(I, II) Then
Exit For
ElseIf NoDups(III) = "" Then
NoDups(III) = Example(I, II)
Exit For
End If
Next
Next
Next
End Sub
To work through a 2D array, do as you would with a 1D array, but with a 'width' loop inside of a 'height' loop.
ie:
for a = 1 to number_of_elements_in_first_dimension
for y = 1 to number_of_elements_in_second_dimension
initial_comparison_string = array(a,b)
for x = a to number_of_elements_in_first_dimension
for y = b + 1 to number_of_elements_in_second_dimension
if Initial_comparison_string = array(x,y) then array(x,y) = ""
next y
next x
next b
next a
This will run fairly slowly with a very large 2D array, but I think you'd have to do 2 nested loops like this to take each value and compare it against each value which appears later.

VBA Nested For Loop Efficiency

I am trying to find the fastest way to perform a task in VBA. Currently I have it written as a nested for loop which can be extremely slow. I am looping over a list of unique numbers and matching them to numbers in a different list. If I get a match I store the information in a multidimensional array since there can be multiple matches and I want to keep track of all of them. Unfortunetly, this means when using a for loop if there are just 1000 unique numbers and 5000 numbers to look for matches my loop can end up iterating 1000*5000 = 5000000 times. As you see this can create a problem quickly. I am asking if there is any better way to approach this problem while staying in VBA. I already did all the tricks like set screenUpdating to false and calculation to manaul.
Here is my code:
For x = 0 To UBound(arrUniqueNumbers)
Dim arrInfo() As Variant
ReDim Preserve arrInfo(0)
If UBound(arrInfo) = 0 Then
arrInfo(0) = CStr(arrUniqueNumbers(x))
End If
For y = 2 To Length
UniqueString = CStr(arrUniquePhoneNumbers(x))
CLEARString = CStr(Sheets(2).Range("E" & y).Value)
If UniqueString = CLEARString Then 'match!
NormalizedDate = Format(CStr(Sheets(2).Range("G" & y).Value), "yyyymmdd")
z = z + 1
ReDim Preserve arrInfo(z)
arrInfo(z) = NormalizedDate & " " & LTrim(CStr(Sheets(2).Range("D" & y).Value))
arrInfo(z) = LTrim(arrInfo(z))
End If
Next
arrUniqueNumbers(x) = arrInfo()
ReDim arrInfo(0) 'erase everything in arrOwners
z = 0
Next
The loop is quite inefficient, so there are quite a few avoidable bottlenecks (mostly in the order of simplest to change to most complex to change)
Take the UniqueString step out of the innermost loop: This step doesn't change with changing y, so no point in repeating it.
Take the Redim Preserve out of the innermost loop: You are reallocating memory in the innermost loop which is extremely inefficient. Allocate 'sufficient' amount of memory outside the loop.
Do not keep using Sheets().Range() to access cell contents: Every time you access something on the spreadsheet, it is a HUGE drag and has a lot of overhead associated with the access. Consider one-step fetch operations from the spreadsheet, and one-step push operations back to the spreadsheet for your results. See sample code below.
Sample code for Efficient Fetch and Push-back operations for the spreadsheet:
Dim VarInput() As Variant
Dim Rng As Range
' Set Rng = whatever range you are looking at, say A1:A1000
VarInput = Rng
' This makes VarInput a 1 x 1000 array where VarInput(1,1) refers to the value in cell A1, etc.
' This is a ONE STEP fetch operation
' Your code goes here, loops and all
Dim OutputVar() as Variant
Redim OutputVar(1 to 1000, 1 to 1)
' Fill values in OutputVar(1,1), (1,2) etc. the way you would like in your output range
Dim OutputRng as Range
Set OutputRng = ActiveSheet.Range("B1:B1000")
' where you want your results
OutputRng = OutputVar
' ONE STEP push operation - pushes all the contents of the variant array onto the spreadsheet
There are quite a few other steps which can further dramatically speed up your code, but these should produce visible impact with not too much effort.
dim dict as Object
set dict = CreateObject("Scripting.Dictionary")
dim x as Long
'Fill with ids
'Are arrUniqueNumbers and arrUniquePhoneNumbers the same?
For x = 0 To UBound(arrUniqueNumbers)
dict.add CStr(arrUniquePhoneNumbers(x)), New Collection
next
'Load Range contents in 2-Dimensional Array
dim idArray as Variant
idArray = Sheets(2).Cells(2,"E").resize(Length-2+1).Value
dim timeArray as Variant
timeArray = Sheets(2).Cells(2,"G").resize(Length-2+1).Value
dim somethingArray as Variant
somethingArray = Sheets(2).Cells(2,"D").resize(Length-2+1).Value
dim y as Long
'Add Values to Dictionary
For y = 2 To Length
Dim CLEARString As String
CLEARString = CStr(timeArray(y,1))
If dict.exists(CLEARString) then
dict(CLEARString).Add LTrim( Format(timeArray(y,1)), "yyyymmdd")) _
& " " & LTrim(CStr(somethingArray(y,1)))
end if
next
Access like this
dim currentId as Variant
for each currentId in dict.Keys
dim currentValue as variant
for each currentValue in dict(currentId)
debug.Print currentId, currentValue
next
next

Resources