Deleting instances of chars using arrays and loops - arrays

I have a column where almost every cell is made of a combination of numbers and letters and symbols ("TS-403" or "TSM-7600"). I want every char that's not an integer to be deleted/replaced with an empty string, so that I'm left only with numbers ("403").
I've thought up of two approaches:
I think the best one is to create an array of integers with the numbers 0-9, and then iterate through the cells with a for loop where if the string in a cell contains a char that's not in the array, then that symbol (not the entire cell) should be erased.
Sub fixRequestNmrs()
Dim intArr() as Integer
ReDim intArr(1 to 10)
For i = 0 to 9
intArr(i) = i
Next i
Dim bRange as Range
Set bRange = Sheets(1).Columns(2)
For Each cell in bRange.Cells
if cell.Value
// if cell includes char that is not in the intArr,
// then that char should be deleted/replaced.
...
End Sub()
Perhaps the second approach is easier, which would be to use the Split() function as the '-' is always followed by the numbers, and then have that first substring replaced with "". I'm very confused on how to use the Split() function in combination with a range and a replace funtion though...
For Each cell in bRange.Cells
Cells.Split(?, "-")
...

Digits to Integer Using the Like Operator
The Function
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose: Returns an integer composed from the digits of a string.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function DigitsToInteger(ByVal SearchString As String) As Long
Dim ResultString As String
Dim Char As String
Dim n As Long
For n = 1 To Len(SearchString)
Char = Mid(SearchString, n, 1)
If Char Like "[0-9]" Then ResultString = ResultString & Char
Next n
If Len(ResultString) = 0 Then Exit Function
DigitsToInteger = CLng(ResultString)
End Function
A Worksheet Example
Sub DigitsToIntegerTEST()
Const FIRST_ROW As Long = 2
' Read: Reference the (single-column) range.
Dim wb As Workbook: Set wb = ThisWorkbook ' workbook containing this code
Dim ws As Worksheet: Set ws = wb.Worksheets("Sheet1")
Dim LastRow As Long: LastRow = ws.Cells(ws.Rows.Count, "B").End(xlUp).Row
If LastRow < FIRST_ROW Then Exit Sub ' no data
Dim rg As Range: Set rg = ws.Range("B2", ws.Cells(LastRow, "B"))
Dim rCount As Long: rCount = rg.Rows.Count
' Read: Return the values from the range in an array.
Dim Data() As Variant
If rCount = 1 Then
ReDim Data(1 To 1, 1 To 1): Data(1, 1) = rg.Value
Else
Data = rg.Value
End If
' Modify: Use the function to replace the values with integers.
Dim r As Long
For r = 1 To rCount
Data(r, 1) = DigitsToInteger(CStr(Data(r, 1)))
Next r
' Write: Return the modifed values in the range.
rg.Value = Data
' To test the results in the column adjacent to the right, instead use:
'rg.Offset(, 1).Value = Data
End Sub
In VBA (Simple)
Sub DigitsToIntegerSimpleTest()
Const S As String = "TSM-7600sdf"
Debug.Print DigitsToInteger(S) ' Result 7600
End Sub
In Excel
=DigitsToInteger(A1)

If you have the CONCAT function, you can do this with a relatively simple formula -- no VBA needed:
=CONCAT(IFERROR(--MID(A1,SEQUENCE(LEN(A1)),1),""))
If you prefer a non-VBA solution in an earlier version of Excel, there is a more complex formula available, but I'd have to go back through my files to locate it.

A tricky function GetVal()
The following function
translates a string into a single characters array arr via help function String2Arr()
isolates them into numeric (category code 6) or non-numeric categories (other) via a tricky execution of Application.Match (here without its 3rd argument which is mostly used for precise search, and by comparing two arrays)
finds the starting position in the original string via Instr()
returns the value of the right substring via Val() (~> see note).
Function GetVal(ByVal s As String) As Double
Dim arr: arr = String2Arr(s): Debug.Print Join(arr, "|")
Dim chars: chars = Split(" ,',+,-,.,0,A", ",")
Dim catCodes: catCodes = Application.Match(arr, chars) 'No 3rd zero-argument!!
Dim tmp$: tmp = Join(catCodes, ""): Debug.Print Join(catCodes, "|")
Dim pos&: pos = InStr(tmp, 6) ' Pos 6: Digits; pos 1-5,7: other symbols/chars
GetVal = Val(Mid(s, pos)) ' calculate value of right substring
End Function
Notes
The Val function can translate a (sub)string with starting digits to a number, even if there are following non-numeric characters.
Help function String2Arr()
Atomizes a string into a single characters array:
Function String2Arr(ByVal s As String)
s = StrConv(s, vbUnicode)
String2Arr = Split(s, vbNullChar, Len(s) \ 2)
End Function
Example call
Dim s As String
s = "aA+*&$%(y#,'/\)!-12034.56blabla"
Debug.Print GetVal(s) ' ~~> 12034.56

Related

Change a single-dimension array into a multi-dimensional array in VBA for Access

I have code to ask a user for a series of codes that then creates a single-dimensional array like this:
Dim strDaysTimes As String
Dim arrDaysTimes() As String
strDaysTimes = InputBox("What days and times do you want to schedule meetings for? (write as 6c,7b)", "Enter Days and Times")
arrDaysTimes() = Split(strDaysTimes, ",")
The number of inputs is not defined but the format is. It could be "6c,7b" or "5a,6b,7b".
I want to convert this into a multi-dimensional array that would carry the values like this (one dimension has the number portion and the other has the letter portion):
5 a
6 b
7 b
I know that I need to use a nested For...Next statements to process multidimensional arrays, but I would appreciate any suggestions.
Use ReDim:
Public Function DivideArray()
Dim strDaysTimes As String
Dim arrDaysTimes() As String
Dim DaysTimes() As String
Dim Index As Integer
strDaysTimes = InputBox("What days and times do you want to schedule meetings for? (write as 6c,7b)", "Enter Days and Times")
arrDaysTimes() = Split(strDaysTimes, ",")
ReDim DaysTimes(UBound(arrDaysTimes) - LBound(arrDaysTimes) + 1, 0 To 1)
For Index = LBound(arrDaysTimes) To UBound(arrDaysTimes)
DaysTimes(Index, 0) = Left(LTrim(arrDaysTimes(Index)), 1)
DaysTimes(Index, 1) = Right(RTrim(arrDaysTimes(Index)), 1)
Next
For Index = LBound(arrDaysTimes) To UBound(arrDaysTimes)
Debug.Print DaysTimes(Index, 0), DaysTimes(Index, 1)
Next
End Function
Input example:
a7, b8, c9
Output:
a 7
b 8
c 9
Just for the sake of the art an alternative to #Gustav 's approach with the bonus that it returns token lengths greater than 1, too.
Furthermore it profits from the fact that the Val() function is able to return
a) the starting numeric value from an input string and
b) the closing string by a split via the above numeric value as delimiter.
Public Function tokenize(ByVal s As String)
Dim arr() As String
arr() = Split(Trim(s), ",")
Dim tmp() As String
ReDim tmp(0 To UBound(arr) - LBound(arr), 0 To 1)
Dim i As Long
For i = LBound(arr) To UBound(arr)
Dim num: num = Val(arr(i))
tmp(i, 0) = num
tmp(i, 1) = Split(arr(i), num)(1)
Next
tokenize = tmp
End Function
Example call
Sub testTokenize()
'0. Get input string (e.g. "6c,7b")
Dim strDaysTimes As String
strDaysTimes = InputBox( _
"What days and times do you want to schedule meetings for? (write as 6c,7b)", _
"Enter Days and Times", _
"6c,7b")
'1. Call help function
Dim results As Variant
results = tokenize(strDaysTimes) ' << function tokenize()
'2. Show results in VB Editor's immediate window
Dim i As Long
For i = LBound(results) To UBound(results)
Debug.Print results(i, 0), results(i, 1)
Next
End Sub
The following code will help you get there.
The GetDaysAndTimes function will return a Jagged array (i.e. an array of arrays). This means that to get the Day and Time of Item 3 you would use ArrayName(2)(0) and ArrayName(2)(1) where arrayname is the name of the array you are using (arrayDaysTimes?)
The function SplitAlphaNumString allows users to enter codes such as AB23.
Option Explicit
' This function takes the string returned by your input box
Public Function GetDaysAndTimes(ByRef ipString As String) As Variant
Dim myItems As Variant
myItems = VBA.Split(ipString, ",")
Dim myDayTimes As Variant
Dim myindex As Long
For myindex = LBound(myItems) To UBound(myItems)
myDayTimes(myindex) = SplitAlphaNumString(myItems(myindex))
Next
GetDaysAndTimes = myDayTimes
End Function
Public Function SplitAlphaNumString(ByVal ipString As String) As Variant
Dim myindex As Long
For myindex = 1 To VBA.Len(ipString)
If VBA.Asc(VBA.Mid(ipString, myindex, 1)) < 58 Then
Dim myAlphas As String
myAlphas = VBA.Mid(ipString, 1, myindex - 1)
Dim myNums As String
myNums = VBA.Mid(ipString, myindex)
SplitAlphaNumString = Array(myAlphas, myNums)
Exit Function
End If
Next
End Function
Sub Test()
Dim myArray As Variant
myArray = SplitAlphaNumString("D5")
Debug.Print myArray(0), myArray(1)
End Sub

Filtering out Numbers from Array

So I have an Array called TagOptions - it contains numeric values according to a pervious if statement. In order to take out values I didn't want I gave the undesired values a place holder value of 0. I am now trying to filter out this value but can't find anything online that is helpful.
Will paste the entire function for context but more interested in just filtering out the placeholder zeros from my array.
Sorry if this is novice but I am very new to this:
Private Sub CommandButton4_Click()
Dim sh As Worksheet
Set sh = ThisWorkbook.Sheets("TEST")
lrow = sh.Cells(Rows.count, 1).End(xlUp).Row
Dim splitstring As String
Dim holder As String
Dim myarray() As String
Dim strArrayNumber() As Integer
Dim strArrayTag() As String
Dim TagOptions() As Integer
Dim TagOptions2() As Integer
ReDim strArrayNumber(1 To lrow) As Integer
ReDim strArrayTag(1 To lrow) As String
'Initial for loop splitting tags and removing any tags with text (MV-4005A)
'Transfering those remaining tag numbers into array if they match equip selected
For a = 1 To lrow
If sh.Cells(a, 1).Value <> vbNullString Then
splitstring = sh.Cells(a, 1).Value
myarray = Split(splitstring, "-")
strArrayTag(a) = myarray(0)
End If
If IsNumeric(myarray(1)) = False Then
myarray(1) = 0
End If
If strArrayTag(a) = TagNumber1.Value Then 'Only stored if has selected Equipment tag
strArrayNumber(a) = myarray(1)
End If
Next a
'Sort Created Array
Quicksort strArrayNumber, LBound(strArrayNumber), UBound(strArrayNumber)
ReDim TagOptions(1000 To 2000) As Integer
Dim j As Integer
For j = 1000 To 2000
For b = 1 To UBound(strArrayNumber)
If strArrayNumber(b) = j Then
TagOptions(j) = 0
Exit For
Else
TagOptions(j) = j
End If
Next b
sh.Cells(j, 8) = TagOptions(j)
Next j
Quicksort TagOptions, LBound(TagOptions), UBound(TagOptions)
For f = LBound(TagOptions) To UBound(TagOptions)
sh.Cells(f, 9) = TagOptions(f)
Next f
**TagOptions2 = Filter(TagOptions, "0", False, vbDatabaseCompare)**
Me.ComboBox1.List = TagOptions
End Sub
Thnak you in advance for any help.
tl;dr entire code, just note that VBA's Filter() function applied on a "flat" 1-dim array only executes a partial character search finding "0" also in strings like e.g. "10" or "205", what definitely isn't what you want to do :-;
Btw, if your initial array is a 2-dim array, there are number of answers at SO how to slice data from a 2-dim array and transpose or double transpose them to a 1-dim array needed as starting point.
Solving the actual core question how to filter out zero-digits
To succeed in filtering out zeros in a 1-dim array, simply use the following function via the Worksheetfunction FilterXML (available since vers. 2013+):
tagOptions = WorksheetFunction.FilterXML("<t><s>" & _
Join(tagOptions, "</s><s>") & "</s></t>", _
"//s[not(.='0')]")
resulting in a 1-based 2-dim array.
If you prefer, however to get a resulting 1-dim array instead, simply transpose it via tagOptions = Application.Transpose(tagOptions) or tagOptions = WorkSheetFunction.Transpose(tagOptions).
You can find an excellent overview at Extract substrings ... from FilterXML

VBA moving rows through arrays

I am pretty new with arrays in VBA, and need some help finishing a code...
The objective is to copy from one array to another if a value in the first part of the array is found.
Here's what I have so far, and I have put comments in the lines that I am struggling with.
Option Explicit
Sub ReadingRange()
Dim ARRAY_Multiwage As Variant
Dim ARRAY_TEMP_Multiwage() As Variant
ARRAY_Multiwage = Sheets("Multiwage").Range("A1").CurrentRegion
Dim a As Long
Dim b As Long
For a = LBound(ARRAY_Multiwage, 1) To UBound(ARRAY_Multiwage, 1)
If ARRAY_Multiwage(a, 1) = "60021184_2018/36/HE" Then
'add ARRAY_Multiwage(a, 1) to ARRAY_TEMP_Multiwage
'Debug print to see that it has been added
Else:
End If
Next a
End Sub
Any help would be greatly appreciated
Try this out. What you are looking for is ReDim option to dynamically expand an array before entering data into the newest slot.
Sub ReadingRange()
Dim ARRAY_Multiwage As Variant
Dim ARRAY_TEMP_Multiwage() As String
ARRAY_Multiwage = Sheets("Sheet2").Range("A1").CurrentRegion
Dim a As Long
Dim b As Long
' c is the counter that helps array become larger dynamically
Dim c As Long
c = 0
For a = LBound(ARRAY_Multiwage, 1) To UBound(ARRAY_Multiwage, 1)
If ARRAY_Multiwage(a, 1) = "60021184_2018/36/HE" Then
' change the dimension of the array
ReDim Preserve ARRAY_TEMP_Multiwage(c)
' add data to it
ARRAY_TEMP_Multiwage(c) = ARRAY_Multiwage(a, 1)
' print what was added
Debug.Print ("Ubound is " & UBound(ARRAY_TEMP_Multiwage) & ". Latest item in array is " & ARRAY_TEMP_Multiwage(UBound(ARRAY_TEMP_Multiwage)))
' get ready to expand the array
c = c + 1
Else:
End If
Next a
End Sub
I tend to use a Long data type variable as a counter within the loop for the destination array, that way each time the array is accessed, a new element can be written to. In past I've been steered towards declaring the new array with the maximum upper bound it could hold and resize it once at the end so the below example will follow that.
Option Explicit
Sub ReadingRange()
Dim ARRAY_Multiwage As Variant
Dim ARRAY_TEMP_Multiwage() As Variant
ARRAY_Multiwage = Sheets("Multiwage").Range("A1").CurrentRegion
Dim a As Long
Dim b As Long
Dim ArrayCounter as Long
ArrayCounter = 1 'Or 0, depends on if you are using a zero based array or not
For a = LBound(ARRAY_Multiwage, 1) To UBound(ARRAY_Multiwage, 1)
If ARRAY_Multiwage(a, 1) = "60021184_2018/36/HE" Then
ARRAY_TEMP_Multiwage(ArrayCounter) = ARRAY_Multiwage(a, 1)
Debug.Print ARRAY_TEMP_Multiwage(ArrayCounter)
ArrayCounter = ArrayCounter + 1
Else
'Do nothing
End If
Next a
ReDim Preserve ARRAY_TEMP_Multiwage (1 To (ArrayCounter - 1))
End Sub
Copy Range With Criteria
The following will copy from worksheet Sourceultiwage to worksheet
Targetultiwage both in ThisWorkbook, the workbook containing this
code.
Adjust the values in the constants section including wb.
Additionally you can choose to copy headers (copyHeaders)
The Code
Option Explicit
Sub copyWithCriteria()
' Source
Const srcName As String = "Sourceultiwage"
Const srcFirst As String = "A1"
' Target
Const tgtName As String = "Targetultiwage"
Const tgtFirst As String = "A1"
' Criteria
Const CriteriaColumn As Long = 1
Const Criteria As String = "60021184_2018/36/HE"
' Headers
Const copyHeaders As Boolean = False
' Workboook
Dim wb As Workbook: Set wb = ThisWorkbook
' Write values from Source Range to Source Array.
Dim rng As Range
Set rng = wb.Worksheets(srcName).Range(srcFirst).CurrentRegion
Dim NoR As Long
NoR = WorksheetFunction.CountIf(rng.Columns(CriteriaColumn), Criteria)
Dim Source As Variant: Source = rng.Value
' Write values from Headers Range to Headers Array.
If copyHeaders Then
Dim Headers As Variant: Headers = rng.Rows(1).Value
End If
' Write from Source to Target Array.
Set rng = Nothing
Dim UB1 As Long: UB1 = UBound(Source)
Dim UB2 As Long: UB2 = UBound(Source, 2)
Dim Target As Variant: ReDim Target(1 To NoR, 1 To UB2)
Dim i As Long, j As Long, k As Long
For i = 1 To UB1
If Source(i, CriteriaColumn) = Criteria Then
k = k + 1
For j = 1 To UB2
Target(k, j) = Source(i, j)
Next j
End If
Next i
' Write from Target Array to Target Range.
With wb.Worksheets(tgtName).Range(tgtFirst)
If copyHeaders Then .Resize(, UB2).Value = Headers ' Headers
.Offset(Abs(copyHeaders)).Resize(NoR, UB2).Value = Target ' Data
End With
' Inform user.
MsgBox "Data transferred.", vbInformation, "Success"
End Sub

How to join returned values from named range separated by comma

I've spent hours trying to find out how to join returned values from a named range, but the result is a
run-time error 32 - Type mismatch.
As a newbie I'm still struggling with arrays, so maybe I've overlooked some detail. Thank you for helping me out.
Example: (B1)Benzine, (B2)Diesel, (B3)Hybride -> (E1)Gasoline, (E2)Diesel, (E3)Hybrid
This is the named range:
Another example (to be more clear):
Example 2: (B1)Benzine, (B3)Hybride -> (E1)Gasoline, (E3)Hybrid
Option Explicit
Sub splitter()
Dim i As Long
Dim w As Long
'Dim oWB As Workbook
Dim oWS As Worksheet
Dim oWS9 As Worksheet
Dim rngMOTOR As Range
Dim rngMOTOR2 As Range
Dim arrMOTOR() As Variant
Dim LastRow As Long
'Set oWB = Workbooks("BRONBESTAND.xlsm")
Set oWS = Sheets("ONDERDELEN")
Set oWS9 = Sheets("MOTOR") '5 columns: 1 Short & LONG + 1 NL + 3 Languages !!!!! WARNING
LastRow = oWS.Range("A" & Rows.Count).End(xlUp).Row
For i = 2 To LastRow 'Starting below headers
Set rngMOTOR = oWS.Cells(i, "M") 'MOTOR ...
Set rngMOTOR2 = oWS9.Range("MOTOR") 'MOTOR2: MOTOR - Bronbestand arrPOS = rngPOS2.Value
arrMOTOR = rngMOTOR2.Value
'*********
Dim txt As String
Dim j As Integer
Dim Splitted As Variant
Dim arrMOTORall As Variant
Dim arrMOTORsplit As Variant
Dim Motor3 As String
txt = oWS.Cells(i, "M") 'MOTOR ...
Debug.Print ("txt : ") & i & ": "; txt
If Not IsEmpty(txt) Then
Splitted = Split(txt, ", ")
For j = 0 To UBound(Splitted)
Cells(1, j + 1).Value = Splitted(j)
Debug.Print (" ---> Splitted: ") & Splitted(j)
'**** INSERT *****
For w = LBound(arrMOTOR) To UBound(arrMOTOR)
If arrMOTOR(w, 1) = Splitted(j) Then 'EX: B - Benzine
arrMOTORsplit = (arrMOTOR(w, 4)) '(arrMOTOR(y, 2)) -> 1=SHORT+LONG , 2=NL, 3=FR, 4=EN
Debug.Print (" ---> arrMOTORsplit: ") & i & ": " & arrMOTORsplit
'**** JOIN ****
arrMOTORall = Join(arrMOTORsplit, ", ")
Debug.Print ("arrMOTORall: ") & arrMOTORall
End If
Next w
Next j
End If
Next i
End Sub
Get comma separated strings for each column in named range
I didn't analyze your code, but this should work to receive the first three values joined
"Benzine, Diesel, Hybride" ' e.g. from first column
or
"Gasoline, Diesel, Hybrid" ' e.g. from the fourth column
from a named range "Motor" via the Application.Index function.
Notes
The parameter 0 in this Index function indicates to not choose a specific row, the Parameter ColNo chooses each of your columns in a Loop. A subsequent transposition allows to change the 2 dimensioned array values to a 1-dim array. The Join function needs a 1-dim array and concatenates the chosen column items therein.
Hint: The following sample code uses a fully qualified range reference assuming that you don't call the TestMe procedure from your Personal Macro Library. In the latter case you'd have to change references and workbook identification (not using ThisWorkbook!).
Example code
Option Explicit ' declaration head of your code module
Sub TestMe()
Dim v As Variant, ColNo As Long
' assign first three rows to variant 1-based 2-dim datafield array
v = ThisWorkbook.Worksheets("Motor").[Motor].Resize(3, 4) ' Named range value
' write comma separated list for each column
For ColNo = 1 To 4
Debug.Print Join(Application.Transpose(Application.Index(v, 0, ColNo)), ", ")
Next ColNo
End Sub
EDIT - Flexible Search in ANY ORDER to translate joined lists
This solution allows to return joined search words in any combination using the Application.Index function in an advanced way using row and column arrays as parameters. The main function getSplitters() creates a variant 2-dim array in only three steps without loops and redims and uses two language constants (Const DUTCH and Const ENGLISH).:
assigns data to variant 1-based 2-dim datafield array
gets only the selected rows based on comma separated string values
reduces the same array to Dutch and English columns
Calling Code
Due to your OP the calling code anylyzes all comma separated strings in Column M in your sheet "ONDERDELEN" as far as there are values in column A. This is made by passing these found string values to the main function getSplitters with an innovative approach to get results in only three steps without Loops (see function code below).
Translation is based on values in the named range Motor "B1:E4" in sheet "Motor" where rows comprise different sort of fuel with neighbouring columns for different languages (starting with Dutch in the first column and English in the fourth col).
Note that using VBA it is faster to loop through an array to get values than through a range.
Option Explicit ' declaration head of your code module
Const DUTCH As Integer = 1
Const ENGLISH As Integer = 4
Sub TranslateAnyFuelCombination()
' Purpose: returns comma separated lists in column "M" and translates from Dutch to English
' Example: "Benzine, Hybride, Diesel" (Dutch) gets to "Gasoline, Hybrid, Diesel" in English
Dim s As String
Dim oWS As Worksheet, i&, LastRow&, vMOTOR As Variant
Set oWS = Thisworkbook.Worksheets("ONDERDELEN") ' fully qualified reference
' Get last row of wanted data
LastRow = oWS.Range("A" & Rows.Count).End(xlUp).Row
vMOTOR = oWS.Range("M1:M" & LastRow)
For i = 2 To LastRow 'Starting below headers
Debug.Print getSplitters(vMOTOR(i, 1))
Next i
End Sub
Main function
Function getSplitters(ByVal sRows As String) As String
Dim i As Long, j As Long
Dim v As Variant, a As Variant
' [0] analyze selected rows string, e.g. "Benzine, Hybride, Diesel"
a = getRowAr(sRows) ' -> assign 1-dim Rows Array(1, 3, 2)
' [1] assign data to variant 1-based 2-dim datafield array
v = Application.Transpose(ThisWorkbook.Worksheets("Motor").[Motor]) ' Named range value
' [2] get only selected rows, e.g. 1st, 3rd and 2nd -> in free order (!) Benzine, Hybride, Diesel
v = Application.Transpose(Application.Index(v, _
Application.Evaluate("row(1:" & UBound(v, 2) & ")"), _
a)) ' transposed columns array = selected rows
' [3] reduce to Dutch and English columns
v = Application.Transpose(Application.Index(v, _
Application.Evaluate("row(1:" & (UBound(a) + 1) & ")"), _
Array(DUTCH, ENGLISH))) ' selected columns array (above array retransposed)
' [4] return concatenated strings
getSplitters = Join(Application.Transpose(Application.Transpose(Application.Index(v, 1, 0))), ", ") & " -> " & _
Join(Application.Transpose(Application.Transpose(Application.Index(v, 2, 0))), ", ")
End Function
Two helper functions
Function getRowAr(ByVal sList As String) As Variant
' Purpose: split comma separated list into 1-dim number array in FREE ORDER
' Example: "Benzine, Hybride, Diesel" -> Array(1, 3, 2)
Dim ar, i&
' change words in comma separated list to numbers
ar = Split(Replace(sList, " ", ""), ",")
For i = LBound(ar) To UBound(ar)
ar(i) = val(getNumber(ar(i))) ' change to numbers
Next i
getRowAr = ar ' return
End Function
Function getNumber(ByVal s As String) As Long
' Purpose: replaces dutch search words with corresponding row number
Dim arFuel
' get search words to 1-dim array
arFuel = Application.Index(ThisWorkbook.Worksheets("Motor").[Motor], 0, DUTCH)
' return corresponding number
getNumber = Application.Match(s, arFuel)
End Function
Addendum (Edit due to comment)
The above code works as intended if you are sure that the concatenated search words (or starting parts) actually match else an Error 13 is raised. You can solve this issue in two steps:
Insert an empty first row into your named range Motor (or fill it e.g. with ?, #N/A etc.)
Change the 2nd helper function as follows:
Edited function getNumber()
Function getNumber(ByVal s As String) As Long
' Purpose: replaces dutch search words with corresponding row number
Dim arFuel
' get search words to 1-dim array
arFuel = Application.Index(ThisWorkbook.Worksheets("Motor").[Motor], 0, DUTCH)
' return corresponding number
On Error Resume Next ' provide for not found case
getNumber = Application.Match(s, arFuel, 0) ' find only exact matches
If Err.Number <> 0 Then getNumber = 0 ' could be omitted in case of a zero return
End Function
With 2 arrays this is a possible solution:
Sub TestMe()
Dim inputString As String
Dim arrString As Variant
Dim arrResult As Variant
inputString = "Benzine, Diesel, Hybride"
arrString = Split(inputString, ",")
Dim total As Long: total = UBound(arrString)
ReDim arrResult(total)
Dim i As Long
For i = LBound(arrString) To UBound(arrString)
arrResult(total - i) = Trim(arrString(i))
Next i
Debug.Print Join(arrResult, " ,")
End Sub
However, there is a classic solution of this problem, reversing everything twice:
Sub TestMe()
Dim inputString As String
inputString = "Benzine, Diesel, Hybride"
inputString = StrReverse(inputString)
Dim arr As Variant: arr = Split(inputString, ",")
Dim i As Long
For i = LBound(arr) To UBound(arr)
arr(i) = Trim(StrReverse(arr(i)))
Next i
Debug.Print Join(arr, ", ")
End Sub

Efficient way to create an inverted index in VBA

I am creating an inverted index to get a dictionary of words with an associated list of the line numbers that the word appears on (starting the line numbers and a list of words that appear in a given cell within that line).
I have managed to get some code working for this, but I found dealing with adding to the arrays (the values in the dictionary) to be a little cumbersome and I wonder is there is a more efficient or more elegant way to handle this.
I am open to using arrays, collections or any other data type that can be easily searched to store the list of line numbers in the values of the dictionary. I have pasted a cut down version of my code to demonstrate the core problem below, the question is really just about the BuildInvertedIndex procedure, but the rest is included to try to make it easier to recreate the scenario:
Sub Test()
' minimum included here to demonstrate use of buildInvertedIndex procedure
Dim vRange As Range
Dim vDict As Dictionary
Set vRange = ActiveSheet.Range("F2:F20585")
Set vDict = New Dictionary
BuildInvertedIndex vDict, vRange
' test values returned in dictionary (word: [line 1, ..., line n])
Dim k As Variant, vCounter As Long
vCounter = 0
For Each k In vDict.Keys
Debug.Print k & ": " & ArrayToString(vDict.Item(k))
vCounter = vCounter + 1
If vCounter >= 10 Then
Exit For
End If
Next
End Sub
Sub BuildInvertedIndex(pDict As Dictionary, pRange As Range)
Dim cell As Range
Dim words As Variant, word As Variant, val As Variant
Dim tmpArr() As Long
Dim newLen As Long, i As Long
' loop through cells (one col wide so same as looping through lines)
For Each cell In pRange.Cells
' loop through words in line
words = Split(cell.Value)
For Each word In words
If Not pDict.exists(word) Then
' start line array with first row number
pDict.Add word, Array(cell.Row())
Else
i = 0
If Not InArray(cell.Row(), pDict.Item(word)) Then
newLen = UBound(pDict.Item(word)) + 1
ReDim tmpArr(newLen)
For Each val In tmpArr
If i < newLen Then
tmpArr(i) = pDict.Item(word)(i)
Else
tmpArr(i) = cell.Row()
End If
i = i + 1
Next val
pDict.Item(word) = tmpArr
End If
End If
Next word
Next cell
End Sub
Function ArrayToString(vArray As Variant, _
Optional vDelim As String = ",") As String
' only included to support test (be able to see what is in the arrays)
Dim vDelimString As String
Dim i As Long
For i = LBound(vArray) To UBound(vArray)
vDelimString = vDelimString & CStr(vArray(i)) & _
IIf(vCounter < UBound(vArray), vDelim, "")
Next
ArrayToString = vDelimString
End Function
To run this you will need values in column F of the active sheet (sentences), if you do not already have it you will also need to add a reference to the Microsoft Scripting Runtime in your VBA environment for the dictionary data type to be available (tools -> references -> Microsoft Scripting Runtime).
As you will see from the code this gets a bit messy where I have to insert a new line number into an existing array (that is stored as a value within the dictionary). As I do not know of a way to just extend this array (without clearing the existing values), I have used the variable tmpArr to create an array of the appropriate size and then copy the values one by one from the existing array in the dictionary and then add the current row number to the end. The temporary array is then used to replace the existing value for that key (the current word).
Any advice on this would be greatly appreciated.
I am open to using arrays, collections or any other data type
As I see, using collection instead array would be much simplier:
Sub BuildInvertedIndex(pDict As Dictionary, pRange As Range)
Dim cell As Range
Dim words, word
Dim i As Long
' loop through cells (one col wide so same as looping through lines)
For Each cell In pRange.Cells
' loop through words in line
words = Split(cell.Value)
For Each word In words
If Not pDict.Exists(word) Then
' initialize collection
pDict.Add word, New Collection
End If
'try to add to collection. If row is already in collecton, nothing happend. Storing key makes you sure there're only unique rows
On Error Resume Next
pDict.Item(word).Add Item:=cell.Row, Key:=CStr(cell.Row)
On Error GoTo 0
Next word
Next cell
End Sub
Next step, is to slightly modify ArrayToString to ColToString:
Function ColToString(vCol As Collection, _
Optional vDelim As String = ",") As String
' only included to support test (be able to see what is in the arrays)
Dim vDelimString As String
Dim i As Long
For i = 1 To vCol.Count
vDelimString = vDelimString & CStr(vCol.Item(i)) & _
IIf(i < vCol.Count, vDelim, "")
Next
ColToString = vDelimString
End Function
and the test subroutine (changed only one row - Debug.Print k & ": " & ColToString(vDict.Item(k)) and target range to "F2:F5"):
Sub Test()
' minimum included here to demonstrate use of buildInvertedIndex procedure
Dim vRange As Range
Dim vDict As Dictionary
Set vRange = ActiveSheet.Range("F2:F5")
Set vDict = New Dictionary
BuildInvertedIndex vDict, vRange
' test values returned in dictionary (word: [line 1, ..., line n])
Dim k As Variant, vCounter As Long
vCounter = 0
For Each k In vDict.Keys
Debug.Print k & ": " & ColToString(vDict.Item(k))
vCounter = vCounter + 1
If vCounter >= 10 Then
Exit For
End If
Next
'clean up memory
Set vDict = Nothing
End Sub
RESULT:
UPDATE:
to improve speed of your code you could store range in array (next approach work only with single-column range, but you could easily modify it):
Test sub:
Sub TestWirhArray()
' minimum included here to demonstrate use of buildInvertedIndex procedure
Dim vRange As Range
Dim vDict As Dictionary
Dim myArr As Variant
Set vDict = New Dictionary
Set vRange = ActiveSheet.Range("F2:F20585")
myArr = vRange.Value
BuildInvertedIndexWithArr vDict, myArr, vRange.Row
' test values returned in dictionary (word: [line 1, ..., line n])
Dim k As Variant, vCounter As Long
vCounter = 0
For Each k In vDict.Keys
Debug.Print k & ": " & ColToString(vDict.Item(k))
vCounter = vCounter + 1
If vCounter >= 10 Then
Exit For
End If
Next
'clean up memory
Set vDict = Nothing
End Sub
new version of BuildInvertedIndexWithArr:
Sub BuildInvertedIndexWithArr(pDict As Dictionary, pArr, firstRow As Long)
Dim cell, words, word
Dim i As Long, j As Long
j = firstRow
' loop through cells (one col wide so same as looping through lines)
For Each cell In pArr
' loop through words in line
words = Split(cell)
For Each word In words
If Not pDict.exists(word) Then
' initialize collection
pDict.Add word, New Collection
End If
On Error Resume Next
pDict.Item(word).Add Item:=j, Key:=CStr(j)
On Error GoTo 0
Next word
j = j + 1
Next cell
End Sub

Resources