Remove characters from a word - VBA - arrays

I want to remove characters from a string based on the position. i use the below code and i manage to:
Insert all word letters in an array
Loop the array and replace the letters that i want with nothing
BUT i dont know have to remove nothing from the array, recreate array index and create the word again without those characters.
'Create an array with all letters
For j = 1 To Len(SheetName2)
ReDim Preserve arrWord(k)
arrWord(k) = Mid(SheetName2, j, 1)
k = k + 1
Next j
Counter = (Len(SheetName2) - 31) * 2
'Loop array and replace with nothing
For w = Counter - 1 To 0 Step -2
arrWord(w) = ""
Next w

You don't need an array of the characters. Just read every second character into a new string.
Public Sub OutputEverySecondChar()
Dim TestString As String
TestString = "abcdefghij"
Dim OutputString As String
Dim iChar As Long
For iChar = 1 To Len(TestString) Step 2
OutputString = OutputString & Mid$(TestString, iChar, 1)
Next iChar
Debug.Print OutputString 'output is: acegi
End Sub
Or even better as re-useable function:
Public Sub test()
Debug.Print ReturnEveryXcharcter(InputString:="abcdefghij", FirstChar:=1, xChar:=2, CharLength:=1)
End Sub
Public Function ReturnEveryXcharcter(InputString As String, FirstChar As Long, xChar As Long, CharLength As Long) As String
Dim iChar As Long
For iChar = FirstChar To Len(InputString) Step xChar
ReturnEveryXcharcter = ReturnEveryXcharcter & Mid$(InputString, iChar, CharLength)
Next iChar
End Function
So the function is even more flexible and you can eg use the following to delete every 3ʳᵈ character (beginning with the first):
Debug.Print ReturnEveryXcharcter(InputString:="abcdefghij", FirstChar:=2, xChar:=3, CharLength:=2)
'output will be: bcefhi
Edit according comment:
Then loop until it is less then 31 characters:
Public Sub test()
Dim SheetName As String
SheetName = "1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
Do
SheetName = ReturnEveryXcharcter(InputString:=SheetName, FirstChar:=1, xChar:=2, CharLength:=1, LengthLimit:=31)
Loop While Len(SheetName) > 31
Debug.Print SheetName
End Sub

My solution:
If Len(SheetName1) > 31 Then
SheetName2 = Replace(SheetName1, " ", "")
'Third check - Remove letter by letter starting from the beginning
If Len(SheetName2) > 31 Then
Counter = (Len(SheetName2) - 31) * 2
Part_1 = Left(SheetName2, Counter)
Part_2 = Mid(SheetName2, Counter + 1, Len(SheetName2) - (Counter))
Semi_Part = ""
For j = 1 To Counter Step 2
'Use only Part_1 because it ll be the same in all occurance
Semi_Part = Semi_Part + Mid(Part_1, j, 1)
Next j
FinalSheetname = Semi_Part & Part_2
Else
FinalSheetname = SheetName2
End If
Else
FinalSheetname = SheetName1
End If

Just for interest, you can also do this with a worksheet function if you have one of the more recent versions of Excel (O365 or 2016+) that has the CONCAT function.
The "trick" is to create an array that includes the positions of the characters you wish to retain.
In the case of deleting every even space positioned character (retaining the odd positions):
=CONCAT(MID(A1,(ROW(INDEX($A:$A,1,1):INDEX($A:$A,ROUNDUP(LEN(A1)/2,0),1))-1)*2+1,1))
Since this is an array formula, you need to "confirm" it by holding down ctrl + shift while hitting enter. If you do this correctly, Excel will place braces {...} around the formula as observed in the formula bar

Related

Deleting instances of chars using arrays and loops

I have a column where almost every cell is made of a combination of numbers and letters and symbols ("TS-403" or "TSM-7600"). I want every char that's not an integer to be deleted/replaced with an empty string, so that I'm left only with numbers ("403").
I've thought up of two approaches:
I think the best one is to create an array of integers with the numbers 0-9, and then iterate through the cells with a for loop where if the string in a cell contains a char that's not in the array, then that symbol (not the entire cell) should be erased.
Sub fixRequestNmrs()
Dim intArr() as Integer
ReDim intArr(1 to 10)
For i = 0 to 9
intArr(i) = i
Next i
Dim bRange as Range
Set bRange = Sheets(1).Columns(2)
For Each cell in bRange.Cells
if cell.Value
// if cell includes char that is not in the intArr,
// then that char should be deleted/replaced.
...
End Sub()
Perhaps the second approach is easier, which would be to use the Split() function as the '-' is always followed by the numbers, and then have that first substring replaced with "". I'm very confused on how to use the Split() function in combination with a range and a replace funtion though...
For Each cell in bRange.Cells
Cells.Split(?, "-")
...
Digits to Integer Using the Like Operator
The Function
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose: Returns an integer composed from the digits of a string.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function DigitsToInteger(ByVal SearchString As String) As Long
Dim ResultString As String
Dim Char As String
Dim n As Long
For n = 1 To Len(SearchString)
Char = Mid(SearchString, n, 1)
If Char Like "[0-9]" Then ResultString = ResultString & Char
Next n
If Len(ResultString) = 0 Then Exit Function
DigitsToInteger = CLng(ResultString)
End Function
A Worksheet Example
Sub DigitsToIntegerTEST()
Const FIRST_ROW As Long = 2
' Read: Reference the (single-column) range.
Dim wb As Workbook: Set wb = ThisWorkbook ' workbook containing this code
Dim ws As Worksheet: Set ws = wb.Worksheets("Sheet1")
Dim LastRow As Long: LastRow = ws.Cells(ws.Rows.Count, "B").End(xlUp).Row
If LastRow < FIRST_ROW Then Exit Sub ' no data
Dim rg As Range: Set rg = ws.Range("B2", ws.Cells(LastRow, "B"))
Dim rCount As Long: rCount = rg.Rows.Count
' Read: Return the values from the range in an array.
Dim Data() As Variant
If rCount = 1 Then
ReDim Data(1 To 1, 1 To 1): Data(1, 1) = rg.Value
Else
Data = rg.Value
End If
' Modify: Use the function to replace the values with integers.
Dim r As Long
For r = 1 To rCount
Data(r, 1) = DigitsToInteger(CStr(Data(r, 1)))
Next r
' Write: Return the modifed values in the range.
rg.Value = Data
' To test the results in the column adjacent to the right, instead use:
'rg.Offset(, 1).Value = Data
End Sub
In VBA (Simple)
Sub DigitsToIntegerSimpleTest()
Const S As String = "TSM-7600sdf"
Debug.Print DigitsToInteger(S) ' Result 7600
End Sub
In Excel
=DigitsToInteger(A1)
If you have the CONCAT function, you can do this with a relatively simple formula -- no VBA needed:
=CONCAT(IFERROR(--MID(A1,SEQUENCE(LEN(A1)),1),""))
If you prefer a non-VBA solution in an earlier version of Excel, there is a more complex formula available, but I'd have to go back through my files to locate it.
A tricky function GetVal()
The following function
translates a string into a single characters array arr via help function String2Arr()
isolates them into numeric (category code 6) or non-numeric categories (other) via a tricky execution of Application.Match (here without its 3rd argument which is mostly used for precise search, and by comparing two arrays)
finds the starting position in the original string via Instr()
returns the value of the right substring via Val() (~> see note).
Function GetVal(ByVal s As String) As Double
Dim arr: arr = String2Arr(s): Debug.Print Join(arr, "|")
Dim chars: chars = Split(" ,',+,-,.,0,A", ",")
Dim catCodes: catCodes = Application.Match(arr, chars) 'No 3rd zero-argument!!
Dim tmp$: tmp = Join(catCodes, ""): Debug.Print Join(catCodes, "|")
Dim pos&: pos = InStr(tmp, 6) ' Pos 6: Digits; pos 1-5,7: other symbols/chars
GetVal = Val(Mid(s, pos)) ' calculate value of right substring
End Function
Notes
The Val function can translate a (sub)string with starting digits to a number, even if there are following non-numeric characters.
Help function String2Arr()
Atomizes a string into a single characters array:
Function String2Arr(ByVal s As String)
s = StrConv(s, vbUnicode)
String2Arr = Split(s, vbNullChar, Len(s) \ 2)
End Function
Example call
Dim s As String
s = "aA+*&$%(y#,'/\)!-12034.56blabla"
Debug.Print GetVal(s) ' ~~> 12034.56

Pass of `ByVal` argument to Regex function leads to very slow code , although using Array

I am using this Regex function to (Remove numeric characters from end of string if count of numbers >= 9),
Function Remove_Number_Regex(Text As String) As String
'Remove numbers from end of string if count of numbers(characters) >= 9
With CreateObject("VBScript.RegExp")
.Global = True
.Pattern = "\d{9,}(?=\.\w+$)"
Remove_Number_Regex = .Replace(Text, "")
End With
End Function
I tried on excel (as formula) and it works as it should without any error.
Then I used it inside vba using Array , but I got this error
Compile error: ByRef argument type mismatch
I fixed that error by passing ByVal argument to the declaration of Regex function
Function Remove_Number_Regex(ByVal Text As String) As String
And that leads to a very slow code to finish 18 seconds on (10K row) ,although using any other text function inside the same array takes 0.4 seconds to finish.
In advance, grateful for any helpful comments and answers.
Sub Use_Function_Remove_Number_Regex()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim arg As Range, arr
With ActiveSheet
Set arg = .Range("O1", .Cells(.Rows.Count, "R").End(xlUp)) '10k rows
End With
arr = arg.value
Dim r As Long, j As Long
For j = 1 To 4
For r = 1 To UBound(arr)
arr(r, j) = Remove_Number_Regex(arr(r, j))
Next r
Next j
arg.value = arr
Application.Calculation = xlCalculationAutomatic
End Sub
Generally speaking; using regular expressions will slow things down. You are correct that common string-operations are faster. So, why not use them:
Function Remove_Number(Text As String) As String
Dim nr As String: nr = CStr(Val("1" & StrReverse(Split(Text, ".")(0))))
If Len(nr) > 9 Then
Remove_Number = Replace(Text, StrReverse(Mid(nr, 2)) & ".", ".")
Else
Remove_Number = Text
End If
End Function
To test this function based on your previous question:
Sub Test()
Dim arr As Variant: arr = Array("Anomaly - allhelipads1335023398818.doc", "Anomaly - oilpipingW8.doc")
For Each el In arr
Debug.Print Remove_Number(CStr(el))
Next
End Sub
Returns:
The trick used:
Split your input based on the dot, and return the 1st element from the array (zero based);
Reverse the string;
Extract the numeric value when concatenated with a '1' (to prevent trailing zeros to disappear);
Check if length is more than 9 (accounting for the '1') and if so replace the value accordingly.
Note: Depending on your version of Excel, you could just stay away from VBA alltogether. See my answer to your previous question.

How to exctract first and last numbers in a sequence

i foung similiar topic, but i can t apply completily that solution to my needs... I want to upgrade excel workbook at my job by making it more auto-entry capable.
Mostly i use excel functions, but sometimes i need some VBA coding, which im not very familiar with. So my problem is, that i need something like this mentioned on this thread. How to get the first and last numbers in a sequence
I have box numbers in different sequince in ascening order starting from "A4" to X on
Sheet1. Example Box numbers: M004935149,M004935150,M004935151,M004935202,M004935203,M004935204,M004935205, is it possible when i copy&paste(values) to sheet2 from "A4" to X (depenting on number of boxes copied) to make a string, sentence or whatever is called in specific form in some other cells. M004935149-151 // M004935202-205. I used code from topic in link above, it can make half job done but i can t figure it out how to make entry from desired cell range and display them on worksheet, and to display values in desired format. Link of screen shoots from my example is following:
I hope that someone can help. Thanks in advance.
Check this
Option Explicit
Sub test2()
Dim ws As Worksheet
Dim arr() As String, result As String, letter As String, cellValue As String, tempLastElement As String
Dim lastColumn As Long, counter As Long
Dim firstColumn As Integer, targetRow As Integer, i As Integer
Set ws = Worksheets("Sheet1")
firstColumn = 1 'number of first column with target data
targetRow = 1 'number of row with target data
lastColumn = ws.Range(ws.Cells(targetRow, firstColumn), ws.Cells(targetRow, Columns.Count).End(xlToLeft).Columns).Count
ReDim arr(1 To lastColumn - firstColumn + 1)
letter = Left(ws.Cells(targetRow, firstColumn).Value, 1) 'if count of character in start of string will be more 1, replace 1 on to count of characters
For i = 1 To UBound(arr)
cellValue = ws.Cells(targetRow, i).Value
arr(i) = Right(cellValue, Len(cellValue) - 1) 'if count of character in start of string will be more 1, replace 1 on to count of characters
Next i
ReDim sequenceArr(1 To UBound(arr))
sequenceArr(1) = arr(1)
counter = 2
For i = 1 To UBound(arr) - 1
If CLng(arr(i)) + 1 = CLng(arr(i + 1)) Then
tempLastElement = arr(i + 1)
sequenceArr(counter) = tempLastElement
Else
counter = counter + 1
sequenceArr(counter) = arr(i + 1)
counter = counter + 1
End If
Next
ReDim Preserve sequenceArr(1 To counter)
result = ""
counter = 1
For i = 1 To UBound(sequenceArr) - 1
If counter > UBound(sequenceArr) Then Exit For
If result = "" Then
result = letter & sequenceArr(counter) & "-" & Right(sequenceArr(counter + 1), 3)
counter = counter + 2
Else
result = result & "//" & letter & sequenceArr(counter) & "-" & Right(sequenceArr(counter + 1), 3)
counter = counter + 2
End If
Next
ws.Range("D4").Value = result
End Sub
Result on

Count the number of uppercase words in a text review

I want to count the number of uppercase words in a data set of movie reviews (A2:A1001) and paste the results to column E.
I am having trouble getting any response from the sub on the spreadsheet.
Sub UppercaseWordCount()
Dim ArraySplit() As String
Dim X As Integer
Dim Count As Integer
Dim NextWord As String
Dim Line As Integer
Count = 0
ArraySplit = Split("A2:A1001", " ")
Line = 2
'splitting each review into an array and using lower
'and upper bounds with Ucase function to get a count
'len function used to split up words from single characters
For X = LBound(ArraySplit) To UBound(ArraySplit)
NextWord = ArraySplit(X)
If NextWord = UCase(NextWord) And Len(NextWord) >= 2 Then
Count = Count + 1
End If
Next
'calling the sub to column E with a count for each review
Range("E" & Line).Value = Count
Line = Line + 1
End Sub
Try this. You could convert to a custom function if doing repeatedly.
Sub UppercaseWordCount()
Dim ArraySplit() As String
Dim X As Long 'long better than integer
Dim Count As Long
Dim NextWord As String
Dim r As Range
'splitting each review into an array and using lower
'and upper bounds with Ucase function to get a count
'len function used to split up words from single characters
For Each r In Range("A2:A1001") 'loop through defined range
ArraySplit = Split(r, " ") 'split each cell using space delimiter
For X = LBound(ArraySplit) To UBound(ArraySplit) 'everything else as before
NextWord = ArraySplit(X)
If NextWord = UCase(NextWord) And Len(NextWord) >= 2 Then
Count = Count + 1
End If
Next
'calling the sub to column E with a count for each review
r.Offset(, 4).Value = Count 'can use offset rather than defining a new variable
Count = 0 'reset count
Next r
End Sub

vba mapping function behaving unexpectedly

why the function below works sometimes and fails at others?
The function idea
a string gets passed to this function by a user or a procedure
the string will be split by multiple delimiters into an array of words
each word is cleaned from non-letter characters except for the characters (dash and single quote mark inside the word) an example of words having these two char. would be (man's spirit, life-loving)
if the single quote marks are found surrounding the word, they get trimmed too
after processing each word in the string, and if the word is not empty it gets stored into the array Along with it's starting position and end position
the start and end positions refer to the position of that word within the input string and excluding any non-letter characters that might surround the word.
Here's the code
Sub test()
Dim d$: d = ThisDocument.Range.Text
Dim Arr(), i&
Arr = ExtractWordsFromDoc_2(d)
For i = 0 To UBound(Arr)
ThisDocument.Range(Arr(i)(1) - 1, Arr(i)(2)).HighlightColorIndex = wdBrightGreen
Next
End Sub
Function ExtractWordsFromDoc_2(ByRef doc As Document, Optional ByVal Delimiters)
' Take a string, and return it as a one dimensional array of individual arrys, each array
' has three values (single delimited string, start range of delimited str, end range of delimited str)
' the input string is delimited by any of several characters. None of those characters are returned in
' the result. Provide a default list of Delimiters, which Should come from registry.
' But allow override.
'===================================================================================================================
Dim InputString$: InputString = doc.Range.Text
'return an array of empty string when input string is empty
If InputString = "" Then
ExtractWordsFromDoc_2 = Array("", 0, 0)
Exit Function
End If
'===================================================================================================================
Dim DelimitList As Variant, ArrayOfWords() As Variant, TmpArr() As Variant
Dim OneChar$, TempWord$, WordCount&, InputStringLength&, CharIndex&, ArrUbound&
'===================================================================================================================
'if delimiters are missing, We should get these from a Registry
If IsMissing(Delimiters) Then
DelimitList = Chr(34) & Chr(147) & Chr(148) & Chr(32) & "," & "." & vbCr & vbTab & "/" & "!" & "|" & ";" & ")" & "(" & "?"
'Chr(34)= straight double quotes mark
'Chr(147) & Chr(148) =opening and closing double quotes marks
'Chr(32) = space
Else
DelimitList = Delimiters 'user can override if needed
End If
'===================================================================================================================
InputStringLength = Len(InputString) 'get the input string length
For CharIndex = 1 To InputStringLength 'loop through each character
OneChar = VBA.Strings.Mid(InputString, CharIndex, 1) 'Read one character at a time
Select Case InStr(DelimitList, OneChar) 'Test if the character is a delimiter character
Case 0 'it is not a delimiter
TempWord = TempWord & OneChar 'Add the character to the current word
Case Is <> 0, Is = InputStringLength 'it is a delimiter or it is the last character
'if the temp word is not empty and not a quotation mark
If TempWord > "" And Not (TempWord = "'" Or TempWord = Chr(145) Or TempWord = Chr(146)) Then
TmpArr = TrimSingQuotes(TempWord) 'send that word to be cleaned from single quotaion mark
If (Not TmpArr(0) = "") Then 'if the returned word has length, count it
WordCount = WordCount + 1
ArrUbound = WordCount - 1 'set the new upper dimension for the storing array
ReDim Preserve ArrayOfWords(ArrUbound) 'expand storing array when we have a cleaned word with length
'Save new word in the last place inside the array, along with the word start and end ranges
ArrayOfWords(ArrUbound) = Array(TmpArr(0), _
CharIndex - Len(TempWord) + TmpArr(1) - 1, _
CharIndex - Len(TempWord) + TmpArr(2) - 1)
End If
TempWord = "" 'reset the Temp Word
End If
End Select
Next CharIndex
'===================================================================================================================
ExtractWordsFromDoc_2 = ArrayOfWords 'Return the storing array through function name
'do some cleaning
Erase ArrayOfWords
Erase TmpArr
End Function
Sub testTrimSingQuotes()
TrimSingQuotes (Empty)
End Sub
Function TrimSingQuotes(ByVal TempWord$)
'SSQP =starting single quote position
'ESQP = ending single quote position
'==================================================================
If TempWord = "" Then
TrimSingQuotes = Array("", 0, 0)
Exit Function
End If
'==================================================================
Dim SSQP&: SSQP = 1
Dim ESQP&: ESQP = Len(TempWord)
'==================================================================
'trim starting single quotes
Do While (Mid(TempWord, SSQP, 1) = "'" Or Mid(TempWord, SSQP, 1) = Chr(145) Or Mid(TempWord, SSQP, 1) = Chr(146)) And SSQP < ESQP
SSQP = SSQP + 1
Loop
'==================================================================
'trim ending single quotes
Do While (Mid(TempWord, ESQP, 1) = "'" Or Mid(TempWord, ESQP, 1) = Chr(145) Or Mid(TempWord, ESQP, 1) = Chr(146)) And (ESQP > SSQP)
ESQP = ESQP - 1
Loop
'==================================================================
'get the trimmed word
TempWord = Mid(TempWord, SSQP, ESQP - SSQP + 1)
'==================================================================
'test the trimmed word for output
If TempWord > "" And Not (TempWord = "'" Or TempWord = Chr(145) Or TempWord = Chr(146)) Then
TrimSingQuotes = Array(TempWord, SSQP, ESQP)
Else
TrimSingQuotes = Array("", 0, 0)
End If
End Function
To be honest, I didn't spend a lot of time (i.e. none) figuring out why your code isn't working as intended. I suspect it has something to do with calculating your position in the input string.
It is much simpler to leverage the build in Split function to do your heavy lifting, and probably much more performant than relying on the string functions like Instr and Mid. Note that this relies on 2 quirks of the Split function:
First, if you call Split on an empty string, it returns an array with a UBound of -1.
Second, VBA's version of Split doesn't remove empty entries - so, Split("foo", "foo") results in the array { vbNullString, vbNullString }. This is good, because you can tell by the result how many delimiters were in the string based on the size of the resulting array (number of delimiters in the input will always equal the number of array elements minus one. In VBA terms, delimiterCount = UBound(Split(inputString, delimiter)).
Your requirement makes this easy in that your delimiters are all 1 character.
Try something like this:
Private Function MultiSplit(inValue As String, delimiters() As Variant) As Variant()
Dim output() As Variant
Dim bound As Long
ReDim Preserve output(bound)
Dim tokens() As String
Dim index As Long
tokens = Split(inValue, delimiters(0))
If UBound(tokens) = -1 Then
MultiSplit = Array(vbNullString, 0, 0)
Exit Function
End If
'Process each delimiter.
For index = 1 To UBound(delimiters)
tokens = SubSplit(tokens, CStr(delimiters(index)))
Next index
Dim position As Long
For index = LBound(tokens) To UBound(tokens)
If tokens(index) = vbNullString Then
'This means a delimiter was removed, so increment the position to account for it.
position = position + 1
Else
'Resize the output array and write the result for the remaining token.
ReDim Preserve output(bound)
output(bound) = Array(tokens(index), position, position + Len(tokens(index)) - 1)
position = position + Len(tokens(index)) + 1
bound = bound + 1
End If
Next index
MultiSplit = output
End Function
Private Function SubSplit(inValue() As String, delimiter As String) As String()
Dim tokens() As String
Dim substring As Variant
Dim token As Variant
Dim output() As String
output = Split(vbNullString)
For Each substring In inValue
tokens = Split(substring, delimiter)
'Test for an empty token - these need to be preserved in the output.
If UBound(tokens) = -1 Then
ReDim Preserve output(UBound(output) + 1)
Else
For Each token In tokens
ReDim Preserve output(UBound(output) + 1)
output(UBound(output)) = token
Next token
End If
Next substring
SubSplit = output
End Function
Test code:
Private Function TestCode()
Dim delims() As Variant
Dim results() As Variant
Dim test As String
delims = Array(Chr(34), Chr(147), Chr(148), Chr(32), ",", ".", vbCr, vbTab, "/", "!", "|", ";", ")", "(", "?")
test = "foo|||bar,,baz?crux"
results = MultiSplit(test, delims)
Dim result As Variant
For Each result In results
Debug.Print result(0) & vbTab & result(1) & vbTab & result(2)
Next result
End Function
Note that I didn't examine the existing code even far enough to determine if your output positions were 1 or 0 based. The example above is 0 based. If you need 1 based, insert position = 1 after Dim position As Long.
Quote removal is left as an exercise for the reader.

Resources