Excel VBA compare values on multiple rows and execute additional code - arrays

I have the following task:
There are fields in my document, the combination of which needs to be compared, and if they are the same, another field on the same rows need to be updated.
So far, I add the values in arrays (skipping the first row as header, thus iNum = 2) with select statements per column and concatenate them per row for the comparison.
Dim conc As Range 'Concatenated fields
Dim iconc() As Variant
ReDim iconc(UBound(iMatn) - 1, 1)
For iNum = 2 To UBound(iMatn)
iconc(iNum - 1, 1) = iMatn(iNum, 1) & iVendr(iNum, 1) & iInd1(iNum, 1) & iInd2(iNum, 1) 'Current concatenation
Select Case iNum - 1
Case 2: 'Compare two records
If iconc(iNum - 2, 1) = iconc(iNum - 1, 1) Then 'Compare first and second records
'Execute code to update the two fields from Extra field column
End If
Case 3: 'Compare three records
If AllSame(iconc(iNum - 3, 1), iconc(iNum - 2, 1), iconc(iNum - 1, 1)) Then
'Execute code to update the three fields from Extra field column
End If
I go through each value of the concatenation and compare if its the same as the previous ones with Case statement (I don't expect more than 4 or 5 to be the same, even though there could be a couple hundred of lines).
Thus I face two issues:
If there are 3 equal values, for example, the code first jumps to the case for 2. How can I make it so that it skips to the maximum value?
It needs to resume checking after the rows that were already checked. E.g. if the first two are the same, the code should start checking from the third one; basically to start at from the line after the last of any duplicate ones that are located.
Example
Image: the code needs to return that there are 3 equal rows (lines 2 to 4), update the respective cells on the "Extra field" column, proceed further (from line 5), return that there are 2 equal rows (lines 6 and 7), update the same as above again, proceed further (from line 8) etc.
Any help will be highly appreciated as I am stuck with this problem.
Thank you all.

To determine how many are in each group, in order to decide how you will update the extra fields column, I would use a Dictionary & Collection object.
eg:
'Set reference to Microsoft Scripting Runtime
' (or use late-binding)
Option Explicit
Sub due()
Dim myDict As Dictionary, col As Collection
Dim i As Long, v As Variant
Dim sKey As String
Dim rTable As Range
Dim vTable As Variant, vResults As Variant
'there are more robust methods of selecting the table range
'depending on your actual layout
'And code will also make a difference if the original range includes
' or does not include the "Extra Field" Column
' Code below assumes it is NOT included in original data
Set rTable = ThisWorkbook.Worksheets("sheet2").Cells(1, 1).CurrentRegion
vTable = rTable
Set myDict = New Dictionary
For i = 2 To UBound(vTable)
sKey = vTable(i, 1) & vTable(i, 2) & vTable(i, 3) & vTable(i, 4)
Set col = New Collection
If Not myDict.Exists(sKey) Then
col.Add Item:=WorksheetFunction.Index(vTable, i, 0)
myDict.Add Key:=sKey, Item:=col
Else
myDict(sKey).Add Item:=WorksheetFunction.Index(vTable, i, 0)
End If
Next i
For Each v In myDict.Keys
Select Case myDict(v).Count
Case 2
Debug.Print v, "Do update for two rows"
Case 3
Debug.Print v, "Do update for three rows"
Case Else
Debug.Print v, "No update needed"
End Select
Next v
End Sub
=>
1234V22341212 Do update for three rows
1234v22351215 No update needed
2234v22361515 Do update for two rows
2234v22361311 No update needed
Although, I would probably use Power Query (available in Windows Excel 2010+ and 365) which can easily group by the four columns and return a count. You can then add a new column depending on that count.
Not knowing the nature of your updating Extra Field and the difference between what happens for 2, 3, 4, ... the same, it is not possible to supply any code for that purpose.
In general, you would
expand the array for each dictionary item
Add the extra column if it's not already there
Do the update
Add that to a pre-dimensioned results array
Write the results array back to the worksheet
Note that this method of working with VBA arrays will execute an order of magnitude faster than doing repeated worksheet accessing, but the code is longer.

If you use a look ahead to the next row (rather than look behind) you can determine the end of the group and process accordingly.
Option Explicit
Sub CompareRows()
Dim ws As Worksheet, ar, arExtra
Dim lastrow As Long, n As Long, i As Long, k As Long
Dim s As String, sNext As String, rng As Range, iColor As Long
Set ws = ThisWorkbook.Sheets("Sheet1")
With ws
lastrow = .Cells(.Rows.Count, "A").End(xlUp).Row
ar = .Range("A1:D1").Resize(lastrow)
n = 1
For i = 1 To UBound(ar)
' look ahead to next line
If i = UBound(ar) Then
sNext = ""
Else
sNext = ar(i + 1, 1) & "_" & ar(i + 1, 2) & _
"_" & ar(i + 1, 3) & "_" & ar(i + 1, 4)
End If
If i > 1 Then ' skip header
' increment count if matched
If sNext = s Then
n = n + 1
' process group using counter n
Else
If n >= 2 Then
' first row of group
Set rng = .Cells(i, "E").Offset(1 - n)
If n >= 3 Then
iColor = rgb(128, 255, 128) ' green
Else
iColor = rgb(255, 255, 128) ' yellow
End If
'code to update n rows in Extra column
rng.Resize(n).Value = n
rng.Offset(, -4).Resize(n, 5).Interior.color = iColor
k = k + 1 ' group count
End If
n = 1
End If
End If
s = sNext
Next
End With
MsgBox k & " groups found", vbInformation
End Sub

Related

How to get index of Element in two dimensional array?

I have an array with 2 dimensions.
I also have a For Each loops which loops with elements of these arrays.
How can i get a Index of vElement/vElement2 in the moment of my comment here in code?
I would be very, very thankful if You can help me.
For Each vElement In Table1
For Each vElement2 In Table2
If ws_1.Cells(1, c) = vElement Then
For Row = 3 To lastRow
amountValue = amountValue + ws_1.Cells(Row, c).value
ws_2.Cells(row2, colIlosc) = amountValue
'Here i would love to have index of vElement for example. In my head it would be something like... Index(vElement) or Index(Table1(vElement))
ws_2.Cells(row2, columncodeprod) = vElement2
row2 = row2 + 1
amountValue = 0
Next Row
End If
Next vElement2
Next vElement
Show Indices of an element in a 2-dim Array - the complicated way
If I understand correctly, you are looping through a datafield array via a ►For Each construction and want to get the current row/column index pair of that same array.
In order to answer your question
"How to get indices of an element in a two dimensional array",
I leave aside that you would get these automatically in a more evident and usual way if you changed the logic by looping through array rows first and inside this loop eventually through array columns - see Addendum *).
To allow a reconstruction of e.g. the 6th array element in the example call below as referring to the current index pair (element i=6 ~> table1(3,2) ~> row:=3/column:=2) it would be necessary
to add an element counter i by incrementing its value by +1 each time you get the next element and
to pass this counter as argument (additionally to a reference to the datafield) to a help function getIndex()
returning results as another array, i.e. an array consisting only of two values: (1) the current array row, (2) the current array column:
Example call
Note: For better readibility and in order to condense the answer to the mimimum needed (c.f. MCVE) the following example call executes only one For Each loop over the table1 datafield array; you will be in the position to change this to your needs or to ask another question.
Option Explicit ' declaration head of your code module
Sub ShowIndicesOf2DimArray()
Dim table1 ' declare variant 1-based 2-dim datafield
table1 = Sheet1.Range("A2:B4") ' << change to sheets Code(Name)
Dim vElem, i As Long
Dim curRow As Long, curCol As Long ' current row/column number
For Each vElem In table1
i = i + 1 ' increment element counter
curRow = getIndex(table1, i)(1) ' <~ get row index via help function
curCol = getIndex(table1, i)(2) ' <~ get col index via help function
'optional debug info in VB Editors immediate window (here: Direktbereich)
Debug.Print i & ". " & _
" Table1(" & curRow & "," & curCol & ") = " & vElem & vbTab;
Debug.Print ", where curRow|curCol are " & Join(getIndex(table1, i), "|")
Next vElem
End Sub
Help function getIndex() called by above procedure
Function getIndex(table1, ByVal no As Long) As Variant
'Purpose: get 1-based 1-dim array with current row+column indices
ReDim tmp(1 To 2)
tmp(1) = (no - 1) Mod UBound(table1) + 1
tmp(2) = Int((no - 1) / UBound(table1) + 1)
getIndex = tmp
End Function
*) Addendum - "the simple way"
Just the other way round using row and column variables r and c as mentioned above; allows to refer to an item simply via table1(r,c) :
Sub TheSimpleWay()
Dim table1 ' declare variant 1-based 2-dim datafield
table1 = Sheet1.Range("A2:B4") ' << change to sheets Code(Name)
Dim vElem, i As Long
Dim r As Long, c As Long ' row and column counter
For r = 1 To UBound(table1) ' start by row 1 (1-based!) up to upper boundary in 1st dimension
For c = 1 To UBound(table1, 2) ' start by col 1 (1-based!) up to upper boundary in 2nd dimension
i = i + 1
Debug.Print i & ". " & _
" Table1(" & r & "," & c & ") = " & table1(r, c) & vbTab;
Debug.Print ", where row|col are " & r & "|" & c
Next c
Next r
End Sub
There is NO index in the case you put in discussion...
vElement and vElement2 variables are of the Variant type. They are not objects, to have an Index property.
When you use a For Each vElement In Table1 loop, VBA starts from the array first element, goes down up to the last row and then do the same for the next column.
When you need to know what you name arrays 'indexes' you must use For i = 1 To Ubound(Table1, 1) followed by For j = 1 To Ubound(Table1, 2). In such a case you will know the matching array element row and columns. We can consider them your pseudo-indexes...
If you really want/insist to extract such indexes in an iteration of type For Each vElement In Table1, you must build them. I will try en elocvent code example:
Sub testElemIndex()
Dim sh As Worksheet, Table1 As Variant, vElement As Variant
Dim i As Long, indexRow As Long, indexCol
Set sh = ActiveSheet
sh.Range("C6").value = "TestIndex"
Table1 = sh.Range("A1:E10").value
For Each vElement In Table1
i = i + 1:
If vElement = "TestIndex" Then
If i <= UBound(Table1, 1) Then
indexRow = i: indexCol = 1
Else
indexCol = Int(i / UBound(Table1, 1)) + 1
indexRow = i - Int(i / UBound(Table1, 1)) * UBound(Table1, 1)
End If
Debug.Print Table1(indexRow, indexCol), indexRow, indexCol: Stop
End If
Next
End Sub
You can calculate the rows and columns of the array element. And the code proves that using them, the returned array value is exactly the found one...
Is it a little more light on the array 'indexes'...?
Dim Table1() As Variant
Dim Table2() As Variant
Table1 = Range(Cells(2, 3), Cells(lastRow, vMaxCol))
Table2 = Range(Cells(2, 1), Cells(lastRow, 1))
Table1 is Variant(1 to 33, 1 to 9)
Table2 is Variant(1 to 33, 1 to 1)
This 33 and 9 is dynamic.

How do I associate a listbox VBA Form array associate with absolute address

I am populating a listbox in a form using a range as so:
Private Sub UserForm_Initialize()
Names = Range("C6:D" & Cells(Rows.Count, 3).End(xlUp).Row)
For i = LBound(Names, 1) To UBound(Names, 1)
ListBox1.AddItem Names(i, 1) & "-" & Names(i, 2)
Next
OptionButton3.Value = True
End Sub
I need to call the address of each of these items later in my code to act upon; in reality each item in the listbox is to select which rows to act upon by the user placing each item in a different listbox as part of the form.
I have tried to redimension the array like so, with no success due to "Constant Expression Required":
Dim Names(6 To Cells(Rows.Count, 3).End(xlUp).Row, Range("C6:D" & Cells(Rows.Count, 3).End(xlUp).Row))
What is either the best way to associate the address with the array, or record the list of rows ?
Think this is what you need:
Code
Private Sub UserForm_Initialize()
' I. Get started
' a) variables
Const iOffset As Long = 5 ' row offset 5, i.e. start at row C6
Dim names, a ' variant datafield arrays
Dim i As Long, n As Long
Dim ws As Worksheet
' b) set worksheet object to memory
Set ws = ThisWorkbook.Worksheets("MySheet") ' << change to your sheet name
' c) listbox layout just for demo
Me.ListBox1.ColumnCount = 2
Me.ListBox1.ColumnWidths = "100;50"
' d) get last row of column C
n = ws.Cells(Rows.Count, 3).End(xlUp).Row
' II. Get values
' a) create 1-based 2-dim variant datafield array
names = ws.Range("C6:D" & n)
' b) concatenate names and define cell address
For i = LBound(names, 1) To UBound(names, 1)
names(i, 1) = names(i, 1) & "-" & names(i, 2)
names(i, 2) = "C" & i + iOffset
Next
' III. Fill Listbox
' get values of names array (1-liner, allows more than 10 columns :-)
ListBox1.List = names
' IV. Test to get 2nd array column into new array a
a = Application.Index(names, 0, 2)
For i = LBound(a) To UBound(a)
Debug.Print a(i, 1)
Next i
' V. clear memory
Set ws = Nothing
End Sub
Note
I added a constant defining your row Offset (Const iOffset As Long = 5) and speeded up code using an array assignment to your ListBox.List in one code statement instead of adding items one by one (BTW this would allow to use more than 10 listbox columns).
As #Rory remarked, just adding your row offset (e.g. +5) to the current active ListBox1.Listindex would be sufficient to get the row number. In this case you should only take care that a list row is marked (i.e. to check If ListBox1.ListIndex > -1 Then ... do something).
I solved the problem by a rather inelegant hack that I don't particularly like using a helper column listing the row number. There are better ways of doing this I'm sure, but here's what I came up with:
Private Sub UserForm_Initialize()
Names = Range("C6:E" & Cells(Rows.Count, 3).End(xlUp).Row)
For i = LBound(Names, 1) To UBound(Names, 1)
ListBox1.AddItem Names(i, 3) & ": " & Names(i, 1) & "-" & Names(i, 2)
Next
OptionButton3.Value = True
End Sub
I then recall the row number in my code as follows once the user has selected which items to act on:
For i = 0 To (ListBox2.ListCount - 1)
Dim itemName() As String
itemName() = Split(ListBox2.list(i), ":")
deviceRow = itemName(0)
Debug.Print "Row number: " + deviceRow
... <SNIP>
It Prints like this:
Row number: 10
Row number: 7
Row number: 14
Row number: 9
There must be better ways of doing this, but that was my solution.

Sum up column B based on colum C values

I have a quick question: I try to sum up in a table of 4 columns column number 2 if the value in column number 1 AND 3 matches. I found a sample code here on stack overflow, but it counts currently based on column 1. I'm new to VBA and don't know what to change or how to adjust the code to base my calculations on column 1 and 3. Here is the sample code:
Option Explicit
Sub testFunction()
Dim rng As Excel.Range
Dim arrProducts() As String
Dim i As Long
Set rng = Sheet1.Range("A2:A9")
arrProducts = getSumOfCountArray(rng)
Sheet2.Range("A1:B1").Value = Array("Product", "Sum of Count")
' go through array and output to Sheet2
For i = 0 To UBound(arrProducts, 2)
Sheet2.Cells(i + 2, "A").Value = arrProducts(0, i)
Sheet2.Cells(i + 2, "B").Value = arrProducts(1, i)
Next
End Sub
' Pass in the range of the products
Function getSumOfCountArray(ByRef rngProduct As Excel.Range) As String()
Dim arrProducts() As String
Dim i As Long, j As Long
Dim index As Long
ReDim arrProducts(1, 0)
For j = 1 To rngProduct.Rows.Count
index = getProductIndex(arrProducts, rngProduct.Cells(j, 1).Value)
If (index = -1) Then
' create value in array
ReDim Preserve arrProducts(1, i)
arrProducts(0, i) = rngProduct.Cells(j, 1).Value ' product name
arrProducts(1, i) = rngProduct.Cells(j, 2).Value ' count value
i = i + 1
Else
' value found, add to id
arrProducts(1, index) = arrProducts(1, index) + rngProduct.Cells(j, 2).Value
End If
Next
getSumOfCountArray = arrProducts
End Function
Function getProductIndex(ByRef arrProducts() As String, ByRef strSearch As String) As Long
' returns the index of the array if found
Dim i As Long
For i = 0 To UBound(arrProducts, 2)
If (arrProducts(0, i) = strSearch) Then
getProductIndex = i
Exit Function
End If
Next
' not found
getProductIndex = -1
End Function
Sum Column B based on Column A using Excel VBA Macro
Could you please advise me how I can solve this problem. Below you can find a sample picture of my small table. The quantity of the yellow part, for instance, shall be summed up and the second row shall be deleted.
Sample Table - Picture
you said "I try to sum up in a table of 4 columns column number 2" but from your "Sample Table - Picture" I'd understand you want to sum up column number 4
edited after OP variation of data range
Assuming what above you could try the following
Option Explicit
Sub main()
On Error GoTo 0
With ActiveSheet '<== set here the actual sheet reference needed
' With .Range("A:D").Resize(.cells(.Rows.Count, 1).End(xlUp).row) '<== here adjust "A:D" to whatever colums range you need
With .Range("A51:D" & .cells(.Rows.Count, "A").End(xlUp).row) '<== here adjust "A:D" to whatever colums range you need
With .Offset(1).Resize(.Rows.Count - 1)
.Offset(, .Columns.Count).Resize(, 1).FormulaR1C1 = "=SUMIFS(C2, C1,RC1,C3, RC3)" '1st "helper column is the 1st column at the right of data columns (since ".Offset(, .Columns.Count)")
.Columns(2).Value = .Offset(, .Columns.Count).Resize(, 1).Value 'reference to 1st "helper" column (since ".Offset(, .Columns.Count)")
.Offset(, .Columns.Count).Resize(, 1).FormulaR1C1 = "=concatenate(RC1,RC3)"
With .Offset(, .Columns.Count + 1).Resize(, 1) '2nd "helper" column is the 2nd column at the right of data columns (since ".Offset(, .Columns.Count + 1)"
.FormulaR1C1 = "=IF(countIF(R1C[-1]:RC[-1],RC[-1])=countif(C[-1],RC[-1]),1,"""")" 'reference to 1st "helper" column (with all those "C[-1]")
.Value = .Value
.SpecialCells(xlCellTypeBlanks).EntireRow.Delete
.Offset(, -1).Resize(, 2).ClearContents ' reference to both "helper" columns: ".Offset(, -1)" reference the 1st since it shifts one column to the left from the one referenced in the preceeding "With.." statement (which is the 2nd column at thre right of data columns) and ".Resize(, 2)" enlarges to encose the adjacent column to the right
End With
End With
End With
End With
End Sub
it makes use of two "helper" columns, which I assumed could be the two adjacent to the last data columns (i.e.: if data columns are "A:D" then helper columns are "E:F")
should you need to use different "helper" columns then see comments about how they are located and change code accordingly

Redimming a 2d array throws type mismatch

I was working on a solution to another question of mine when I stumble across this helpful question and answer. However implementing the answer given by Control Freak over there throws me a Type Mismatch error as soon as I exit the function and return to my code on the line: Years = ReDimPreserve(Years, i, 3). I'm not that skilled of a programmer to figure out what is going wrong here, so can anybody shed some light on this.
Here is my code:
Sub DevideData()
Dim i As Integer
Dim Years() As String
ReDim Years(1, 3)
Years(1, 1) = Cells(2, 1).Value
Years(1, 2) = 2
i = 2
ThisWorkbook.Worksheets("Simple Boundary").Activate
TotalRows = ThisWorkbook.Worksheets("Simple Boundary").Range("A100000").End(xlUp).row
For row = 3 To TotalRows
Years = ReDimPreserve(Years, i, 3)
If Not Cells(row, 1).Value = Cells(row - 1, 1).Value Then
Years(i - 1, 3) = row - 1
Years(i, 1) = Cells(row, 1).Value
Years(i, 2) = row
i = i + 1
End If
Next row
End Sub
And here is the function as written by Control Freak:
Public Function ReDimPreserve(aArrayToPreserve, nNewFirstUBound, nNewLastUBound)
ReDimPreserve = False
'check if its in array first
If IsArray(aArrayToPreserve) Then
'create new array
ReDim aPreservedArray(nNewFirstUBound, nNewLastUBound)
'get old lBound/uBound
nOldFirstUBound = UBound(aArrayToPreserve, 1)
nOldLastUBound = UBound(aArrayToPreserve, 2)
'loop through first
For nFirst = LBound(aArrayToPreserve, 1) To nNewFirstUBound
For nLast = LBound(aArrayToPreserve, 2) To nNewLastUBound
'if its in range, then append to new array the same way
If nOldFirstUBound >= nFirst And nOldLastUBound >= nLast Then
aPreservedArray(nFirst, nLast) = aArrayToPreserve(nFirst, nLast)
End If
Next
Next
'return the array redimmed
If IsArray(aPreservedArray) Then ReDimPreserve = aPreservedArray
End If
End Function
I promised a fuller answer. Sorry it is later than I expected:
I got tied up with another problem,
Technique 1, which I was expecting to recommend, did not work as I expected so I added some other techniques which are much more satisfactory.
As I said in my first comment:
Public Function ReDimPreserve(aArrayToPreserve, nNewFirstUBound, nNewLastUBound)
causes aArrayToPreserve to have the default type of Variant. This does not match:
Dim Years() As String
As you discovered, redefining Years as a Variant, fixes the problems. An alternative approach would be to amend the declaration of ReDimPreserve so aArrayToPreserve is an array of type String. I would not recommend that approach since you are storing both strings and numbers in the array. A Variant array will handle either strings or numbers while a String array can only handle numbers by converting them to strings for storage and back to numbers for processing.
I tried your macro with different quantities of data and different amendments and timed the runs:
Rows of data Amendment Duration of run
3,500 Years() changed to Variant 4.99 seconds
35,000 Years() changed to Variant 502 seconds
35,000 aArrayToPreserve changed to String 656 seconds
As I said in my second comment, ReDim Preserve is slow for both the inbuilt method and the VBA routine you found. For every call it must:
find space for the new larger array
copy the data from the old array to the new
release the old array for garbage collection.
ReDim Preserve is a very useful method but it must be used with extreme care. Sometimes I find that sizing an array to the maximum at the beginning and using ReDim Preserve to cut the array down to the used size at the end is a better technique. The best techniques shown below determine the number of entries required before sizing the array.
At the bottom of your routine, I added:
For i = LBound(Years, 1) To LBound(Years, 1) + 9
Debug.Print Years(i, 0) & "|" & Years(i, 1) & "|" & Years(i, 2) & "|" & Years(i, 3)
Next
For i = UBound(Years, 1) - 9 To UBound(Years, 1)
Debug.Print Years(i, 0) & "|" & Years(i, 1) & "|" & Years(i, 2) & "|" & Years(i, 3)
Next
This resulted in the following being output to the Immediate Window:
|||
|AAAA|2|2
|AAAB|3|4
|AAAC|5|7
|AAAD|8|11
|AAAE|12|16
|AAAF|17|22
|AAAG|23|23
|AAAH|24|25
|AAAI|26|28
|AOUJ|34973|34976
|AOUK|34977|34981
|AOUL|34982|34987
|AOUM|34988|34988
|AOUN|34989|34990
|AOUO|34991|34993
|AOUP|34994|34997
|AOUQ|34998|35002
|AOUR|35003|
|||
Since you have called the array Years, I doubt my string values are anything like yours. This does not matter. What matters, is that I doubt this output was exactly what you wanted.
If you write:
ReDim Years(1, 3)
The lower bounds are set to the value specified by the Option Base statement or zero if there is no Option Base statement. You have lower bounds for both dimensions of zero which you do not use. This is the reason for the “|||” at the top. There is another “|||” at the end which means you are creating a final row which you are not using. The final used row does not have an end row which I assume in a mistake.
When I can divide a routine into steps, I always validate the result of one step before advancing to the next. That way, I know any problems are within the current step and not the result of an error in an earlier step. I use Debug.Print to output to the Immediate Window most of the time. Only if I want to output a lot of diagnostic information will I write to a text file. Either way, blocks of code like mine are a significant aid to rapid debugging of a macro.
I would never write ReDim Years(1, 3). I always specify the lower bound so as to be absolutely clear. VBA is the only language I know where you can specify any value for the lower bound (providing it is less than the upper bound) so I will specify non-standard values if is helpful for a particular problem. In this case, I see not advantage to a lower bound other than one so that is what I have used.
With two dimensions arrays it is conventional to have columns as the first dimension and rows as the second. One exception is for arrays read from or to be written to a worksheet for which the dimensions are the other way round. You have rows as the first dimension. If you have used the conventional sequence you could have used the ReDim Preserve method, thereby avoiding the RedimPreserve function and the problem of non-matching types.
Technique 1
I expected this to be the fastest technique. Experts advise us to avoid “re-inventing the wheel”. That is, if Excel has a routine that will do what you want, don’t code an alternative in VBA. However, I have found a number of examples where this is not true and I discovered this technique was one of them.
The obvious technique here is to use Filter, then create a range of the visible rows using SpecialCells and finally process each row in this range. I have used this technique very successfully to meet other requirements but not here.
I did not know the VBA to select unique rows so started the macro recorder and filtered my test data from the keyboard to get:
Range("A1:A35000").AdvancedFilter Action:=xlFilterInPlace, Unique:=True
My past uses of Filter have all converted to AutoFilter which I have found to give acceptable performance. This converted to AdvancedFilter which took 20 seconds both from the keyboard and from VBA. I do not know why it is so slow.
The second problem was that:
Set RngUnique = .Range(.Cells(1, 1), .Cells(RowLast, 1)) _
.SpecialCells(xlCellTypeVisible)
was rejected as “too complicated”.
Not being able to get the visible rows as a range means the benefits of Filter are not really available. I have counted the visible rows to simulate having RngUnique.Rows.Count. This shows the technique which has always worked with AutoFilter. If AdvancedFilter had reported the unique rows in an accepted time I might have investigated this problem but under the circumstances it does not seem worth the effort.
The macro demonstrating this technique is:
Option Explicit
Sub Technique1()
' * Avoid using meaningless names like i. Giving every variable a meaningful
' name is helpful during development and even more helpful when you return
' to the macro in six months for maintenence.
' * My naming convention is use a sequence of keywords. The first keyword
' identifies what type of data the variable holds. So "Row" means it holds
' a row number. Each subsequent keyword narrows the scope. "RowSb" is a
' row of the worksheet "Simple Boundary" and "RowYears" is a row of the Years
' array. "RowSbCrnt"is the current row of the worksheet "Simple Boundary".
' * I can look at macros I wrote years ago and know what all the variables are.
' You may not like my convention. Fine, development your own but do not
' try programming with random names.
' * Avoid data type Integer which specifies a 16-bit whole number and requires
' special processing on 32 and 64-bit computers. Long is now the recommended
' data type for whole numbers.
Dim NumRowsVisible As Long
Dim RowSbCrnt As Long
Dim RowSbLast As Long
Dim RowYearsCrnt As Long
Dim TimeStart As Double
Dim Years() As Variant
TimeStart = Timer ' Get the time as seconds since midnight to nearest .001
' of a second
' This can save significant amounts of time if the macro amends the
' screen or switches between workbooks.
Application.ScreenUpdating = False
With Worksheets("Simple Boundary")
' Rows.Count avoiding having to guess how many rows will be used
RowSbLast = .Cells(Rows.Count, "A").End(xlUp).Row
' Hide non-unique rows
With .Range(.Cells(1, 1), .Cells(RowSbLast, 1))
.AdvancedFilter Action:=xlFilterInPlace, Unique:=True
End With
' Count number of unique rows.
' It is difficult to time small pieces of code because OS routines
' can execute at any time. However, this count takes less than .5
' of a second with 35,000 rows.
NumRowsVisible = 0
For RowSbCrnt = 2 To RowSbLast
If Not .Rows(RowSbCrnt).Hidden Then
NumRowsVisible = NumRowsVisible + 1
End If
Next
' Use count to ReDim array to final size.
ReDim Years(1 To 3, 1 To NumRowsVisible)
RowYearsCrnt = 1
Years(1, RowYearsCrnt) = .Cells(2, 1).Value
Years(2, RowYearsCrnt) = 2
For RowSbCrnt = 3 To RowSbLast
If Not .Rows(RowSbCrnt).Hidden Then
Years(3, RowYearsCrnt) = RowSbCrnt - 1
RowYearsCrnt = RowYearsCrnt + 1
Years(1, RowYearsCrnt) = .Cells(RowSbCrnt, 1).Value
Years(2, RowYearsCrnt) = RowSbCrnt
End If
Next
' Record final row for final string
Years(3, RowYearsCrnt) = RowSbLast
.ShowAllData ' Clear AdvancedFilter
End With
Application.ScreenUpdating = True
Debug.Print "Duration: " & Format(Timer - TimeStart, "#,##0.000")
' Output diagnostics
For RowYearsCrnt = 1 To 9
Debug.Print Years(1, RowYearsCrnt) & "|" & _
Years(2, RowYearsCrnt) & "|" & _
Years(3, RowYearsCrnt) & "|"
Next
' Note that rows are now in the second dimension hence the 2 in UBound(Years, 2)
For RowYearsCrnt = UBound(Years, 2) - 9 To UBound(Years, 2)
Debug.Print Years(1, RowYearsCrnt) & "|" & _
Years(2, RowYearsCrnt) & "|" & _
Years(3, RowYearsCrnt) & "|"
Next
End Sub
The output to the Immediate Window is:
Duration: 20.570
AAAA|2|2|
AAAB|3|4|
AAAC|5|7|
AAAD|8|11|
AAAE|12|16|
AAAF|17|22|
AAAG|23|23|
AAAH|24|25|
AAAI|26|28|
AOUI|34970|34972|
AOUJ|34973|34976|
AOUK|34977|34981|
AOUL|34982|34987|
AOUM|34988|34988|
AOUN|34989|34990|
AOUO|34991|34993|
AOUP|34994|34997|
AOUQ|34998|35002|
AOUR|35003|35008|
As you can see the last row is correct. A duration of 20 seconds is better than the 8 minutes of your technique but I am sure we can do better.
Technique 2
The next macro is similar to the last one but it counts the unique rows rather than use AdvancedFilter to hide the non-unique rows. This macro has a duration of 1.5 seconds with 35,000 rows. This demonstrates that counting how many rows are required for an array in a first pass of the data is a viable approach. The diagnostic output from this macro is the same as above.
Sub Technique2()
Dim NumRowsUnique As Long
Dim RowSbCrnt As Long
Dim RowSbLast As Long
Dim RowYearsCrnt As Long
Dim TimeStart As Double
Dim Years() As Variant
TimeStart = Timer ' Get the time as seconds since midnight to nearest .001
' of a second
With Worksheets("Simple Boundary")
RowSbLast = .Cells(Rows.Count, "A").End(xlUp).Row
' Count number of unique rows.
' Assume all data rows are unique until find otherwise
NumRowsUnique = RowSbLast - 1
For RowSbCrnt = 3 To RowSbLast
If .Cells(RowSbCrnt, 1).Value = .Cells(RowSbCrnt - 1, 1).Value Then
NumRowsUnique = NumRowsUnique - 1
End If
Next
' * Use count to ReDim array to final size.
' * Note that I have defined the columns as the first dimension and rows
' as the second dimension to match convention. Had I wished, this would
' have allowed me to use the standard ReDim Preserve which can only
' adjust the last dimension. However, this does not match the
' syntax of Cells which has the row first. It may have been better to
' maintain your sequence so the two sequences were the same.
ReDim Years(1 To 3, 1 To NumRowsUnique)
RowYearsCrnt = 1
Years(1, RowYearsCrnt) = .Cells(2, 1).Value
Years(2, RowYearsCrnt) = 2
For RowSbCrnt = 3 To RowSbLast
If .Cells(RowSbCrnt, 1).Value <> .Cells(RowSbCrnt - 1, 1).Value Then
Years(3, RowYearsCrnt) = RowSbCrnt - 1
RowYearsCrnt = RowYearsCrnt + 1
Years(1, RowYearsCrnt) = .Cells(RowSbCrnt, 1).Value
Years(2, RowYearsCrnt) = RowSbCrnt
End If
Next
' Record final row for final string
Years(3, RowYearsCrnt) = RowSbLast
End With
Debug.Print "Duration: " & Format(Timer - TimeStart, "#,##0.000")
' Output diagnostics
For RowYearsCrnt = 1 To 9
Debug.Print Years(1, RowYearsCrnt) & "|" & _
Years(2, RowYearsCrnt) & "|" & _
Years(3, RowYearsCrnt) & "|"
Next
' Note that rows are now in the second dimension hence the 2 in UBound(Years, 2)
For RowYearsCrnt = UBound(Years, 2) - 9 To UBound(Years, 2)
Debug.Print Years(1, RowYearsCrnt) & "|" & _
Years(2, RowYearsCrnt) & "|" & _
Years(3, RowYearsCrnt) & "|"
Next
End Sub
Technique 3
The next macro is only slightly changed from the last.
Firstly, I have replaced the literals used to identify the column numbers in worksheets and arrays with constants such as:
Const ColYrEnd As Long = 3
Under my naming convention ColYrEnd = Column of Year array holding range End hence:
Years(ColYrEnd, RowYearsCrnt) = RowCvCrnt - 1
instead of Years(3, RowYearsCrnt) = RowCvCrnt - 1
This makes no difference to the compiled code but makes the source code easier to understand because you do not have to remember what columns 1, 2 and 3 hold. More importantly, if you ever have to rearrange the columns, updating the constants is the only change required. If you ever have to search through a long macro replacing every use of 2 as a column number (while ignoring any other use of 2) by 5, you will know why this is important.
Secondly, I have used:
ColValues = .Range(.Cells(1, ColSbYear), _
.Cells(RowSbLast, ColSbYear)).Value
to import column 1 to an array. The code that read the values from the worksheet now reads them from this array. Array access is much faster than worksheet access so this reduces the runtime from 1.5 seconds to .07 seconds.
The revised code is:
Sub Technique3()
Const ColCvYear As Long = 1
Const ColSbYear As Long = 1
Const ColYrYear As Long = 1
Const ColYrStart As Long = 2
Const ColYrEnd As Long = 3
Const RowSbDataFirst As Long = 2
Const RowCvDataFirst As Long = 2
Dim ColValues As Variant
Dim NumRowsUnique As Long
Dim RowCvCrnt As Long
Dim RowSbCrnt As Long
Dim RowSbLast As Long
Dim RowYearsCrnt As Long
Dim TimeStart As Double
Dim Years() As Variant
TimeStart = Timer ' Get the time as seconds since midnight to nearest .001
' of a second
With Worksheets("Simple Boundary")
RowSbLast = .Cells(Rows.Count, ColSbYear).End(xlUp).Row
ColValues = .Range(.Cells(1, ColSbYear), _
.Cells(RowSbLast, ColSbYear)).Value
' * The above statement imports all the data from column 1 as a two dimensional
' array into a Variant. The Variant is then accessed as though it is an array.
' * The first dimension has one entry per row, the second dimension has on entry
' per column which is one in this case. Both dimensions will have a lower bound
' of one even if the first row or column loaded is not one.
End With
' Count number of unique rows.
' Assume all data rows are unique until find otherwise
NumRowsUnique = UBound(ColValues, 1) - 1
For RowCvCrnt = RowCvDataFirst + 1 To UBound(ColValues, 1)
If ColValues(RowCvCrnt, ColCvYear) = ColValues(RowCvCrnt - 1, ColCvYear) Then
NumRowsUnique = NumRowsUnique - 1
End If
Next
' I mentioned earlier that I was unsure if having rows and columns in the
' convention sequence was correct. I am even less sure here where array
' ColValues has been loaded from a worksheet and the rows and columns are
' not in the conventional sequence. ReDim Years(1 To 3, 1 To NumRowsUnique)
RowYearsCrnt = 1
Years(ColYrYear, RowYearsCrnt) = ColValues(RowCvDataFirst, ColCvYear)
Years(ColYrStart, RowYearsCrnt) = RowCvDataFirst
For RowCvCrnt = RowCvDataFirst + 1 To UBound(ColValues, 1)
If ColValues(RowCvCrnt, ColCvYear) <> ColValues(RowCvCrnt - 1, ColCvYear) Then
Years(ColYrEnd, RowYearsCrnt) = RowCvCrnt - 1
RowYearsCrnt = RowYearsCrnt + 1
Years(ColYrYear, RowYearsCrnt) = ColValues(RowCvCrnt, ColCvYear)
Years(ColYrStart, RowYearsCrnt) = RowCvCrnt
End If
Next
' Record final row for final string
Years(ColYrEnd, RowYearsCrnt) = UBound(ColValues, 1)
Debug.Print "Duration: " & Format(Timer - TimeStart, "#,##0.000")
' Output diagnostics
For RowYearsCrnt = 1 To 9
Debug.Print Years(ColYrYear, RowYearsCrnt) & "|" & _
Years(ColYrStart, RowYearsCrnt) & "|" & _
Years(ColYrEnd, RowYearsCrnt) & "|"
Next
' Note that rows are now in the second dimension hence the 2 in UBound(Years, 2)
For RowYearsCrnt = UBound(Years, 2) - 9 To UBound(Years, 2)
Debug.Print Years(ColYrYear, RowYearsCrnt) & "|" & _
Years(ColYrStart, RowYearsCrnt) & "|" & _
Years(ColYrEnd, RowYearsCrnt) & "|"
Next
End Sub
Other techniques
I considered introducing other techniques but I decided they were not useful for this requirement. Also, this answer is already long enough. I have provided much for you to think about and more would just be overload. As stated above I have reduced the run time for 35,000 rows from 8 minutes to 20 seconds to 1.5 seconds to .07 seconds.
Work slowly through my macros. I have hope I have provided adequate explanation of what each is doing. Once you know a statement exists, it is generally easy to look it up so there is not too much explanation of the statements. Come back with questions as necessary.
As stated earlier in comments, ReDim Preserve is an expensive call when working with large datasets and is generally avoided. Here is some commented code that should perform as desired. Tested on a dataset with 200,000 rows, it took less than 5 seconds to complete. Tested on a dataset with 1000 rows, it took less that 0.1 seconds to complete.
The code uses a Collection to get the unique values out of column A, and then builds the array based on those unique values and outputs the results to another sheet. In your original code, there was nowhere that the resulting array was output, so I just made something up and you'll need to adjust the output section as needed.
Sub tgr()
Dim ws As Worksheet
Dim rngYears As Range
Dim collUnqYears As Collection
Dim varYear As Variant
Dim arrAllYears() As Variant
Dim arrYearsData() As Variant
Dim YearsDataIndex As Long
Set ws = ActiveWorkbook.Sheets("Simple Boundary")
Set rngYears = ws.Range("A1", ws.Cells(Rows.Count, "A").End(xlUp))
If rngYears.Cells.Count < 2 Then Exit Sub 'No data
Set collUnqYears = New Collection
With rngYears
.CurrentRegion.Sort rngYears, xlAscending, Header:=xlYes 'Sort data by year in column A
arrAllYears = .Offset(1).Resize(.Rows.Count - 1).Value 'Put list of years in array for faster calculation
'Get count of unique years by entering them into a collection (forces uniqueness)
For Each varYear In arrAllYears
On Error Resume Next
collUnqYears.Add CStr(varYear), CStr(varYear)
On Error GoTo 0
Next varYear
'Ssize the arrYearsData array appropriately
ReDim arrYearsData(1 To collUnqYears.Count, 1 To 3)
'arrYearsData column 1 = Unique Year value
'arrYearsData column 2 = Start row for the year
'arrYearsData column 3 = End row for the year
'Loop through unique values and populate the arrYearsData array with desired information
For Each varYear In collUnqYears
YearsDataIndex = YearsDataIndex + 1
arrYearsData(YearsDataIndex, 1) = varYear 'Unique year
arrYearsData(YearsDataIndex, 2) = .Find(varYear, .Cells(1), , , , xlNext).Row 'Start Row
arrYearsData(YearsDataIndex, 3) = .Find(varYear, .Cells(1), , , , xlPrevious).Row 'End Row
Next varYear
End With
'Here is where you would output your results
'Your original code did not output results anywhere, so adjust sheet and start cell as necessary
With Sheets("Sheet2")
.UsedRange.Offset(1).ClearContents 'Clear previous result data
.Range("A2").Resize(UBound(arrYearsData, 1), UBound(arrYearsData, 2)).Value = arrYearsData
.Select 'This will show the output sheet so you can see the results
End With
End Sub
As you mentioned in the comments, if you are going to continue this way you definitely need to move that redim inside the if statement:
If Not Cells(row, 1).Value = Cells(row - 1, 1).Value Then
Years = ReDimPreserve(Years, i, 3)
Years(i - 1, 3) = row - 1
Years(i, 1) = Cells(row, 1).Value
Years(i, 2) = row
i = i + 1
End If
I think this redimming multi-dimensional arrays is overkill for you. I have a few recommendations:
Ranges
I notice that you are using 2 values to represent the start of a range and end of a range (years(i,2) is the start and years(i,3) is the end). Instead why not just use an actual range?
Create a range variable called startNode and when you find the end of the range create a Range object like with Range(startNode,endNode).
Your code will look something like this:
Sub DevideData()
Dim firstCell As Range
Dim nextRange As Range
Set firstCell = Cells(2,1)
ThisWorkbook.Worksheets("Simple Boundary").Activate
TotalRows = ThisWorkbook.Worksheets("Simple Boundary").Range("A100000").End(xlUp).row
For row = 3 To TotalRows
If Not Cells(row, 1).Value = Cells(row - 1, 1).Value Then
Set nextRange = Range(firstCell, Cells(row-1,1))
Set firstCell = Cells(row,1)
End If
Next row
End Sub
1D Array
Now you do not need to store 3 values! Just an array of ranges Which you can redim like this:
Dim years() As Range
'Do Stuff'
ReDim Preserve years(1 to i)
set years(i) = nextRange
i = i + 1
Note that the only reason that ReDimPreserve was created was so that you can redim both dimensions of a 2D array (normally you can only change the second dimension). With a 1D array you can freely redim without any troubles! :)
For Each Loop
Lastly I recommend that you use a for each loop instead of a regular for loop. It makes your intentions for the loop more explicit which makes your code more readable.
Dim firstCell as Range
Dim lastUniqueValue as Variant
Dim lastCell as Range
Dim iCell as Range
Set firstCell = Cells(3,1)
lastUniqueValue = firstCell.Value
Set lastCell = ThisWorkbook.Worksheets("Simple Boundary").Range("A100000").End(xlUp)
For Each iCell in Range(firstCell, lastCell)
If iCell.Value <> lastUniqueValue Then
lastUniqueValue = iCell.Value
'Do Stuff
End If
Next
Hope this helps! :)

VBA array trouble error 9 script out of range

Thanks for reading my question,
I was given a list of about 250k entries along with names and sign in dates to accompany each entry to show when they logged. My task is to find out which users signed in on consecutive days, how often and how many times.
i.e. Bob smith had 3 consecutive days one time, 5 consecutive days 3 times.
joe smith had 8 consecutive days once, 5 consecutive days 8 times
etc
I am brand new to VBA and have been struggling to write a program to do this.
code:
Option Explicit
Option Base 1
Sub CountUUIDLoop()
Dim UUID As String
Dim Day As Date
Dim Instance() As Variant
ReDim Instance(50, 50)
Dim CountUUID As Variant
Dim q As Integer
Dim i As Long
Dim j As Long
Dim f As Integer
Dim g As Integer
Dim LastRow As String
f = 1
q = 1
g = 2
LastRow = Cells.Find("*", [A1], , , xlByRows, xlPrevious).Row
For i = q To LastRow
UUID = Cells(i, "A")
Instance(f, 1) = UUID
g = 2
For j = 1 To LastRow
If UUID = Cells(j, "A") Then
Instance(f, g) = Cells(j, "B")
g = g + 1
End If
Next j
f = f + 1
q = g - 1
Next i
End Sub
The goal of this code is to go through the entries and store them in the array 'Instance' such that the 2D array would look like [UUID1, B1, B2, B3]
[UUID2, B1, B2, B3, B4]
[UUID3, B1, B2]
Where the UUID is the user, the B1 represents the date that user signed in, b2 would be the next date they signed in etc. Some users have more or less dates than others.
My main issue has come with setting up the array as I keep getting different errors around it. I'm not sure how to define this 2D array partly because there will be over 30 000 rows, each with 1->85 columns.
Any help is appreciated, let me know if anything needs further clarification. Once again this is my first time using VBA so im sorry ahead of time if everything i've been doing is wrong.
P.S. I used ReDim Instance (50,50) as a test to see if i could make it work by predefining but same errors occurred. Thanks again!
As far as I understand from your question and code, you have a table with following structure:
..............A.................B
1........LOGIN1.......DATE1
2........LOGIN1.......DATE2
3........LOGIN1.......DATE3
4........LOGIN2.......DATE4
5........LOGIN2.......DATE5
6........LOGIN3.......DATE6
And your task in this code was to fetch data in a 2D structure like this:
RESULT_ARRAY-
............................|-LOGIN1-
............................................|-DATE1
............................................|-DATE2
............................................|-DATE3
............................|-LOGIN2-
............................................|-DATE4
............................................|-DATE5
............................|-LOGIN3-
............................................|-DATE6
First of all, you need to know what goes wrong in your code. Please see comments in code below to find out the reason of error:
Option Explicit
Option Base 1
Sub CountUUIDLoop()
Dim UUID As String
Dim Day As Date
Dim Instance() As Variant ' If you are using variant data type, it is not necesary to point it: default data type in VBA is Variant. Just write like this: "Dim Instance()"
ReDim Instance(50, 50) ' Limitation in 50 may be the reason, why your script is going into "out of range" error.
' Remember, that this operation means, that your array now will have following dimentions: [1..50,1..50]
Dim CountUUID As Variant 'Just write like this: "Dim CountUUID"
Dim q As Integer ' you can describe all your variables in one line, like this: "Dim q as Integer,f as Integer,g as Integer"
Dim i As Long
Dim j As Long
Dim f As Integer
Dim g As Integer
Dim LastRow As String ' first mistake: you are using String data type to perform numeric operations below in your FOR-cycle
f = 1 ' Your Instance array index starts from {0} and you are not using this index by starting from {1}.
q = 1 ' The reason to use this variable is not obvious. You could just use constant in FOR cycle below and avoid unnecessary variables.
g = 2 ' You could remove this line, because this var is set every time in cycle below (before second FOR)
LastRow = Cells.Find("*", [A1], , , xlByRows, xlPrevious).Row ' The alternative here is to use predefined Excel constants, like this:
' "Cells.SpecialCells(xlLastCell).Row".
'If LastRow is bigger, than {50} - this could be a reason of your Error.
For i = q To LastRow ' Here goes comparison between String and Integer data type, not good thing, but type conversion should work fine here.
UUID = Cells(i, "A") ' No need to perform re-set here, just move forward and assign value from this cell to the Instanse directly:
' Like this: Instance(f, 1) = Cells(i, "A")
Instance(f, 1) = UUID
g = 2
For j = 1 To LastRow ' It is another point, why "q" variable is not necessary. :)
If UUID = Cells(j, "A") Then ' You could use your Instansce value instead of UUID there, like this: "Instance(f, 1)"
Instance(f, g) = Cells(j, "B") 'If "g" variable will somehow become bigger, than {49}, this could become a reason of your Error.
g = g + 1
End If
Next j
f = f + 1
q = g - 1 ' "q" variable is not used after this row, so it is a strange unnecessary action
Next i
End Sub
Now, when we have some information about error, let me do some improvements on your code. I am certain, that to make most simply code, you can use your Excel worksheets to store and count data with VBA as background automations. But if you need the code with arrays, let's do this! :)
Option Explicit ' It is an option that turns on check for every used variable to be defined before execution. If this option is not defined, your code below will find undefined variables and define them when they are used. Good practice is to use this option, because it helps you, for example to prevent missprinting errors in variable names.
Option Base 1 ' This option sets the default index value for arrays in your code. If this option is not set, the default index value will be {0}.
Const HEADER_ROW = 1 ' It is a number to identify your header row, next row after this one will be counted as a row with data
Const UUID = 1 ' ID of element in our "Instance" array to store UUID
Const DATES_ID = 2 ' ID of element in our "Instance" array to store dates
Function CountUUIDLoop()
ActiveSheet.Copy After:=ActiveSheet 'Copy your worksheet to new one to ensure that source data will not be affected.
Dim Instance(), dates() ' "Instance" will be used to store all the data, "dates" will be used to store and operate with dates
ReDim Instance(2, 1) ' Set first limitation to the "Instance" array in style [[uuid, dates],id]
ReDim dates(1) ' Set first limitation to the "dates" array
Instance(DATES_ID, 1) = dates
Dim CountUUID
Dim i as Long, j as Long, f as Long, active_element_id As Long 'Integer is quite enough to perform our array manipulations, but Long datatype is recomended (please refer to UPDATE2 below)
i = HEADER_ROW + 1 ' Set first row to fetch data from the table
active_element_id = 1 ' Set first active element number
With ActiveSheet ' Ensure that we are working on active worksheet.
While .Cells(i, 1) <> "" 'If operated cell is not empty - continue search for data
If i > HEADER_ROW + 1 Then
active_element_id = active_element_id + 1 ' increment active element number
ReDim Preserve Instance(2, active_element_id) ' Assign new limitation (+ 1) for our Instances, don't forget to preserve our results.
ReDim dates(1) ' Set first limitation to the "dates" array
Instance(DATES_ID, active_element_id) = dates
End If
Instance(UUID, active_element_id) = .Cells(i, 1) ' save UUID
dates(1) = .Cells(i, 2) ' save first date
j = i + 1 ' Set row to search next date from as next row from current one.
While .Cells(j, 1) <> "" 'If operated cell is not empty - continue search for data
If .Cells(j, 1) = .Cells(i, 1) Then
ReDim Preserve dates(UBound(dates) + 1) ' Expand "dates" array, if new date is found.
dates(UBound(dates)) = .Cells(j, 2) ' Save new date value.
.Cells(j, 1).EntireRow.Delete 'Remove row with found date to exclude double checking in future
Else
j = j + 1 ' If uuid is not found, try next row
End If
Wend
Instance(DATES_ID, active_element_id) = dates
i = i + 1 'After all the dates are found, go to the next uuid
Wend
.Cells(i, 1) = "UUID COUNT" ' This will write you a "UUID COUNT" text in A column below all the rest of UUIDs on active worksheet
.Cells(i, 2) = i - HEADER_ROW - 1 ' This will write you a count of UUIDs in B column below all the rest of UUIDs on active worksheet
End With
CountUUIDLoop = Instance ' This ensures that your function (!) returns an array with all UUIDs and dates inside.
End Function
This function will print you count of your UUIDs at the bottom of active sheet and return you an array like this:
[[LOGIN1][1], [[DATE1][DATE2][DATE3]][1]]
I have used this order of storing data to avoid error with expanding of multidimentional arrays. This error is similar to yours, so you could read more about this there: How can I "ReDim Preserve" a 2D Array in Excel 2007 VBA so that I can add rows, not columns, to the array? Excel VBA - How to Redim a 2D array? ReDim Preserve to a Multi-Dimensional Array in Visual Basic 6
Anyway, you could use my function output ("Instance" array) to perform your further actions to find what you need or even display your uuid-dates values. :)
Good luck in your further VBA actions!
UPDATE
Here is the test procedure showing how to work with the above function's results:
Sub test()
Dim UUIDs ' The result of the "CountUUIDLoop" function will be stored there
Dim i as Long, j As Long ' Simple numeric variables used as indexies to run through our resulting array
UUIDs = CountUUIDLoop ' assign function result to a new variable
Application.DisplayAlerts = False ' Disable alerts from Excel
ActiveSheet.Delete ' Delete TMP worksheet
Application.DisplayAlerts = True ' Enable alerts from Excel
If UUIDs(UUID, 1) <> Empty Then ' This ensures that UUIDs array is not empty
Sheets.Add After:=ActiveSheet ' Add new worksheet after active one to put data into it
With ActiveSheet 'Ensure that we are working with active worksheet
.Cells(HEADER_ROW, 1) = "UUIDs/dates" ' Put the header into the "HEADER_ROW" row
For i = 1 To UBound(UUIDs, 2) ' run through all the UUIDs
.Cells(1 + HEADER_ROW, i) = UUIDs(UUID, i) ' Put UUID under the header
For j = 1 To UBound(UUIDs(DATES_ID, i)) ' run through all the dates per UUID
.Cells(j + 1 + HEADER_ROW, i) = UUIDs(DATES_ID, i)(j) ' put date into column below the UUID
Next j ' Go to next date
Next i ' Go to next UUID
.Cells.EntireColumn.AutoFit ' This will make all columns' width to fit its contents
End With
Else
MsgBox "No UUIDs are found!", vbCritical, "No UUIDs on worksheet" ' Show message box if there are no UUIDs in function result
End If
End Sub
So, if you'll have following data on the active worksheet:
..............A.................B
1........LOGIN1.......DATE1
2........LOGIN1.......DATE2
3........LOGIN1.......DATE3
4........LOGIN2.......DATE4
5........LOGIN2.......DATE5
6........LOGIN3.......DATE6
...this sub will put UUIDs on the new sheet like this:
..............A.................B.................C
1........UUIDs/dates
2........LOGIN1........LOGIN2........LOGIN3
3........DATE1.........DATE4.........DATE6
4........DATE2.........DATE5
5........DATE3
UPDATE2
It is recomended to use Long data type instead of Integer each type when integer (or whole number) variable is needed. Long is slightly faster, it has much wider limitations and costs no additional memory. Here is proof link:
MSDN:The Integer, Long, and Byte Data Types
I would recommend using collections and a dictionary instead of arrays. The below code will structure the data in a way that is very similar to the way you wanted it.
Sub collect_logins_by_user_()
'you need to enable the microsoft scripting runtime
'in tools - references
'assuming unique ids are in col A and there are no gaps
'and assuming dates in col B and there are no gaps
'
'The expected runtime for this is O(n) and I have used similar code on more than 250.000 record.
'It still takes a while obviously, but should run just fine.
'
'The the data will bestructed in the following format:
'{id_1: [d_1, d_2,...], id_2: [d_3, d_4,...], ...}
Dim current_id As Range: Set current_id = ActiveSheet.Range("A2") 'modify range as required
Dim logins_by_users As New Dictionary
While Not IsEmpty(current_id)
If Not logins_by_users.Exists(current_id.Value) Then
Set logins_by_users(current_id.Value) = New Collection
End If
logins_by_users(current_id.Value).Add current_id.Offset(ColumnOffset:=1).Value
Set current_id = current_id.Offset(RowOffset:=1)
Wend
'Once you have the data structured, you can do whatever you want with it.
'like printing it to the immediate window.
Dim id_ As Variant
For Each id_ In logins_by_users
Debug.Print "======================================================="
Debug.Print id_
Dim d As Variant
For Each d In logins_by_users(id_)
Debug.Print d
Next d
Next id_
Debug.Print "======================================================="
End Sub
I have written a bit of code that does something along the lines of what you are trying to do - it prints to the debug window the different numbers of consecutive logs per user, separeted by commas.
This code makes use of the dictionary object - which essentially is an associative array where the indexes are not restrained to numbers like they are in arrays, and offers a couple of convenient features to manipulate data that arrays don't.
I have tested this on a sheet including user ids in colomn A and log dates in column B - including headers - and this looks to work fine. Fell free to give it a try
Sub mysub()
Dim dic As Object
Dim logs As Variant
Dim myval As Long
Dim mykey As Variant
Dim nb As Long
Dim i As Long
Set dic = CreateObject("Scripting.dictionary")
'CHANGE TO YOUR SHEET REFERENCE HERE
For Each cell In Range(Cells(2, 1), Cells(Worksheets("Sheet8").Rows.count, 1).End(xlUp))
mykey = cell.Value
myval = cell.Offset(0, 1)
If myval <> 0 Then
On Error GoTo ERREUR
dic.Add mykey, myval
On Error GoTo 0
End If
Next cell
For Each Key In dic
logs = Split(dic(Key), ",")
logs = sortArray(logs)
i = LBound(logs) + 1
nb = 1
Do While i <= UBound(logs)
Do While CLng(logs(i)) = CLng(logs(i - 1)) + 1
nb = nb + 1
i = i + 1
Loop
If nb > 1 Then
tot = tot & "," & CStr(nb)
nb = 1
End If
i = i + 1
Loop
If tot <> "" Then dic(Key) = Right(tot, Len(tot) - 1)
Debug.Print "User: " & Key & " - Consecutive logs: " & dic(Key)
tot = ""
mys = ""
Next Key
Exit Sub
ERREUR:
If myval <> 0 Then dic(mykey) = dic(mykey) & "," & CStr(myval)
Resume Next
End Sub
Function sortArray(a As Variant) As Variant
For i = LBound(a) + 1 To UBound(a)
j = i
Do While a(j) < a(j - 1)
temp = a(j - 1)
a(j - 1) = a(j)
a(j) = temp
j = j - 1
If j = 0 Then Exit Do
Loop
Next i
sortArray = a
End Function

Resources