Flatten table in Excel using VBA - database

I'm looking to use VBA to transform a raw data extract into a flattened table for querying. Currently, I have a raw data table in Excel that summarizes the status of Phases A, B, and C for a given Engagement (note: some Engagements may not have data for all 3 phases).
Row| EngagementID | A_date | A_status | B_date | B_status | C_date | C_status
1 | 201 | 2/2 | Approved | | | |
2 | 201 | | | 3/5 | Approved | |
3 | 201 | | | | | 4/1 | Pending
4 | 203 | 2/12 | Submitted| | | |
5 | 203 | | | 2/20 | Approved | |
6 | 207 | 2/5 | Approved | | | |
I need to flatten the table to look like something this:
Row| EngagementID | Date | Status
1 | 201 | 2/2 | Approved
2 | 201 | 3/5 | Approved
3 | 201 | 4/1 | Pending
4 | 203 | 2/12| Submitted
5 | 203 | 2/20| Approved
6 | 207 | 2/5 | Approved
Additionally, I'd like to add a column for the Phase so that I can "tag" each row with the Phase (A, B, or C) that it is associated with.
I've tried the following VBA code, but it flattens the table vertically, as opposed to horizontally (merging 3 rows into 1, as opposed to 3 columns into 1):
Private Sub test()
Dim R As Long
Dim i As Integer
i = 1
R = 2
Count = 0
Do While Not IsEmpty(Range("A" & R))
If Cells(R, 1).Value = Cells(R + 1, 1).Value Then
Count = Count + 1
Else
i = 1
Do While i <= Count
Cells(R - Count, 2 + (2 * i)).Value = Cells(R - Count + i, 2 + (2 * i))
Cells(R - Count, 3 + (2 * i)).Value = Cells(R - Count + i, 3 + (2 * i))
i = i + 1
Loop
i = 1
Do While i <= Count
Rows(R - Count + i).Delete
i = i + 1
R = R - 1
Loop
Count = 0
End If
R = R + 1
Loop
End Sub
Please help!!

Try this code
Sub Test()
Dim a As Variant
Dim b As Variant
Dim i As Long
Dim j As Long
Dim k As Long
a = Range("A1").CurrentRegion.Value
ReDim b(1 To UBound(a, 1) * 3, 1 To 3)
For i = 2 To UBound(a, 1)
For j = 2 To UBound(a, 2) Step 2
If a(i, j) <> "" And a(i, j + 1) <> "" Then
k = k + 1
b(k, 1) = a(i, 1)
b(k, 2) = a(i, j)
b(k, 3) = a(i, j + 1)
End If
Next j
Next i
Range("J1").Resize(k, UBound(b, 2)).Value = b
End Sub

Related

VBA Concatenate string with previous string value (previous value changes every few cells)

I am working on building data historical of Financial System.
I need a macro that reads if the cell doesn´t contain uppercase, then concatenates the text of this cell with the previous value that was uppercases.
| FIRST STEP |
|------------------|
| | Name |
| --- | ---------- |
| 1 |DISPONIBLE: |
| 2 |Caja |
| 3 |Bancos |
| 4 |INVERSIONES:|
| 5 |Temporales |
| 6 |Largoplazo |
| 7 |CARTERA: |
| 8 |Crédito |
| | LAST STEP |
|----------------------------|
| | Name |
| --- | -------------------- |
| 1 |DISPONIBLE: |
| 2 |disponibleCaja |
| 3 |disponibleBancos |
| 4 |INVERSIONES: |
| 5 |inversionesTemporales |
| 6 |inversionesLargoplazo |
| 7 |CARTERA: |
| 8 |carteraCrédito |
This should do it:
Sub t()
Dim rng As Range, cel As Range
Dim capitalWord As String
Set rng = Range("A1:A8") 'Adjust as needed
For Each cel In rng
If IsUppercase(cel.Value) Then
capitalWord = Replace(cel.Value, ":", "")
Else
cel.Value = LCase(capitalWord) & WorksheetFunction.Proper(cel.Value)
End If
Next cel
End Sub
Public Function IsUppercase(AString As String) As Boolean
IsUppercase = (UCase(AString) = AString)
End Function
Basically it checks if the cell is uppercase, and if so, sets that as the capitalWord. Otherwise, adds the lowercase of that word to the cell value.

Consolidate Rows Using Dictionary In VBA with Sum and Max/Min to certain columns

I am attempting to consolidate rows of data that share several attributes (e.g. order number and product number). For example: Order 12345 has 4 rows of data all with the same product number but each row has a unique revenue amount.
I want to have an end result where all 4 rows are consolidated into 1 row which has the revenue amounts summed from the 4 original lines. Additionally each row has a start and end date. I need the final consolidated row to have the earliest (MIN) start date and the last (MAX) end date as the end result in the consolidated line.
My target lines to be consolidated are not always consecutive in the data and thus I figured a dictionary would be they way to go (in which a unique ID (used to identify lines that need to be consolidated) are my "key"). I found a similar question on here and used code from that answer to get to where I am at now.
I have a "unique ID" which determines what lines need to be consolidated together (if the ID is the same, the lines need to be consolidated). The Unique ID is a concatenation of 4 columns (order #, product, a contract name and state).
My current code is:
Dim oRange As Range
Dim oTarget As Range
Dim oRow As Range
Dim oRowAmend As Range
Dim oDic As Scripting.Dictionary
Dim sIndex As String
Dim vKey As Variant
Dim vItem As Variant
Dim LastRow As Long
Worksheets("ODD Data").Activate
LastRow = Worksheets("ODD Data").Range("A" & Rows.Count).End(xlUp).Row
'Define the source range
Set oRange = Sheets("ODD Data").Range("A2:CE" & LastRow)
'Define where the updated data will be printed.
Set oTarget = Sheets("Consolidated ODD Data").Range("A2:CE2")
Set oDic = New Scripting.Dictionary
For Each oRow In oRange.Rows
'Define Indexes (what is checked for duplicates)
sIndex = oRow.Cells(82) 'Column 82 is my unique ID column
'If the index exists, sum the values
If oDic.Exists(sIndex) Then
Set oRowAmend = oRow
oRowAmend.Cells(36).Value = oRow.Cells(36).Value + oRowAmend.Cells(36).Value 'Column 36 is the column which has the revenue amount I wish to sum
oDic.Remove (sIndex)
oDic.Add sIndex, oRowAmend
'If does not exist, only store their values
Else
oDic.Add sIndex, oRow
End If
Next oRow
For Each vKey In oDic
vItem = oDic.Item(vKey)
oTarget = vItem
'Points oTarget for next row
Set oTarget = oTarget.Offset(1, 0)
Next vKey
End Sub
Currently the code runs without error and I get the expected number of lines output onto the new "Consolidated ODD Data" Sheet. Column AJ (36) is not summing however. It appears that whatever the value in column AJ is for the last line to be consolidated is simply doubled (not added to the other lines needing to be consolidated). This is happening not only on the output sheet but ALSO on the original data set (which I do not want).
I have no idea how to apply the MIN / MAX Functionality to my start and end dates. Any help on this (or any part) is much appreciated. The start date is in Column O and End Date is in Column P. All the other columns will be identical between the rows I am consolidating.
I was wondering if I need to story an array as an item in my dictionary? I am new to this and getting a bit over my head!
Many thanks in advance!
Consider this data in Sheet1 starting at A1:
| Row | Key | Order | Product | Contract | State | Value | Start | End |
|-----|-----------------|-------|---------|----------|-------|-------|----------|----------|
| 1 | aaa|123|foo|bar | aaa | 123 | foo | bar | 11 | 27-11-17 | 08-01-18 |
| 2 | bbb|456|foo|bar | bbb | 456 | foo | bar | 11 | 22-11-17 | 23-12-17 |
| 3 | aaa|123|foo|bar | aaa | 123 | foo | bar | 10 | 30-11-17 | 05-01-18 |
| 4 | bbb|456|foo|bar | bbb | 456 | foo | bar | 13 | 03-12-17 | 08-01-18 |
| 5 | aaa|456|foo|bar | aaa | 456 | foo | bar | 27 | 04-12-17 | 24-12-17 |
| 6 | bbb|123|foo|bar | bbb | 123 | foo | bar | 6 | 12-12-17 | 26-12-17 |
| 7 | bbb|123|foo|bar | bbb | 123 | foo | bar | 9 | 10-12-17 | 30-12-17 |
| 8 | bbb|456|foo|bar | bbb | 456 | foo | bar | 11 | 04-12-17 | 06-01-18 |
| 9 | bbb|456|foo|bar | bbb | 456 | foo | bar | 24 | 28-11-17 | 23-12-17 |
| 10 | bbb|456|foo|bar | bbb | 456 | foo | bar | 27 | 26-11-17 | 06-01-18 |
| 11 | aaa|123|foo|bar | aaa | 123 | foo | bar | 3 | 27-11-17 | 07-01-18 |
| 12 | aaa|123|foo|bar | aaa | 123 | foo | bar | 1 | 02-12-17 | 24-12-17 |
| 13 | bbb|456|foo|bar | bbb | 456 | foo | bar | 26 | 01-12-17 | 03-01-18 |
| 14 | aaa|123|foo|bar | aaa | 123 | foo | bar | 26 | 05-12-17 | 31-12-17 |
| 15 | aaa|123|foo|bar | aaa | 123 | foo | bar | 24 | 08-12-17 | 21-12-17 |
Where the formula for Key is:
=C2&"|"&D2&"|"&E2&"|"&F2
You are well advised (per #RonRosenfeld) to use a Class for the dictionary value, e.g. Class1 (just create a new class in VB editor) and then enter:
Option Explicit
Public ConsolidatedRevenue As Double
Public FirstDate As Date
Public LastDate As Date
Then you can use code like this (with intellisense support):
Dim obj As Class1
Set obj = New Class1
obj.ConsolidatedRevenue = 99
obj.ConsolidatedRevenue = obj.ConsolidatedRevenue + 99
So, the following code will:
loop every row
if the key is not in the dictionary then add the key and a new Class1 with the data for that row
if the key is not new, then get the existing data and increment revenue and compare dates to get start and end for the consolidated item
Code:
Option Explicit
Sub Consolidate()
Dim ws As Worksheet
Dim rngData As Range
Dim objDic As Object
Dim lngCounter As Long
Dim varKey As Variant
Dim dblRevenue As Double
Dim dtStart As Date
Dim dtEnd As Date
Dim objData As Class1
Set ws = ThisWorkbook.Worksheets("Sheet1") '<-- change to your worksheet
Set rngData = ws.Range("A2:I16") '<-- change to your range with last row etc
Set objDic = CreateObject("Scripting.Dictionary") '<-- late bound reference to dictionary
For lngCounter = 1 To rngData.Rows.Count
varKey = rngData.Cells(lngCounter, 2).Value '<-- the key
dblRevenue = CDbl(rngData.Cells(lngCounter, 7).Value) '<-- the revenue
dtStart = CDate(rngData.Cells(lngCounter, 8).Value) '<-- the start date on row
dtEnd = CDate(rngData.Cells(lngCounter, 9).Value) '<-- the end date on row
' test for key in dictionary
If objDic.Exists(varKey) Then
' get existing data packet
Set objData = objDic(varKey)
' increment revenue
objData.ConsolidatedRevenue = objData.ConsolidatedRevenue + CDbl(rngData.Cells(lngCounter, 7))
' update first date if earlier
If dtStart < objData.FirstDate Then
objData.FirstDate = dtStart
End If
' update last date if later
If dtEnd > objData.LastDate Then
objData.LastDate = dtEnd
End If
Else
' create a new data packet
Set objData = New Class1
' set properties for new item
objData.ConsolidatedRevenue = dblRevenue
objData.FirstDate = dtStart
objData.LastDate = dtEnd
' store new data packet in dictionary
objDic.Add varKey, objData
End If
Next lngCounter
' test dictionary
For Each varKey In objDic.Keys
' output could go to another sheet instead of immediate window...
Debug.Print "Key: " & varKey
Debug.Print "Revenue: " & objDic(varKey).ConsolidatedRevenue
Debug.Print "First Date: " & objDic(varKey).FirstDate
Debug.Print "End Date: " & objDic(varKey).LastDate
Next varKey
End Sub
The output is:
Key: aaa|123|foo|bar
Revenue: 75
First Date: 27-Nov-17
End Date: 08-Jan-18
Key: bbb|456|foo|bar
Revenue: 112
First Date: 22-Nov-17
End Date: 08-Jan-18
Key: aaa|456|foo|bar
Revenue: 27
First Date: 04-Dec-17
End Date: 24-Dec-17
Key: bbb|123|foo|bar
Revenue: 15
First Date: 10-Dec-17
End Date: 30-Dec-17
You should be able to adapt that to your data set. To do the min/ max tests on the dates the proposed code just uses < and > between the current dates stored in the data packet (e.g. Class1 properties) and the dates from the row being processed:
' update first date if earlier
If dtStart < objData.FirstDate Then
objData.FirstDate = dtStart
End If
' update last date if later
If dtEnd > objData.LastDate Then
objData.LastDate = dtEnd
End If
HTH
Edit
Per the comment question about only printing the key dates and revenue - you can add extra fields to the class:
Option Explicit
Public ConsolidatedRevenue As Double
Public FirstDate As Date
Public LastDate As Date
Public Order As String
Public Product As String
Public Contract As String
Public State As String
'... etc
Then in the main loop, get those additional values e.g.
' ... (Dim them all first e.g. Dim strOrder As String etc)
strOrder = rngData.Cells(lngCounter, 3).Value
strProduct = rngData.Cells(lngCounter, 4).Value
strContract = rngData.Cells(lngCounter, 5).Value
strState = rngData.Cells(lngCounter, 6).Value
' ...
And then you can add them to the instance of Class1:
' ...
objData.Order = strOrder
objData.Product = strProduct
objData.Contract = strContract
objData.State = strState
' ... etc
And then when you loop the dictionary you can output them e.g.
Dim wsOutput As Worksheet
Set wsOutput = ThisWorkbook.Worksheets("Output") '<-- change to your output sheet
' loop the dictionary
Dim lng As Long
For lng = 0 To objDic.Count - 1
' ... instead of Debug.Print output to sheet with wsOutput.Cells(x, y).Value = foo
Set objData = objDic.Items()(lng)
wsOutput.Cells(lng + 1, 1).Value = objData.Order
wsOutput.Cells(lng + 1, 2).Value = objData.Product
wsOutput.Cells(lng + 1, 3).Value = objData.Contract
wsOutput.Cells(lng + 1, 4).Value = objData.State
wsOutput.Cells(lng + 1, 5).Value = objData.FirstDate
wsOutput.Cells(lng + 1, 6).Value = objData.LastDate
wsOutput.Cells(lng + 1, 7).Value = objData.ConsolidatedRevenue
' ... etc
Next lng

Excel VBA Grab Ancestor value from Relational Data

I can't seem to figure out how this would work in excel VBA
I have this relational data in excel like so:
Hierarchially / treeview, data looks like this:
The End result of the data should look like this (After using excel VBA script) where
columns A and B is relational Data
column E is input values for lookup
Column F is result ancestor value
My script thus far looks like this:
Public Sub DictionaryExamples()
Dim sht As Worksheet: Set sht = ActiveSheet
Dim exampleValues As Variant
Dim i As Long
Dim aKey As String
Dim aValue As String
Dim exampleDict As Object
'Load values into a variant array
exampleValues = Range("A1:B15").Value
'Instantiate a dictionary
Set exampleDict = CreateObject("scripting.dictionary")
'Read all keys and values, and add them to the dictionary
For i = 1 To UBound(exampleValues)
aKey = CStr(exampleValues(i, 1))
aValue = CStr(exampleValues(i, 2))
exampleDict.Add aKey, aValue
Next i
'After Dictionary setup, use input values E to output Ancestor F
Dim curCell As Range
Dim LastRow As Long
Dim temp As Variant
LastRow = sht.Cells(Rows.Count, "E").End(xlUp).row
'Loop through all values in parent to find ancestor
For Each curCell In sht.Range("E1:E" & LastRow).Cells
temp = curCell
'Search Dictionary until no matches are found, that is ancestor
Do
If exampleDict.Exists(temp) Then
temp = exampleDict(temp)
Else
'Print ancestor
curCell(, 2).Value = temp
Exit Do
End If
Loop
Next
End Sub
Result so far: (not getting correct output values)
Essentially I am using a dictionary (A= key, B=value) to use as a lookup for (E=input) and then output (F=results)
I loop through multiple times until I find a key that has no pair, and use the latest working key value as the ancestor
The "root" words in the data is unncessary I just put it there for clarification, it could a null value I just wanted to clarify which input levels are already top level ancestor values
You need to test whether your child node's parent is a root element or is a leaf (child node) itself before continuing your loop. Otherwise, you will always be writing the value of the parent node, which is "Root", and never the parent's name (key).
Option Explicit
Public Sub DictionaryExamples()
Dim sht As Worksheet: Set sht = ActiveSheet
Dim exampleValues As Variant
Dim i As Long
Dim aKey As String, aValue As String
Dim exampleDict As Object
Dim curCell As Range
'Load values into a variant array
exampleValues = Range("A2:B15").Value
'Instantiate a dictionary
Set exampleDict = CreateObject("scripting.dictionary")
'Read all keys and values, and add them to the dictionary
For i = 1 To UBound(exampleValues)
aKey = CStr(exampleValues(i, 1))
aValue = CStr(exampleValues(i, 2))
exampleDict.Add aKey, aValue
Next i
'After Dictionary setup, use input values E to output Ancestor F
With sht
'Loop through all values in parent to find ancestor
For Each curCell In .Range("E2", .Cells(Rows.Count, "E").End(xlUp))
aKey = curCell
'If the
If Not exampleDict.Exists(exampleDict(aKey)) Then
'If the node is a parent node print it's value
'To avoid confusion I'd have used: curCell(, 2).Value = "Parent Node"
curCell(, 2).Value = exampleDict(aKey)
Else
'Search Dictionary until no matches are found, that is ancestor
Do
If exampleDict.Exists(aKey) Then
'Here we test if this child node's parent is a root node
If Not exampleDict.Exists(exampleDict(aKey)) Then
'The child node's parent is a root node
curCell(, 2).Value = aKey
Exit Do
Else
'The child node's parent is also a leaf so continue
aKey = exampleDict(aKey)
End If
End If
Loop
End If
Next
End With
End Sub
Another solution (Not my original solution, received help elsewhere)
Option Explicit
Private Const LOOP_LIMIT As Integer = 100
Public Sub LineageDemo()
Dim dict As Object
Dim inputValues As Variant
Dim outputValues As Variant
Dim i As Long
'Read relations into dictionary
Set dict = BuildDictionaryOfRelations(Range("A2:A140"), Range("B2:B140"))
'Read input values into variant array
inputValues = Range("E2:E1465").Value
'Write output
ReDim outputValues(1 To UBound(inputValues), 1 To 1)
For i = 1 To UBound(inputValues)
outputValues(i, 1) = TraceAncestor(CStr(inputValues(i, 1)), dict, "Root")
Next i
Range("F2:F1465").Value = outputValues
End Sub
Private Function BuildDictionaryOfRelations(childRange As Range, parentRange As Range) As Object
Dim childValues As Variant
Dim parentValues As Variant
Dim i As Long
Dim aChild As String
Dim aParent As String
Dim dict As Object
If childRange.Columns.Count <> 1 Or parentRange.Columns.Count <> 1 _
Or childRange.Rows.Count <> parentRange.Rows.Count Then _
Err.Raise vbObjectError + 1, Description:="Bad/inconsistent category ranges"
'Load values into variant arrays
childValues = childRange.Value
parentValues = parentRange.Value
'Instantiate a dictionary
Set dict = CreateObject("scripting.dictionary")
'Populate the dictionary
For i = 1 To UBound(childValues)
aChild = CStr(childValues(i, 1))
aParent = CStr(parentValues(i, 1))
If aChild = "pizza-oven" Then Stop
dict.Add aChild, aParent
Next i
Set BuildDictionaryOfRelations = dict
End Function
Private Function TraceAncestor(aChild As String, relationDict As Object, rootString As String) As String
Dim aParent As String
Dim i As Integer
If Not (relationDict.exists(aChild)) Then
TraceAncestor = "ERROR: " & aChild & " does not appear in the CategoryName column"
Exit Function
End If
'If aChild is a root, return root
If relationDict.Item(aChild) = rootString Then
TraceAncestor = rootString
Exit Function
End If
'Trace from child to parent to parent to parent... to find ultimate ancestor
For i = 1 To LOOP_LIMIT
If Not (relationDict.exists(aChild)) Then
TraceAncestor = "ERROR: " & aChild & " does not appear in the CategoryName column"
Exit Function
End If
aParent = relationDict.Item(aChild)
If aParent = rootString Then Exit For
aChild = aParent
Next i
If i > LOOP_LIMIT Then
TraceAncestor = "ERROR: Ancestor could not be found for " & aChild & " in " & LOOP_LIMIT & " iterations"
Exit Function
End If
TraceAncestor = aChild
End Function
I only had about 1000 to 2000 cells or so, so I ended up using Jerry's Cascading tree formula here
https://sites.google.com/a/madrocketscientist.com/jerrybeaucaires-excelassistant/text-functions/cascading-tree
Option Explicit
Sub TreeStructure()
'JBeaucaire 3/6/2010, 10/25/2011
'Create a flow tree from a two-column accountability table
Dim LR As Long, NR As Long, i As Long, Rws As Long
Dim TopRng As Range, TopR As Range, cell As Range
Dim wsTree As Worksheet, wsData As Worksheet
Application.ScreenUpdating = False
'Find top level value(s)
Set wsData = Sheets("Input")
'create a unique list of column A values in column M
wsData.Range("A:A").AdvancedFilter Action:=xlFilterCopy, _
CopyToRange:=wsData.Range("M1"), Unique:=True
'Find the ONE value in column M that reports to no one, the person at the top
wsData.Range("N2", wsData.Range("M" & Rows.Count).End(xlUp) _
.Offset(0, 1)).FormulaR1C1 = "=IF(COUNTIF(C2,RC13)=0,1,"""")"
Set TopRng = wsData.Columns("N:N").SpecialCells(xlCellTypeFormulas, 1).Offset(0, -1)
'last row of persons listed in data table
LR = wsData.Range("A" & wsData.Rows.Count).End(xlUp).Row
'Setup table
Set wsTree = Sheets("LEVEL STRUCTURE")
With wsTree
.Cells.Clear 'clear prior output
NR = 3 'next row to start entering names
'Parse each run from the top level
For Each TopR In TopRng 'loop through each unique column A name
.Range("B" & NR) = TopR
Set cell = .Cells(NR, .Columns.Count).End(xlToLeft)
Do Until cell.Column = 1
'filter data to show current leader only
wsData.Range("A:A").AutoFilter Field:=1, Criteria1:=cell
'see how many rows this person has in the table
LR = wsData.Range("A" & Rows.Count).End(xlUp).Row
If LR > 1 Then
'count how many people report to this person
Rws = Application.WorksheetFunction.Subtotal(103, wsData.Range("B:B")) - 1
'insert that many blank rows below their name and insert the names
cell.Offset(1, 1).Resize(Rws).EntireRow.Insert xlShiftDown
wsData.Range("B2:B" & LR).Copy cell.Offset(1, 1)
'add a left border if this is the start of a new "group"
If .Cells(.Rows.Count, cell.Column + 1).End(xlUp).Address _
<> cell.Offset(1, 1).Address Then _
.Range(cell.Offset(1, 1), cell.Offset(1, 1).End(xlDown)) _
.Borders(xlEdgeLeft).Weight = xlThick
End If
NR = NR + 1 'increment to the next row to enter the next top leader name
Set cell = .Cells(NR, .Columns.Count).End(xlToLeft)
Loop
Next TopR
'find the last used column
i = .Cells.Find("*", .Cells(.Rows.Count, .Columns.Count), _
SearchOrder:=xlByColumns, SearchDirection:=xlPrevious).Column
'format the used data range
With Union(.Range(.[B1], .Cells(1, i)), .Range("B:BB").SpecialCells(xlCellTypeConstants, 23))
.Interior.ColorIndex = 5
.Font.ColorIndex = 2
.Font.Bold = True
.HorizontalAlignment = xlCenter
End With
.Range("B1").Interior.ColorIndex = 53
.Range("B1").Value = "LEVEL 1"
.Range("B1").AutoFill Destination:=.Range("B1", .Cells(1, i)), Type:=xlFillDefault
End With
wsData.AutoFilterMode = False
wsData.Range("M:N").ClearContents
wsTree.Activate
Application.ScreenUpdating = True
End Sub
This did 95% of the work I needed to do, the rest I just used excel formulas (no VBA needed afterwards)
steps to get ancestor data from start:
I did following procedure:
0: Have data in a relational data setup
1: Cleaned out any duplicate data conditional formatting for duplicates
2: Run Jerry's Excel VBA Macro. Results below
Col A | ColB | ColC | ColD | ColE | ColF |
| Lvl1 | Lvl2 | Lvl3 | Lvl4 | Lvl5 |
| AAA | | | | |
| | BBB | | | |
| | EEE | | | |
| | FFF | | | |
| CCC | | | | |
| | GGG | | | |
| | | III | | |
| | | | JJJ | |
| | | | | KKK |
| DDD | | | | |
| | HHH | | | |
3: Populate the the top level excel via copy+pasting through excel (I only had 3 parent top level categories, so it took 2 minutes)
Col A | ColB | ColC | ColD | ColE | ColF |
| Lvl1 | Lvl2 | Lvl3 | Lvl4 | Lvl5 |
| AAA | | | | |
| AAA | BBB | | | |
| AAA | EEE | | | |
| AAA | FFF | | | |
| CCC | | | | |
| CCC | GGG | | | |
| CCC | | III | | |
| CCC | | | JJJ | |
| CCC | | | | KKK |
| DDD | | | | |
| DDD | HHH | | | |
4: Then use a helper column in column A, via this formula
=IF(B19<>"", B19,
IF(C19<>"",C19,
IF(D19<>"",D19,
IF(F19<>"",F19))))
where C, D,E,F are subcategories levels stemming from the parent (Column B). This searches values in column C for input, if its not there, then column D, then E, then F and copies whatever it finds first.
Col A | ColB | ColC | ColD | ColE | ColF |
| Lvl1 | Lvl2 | Lvl3 | Lvl4 | Lvl5 |
| AAA | | | | |
BBB | AAA | BBB | | | |
EEE | AAA | EEE | | | |
FFF | AAA | FFF | | | |
| CCC | | | | |
GGG | CCC | GGG | | | |
III | CCC | | III | | |
JJJ | CCC | | | JJJ | |
KKK | CCC | | | | KKK |
| DDD | | | | |
HHH | DDD | HHH | | | |
5: Then use an index / match function now that all data is normalized (on columns A and B) using my original input values as the lookup
6: Clean up any data afterwards manually
Step 3 can be easily macro'd for larger sets of data, just traverse that column and paste until it finds the next value down.

Transposing a single line result into a single column result

Nearly all of the similar questions I came across are suggesting using PIVOT but I couldn't understand how to use it in my case.
This is my query
SELECT
SUM(CONVERT(INT,H01)) as H01,
SUM(CONVERT(INT,H02)) as H02,
SUM(CONVERT(INT,H03)) as H03,
SUM(CONVERT(INT,H04)) as H04,
SUM(CONVERT(INT,H05)) as H05,
SUM(CONVERT(INT,H06)) as H06,
SUM(CONVERT(INT,H07)) as H07,
SUM(CONVERT(INT,H08)) as H08,
SUM(CONVERT(INT,H09)) as H09,
SUM(CONVERT(INT,H10)) as H10,
SUM(CONVERT(INT,H11)) as H11,
SUM(CONVERT(INT,H12)) as H12,
SUM(CONVERT(INT,H13)) as H13,
SUM(CONVERT(INT,H14)) as H14,
SUM(CONVERT(INT,H15)) as H15
FROM
IslemFisi
WHERE
(H01 = 1) OR (H02 = 1) OR (H03 = 1) OR (H04 = 1) OR
(H05 = 1) OR (H06 = 1) OR (H07 = 1) OR (H08 = 1) OR
(H09 = 1) OR (H10 = 1) OR (H11 = 1) OR (H12 = 1) OR
(H13 = 1) OR (H14 = 1) OR (H15 = 1)
which returns this result set
+-------+------+------+------+-----+------+-----+------+-----+-----+------+-----+-----+------+-------+
| H01 | H02 | H03 | H04 | H05 | H06 | H07 | H08 | H09 | H10 | H11 | H12 | H13 | H14 | H15 |
+-------+------+------+------+-----+------+-----+------+-----+-----+------+-----+-----+------+-------+
| 11802 | 2239 | 1575 | 2230 | 833 | 3189 | 152 | 4234 | 617 | 428 | 2211 | 263 | 783 | 3954 | 15922 |
+-------+------+------+------+-----+------+-----+------+-----+-----+------+-----+-----+------+-------+
I want to have it in a single column but dont know how to do it. using pivot needs some aggregation functions as it seems but I don't have anything to aggregate at this point.
;WITH cte AS (
--put your query here
)
SELECT H, [VALUES]
FROM (
SELECT *
FROM cte) as p
UNPIVOT (
[VALUES] FOR H IN (H01, H02, H03, H04 ...etc)
) as unpvt
Output:
H VALUES
H01 11802
H02 2239
....

VBA Array Manipulation

I found a macro (courtesy of Jerry Beaucaire) that splits up one worksheet into many based on unique values in a given column. This works great. However...
The client has supplied a differently formatted worksheet which needs some gentle massaging to get into the format we need.
First, let me show you a snippet of JB's code:
MyArr = Application.WorksheetFunction.Transpose _
(ws.Columns(iCol).SpecialCells(xlCellTypeConstants))
From what I can tell (and I'm a total VB newbie, so what do I know..??), this populates an array with the selected row values
And this:
For Itm = 2 To UBound(MyArr)
...(code removed)
ws.Range("A" & TitleRow & ":A" & LR).EntireRow.Copy _
Sheets(MyArr(Itm) & "").Range("A1")
...(code removed)
Next Itm
...seems to do the copying.
Alright. ...fine so far.
The problem is that I need to add a step to the process. This will be tricky to explain. Please bear with me...
Title row is row 1
Data starts in row 2
Each row has 9 columns:
colA: identifier
colB-colD: x,y,z values (for top of item)
colE-colG: x,y,z values (for bottom of item)
colH and colI: can be ignored
These x,y and z values are used to define points that are used to plot lines in a 3D modelling program. Each row in the worksheet actually defines a line (well... a start point and an end point - "top" and "bottom") Unfortunately, the data(worksheet) we have received defines two sets of data for each line - both having the same start point, but with different end points. Put another way, starting with rows 3 and 4, the data in columns B-D is the same for both rows. This applies to rows 5 & 6, 7 & 8, etc.
Since all we need are a set of data POINTS, we can safely use the values from cols E-G.
HOWEVER... and this is where I need help... We need the first row of the newly created worksheet to start with the values from row 2, cols B-D. (ie. we can use the end points as our coordinates, but we still need the first start point) All the rest is fine the way it is.
For example:
Source Data:
| A | B | C | D | E | F | G |
1 | id | x-top | y-top | z-top | x-bottom | y-bottom | z-bottom |
2 | H1 | 101.2 | 0.525 | 54.25 | 110.25 | 0.625 | 56.75 |
3 | H1 | 110.25| 0.625 | 56.75 | 121.35 | 2.125 | 62.65 |
4 | H1 | 110.25| 0.625 | 56.75 | 134.85 | 3.725 | 64.125 | B,C,D same as row 3
5 | H1 | 134.85| 3.725 | 64.125| 141.25 | 4.225 | 66.75 |
6 | H1 | 134.85| 3.725 | 64.125| 148.85 | 5.355 | 69.85 | B,C,D same as row 5
What I need:
| A | B | C | D | E | F | G |
1 | id | x-top | y-top | z-top | x-bottom | y-bottom | z-bottom |
2 | H1 | | | | 101.2 | 0.525 | 54.25 |
3 | H1 | 101.2 | 0.525 | 54.25 | 110.25 | 0.625 | 56.75 |
4 | H1 | 110.25| 0.625 | 56.75 | 121.35 | 2.125 | 62.65 |
5 | H1 | 110.25| 0.625 | 56.75 | 134.85 | 3.725 | 64.125 |
6 | H1 | 134.85| 3.725 | 64.125| 141.25 | 4.225 | 66.75 |
7 | H1 | 134.85| 3.725 | 64.125| 148.85 | 5.355 | 69.85 |
So... What's the best way to do this? Can I add to the existing macro to perform this operation? If so, better to modify the array? ...better to modify the Copy routine? ...and how??
Thanks in advance for your help and please don't suggest doing it manually. There are 70,000+ rows to parse!
If you need more info, let me know!
The full macro is available for free to all at this location
To achieve your connecting points, these additions should do it:
For Itm = 2 To UBound(MyArr)
...(code removed)
ws.Range("A" & TitleRow & ":A" & LR).EntireRow.Copy _
Sheets(MyArr(Itm) & "").Range("A1")
Sheets(MyArr(Itm) & "").Rows(2).Insert xlShiftDown
Sheets(MyArr(Itm) & "").Range("E2").Resize(, 3).Value = Sheets(MyArr(Itm) & "").Range("B3").Resize(, 3).Value
...(code removed)
Next Itm

Resources