Creating a database in excel - database

I'm working with a database in excel. I will try to make it as simple as possible.
For example,
I have a vlookup range/array of fruits, and who likes each fruit.
Fruit - Person
1. Apple – DeShoun
2. Apple – John
3. Apple – Scott
4. Pear – Scott
5. Strawberries – John… ect
In my database I have a list of fruit and the vendor that sells it
Fruit - Vendor
1. Apple – Sprouts
2. Apple – Walmart
3. Apple – Trader Joe’s
4. Strawberries – Abel Farms
5. Banana – Sprouts
6. Pear – Sprouts…. ect
I need to be able to find the fruit “apple” within my database and create new rows of information within the database so that it looks like the following.
Fruit - Vendor - Person
1. Apple – Sprouts - DeShoun
2. Apple – Walmart - DeShoun
3. Apple – Trader Joe’s - DeShoun
4. Apple – Sprouts - John
5. Apple – Walmart – John
6. Apple – Trader Joe’s - John
7. Apple – Sprouts - Scott
8. Apple – Walmart - Scott
9. Apple – Trader Joe’s – Scott
10. Strawberries – Abel Farms - John
11. Banana – Sprouts - #N/A
12. Pear – Sprouts - Scott
Since I will be working on a minimum of 1000+ rows, I need to know if there’s there a process to expedite this in any way.
Does anyone have any suggestions or links/articles that can point me in the right direction?
Feel free to comment or ask any questions that could help lead to a good answer.
Thanks

Let's say your Fruit-Person table is Table 2, Fruit-Vendor is table 3. Fruit is the common field across tables here. You will need to build a Table 1 with unique values from Fruit column. (There are many ways of building a table with unique values, if you aren't aware, they should be available online)
I am listing the process for Excel-2013, there is a chance it might be slightly different in the older versions.
Step 0:
You have 3 tables as per earlier description.
Step 1:
Convert all of them to Tables one by one.
Alt+N>>T, or, select A1:A5 >> Insert >> Table. Tick Choose My Table has Headers.
Repeat this process for all 3 tables. They should look like this:
Step 2:
Create Pivot Table on Multiple Ranges
A) Create Pivot table on Table 1 (Insert>>PivotTable). Tick check "Add this data to Data Model". IMP
B) Under Pivot Table fields, ALL; you should see all 3 tables
Step3:
Create Relationships
In the Analyze tab, click Relationships. A box which says Manage relationships should open up. The idea is to build relationships.
A) Try building a relationship between Table 1 and Table 2.
New >> choose following options:
Table: Table 2
Column (Foreign): Fruit
Related Table: Table 1
Related Column (Primary): Fruit
B) Let's try building it now between Table 1 & 3
New >> choose following options:
Table: Table 3
Column (Foreign): Fruit
Related Table: Table 1
Related Column (Primary): Fruit
It should look like this:
Step 4:
Forming the Pivot
A) Get Fruit from Table 1, Person from Table 2, Vendor from Table 3 (in that order) as row labels
B) Now, Table2/Fruit and Table3/Fruit need to go as Value Labels.
The table so formed is your almost final table. The rows you want will be the ones which have a 1 in column D and E both. You can get those rows off by filtering/pasting as values.
(As a process, pasting images isn't the popular method it seems, but I couldn't have explained it better visually without them)

I'm pretty new to VBA, but had a bit of a fiddle around and this seems to work sort of as you describe (as a potential example...). Have put each table of data on a separate worksheet.
Sub FruityPerson_Matching()
Dim strFruit As String, strPerson As String, strVendor As String 'to hold text.
Dim myWB As Workbook, myWS_P As Worksheet, myWS_V As Worksheet, myWS_C As Worksheet
Dim LastRow As Integer, n As Integer, iNewRow As Integer
Dim rFruit As Range, checkCell As Range
Set myWB = Application.ActiveWorkbook
Set myWS_P = myWB.Worksheets("Person")
Set myWS_V = myWB.Worksheets("Vendor")
Set myWS_C = myWB.Worksheets("Combined")
LastRow = myWS_P.Cells(myWS_P.Rows.Count, "A").End(xlUp).Row
First looping through list of people, finds the first instance of their fruit within the vendors list:
For n = 2 To LastRow
strFruit = Cells(n, 1).Value
strPerson = Cells(n, 2).Value
Set rFruit = myWS_V.Range("A:B").Find(What:=strFruit, LookIn:=xlValues, _
LookAt:=xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, _
MatchCase:=False, SearchFormat:=False)
If Not rFruit Is Nothing Then
Set checkCell = rFruit 'For checking when findnext gets back to original cell.
strVendor = myWS_V.Cells(rFruit.Row, 2).Value
Add this to a new row (so that sure it is blank) in the final combined data sheet:
iNewRow = myWS_C.Range("A" & myWS_C.Rows.Count).End(xlUp).Offset(1).Row
myWS_C.Range("A" & iNewRow).Value = strFruit
myWS_C.Range("B" & iNewRow).Value = strVendor
myWS_C.Range("C" & iNewRow).Value = strPerson
Since potential multiple vendors per fruit, now looping through them for same person:
Do
Set rFruit = myWS_V.Range("A:B").FindNext(After:=rFruit)
If Not rFruit Is Nothing Then
If rFruit.Address = checkCell.Address Then Exit Do
'Shows: are back at start.
strVendor = myWS_V.Cells(rFruit.Row, 2).Value
iNewRow = myWS_C.Range("A" & myWS_C.Rows.Count).End(xlUp).Offset(1).Row
myWS_C.Range("A" & iNewRow).Value = strFruit
myWS_C.Range("B" & iNewRow).Value = strVendor
myWS_C.Range("C" & iNewRow).Value = strPerson
Else
Exit Do
End If
Loop
Else
'What do if strFruit not found...?
Exit Sub
End If
Next
End Sub
Finally moving on to next person in loop etc until reaching the last row of data.
Something like what you had in mind?

It can be difficult to get your head round at first, but I would recommend looking into the INDEX MATCH functions. Used together they can do exactly what vlookup does, but with a little understanding they are far more flexible and may well be better suited to your needs :)
http://fiveminutelessons.com/learn-microsoft-excel/how-use-index-match-instead-vlookup
Might be helpful, or google to find a tutorial that suits you
Specifically for your problem, the hardest part will be matching every vendor, person to each fruit... VBA might be necessary

Related

Extract data from sheet range based on condition (like DataFrame.loc) using VBA

The problem for some of you Python coders out there is doing DataFrame.loc[DataFrame['Name']=='John'][-1].
This is what I have in the spreadsheet, let's say from cell range (A1:E3).
Name
Bio
Date
John
Loves travelling to the mountains
11/20
Joe
plays computer
11/20
John
goes to the sea a lot
01/22
Jenny
dances salsa
02/22
On a separate sheet, I have 2 buttons and 2 cells
Add information to database (built in VBA)
Extract latest information to database
First cell allows me to enter the name such as 'John'
Second cell that allows me to enter the Bio (or equivalently extract the bio to)
How could I, if I type John and click the button to extract the bio, get the latest Bio (the one dated 01/22) for John into another cell?
If you got an older version of Excel you may do this with SUMPRODUCT:
=SUMPRODUCT(MAX(--(A2:A5=G5)*C2:C5))
In Excel 365, you can use function MAXIFS:
MAXIFS
function
Hi I figured out a solution for anybody else looking for something similar and have closed this query.
if we set up a data array that can be used to store the table in the sheet range. Then we can loop through the data as such:
Dim rowCount
Dim dataArray As Variant
Dim bio
For i = 1 To rowCount
If dataArray(i, 1) = Name Then
bio = dataArray(i, 2)
End If
Next
Sheets("Main").Range("A5") = bio
where rowCount should be the number of rows in the table, this can be specified inside a cell on the sheet, and Name be whatever name it is such as 'John'

Update cell value if new value is availiable

So I am brainstorming a few different ways to accomplish this task... but none of the ways I am thinking are very clean. I am looking for a clean way to accomplish this.
I have 2 workbooks (workbook A, workbook B).
Workbook A looks like this:
A B C D E
Tom Bob Sam Ted Meg
1 4 9 3 2
The A,B,C... are the columns (not actually on the sheet) and the 1,4,9,3,2 is data on the last row (could be row 10 or row 1000, etc...)
Workbook B looks like this:
A B
Sam 5
Meg 1
I want to update workbook A with any values on workbook B. So... in this example... Sam and Meg has a new value... So I want to update workbook A to look like this:
A B C D E
Tom Bob Sam Ted Meg
1 4 5 3 1
I feel like the simplest way to do this may be to make something like a dictionary or something like that but I have never used a dictionary and don't know if some other method would be easier / simpler.
So if you want to get data from one book to another then vlookup is the simplest solution
https://photos.app.goo.gl/CRjxhJHVKm6xKJ8x7
I nested it within an if statement that checks to see if there is an error, if there is an error then it just grabs the value from the line above. I.e. if the Name doesn't exist on your second sheet, then there is an error and it will just grab from the line above.
=IF(ISERROR(VLOOKUP(A1,[Book2]Sheet1!$A$2:$B$5,2,FALSE)),A11,VLOOKUP(A1,[Book2]Sheet1!$A$2:$B$5,2,FALSE))
On the face value, this does what you are asking but the way you phrased the question implies that there is some other functionality and some other things you need to do that this solution will possibly get in the way of. That's why I was trying to clarify what you really want to achieve.
Anyway, hopefully its a step in the right direction... Goodluck!
Here is my take on it. No dictionary, however going through memory in an array:
Sample data:
Sheet1:
Sheet2:
Sample code:
Sub Test()
Dim lr As Long, x As Long, col As Long
Dim arr As Variant
'Step 1: Get the values to update
With Sheet1
lr = .Cells(.Rows.Count, 1).End(xlUp).Row
arr = .Range("A1:B" & lr)
End With
'Step 2: Go through our values to update
With Sheet2
For x = LBound(arr) To UBound(arr)
col = .Rows(1).Find(arr(x, 1), Lookat:=xlWhole).Column
.Cells(.Cells(.Rows.Count, col).End(xlUp).Row + 1, col) = arr(x, 2)
Next x
End With
End Sub
Result:
Remove the +1 in case you want to rewrite the last entry in the found column. I just assumed you wanted to paste the value underneath. Also, I refered to sheets, you can simply refer to the other open workbook.
Firstly, a little more background on my problem. I am opening text files, extracting data, re-arranging the data on an excel sheet (wkbTemp), and then copying the re-arranged data to "wkbCompiledDataTable."
Here is how i solved my problem.
wkbTemp.Sheets(1).Activate
LRow = Cells(Rows.Count, 9).End(xlUp).Offset(0, 0).row
wkbTemp.Sheets(1).Range("J1:N" & LRow).Copy
wkbCompiledDataTable(1).Activate
LRow = Cells(Rows.Count, 1).End(xlUp).Offset(0, 0).row
wkbCompiledDataTable(1).Sheets(1).Range("D" & LRow).PasteSpecial _
Paste:=xlPasteValues, _
SkipBlanks:=True, _
Transpose:=True
As I said earlier, the actual application is much more complicated. This routine is actually in a loop. So for example I actually have "wkbCompiledDataTable(i)" instead of "wkbCompiledDataTable(1)." But for me this was the simplest solution. wkbTemp.Sheets(1) Column "J" has data. Column "K" is blank. Column "L", Column "M", Column "N" has data (These 3 columns are the columns I was concerned with data changing).

VBA SumIF between two Arrays

I have two large tables of data that I have been playing around with to try and minimize computing resources. The two tables roughly have the following information:
1) Freight Billing information
InvoiceNum | TrackingNum | Weight | BilledAmt | IncentiveCredit
2) ERP Generated Invoice Information
InvoiceNum | Weight | ProductRev | FreightRev | TaxAmt | DiscountAmt
I'm conducting an analysis between what the freight company is charging per order (Freight Cost) compared to what we charge customers per order (Freight Revenue). Additionally, I'm hoping to compare reported weights for both the freight company and our company to audit accuracy.
There will only be one instance of the invoice in the ERP data, but there can be multiple lines in the freight data with the same invoice number. A simple SumIF column would work well if the tables weren't both approximately 100,000 lines. I know that working with Arrays is generally much faster than working directly with ranges, so I started there.
Option Explicit
Sub compareFreightData()
Application.ScreenUpdating = False
'Declare Sheets
Dim sht As Worksheet
Dim billingSht As Worksheet
Set systemSht = ThisWorkbook.Sheets("SystemData")
Set billingSht = ThisWorkbook.Sheets("UPSData")
'Declare Arrays
Dim billingArray As Variant
Dim systemArray As Variant
'Declare timer
Dim t As Long
t = Timer
'Add additional header info
systemSht.Range("K3").Value = "BilledTotal"
systemSht.Range("L3").Value = "IncentiveCredits"
systemSht.Range("M3").Value = "OrderWeight"
'Put info into arrays
upsArray = billingSht.ListObjects("UPSCSVFiles").DataBodyRange.Value
systemArray = systemSht.Range("A4:J" & systemSht.Cells(systemSht.Rows.Count,"A").End(xlUp).Row).Value
'ReDim to have the three additional columns in the array
ReDim Preserve systemArray(1 To UBound(systemArray, 1), 1 To UBound(systemArray, 2) + 3)
'****************************CALCULATE THE SUMIF*******************
'
'
'Timer
Debug.Print "Timer", Timer - t
Some ideas I had were the following:
If there is a way to sum all the items in a filtered array then print the amount. (I think Filter() only works with single dimension arrays)
Loop through billing array, find the location in system array using MATCH, then add to system array (this seems very slow)
Mostly, I believe there is some easier way that I am missing and any input or ideas would be greatly appreciated.

Counting Unique Values for Multiple Columns in Excel

I've attempted using Pivot tables and SUMPRODUCT & COUNTIF formulas after looking through possible solutions but haven't found anything positive yet. Below is the input data:
Level 1 Level 2 Level 3 Level 4 Level 5
Tom Liz
Tom Liz Mel
Tom Liz Dan
Tom Liz Dan Ian
Tom Liz Dan Ken
Tom Tim
Tom Tim Fab
Tom Tim Fab Ken
Tom Tim Fab Ken Jan
Eve
Expected output data is below. The intent is to not have to feed in a pre-loaded list of names. The expectation is that the program could determine the counts based on the input data alone:
Counts
-------
Tom: 9
Eve: 1
Liz: 5
Tim: 4
Mel: 1
Dan: 3
Fab: 3
Ian: 1
Ken: 3
Jan: 1
Any help towards this is appreciated....thanks!
UPDATE: A preloaded list with the list of Names CAN be used to generate the counts. The above description was updated accordingly.
First enter the following UDF in a standard module:
Public Function ListUniques(rng As Range) As Variant
Dim r As Range, ary(1 To 9999, 1 To 1) As Variant
Dim i As Long, C As Collection
Set C = New Collection
On Error Resume Next
For Each r In rng
v = r.Value
If v <> "" Then
C.Add v, CStr(v)
End If
Next r
On Error GoTo 0
For i = 1 To 9999
If i > C.Count Then
ary(i, 1) = ""
Else
ary(i, 1) = C.Item(i)
End If
Next i
ListUniques = ary
End Function
Then hi-light a section of a column, say G1 thru G50 and enter the Array Formula:
=listuniques(A2:E11)
Array formulas must be entered with Ctrl + Shift + Enter rather than just the Enter key.
If done correctly you should see something like:
Finally in H1 enter:
=COUNTIF($A$2:$E$11,G1)
and copy down
NOTE
User Defined Functions (UDFs) are very easy to install and use:
ALT-F11 brings up the VBE window
ALT-I
ALT-M opens a fresh module
paste the stuff in and close the VBE window
If you save the workbook, the UDF will be saved with it.
If you are using a version of Excel later then 2003, you must save
the file as .xlsm rather than .xlsx
To remove the UDF:
bring up the VBE window as above
clear the code out
close the VBE window
To use the UDF from Excel:
=myfunction(A1)
To learn more about macros in general, see:
http://www.mvps.org/dmcritchie/excel/getstarted.htm
and
http://msdn.microsoft.com/en-us/library/ee814735(v=office.14).aspx
and for specifics on UDFs, see:
http://www.cpearson.com/excel/WritingFunctionsInVBA.aspx
Macros must be enabled for this to work!

Count the number of specific words in a list

I have some big computation to do since I have an Excel file with a column representing a list of unique IDs of people that worked on every incidents in our system. I would like to know the total number of interventions that have been done on all incidents. For example, let's say I have this:
ID|People working on that incident
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
0|AA0000 BB1111 CC2222 ZZ1234
1|BB1111
2|CC2222 ZZ1234 CC2222 ZZ1234
3|BB1111 CC2222 AA0000 BB1111
I have a list named List which has a zone with the list of people IDs I actually want to include. For example, let's say that the first zone of List = {"AA0000","CC2222"}.
Now, I would like to know how many interventions have been done by our employees (in List) on all the incidents I have (we have 4 in the array above). The result would be 6: 2 interventions for incident ID 0, 0 for ID 1, 2 for ID 2 and 2 for ID 3.
Assuming the data are in a different (closed) workbook, how can I calculate that using my list List and the range above A1:B4 (I would like to eventually use the whole columns, so let's say A:B)?
EDIT:
I already got something working that count the number of times a specific word is in a whole column.
SUM(
LEN('[myFile.xlsx]Sheet1'!$A:$A)
-LEN(
SUBSTITUTE('[myFile.xlsx]Sheet1'!$A:$A;$Z$1;"")
)
)
/LEN($Z$1)
Z1 is the word I'm looking for (example: CC2222) and '[myFile.xlsx]Sheet1'!$A:$A is the column I'm searching in.
Isn't there a really simple way to make this working with an array instead of Z1? The length is always the same (six plus a space).
Source: http://office.microsoft.com/en-ca/excel-help/count-the-number-of-words-in-a-cell-or-range-HA001034625.aspx
Split your source data ColumnB with Text to Columns. Unpivot the result, delete the middle column and pivot what's left.
You could do this fairly easily with a User Defined Function. The function below takes two arguments. The first is the range constituting you second column labelled above "People working on that incident". The second is your List which is a range consisting of a single entry for each ID you wish to count. As shown in your example, if multiple identical ID's are shown in a single entry (e.g. your ID 2 has CC2222 repeated twice), they will each be counted.
To enter this User Defined Function (UDF), opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this User Defined Function (UDF), enter a formula like
=InterventionCount(B2:B5,H1:H2)
in some cell.
Option Explicit
Function InterventionCount(myRange As Range, myList As Range) As Long
Dim RE As Object, MC As Object
Dim vRange As Variant, vList As Variant
Dim sPat As String
Dim I As Long
vRange = myRange
vList = myList
If IsArray(vList) Then
For I = 1 To UBound(vList)
If Not vList(I, 1) = "" Then _
sPat = sPat & "|" & vList(I, 1)
Next I
Else
sPat = "|" & vList
End If
sPat = "\b(?:" & Mid(sPat, 2) & ")\b"
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.ignorecase = True
.Pattern = sPat
End With
For I = 1 To UBound(vRange)
Set MC = RE.Execute(vRange(I, 1))
InterventionCount = InterventionCount + MC.Count
Next I
End Function
For a non-VBA solution you could use a helper column. Again, List is a single column which contains the list of people you want to add up, one entry per cell.
If your data is in Column B, then add a column and enter this formula in B2:
This formula must be array-entered; and the $A:$J terms represent a counter allowing for up to ten items in the entries in column B. If there might be more than that, expand as needed: e.g. for up to 26 items, you would change them to $A:$Z
=SUM(N(TRIM(MID(SUBSTITUTE(B2," ",REPT(" ",99)),(COLUMN($A:$J)=1)+(COLUMN($A:$J)>1)*(COLUMN($A:$J)-1)*99,99))=(List)))
Fill down as far as necessary, then SUM the column to get your total.
To array-enter a formula, after entering
the formula into the cell or formula bar, hold down
ctrl-shift while hitting enter. If you did this
correctly, Excel will place braces {...} around the formula.
I finally went for a completely different solution based on my working formula for 1 employee:
SUM(
LEN('[myFile.xlsx]Sheet1'!$A:$A)
-LEN(
SUBSTITUTE('[myFile.xlsx]Sheet1'!$A:$A;$Z$1;"")
)
)
/LEN($Z$1)
Instead of trying something more complicated, I just added a new column to my employee list where the total is evaluated for each employees (it was already needed elsewhere anyway). Then, I just have to sum up all the employees to get my total.
It is not as elegant as I would like and I feel like it is a workaround, but since it is the easiest solution on a programmation standpoint and that I need the individual datas anyway, it's what I really need for now.
+1 to all the other answers for your help though.

Resources