create relationship based on two columns - relationship

I have two tables like below
Table 01
Company Date Size
A 01/05/2000 30
A 01/06/2000 40
B 01/05/2000 80
B 01/06/2000 90
Table 02
Company Date sales
A 01/05/2000 30
A 01/06/2000 40
B 01/05/2000 80
B 01/06/2000 90
I want to create relationship between these two tables based on date and company.
How to create relationship between two tables?
Thanks in advance

For a Power Pivot / Data Model, you can only use one column in a relationship. You can concatenate columns in a new calculated column using the & operator, e.g.
= 'Table 01'[Company] & "|" & 'Table 01'[Date]

Assuming your Table 1 is column A, B, C in "Sheet1", and Table 2 the same, but in "Sheet2", you can just use a Index/Match (entered with CTRL+SHIFT+ENTER.
In your Table 1, go to D2 (the first non-header row in column D, next to the 30), and use this formula:
=Index(Sheet2!$C$2:$C$10,Match($A2&$B2,Sheet2!$A$2:$A$10&Sheet2!$B$2:$B$10,0))
I'm assuming your last row is 10, if not, just change that in all parts of the formula. This should leave you with:
Table 01
Company Date Size
A 01/05/2000 30 30
A 01/06/2000 40 40
B 01/05/2000 80 80
B 01/06/2000 90 90

Related

star-schema - fact table modeling

I need to model a star schema for some business needs about liquidity stress testing.
i will try to find an analogy example.
let's say we have deals about financing/financial securities etc
in the fact table,
at a given date, this deals have the real value of X euros
but will have a variation in time. thus we have some projection values.
my concern is about how to represent this projection values for this deals, and more specifically what granularity to choose.
( the example below is oversimplification of the fact table -- and yes it's Dimension Id's that
are used otherwise )
Method 1 : as many metric columns as projection values calculated
AsofDate
DealId
0Day
1Day
7Days
1Month
2022-01-01
financingDeal1
100
99
98
85
2022-01-01
financingDeal2
150
150
120
120
2022-01-01
financingDeal3
100
99
98
85
2022-01-01
financingDeal4
100
99
98
85
Method 2 : add a granularity : a row is not anymore only a deal on a given that. it's a deal in a given date and it's projection in the next few days/months
AsofDate
DealId
projection
value
2022-01-01
financingDeal1
0Day - actual
100
2022-01-01
financingDeal1
1Day
99
2022-01-01
financingDeal1
7Days
99
2022-01-01
financingDeal1
1Month
85
from where i see it :
for method 1, the main inconvenient is if in the futur, we have a new projection value for 3 months, we will need to add a column in the ETL/ in the OLAP cube for '3months'
for method 2:
we will have as many rows as (deals * projections) and we do have 11 projections so it's 11 rows for each deal and we do have 1Million+ of them.
what is your opinion on this topic?
Thanks for your consideration

Add Countif to Array Formula (Subtotal) in Excel

I am new to array formulae and have noticed that while SUBTOTAL includes many functions, it does not feature COUNTIF (only COUNT and COUNTA).
I'm trying to figure out how I can integrate a COUNTIF-like feature to my array formula.
I have a matrix, a small subset of which looks like:
A B C D E
48 53 46 64 66
48 66 89
40 38 42 49 44
37 33 35 39 41
Thanks to the help of #Tom Shape in this post, I (he) was able to average the sum of each row in the matrix provided it had complete data (so rows 2 and 4 in the example above would not be included).
Now I would like to count the number of rows with complete data (so rows 2 and 4 would be ignored) which include at least one value above a given threshold (say 45).
In the current example, the result would be 2, since row 1 has 5/5 values > 45, and row 3 has 1 value > 45. Row 5 has values < 45 and rows 2 and 3 have partially or fully missing data, respectively.
I have recently discovered the SUMPRODUCT function and think that perhaps SUMPRODUCT(--(A1:E1 >= 45 could be useful but I'm not sure how to integrate it within Tom Sharpe's elegant code, e.g.,
=AVERAGE(IF(SUBTOTAL(2,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))=COLUMNS(A1:E1),SUBTOTAL(9,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1))),""))
Remember, I am no longer looking for the average: I want to filter rows for whether they have full data, and if they do, I want to count rows with at least 1 entry > 45.
Try the following. Enter as array formula.
=COUNT(IF(SUBTOTAL(4,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))>45,IF(SUBTOTAL(2,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))=COLUMNS(A1:E1),SUBTOTAL(9,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1))))))
Data

Pulling ordered list using array functions in Excel

I have a report in excel that displays the sales results from each employee. The columns are Location, Region, Username & Sales. It is sorted by Sales descending, showing which employee has the best sales in the company.
I am attempting to have an additional sheet per region that displays the results for all employees in that region also sorted by Sales (to avoid sorting the results of the many regions myself everyday).
An example version of the first 12 rows of the Data Sheet:
G H I J K X
Row Location Username Sales Region Region
1 38 John.Doe 85 North1 North1
2 154 John.Smith 83 South2
3 23 E.Williams 83 North1
4 210 M.Williams 79 East5
5 139 Joe.Dawn 77 North2
6 22 Kay.Smith 69 South2
7 51 Jay.Smith 69 South2
8 125 L.Smith 69 East2
9 51 L.Day 69 South2
10 23 23.Guest2 67 North1
11 92 U.Goode 65 North4
I have successfully created an array function that pulls the Sales column of only the results in the specified region.
{=LARGE(SMALL(IF(IF(ISERROR(K:K),"",K:K)=$X$2,J:J),
ROW(INDIRECT("1:"&COUNTIF(K:K,$X$2)))),F2)}
I am attempting now for an array function that pulls the Username that matches the corresponding sales amount in the original array, and also matches the region. I am having trouble when a single region has 'ties' or more than one employee with the same sales that month. Here is what I started with for that function:
=INDEX(I:I,MATCH(1,(Y2=J:J)*($X$1=K:K),0)
but that is having trouble when a single region has multiple users with the same sales. So I am trying a conditional to accomodate, with the function I know that works for singles when there's only one of that sales for that region.
{=IF(COUNTIF($AB$2:AB2,AB2)>1,
INDEX(I:I,
SMALL(IF(J:J=AB2,
IF(K:K=$AB$2,ROW(K:K)-ROW(INDEX(K:K,1,1))+1)),
COUNTIF($AB$2:AB2,AB2))),
INDEX(I:I,MATCH(1,(AC2=J:J)*($AB$2=K:K),0)))}
The inner piece may be sufficient if it worked, excluding the need for the conditional:
{=INDEX(I:I,
SMALL(IF(J:J=AB2,
IF(K:K=$AB$2,ROW(K:K)-ROW(INDEX(K:K,1,1))+1)),
COUNTIF($AB$2:AB2,AB2)))}
I'll use the same function for Username.
Expected results for two regions:
X Y Z AA AB AC AD AE
Region Sales Username Location Region Sales Username Location
North1 85 John.Doe 38 South2 83 John.Smith 154
83 E.Williams 23 69 Kay.Smith 22
67 23.Guest2 23 69 Jay.Smith 51
69 L.Day 51
Since beginning to type this question I have found a work around that includes a few additional columns to complete the calculation, but still wanted to ask this to see if it was possible for knowledge's sake.
With North1 in X2, these are the formulas for Y2:AA2.
=IFERROR(AGGREGATE(14, 6, ($J$2:$J$999)/($K$2:$K$999=X$2), ROW(1:1)), "")
=IFERROR(INDEX($H:$H, AGGREGATE(15, 6, ROW($2:$999)/(($K$2:$K$999=X$2)*($J$2:$J$999=Y2)), COUNTIF(Y$2:Y2, Y2))), "")
=IFERROR(INDEX($H:$H, AGGREGATE(15, 6, ROW($2:$999)/(($K$2:$K$999=X$2)*($J$2:$J$999=Y2)), COUNTIF(Y$2:Y2, Y2))), "")
Fill down as necessary.
With South2 in AB2, copy Y2:AA2 to AC2:AE2 and fill down as necessary.

Do arithmatic inside database. Is this possible?

Is it possible to this dynamically Add(stage_1 + stage_2) and get the total saved into the column called total. I am using phpMyAdmin. And the stage columns are of type float.
Car stage_1 stage_2 total
1 30 50 80
2 28 51 79
3 31 51 82
Thanks in advance for any help.
Try this:
update cartable set total = stage_1 + stage_2
In fact, instead of storing the column total in the database, you could just create a view:
create view carview as
select Car, state_1, stage_2, stage_1 + stage_2 as total
from cartable

SQLServer Calculate Average of Multiple Columns

I have generated a table using PIVOT and the ouput of columns are dynamic. One of the output is as given below:
user test1 test2 test3
--------------------------------
A1 10 20 30
A2 90 87 75
A3 78 12 34
The output of above table represents a list of users attending tests. The tests will be added dynamically, so the columns are dynamic in nature.
Now, I want to find out average marks of each user as well as average marks of each test.
I am able to calculate the average of each test, but got puzzled to find out the average of each user.
Is there a way to do this??
Please help.
Mahesh
You can add the marks for each user then divide by the number of columns:
SELECT
user,
(test1 + test2 + test3) / 3 AS average_mark
FROM users
Or to ignore NULL values:
SELECT
user,
(ISNULL(test1, 0) + ISNULL(test2, 0) + ISNULL(test3, 0)) / (
CASE WHEN test1 IS NULL THEN 0 ELSE 1 END +
CASE WHEN test2 IS NULL THEN 0 ELSE 1 END +
CASE WHEN test3 IS NULL THEN 0 ELSE 1 END
) AS average_mark
FROM users
Your table structure has two disadvantages:
Because your table structure is created dynamically you would also have to construct this query dynamically.
Because some students will not have taken all tests yo may have some NULL values.
You may want to consider changing your table structure to fix both of these problems. I would suggest that you use the following structure for your table:
user test mark
-------------------
A1 1 10
A2 1 90
A3 1 78
A1 2 20
A2 2 87
A3 2 12
A1 3 30
A2 3 75
A3 3 34
Then you can do this to get the average mark per user:
SELECT user, AVG(mark) AS average_mark
FROM users
GROUP BY user
And this to get the average mark per test:
SELECT test, AVG(mark) AS average_mark
FROM users
GROUP BY test
Can you do it on your data source before you pivot it?
The simple answer is to UNPIVOT the same way you just PIVOTed. But the best answer is to not do the PIVOT in the first place! Store the unpivoted data in a table first, then from that do your PIVOT and your average.

Resources