DAX - Divide a column over itself with different filters to get percentages - pivot-table

In power Pivot I have tables along the lines of:
Table 1
Year
Month
Branch_ID
Store_ID
Article
Value
2022
10
1
1
Sales
100
2022
10
1
2
Sales
200
2022
10
1
2
Operating expenses
50
2022
10
1
1
Operating expenses
80
2022
10
1
2
Cost of Sales
20
2022
10
1
1
Cost of Sales
30
Table 2
Year
Month
Branch_ID
Store_ID
Article
Value
2022
10
1
1
Sales_Ecomm
20
2022
10
1
2
Sales_Ecomm
15
Table 3
| Article |
|--------------------|
| Sales |
| Operating expenses |
| Cost of Sales |
| Sales_Ecomm |
There are multiple branches and months, so these columns may not be ignored.
Table 1 and table 2 are separate. Table 3 is connected to both so that I could build a pivot table.
In the pivot table I want to have all articles re-evaluated as percentage of Sales, i.e. I am trying to get a pivot table along the lines of:
Store ID
Sales
Operating expenses
Cost of Sales
Sales_Ecomm
Value
% of sales
Value
% of sales
Value
% of sales
Value
% of sales
1
100
100.00%
80
80.00%
30
30.00%
20
20.00%
2
200
100.00%
50
25.00%
20
10.00%
15
7.50%
I have a measure
Val. := sum(table1[Value]) + sum(table2[value])
which seems to be working for absolute values of the articles.
However, I can't seem to come up with an appropriate DAX measure for percentages. I have tried:
%_of_Sales := [Val.] / calculate([Val.], filter(table3; table3[Article]="Sales"))
but it only counts Sales as percentage of Sales (100%), yielding #NUM! for other articles in the pivot table.
How do I define a ratio measure so that every article is evaluated against Sales?

You're missing a crucial ALL:
=
DIVIDE(
[Val.],
CALCULATE(
[Val.],
FILTER(
ALL( table3 ),
table3[Article] = "Sales"
)
)
)
which is equivalent to:
=
DIVIDE(
[Val.],
CALCULATE(
[Val.],
table3[Article] = "Sales"
)
)

Related

Sum values from multiple tables grouping by a common column

I have three tables in MS SQL Server 2014. Each of them holds a couple of numeric values, a description and a date. For the sake of brevety, let's assume the following tables:
table "beverages"
day beverage amount
---------- -------- ------
2018-12-01 water 2
2018-12-01 tea 1
2018-12-01 coffee 7
2018-12-02 water 4
2018-12-02 tea 2
table "meals"
day meal amount
---------- ------ ------
2018-12-01 burger 1
2018-12-01 bread 2
2018-12-02 steak 1
table "fruit"
day fruit amount
---------- ------ ------
2018-12-01 apple 4
2018-12-01 banana 1
2018-12-02 apple 2
Then I have another table holding only a list of dates.
table "dates"
day
----------
2018-12-01
2018-12-02
What I need is a query that returns one row for each of the rows in the dates table, and in each row has the date, the total amount of beverages, the total amount of meals and the total amount of fruit for that day. I do not care for the different types of beverages, meals and fruit, just the sum. The result should be:
expected result
day beverages meals fruit
---------- ----------- ----------- -----------
2018-12-01 10 3 5
2018-12-02 6 1 2
But instead I receive
received result
day beverages meals fruit
---------- ----------- ----------- -----------
2018-12-01 40 18 30
2018-12-02 6 2 4
I already know what the problem is, just not how to fix it. Even worse, I'm sure that I knew the answer once, but now I can't even figure the right search terms to make Google tell me...
When I do the query like this (I used table variables for testing)
SELECT
[d].[day]
,SUM([b].[amount]) AS [beverages]
,SUM([m].[amount]) AS [meals]
,SUM([f].[amount]) AS [fruit]
FROM #dates AS [d]
LEFT OUTER JOIN #beverages AS [b]
ON [d].[day] = [b].[day]
LEFT OUTER JOIN #meals AS [m]
ON [d].[day] = [m].[day]
LEFT OUTER JOIN #fruit AS [f]
ON [d].[day] = [f].[day]
GROUP BY [d].[day]
it sums each row from the different tables more than once, because it returns every possible combination of the three tables. Removing the SUM() and GROUP BY proves that:
day beverages meals fruit
---------- ----------- ----------- -----------
2018-12-01 2 1 4
2018-12-01 2 1 1
2018-12-01 2 2 4
2018-12-01 2 2 1
2018-12-01 1 1 4
2018-12-01 1 1 1
2018-12-01 1 2 4
2018-12-01 1 2 1
2018-12-01 7 1 4
2018-12-01 7 1 1
2018-12-01 7 2 4
2018-12-01 7 2 1
2018-12-02 4 1 2
2018-12-02 2 1 2
So, what do I need to change in the query to make it sum the values for each of the three tables without multiplying it with the number of the rows in the other tables?
Group the Tables before joining like so:
SELECT
[d].[day]
,[b].[amount] AS [beverages]
,[m].[amount] AS [meals]
,[f].[amount] AS [fruit]
FROM #dates AS [d]
LEFT OUTER JOIN (SELECT day, SUM(amount) as amount FROM #beverages GROUP BY day) AS [b]
ON [d].[day] = [b].[day]
LEFT OUTER JOIN (SELECT day, SUM(amount) as amount FROM #meals GROUP BY day) AS [m]
ON [d].[day] = [m].[day]
LEFT OUTER JOIN (SELECT day, SUM(amount) as amount FROM #fruit GROUP BY day) AS [f]
ON [d].[day] = [f].[day]
How about a PIVOT instead?
Example
Select *
From (
Select day,Item='beverage',amount from beverages
Union All
Select day,Item='meals' ,amount from meals
Union All
Select day,Item='fruit' ,amount from fruit
) src
Pivot ( sum(amount) for Item in ([beverages],[meals],[fruit]) ) pvt

SQL Server, finding the SUM of a foreign key column

I currently have 3 tables.
Table 1: customers
id(PK) name surname
----------------------------------
1 name1 surname1
2 name2 surname2
3 name3 surname3
4 name4 surname4
Table 2: sales
id(FK) game(FK) price(FK)
-----------------------------
1 1 1
2 4 4
3 4 4
4 3 3
1 3 3
2 3 3
3 2 2
Table 3: stock
id(FK) game price
-----------------------------
1 game1 20
2 game2 30
3 game3 40
4 game4 50
What I'm looking to do is find the sum of all the sales listed in the sales table (table 2).
So far, I can display a table showing how much money each game has made in total but cannot get the overall total of sales to display.
I have tried
select sum(sales.price)
from sold
However, this is just calculating the sum of the foreign key (in this case it would be 20). However, I want it to display 270.
You need to join the stock and sales tables to get the correct price of each of the items sold.
Select sum(stock.price) from sales
inner join stock on sales.game = stock.id

Latest record for each user number?

I did search, but the uniqueness of each question makes it hard for me to "translate" it for my dataset.
I have table A named: CLOGS17
With a sub-set of the data and fields shown:
SERIALNO EVDATE SYSNO AREA USRNO
4 2017-01-01 02:03:48.000 1 4 10
4 2017-01-01 02:09:00.000 1 4 10
4 2017-01-01 02:24:44.997 1 6 10
4 2017-01-01 02:56:50.000 1 2 18
5 2017-08-08 02:03:48.000 1 4 10
5 2017-01-09 02:09:00.000 1 4 10
6 2017-04-03 02:24:44.997 1 6 10
8 2017-05-05 02:56:50.000 1 2 18
My goal is to retrieve all records where the combination of SERIALNO + SYSNO + AREA + USRNO has not been used in the last 30 days (inactive user essentiallY) so I can delete that USRNO.
Desired output from above data would be (newest record for each SERIALNO, SYSNO, AREA, and USRNO distinct combination):
SERIALNO EVDATE SYSNO AREA USRNO
4 2017-01-01 02:09:00.000 1 4 10
4 2017-01-01 02:24:44.997 1 6 10
4 2017-01-01 02:56:50.000 1 2 18
5 2017-08-08 02:03:48.000 1 4 10
6 2017-04-03 02:24:44.997 1 6 10
8 2017-05-05 02:56:50.000 1 2 18
I am then able to get only those within the last 30 days.
Given the table data below ("Table B"), it is a list of all stored users:
SERIALNO CONTID SYSNO AREA USRID
36 001 1 * 1
36 001 1 * 18
36 001 1 * 2
36 001 1 * 29
36 001 1 * 36
36 001 1 1 10
This table contains ALL users in the system.
How can I return all the users from Table B that have not been used for a given CONTID, SYSNO, and AREA?
For the first part of your question it would be as simples as a group by of a select on the desired fields:
SELECT SERIALNO,
SYSNO,
AREA,
USRNO,
MAX(EVDATE)
FROM CLOGS17
GROUP BY SERIALNO,
SYSNO,
AREA,
USRNO
Since you didn't provide enough information about the second part. This query will give you the output you show in your question.
So, to get all users that doesn't meet your 30 days criteria (whatever it are), you just do a left join of you user table with the above query seeking the nulls for the query above, like this:
SELECT *
FROM tableb tb LEFT JOIN
(SELECT SERIALNO,
SYSNO,
AREA,
USRNO,
MAX(EVDATE)
FROM CLOGS17
GROUP BY SERIALNO,
SYSNO,
AREA,
USRNO) a
ON tb.SERIALNO = a.SERIALNO,
AND tb.SYSNO = a.SYSNO
AND tb.USRNO = a.USRNO
WHERE a.AREA is null

Calculate Weighted Average in SQL Server

I am trying to calculate a weighted average based on the following calculations.
I have a dataset that looks something like this:
item | Date Sent | Date Received
1 | 2 Feb 10am | 3 Feb 10am
1 | 6 Feb 11am | 6 Feb 12pm
2 | 2 Feb 10am | 3 Feb 10am
2 | 6 Feb 11am | 6 Feb 12pm
I then need to calculate the average based on the time difference rounded down meaning:
Time Diff | Count |
1 | 2 |
12 | 2 |
So in this case it would be:
1 * 2 + 12 * 2 / (12 + 1)
I have already written the SQL query to calculate the aggregate table:
select
floor(datediff(hh, dateSent, dateReceived)) as hrs,
count(item) as freq
from
table
group by
floor(datediff(hh, dateSent, dateReceived))
having
floor(datediff(hh, dateSent, dateReceived)) < 100
order by
floor(datediff(hh, dateSent, dateReceived)) asc;
Should I do a subquery? I am not proficient and I have tried but keep getting syntax errors.
Can somebody help me to get the SQL query to get the weighted average?
If what you mean by "weighted average" is average of all time differences, then the following may be helpful:
select AVG(a.hrs)
from
(
select floor(datediff(hh,dateSent,dateReceived)) as hrs,
count(item) as freq from table
group by floor(datediff(hh,dateSent,dateReceived))
having floor(datediff(hh,dateSent,dateReceived)) <100
-- order by floor(datediff(hh,dateSent,dateReceived)) asc
) a

Aggregate function on one column, group by on another, leave a third unaffected

I feel like this isn't too bad of a problem but I've been looking for a solution for the greater part of the day to no avail. Other solutions I've seen plenty of that don't seem to help me have been for getting columns that aren't unique values along with a group by and aggregate function.
The problem
I have a table of historical data as follows:
ID | source | value | date
---+--------+-------+-----------
1 | 12 | 10 | 2016-11-16
2 | 12 | 20 | 2015-11-16
3 | 12 | 30 | 2014-11-16
4 | 13 | 40 | 2016-11-16
5 | 13 | 50 | 2015-11-16
6 | 13 | 60 | 2014-11-16
I'm trying to get data before a certain date(within a loop to go different ranges), then getting the sum of the values grouped by source. So as an example "get all records before 30 days ago, and get the sum of the values of the unique sources, using the most recent dated entry for each".
So the first step was to remove entries with dates not in the range, an easy where date < getdate()-30 for example to get:
ID | source | value | date
---+--------+-------+-----------
2 | 12 | 20 | 2015-11-16
3 | 12 | 30 | 2014-11-16
5 | 13 | 50 | 2015-11-16
6 | 13 | 60 | 2014-11-16
Now my issue is finding a way to group by source and take the max date, and then sum up the result across all sources. The idea hear is that we don't know when the last entry is, so before the specified date we get all records, then take the newest entry for each unique source, and sum those up to get the total value at that time.
So the next step would be to group by source using the max of date, resulting in :
ID | source | value | date
---+--------+-------+-----------
2 | 12 | 20 | 2015-11-16
5 | 13 | 50 | 2015-11-16
And then the final step would be to sum the values, and then this process is repeated to get the sum value for multiple dates, so this would result in the row
value | date
-------+-----------
70 | getdate() - 30
to use for the rest.
Where I'm stuck
I'm trying to group by source and use the max of date to get the most recent entry for each unique source, but if I use the aggregate function or group by, then I can't preserve the ID or value columns to stick with the chosen max row. It's totally possible I'm just misunderstanding how aggregate functions work.
Progress so far
The best place I've gotten to yet is something like
with dataInDateRange as (
select *
from #historicalData hd
where hd.date < getdate() - 30
)
select ???, max(date)
from dataInDateRange
group by source
But I'm not seeing how I can do this without somehow preserving a unique ID for the row that has the max date for each source so then I can go back and sum up the numbers.
Thank you great people for any help/guidance/lessons
USE row_number()
with dataInDateRange as (
select *
from #historicalData hd
where hd.date < getdate() - 30
), rows as (
select *,
row_number() over (partition by source
order by date desc) as rn
from dataInDateRange
)
SELECT *
FROM rows
WHERE rn = 1

Resources