Calculate measure involving dates on a factTable with DAX - pivot-table

I have this problem:
given a "Movements" factTable that holds a list of warehouse transactions.
I want to know how many items arrived, how many were shipped (and this is trivial) but also how many are "In Order" at a particular time (and this is the difficult part)
So, each line can either be a receipt (it has a positive "qIn" value) or a shipment (positive qOut)
For example a very simple list of records could be:
ID Item TransactionDate OrderDate qIn qOut
1 A 2019-01-30 2019-01-10 5 0
2 A 2019-02-20 2019-01-15 3 0
3 A 2019-03-12 2019-01-20 0 6
4 A 2019-03-30 2019-02-20 20 0
That means:
On TransactionDate 2019-01-30 Items A has arrived in quantity 5.
The order for this had been created on 2019-01-10: so for that 20 days there was 5 quantity of Item A "ordered".
However, when I watch at the end of January, I should see 0 for this transaction in the "ordered" measure because it arrived on January 30.
Instead, for the second record, at the end of January I should see that a quantity of 3 was "in order", because the actual arrival has been on 2019-02-20.
So, at the end of the line, the Excel pivot table should show a situation similar to this:
Year 2019
Month January February March
IN | Ord IN | Ord IN | Ord
Item
A 5 3 3 20 20 0
The simple measure of qIn is:
qIN := SUM(Transactions[qtaIn])
The measure of ordered quantity I have elucubrated at the moment (that does nothing!):
orderedQty :=
CALCULATE (
SUMX ( Transactions; Transactions[qIn] );
DATESBETWEEN (
Transactions[TransactionDate];
MINX ( Transactions; Transactions[OrderDate] );
MAXX ( Transactions; Transactions[TransactionDate] )
)
)
EDIT
The "InOrder" measure should be "additive" in the sense that it should not only take into account what has happened in the current month, but also how much of the InOrder from past months is yet to be received.
With a picture (but that would be to do...) the whole thing would be clearer, at least from a logic perspective. However, also with a picture, I can't see how to extract "direct measures" from that logic.
Instead, exploiting the measures already provided by #Olly, the problem could be reformulated as:
InOrderFromOtherMonths := Sum (qIn) where Order Month <> Current Month
(i.e. how many are arrived in current month that comes from orders taken in past months)
InOrder := Total sum of (ORDER measure) - InOrderFromOtherMonths
PS.
I have created an Excel file with a little more interesting example.
In that file, using the "direct measure picture" the InOrder for January would be:
ID 2 + ID 5 + ID 6 (orders yet opened at end of January).
In values = 3+9+17=29
With the "indirect" measure would be:
Total sum of ORDER = 15+23+12=50
InOrderFromOtherMonths = 6+15=21
InOrder = Total sum of ORDER - InOrderFromOtherMonths = 50 - 21 = **29**

Create a Calendar table, including a YYYY-MM field. If you don't already have a calendar table, you can automatically create one in PowerPivot: Design > Date Table > New
Create an ACTIVE relationship between Calendar[Date] and Transactions[TransactionDate]
Create an INACTIVE relationship between Calendar[Date] and Transactions[OrderDate]
Now create your measures:
Measure IN:
IN:=SUM ( Transactions[qIn] )
Measure ORDERS:
ORDERS:=
CALCULATE (
SUM ( Transactions[qIn] ),
USERELATIONSHIP ( 'Calendar'[Date], Transactions[OrderDate] )
)
Measure ORDER:
ORDER:=
IF (
HASONEVALUE ( 'Calendar'[YYYY-MM] ),
CALCULATE (
[ORDERS],
FORMAT ( Transactions[TransactionDate], "YYYY-MM" ) <> VALUES ( 'Calendar'[YYYY-MM] )
)
)
And pivot to suit:
EDIT
After your question edit, I'm finding some of your labels confusing - but try creating the following measures:
Measure: Ordered
Ordered:=
CALCULATE (
SUM ( Movements[qIn] ),
USERELATIONSHIP ( 'Calendar'[Date], Movements[OrdDate] )
)
Measure: Received
Received:= SUM ( Movements[qIn] )
Measure: Outstanding
Outstanding:=
VAR EOMaxDate =
EOMONTH ( LASTDATE ( 'Calendar'[Date] ), 0 )
RETURN
IF (
ISBLANK ( [Ordered] ) && ISBLANK ( [Received] ),
BLANK(),
CALCULATE (
[Ordered] - [Received],
FILTER (
ALL ( 'Calendar'),
'Calendar'[Date] <= EOMaxDate
)
)
)
Now use those three measures in your pivot:
Or, more clearly:
See https://excel.solutions/so_55596609-2/ for example XLSX file

Related

ABAP - Group employees by cost center and calculate sum

I have an internal table with employees. Each employee is assigned to a cost center. In another column is the salary. I want to group the employees by cost center and get the total salary per cost center. How can I do it?
At first I have grouped them as follows:
Loop at itab assigning field-symbol(<c>)
group by <c>-kostl ascending.
Write: / <c>-kostl.
This gives me a list of all cost-centers. In the next step I would like to calculate the sum of the salaries per cost center (the sum for all employees with the same cost-center).
How can I do it? Can I use collect?
Update:
I have tried with the follwing coding. But I get the error "The syntax for a method specification is "objref->method" or "class=>method"". lv_sum_salary = sum( <p>-salary ).
loop at i_download ASSIGNING FIELD-SYMBOL(<c>)
GROUP BY <c>-kostl ascending.
Write: / <c>-kostl, <c>-salary.
data: lv_sum_salary type p DECIMALS 2.
Loop at group <c> ASSIGNING FIELD-SYMBOL(<p>).
lv_sum_salary = sum( <p>-salary ).
Write: /' ',<p>-pernr, <p>-salary.
endloop.
Write: /' ', lv_sum_salary.
endloop.
I am not sure where you got the sum function from, but there is no such build-in function. If you want to calculate a sum in a group-by loop, then you have to do it yourself.
" make sure the sum is reset to 0 for each group
CLEAR lv_sum_salary.
" Do a loop over the members of this group
LOOP AT GROUP <c> ASSIGNING FIELD-SYMBOL(<p>).
" Add the salary of the current group-member to the sum
lv_sum_salary = lv_sum_salary + <p>-salary.
ENDLOOP.
" Now we have the sum of all members
WRITE |The sum of cost center { <c>-kostl } is { lv_sum_salary }.|.
Generally speaking, to group and sum, there are these 4 possibilities (code snippets provided below):
SQL with an internal table as source: SELECT ... SUM( ... ) ... FROM #itab ... GROUP BY ... (since ABAP 7.52, HANA database only); NB: beware the possible performance overhead.
The classic way, everything coded:
Sort by cost center
Loop at the lines
At each line, add the salary to the total
If the cost center is different in the next line, process the total
LOOP AT with GROUP BY, and LOOP AT GROUP
VALUE with FOR GROUPS and GROUP BY, and REDUCE and FOR ... IN GROUP for the sum
Note that only the option with the explicit sorting will sort by cost center, the other ones won't provide a result sorted by cost center.
All the below examples have in common these declarative and initialization parts:
TYPES: BEGIN OF ty_itab_line,
kostl TYPE c LENGTH 10,
pernr TYPE c LENGTH 10,
salary TYPE p LENGTH 8 DECIMALS 2,
END OF ty_itab_line,
tt_itab TYPE STANDARD TABLE OF ty_itab_line WITH EMPTY KEY,
BEGIN OF ty_total_salaries_by_kostl,
kostl TYPE c LENGTH 10,
total_salaries TYPE p LENGTH 10 DECIMALS 2,
END OF ty_total_salaries_by_kostl,
tt_total_salaries_by_kostl TYPE STANDARD TABLE OF ty_total_salaries_by_kostl WITH EMPTY KEY.
DATA(itab) = VALUE tt_itab( ( kostl = 'CC1' pernr = 'E1' salary = '4000.00' )
( kostl = 'CC1' pernr = 'E2' salary = '3100.00' )
( kostl = 'CC2' pernr = 'E3' salary = '2500.00' ) ).
DATA(total_salaries_by_kostl) = VALUE tt_total_salaries_by_kostl( ).
and the expected result will be:
ASSERT total_salaries_by_kostl = VALUE tt_total_salaries_by_kostl(
( kostl = 'CC1' total_salaries = '7100.00' )
( kostl = 'CC2' total_salaries = '2500.00' ) ).
Examples for each possibility:
SQL on internal table:
SELECT kostl, SUM( salary ) AS total_salaries
FROM #itab AS itab ##DB_FEATURE_MODE[ITABS_IN_FROM_CLAUSE]
GROUP BY kostl
INTO TABLE #total_salaries_by_kostl.
Classic way:
SORT itab BY kostl.
DATA(next_line) = VALUE ty_ref_itab_line( ).
DATA(total_line) = VALUE ty_total_salaries_by_kostl( ).
LOOP AT itab REFERENCE INTO DATA(line).
DATA(next_kostl) = VALUE #( itab[ sy-tabix + 1 ]-kostl OPTIONAL ).
total_line-total_salaries = total_line-total_salaries + line->salary.
IF next_kostl <> line->kostl.
total_line-kostl = line->kostl.
APPEND total_line TO total_salaries_by_kostl.
CLEAR total_line.
ENDIF.
ENDLOOP.
EDIT: I don't talk about AT NEW and AT END OF because I'm not fan of them, as they don't explicitly define the possible multiple fields, they implicitly consider all the fields before the mentioned field + this field included. I also ignore ON CHANGE OF, this one being obsolete.
LOOP AT with GROUP BY:
LOOP AT itab REFERENCE INTO DATA(line)
GROUP BY ( kostl = line->kostl )
REFERENCE INTO DATA(kostl_group).
DATA(total_line) = VALUE ty_total_salaries_by_kostl(
kostl = kostl_group->kostl ).
LOOP AT GROUP kostl_group REFERENCE INTO line.
total_line-total_salaries = total_line-total_salaries + line->salary.
ENDLOOP.
APPEND total_line TO total_salaries_by_kostl.
ENDLOOP.
VALUE with FOR and GROUP BY, and REDUCE for the sum:
total_salaries_by_kostl = VALUE #(
FOR GROUPS <kostl_group> OF <line> IN itab
GROUP BY ( kostl = <line>-kostl )
( kostl = <kostl_group>-kostl
total_salaries = REDUCE #( INIT sum = 0
FOR <line_2> IN GROUP <kostl_group>
NEXT sum = sum + <line_2>-salary ) ) ).

Average number of rows by hour based on total number of days

Dear all
In PowerBI, using DirectQuery, I would like to have the sum of of occurences by hour per day, divided by the total number of days
Let me provide you with a sample data.
DataTable:
ID;DATE;HOUR715;2019-10-19;15:47:37181;2019-10-19;15:56:11349;2019-10-19;15:57:256ec;2019-10-19;15:58:1657e;2019-10-19;16:02:35860;2019-10-19;16:03:427a5;2019-10-19;16:03:52978;2019-10-19;16:05:19da0;2019-10-20;11:00:45c2d;2019-10-20;23:04:53355;2019-10-20;23:04:534f5;2019-10-20;23:05:10396;2019-10-21;14:42:245f7;2019-10-21;14:43:3793a;2019-10-21;14:55:36a44;2019-10-21;14:59:21264;2019-10-21;15:05:20f48;2019-10-21;15:07:01
And a summarized Dimension Table with the values present in DataTable:
DimHourTable:
COMPLETEHOUR;HOUR2415:47:37;1515:56:11;1515:57:25;1515:58:16;1516:02:35;1616:03:42;1616:03:52;1616:05:19;1611:00:45;1123:04:53;2323:04:53;2323:05:10;2314:42:24;1414:43:37;1414:55:36;1414:59:21;1415:05:20;1515:07:01;15
Note: Relationship with Both Directions filter between DataTable[HOUR] and DimHourTable[COMPLETEHOUR]
I'm now doing this:
formula1Occurrences = COUNTA( DataTable[id] )formula2 CountDays = DISTINCTCOUNT( DataTable[date] )formula3Avg_Occurrences = DIVIDE( [Occurences] , [CountDays] )
Then I'm putting in a matrix the following
Rows: DimHourTable[HOUR24]
Values: Avg_Occurrences
With that Sample Data, this is the average I'm getting.
11 -> 114 -> 415 -> 316 -> 423 -> 3
It ends up dividing by the number of days that contains that specific hour.
But, in reality, I would like to have this:
11 -> 0.3314 -> 1.3315 -> 216 -> 1.3323 -> 1
I would like to divide the occurrences by the total number of days present in the DataTable, independent if it contains that specific hours or not.
Does someone have an idea how to solve it?
Thanks in advance!
Try using this for formula2
CountDays = CALCULATE ( DISTINCTCOUNT ( DataTable[date] ), ALL ( 'DataTable' ) )
When you are using slicers, you can also try
CountDays = CALCULATE ( DISTINCTCOUNT ( DataTable[date] ), ALLSELECTED ( 'DataTable' ) )
The ALL and ALLSELECTED functions will remove the filtercontext that is created by DimHourTable[HOUR24], that you put on rows in your matrix-visual

ms sql table adding rows whenever level changes by more than 1 so that every row has difference of 1 in start_level and end_level

(This is my first stack overflow question. So please let me know suggestions for posing a better question, if you cannot understand.)
I have a table of around 500 people(users) who are going up the stairs from floor x (0=x, max(y) = 50). A person can climb zero/one or many levels in a single go which corresponds to a single row of the table along with the time taken to do so in seconds.
I want to find average time taken to go from floor a to a+1 where a is any of the floor number. To do so I intend to divide every row of the mentioned table into rows which have start_level+1= end_level. Duration will be divided equally as shown in EXPECTED OUTPUT TABLE for user b.
GIVEN TABLE INPUT
start_level end_level duration user
1 1 10 a
1 2 5 a
2 5 27 b
5 6 3 c
EXPECTED OUTPUT
start_level end_level duration user
1 1 10 a
1 2 5 a
2 3 27/3 b
3 4 27/3 b
4 5 27/3 b
5 6 3 c
Note: level jumps are in integers only.
After getting expected output, I can simply create a column sum(duration)/count(distinct users) at a start_level level to get average time taken to get one floor above from each floor.
Any help is appreciated.
You can use a Numbers table to "create" the incremental steps. Here's my setup:
CREATE TABLE #floors
(
[start_level] INT,
[end_level] INT,
[duration] DECIMAL(10, 4),
[user] VARCHAR(50)
)
INSERT INTO #floors
([start_level],
[end_level],
[duration],
[user])
VALUES (1,1,10,'a'),
(1,2,5,'a'),
(2,5,27,'b'),
(5,6,3,'c')
Then, using a Numbers table and some LEFT JOIN/COALESCE logic:
-- Create a Numbers table
;WITH Numbers_CTE
AS (SELECT TOP 50 [Number] = ROW_NUMBER()
OVER(
ORDER BY (SELECT NULL))
FROM sys.columns)
SELECT [start_level] = COALESCE(n.[Number], f.[start_level]),
[end_level] = COALESCE(n.[Number] + 1, f.[end_level]),
[duration] = CASE
WHEN f.[end_level] = f.[start_level] THEN f.[duration]
ELSE f.[duration] / ( f.[end_level] - f.[start_level] )
END,
f.[user]
FROM #floors f
LEFT JOIN Numbers_CTE n
ON n.[Number] BETWEEN f.[start_level] AND f.[end_level]
AND f.[end_level] - f.[start_level] > 1
Here are the logical steps:
LEFT JOIN the Numbers table for cases where end_level >= start_level + 2 (this has the effect of giving us multiple rows - one for each incremental step)
new start_level = If the LEFT JOIN "completes": take Number from the Numbers table, else: take the original start_level
new end_level = If the LEFT JOIN "completes": take Number + 1, else: take the original end_level
new duration = If end_level = start_level: take the original duration (to avoid divide by 0), else: take the average duration over end_level - start_level

In SSRS, how can I add a row to aggregate all the rows that don't match a filter?

I'm working on a report that shows transactions grouped by type.
Type Total income
------- --------------
A 575
B 244
C 128
D 45
E 5
F 3
Total 1000
I only want to provide details for transaction types that represent more than 10% of the total income (i.e. A-C). I'm able to do this by applying a filter to the group:
Type Total income
------- --------------
A 575
B 244
C 128
Total 1000
What I want to display is a single row just above the total row that has a total for all the types that have been filtered out (i.e. the sum of D-F):
Type Total income
------- --------------
A 575
B 244
C 128
Other 53
Total 1000
Is this even possible? I've tried using running totals and conditionally hidden rows within the group. I've tried Iif inside Sum. Nothing quite seems to do what I need and I'm butting up against scope issues (e.g. "the value expression has a nested aggregate that specifies a dataset scope").
If anyone can give me any pointers, I'd be really grateful.
EDIT: Should have specified, but at present the dataset actually returns individual transactions:
ID Type Amount
---- ------ --------
1 A 4
2 A 2
3 B 6
4 A 5
5 B 5
The grouping is done using a row group in the tablix.
One solution is to solve that in the SQL source of your dataset instead of inside SSRS:
SELECT
CASE
WHEN CAST([Total income] AS FLOAT) / SUM([Total income]) OVER (PARTITION BY 1) >= 0.10 THEN [Type]
ELSE 'Other'
END AS [Type]
, [Total income]
FROM Source_Table
See also SQL Fiddle
Try to solve this in SQL, see SQL Fiddle.
SELECT I.*
,(
CASE
WHEN I.TotalIncome >= (SELECT Sum(I2.TotalIncome) / 10 FROM Income I2) THEN 10
ELSE 1
END
) AS TotalIncomePercent
FROM Income I
After this, create two sum groups.
SUM(TotalIncome * TotalIncomePercent) / 10
SUM(TotalIncome * TotalIncomePercent)
Second approach may be to use calculated column in SSRS. Try to create a calculated column with above case expression. If it allows you to create it, you may use it in the same way as SQL approach.
1) To show income greater than 10% use row visibility condition like
=iif(reportitems!total_income.value/10<= I.totalincome,true,false)
here reportitems!total_income.value is total of all income textbox value which will be total value of detail group.
and I.totalincome is current field value.
2)add one more row to outside of detail group to achieve other income and use expression as
= reportitems!total_income.value-sum(iif(reportitems!total_income.value/10<= I.totalincome,I.totalincome,nothing))

Populating a departure date field which is after the arrival date

Step 1
Arrival Date (Already generated) – 1.35 Million Times
Step 2
Randomise a number between 0 and 1
Step 3
Use the Randomised number produced above to create the script below
UPDATE BOOKINGS
SET DepartureDate
CASE WHEN RAND() Result = Between 0 and 0.3 = Departure Date will be 2 Nights Later
CASE WHEN RAND() Result = Between 0.3 and 0.4 = Departure Date will be 3 Nights Later
CASE WHEN RAND ()Result >0.4 = Departure Date will be either 1,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28 Nights Later
Do not use RAND() with a changing seed. It makes for terribly randomized data.
To get to your solution you need to create "buckets" of possible values. 3 days is supposed to happen in 10% of the cases; that makes the smallest bucket, so we need ten buckets. 2 days goes into 3 buckets. The other values go into 2 buckets each. then just use modulo to select one of the 10 buckets like this:
CREATE TABLE dbo.booking(Id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,days INT);
GO
INSERT INTO dbo.booking(days)
SELECT TOP(100000) 0 FROM sys.columns A,sys.columns B,sys.columns C,sys.columns D;
GO
UPDATE b
SET days = rndm.days
FROM dbo.booking b
CROSS APPLY (
SELECT days
FROM (VALUES(0,2),(1,2),(2,2),(3,3),(4,1),(5,1),(6,4),(7,4),(8,28),(9,28))dn(n,days)
WHERE n = ABS(CHECKSUM(NEWID(),b.Id))%10
)rndm;
GO
SELECT days,COUNT(1) cnt
FROM dbo.booking
GROUP BY days;
GO
EDIT: Updated code to not use case statement.
Just to let you know the final solution I used was:
UPDATE BOOKINGS
SET DepartureDate =
DATEADD(day,
CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0 and 0.3 THEN 2 ELSE
CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.3 and 0.5 THEN 3 ELSE
Round(Rand(CHECKSUM(NEWID())) * 28,0) END END,ArrivalDate)
Thanks
Wayne

Resources