Count Unique Values with Conditions Excel - arrays

I'm trying to create a simple chart to show how many employees are on a given corrective action level within a specified date range. The issue I'm running into is this:
The log shows associate Test 1 received a verbal warning on 8/14/19 for their productivity, then a first written warning on 8/24/19, then a final written warning that was processed later but took place on 8/23/19.
The formula I wrote will show this as 1 person at each level of correction (verbal, first written, and final written). I want it to only count the highest-level warning for each person. So the chart would only count 1 entry at the final written warning level.
What am I missing to accomplish this?
Raw Data:
Summary Chart:
Summary Chart Formula (across from Verbal level):
={SUM(--(FREQUENCY(IF(('2019'!$C:$C<>"")*('2019'!$F:$F=$B$2)*('2019'!$D:$D>=$B$3)*('2019'!$D:$D<=$C$3)*('2019'!$E:$E=$B5)*('2019'!$E:$E<>$B6),MATCH('2019'!$C:$C,'2019'!$C:$C,0)),ROW('2019'!$C:$C)-ROW('2019'!$C$2)+1)>0))}''''

Cracked it! I added two helper columns to the raw data in between Step and Reason.
The first, Level, is a VLOOKUP that converts the Step to a numerical value (in order of severity, the lowest being a Verbal, highest being an Exit).
The second, Max, is a MAXIFS formula to flag which step is the highest severity by associate ID and Reason:
=IF(MAXIFS(F:F,C:C,C2,H:H,H2,D:D,">="&Summary!$B$3,D:D,"<="&Summary!$C$3)=F2,"X","")
The formula in the summary chart now reads as follows:
=SUM(--(FREQUENCY(IF(('2019'!$C:$C<>"")*('2019'!$I:$I=$B$2)*('2019'!$D:$D>=$B$3)*('2019'!$D:$D<=$C$3)*('2019'!$F:$F=$B5)*('2019'!$H:$H="X"),MATCH('2019'!$C:$C,'2019'!$C:$C,0)),ROW('2019'!$C:$C)-ROW('2019'!$C$1)+1)>0))

Related

SageMaker ClientError: Detected non integer labels in the dataset

I have created a SageMaker training job to train on a toy, tabular, multiclass(3) dataset which has failed with the following error:
ClientError: Detected non integer labels in the dataset. For classification tasks, the labels should be integers between 0 to (num_classes-1), exit code: 2
It sounds like they're saying that for the classes (labels) they want to see values between 0 and 2 in this case, as I have 3 classes.
I have set num_classes to 3 and have validated that I only have 3 unique values in the rightmost column of my dataset: 0, 1, and 2
I've set feature_dim to 3. I've removed the headers from my dataset. My raw data looks like 5,000 lines of this:
csv snapshot
Can anyone guess as to what might be causing this error?
I wanted to answer this because at the time of this writing, the error message I recieved returns 0 hits on Google.
It turns out that the issue was that SageMaker expects the class labels to appear in the first column, by default. This is different from how datasets are typically structured. So when I got this error message, SageMaker was looking at my first column which had all sorts of float values. I fixed it by moving my labels to the first column.

SSRS - Sum of an aggregated field

I have a report, that shows the individual sales commission for each employee.
The expression for the Sales Commission is:
=Sum(Fields!Total_Comission.Value)*Parameters!Distribution_Factor.Value*Fields!Individual_Factor.Value
Now I want to add a total for this field. If I just right click the cell and click on Add Total, it works but gives out the wrong total.
If I try to sum the field like this:
=Sum(Reportitems!GO_Prov.Value)
I get the error:
The Value expression for the textrun 'Textbox93.Paragraphs[0].TextRuns[0]' uses an aggregate function on a report item. Aggregate functions can be used only on report items contained in page headers and footers.
Is there a workaround to sum the aggregated field of this tablix? Maybe with a code?
Thanks in advance.
Update1:
Unfortunately i don't know how to write custom codes. But I found this code:
Public Total_lookup_Sum As Integer = 0
Public Function Lookup_Sum(ByVal value As Integer) As Integer
Total_lookup_Sum = Total_lookup_Sum + value
Return value
End Function
The expression i used for the Sales Commission is now:
=Code.Lookup_Sum((Sum(Fields!Total_Comission.Value)*Parameters!Distribution_Factor.Value*Fields!Individual_Factor.Value))
And the expression for the field where i would like to get the sum is:
=Code.Total_lookup_Sum
Now i get the error:
There is an error on line 0 of custom code: [BC30205] End of
statement expected.
Is there a way to solve this?
Scenarios like these can be tricky in SSRS. From your description (even though the screenshot doesn't really show everything), I'm guessing that you've got rows grouped by salesperson. In your column that's calculating the commission, you've got a sum of "Total_commission", but you've just got the "Individual_Factor" value not aggregated. Again, having a guess, but each underlying row (by employee) must have the same "Individual_Factor" value (so actually using Min(Individual_Factor) would give the same result).
But then, when you try and just take the same formula (or even a derivation of the formula), and make an overall aggregate of all of the rows, how does SSRS know which "Individual_Factor" value to use? You don't want Min() or Max(), because that would just be the lowest or highest value across all of the salespeople.
Your suggestion of a workaround via code is generally the way that I approach this. You need a report variable, something like "Commission_Grand_Total", and then you need a function in the report code that accepts 1 parameter, and in the function you'll add the parameter value to the variable. The easiest thing to do is to make the parameter the return value of the function.
Then, in the field where you currently have your commission formula (on the salesperson row), the expression in that field becomes =TheFunctionYouCreate((Sum(Fields!Total_Comission.Value)*Parameters!Distribution_Factor.Value*Fields!Individual_Factor.Value))
By passing the formula to the function, you're achieving two things:
The function will take each salesperson's calculated commission and add it to your report variable
The function will output the parameter value that you passed in (since you want to display the calculated commission amount on each salesperson's row)
Lastly, to display the overall total, the expression for that field is just the report variable that holds the overall total (that has been cumulatively added to as SSRS wrote out each salesperson's record)
TIP: I sometimes do this same sort of thing, but if I don't want the row-by-row value to be shown (I just want the cumulative total to be calculated), just put the expression that calls the function in a hidden column. SSRS will still run the function as it renders each row, but obviously it's just not displaying the result of the function.
Some MS reference for report variables and code
https://learn.microsoft.com/en-us/sql/reporting-services/report-design/built-in-collections-report-and-group-variables-references-report-builder?view=sql-server-ver15
https://learn.microsoft.com/en-us/sql/reporting-services/report-design/add-code-to-a-report-ssrs?view=sql-server-ver15
This is untested unfortunately but...
assuming you have a rowgroup for employee called EmployeeGroup then the expression would look like this...
=SUM(
Sum(Fields!Total_Comission.Value, "EmployeeGroup")
* Parameters!Distribution_Factor.Value
* FIRST(Fields!Individual_Factor.Value, "EmployeeGroup")
)
The inner expression reads as
Sum of Total_Comission with current employee group
... multiplied by Distribution Factor
... multiplied by the first individual factor with the current employee group
The outer expression just sums all the individual employee level sums.

Google Data Studio : how to obtain a SUM related to a COUNT_DISTINCT?

I have a dataset including 3 columns :
ID transac (The unique ID of the transaction - Dimension)
Source (The source of the transaction - Dimension)
Amount € (The amount of the transaction - Stat)
screenshot of my dataset
To Count the number of transactions (for one or more sources), i use COUNT_DISTINCT function
I want to make the sum of the transactions amounts (for one or more sources). But i don't want to additionate the amounts of the transactions with the same ID !
Is there a way to do this calcul with a DataStudio function ?
Thanks for your answers. :-)
EDIT : I saw that we could do this type of calculation via SQL here and I would like to do this in DataStudio (so that I don't have to pre-calculate the amounts per source.)
IMO, your dataset contains wrong data. Each value should be relative only to that line, but this is not the case: if the total is =20, each line should describe the participation of that line to the total. With 4 sources, each line should be =5 or something else that sums 20.
To solve it in DataStudio, you need something like CALCULATE function in PowerBI, but currently DataStudio doesn't support this feature.
But there are some options to consider to repair your data:
If you're sure there are always 4 sources, just create a new calculated field with the expression Amount/4 and SUM it. It is not an elegant solution, but it works.
If your data source is Google Sheets, you can easily repair the data using formulas, like in this example:
Link to spreadsheet
For this spreadsheet, I used this formula in adjusted_amount column: =C2/COUNTIF(A:A,A2). With this column in DataStudio, just use the usual SUM aggregation function to summarize it correctly.

How do I calculate the percentage of a count function?

I am trying to take the percentage of a count function so to create a MS BIDS report resembling this excel file:
Excel Close Rate Summary
The unique identifier for the opportunities is the field "opportunityid", so I am using COUNT(Fields!opportunityid.Value) to determine the number of cases in each stage. I want to write an expression that will return the percentage of cases in each stage per creation month. Which can be seen in the above excel screenshot.
This is my current MS BIDS report when i preview it
To be more specific, I want to have the percentage of "Active" and "New" opportunities in January to represent 67% and 33% respectively. 67% comes from 4/6. The 4 comes from the active opportunities out of the 6 opportunities created in January. Likewise, the 33% comes from the 2 new opportunities out of the 6 that were created in January.
There are more stage names than Active and New. Other options include New, Warm, Hot, Implementation, Active, Hibernate or Canceled. This is relevant to mention because I have tried to create an expression that counts based on the number of opportunities with a specific stage name, but have been unsuccessful.
Currently the expression I am using to calculate the percentage is:
=COUNT(Fields!new_rptstage.Value)/SUM(COUNT(Fields!opportunityid.Value),"GroupbyStageName")
Based on this expression, I am only able to get 1/1 or 100% for each of the stage names. I have tried a bunch of variations of the above expression by changing the scope, but have been unsuccessful in getting the desired results. Can someone explain how to correct this?
SAMPLE DATA:
In the sample data, I want the expression to be in the percentage column. The percentage should be the # of cases in a particular stage for the total cases that month. So looking at the above picture:
Active February 54 54/168 [have 54/168 display as a percentage]
Warm February 8 8/168
etc.
EDIT:
These are the expressions that may help show the underlying data in the chart.
The creation month expression is
=Fields!MonthCreated.Value & " " & year(Fields!createdon.Value)
The percent expression is listed above.
You don't want to use the COUNT() function. COUNT(*) returns a count of the number of rows that have a value. It doesn't return the actual value.
Since you've only showed a screen shot of your report, I don't know how your underlying data columns relate to it, but what you want to do for your Percent column expression is this:
This is psuedo code because I don't know your dataset field names:
CaseCount.Value / SUM(CaseCount.Value)
EDIT: Now that I better understand how your data relates to your report, I think the only change you need to make to your existing formula is casting it to a decimal type. It's probably rounding all fractions up to 1.
Try this for the expression in your percentage column:
=CDbl(COUNT(Fields!new_rptstage.Value))/CDbl(SUM(COUNT(Fields!opportunityid.Value),"GroupbyStageName"))

Creating custom rollups with SSAS

I am currently working on a requirement as follows and would appreciate some help in figuring out a way to configure the aggregation of my measure:
I have a fact table that contains the following Item ID, DateID,StoreID, ReceivedComments. The way received comments work is that on a daily basis a new record is created that adds to the value of received comments (for example if Item 5 in Store 5 on 1 Jan had 23 Received Comments and it received 5 comments the following day, the row for Jan 2 would be Item 5, Store 5, Jan 2, 28)
We created a measure using MAX and it works fine whenever Item ID is used in the query. When we start moving to a higher level the max produces wrong results. Our requirement is to setup the measure to be as follows:
If the member selected is on the Item Level then MAX, if it's on any other level (Date or Store) then the measure should aggregate the Max of all Items under this date or store.
Due to the business rules and structure of the database Store and Item are different dimensions so I can not include them in 1 Hierarchy.
We have been playing around with Custom RollUps but so far haven't been able to get it to work.
Thanks
I would solve this by using a more traditional approach to your fact table. Instead of keeping a cumulative count in the ReceivedComments column, I would keep only the number of comments received THAT DAY.
That way, instead of using MAX, you can create your measure using SUM, and it will automatically rollup when you go to higher levels.
The only disadvantage I can see to this approach is that you will need to use a range of dates, instead of only the most recent date, to get a full total of all the comments for a given item/store/date. But that's a very small change to your MDX.
Someone suggested using ISLEAF to determine the level, Instead of using ISLeaf i went with AS CASE WHEN [Item].[ItemID].CURRENTMEMBER.LEVEL IS [Item].[ItemID].[(All)] so I don't have to account for other dimensions such as Date, Store, etc as I have several other dimensions that all behave the same way.
And then I went with this formula to determine the Sum of the Max of the items in a particular store like this:
SUM({[Item].[Item ID].children},[Measures].[ReceivedComments]), Now I expect some performance issues with this measure but we are currently running some tests to see if it's gonna be reliable to work with it on actual data.

Resources