Multiply columns in T-SQL join query - sql-server

I am using SQL Server 2008. I am trying to build a T-SQL query to calculate some performance metrics based on data from several tables. Unfortunately, I am stuck on one of the calculations and can not figure out what is wrong. I would greatly appreciate any help:
The calculations require the total produced (from table ShiftHourCounts), the total scrap pieces (from table ShiftReportScrap), and the total downtime (from Downtime query)
I have tried to add each operation/calculation in the query mainly for my own education/ troubleshooting
I do not understand why column Q returns zero. Q= Tok/Tp, and the query correctly returns and calculates both Tok and Tp individually. In the below example, Q should =0.994
The query currently returns correct values for everything except Q, A, P, and OEE. Q, A, P, and OEE always return zero
Current query:
--OEE= A*P*Q (this is the final desired result/ calculation)
--A= (Planned run time - Unplanned Down Time)/Planned run time
--A= (Prt - Dtu)/Prt
--Prt= Maximum Available Time - Planned Down Time
--Prt= Mat=DTp
--Effective production time= Planned run time - Unplanned Down Time
--Ept=Prt-DTu
--P= (BDT*total number of produced parts)/Effective production time
--P= (BDT*Tp)/Ept
--Q= Total number of OK parts/Total number of produced parts
--Q= Tok/Tp
select
sm.SR_ID, sm.SR_PartID, sm.SR_StartTime,
isnull(sm.SR_EndTime,GETDATE()) AS EndTime,
isnull(sm.SR_BDT,1) AS BDT,
DATEDIFF(n, sm.SR_StartTime, isnull(sm.SR_EndTime, GETDATE())) AS Prt,
isnull(p.TotalProduced,0) AS Tp,
isnull(s.Scrap,0) AS Scrap,
(isnull(p.TotalProduced, 0) - isnull(s.Scrap, 0)) AS Tok,
isnull(dt.DownTimeDuration, 0) AS DTu,
((isnull(p.TotalProduced, 0) - isnull(s.Scrap, 0)) / isnull(p.TotalProduced, 0)) AS Q, --Q= Tok/Tp
((DATEDIFF(n, sm.SR_StartTime, isnull(sm.SR_EndTime, GETDATE())) - isnull(dt.DownTimeDuration, 0)) / DATEDIFF(n, sm.SR_StartTime, isnull(sm.SR_EndTime, GETDATE()))) AS A,
((isnull(sm.SR_BDT, 1) * isnull(p.TotalProduced, 0)) / (DATEDIFF(n, sm.SR_StartTime, isnull(sm.SR_EndTime, GETDATE())) - isnull(dt.DownTimeDuration, 0))) AS P,
(((isnull(p.TotalProduced, 0) - isnull(s.Scrap, 0)) / isnull(p.TotalProduced, 0)) * ((DATEDIFF(n, sm.SR_StartTime, isnull(sm.SR_EndTime, GETDATE())) - isnull(dt.DownTimeDuration, 0)) / DATEDIFF(n, sm.SR_StartTime, isnull(sm.SR_EndTime, GETDATE())))*((isnull(sm.SR_BDT,1)*isnull(p.TotalProduced,0))/(DATEDIFF(n,sm.SR_StartTime,isnull(sm.SR_EndTime,GETDATE()))-isnull(dt.DownTimeDuration,0)))) AS OEE
FROM
ShiftReportMaster sm
LEFT JOIN
(SELECT
SH_ShiftID, Sum(SH_Produced) AS TotalProduced
FROM
ShiftHourCounts
GROUP BY
SH_ShiftID) p ON (p.SH_ShiftID = sm.SR_ID)
LEFT JOIN
(SELECT
SRS_SR_ID, SRS_PartID, Sum(SRS_Scraped) AS Scrap
FROM
ShiftReportScrap
GROUP BY
SRS_SR_ID, SRS_PartID) s ON (s.SRS_SR_ID = sm.SR_ID)
AND (s.SRS_PartID = sm.SR_PartID)
LEFT JOIN
(SELECT
srd.DTR_SRID, [Downtime reasons].DT_Planned,
Sum(srd.DTR_DownTimeDuration) AS DownTimeDuration
FROM
ShiftReportDowntime srd
LEFT JOIN
[Downtime reasons] ON srd.DTR_Reason = [Downtime reasons].DT_ID
GROUP BY
srd.DTR_SRID, [Downtime reasons].DT_Planned
HAVING
((([Downtime reasons].DT_Planned) = 0))) dt ON (dt.DTR_SRID = sm.SR_ID)
WHERE
sm.SR_ID = 3689;

Most likely from integer division. Try this:
((isnull(p.TotalProduced+.0,0.0)-isnull(s.Scrap+.0,0.0))
/nullif(p.TotalProduced,0)) AS Q, --Q= Tok/Tp
Adding .0 or multiplying 1.0 implictly converts the integers into a decimal type.
Dividing integers will return an integer type, and if that value is less than 1 it will return 0 because it truncates instead of rounding or using some other logic to return an integer.

Related

SQL Server : measuring real-time efficiency by operator

I've been working on some SQL code to measure efficiency in real-time for some production data. Here's a quick background:
Operators will enter in data for specific sub assemblies. This data looks something like this:
ID PO W/S Status Operator TotalTime Date
60129515_2000_6_S025 107294 S025 Completed A 38 05/08/2020
60129515_2000_7_S025 107294 S025 Completed A 46 05/08/2020
60129515_2000_8_S025 107294 S025 Completed A 55 05/08/2020
60129515_2025_6_S020 107295 S020 Completed B 58 05/08/2020
60129515_2025_7_S020 107295 S020 Completed B 47 05/08/2020
60129515_2025_8_S020 107295 S020 Completed B 45 05/08/2020
60129515_2000_1_S090 107294 S090 Completed C 33 05/08/2020
60129515_2000_2_S090 107294 S090 Completed C 34 05/08/2020
60129515_2000_3_S090 107294 S090 Completed C 21 05/08/2020
The relevant columns are the Operator, TotalTime and Date (note that the date is stored as varchar(50) because it plays nicer with Microsoft PowerApps that way).
What I need to do is:
Aggregate the sum of "TotalTime" grouped by Operator
Calculate the time elapsed based on a condition:
If between 7AM and 4PM, calculate the time elapsed since 7AM of the current day
If after 4PM, return the total time between 7AM and 4PM of the current day
Divide the SUM(TotalTime) by the TimeElapsed (AKA the first list item / second list item) in order to get a rough estimate of labor hours worked vs. hours passed in the day.
This calculation would change every time the query was ran. This will allow the Microsoft PowerApp that is pulling this query to refresh the efficiency measure in real time. I've taken a stab at it already - see below:
SELECT
md.Operator,
CASE
WHEN DATEADD(HOUR, -5, GETUTCDATE()) > CONVERT(DATETIME, CONVERT(DATE, DATEADD(HOUR, -5, GETUTCDATE()))) + '7:00' AND GETDATE() < CONVERT(DATETIME, CONVERT(DATE, DATEADD(HOUR, -5, GETUTCDATE()))) + '15:45'
THEN (SUM(isNull(md.TotalTime, 0)) + SUM(isNull(md.DelTime, 0))) * 1.0 / DATEDIFF(MINUTE, CONVERT(DATETIME, CONVERT(DATE, DATEADD(HOUR, -5, GETUTCDATE()))) + '7:00' , DATEADD(HOUR, -5, GETUTCDATE())) * 100.0
ELSE (SUM(isNull(md.TotalTime, 0)) + SUM(isNull(md.DelTime, 0))) / 420 * 100.0
END AS OpEfficiency
FROM
[Master Data] AS md
WHERE
md.[Date] = CONVERT(varchar(50), DATEADD(HOUR, -5, GETUTCDATE()), 101)
GROUP BY
md.Operator
Note: the DelTime is a different column regarding delay times. I am also converting back from UTC time to avoid any time zone issues when transferring to PowerApps.
However, this is horribly inefficient. I am assuming it is because the Date needs to be converted to datetime every single time. Would it work better if I had a calculated column that already had the date converted? Or is there a better way to calculate time elapsed since a certain time?
Thanks in advance.
There are a few things you can do to increase efficiency considerably. First, you want to make sure SQL can do a simple comparison when selecting rows, so you'll start by calculating a string to match your date on since your [Date] field is a string not a date.
Second, calculate the minutes in your shift (either 540 for a full shift or scaled down to 0 at 7 AM exactly) ahead of time so you aren't calculating minutes in each row.
Third, when summing for operators, use a simple sum on the minutes and calculate efficiency from that sum and your pre-calculated shift so far minutes.
One note - I'm casting the minutes-so-far as FLOAT in my example, maybe not the best type but it's clearer than other decimal types like DECIMAL(18,6) or whatever. Pick something that will show the scale you want.
My example uses a Common Table Expression to generate that date string and minutes-so-far FLOAT, that's nice because it fits in a direct query, view, function, or stored procedure, but you could DECLARE variables instead if you wanted to.
By filtering with an INNER JOIN on the [Date] string against the pre-calculated TargetDate string, I make sure the data set is pared down to the fewest records before doing any math on anything. You'll definitely want to INDEX [Date] to keep this fast as your table fills up.
All these together should give a pretty fast query, good luck
with cteNow as ( --Calculate once, up front - date as string, minutes elapsed as FLOAT (or any non-integer)
SELECT CASE WHEN 60*DATEPART(HOUR, GETUTCDATE())+DATEPART(MINUTE, GETUTCDATE()) > 60*21
--4PM in UTC-5, expressed in minutes
THEN CONVERT(float,(16-7)*60) --minutes in (4 PM-7 AM) * 60 minutes/hour
ELSE --Assume nobody is running this at 6 AM, so ELSE = between 7 and 4
CONVERT(float,60*DATEPART(HOUR, GETUTCDATE()) + DATEPART(MINUTE, GETUTCDATE()) - ((7+5)*60))
--Minutes since midnight minus minutes from midnight to 7 AM, shifted by
--UTS offset of 5 hours
END as MinutesToday --Minutes in today's shift so far
, FORMAT(DATEADD(HOUR,-5,GETUTCDATE()),'MM/dd/yyyy') as TargetDate --Date to search for
--as a string so no conversion in every row comparison. Also, index [Date] column
)
SELECT md.Operator, SUM(md.TotalTime) as TotalTime, SUM(md.TotalTime) / MinutesToday as Efficiency
FROM [Master Data] AS md INNER JOIN cteNow as N on N.TargetDate = md.[Date]
GROUP BY md.Operator, MinutesToday
BTW, you didn't make allowances for lunch or running before 7 AM, so I also ignored those. I think both could be addressed in cteNOW without adding much complexity.

How can I add values to a chart that do not exist as 0 in google data studio?

I have got 4 tables in BigQuery that keep statistics for messages in a Message Queue. The tables are : receivedMessages, processedMessages, skippedMessages and failedMessages. Each table has among other things a header.processingMetadata.approximateArrivalTimestamp which as you might have guessed it is a timestamp field.
My purpose is to create 4 charts for each one of this tables aggregating in this field as well as a 5th chart that displays the percentage of each message category each day in regards to the receivedMessages as well as the unknown status messages using the following formula :
UNKNOWN_STATUS_MESSAGES = TOTAL_RECEIVED_MESSAGES - (TOTAL_PROCESSED_MESSAGES + TOTAL_SKIPPED_MESSAGES + TOTAL_FAILED_MESSAGES)
However some days do not have skipped or failed messages, therefore there are no records in Big Query in these two tables. This results to these 2 graphics having dates missing and also not displaying correctly the UNKNOWN_STATUS_MESSAGES in the 5th graph.
I also used the following code as a metric in my graphs with no success (changing the variable name appropriately each time).
CASE WHEN TOTAL_FAILED_MESSAGES IS NULL THEN 0 ELSE TOTAL_FAILED_MESSAGES END
Is there a way to make google data studio to fill the dates with no data with 0s so I can display the charts correctly?
As long as you know the date boundaries of your chart, you can fill those holes with zeros. For instance, if you want to generate your report for last 30 days:
with dates as (
select
x as date
from
unnest(generate_date_array(date_sub(current_date(), interval 30 day), current_date())) as x
)
select
date,
received_messages,
processed_messages,
skipped_messages,
failed_messages,
received_messages - (processed_messages + skipped_messages + failed_messages) as unknown_messages from (
select
d.date,
coalesce(count(received.*), 0) as received_messages,
coalesce(count(processed.*), 0) as processed_messages,
coalesce(count(skipped.*), 0) as skipped_messages,
coalesce(count(failed.*), 0) as failed_messages
from dates d
left join dataset.receivedMessages received
on date(received.header.processingMetadata.approximateArrivalTimestamp) = d.date
left join dataset.processedMessages processed
on date(processed.header.processingMetadata.approximateArrivalTimestamp) = d.date
left join dataset.skippedMessages skipped
on date(skipped.header.processingMetadata.approximateArrivalTimestamp) = d.date
left join dataset.failedMessages failed
on date(failed.header.processingMetadata.approximateArrivalTimestamp) = d.date
group by 1
)
order by 1
1) I recommend doing a join in BigQuery with a date master table to return '0' for those date values.
2) Otherwise, in Data Studio, make sure there is a field X that has values for all dates. Then create a calculated field with formula X - X + TOTAL_SKIPPED_MESSAGES and X - X + TOTAL_FAILED_MESSAGES
As I found out it is also possible to do it in non fixed date using date parameters. So the first part of khan's answer can be rewritten as:
WITH dates AS (
select *
from unnest(generate_date_array(PARSE_DATE('%Y%m%d', #DS_START_DATE), PARSE_DATE('%Y%m%d', #DS_END_DATE), interval 1 day)) as day
)

How does the outer WHERE clause affect the way nested query is executed?

Let's say I have a table lines
b | a
-----------
17 7000
17 0
18 6000
18 0
19 5000
19 2500
I want to get positive values of a function: (a1 - a2) \ (b2 - b1) for all elements in cartesian product of lines with different b's. (If you are interested this will result in intersections of lines y1 = b1*x + a1 and y2 = b2*x + a2)
I wrote query1 for that cause
SELECT temp.point FROM
(SELECT DISTINCT ((l1.a - l2.a) / (l2.b - l1.b)) AS point
FROM lines AS l1
CROSS JOIN lines AS l2
WHERE l1.b != l2.b
) AS temp
WHERE temp.point > 0
It throws a "division by zero" error. I tried the same query without the WHERE clause (query2) and it works just fine
SELECT temp.point FROM
(SELECT DISTINCT ((l1.a - l2.a) / (l2.b - l1.b)) AS point
FROM lines AS l1
CROSS JOIN lines AS l2
WHERE l1.b != l2.b
) AS temp
as well as the variation with the defined SQL function (query3)
CREATE FUNCTION get_point(#a1 DECIMAL(18, 4), #a2 DECIMAL(18, 4), #b1 INT, #b2 INT)
RETURNS DECIMAL(18, 4)
WITH EXECUTE AS CALLER
AS
BEGIN
RETURN (SELECT (#a1 - #a2) / (#b2 - #b1))
END
GO
SELECT temp.point FROM
(SELECT DISTINCT dbo.get_point(l1.a, l2.a, l1.b, l2.b) AS point
FROM lines AS l1
CROSS JOIN lines AS l2
WHERE l1.b != l2.b
) AS temp
WHERE temp.point > 0
I have an intuitive assumption that the outer SELECT shouldn't affect the way nested SELECT is executed (or at least shouldn't break it). Even if it is not true that wouldn't explain why query3 works when query1 doesn't.
Could someone explain the principle behind this? That would be much appreciated.
If you want to guarantee that the query will always work, you'd need to wrap your calculation in something like a case statement
case when l2.b - l1.b = 0
then null
else (l1.a - l2.a) / (l2.b - l1.b)
end
Technically, the optimizer is perfectly free to evaluate conditions in whatever order it expects will be more efficient. The optimizer is free to evaluate the division before the where clause that filters out rows where the divisor would be 0. It is also free to evaluate the where clause first. Your different queries have different query plans which result in different behavior.
Realistically, though, even though a particular query might have a "good" query plan today, there is no guarantee that the optimizer won't decide in a day, a month, or a year to change the query plan to something that would throw a division by 0 error. I suppose you could decide to use a bunch of hints/ plan guides to force a particular plan with a particular behavior to be used. But that tends to be the sort of thing that bites you in the hind quarters later. Wrapping the calculation in a case (or otherwise preventing the division by 0 error) will be much safer and easier to explain to the next developer.

Setting hh:mm to 24 hours format in one single t-sql Query - SQL Server 2012

I got a query which returns some hh:ss time values. The problem however is that it returns it in a PM/AM format while it needs to be a 24 hours format. I can't change the global language setting because this 24 hours time setting is query specific.
I was wondering how to solve this issue?
The query I got now is as follows:
SELECT
dbo.qryMPDisplayPre.Datum, dbo.qryMPDisplayPre.Relatie,
dbo.qryMPDisplayPre.[Order], dbo.qryMPDisplayPre.Status,
dbo.WorkOrder.DeviceID, dbo.Relaties.RelatieNaam AS Monteur,
dbo.Orders.Omschrijving AS OrderOmschrijving,
Format(dbo.WorkOrder.WBTravelDeparture, 'hh:mm') AS TravelDeparture,
Format(dbo.WorkOrder.WBTravelArrival, 'hh:mm') AS TravelArrival,
Format(dbo.WorkOrder.WBWorkArrival, 'hh:mm') AS WorkArrival,
Format(dbo.WorkOrder.WBWorkDeparture, 'hh:mm') AS WorkDeparture,
(CASE WHEN WorkOrder.[WBtravelhours] IS NULL
THEN 0 ELSE (CAST(WorkOrder.[WBTravelHours] * 100.0 / 100.0 AS DECIMAL(30, 2))) END) AS TravelHours,
(CASE WHEN WorkOrder.[wbworkhours] IS NULL
THEN 0 ELSE (CAST(WorkOrder.[WBWorkHours] * 100.0 / 100.0 AS DECIMAL(30, 2))) END) AS WorkHours,
dbo.qryWBMontageGeboekt.Geboekt, dbo.Orders.OpdAdres,
dbo.Orders.OpdPC, dbo.Orders.OpdPlaats,
LEFT(dbo.Orders.Omschrijving, 9) AS Expr1
FROM
dbo.qryWBMontageGeboekt
RIGHT OUTER JOIN
dbo.Orders
RIGHT OUTER JOIN
dbo.Relaties
RIGHT OUTER JOIN
dbo.WorkOrder
RIGHT OUTER JOIN
dbo.qryMPDisplayPre ON dbo.WorkOrder.WONummer = dbo.qryMPDisplayPre.[Order]
AND dbo.WorkOrder.WOStatus = dbo.qryMPDisplayPre.Status
AND dbo.WorkOrder.WOAssignmentDate = dbo.qryMPDisplayPre.Datum
ON dbo.Relaties.RelatieNummer = dbo.qryMPDisplayPre.Relatie
ON dbo.Orders.Nummer = dbo.qryMPDisplayPre.[Order]
ON dbo.qryWBMontageGeboekt.Datum = dbo.qryMPDisplayPre.Datum
AND dbo.qryWBMontageGeboekt.Relatie = dbo.qryMPDisplayPre.Relatie
AND dbo.qryWBMontageGeboekt.[Order] = dbo.qryMPDisplayPre.[Order]
WHERE
(dbo.qryMPDisplayPre.Datum > '11/1/2012')
AND (dbo.qryMPDisplayPre.Status <> 0)
It is kinda weird since the values in WorkArrival are getting displayed correctly in the 24-hours format. Though the values in TravelDeparture, TravelArrival and WorkDeparture aren't while they are formatted the same way as the WorkArrival one.
So this made me believe that there was something wrong with the values from where they are fetched, the WorkOrder table. Though this table contains date times in a 24-hours way and they are all the same (so this couldn't be the problem).
See here the workorder table from where the values are fetched:
As you can see this are all dates with 24 hour HH:MM values.
Now below you can see the Query results with its PM/AM formatted time values:
As you can see the Query results are very weird. It seems that the WorkArrival fields returns its value correct, but the others don't. What is also strange is the fact that the field TravelDeparture returns some off its values correctly (2 top ones) but others incorrect..
Any clue how this can happen, and how I can let the values return in a 24 hours manor (in the query results).
In your example they should all be in 12 hour format, and I see no reason for it not being the case. The format for 12 hours is 'hh' and you are using it in all places.
Is this your original query? If not then check your format strings for upper / lower case. The format for 24 hours happens to be 'HH' (upper case instead of lower case being the only difference).

SQL Server : divide by zero error

I have searched the forums and found several responses on this topic but as I am new to SQL I'm not getting it.
I created a TSQL query that when run returns "Divide by zero error". This is because I am dividing one column by another and there is a zero in one of the records. Ok, I got that part but I can't make heads or tails of the posts explaining how to resolve the issue.
SELECT
f.Sales_Rep1 AS 'Sales Rep'
, SUM(CAST(f.Other1_Revenue + f.Other2_Revenue AS FLOAT)) AS 'MRR'
, CW.MRR_Goal AS 'MRR Goal'
, SUM((f.Other1_Revenue + f.Other2_Revenue)/(CW.MRR_Goal)) AS 'Total'
, SUM(CAST(f.Product_Revenue + f.Service_Revenue AS FLOAT)) AS 'NRR'
, CW.NRR_Goal AS 'NRR Goal'
, (cw.MRR_Goal + cw.NRR_Goal)AS 'Total Goal'
FROM
dbo.v_rpt_Opportunity AS f
INNER JOIN
dbo.v_memberpickerlist AS m ON f.Sales_Rep1 = m.Member_ID
INNER JOIN
dbo.CW_SalesGoals AS CW ON CW.Sales_Rep = f.Sales_Rep1
WHERE
(f.Expected_Close_Date >= DATEADD(MM, DATEDIFF(MM, 0, GETDATE()), 0))
AND (m.activestatus = 'active') AND (f.Status = 'Won')
OR (f.Expected_Close_Date >= DATEADD(MM, DATEDIFF(MM, 0, GETDATE()), 0))
AND (m.activestatus = 'active')
AND (f.Status LIKE '%submitted%')
GROUP BY
f.Sales_Rep1, CW.MRR_Goal, CW.NRR_Goal,
f.Other1_Revenue, f.Other2_Revenue
I know the line that is causing the issue, but I do not know how to resolve it...
SUM((f.Other1_Revenue + f.Other2_Revenue) / (CW.MRR_Goal)) AS 'Total'
What I am trying to do is get the percentage of the goal but I cant get past the division to do that.
Any help would be greatly appreciated. Thank you!
You can use the nullif() function to avoid divide by zero exceptions:
(f.Other1_Revenue + f.Other2_Revenue) / nullif(CW.MRR_Goal, 0)
In the event that CW.MRR_Goal is 0, the result for the entire expression will be null1.
Sample:
select 1 / nullif(0, 0) -- null
1 I've selected null because, in my opinion, it's the most appropriate analogue for undefined in SQL. While that's helpful for an individual expression, it may not be what you're looking for in an aggregate sense.
Use a CASE statement:
SUM(CASE WHEN CW.MRR_Goal <> 0 THEN (f.Other1_Revenue + f.Other2_Revenue)/(CW.MRR_Goal) ELSE 0 END) AS 'Total'
This way, it will perform the division only if the value of CW.MRR_Goal in that record is other than zero; if it's zero, it will return 0
You could use an IIF:
SUM(IIF(CW.MRR_Goal = 0, 0, (f.Other1_Revenue + f.Other2_Revenue) / (CW.MRR_Goal))
This way if the goal is equal to 0 then it will SUM the 0 rather than trying the division whereas if the goal is not equal to 0 then it will do the division.
More on IIF https://msdn.microsoft.com/en-GB/library/hh213574.aspx
As user3540365 mentioned IIF only applies from SQL Server 2012 as opposed to CASE which applies from SQL Server 2008 onwards see here and is an SQL standard.

Resources