I have a stored procedure that works correctly, but don't understand the theory behind why it works. I'm indentifying a consecutive period of time by utilizing a datepart and dense rank (found solution through help elsewhere).
select
c.bom
,h.x
,h.z
,datepart(year, c.bom) * 12 + datepart(month, c.bom) -- this is returning a integer value for the year and month, allowing us to increment the number by one for each month
- dense_rank() over ( partition by h.x order by datepart(year, c.bom) * 12 + datepart(month, c.bom)) as grp -- this row does a dense rank and subtracts out the integer date and rank so that consecutive months (ie consecutive integers) are grouped as the same integer
from
#c c
inner join test.vw_info_h h
on h.effective_date <= c.bom
and (h.expiration_date is null or h.expiration_date > c.bom)
I understand in theory what is happening with the grouping functionality.
How does multiplying year * 12 + month work? Why do we multiply the year? What is happening in the backend?
The year component of a date is an integer value. Since there are 12 months in a year, multiplying the year value by 12 provides the total number of months that have passed to get to the first of that year.
Here's an example. Take the date February 11, 2012 (20120211 in CCYYMMDD format)
2012 * 12 = 24144 months from the start of time itself.
24144 + 2 months (february) = 24146.
Multiplying the year value by the number of months in a year allows you to establish month-related offsets without having to do any coding to handle the edge cases between the end of one year and the start of another. For example:
11/2011 -> 24143
12/2011 -> 24144
01/2012 -> 24145
02/2012 -> 24146
Related
I've been working on some SQL code to measure efficiency in real-time for some production data. Here's a quick background:
Operators will enter in data for specific sub assemblies. This data looks something like this:
ID PO W/S Status Operator TotalTime Date
60129515_2000_6_S025 107294 S025 Completed A 38 05/08/2020
60129515_2000_7_S025 107294 S025 Completed A 46 05/08/2020
60129515_2000_8_S025 107294 S025 Completed A 55 05/08/2020
60129515_2025_6_S020 107295 S020 Completed B 58 05/08/2020
60129515_2025_7_S020 107295 S020 Completed B 47 05/08/2020
60129515_2025_8_S020 107295 S020 Completed B 45 05/08/2020
60129515_2000_1_S090 107294 S090 Completed C 33 05/08/2020
60129515_2000_2_S090 107294 S090 Completed C 34 05/08/2020
60129515_2000_3_S090 107294 S090 Completed C 21 05/08/2020
The relevant columns are the Operator, TotalTime and Date (note that the date is stored as varchar(50) because it plays nicer with Microsoft PowerApps that way).
What I need to do is:
Aggregate the sum of "TotalTime" grouped by Operator
Calculate the time elapsed based on a condition:
If between 7AM and 4PM, calculate the time elapsed since 7AM of the current day
If after 4PM, return the total time between 7AM and 4PM of the current day
Divide the SUM(TotalTime) by the TimeElapsed (AKA the first list item / second list item) in order to get a rough estimate of labor hours worked vs. hours passed in the day.
This calculation would change every time the query was ran. This will allow the Microsoft PowerApp that is pulling this query to refresh the efficiency measure in real time. I've taken a stab at it already - see below:
SELECT
md.Operator,
CASE
WHEN DATEADD(HOUR, -5, GETUTCDATE()) > CONVERT(DATETIME, CONVERT(DATE, DATEADD(HOUR, -5, GETUTCDATE()))) + '7:00' AND GETDATE() < CONVERT(DATETIME, CONVERT(DATE, DATEADD(HOUR, -5, GETUTCDATE()))) + '15:45'
THEN (SUM(isNull(md.TotalTime, 0)) + SUM(isNull(md.DelTime, 0))) * 1.0 / DATEDIFF(MINUTE, CONVERT(DATETIME, CONVERT(DATE, DATEADD(HOUR, -5, GETUTCDATE()))) + '7:00' , DATEADD(HOUR, -5, GETUTCDATE())) * 100.0
ELSE (SUM(isNull(md.TotalTime, 0)) + SUM(isNull(md.DelTime, 0))) / 420 * 100.0
END AS OpEfficiency
FROM
[Master Data] AS md
WHERE
md.[Date] = CONVERT(varchar(50), DATEADD(HOUR, -5, GETUTCDATE()), 101)
GROUP BY
md.Operator
Note: the DelTime is a different column regarding delay times. I am also converting back from UTC time to avoid any time zone issues when transferring to PowerApps.
However, this is horribly inefficient. I am assuming it is because the Date needs to be converted to datetime every single time. Would it work better if I had a calculated column that already had the date converted? Or is there a better way to calculate time elapsed since a certain time?
Thanks in advance.
There are a few things you can do to increase efficiency considerably. First, you want to make sure SQL can do a simple comparison when selecting rows, so you'll start by calculating a string to match your date on since your [Date] field is a string not a date.
Second, calculate the minutes in your shift (either 540 for a full shift or scaled down to 0 at 7 AM exactly) ahead of time so you aren't calculating minutes in each row.
Third, when summing for operators, use a simple sum on the minutes and calculate efficiency from that sum and your pre-calculated shift so far minutes.
One note - I'm casting the minutes-so-far as FLOAT in my example, maybe not the best type but it's clearer than other decimal types like DECIMAL(18,6) or whatever. Pick something that will show the scale you want.
My example uses a Common Table Expression to generate that date string and minutes-so-far FLOAT, that's nice because it fits in a direct query, view, function, or stored procedure, but you could DECLARE variables instead if you wanted to.
By filtering with an INNER JOIN on the [Date] string against the pre-calculated TargetDate string, I make sure the data set is pared down to the fewest records before doing any math on anything. You'll definitely want to INDEX [Date] to keep this fast as your table fills up.
All these together should give a pretty fast query, good luck
with cteNow as ( --Calculate once, up front - date as string, minutes elapsed as FLOAT (or any non-integer)
SELECT CASE WHEN 60*DATEPART(HOUR, GETUTCDATE())+DATEPART(MINUTE, GETUTCDATE()) > 60*21
--4PM in UTC-5, expressed in minutes
THEN CONVERT(float,(16-7)*60) --minutes in (4 PM-7 AM) * 60 minutes/hour
ELSE --Assume nobody is running this at 6 AM, so ELSE = between 7 and 4
CONVERT(float,60*DATEPART(HOUR, GETUTCDATE()) + DATEPART(MINUTE, GETUTCDATE()) - ((7+5)*60))
--Minutes since midnight minus minutes from midnight to 7 AM, shifted by
--UTS offset of 5 hours
END as MinutesToday --Minutes in today's shift so far
, FORMAT(DATEADD(HOUR,-5,GETUTCDATE()),'MM/dd/yyyy') as TargetDate --Date to search for
--as a string so no conversion in every row comparison. Also, index [Date] column
)
SELECT md.Operator, SUM(md.TotalTime) as TotalTime, SUM(md.TotalTime) / MinutesToday as Efficiency
FROM [Master Data] AS md INNER JOIN cteNow as N on N.TargetDate = md.[Date]
GROUP BY md.Operator, MinutesToday
BTW, you didn't make allowances for lunch or running before 7 AM, so I also ignored those. I think both could be addressed in cteNOW without adding much complexity.
I am comparing two dates in SQL Server and need following information in precise manner:
Input:
Start Date = 12/28/2015
End Date = 12/25/2020
Result returned (each piece in different column):
Days = 1825 (includes start/end date in calculation)
Years = 4
Months = 11
Days = 28
How many days in Start Month = 31 days (Because it is December)
Occupancy in Start Month = (because December has 31 days, it would be 4/31) = .12903225806
How many days in End Month = 31 days (Because it is December)
Occupancy in End Month = (because December has 31 days, it would be 25/31) = .8064516129
For 1-3, you can check https://www.w3schools.com/sql/func_sqlserver_datediff.asp. It's fairly simple, you can just use datediff and specify the unit of interest.
For 4, you can simply do day(Start Date) to get the date.
For 5 and 7, you can use:
SELECT DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,#myDate)+1,0)
to get the end of month date of the given date and use day() to get the day part of the datetime object.
For 6 and 8, it's just simple math, you get number of month in that month from 5 and 7, and then, if Start Date, then (day(eom(Start Date)) - day(Start Date) + 1) / day(eom(date) and if End Date, then day(End Date) / day(eom(End Date)).
I didn't give you the full code as I don't have SQL server running besides me. Hope it helps.
I have a table like this:
Year Month Code Amount
---------------------------------------
2017 11 a 7368
2017 11 b 3542
2017 12 a 4552
2017 12 b 7541
2018 1 a 6352
2018 1 b 8376
2018 2 a 1287
2018 2 b 3625
I make slicer base on Year and Month (ignore the Code), and I want to show SUM of Amount like this :
If I select on slicer Year 2017 and Month 12, the value to be shown is SUM Amount base on 2017-11, and select on slicer Year 2018 and Month 1 should be SUM Amount base on 2017-12
I have tried this one for testing with, but this not allowed:
Last Month = CALCULATE(SUM(Table[Amount]); Table[Month] = SELECTEDVALUE(Table[Month]) - 1)
How to do it right?
I want something like this
NB: I use direct query to SQL Server
Update: At this far, I added Last_Amount column in SQL Server Table by sub-query, maybe you guys have a better way for my issue
The filters in a CALCULATE statement are only designed to take simple statements that don't have further calculations involved. There are a couple of possible remedies.
1. Use a variable to compute the previous month number before you use it in the CALCULATE function.
Last Month =
VAR PrevMonth = SELECTEDVALUE(Table[Month]) - 1
RETURN CALCULATE(SUM(Table[Amount]), Table[Month] = PrevMonth)
2. Use a FILTER() function. This is an iterator that allows more complex filtering.
Last Month = CALCULATE(SUM(Table[Amount]),
FILTER(ALL(Table),
Table[Month] = SELECTEDVALUE(Table[Month]) - 1))
Edit: Since you are using year and month, you need to have a special case for January.
Last Month =
VAR MonthFilter = MOD(SELECTEDVALUE(Table[Month]) - 2, 12) + 1
VAR YearFilter = IF(PrevMonth = 12,
SELECTEDVALUE(Table[Year]) - 1,
SELECTEDVALUE(Table[Year]))
RETURN CALCULATE(SUM(Table[Amount]),
Table[Month] = MonthFilter,
Table[Year] = YearFilter)
We assume that the week starts from Mon and ends at Sunday.
While trying to comparing with last year, how can we get the right dates?
For example, if i want to compare this week (10/23/2017 to 10/29/2017) with year same week (10/24/2016 to 10/30/2016)
I am setting 2 parameters in SSRS BeginDate and BeginDateLastYear . The value of BeginDate = 10/23/2017
How can i assume the value for BeginDateLastYear as a value precisely to the Monday of same week number in last year ( IN this case it should be 10/24/2016) ?
Currently i am trying to set the value of StartDate in last year like
BeginDateLastYear =DATEADD(DateInterval.Year,-1,Parameters!BeginDate.Value)
Also i tried to use
=DATEADD(DateInterval.Week,-52,Parameters!BeginDate.Value) , but not very sure will this work precisely to same week start date of last year
In reporting services you can use the following expression for BeginDateLastYear
= DATEADD(
DateInterval.WeekOfYear,
IIF(
DATEPART(DateInterval.WeekOfYear,DATEADD(DateInterval.WeekOfYear, -52 , Parameters!BeginDate.Value))<> DATEPART(DateInterval.WeekOfYear, Parameters!BeginDate.Value) ,
-53,
-52
) ,
Parameters!BeginDate.Value
)
Code logic:
If by going back 52 weeks the two week numbers do not match go back 53 weeks, else go back 52 weeks
In case you want to implement it in SQL without using the second parameter (SSRS has some issues on cascading parameters)
SET #BeginDateLastYear =
DATEADD(
wk,
CASE WHEN DATEPART(wk,DATEADD(wk, -52 , #BeginDate))<> DATEPART(wk,#BeginDate)
THEN -53
ELSE -52 END,
#BeginDate
)
I have a large array with daily data from 1926 to 2012. I want to find out how many observations are in each year (it varies from year-to-year). I have a column vector which has the dates in the form of:
19290101
19290102
.
.
.
One year here is going to be July through June of the next year.
So 19630701 to 19640630
I would like to use this vector to find the number of days in each year. I need the number of observations to use as inputs into a regression.
I can't tell whether the dates are stored numerically or as a string of characters; I'll assume they're numbers. What I suggest doing is to convert each value to the year and then using hist to count the number of dates in each year. So try something like this:
year = floor(date/10000);
obs_per_year = hist(year,1926:2012);
This will give you a vector holding the number of observations in each year, starting from 1926.
Series of years starting July 1st:
bin = datenum(1926:2012,7,1);
Bin your vector of dates within each year with bin(1) <= x < bin(2), bin(2) <= x < bin(3), ...
count = histc(dates,bin);