I have a table (SYS_Holidays) that have start and end dates of each holiday period. I need to output all holiday dates in a relational form. For example, I have 25-Dec-2017 to 2-Jan-2018 as one row in the input, I want to output 25-Dec, 26-Dec ... through 2-Jan as 9 rows.
I have written this script, could you please tell me how I can make it more efficient?
SELECT
H.HolidayName
, DATEADD(DAY, Number-1, H.StartDate) AS HolidayDate
FROM
SYS_Holidays AS H
CROSS JOIN Config_Numbers AS N
WHERE
-- Figure the # of days between start and end: one row for each holiday-date
-- If EndDate is null, just use StartDate (i.e. 1-day holiday)
N.Number <= DATEDIFF(DAY, H.StartDate, ISNULL(H.EndDate, H.StartDate) ) + 1
NB: Config_Numbers is a table I have created with a huge list of integers (as BIGINT)
It can be done using a date table and an inner join. Use the subquery to make a table if it is not efficient enough:
Create table #Test (HolidayName nvarchar(100), StartDate Date, EndDate Date)
Insert Into #Test Values ('Christmas', '2017-12-22', '2018-01-03'), ('Easter' , '2017-04-10', '2017-04-16')
SELECT HolidayName, DatesList.[Date] as HolidayDate
FROM #Test t
inner join (
SELECT cast(dateadd(day, number, '2017-1-1') as date) as [Date]
FROM master..spt_values WHERE type='P' AND number < 1000) AS DatesList
on t.StartDate<=DatesList.[Date] and t.EndDate>=DatesList.[Date]
I modified #cloudsafe's answer, to yield the code below. It is still much faster than any of the joins using Config_Numbers. Subtree cost came to ~0.2785.
I figured that 2048 can cover a little more than 5 years, so I broke my code up into 5-year blocks, and did a UNION to join them up.
Trouble is, I'd have to remember to do another UNION every 5-years :-(
SELECT HolidayName, DatesList.[Date] as HolidayDate
FROM SYS_Holidays AS H
inner join (
SELECT cast(dateadd(day, number, '2013-01-01') as date) as [Date]
FROM master..spt_values WHERE type='P' AND number < 2048) AS DatesList
on H.StartDate <= DatesList.[Date] and H.EndDate >=DatesList.[Date]
UNION
SELECT HolidayName, DatesList.[Date] as HolidayDate, H.HolidayId, H.CampusId, H.CategoryId
FROM SYS_Holidays AS H
inner join (
SELECT cast(dateadd(day, number, '2018-01-01') as date) as [Date]
FROM master..spt_values WHERE type='P' AND number < 2048) AS DatesList
on H.StartDate <= DatesList.[Date] and H.EndDate >=DatesList.[Date]
Any further suggestions for improvement, please?
Related
Hope you all have a fantastic time!
I am trying to create a temporary column with date values from the past 14 days.
so in essence, it should for example the below list whenever i re-call this stored procedure:
Reason for this is that we might not have data for a specific date, and we need to show that the count returns zero.
So the idea is to create a column, and then join with another table i created based on the date to do the counts per day.
I tried below code, but it does not work:
declare #a int
set #a = 1
select
[Date] = while (#a <= 14)
begin
cast(dateadd(day, -#a, getdate()) as date)
set #a = #a + 1
end
into #Temp1
select
*
from #Temp1
Any help will be greatly appreciated!
Using a simulated numbers table in a cte (you would ideally have a permanent table) you can simply do
with numbers as (
select top (14) Row_Number() over (order by (select null)) as n from sys.sysobjects
)
select cast(dateadd(day, -n, getdate()) as date)
from numbers
Instead of querying sys.sysobjects unnecessarily, you can generate the table on the fly with no reads, by using a VALUES constructor.
with numbers as (
select Row_Number() over (order by (select null)) as n
from (values(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) v(x)
)
select cast(dateadd(day, -n, getdate()) as date)
from numbers
I want to sum values where date is between de creationdate and endDate,, hence ValueEnd.
For instances the second row, the creationDate is the same as the endDate, so I have to sum the ValuePerDay of this day to the previsou value. So in the column ValueEnd it is 3.4+1.17 = 4.57
I started by calculating the sum from the days where de Difference is 1, like this:
SELECT
CONVERT(CHAR(10), CreationDate,103) CreationDate
,CONVERT(CHAR(10), EndDate,103) EndDate
,SUM(Values_an) Values_an
FROM Dat1
WHERE Difference=1
GROUP BY CONVERT(CHAR(10), CreationDate,103), CONVERT(CHAR(10), EndDate,103), Difference
However, I'm having trouble sum the values where the difference if higher than 1. Can someone help me please?
OK, judging by the provided information - and as far as I understood everything right - the following approach might solve your problem:
DECLARE #t TABLE(
CreationDate date,
EndDate date,
Value_An decimal(19,4)
)
INSERT INTO #t VALUES
('2019-03-01', '2019-03-01', 3.4)
,('2019-03-01', '2019-03-03', 3.5)
,('2019-05-01', '2019-05-01', 3.6)
,('2019-06-01', '2019-06-04', 3.7)
;WITH cteMultiRow AS(
SELECT CreationDate, COUNT(*) cntRows
FROM #t
GROUP BY CreationDate
HAVING COUNT(*) > 1
),
cte AS(
SELECT t.*
,ROW_NUMBER() OVER (PARTITION BY t.CreationDate ORDER BY t.EndDate) AS rn
,DATEDIFF(d, t.CreationDate, t.EndDate)+1 AS Difference
,CASE WHEN m.CreationDate IS NOT NULL THEN t.Value_An/(DATEDIFF(d, t.CreationDate, t.EndDate)+1) ELSE t.Value_An END AS ValuePerD
FROM #t t
LEFT JOIN cteMultiRow m ON t.CreationDate = m.CreationDate
),
cteSums AS(
SELECT c.CreationDate, SUM(c.ValuePerD) AS ValuePerD
FROM cte c
GROUP BY c.CreationDate
)
SELECT c.CreationDate, c.EndDate, c.Value_An, c.Difference, c.ValuePerD, ISNULL(s.ValuePerD, c.Value_An) AS ValueEnd
FROM cte c
LEFT JOIN cteSums s ON c.CreationDate = s.CreationDate AND c.rn = 1
I have created a stored procedure to get data. In this stored procedure, I have returned 1 table and table stores the data above 1 lakh + data. So right now I have run the stored procedure that time I get the data in above 1 minute time take. I want just with in 1 second get data. I have set also SET NOCOUNT ON; and also create missing index. Still I am getting same time for the get data.
This is my query:
DECLARE #CurMon int
DECLARE #year nvarchar(max)
SELECT #CurMon = month(getdate())
SELECT #year = year(getdate())
SELECT
FORMAT(dateadd(MM, T.i, getdate()),'MMM-yy') AS DateColumn,
ISNULL(uf.TotalCount, 0) as TotalCount
FROM
(VALUES (12-#CurMon),(11-#CurMon),(10-#CurMon),(9-#CurMon),(8-#CurMon),(7-#CurMon),(6-#CurMon), (5-#CurMon), (4-#CurMon), (3-#CurMon), (2-#CurMon), (1-#CurMon)) AS T(i)
OUTER APPLY
(SELECT DISTINCT
COUNT(datepart(MM,UF.InsertDateTime)) OVER (partition by datepart(MM,UF.InsertDateTime)) AS TotalCount
FROM dbo.UserFollowers UF
INNER JOIN dbo.Users U on U.UserId = UF.FollowerId
WHERE DATEDIFF(mm,UF.InsertDateTime, DATEADD(mm, T.i, GETDATE())) = 0 and UF.IsFollowed = 1
) uf
order by DATEPART(MM,convert(datetime,FORMAT(dateadd(MM, T.i, getdate()),'MMMM') +'01 '+#year,110))
i am also try some other query for the improve speed of query but still i am getting same time. here this query also print.
declare #StartDate datetime = dateadd(year , datediff(year , 0, getdate() ) , 0)
declare #tempT2 table
(
MNo int,
[Month] datetime,
NextMonth datetime)
;with Months as (
select top (12)
MNo = row_number() over (order by number)
,[Month] = dateadd(month, row_number() over (order by number) -1, #StartDate)
, NextMonth = dateadd(month, row_number() over (order by number), #StartDate)
from master.dbo.spt_values
)
insert into #tempT2
select * from Months
select
m.MNo
, Month = format(m.Month, 'MMM-yy')
, tally = count(UF.InsertDateTime)
from #tempT2 m
left join dbo.UserFollowers UF
INNER JOIN dbo.Users U on U.UserId = UF.FollowerId
on UF.InsertDateTime >= m.Month
and UF.InsertDateTime < m.NextMonth where UF.IsFollowed = 1
group by m.MNo,format(m.Month, 'MMM-yy')
order by MNo
here this is my both query i have try but still i am not getting success for the improve the speed of the query. and sorry but i can not see here my execution plan of the query actually i have not permission for that.
You can gain a little bit of performance by switching to a temporary table instead of a table variable, and by getting rid of format():
declare #StartDate datetime = dateadd(year , datediff(year , 0, getdate() ) , 0)
create table #Months (
MNo int not null primary key
, Month char(6) not null
, MonthStart datetime not null
, NextMonth datetime not null
)
;with Months as (
select top (12)
MNo = row_number() over (order by number)
, MonthStart = dateadd(month, row_number() over (order by number) -1, #StartDate)
, NextMonth = dateadd(month, row_number() over (order by number), #StartDate)
from master.dbo.spt_values
)
insert into #Months (MNo, Month, MonthStart, NextMonth)
select
MNo
, Month = stuff(convert(varchar(9),MonthStart,6),1,3,'')
, MonthStart
, NextMonth
from Months;
select
m.MNo
, m.Month
, tally = count(UF.InsertDateTime)
from #tempT2 m
inner join dbo.Users U
on UF.InsertDateTime >= m.MonthStart
and UF.InsertDateTime < m.NextMonth
inner join dbo.UserFollowers UF
on U.UserId = UF.FollowerId
and UF.IsFollowed = 1
group by
m.MNo
, m.Month
order by MNo
After that, you should evaluate the execution plan to determine if you need a better indexing strategy.
If you still need it to go faster, you could create an actual calendar table and look into creating an indexed view. An indexed view can be a chore get it to behave correctly depending on your sql server version, but will be faster.
Reference:
format performance - Aaron Bertrand
What is the difference between a temp table and table variable in SQL Server? - Answer by Martin Smith
When should I use a table variable vs temporary table in sql server? - Answer by Martin Smith
Indexed Views and Statistics - Paul White
Generate a set or sequence without loops - 2 - Aaron Bertrand
The "Numbers" or "Tally" Table: What it is and how it replaces a loop - Jeff Moden
Creating a Date Table/Dimension in sql Server 2008 - David Stein
Calendar Tables - Why You Need One - David Stein
Creating a date dimension or calendar table in sql Server - Aaron Bertrand
I am fairly new to SSIS, and now I have this requirement to exclude weekends in order to do a performance management. Now I have created a calendar and marked the weekends; what I am trying to do, using SSIS, is get the start and end date of every status and count how many weekends are there. I am kind of struggling to know which component to use to achieve this task.
So I have mainly two tables:
1- Table Calendar
2- Table History-Log
Calendar has the following columns:
1- ID
2- date
3- year
4- month
5- day of week
6- isweekend
History-Log has the following:
1- ID
2- Status
3- startdate
4- enddate
Your help is really appreciated.
I'm not an SSIS user, so apologies if this answer does not help, but if I wanted to get the result you describe, based on some test data:
DECLARE #Calendar TABLE (
ID INT,
[Date] DATETIME,
[Year] INT,
[Month] INT,
[DayOfWeek] VARCHAR(10),
IsWeekend BIT
)
DECLARE #HistoryLog TABLE (
ID INT,
[Status] INT,
StartDate DATETIME,
EndDate DATETIME
)
DECLARE #StartDate DATE = '20100101', #NumberOfYears INT = 10
DECLARE #CutoffDate DATE = DATEADD(YEAR, #NumberOfYears, #StartDate);
INSERT INTO #Calendar
SELECT ROW_NUMBER() OVER (ORDER BY d) AS ID,
d AS [Date],
DATEPART(YEAR,d) AS [Year],
DATEPART(MONTH,d) AS [Month],
DATENAME(WEEKDAY,d) AS [DayOfWeek],
CASE WHEN DATENAME(WEEKDAY,d) IN ('Saturday','Sunday') THEN 1 ELSE 0 END AS IsWeekend
FROM
(
SELECT d = DATEADD(DAY, rn - 1, #StartDate)
FROM
(
SELECT TOP (DATEDIFF(DAY, #StartDate, #CutoffDate))
rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
ORDER BY s1.[object_id]
) AS x
) AS y;
INSERT INTO #HistoryLog
SELECT 1, 3, '2016-01-05', '2016-01-20'
UNION
SELECT 2, 7, '2016-01-08', '2016-01-25'
UNION
SELECT 3, 4, '2016-01-01', '2016-02-03'
UNION
SELECT 4, 3, '2016-02-09', '2016-02-10'
I would use a query like this to return all of the HistoryLog records with a count of the number of weekend days between their StartDate and EndDate:
SELECT h.ID,
h.[Status],
h.StartDate,
h.EndDate,
COUNT(c.ID) AS WeekendDays
FROM #HistoryLog h
LEFT JOIN #Calendar c ON c.[Date] >= h.StartDate AND c.[Date] <= h.EndDate AND c.IsWeekend = 1
GROUP BY h.ID, h.[Status], h.StartDate, h.EndDate
ORDER BY 1
If you wanted to know the number of weekends, rather than the number of weekend days, we'd need to slightly amend this logic (and define how a range containing only one weekend day - or one starting on a Sunday and ending on a Saturday inclusive - should be handled). Assuming you just want to know how many distinct weekends are at least partially within the date range, you could do:
SELECT h.ID,
h.[Status],
h.StartDate,
h.EndDate,
COUNT(weekends.ID) AS Weekends
FROM #HistoryLog h
LEFT JOIN
(
SELECT c.ID,
c.[Date] AS SatDate,
DATEADD(DAY,1,c.[Date]) AS SunDate
FROM #Calendar c
WHERE c.[DayOfWeek] = 'Saturday'
) weekends ON h.StartDate BETWEEN weekends.SatDate AND weekends.SunDate
OR h.EndDate BETWEEN weekends.SatDate AND weekends.SunDate
OR (h.StartDate <= weekends.SatDate AND h.EndDate >= weekends.SunDate)
GROUP BY h.ID, h.[Status], h.StartDate, h.EndDate
I'm tasked with the following:
Select a list of all customers who had their nth order during a certain date range (usually a specific month).
This list needs to contain: customer id, sum of first n orders
My tables are something like this:
[dbo.customers]: customerID
[dbo.orders]: orderID, customerID,
orderDate, orderTotal
Here is what I've tried so far:
-- Let's assume our threshold (n) is 10
-- Let's assume our date range is April 2013
-- Get customers that already had n orders before the beginning of the given date range.
DECLARE #tmpcustomers TABLE (tmpcustomerID varchar(8))
INSERT INTO
#tmpcustomers
SELECT
c.customerID
FROM
orders o
INNER JOIN customers c ON o.customerID = c.customerID
WHERE
o.orderDate < '2013-04-01'
GROUP BY c.customerID
HAVING (COUNT(o.orderID) >= 10)
-- Now get all customers that have n orders sometime within the given date range
-- but did not have n orders before the beginning of the given date range.
SELECT
a.customerID, SUM(orderTotal) AS firstTenOrderTotal
SELECT
o.customerID, o.orderID, o.orderTotal
FROM
orders o
INNER JOIN customers c ON c.customerID = o.customerID
WHERE
a.customerID NOT IN ( SELECT tmpcustomerID FROM #tmpcustomers )
AND
o.orderDate > '2013-04-01'
AND
o.orderDate < '2013-05-01'
GROUP BY c.customerID
HAVING COUNT(o.orderID) >= 10
This seems to work but it's clunky and slow. Another big problem is that the firstTenOrderTotal is actually the SUM of the total amount of orders by the end of the given date range and not necessarily the first 10.
Any suggestions for a better approach would be much appreciated.
In the insert to #tmpcustomers, why are you joining back to the customer table? The order table already has the customerID that you want. Also, why are you looking for orders where the order date is before your date range? Don't you just want customers with more than n orders between a date range? This will make the second query easier.
By only having the customers with n or more orders in the table variable #tmpcustomers, you should just be able to join it and the orders table in the second query to get the sum of all the orders for those customers where you would once again limit order table records to your date range (so you do not get orders outside of that range). This will remove the having statement and the join to the customers table in your final result query.
Give this a try. Depending on your order distribution it may perform better. In this query im assembling the list of orders in the range, and then looking back to count the number of prior orders (also grabbing the orderTotal).
note: I am assuming the orderID increments as orders are placed.
If this isnt the case just use a row_number over the date to project the sequence into the query.
declare #orders table (orderID int primary key identity(1,1), customerID int, orderDate datetime, orderTotal int)
insert into #orders (customerID, orderDate, orderTotal)
select 1, '2013-01-01', 1 union all
select 1, '2013-01-02', 2 union all
select 1, '2013-02-01', 3 union all
select 2, '2013-01-25', 5 union all
select 2, '2013-01-26', 5 union all
select 2, '2013-02-02', 10 union all
select 2, '2013-02-02', 10 union all
select 2, '2013-02-04', 20
declare #N int, #StartDate datetime, #EndDate datetime
select #N = 3,
#StartDate = '2013-02-01',
#EndDate = '2013-02-20'
select o.customerID,
[total] = o.orderTotal + p.total --the nth order + total prior
from #orders o
cross
apply ( select count(*)+1, sum(orderTotal)
from #orders
where customerId = o.customerID and
orderID < o.orderID and
orderDate <= o.orderDate
) p(n, total)
where orderDate between #StartDate and #EndDate and p.n = #N
Here is my suggestion:
Use Northwind
GO
select ords.OrderID , ords.OrderDate , '<-->' as Sep1 , derived1.* from
dbo.Orders ords
join
(
select CustomerID, OrderID, ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY OrderId DESC) AS ThisCustomerCardinalOrderNumber from dbo.Orders
) as derived1
on ords.OrderID = derived1.OrderID
where
derived1.ThisCustomerCardinalOrderNumber = 3
and ords.OrderDate between '06/01/1997' and '07/01/1997'
EDIT:::::::::
I took my CTE example, and reworked it for multiple Customers (seen below).
Give it the college try.
Use Northwind
GO
declare #BeginDate datetime
declare #EndDate datetime
select #BeginDate = '01/01/1900'
select #EndDate = '12/31/2010'
;
WITH
MyCTE /* http://technet.microsoft.com/en-us/library/ms175972.aspx */
( ShipName,ShipAddress,ShipCity,ShipRegion,ShipPostalCode,ShipCountry,CustomerID,CustomerName,[Address],
City,Region,PostalCode,Country,Salesperson,OrderID,OrderDate,RequiredDate,ShippedDate,ShipperName,
ProductID,ProductName,UnitPrice,Quantity,Discount,ExtendedPrice,Freight,ROWID) AS
(
SELECT
ShipName ,ShipAddress,ShipCity,ShipRegion,ShipPostalCode,ShipCountry,CustomerID,CustomerName,[Address]
,City ,Region,PostalCode,Country,Salesperson,OrderID,OrderDate,RequiredDate,ShippedDate,ShipperName
,ProductID ,ProductName,UnitPrice,Quantity,Discount,ExtendedPrice,Freight
, ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate , ProductName ASC ) as ROWID /* Note that the ORDER BY (here) is directly related to the ORDER BY (near the very end of the query) */
FROM
dbo.Invoices inv /* “Invoices” is a VIEW, FYI */
where
(inv.OrderDate between #BeginDate and #EndDate)
)
SELECT
/*
ShipName,ShipAddress,ShipCity,ShipRegion,ShipPostalCode,ShipCountry,CustomerID,CustomerName,[Address],
City,Region,PostalCode,Country,Salesperson,OrderID,OrderDate,RequiredDate,ShippedDate,ShipperName,
ProductID,ProductName,UnitPrice,Quantity,Discount,ExtendedPrice,Freight,
*/
/*trim the list down a little for the final output */
CustomerID ,OrderID , OrderDate, (ExtendedPrice + Freight) as ComputedTotal
/*The below line is the “trick”. I reference the above CTE, but only get data that is less than or equal to the row that I am on (outerAlias.ROWID)*/
, (Select SUM (ExtendedPrice + Freight) from MyCTE innerAlias where innerAlias.ROWID <= outerAlias.ROWID and innerAlias.CustomerID = outerAlias.CustomerID) as RunningTotal
, ROWID as ROWID_SHOWN_FOR_KICKS , OrderDate as OrderDate
FROM
MyCTE outerAlias
GROUP BY CustomerID ,OrderID, OrderDate, ProductName,(ExtendedPrice + Freight) ,ROWID,OrderDate
/*Two Order By Options*/
ORDER BY outerAlias.CustomerID , outerAlias.OrderDate , ProductName
/* << Whatever the ORDER BY is here, should match the “ROW_NUMBER() OVER ( ORDER BY ________ ASC )” statement inside the CTE */
/*ORDER BY outerAlias.ROWID */ /* << Or, to keep is more “trim”, ORDER BY the ROWID, which will of course be the same as the “ROW_NUMBER() OVER ( ORDER BY” inside the CTE */