Multiple date ranges using CTE - sql-server

I need to generate a table of half hour periods. I have the following which works:
WITH ctePeriods AS
(
SELECT #gapStart HalfHourPeriod
UNION ALL
SELECT DATEADD(MINUTE, 30, HalfHourPeriod)
FROM ctePeriods
WHERE HalfHourPeriod < DATEADD(MINUTE, -30, #gapEnd)
)
Which gives me the values for the range between #gapStart and #gapEnd.
However I also have a table of ranges which I need to generate:
create table #gaps(HHFrom datetime, HHTo datetime)
Currently I'm using this to get the values for #gapStart and #gapEnd used above by getting the min and max from #gaps. But this means I'm filling in more rows then I need in ctePeriods.
Is there any way that I can use the rows in #gaps within ctePeriods so I only create the rows that I need?

I personally prefer using a Tally Table for things like this. You can use a persisted Tally Table, or you create one on the fly (as I do here):
CREATE TABLE #gaps (HHFrom datetime,
HHTo datetime);
INSERT INTO #gaps (HHFrom,
HHTo)
VALUES('20190101','20190103'),
('20190217','20190315'),
('20190708',GETDATE());
GO
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1, N N2, N N3, N N4, N N5, N N6), --1000000 rows, feel free to increase/decrease per your own requirement
Dates AS(
SELECT G.HHFrom,
G.HHTo,
DATEADD(MINUTE, 30*T.I, G.HHFrom) AS HH
FROM #gaps G
CROSS JOIN Tally T
WHERE DATEADD(MINUTE, 30*T.I, G.HHFrom) <= G.HHTo)
SELECT *
FROM Dates D
ORDER BY D.HHFrom, D.HH;
GO
DROP TABLE #gaps;
Unlike an rCTE, this means that for large ranges the statement won't "fall over" if you have more than 100 rows (the default recursion), and isn't recursive like an rCTE.

Related

How can I get, and re-use, the next 6 *dates* from today in SQL Server?

I need to create a CTE I can re-use that will hold seven dates. That is today and the next six days.
So, output for today (4/22/2022) should be:
2022-04-22
2022-04-23
2022-04-24
2022-04-25
2022-04-26
2022-04-27
2022-04-28
So far, I have this:
WITH seq AS
(
SELECT 0 AS [idx]
UNION ALL
SELECT [idx] + 1
FROM seq
WHERE [idx] < 6
)
SELECT DATEADD(dd, [idx], CONVERT(date, GETDATE()))
FROM seq;
The problem is my SELECT is outside the WITH, so I would need to wrap this whole thing with another WITH to re-use it, for example to JOIN on it as a list of dates, and I'm not having luck getting that nested WITH to work. How else could I accomplish this?
To be clear: I'm not trying to find records in a specific table full of dates that are from the next seven days. There are plenty of easy solutions for that. I need a list of dates for today and the next six days, that I can re-use in other queries as a CTE.
You're close. Here's an example:
with cte as (
select
1 as n
,GETDATE() as dt
union all
select
n+1
,DATEADD(dd,n,GETDATE()) as dt
from cte
where n <= 6
)
select * from cte
Fiddle here
You can create a view for reusability and simply query the view rather than using the same CTE over and over again.
You can do this by adding a second column for the date to the CTE:
WITH seq AS (
SELECT 0 AS [idx], cast(current_timestamp as date) as date
UNION ALL
SELECT [idx] + 1, dateadd(dd, idx+1, cast(current_timestamp as date))
FROM seq
WHERE [idx] < 6
)
SELECT *
FROM seq;
See it here:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=208ecd76be2071529078f38b1735b0cd
Another option is you can "stack" CTEs, rather than nest, to avoid the second column:
WITH seq0 AS (
SELECT 0 AS [idx]
UNION ALL
SELECT [idx] + 1
FROM seq0
WHERE [idx] < 6
),
seq As (
SELECT dateadd(dd, idx, cast(current_timestamp as date)) as idx
FROM seq0
)
SELECT *
FROM seq;
Note how the final query only needed to reference the 2nd CTE.
See it here:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=22315438e4710792f368009cc6ff6451
Never recommend using recursion if you don't have a need for it. It's more complex and slower. I'd just use a hardcoded list of numbers, could encapsulate it in TVF if you wanted to reuse it across different stored procedures/functions. If you need to reuse it in 1 stored proc in multiple places, I'd just throw it in a temp table.
CTE Version without Recursion
WITH cte_7days AS (
SELECT theDate = CAST(DATEADD(dd,num,GETDATE()) AS DATE)
FROM (VALUES (0),(1),(2),(3),(4),(5),(6)) AS A(num)
)
SELECT *
FROM cte_7days
CROSS APPLY Version to Remove Need for CTE
Could use something like this as your base query, and then just add more joins below table depending on your query
SELECT theDate
FROM (VALUES (0),(1),(2),(3),(4),(5),(6)) AS A(num)
CROSS APPLY (SELECT theDate = CAST(DATEADD(DAY,num,GETDATE()) AS DATE)) AS B
TVF Version
CREATE FUNCTION dbo.uf_7days()
RETURNS TABLE AS
RETURN
(
SELECT theDate
FROM (VALUES (0),(1),(2),(3),(4),(5),(6)) AS A(num)
CROSS APPLY (SELECT theDate = CAST(DATEADD(DAY,num,GETDATE()) AS DATE)) AS B
)
GO

Get random data from SQL Server without performance impact

I need to select random rows from my sql table, when search this cases in google, they suggested to ORDER BY NEWID() but it reduces the performance. Since my table has more than 2'000'000 rows of data, this solution does not suit me.
I tried this code to get random data :
SELECT TOP 10 *
FROM Table1
WHERE (ABS(CAST((BINARY_CHECKSUM(*) * RAND()) AS INT)) % 100) < 10
It also drops performance sometimes.
Could you please suggest good solution for getting random data from my table, I need minimum rows from that tables like 30 rows for each request. I tried TableSAMPLE to get the data, but it returns nothing once I added my where condition because it return the data by the basis of page not basis of row.
Try to calc the random ids before to filter your big table.
since your key is not identity, you need to number records and this will affect performances..
Pay attention, I have used distinct clause to be sure to get different numbers
EDIT: I have modified the query to use an arbitrary filter on your big table
declare #n int = 30
;with
t as (
-- EXTRACT DATA AND NUMBER ROWS
select *, ROW_NUMBER() over (order by YourPrimaryKey) n
from YourBigTable t
-- SOME FILTER
WHERE 1=1 /* <-- PUT HERE YOUR COMPLEX FILTER LOGIC */
),
r as (
-- RANDOM NUMBERS BETWEEN 1 AND COUNT(*) OF FILTERED TABLE
select distinct top (#n) abs(CHECKSUM(NEWID()) % n)+1 rnd
from sysobjects s
cross join (SELECT MAX(n) n FROM t) t
)
select t.*
from t
join r on r.rnd = t.n
If your uniqueidentifier key is a random GUID (not generated with NEWSEQUENTIALID() or UuidCreateSequential), you can use the method below. This will use the clustered primary key index without sorting all rows.
SELECT t1.*
FROM (VALUES(
NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID())
,(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID())
,(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID())) AS ThirtyKeys(ID)
CROSS APPLY(SELECT TOP (1) * FROM dbo.Table1 WHERE ID >= ThirtyKeys.ID) AS t1;

T-SQL - get only latest row for selected condition

I have table with measurement with column SERIAL_NBR, DATE_TIME, VALUE.
There is a lot of data so when I need them to get the last 48 hours for 2000 devices
Select * from MY_TABLE where [TIME]> = DATEADD (hh, -48, #TimeNow)
takes a very long time.
Is there a way not to receive all the rows for each device, but only the latest entry? Would this speed up the query execution time?
Assuming that there is column named deviceId(change as per your needs), you can use top 1 with ties with window function row_number:
Select top 1 with ties *
from MY_TABLE
where [TIME]> = DATEADD (hh, -48, #TimeNow)
Order by row_number() over (
partition by deviceId
order by Time desc
);
You can simply create Common Table Expression that sorts and groups the entries and then pick the latest one from there.
;WITH numbered
AS ( SELECT [SERIAL_NBR], [TIME], [VALUE], row_nr = ROW_NUMBER() OVER (PARTITION BY [SERIAL_NBR] ORDER BY [TIME] DESC)
FROM MY_TABLE
WHERE [TIME]> = DATEADD (hh, -48, #TimeNow) )
SELECT [SERIAL_NBR], [TIME], [VALUE]
FROM numbered
WHERE row_nr = 1 -- we want the latest record only
Depending on the amount of data and the indexes available this might or might not be faster than Anthony Hancock's answer.
Similar to his answer you might also try the following:
(from MSSQL's point of view, the below query and Anthony's query are pretty much identical and they'll probably end up with the same query plan)
SELECT [SERIAL_NBR] , [TIME], [VALUE]
FROM MY_TABLE AS M
JOIN (SELECT [SERIAL_NBR] , max_time = MAX([TIME])
FROM MY_TABLE
GROUP BY [SERIAL_NBR]) AS L -- latest
ON L.[SERIAL_NBR] = M.[SERIAL_NBR]
AND L.max_time = M.[TIME]
WHERE M.DATE_TIME >= DATEADD(hh,-48,#TimeNow)
Your listed column values and your code don't quite match up so you'll probably have to change this code a little, but it sounds like for each SERIAL_NBR you want the record with the highest DATE_TIME in the last 48 hours. This should achieve that result for you.
SELECT SERIAL_NBR,DATE_TIME,VALUE
FROM MY_TABLE AS M
WHERE M.DATE_TIME >= DATEADD(hh,-48,#TimeNow)
AND M.DATE_TIME = (SELECT MAX(_M.DATE_TIME) FROM MY_TABLE AS _M WHERE M.SERIAL_NBR = _M.SERIAL_NBR)
This will get you details of the latest record per serial number:
Select t.SERIAL_NBR, q.FieldsYouWant
from MY_TABLE t
outer apply
(
selct top 1 t2.FieldsYouWant
from MY_TABLE t2
where t2.SERIAL_NBR = t.SERIAL_NBR
order by t2.[TIME] desc
)q
where t.[TIME]> = DATEADD (hh, -48, #TimeNow)
Also, worth sticking DATEADD (hh, -48, #TimeNow) into a variable rather than calculating inline.

How to select Top % in T-SQL without using Top clause?

How to select Top 40% from a table without using the Top clause (or Top percent, the assignment is a little ambiguous) ? This question is for T-SQL, SQL Server 2008. I am not allowed to use Top for my assignment.
Thanks.
This is what I've tried but seems complicated. Isn't there an easier way ?
select top (convert (int, (select round (0.4*COUNT(*), 0) from MyTable))) * from MyTable
Try the NTILE function:
;WITH YourCTE AS
(
SELECT
(some columns),
percentile = NTILE(10) OVER(ORDER BY SomeColumn DESC)
FROM
dbo.YourTable
)
SELECT *
FROM YourCTE
WHERE percentile <= 4
The NTILE(10) OVER(....) creates 10 groups of percentages over your data - and thus, the top 40% are the groups no. 1, 2, 3, 4 of that result
Use NTILE
CREATE TABLE #temp(StudentID CHAR(3), Score INT)
INSERT #temp VALUES('S1',75 )
INSERT #temp VALUES('S2',83)
INSERT #temp VALUES('S3',91)
INSERT #temp VALUES('S4',83)
INSERT #temp VALUES('S5',93 )
INSERT #temp VALUES('S6',75 )
INSERT #temp VALUES('S7',83)
INSERT #temp VALUES('S8',91)
INSERT #temp VALUES('S9',83)
INSERT #temp VALUES('S10',93 )
SELECT * FROM (
SELECT NTILE(10) OVER(ORDER BY Score) AS NtileValue,*
FROM #temp) x
WHERE NtileValue <= 4
ORDER BY 1
Interesting enough I blogged about NTILE today: Does anyone use the NTILE() windowing function?
A problem with the NTILE(10) answers given so far is that if the table has 15 rows they will return 8 rows (53%) rather than the correct number to make up 40% (6).
If the number of rows is not evenly divisible by number of buckets the extra rows all go into the first buckets rather than being evenly distributed.
This alternative (borrows SQL Menace's table) avoids that issue.
WITH CTE
AS (SELECT *,
ROW_NUMBER() OVER ( ORDER BY Score) AS RN,
COUNT(*) OVER() AS Cnt
FROM #temp)
SELECT StudentID,
Score
FROM CTE
WHERE RN <= CEILING(0.4 * Cnt )
Using Top t-sql command:
select top 10 [Column_1],
[Column_2] from [Table]
order by [Column_1]
Using Paging method:
select
[Column_1],
[Column_2]
from
(Select ROW_NUMBER() Over (ORDER BY [Column_1]) AS Row,
[Column_1],
[Column_2]
FROM [Table]) as [alias]
WHERE (Row between 0 and 10)
This is finding the top 10 with order by [Column_1]...please note this is using [variable] method of documentation.
If you could provide column names and table names i could write much more beneficial t-sql, for example to find the top 40% you are going to need to do another sub-query to get count of all rows then do division, i'd likely do this as a query before i do the main query.
Calculate and set ROWCOUNT for whatever number of records.
Then execute you query for the limited set.
declare #rc as integer
select #rc = count(*)*0.40 from CTE
Set ROWCOUNT #rc
select * from CTE
ROWCOUNT is not deprecated yet - see http://msdn.microsoft.com/en-us/library/ms188774.aspx

How can I order by count with pagination?

I have to migrate some SQL from PostgreSQL to SQL Server (2005+). On PostgreSQL i had:
select count(id) as count, date
from table
group by date
order by count
limit 10 offset 25
Now i need the same SQL but for SQL Server. I did it like below, but get error: Invalid column name 'count'. How to solve it ?
select * from (
select row_number() over (order by count) as row, count(id) as count, date
from table
group by date
) a where a.row >= 25 and a.row < 35
You can't reference an alias by name, at the same scope, except in an ending ORDER BY (it is an invalid reference inside of a windowing function at the same scope).
To get the exact same results, it may need to be extended to (nesting scope for clarity):
SELECT c, d FROM
(
SELECT c, d, ROW_NUMBER() OVER (ORDER BY c) AS row FROM
(
SELECT d = [date], c = COUNT(id) FROM dbo.table GROUP BY [date]
) AS x
) AS y WHERE row >= 25 AND row < 35;
This can be shortened a little bit as per mohan's answer.
SELECT c, d FROM
(
SELECT COUNT(id), [date], ROW_NUMBER() OVER (ORDER BY COUNT(id))
FROM dbo.table GROUP BY [date]
) AS y(c, d, row)
WHERE row >= 25 AND row < 35;
In SQL Server 2012, it's much easier with OFFSET / FETCH - closer to the syntax you're used to, but actually using ANSI-compatible syntax rather than proprietary voodoo.
SELECT c = COUNT(id), d = [date]
FROM dbo.table GROUP BY [date]
ORDER BY COUNT(id)
OFFSET 25 ROWS FETCH NEXT 10 ROWS ONLY;
I blogged about this functionality in 2010 (lots of good comments there too) and should probably invest some time doing some serious performance tests.
And I agree with #ajon - I hope your real tables, columns and queries don't abuse reserved words like this.
It works
DECLARE #startrow int=0,#endrow int=0
;with CTE AS (
select row_number() over ( order by count(id)) as row,count(id) AS count, date
from table
group by date
)
SELECT * FROM CTE
WHERE row between #startrow and #endrow
I think this will do it
select * from (
select row_number() over (order by id) as row, count(id) as count, date
from table
group by date
) a where a.row >= 25 and a.row < 35
Also, I don't know what version of SQL Server you are using but SQL Server 2012 has a new Paging feature

Resources