Table Valued Function with Recursive CTE - sql-server

Just for fun, I’m trying to write a table valued function to generate a table of dates. For testing purposes, I am hard-coding values which should be passed in variables.
By itself, this works:
WITH cte AS (
SELECT cast('2021-10-01' AS date) AS date
UNION ALL
SELECT dateadd(day,1,date) FROM cte WHERE date<current_timestamp
)
SELECT * FROM cte OPTION(maxrecursion 0);
Note the OPTION at the end.
As a function, it won’t work unless I remove the OPTION clause at the end:
CREATE FUNCTION dates(#start date, #rows INT) RETURNS TABLE AS
RETURN
WITH cte AS (
SELECT cast('2021-10-01' AS date) AS date
UNION ALL
SELECT dateadd(day,1,date) FROM cte WHERE date<current_timestamp
)
SELECT * FROM cte -- OPTION(maxrecursion 0)
;
For the test data, that’s OK, but it will certainly fail if I give it date at the beginning of the year, since it involves more than 100 recursions.
Is there a correct syntax for this, or is it another Microsoft Quirk which needs a workaround?

I think this may be the answer.
In an answer to another question, #GarethD states:
If you think of a view more as a stored subquery than a stored query … and remember that its definition is expanded out into the main query …
(Incorrect Syntax Near Keyword 'OPTION' in CTE Statement).
If that’s the case, a view can’t include anything that can’t be included in a subquery. That includes the ORDER BY clause and hints such as OPTION.
I have yet to make sure, but I can guess that the same goes for a Table Valued Function. If so, the answer is no, there is no way include the OPTION clause.
Note that other DBMSs are more accommodating in what can be included in subqueries, so I don’t imagine that they have the same limitations.

The solution is to use a multi-statement Table Valued Function:
DROP FUNCTION IF EXISTS dates;
GO
CREATE FUNCTION dates(#start date, #end date) RETURNS #dates TABLE(date date) AS
BEGIN
WITH cte AS (
SELECT #start AS date
UNION ALL
SELECT dateadd(day,1,date) FROM cte WHERE date<#end
)
INSERT INTO #dates(date)
SELECT date FROM cte OPTION(MAXRECURSION 0)
RETURN;
END;
GO
SELECT * FROM dates('2020-01-01','2021-01-01');
Inline Table Valued Functions are literally inlined, and clauses such as OPTION can only appear at the very end of an SQL statement, which is not necessarily at the end of the inline function.
On the other hand, a Multi Statement Table Valued function is truly self-contained, so the OPTION clause is OK there.

Related

SQL Server : how do I make a list that has dates and facilities that are not in another table?

declare #StartDate date = '08/01/2021',
#EndDate Date = '08/04/2021';
with cte_FacilityReportingDates as
(
select distinct Facility, REPORTING_DATE
from table1 a
where REPORTING_DATE between #StartDate and #EndDate
),
cte_facility as
(
select distinct Facility
from table1 a
),
cte_ReportingDates as
(
select distinct a.REPORTING_DATE
from table1 a
where a.REPORTING_DATE between #StartDate and #EndDate
),
cte_Combine as
(
select *
from cte_facility f
cross join cte_ReportingDates d
)
select t1.FACILITY, t1.REPORTING_DATE from cte_Combine t1 where not exists (select 1 from cte_FacilityReportingDates t2 where t1.FACILITY = t2.FACILITY and t2.REPORTING_DATE between StartDate and EndDate and t2.FACILITY is null group by t1.facility, t1.REPORTING_DATE)
I've got it down to the last 50 of the race (Hat Tip to the Olympics) but can't get over the finish line. I know it is simply something I've overlooked but I'm racking my brain! I need to show the facilities and dates that are NOT in the result from cte_ReportingDates.
With proper formatting, you will encourage others to help. You removed the efforts that someone else made in formatting your code when you edited it. That was quite discouraging honestly.
When formatted properly, you can clearly see where each CTE is defined and better understand what each does. Seems you overdid your use of DISTINCT - don't just throw it into code in hopes it "fixes" something. The first cte (cte_FacilityReportingDates) does not really need DISTINCT if used to test for existence. TBH that particular CTE it is a bit overkill since the logic can easily be incorporated within the EXISTS clause below - but that is a style choice.
<with ... all your CTEs from original query ...>
select comb.FACILITY, comb.REPORTING_DATE
from cte_Combine comb
where not exists (select * from cte_FacilityReportingDates as trn
where comb.FACILITY = trn.FACILITY
and comb.REPORTING_DATE = trn.REPORTING_DATE)
order by ...;
There is no reason to apply a GROUP BY clause to the final query since it is nothing by a unique set of <FACILITY, REPORT_DATE>. Any time you use/see such a clause with no aggregates, that should be a concern that the writer has lost the path.
Also notice the ORDER BY clause. If the order of rows matters, then the query that generates the resultset must have one. Usually it does matter.
I also used better table aliases. Cryptic ones are not not helpful to the reader; develop good habits. I have no idea what the CTE named cte_FacilityReportingDates (which selects from "table1" - another crap name with equally crap alias "a") so I just made up something.
The last issue I'll highlight is the rather important assumption you made. Your logic assumes that every facility exists within table1. That is not usually a safe assumption for some sort of "activity" table (which is my guess as to what that table represents). The same applies to dates. For dates you can generate the set of all dates between two boundaries easily - I'll leave that adjustment to you if needed. You cannot do with for facility - you might (likely do or should) need another table for that.

Understanding CTE Semicolon Placement

When I run this CTE in SQL Server it says the syntax is incorrect by the declare statement.
;WITH cte as
(
SELECT tblKBFolders.FolderID
from tblKBFolders
where FolderID = #FolderID
UNION ALL
SELECT tblKBFolders.FolderID
FROM tblKBFolders
INNER JOIN cte
ON cte.FolderID = tblKBFolders.ParentFolderID
)
declare #tblQueryFolders as table (FolderID uniqueidentifier)
insert into #tblQueryFolders
SELECT FolderID From cte;
But if I move the declare to before the CTE, it runs just fine.
declare #tblQueryFolders as table (FolderID uniqueidentifier)
;WITH cte as
(
SELECT tblKBFolders.FolderID
from tblKBFolders
where FolderID = #FolderID
UNION ALL
SELECT tblKBFolders.FolderID
FROM tblKBFolders
INNER JOIN cte
ON cte.FolderID = tblKBFolders.ParentFolderID
)
insert into #tblQueryFolders
SELECT FolderID From cte;
Why is that?
The answer you ask for was given in a comment already: This has nothing to do with the semicolon's placement.
Important: The CTE's WITH cannot follow right after a statement without an ending semicolon. There are many statments, where a WITH-clause would add something to the end of the statement (query hints, the WITH after OPENJSON etc.). The engine would have to guess, whether this WITH adds to the statment before or if it is a CTE's start. That's the reason, why we often see
;WITH cte AS (...)
That's actually the wrong usage of a semicolon. People put it there, just not to forget about it. Anyway it is seen as better style and best practice to end T-SQL statements always with a semicolon (and do not use ;WITH, as it adds an empty statement actually).
A CTE is not much more than syntactical sugar. Putting the CTE's code within a FROM(SELECT ...) AS SomeAlias would be roughly the same. In most cases this would lead to the same execution plan. It helps in cases, where you'd have to write the same FROM(SELECT ) AS SomeAlias in multiple places. And - in general - it makes things easier to read and understand. But it is not - by any means - comparable to a temp table or a table variable. The engine will treat it as inline code and you can use it in the same statement exclusively.
So this is the same:
WITH SomeCTE AS(...some query here...)
SELECT SomeCTE.* FROM SomeCTE;
SELECT SomeAlias.*
FROM (...some query here...) AS SomeAlias;
Your example looks like you think of the CTE as kind of a temp table definition, which you can use in the following statements. But this is not correct.
After the CTE the engine expects another CTE or a final statement like SELECT or UPDATE.
WITH SomeCTE AS(...some query here...)
SELECT * FROM SomeCTE;
or
WITH SomeCTE AS( ...query... )
,AnotherCTE AS ( ...query... )
SELECT * FROM AnotherCTE;
...or another content added with the WITH clause:
WITH XMLNAMESPACES( ...namespace declarations...)
,SomeCTE AS( ...query... )
SELECT * FROM SomeCTE;
All of these examples are one single statement.
Putting a DECLARE #Something in the middle, would break this concept.

Persistent WITH statement in SQL Server 2008 [duplicate]

I've got a question which occurs when I was using the WITH-clause in one of my script. The question is easy to pointed out I wanna use the CTE alias multiple times instead of only in outer query and there is crux.
For instance:
-- Define the CTE expression
WITH cte_test (domain1, domain2, [...])
AS
-- CTE query
(
SELECT domain1, domain2, [...]
FROM table
)
-- Outer query
SELECT * FROM cte_test
-- Now I wanna use the CTE expression another time
INSERT INTO sometable ([...]) SELECT [...] FROM cte_test
The last row will lead to the following error because it's outside the outer query:
Msg 208, Level 16, State 1, Line 12 Invalid object name 'cte_test'.
Is there a way to use the CTE multiple times resp. make it persistent? My current solution is to create a temp table where I store the result of the CTE and use this temp table for any further statements.
-- CTE
[...]
-- Create a temp table after the CTE block
DECLARE #tmp TABLE (domain1 DATATYPE, domain2 DATATYPE, [...])
INSERT INTO #tmp (domain1, domain2, [...]) SELECT domain1, domain2, [...] FROM cte_test
-- Any further DML statements
SELECT * FROM #tmp
INSERT INTO sometable ([...]) SELECT [...] FROM #tmp
[...]
Frankly, I don't like this solution. Does anyone else have a best practice for this problem?
Thanks in advance!
A CommonTableExpression doesn't persist data in any way. It's basically just a way of creating a sub-query in advance of the main query itself.
This makes it much more like an in-line view than a normal sub-query would be. Because you can reference it repeatedly in one query, rather than having to type it again and again.
But it is still just treated as a view, expanded into the queries that reference it, macro like. No persisting of data at all.
This, unfortunately for you, means that you must do the persistance yourself.
If you want the CTE's logic to be persisted, you don't want an in-line view, you just want a view.
If you want the CTE's result set to be persisted, you need a temp table type of solution, such as the one you do not like.
A CTE is only in scope for the SQL statement it belongs to. If you need to reuse its data in a subsequent statement, you need a temporary table or table variable to store the data in. In your example, unless you're implementing a recursive CTE I don't see that the CTE is needed at all - you can store its contents straight in a temporary table/table variable and reuse it as much as you want.
Also note that your DELETE statement would attempt to delete from the underlying table, unlike if you'd placed the results into a temporary table/table variable.

CTE inside SQL IF-ELSE structure

I want to do something like this
declare #a int=1
if (#a=1)
with cte as
(
select UserEmail from UserTable
)
else
with cte as
(
select UserID from UserTable
)
select * from cte
This is just the example, my actual query is far more complex. So I don't want to write the SELECT statement inside IF and ELSE statement twice after the CTE.
If possible, find a way to avoid the if statement entirely.
E.g. in such a trivial example as in your question:
;with CTE as (
select UserEmail from UserTable where #a = 1
union all
select UserID from UserTable where #a != 1 or #a is null
)
select /* single select statement here */
It should generally be possible to compose one or more distinct queries into a final UNION ALL cte, instead of using if - after all, both of the queries being combined must have compatible result sets anyway, for your original question to make sense.
You can't do that - the CTE must immediately be followed by exactly one SQL statement that can refer to it. You cannot split the "definition" of the CTE from the statement that uses it.
So you need to do it this way:
declare #a int=1
if (#a=1)
with cte as
(
select UserEmail from UserTable
)
select * from cte
else
with cte as
(
select UserID from UserTable
)
select * from cte
You cannot split the CTE "definition" for its usage (select * from cte)
The with cte (...) select from cte ... is a single statement. Is not 'followed' by a statement, it is part of the statement. You are asking to split the statement in two which is obviously impossible.
As a general rule, SQL is a very unfriendly language for things like DRY and avoiding code repetition. Attempting to make code more maintainable, more readable, or simply trying to save a few keystrokes can (and usually does) result in serious runtime performance penalties (eg. attempting to move the CTE into a table value function UDF). The simplest thing would be to bite the bullet (this time, and in future...) and write the CTE twice. Sometimes it makes sense to materialize the CTE into a #temp table and then operate into on the #temp table, but only sometimes.
This status-quo is unfortunate, but is everything you can expect from design by committee...

Use of table valued function in order by clause

Can I use my table valued function in order by clause of my select query????
Like this :
declare #ID int
set #ID=9011
Exec ('select top 10 * from cs_posts order by ' + (select * from dbo.gettopposter(#ID)) desc)
GetTopPoster(ID) is my table valued function.
Please help me on this.
You can use a table-valued function with a join. That also allows you to choose any combination of columns to sort by:
select top 10 *
from cs_posts p
join dbo.gettopposter(#ID) as gtp
on p.poster_id = gtp.poster_id
order by
gtp.col1
, gtp.col2
Yes. You can use a Table Valued Function just as a normal table.
Your query is not valid SQL though, despite the TVF.
For further reference:
http://msdn.microsoft.com/en-us/library/ms191165.aspx
You can't do it like that - how does it know what to order by? It doesn't know how the TVF relates to the original query. You can join the two however (as I assume cs_posts has an id column which relates to the TVF) and then order by the the TVF id column.

Resources