Understanding CTE Semicolon Placement - sql-server

When I run this CTE in SQL Server it says the syntax is incorrect by the declare statement.
;WITH cte as
(
SELECT tblKBFolders.FolderID
from tblKBFolders
where FolderID = #FolderID
UNION ALL
SELECT tblKBFolders.FolderID
FROM tblKBFolders
INNER JOIN cte
ON cte.FolderID = tblKBFolders.ParentFolderID
)
declare #tblQueryFolders as table (FolderID uniqueidentifier)
insert into #tblQueryFolders
SELECT FolderID From cte;
But if I move the declare to before the CTE, it runs just fine.
declare #tblQueryFolders as table (FolderID uniqueidentifier)
;WITH cte as
(
SELECT tblKBFolders.FolderID
from tblKBFolders
where FolderID = #FolderID
UNION ALL
SELECT tblKBFolders.FolderID
FROM tblKBFolders
INNER JOIN cte
ON cte.FolderID = tblKBFolders.ParentFolderID
)
insert into #tblQueryFolders
SELECT FolderID From cte;
Why is that?

The answer you ask for was given in a comment already: This has nothing to do with the semicolon's placement.
Important: The CTE's WITH cannot follow right after a statement without an ending semicolon. There are many statments, where a WITH-clause would add something to the end of the statement (query hints, the WITH after OPENJSON etc.). The engine would have to guess, whether this WITH adds to the statment before or if it is a CTE's start. That's the reason, why we often see
;WITH cte AS (...)
That's actually the wrong usage of a semicolon. People put it there, just not to forget about it. Anyway it is seen as better style and best practice to end T-SQL statements always with a semicolon (and do not use ;WITH, as it adds an empty statement actually).
A CTE is not much more than syntactical sugar. Putting the CTE's code within a FROM(SELECT ...) AS SomeAlias would be roughly the same. In most cases this would lead to the same execution plan. It helps in cases, where you'd have to write the same FROM(SELECT ) AS SomeAlias in multiple places. And - in general - it makes things easier to read and understand. But it is not - by any means - comparable to a temp table or a table variable. The engine will treat it as inline code and you can use it in the same statement exclusively.
So this is the same:
WITH SomeCTE AS(...some query here...)
SELECT SomeCTE.* FROM SomeCTE;
SELECT SomeAlias.*
FROM (...some query here...) AS SomeAlias;
Your example looks like you think of the CTE as kind of a temp table definition, which you can use in the following statements. But this is not correct.
After the CTE the engine expects another CTE or a final statement like SELECT or UPDATE.
WITH SomeCTE AS(...some query here...)
SELECT * FROM SomeCTE;
or
WITH SomeCTE AS( ...query... )
,AnotherCTE AS ( ...query... )
SELECT * FROM AnotherCTE;
...or another content added with the WITH clause:
WITH XMLNAMESPACES( ...namespace declarations...)
,SomeCTE AS( ...query... )
SELECT * FROM SomeCTE;
All of these examples are one single statement.
Putting a DECLARE #Something in the middle, would break this concept.

Related

Table Valued Function with Recursive CTE

Just for fun, I’m trying to write a table valued function to generate a table of dates. For testing purposes, I am hard-coding values which should be passed in variables.
By itself, this works:
WITH cte AS (
SELECT cast('2021-10-01' AS date) AS date
UNION ALL
SELECT dateadd(day,1,date) FROM cte WHERE date<current_timestamp
)
SELECT * FROM cte OPTION(maxrecursion 0);
Note the OPTION at the end.
As a function, it won’t work unless I remove the OPTION clause at the end:
CREATE FUNCTION dates(#start date, #rows INT) RETURNS TABLE AS
RETURN
WITH cte AS (
SELECT cast('2021-10-01' AS date) AS date
UNION ALL
SELECT dateadd(day,1,date) FROM cte WHERE date<current_timestamp
)
SELECT * FROM cte -- OPTION(maxrecursion 0)
;
For the test data, that’s OK, but it will certainly fail if I give it date at the beginning of the year, since it involves more than 100 recursions.
Is there a correct syntax for this, or is it another Microsoft Quirk which needs a workaround?
I think this may be the answer.
In an answer to another question, #GarethD states:
If you think of a view more as a stored subquery than a stored query … and remember that its definition is expanded out into the main query …
(Incorrect Syntax Near Keyword 'OPTION' in CTE Statement).
If that’s the case, a view can’t include anything that can’t be included in a subquery. That includes the ORDER BY clause and hints such as OPTION.
I have yet to make sure, but I can guess that the same goes for a Table Valued Function. If so, the answer is no, there is no way include the OPTION clause.
Note that other DBMSs are more accommodating in what can be included in subqueries, so I don’t imagine that they have the same limitations.
The solution is to use a multi-statement Table Valued Function:
DROP FUNCTION IF EXISTS dates;
GO
CREATE FUNCTION dates(#start date, #end date) RETURNS #dates TABLE(date date) AS
BEGIN
WITH cte AS (
SELECT #start AS date
UNION ALL
SELECT dateadd(day,1,date) FROM cte WHERE date<#end
)
INSERT INTO #dates(date)
SELECT date FROM cte OPTION(MAXRECURSION 0)
RETURN;
END;
GO
SELECT * FROM dates('2020-01-01','2021-01-01');
Inline Table Valued Functions are literally inlined, and clauses such as OPTION can only appear at the very end of an SQL statement, which is not necessarily at the end of the inline function.
On the other hand, a Multi Statement Table Valued function is truly self-contained, so the OPTION clause is OK there.

Cannot use WITH statement two times

I wan't to create a view that is constructed like this:
(simplified)
Create VIEW viewAll AS
With TempLevel1 AS
(
SELECT statement
)
With TempLevel2 AS (SELECT * from TempLevel1)
SELECT * from TempLevel2
The problem is that I cannot use With statement like this because of
the following error:
Incorrect syntax near the keyword 'With'.
Incorrect syntax near the keyword 'with'.
If this statement is a
common table expression, an xmlnamespaces clause or a change tracking
context clause, the previous statement must be terminated with a
semicolon.
I have to specify that the SELECT queries are way more complex and I do have to use With two times.
Would it be a better practice to create the first with statement as another view like viewTempLevel1 (and use it in the With TempLevel2 statement)?
From the documentation for Common table Expressions (CTE), you can
Use a comma to separate multiple CTE definitions
Example is (taken straight out of the docs)
WITH Sales_CTE (SalesPersonID, TotalSales, SalesYear)
AS
-- Define the first CTE query.
(
SELECT SalesPersonID, SUM(TotalDue) AS TotalSales, YEAR(OrderDate) AS SalesYear
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY SalesPersonID, YEAR(OrderDate)
)
, -- Use a comma to separate multiple CTE definitions.
-- Define the second CTE query, which returns sales quota data by year for each sales person.
Sales_Quota_CTE (BusinessEntityID, SalesQuota, SalesQuotaYear)
AS
(
SELECT BusinessEntityID, SUM(SalesQuota)AS SalesQuota, YEAR(QuotaDate) AS SalesQuotaYear
FROM Sales.SalesPersonQuotaHistory
GROUP BY BusinessEntityID, YEAR(QuotaDate)
)
-- Define the outer query by referencing columns from both CTEs.
SELECT SalesPersonID...
In your case, the syntax would be...
With TempLevel1 AS
( SELECT statement [...]),
TempLevel2 AS
(SELECT * from TempLevel1)
SELECT * from TempLevel2
You don't need to repeat the WITH keyword. Separate the CTE expressions by comma:
With CTE_Level1 AS
(
SELECT statement
),
CTE_Level2 AS
(
SELECT * from CTE_Level1
)
SELECT * from CTE_Level2

Strange behavior of CTE

I just answered this: Generate scripts with new ids (also for dependencies)
My first attempt was this:
DECLARE #Form1 UNIQUEIDENTIFIER=NEWID();
DECLARE #Form2 UNIQUEIDENTIFIER=NEWID();
DECLARE #tblForms TABLE(id UNIQUEIDENTIFIER,FormName VARCHAR(100));
INSERT INTO #tblForms VALUES(#Form1,'test1'),(#Form2,'test2');
DECLARE #tblFields TABLE(id UNIQUEIDENTIFIER,FormId UNIQUEIDENTIFIER,FieldName VARCHAR(100));
INSERT INTO #tblFields VALUES(NEWID(),#Form1,'test1.1'),(NEWID(),#Form1,'test1.2'),(NEWID(),#Form1,'test1.3')
,(NEWID(),#Form2,'test2.1'),(NEWID(),#Form2,'test2.2'),(NEWID(),#Form2,'test2.3');
--These are the originalIDs
SELECT frms.id,frms.FormName
,flds.id,flds.FieldName
FROM #tblForms AS frms
INNER JOIN #tblFields AS flds ON frms.id=flds.FormId ;
--The same with new ids
WITH FormsWithNewID AS
(
SELECT NEWID() AS myNewFormID
,*
FROM #tblForms
)
SELECT frms.myNewFormID, frms.id,frms.FormName
,NEWID() AS myNewFieldID,flds.FieldName
FROM FormsWithNewID AS frms
INNER JOIN #tblFields AS flds ON frms.id=flds.FormId
The second select should deliver - at least I thought so - two values in "myNewFormID", each three times... But it comes up with 6 different values. This would mean, that the CTE's "NEWID()" is done for each row of the final result set. What am I missing?
Your understanding of CTEs is wrong. They are not simply a table variable that's filled with the results of the query - instead, they are a query on their own. Note that CTEs can be used recursively - this would be quite a sight with table variables :)
From MSDN:
A common table expression (CTE) can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query. Unlike a derived table, a CTE can be self-referencing and can be referenced multiple times in the same query.
The "can be thought" of is a bit deceiving - sure, it can be thought of, but it's not a result set. You don't see this manifesting when you're only using pure functions, but as you've noticed, newId is not pure. In reality, it's more like a named subquery - in your example, you'll get the same thing if you just move the query from the CTE to the from clause directly.
To illustrate this even further, you can add another join on the CTE to the query:
WITH FormsWithNewID AS
(
SELECT NEWID() AS myNewFormID
,*
FROM #tblForms
)
SELECT frms.myNewFormID, frms.id,frms.FormName
,NEWID() AS myNewFieldID,flds.FieldName,
frms2.myNewFormID
FROM FormsWithNewID AS frms
INNER JOIN #tblFields AS flds ON frms.id=flds.FormId
left join FormsWithNewID as frms2 on frms.id = frms2.id
You'll see that the frms2.myNewFormID contains different myNewFormIDs.
Keep this in mind - you can only treat the CTE as a result set when you're only using pure functions on non-changing data; in other words, if executing the same query in a serializable transaction isolation level twice will produce the same result sets.
NEWID() returns a value every time it is executed. Whenever you use it you get a new value
For example,
select top 5 newid()
from sys.tables
order by newid()
You will not see them order by because the selected field is produced with different values than the Order By field

Use of the IN condition

I can easily create a stored procedure in SQL Server with parameters that I use with =, LIKE and most operators. But when it comes to using IN, I don't really understand what to do, and I can't find a good site to teach me.
Example
CREATE PROCEDURE TEST
#Ids --- What type should go here?
AS BEGIN
SELECT * FROM TableA WHERE ID IN ( #Ids )
END
Is this possible and if so how ?
With SQL Server 2008 and above, you can use Table Valued Parameters.
You declare a table type and can use that as a parameter (read only) for stored procedures that can be used in IN clauses.
For the different options, I suggest reading the relevant article for your version of the excellent Arrays and Lists in SQL Server, by Erland Sommarskog.
I've done this in the past using a Split function that I add to my schema functions as described here
Then you can do the following:
CREATE PROCEDURE TEST
#Ids --- What type should go here?
AS BEGIN
SELECT * FROM TableA WHERE ID IN ( dbo.Split(#Ids, ',') )
END
Just remember that the IN function always expects a table of values as a result. SQL Server is smart enough to convert strings to this table format, so long as they are specifically written in the procedure.
Another option in your specific example though, could be to use a join. This will have a performance improvement, but often does not really meet a real-world example you need. The join version would be:
SELECT *
FROM TableA AS ta
INNER JOIN dbo.Split(#Ids, ',') AS ids
ON ta.Id = ids.items
If your asking what I think your asking, I do this every day..
WITH myData(FileNames)
AS
(
SELECT '0608751970%'
UNION ALL SELECT '1000098846%'
UNION ALL SELECT '1000101277%'
UNION ALL SELECT '1000108488%'
)
SELECT DISTINCT f.*
FROM tblFiles (nolock) f
INNER JOIN myData md
ON b.FileNames LIKE md.FileNames
Or if your doing this based on another table:
WITH myData(FileNames)
AS
(
SELECT RTRIM(FileNames) + '%'
FROM tblOldFiles
WHERE Active=1
)
SELECT DISTINCT f.*
FROM tblFiles (nolock) f
INNER JOIN myData md
ON b.FileNames LIKE md.FileNames

CTE inside SQL IF-ELSE structure

I want to do something like this
declare #a int=1
if (#a=1)
with cte as
(
select UserEmail from UserTable
)
else
with cte as
(
select UserID from UserTable
)
select * from cte
This is just the example, my actual query is far more complex. So I don't want to write the SELECT statement inside IF and ELSE statement twice after the CTE.
If possible, find a way to avoid the if statement entirely.
E.g. in such a trivial example as in your question:
;with CTE as (
select UserEmail from UserTable where #a = 1
union all
select UserID from UserTable where #a != 1 or #a is null
)
select /* single select statement here */
It should generally be possible to compose one or more distinct queries into a final UNION ALL cte, instead of using if - after all, both of the queries being combined must have compatible result sets anyway, for your original question to make sense.
You can't do that - the CTE must immediately be followed by exactly one SQL statement that can refer to it. You cannot split the "definition" of the CTE from the statement that uses it.
So you need to do it this way:
declare #a int=1
if (#a=1)
with cte as
(
select UserEmail from UserTable
)
select * from cte
else
with cte as
(
select UserID from UserTable
)
select * from cte
You cannot split the CTE "definition" for its usage (select * from cte)
The with cte (...) select from cte ... is a single statement. Is not 'followed' by a statement, it is part of the statement. You are asking to split the statement in two which is obviously impossible.
As a general rule, SQL is a very unfriendly language for things like DRY and avoiding code repetition. Attempting to make code more maintainable, more readable, or simply trying to save a few keystrokes can (and usually does) result in serious runtime performance penalties (eg. attempting to move the CTE into a table value function UDF). The simplest thing would be to bite the bullet (this time, and in future...) and write the CTE twice. Sometimes it makes sense to materialize the CTE into a #temp table and then operate into on the #temp table, but only sometimes.
This status-quo is unfortunate, but is everything you can expect from design by committee...

Resources