CTE inside SQL IF-ELSE structure - sql-server

I want to do something like this
declare #a int=1
if (#a=1)
with cte as
(
select UserEmail from UserTable
)
else
with cte as
(
select UserID from UserTable
)
select * from cte
This is just the example, my actual query is far more complex. So I don't want to write the SELECT statement inside IF and ELSE statement twice after the CTE.

If possible, find a way to avoid the if statement entirely.
E.g. in such a trivial example as in your question:
;with CTE as (
select UserEmail from UserTable where #a = 1
union all
select UserID from UserTable where #a != 1 or #a is null
)
select /* single select statement here */
It should generally be possible to compose one or more distinct queries into a final UNION ALL cte, instead of using if - after all, both of the queries being combined must have compatible result sets anyway, for your original question to make sense.

You can't do that - the CTE must immediately be followed by exactly one SQL statement that can refer to it. You cannot split the "definition" of the CTE from the statement that uses it.
So you need to do it this way:
declare #a int=1
if (#a=1)
with cte as
(
select UserEmail from UserTable
)
select * from cte
else
with cte as
(
select UserID from UserTable
)
select * from cte
You cannot split the CTE "definition" for its usage (select * from cte)

The with cte (...) select from cte ... is a single statement. Is not 'followed' by a statement, it is part of the statement. You are asking to split the statement in two which is obviously impossible.
As a general rule, SQL is a very unfriendly language for things like DRY and avoiding code repetition. Attempting to make code more maintainable, more readable, or simply trying to save a few keystrokes can (and usually does) result in serious runtime performance penalties (eg. attempting to move the CTE into a table value function UDF). The simplest thing would be to bite the bullet (this time, and in future...) and write the CTE twice. Sometimes it makes sense to materialize the CTE into a #temp table and then operate into on the #temp table, but only sometimes.
This status-quo is unfortunate, but is everything you can expect from design by committee...

Related

Table Valued Function with Recursive CTE

Just for fun, I’m trying to write a table valued function to generate a table of dates. For testing purposes, I am hard-coding values which should be passed in variables.
By itself, this works:
WITH cte AS (
SELECT cast('2021-10-01' AS date) AS date
UNION ALL
SELECT dateadd(day,1,date) FROM cte WHERE date<current_timestamp
)
SELECT * FROM cte OPTION(maxrecursion 0);
Note the OPTION at the end.
As a function, it won’t work unless I remove the OPTION clause at the end:
CREATE FUNCTION dates(#start date, #rows INT) RETURNS TABLE AS
RETURN
WITH cte AS (
SELECT cast('2021-10-01' AS date) AS date
UNION ALL
SELECT dateadd(day,1,date) FROM cte WHERE date<current_timestamp
)
SELECT * FROM cte -- OPTION(maxrecursion 0)
;
For the test data, that’s OK, but it will certainly fail if I give it date at the beginning of the year, since it involves more than 100 recursions.
Is there a correct syntax for this, or is it another Microsoft Quirk which needs a workaround?
I think this may be the answer.
In an answer to another question, #GarethD states:
If you think of a view more as a stored subquery than a stored query … and remember that its definition is expanded out into the main query …
(Incorrect Syntax Near Keyword 'OPTION' in CTE Statement).
If that’s the case, a view can’t include anything that can’t be included in a subquery. That includes the ORDER BY clause and hints such as OPTION.
I have yet to make sure, but I can guess that the same goes for a Table Valued Function. If so, the answer is no, there is no way include the OPTION clause.
Note that other DBMSs are more accommodating in what can be included in subqueries, so I don’t imagine that they have the same limitations.
The solution is to use a multi-statement Table Valued Function:
DROP FUNCTION IF EXISTS dates;
GO
CREATE FUNCTION dates(#start date, #end date) RETURNS #dates TABLE(date date) AS
BEGIN
WITH cte AS (
SELECT #start AS date
UNION ALL
SELECT dateadd(day,1,date) FROM cte WHERE date<#end
)
INSERT INTO #dates(date)
SELECT date FROM cte OPTION(MAXRECURSION 0)
RETURN;
END;
GO
SELECT * FROM dates('2020-01-01','2021-01-01');
Inline Table Valued Functions are literally inlined, and clauses such as OPTION can only appear at the very end of an SQL statement, which is not necessarily at the end of the inline function.
On the other hand, a Multi Statement Table Valued function is truly self-contained, so the OPTION clause is OK there.

Understanding CTE Semicolon Placement

When I run this CTE in SQL Server it says the syntax is incorrect by the declare statement.
;WITH cte as
(
SELECT tblKBFolders.FolderID
from tblKBFolders
where FolderID = #FolderID
UNION ALL
SELECT tblKBFolders.FolderID
FROM tblKBFolders
INNER JOIN cte
ON cte.FolderID = tblKBFolders.ParentFolderID
)
declare #tblQueryFolders as table (FolderID uniqueidentifier)
insert into #tblQueryFolders
SELECT FolderID From cte;
But if I move the declare to before the CTE, it runs just fine.
declare #tblQueryFolders as table (FolderID uniqueidentifier)
;WITH cte as
(
SELECT tblKBFolders.FolderID
from tblKBFolders
where FolderID = #FolderID
UNION ALL
SELECT tblKBFolders.FolderID
FROM tblKBFolders
INNER JOIN cte
ON cte.FolderID = tblKBFolders.ParentFolderID
)
insert into #tblQueryFolders
SELECT FolderID From cte;
Why is that?
The answer you ask for was given in a comment already: This has nothing to do with the semicolon's placement.
Important: The CTE's WITH cannot follow right after a statement without an ending semicolon. There are many statments, where a WITH-clause would add something to the end of the statement (query hints, the WITH after OPENJSON etc.). The engine would have to guess, whether this WITH adds to the statment before or if it is a CTE's start. That's the reason, why we often see
;WITH cte AS (...)
That's actually the wrong usage of a semicolon. People put it there, just not to forget about it. Anyway it is seen as better style and best practice to end T-SQL statements always with a semicolon (and do not use ;WITH, as it adds an empty statement actually).
A CTE is not much more than syntactical sugar. Putting the CTE's code within a FROM(SELECT ...) AS SomeAlias would be roughly the same. In most cases this would lead to the same execution plan. It helps in cases, where you'd have to write the same FROM(SELECT ) AS SomeAlias in multiple places. And - in general - it makes things easier to read and understand. But it is not - by any means - comparable to a temp table or a table variable. The engine will treat it as inline code and you can use it in the same statement exclusively.
So this is the same:
WITH SomeCTE AS(...some query here...)
SELECT SomeCTE.* FROM SomeCTE;
SELECT SomeAlias.*
FROM (...some query here...) AS SomeAlias;
Your example looks like you think of the CTE as kind of a temp table definition, which you can use in the following statements. But this is not correct.
After the CTE the engine expects another CTE or a final statement like SELECT or UPDATE.
WITH SomeCTE AS(...some query here...)
SELECT * FROM SomeCTE;
or
WITH SomeCTE AS( ...query... )
,AnotherCTE AS ( ...query... )
SELECT * FROM AnotherCTE;
...or another content added with the WITH clause:
WITH XMLNAMESPACES( ...namespace declarations...)
,SomeCTE AS( ...query... )
SELECT * FROM SomeCTE;
All of these examples are one single statement.
Putting a DECLARE #Something in the middle, would break this concept.

SELECT INTO query

I have to write an SELECT INTO T-SQL script for a table which has columns acc_number, history_number and note.
How do i facilitate an incremental value of history_number for each record being inserted via SELECT INTO.
Note, that the value for history_number comes off as a different value for each account from a different table.
SELECT history_number = IDENTITY(INT,1,1),
... etc...
INTO NewTable
FROM ExistingTable
WHERE ...
You could use ROW_NUMBER instead of identity i.e. ROW_NUMBER() OVER (ORDER BY )
SELECT acc_number
,o.historynumber
,note
,o.historynumber+DENSE_RANK() OVER (Partition By acc_number ORDER BY Note) AS NewHistoryNumber
--Or some other order by probably a timestamp...
FROM Table t
INNER JOIN OtherTable o
ON ....
Working Fiddle
The will give you an incremented count starting from history number for each accnum. I suggest you use a better order by in the rank but there was not enough info in the question.
This answer to this question may help you as well
Question
Suppose your SELECT statement is like this
SELECT acc_number,
history_number,
note
FROM [Table]
Try this Query as below.
SELECT ROW_NUMBER() OVER (ORDER BY acc_number) ID,
acc_number,
history_number,
note
INTO [NewTable]
FROM [Table]

Use of the IN condition

I can easily create a stored procedure in SQL Server with parameters that I use with =, LIKE and most operators. But when it comes to using IN, I don't really understand what to do, and I can't find a good site to teach me.
Example
CREATE PROCEDURE TEST
#Ids --- What type should go here?
AS BEGIN
SELECT * FROM TableA WHERE ID IN ( #Ids )
END
Is this possible and if so how ?
With SQL Server 2008 and above, you can use Table Valued Parameters.
You declare a table type and can use that as a parameter (read only) for stored procedures that can be used in IN clauses.
For the different options, I suggest reading the relevant article for your version of the excellent Arrays and Lists in SQL Server, by Erland Sommarskog.
I've done this in the past using a Split function that I add to my schema functions as described here
Then you can do the following:
CREATE PROCEDURE TEST
#Ids --- What type should go here?
AS BEGIN
SELECT * FROM TableA WHERE ID IN ( dbo.Split(#Ids, ',') )
END
Just remember that the IN function always expects a table of values as a result. SQL Server is smart enough to convert strings to this table format, so long as they are specifically written in the procedure.
Another option in your specific example though, could be to use a join. This will have a performance improvement, but often does not really meet a real-world example you need. The join version would be:
SELECT *
FROM TableA AS ta
INNER JOIN dbo.Split(#Ids, ',') AS ids
ON ta.Id = ids.items
If your asking what I think your asking, I do this every day..
WITH myData(FileNames)
AS
(
SELECT '0608751970%'
UNION ALL SELECT '1000098846%'
UNION ALL SELECT '1000101277%'
UNION ALL SELECT '1000108488%'
)
SELECT DISTINCT f.*
FROM tblFiles (nolock) f
INNER JOIN myData md
ON b.FileNames LIKE md.FileNames
Or if your doing this based on another table:
WITH myData(FileNames)
AS
(
SELECT RTRIM(FileNames) + '%'
FROM tblOldFiles
WHERE Active=1
)
SELECT DISTINCT f.*
FROM tblFiles (nolock) f
INNER JOIN myData md
ON b.FileNames LIKE md.FileNames

Output to Table Variable, Not CTE

Why is this legal:
DECLARE #Party TABLE
(
PartyID nvarchar(10)
)
INSERT INTO #Party
SELECT Name FROM
(INSERT INTO SomeOtherTable
OUTPUT inserted.Name
VALUES ('hello')) [H]
SELECT * FROM #Party
But the next block gives me an error:
WITH Hey (Name)
AS (
SELECT Name FROM
(INSERT INTO SomeOtherTable
OUTPUT inserted.Name
VALUES ('hello')) [H]
)
SELECT * FROM Hey
The second block gives me the error "A nested INSERT, UPDATE, DELETE, or MERGE statement is not allowed in a SELECT statement that is not the immediate source of rows for an INSERT statement.
It seems to be saying thst nested INSERT statements are allowed, but in my CTE case I did not nest inside another INSERT. I'm surprised at this restriction. Any work-arounds in my CTE case?
As for why this is illegal, allowing these SELECT operations with side effects would cause all sorts of problems I imagine.
CTEs do not get materialised in advance into their own temporary table so what should the following return?
;WITH Hey (Name)
AS
(
...
)
SELECT name
FROM Hey
JOIN some_other_table ON Name = name
If the Query Optimiser decided to use a nested loops plan and Hey as the driving table then presumably one insert would occur. However if it used some_other_table as the driving table then the CTE would get evaluated as many times as there were rows in that other table and so multiple inserts would occur. Except if the Query Optimiser decided to add a spool to the plan and then it would only get evaluated once.
Presumably avoiding this sort of mess is the motivation for this restriction (as with the restrictions on side effects in functions)

Resources