Is the data in a common table expression static? - sql-server

I'm wondering if the data in a CTE is static (does it stay the same when changes are made to the original tables it was created from) - I think the answer is yes, but I want to make sure. Example:
DECLARE #TSCourseID as INT = 123456789;
WITH CTE as
(
SELECT
TSRegistrants.TSRegistrantID
,TSRegistrants.Name
,TSRegistrants.Email
,TSRegistrants.PhoneNumber
FROM TSRegCourseDetail
JOIN TSRegistrants
ON TSRegCourseDetail.TSRegistrantID = TSRegistrants.TSRegistrantID
WHERE TSRegCourseDetail.TSCourseID = #TSCourseID
AND TSRegistrants.Name in ('User List')
)
UPDATE TSRegCourseDetail
SET TSCourseID = 987654321
WHERE TSRegistrantID in (select TSRegistrantID from CTE)
1) Would this change the data in the CTE? This query would empty it if so; I'm hoping it doesn't
2) Also, would update/set from select TSRegistrantID from CTE work/be any better?
I am not a programmer, just got stuck with the hat for the time being >.<
Thanks!

The CTE is evaluated before the UPDATE takes place, so in the way that you mean, no the data doesn't change.
Also, a CTE can only be used with one select statement (or update or insert). If you want it to be available for multiple statements you need either a temp table or table variable instead.

You don't need to use a CTE for this at all:
DECLARE #TSCourseID AS INT = 123456789
UPDATE cd
SET cd.TSCourseID = 987654321
FROM TSCourseDetail cd
INNER JOIN TSRegistrants r ON cd.TSRegistrantID = r.TSRegistrantID
WHERE cd.TSRegistrantID = #TSCourseID
AND TSRegistrants.Name IN ( 'User List' )

Related

Possibility of reusing nested select value in other nested selects

Is it possible to create a reusable nested select statement in TSQL? Something like this?
Select(
Select1(reusable)
Select2(where Select1)
Select3(where Select1)
Select4(where Select1)
)
or do I need to rewrite the same select statement to get the value I need to use in other selects in every lower statement? I tried to use it as below but I get always errors.
(SELECT TOP (1) LOCATION FROM Address AS d
WHERE(RECID = trans.Address) ) as loc,
(SELECT TOP (1) CITY WHERE
(loc = other.recid)) as CITY

SQL Server: use function after certain value in table

I am trying to find the time difference between two certain points in stored conversations. These points can differ in each conversation which makes it difficult for me. I need the time difference between the Agent's message and the first EndUser response after it.
In the example in CaseNr 1234 below I need the time difference between MessageNrs 3&4, 5&6 and 7&8.
In CaseNr 2345 I need the time difference between MessageNrs 3&4, 5&6, 7&8 and 10&11.
In CaseNr 4567 I need the time difference between 2&3 and 4&5.
As is shown, the order Agent & EndUser can differ in each conversation as well as the positions these types are in.
Is there a way to calculate the time difference the way I have described it in SQL server?
I think this code should help you.
with t(MessageNr,CaseNr,Type, AgentTime, EndUserTime) as
(
select
t1.MessageNr,
t1.CaseNr,
t1.Type,
t1.EntryTime,
(select top 1 t2.EntryTime
from [Your_Table] as t2
where t1.CaseNr = t2.CaseNr
and t2.[Type] = 'EndUser'
and t1.EntryTime < t2.EntryTime
order by t2.EntryTime) as userTime
from [Your_Table] as t1
where t1.[Type] = 'Agent'
)
select t.*, DATEDIFF(second, AgentTime, EndUserTime)
from t;
It appears the logic you require is the time difference between an Agent row and the immediately following EndUser row.
You can do this with LEAD, which will be more performant than the use of self-joins.
SELECT *,
DATEDIFF(second, t.EntryTime, t.NextTime) TimeDifference
FROM (
SELECT *,
LEAD(CASE WHEN t.[Type] = 'EndUser' THEN t.EntryTime END) NextTime
FROM myTable t
) t
WHERE t.[Type] = 'Agent'
AND t.NextTime IS NOT NULL

Select IDs which belongs ONLY to the list passed as parameter

Let's start from data:
DECLARE #Avengers TABLE ([Hero] varchar(32), [Preference] varchar(32));
INSERT INTO #Avengers VALUES
('Captain_America','gingers'),('Captain_America','blondes'),
('Captain_America','brunettes'),('Hulk','gingers'),('Hulk','blondes'),
('Hawkeye','gingers'),('Hawkeye','brunettes'),('Iron_Man','blondes'),
('Iron_Man','brunettes'),('Thor','gingers'),('Nick_Fury','blondes');
Now I would like to pass a #Preferences as a list of [Preference] (either comma separated or single column table parameter) without knowing how many parameters I am going to get and based on this to select [Hero] who prefers exactly these #Preferences as provided in parameter (list), by that I mean if I am after 'blondes' and 'gingers' then I am after 'Hulk' only
(NOT 'Captain_America' who prefers 'blondes', 'gingers' and 'brunettes').
I would like to get something like:
SELECT [Hero]
FROM #Avengers
WHERE *IS_ASSIGNED_ONLY_TO_THE_LIST*([Preference]) = #Preference
Well, I think I overcomplicated my code, but it works.
SELECT a.Hero, COUNT(*), MIN(p.N)
FROM #Avengers a
LEFT JOIN ( SELECT *, COUNT(*) OVER() N
FROM #Preferences) p
ON a.Preference = p.Preference
GROUP BY a.Hero
HAVING COUNT(*) = MIN(p.N)
AND COUNT(*) = COUNT(p.Preference)
;
I'm using #Preferences as a table.

SQL Server (2014) Query Optimization when using filter variable passed to procedure

I am looking for help with optimization techniques or hint for me to move ahead with the problem I have. Using a temp table for in clause makes my query run for more than 5 seconds, changing it a static value returns the data under a second. I am trying to understand the way to optimize this.
-- details about the number of rows in table
dept_activity table
- total rows - 17,319,666
- rows for (dept_id = 10) - 36054
-- temp table
CREATE TABLE #tbl_depts (
Id INT Identity(1, 1) PRIMARY KEY
,dept_id integer
);
-- for example I inserted one row but based on conditions multiple department numbers are inserted in this temp table
insert into #tbl_depts(dept_id) values(10);
-- this query takes more than 5 seconds
SELECT activity_type,count(1) rc
FROM dept_activity da
WHERE (
#filter_by_dept IS NULL
OR da.depart_id IN (
SELECT td.dept_id
FROM #tbl_depts td
)
)
group by activity_type;
-- this query takes less than 500 milli seconds
SELECT activity_type,count(1) rc
FROM dept_activity da
WHERE (
#filter_by_dept IS NULL
OR da.depart_id IN (
10 -- changed to static value
)
)
group by activity_type;
What ways I can optimize to return data for first query under a second.
You're testing this with just one value, but isn't your real case different?
The problem that optimizer has here is that it can't know how many rows the temp. table in -clause will actually find, so it'll have to make a guess, and probably that why the result is different. Looking at estimated row counts (+vs actual) might give some insight on this.
If your clause only contains this one criteria:
#filter_by_dept IS NULL OR da.depart_id IN
It might be good to test what happens if you separate your logic with if blocks, into the one that fetches all, and the other that filters the data.
If that's not the real case, you might want to test both option (recompile), which could result into a better plan, but will use (little bit) more CPU since the plan is re-generated every time. Or by constructing the clause with dynamic SQL (either just with the temp table but optimizing away the or statements, or doing a full in clause if there isn't a ridiculous amount of values), but that might get really ugly.
There are different ways of writing same thing. Use as per your requirements -
Separate Block
IF #filter_by_dept IS NULL
BEGIN
SELECT da.activity_type, count(1) rc
FROM dept_activity da
GROUP BY da.activity_ty
END
ELSE
BEGIN
SELECT da.activity_type,COUNT(1) rc
FROM dept_activity da
INNER JOIN #tbl_depts td ON td.dept_id = da.depart_id
GROUP BY da.activity_ty
END
Dynamic Query
DECLARE #sql_stmt VARCHAR(5000)
SET #sql_stmt = '
SELECT activity_type, COUNT(1) rc
FROM dept_activity da
'
IF #filter_by_dept IS NOT NULL
SET #sql_stmt = #sql_stmt + ' INNER JOIN #tbl_depts td ON td.dept_id = da.depart_id'
SET #sql_stmt = #sql_stmt + ' GROUP BY da.activity_type '
EXEC(#sql_stmt)
Simple Left Join
Comparatively, it can be slower that above two options.
SELECT da.activity_type, count(1) rc
FROM dept_activity da
LEFT JOIN #tbl_depts td ON td.dept_id = da.depart_id
WHERE #filter_by_dept IS NULL OR td.id IS NOT NULL
GROUP BY da.activity_type
The biggest issue is most likely the use of an "optional parameter". The query optimizer has no idea weather or not #filter_by_dept is going to have a value the next time it's executed to it chooses to play it safe opts for an index scan, rather than an index seek. The is where OPTION(RECOMPILE) can be your friend. Especially on simple, easy to compile queries like this one.
Also, there are potential gains from using a WHERE EXISTS in place of the IN.
Try the following...
DECLARE #filter_by_dept INT = 10;
SELECT
da.activity_type,
rc = COUNT(1)
FROM
dbo.dept_activity da
WHERE
#filter_by_dept IS NULL
OR
EXISTS (SELECT 1 FROM #tbl_depts td WHERE da.depart_id = td.dept_id)
GROUP BY
da.activity_type
OPTION (RECOMPILE);
HTH, Jason

Preferred method of T-SQL if condition to improve query plan re-use

I want to understand which is the better method of implementing a "IF" condition inside a stored procedure.
I have seen this method used extensively. Which is comparable to iterative coding...
declare #boolExpression bit = 1 --True
if #boolExpression = 1
select column from MyTable where group = 10
else
select column from MyTable where group = 20
I prefer to use a set based method...
declare #boolExpression bit = 1 --True
select column from MyTable where group = 10 and #boolExpression =1
union all
select column from MYTable where group = 20 and #boolExpression =0
I prefer to use this method because as I understand it creates a re-useable query plan and less plan cache churn. Is this fact or fiction? Which is the correct method to use.
Thanks in advance
Assuming you are missing a UNION ALL There isn't much in it as far as I can see. The first version will cache a plan for each statement as children of a COND operator such that only the relevant one will get invoked at execution time.
The second one will have both branches as children of a concatenation operator. The filters have a Startup Expression Predicate meaning that each seek is only evaluated if required.
You could also use it as follows:
DECLARE #boolExpression BIT = 1
SELECT column FROM MyTable
WHERE
CASE
WHEN #boolExpression = 1 THEN
CASE
WHEN group = 10 THEN 1
ELSE 0
END
ELSE
CASE
WHEN group = 20 THEN 1
ELSE 0
END
END = 1
I know that it looks complicated but does the trick, especially in-cases when the applying of a parameter in a stored procedure is optional.

Resources